Sampling in Distributions

428 views
Skip to first unread message

Simon Byrne

unread,
Feb 13, 2015, 3:11:40 PM2/13/15
to juli...@googlegroups.com
One of the sticking points to getting rid of Rmath from Base is its interaction with the random number generator. The ideal solution to this is implementing the sampling algorithms in Distributions.jl. Between Dahua and myself, most of the algorithms have now been implemented: the stumbling block is actually tying it all together in an efficient manner.

As we seem to have hit a roadblock with this, I thought I would outline the main issues, in the hope that others might have some useful suggestions.

1. When sampling value multiple times from the same distribution, some values need only be computed once. Rmath uses static variables for this (checking if the distributional parameters have changed at each call), however this presents possible thread safety issues. The current approach we are exploring involves using immutable types to hold all the pre-computed constants, for example:


The idea here being, that you could use

s = GammaGDSampler(d)
X = [rand(s) for i = 1:n]

for efficiently sampling an array of values from the same distribution. Unfortunately, this turns out to be inefficient for sampling single values, as you may end up computing values on branches that aren't used. We could get around this by implementing everything twice, but hopefully there is a neater solution.

2. Typically there isn't one optimal algorithm for all parameter values, instead you need a meta-algorithm to select the appropriate algorithm for a particular value. One idea we tried was defining a function to choose the appropriate algorithm, e.g.

function sampler(d::Gamma)
    if d.shape < 1.0
        GammaIPSampler(d)
    elseif d.shape == 1.0
        Exponential(d.scale)
    else
        GammaGDSampler(d)
    end
end

and then define:

rand(d::Gamma, n...) = rand(sampler(d), n...)

The problem with this is that the return type instability of sampler imposes a significant cost on the resulting rand.

3. These problems are compounded by the fact that rand methods call other rand methods, e.g. a Beta distribution can be sampled by taking

rand(d::Beta)
    a = rand(Gamma(d.α)
    b = rand(Gamma(d.β)
    return a/(a+b)
end

In order to do this efficiently for an array, you would want to choose the appropriate algorithm, and pre-compute the appropriate values, for each Gamma distribution.


Hopefully that explains the main obstacles we're facing. Please let me know if you have any suggestions, or would like further explanation.

Simon

Elliot Saba

unread,
Feb 13, 2015, 3:35:57 PM2/13/15
to Julia Dev
It sounds to me like the way to deal with your polyalgorithmic problem is to try to take advantage of multiple dispatch somehow.  Is it possible to, for instance, have Gamma be a parent type, and have child types for each sub-algorithm you need in your polyalgorithm?  This way, when you construct a Gamma distribution, it would return something of type GammaIP, or GammaGD, etc... Then you can have the rand() function directly call the relevant sampler function.

Of course, if the sub-algorithm is dependent upon parameters that can change after construction of the distribution object, multiple dispatch isn't going to give us any speedup here, I don't think.
-E

Tim Holy

unread,
Feb 14, 2015, 6:56:37 AM2/14/15
to juli...@googlegroups.com
These are tough problems. In another context I've started wondering whether
there might be a need to develop a package that provides a convenient hack
around type-instability. I have no idea whether this will actually be useful
in practice, but the idea might be something like the following: suppose you
have 3 different types, A, B, and C. We can create a type-stable object
container like this:

immutable ObjectContainer
objtype::DataType
objdata::Vector{UInt8}
end

where objdata is a wrapper around the memory associated with the object. Then
one creates a @dispatch_for macro,

@dispatch_for A B C myfunction(obj::ObjectContainer, args...)

which generates code that looks like this:

function myfunction(obj::ObjectContainer, ...)
if obj.objtype == A
myfunction(unsafe_pointer_to_objref(convert(Ptr{A},
pointer(obj.objdata)), args...)
elseif obj.objtype == B
...
end

Basically the idea here is that you bypass generic runtime method lookup and
handle dispatch manually. Naturally, you have to write myfunction methods for
A, B, and C as usual. There's a remote chance you'd have to give these
different names (myfunction_A, myfunction_B, myfunction_C) and have the
@dispatch_for macro use these names, but I hope that wouldn't be necessary.
You'd also want to declare all of them @noinline.

There are quite a few issues to investigate: will the array-wrapper allocation
destroy any benefit? Can one realistically handle multiple dispatch, not just
single dispatch on the first argument?

--Tim

Simon Byrne

unread,
Feb 14, 2015, 7:02:41 AM2/14/15
to juli...@googlegroups.com
On Friday, 13 February 2015 20:35:57 UTC, Elliot Saba wrote:
It sounds to me like the way to deal with your polyalgorithmic problem is to try to take advantage of multiple dispatch somehow.  Is it possible to, for instance, have Gamma be a parent type, and have child types for each sub-algorithm you need in your polyalgorithm?  This way, when you construct a Gamma distribution, it would return something of type GammaIP, or GammaGD, etc... Then you can have the rand() function directly call the relevant sampler function.
 
Yeah, that is kind of the pattern we're going for, but we were also using the types to hold the pre-computed constants: maybe that is the wrong idea though.

Simon Byrne

unread,
Feb 14, 2015, 7:15:21 AM2/14/15
to juli...@googlegroups.com


On Saturday, 14 February 2015 11:56:37 UTC, Tim wrote:
Then one creates a @dispatch_for macro,

@dispatch_for A B C myfunction(obj::ObjectContainer, args...)

which generates code that looks like this:

function myfunction(obj::ObjectContainer, ...)
    if obj.objtype == A
        myfunction(unsafe_pointer_to_objref(convert(Ptr{A},
pointer(obj.objdata)), args...)
    elseif obj.objtype == B
...
end

...
 
That is an interesting approach, and one that hadn't occurred to me. It does seem like you're almost reimplementing your own dispatch system though!

In my code above, once sampler is inlined, the rand method effectively becomes

function rand(d::Gamma,n...)
    if d.shape < 1.0
        s = GammaIPSampler(d)
    elseif d.shape == 1.0
        s = Exponential(d.scale)
    else
        s = GammaGDSampler(d)
    end
    rand(s,n...)
end
 
We would avoid the type instability altogether if there was a way to lift the rand call inside the loop, e.g. you want to generate the following code:

function rand(d::Gamma,n...)
    if d.shape < 1.0
        s = GammaIPSampler(d)
        rand(s,n...)
    elseif d.shape == 1.0
        s = Exponential(d.scale)
        rand(s,n...)
    else
        s = GammaGDSampler(d)
        rand(s,n...)
    end
end

This would be considerably more difficult for problem 3, where we would ideally want two nested if blocks, i.e.

function rand(d::Beta,n...)
    if d.α < 1.0
        sα = GammaIPSampler(d)
        if d.β < 1.0
            sβ = GammaIPSampler(d)
            rand(BetaSampler(sα, sβ),n...)
        elseif d.β == 1.0
            sβ = Exponential(d.scale)
            rand(BetaSampler(sα, sβ),n...)
        else
            sβ = GammaGDSampler(d)
            rand(BetaSampler(sα, sβ),n...)
        end
    elseif d.α == 1.0
        sα = Exponential(d.scale)
        if d.β < 1.0
            sβ = GammaIPSampler(d)
            rand(BetaSampler(sα, sβ),n...)
        elseif d.β == 1.0
            sβ = Exponential(d.scale)
            rand(BetaSampler(sα, sβ),n...)
        else
            sβ = GammaGDSampler(d)
            rand(BetaSampler(sα, sβ),n...)
        end
    else
        sα = GammaGDSampler(d)
        if d.β < 1.0
            sβ = GammaIPSampler(d)
            rand(BetaSampler(sα, sβ),n...)
        elseif d.β == 1.0
            sβ = Exponential(d.scale)
            rand(BetaSampler(sα, sβ),n...)
        else
            sβ = GammaGDSampler(d)
            rand(BetaSampler(sα, sβ),n...)
        end
    end
end

Now, of course I could write all this out myself, but given all the neat metaprogramming tools in Julia, we should be able to write this once.

Simon Byrne

unread,
Feb 18, 2015, 5:32:40 PM2/18/15
to juli...@googlegroups.com
So after more thought, I realised one solution to problem 2 (and probably 3 as well) would be to instead make sampler use a callback, e.g.

function sampler(f,d::Gamma)
    if shape(d) < 1.0
        s = GammaIPSampler(d)
        f(s)
    elseif shape(d) == 1.0
        s = Exponential(scale(d))
        f(s)
    else
        s = GammaGDSampler(d)
        f(s)
    end
end

rand(d::Gamma,n...) = sampler(s -> rand(s,n...), d)

The only problem is that Julia loses the ability to infer the return type, making code which makes repeated calls woefully inefficient, e.g.

function bar(X)
    s = 0.0
    for x in X
        s += rand(Gamma(x))
    end
    s
end

(this problem persists even if I @inline sampler, and use a generic instead of anonymous function as a callback)

Any suggestions as to how to get around this?

-simon

Michael Francis

unread,
May 19, 2015, 11:53:58 AM5/19/15
to juli...@googlegroups.com
Simon, you can make the code type stable by doing the following 

function sampler(d::Gamma)
    ret::Samplable
    if shape( d ) < 1.0
        ret = (Distributions.GammaIPSampler(d))
    elseif shape( d ) == 1.0
        ret = (Distributions.Exponential(d.scale))
    else
        ret =(Distributions.GammaGDSampler(d))
    end
    return convert( Sampleable, ret )
end

The above will force the implied return type to always be of the abstract Base Sampleable (note you need both the constraint on the type and the convert ), The subsequent calls are also type stable. Unfortunately I don't seem much in the way of performance improvement for the example given ( around 0.015 (+-0.005) seconds for random 100000 Gamma values, which appears to be consistent with the rcall version for rand( Gamma ) do you have an example of the more complex use case?

John Myles White

unread,
May 19, 2015, 11:55:21 AM5/19/15
to juli...@googlegroups.com
I suspect type stability isn't sufficient: you also need concrete types to be inferred.

 -- John

Michael Francis

unread,
May 19, 2015, 12:14:21 PM5/19/15
to juli...@googlegroups.com
This performance does seem to be consistent with the rcall version. It would be good to see a more full example of how this is being used as currently the time is spent in constructing the immutable samplers. 

Tim Holy

unread,
May 19, 2015, 2:37:36 PM5/19/15
to juli...@googlegroups.com
Sorry, I just missed the rest of this thread. I think what you're asking is
doable with

if d.shape < 1.0
s = GammaIPSampler(d)
rand(s::GammaPSampler{typeparams},n...)
elseif d.shape == 1.0
s = Exponential(d.scale)
rand(s::ExponentialScale{othertypeparams},n...)

etc.

--Tim
Reply all
Reply to author
Forward
0 new messages