Random Set element?

676 views
Skip to first unread message

Jorge Fernández de Cossío Díaz

unread,
May 5, 2016, 5:46:06 PM5/5/16
to julia-users
How to get a random element from a Set? Or in general, from a non-indexable collection?

Obviously A[rand(1:length(A))] doesn't work (where A is the Set). I tried this:

function rand{T}(s::Set{T})
    @assert !isempty(s)
    k = rand(1:length(s))
    for x in s
        k -= 1
        if k == 0; return x end
    end
end

But this is O(n), where n is the length of the set. Is there a more efficient way?

Steven G. Johnson

unread,
May 5, 2016, 7:39:05 PM5/5/16
to julia-users
I think you have to use the internals of the Set implementation to have an O(1) algorithm here.   For example, the following seems to work:

import Base: GLOBAL_RNG, isslotfilled, rand
function rand(r::AbstractRNG, s::Set)
    isempty(s) && throw(ArgumentError("set must be non-empty"))
    n = length(s.dict.slots)
    while true
        i = rand(r, 1:n)
        isslotfilled(s.dict, i) && return s.dict.keys[i]
    end
end
rand(s::Set) = rand(Base.GLOBAL_RNG, s)

Something like this should probably be in Base.  This could make a good PR if you want to add a test case, documentation, etc.

Steven G. Johnson

unread,
May 6, 2016, 10:46:55 AM5/6/16
to julia-users

Jorge Fernández de Cossío Díaz

unread,
May 6, 2016, 11:15:52 AM5/6/16
to julia...@googlegroups.com
How do you test a random algorithm like this? Besides obvious checks like that the returned element must belong to the set.
Any deterministic test to check randomness will have a finite probability of failure even if the algorithm is sound.

On Fri, May 6, 2016 at 10:46 AM, Steven G. Johnson <steve...@gmail.com> wrote:

Stefan Karpinski

unread,
May 6, 2016, 11:28:38 AM5/6/16
to Julia Users
Inline image 1

Steven G. Johnson

unread,
May 6, 2016, 11:44:17 AM5/6/16
to julia-users
On Friday, May 6, 2016 at 11:15:52 AM UTC-4, Jorge Fernández de Cossío Díaz wrote:
How do you test a random algorithm like this? Besides obvious checks like that the returned element must belong to the set.
Any deterministic test to check randomness will have a finite probability of failure even if the algorithm is sound.

I think it would be sufficient to have a simple test that the returned elements cover the set if you run for long enough.  Testing that the elements occur with equal probability seems like overkill; this is something you prove by the construction of the algorithm, whereas the tests are mainly there to catch gross errors.

e.g.

let s = Set(1:10), r = rand(s, 10000)
   @test Set(unique(r)) == s
end

This should only fail with probability 0.9^10000 ≈ 2e-458, which is low enough for us not to worry about I think.
Reply all
Reply to author
Forward
0 new messages