Random Set element?

Jorge Fernández de Cossío Díaz

unread,

May 5, 2016, 5:46:06 PM5/5/16

to julia-users

How to get a random element from a Set? Or in general, from a non-indexable collection?

Obviously A[rand(1:length(A))] doesn't work (where A is the Set). I tried this:

function rand{T}(s::Set{T})

@assert !isempty(s)

k = rand(1:length(s))

for x in s

k -= 1

if k == 0; return x end

end

But this is O(n), where n is the length of the set. Is there a more efficient way?

Steven G. Johnson

unread,

May 5, 2016, 7:39:05 PM5/5/16

to julia-users

I think you have to use the internals of the Set implementation to have an O(1) algorithm here. For example, the following seems to work:

import Base: GLOBAL_RNG, isslotfilled, rand
function rand(r::AbstractRNG, s::Set)
isempty(s) && throw(ArgumentError("set must be non-empty"))
n = length(s.dict.slots)
while true
i = rand(r, 1:n)
isslotfilled(s.dict, i) && return s.dict.keys[i]
end
end
rand(s::Set) = rand(Base.GLOBAL_RNG, s)

Something like this should probably be in Base. This could make a good PR if you want to add a test case, documentation, etc.

Steven G. Johnson

unread,

May 6, 2016, 10:46:55 AM5/6/16

to julia-users

Filed https://github.com/JuliaLang/julia/issues/16231

Jorge Fernández de Cossío Díaz

unread,

May 6, 2016, 11:15:52 AM5/6/16

to julia...@googlegroups.com

How do you test a random algorithm like this? Besides obvious checks like that the returned element must belong to the set.

Any deterministic test to check randomness will have a finite probability of failure even if the algorithm is sound.

On Fri, May 6, 2016 at 10:46 AM, Steven G. Johnson <steve...@gmail.com> wrote:

Filed https://github.com/JuliaLang/julia/issues/16231

Stefan Karpinski

unread,

May 6, 2016, 11:28:38 AM5/6/16

to Julia Users

Steven G. Johnson

unread,

May 6, 2016, 11:44:17 AM5/6/16

to julia-users

On Friday, May 6, 2016 at 11:15:52 AM UTC-4, Jorge Fernández de Cossío Díaz wrote:

How do you test a random algorithm like this? Besides obvious checks like that the returned element must belong to the set.
Any deterministic test to check randomness will have a finite probability of failure even if the algorithm is sound.

I think it would be sufficient to have a simple test that the returned elements cover the set if you run for long enough. Testing that the elements occur with equal probability seems like overkill; this is something you prove by the construction of the algorithm, whereas the tests are mainly there to catch gross errors.

e.g.

let s = Set(1:10), r = rand(s, 10000)

@test Set(unique(r)) == s

end

This should only fail with probability 0.9^10000 ≈ 2e-458, which is low enough for us not to worry about I think.

Reply all

Reply to author

Forward