I think it would be sufficient to have a simple test that the returned elements cover the set if you run for long enough. Testing that the elements occur with equal probability seems like overkill; this is something you prove by the construction of the algorithm, whereas the tests are mainly there to catch gross errors.
e.g.
let s = Set(1:10), r = rand(s, 10000)
@test Set(unique(r)) == s
end
This should only fail with probability 0.9^10000 ≈ 2e-458, which is low enough for us not to worry about I think.