Shuffle

2 views
Skip to first unread message

Ryan Tarpine

unread,
Aug 30, 2001, 3:32:41 PM8/30/01
to
What's the easiest way to get a shuffled list or array of a sequence of
numbers? e.g. I need 0 through 1599 in random order, each number appearing
exactly once. Ocaml if possible. Thanks in advance

--
Ryan Tarpine, rtar...@hotmail.com
"Technology is a queer thing. It brings you great gifts
in one hand, and it stabs you in the back with the other."
-- C.P. Snow

Joachim Durchholz

unread,
Aug 30, 2001, 4:07:48 PM8/30/01
to
Ryan Tarpine <rtar...@hotmail.com> wrote:
> What's the easiest way to get a shuffled list or array of a sequence
of
> numbers? e.g. I need 0 through 1599 in random order, each number
appearing
> exactly once. Ocaml if possible. Thanks in advance

Easiest would be:
Put them in an array.
for i from 1 to N:
exchange the i'th element with a randomly chosen one
(exchanging it with itself resulting in a no-op)
(Sorry no OCaml. Should be easy enough to translate though.)

This means ca. 2N writes, on the average. Knuth's Art of Computer
Programming might have more efficient algorithms, but I can't imagine
one right now.

The algorithm is also very imperative; I'm not sure that it's possible
to write a functional version that matches both speed and efficiency.
(I'd be happy to see such a function.)

Regards,
Joachim
--
This is not an official statement from my employer.


Brian Rogoff

unread,
Aug 30, 2001, 4:15:36 PM8/30/01
to
On Thu, 30 Aug 2001, Ryan Tarpine wrote:
> What's the easiest way to get a shuffled list or array of a sequence of
> numbers?

Post to a newsgroup and ask for help. ;-)

> e.g. I need 0 through 1599 in random order, each number appearing
> exactly once. Ocaml if possible. Thanks in advance

let swap a i j =
let tmp = Array.get a i in
(Array.set a i (Array.get a j);
Array.set a j tmp)

let shuffled_array n =
let a = Array.init n (fun i -> i) in
(Random.init n;
for i = 0 to (n - 1) do
swap a i (Random.int n);
done;
a)

let a = shuffled_array 1600

Not really a very functional program, I admit, and probably not the best,
but I work best for money or love.

-- Brian


Al Christians

unread,
Aug 30, 2001, 4:49:27 PM8/30/01
to
Brian Rogoff wrote:
>
> Post to a newsgroup and ask for help. ;-)
>
> > e.g. I need 0 through 1599 in random order, each number appearing
> > exactly once. Ocaml if possible. Thanks in advance
>
> let swap a i j =
> let tmp = Array.get a i in
> (Array.set a i (Array.get a j);
> Array.set a j tmp)
>
> let shuffled_array n =
> let a = Array.init n (fun i -> i) in
> (Random.init n;
> for i = 0 to (n - 1) do
> swap a i (Random.int n);
> done;
> a)
>
> let a = shuffled_array 1600
>
> Not really a very functional program, I admit, and probably not the
> best, but I work best for money or love.
>

I believe that you will get a better randommess if you make that:

swap a i (i + (Random.int (n-i));


Al

Ryan Tarpine

unread,
Aug 30, 2001, 6:08:10 PM8/30/01
to
Al Christians wrote:

> Brian Rogoff wrote:
>> ...


>> let swap a i j =
>> let tmp = Array.get a i in
>> (Array.set a i (Array.get a j);
>> Array.set a j tmp)
>>
>> let shuffled_array n =
>> let a = Array.init n (fun i -> i) in
>> (Random.init n;
>> for i = 0 to (n - 1) do
>> swap a i (Random.int n);
>> done;
>> a)
>>
>> let a = shuffled_array 1600

>> ...


>
> I believe that you will get a better randommess if you make that:
>
> swap a i (i + (Random.int (n-i));

> ...

Thanks to both of you. Just what I needed.

Al Christians

unread,
Aug 30, 2001, 6:47:59 PM8/30/01
to
Ryan Tarpine wrote:
>
> Thanks to both of you. Just what I needed.
>

If you get an A, send us each 15%.


Al

Alexander V. Voinov

unread,
Aug 30, 2001, 7:03:09 PM8/30/01
to
Hi Al,

Al Christians wrote:

But it amounts to slightly less than a... newline! Furthermore your
mailer would strip it as a white space :-(.

Alexander


Russ Allbery

unread,
Aug 30, 2001, 7:59:52 PM8/30/01
to
Joachim Durchholz <joac...@gmx.de> writes:

> Easiest would be:
> Put them in an array.
> for i from 1 to N:
> exchange the i'th element with a randomly chosen one
> (exchanging it with itself resulting in a no-op)

I don't have the proof handy, but you can prove that swapping each element
with a random element is more likely to produce some final orderings than
others. You instead want to swap each element with a randomly chosen
element between it and the end of the list.

--
Russ Allbery (r...@stanford.edu) <http://www.eyrie.org/~eagle/>

Joachim Durchholz

unread,
Aug 31, 2001, 4:27:52 AM8/31/01
to
Russ Allbery <r...@stanford.edu> wrote:
> Joachim Durchholz <joac...@gmx.de> writes:
>
> > Easiest would be:
> > Put them in an array.
> > for i from 1 to N:
> > exchange the i'th element with a randomly chosen one
> > (exchanging it with itself resulting in a no-op)
>
> I don't have the proof handy, but you can prove that swapping each
element
> with a random element is more likely to produce some final orderings
than
> others.

I don't know. My first approach was to swap randomly chosen elements,
for N times, but I wasn't sure that that wouldn't create a bias.

> You instead want to swap each element with a randomly chosen
> element between it and the end of the list.

I do have a proof (well, sketch of it) that this algorithm will work.
It is by induction:

For one element, the algorithm will be trivially correct.

For N elements, there's an 1/N chance that the algorithm will swap the
first element with itself, leaving the others alone.
Further there's an 1/N chance that the algorithm will swap the first
element with any given other element O. This gives O an 1/N chance to
finish on position 1, and an (N-1)/N chance to end up on any other
position, at which is was placed with an 1/(N-1) probability in the
previous round by induction, which gives an (N-1)/N*1/(N-1) = 1/N
probability, q.e.d.

Conclusion: I don't really know whether my algorithm is correct, but I
can't prove it and I *can* prove yours, so your algorithm is
preferrable.

Jón Fairbairn

unread,
Aug 31, 2001, 6:25:01 AM8/31/01
to
"Joachim Durchholz" <joac...@gmx.de> writes:

> Ryan Tarpine <rtar...@hotmail.com> wrote:
> > What's the easiest way to get a shuffled list or array of a
> > sequence of numbers? e.g. I need 0 through 1599 in random order,
> > each number appearing exactly once. Ocaml if possible. Thanks in
> > advance

> Easiest would be:

in Haskell, (sorry)

factorial n = product [1..n]
premutations l = left as an exercise

then

do r <- newStdGen -- start off a random generator
let (index, nextgen) = randomIvalInteger (0, factorial 1600 - 1) r
-- generate a random integer between 0 and 1600!
return (permutations [0..1599] !! index)

gives you what you want

> The algorithm is also very imperative;

My algorithem is quite functional. Might be a bit slow, though ;-)


--
Jón Fairbairn Jon.Fa...@cl.cam.ac.uk

Joachim Durchholz

unread,
Aug 31, 2001, 6:44:32 AM8/31/01
to
Jón Fairbairn <j...@cl.cam.ac.uk> wrote:
> premutations l = left as an exercise
>
> then
>
> do r <- newStdGen -- start off a random generator
> let (index, nextgen) = randomIvalInteger (0, factorial 1600 - 1) r
> -- generate a random integer between 0 and 1600!
> return (permutations [0..1599] !! index)
>
> gives you what you want
>
> > The algorithm is also very imperative;
>
> My algorithem is quite functional. Might be a bit slow, though ;-)

Yup, it's O(N!) in time and space, since permutations is generating all
of them. I don't think that current-day computers can do that with 1600
items.

Is there a functional, efficient algorithm for generating a random
permutation?

Ketil Z Malde

unread,
Aug 31, 2001, 7:05:19 AM8/31/01
to
"Joachim Durchholz" <joac...@gmx.de> writes:

> > do r <- newStdGen -- start off a random generator
> > let (index, nextgen) = randomIvalInteger (0, factorial 1600 - 1) r
> > -- generate a random integer between 0 and 1600!
> > return (permutations [0..1599] !! index)

> Yup, it's O(N!) in time and space, since permutations is generating all
> of them.

Uh, doesn't laziness turn it into an O(n) space thing? So unless I'm
mistaken, you're left with only O(n!) time complexity. Might still be
enough to put some people off, though.

> Is there a functional, efficient algorithm for generating a random
> permutation?

Uh, given a list of elements, draw a random one and add to an
accumulator list until none are left? Gives you O(n^2), but that's
because you have to traverse lists - and the constant factor is low.
And space is O(n).

-kzm
--
If I haven't seen further, it is by standing in the footprints of giants

Mark Jason Dominus

unread,
Aug 31, 2001, 7:08:23 AM8/31/01
to
In article <9mnhqf$3649q$1...@ID-9852.news.dfncis.de>,
Joachim Durchholz <joachim....@gmx.de> wrote:

>Russ Allbery <r...@stanford.edu> wrote:
>> I don't have the proof handy, but you can prove that swapping each
>element
>> with a random element is more likely to produce some final orderings
>than
>> others.
>
>Conclusion: I don't really know whether my algorithm is correct, but I
>can't prove it and I *can* prove yours, so your algorithm is
>preferrable.

I can prove that yours is wrong. Let N=3. Your algorithm generates
three random numbers in the course of its execution. Each of these
random numbers is between 1 and 3. The result, R, is a function of
these three random numbers and nothing else.

There are 27 possible triples of random numbers (1,1,1) .. (3,3,3),
and 6 possible results R. Since 27 is not an exact multiple of 6,
your algorithm must select some results more often than others.

In general, N^N is not a multiple of N! unless N<=2, so your algorithm
is biased except when N<=2.

--
@P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{
@p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f^ord
($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^[P.]/&&
close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d[$q];sleep rand(2)if/\S/;print

Remi VANICAT

unread,
Aug 31, 2001, 6:21:10 AM8/31/01
to

you can have an O(n log n) by using a tree instead of a list.
--
Rémi Vanicat
van...@labri.u-bordeaux.fr
http://dept-info.labri.u-bordeaux.fr/~vanicat

Joachim Durchholz

unread,
Aug 31, 2001, 7:58:46 AM8/31/01
to
Ketil Z Malde <ke...@ii.uib.no> wrote:
> "Joachim Durchholz" <joac...@gmx.de> writes:
>
> > > do r <- newStdGen -- start off a random generator
> > > let (index, nextgen) = randomIvalInteger (0, factorial 1600 -
1) r
> > > -- generate a random integer between 0 and 1600!
> > > return (permutations [0..1599] !! index)
>
> > Yup, it's O(N!) in time and space, since permutations is generating
all
> > of them.
>
> Uh, doesn't laziness turn it into an O(n) space thing?

Hmm... I'd think this heavily depends on the details of the
implementation of permutations.
Am I guessing right here?

> So unless I'm
> mistaken, you're left with only O(n!) time complexity. Might still be
> enough to put some people off, though.

Are you sure you meant O(n!) here? That's one of the worst complexities
that one encounters in practice!

> > Is there a functional, efficient algorithm for generating a random
> > permutation?
>
> Uh, given a list of elements, draw a random one and add to an
> accumulator list until none are left?

For example.

> Gives you O(n^2), but that's
> because you have to traverse lists - and the constant factor is low.

O(N^2) isn't "efficient" if a linear one exists. The worst I'd accept is
O(N log N) (and when it comes to log N factors, I'm more concerned about
constants than big-O complexity).
One could use a tree that stores the not-yet-assigned numbers. We'd have
to store the size of each subtree in its root to guide the random-number
generator. And, since we're at it, we can dispose of the actual numbers,
since the number generated can be inferred from the sequence of
left-right decisions of the RNG, leaving us with a tree where each node
stores the number of unused numbers in some power-of-two-sized interval.
This would have a space and time complexity of O(N log N), but with a
relatively high constant overhead.

> And space is O(n).

Agreed.

Jón Fairbairn

unread,
Aug 31, 2001, 6:45:08 PM8/31/01
to
"Joachim Durchholz" <joac...@gmx.de> writes:

> Ketil Z Malde <ke...@ii.uib.no> wrote:
> > "Joachim Durchholz" <joac...@gmx.de> writes:
> >

[I wrote]


> > > > do r <- newStdGen -- start off a random generator
> > > > let (index, nextgen) = randomIvalInteger (0, factorial 1600 -
> 1) r
> > > > -- generate a random integer between 0 and 1600!
> > > > return (permutations [0..1599] !! index)
> >
> > > Yup, it's O(N!) in time and space, since permutations is generating
> all
> > > of them.
> >
> > Uh, doesn't laziness turn it into an O(n) space thing?
>
> Hmm... I'd think this heavily depends on the details of the
> implementation of permutations.

I think so.

> Am I guessing right here?
>
> > So unless I'm
> > mistaken, you're left with only O(n!) time complexity. Might still be
> > enough to put some people off, though.
>
> Are you sure you meant O(n!) here?

He did. I'm not sure that people have fully appreciated the
implications of the smiley in my original post!

--
Jón Fairbairn Jon.Fa...@cl.cam.ac.uk

Mattias Waldau

unread,
Sep 1, 2001, 5:17:40 AM9/1/01
to
I would use sort to shuffle, and have a compare function that
is random. You will get a n log n shuffle using this technique.

Code in Ocaml:

let _ = Random.self_init ()

(* shuffle the array in place *)
let shuffle (a:'a array) : unit =
let cmp a b = (Random.int 3) - 1 in
Array.sort cmp a

(* return a permuted array of ints of size l *)
let permutated_int (l:int) : int array =
let a = Array.init l (fun ii -> ii) in
shuffle a;
a

and run you get

# permutated_int 10;;
- : int array = [|3; 5; 9; 4; 1; 6; 8; 7; 2; 0|]
# permutated_int 20;;
- : int array =
[|11; 18; 3; 14; 2; 8; 7; 17; 9; 10; 5; 16; 13; 1; 15; 19; 12; 6; 4; 0|]
# permutated_int 20;;
- : int array =
[|9; 4; 17; 5; 14; 18; 7; 6; 11; 15; 19; 10; 13; 12; 16; 3; 8; 2; 1; 0|]
# permutated_int 20;;
- : int array =
[|4; 8; 16; 7; 17; 6; 10; 14; 15; 18; 11; 19; 9; 3; 0; 5; 13; 2; 1; 12|]
#

--
Mattias Waldau, http://www.abc.se/~m10217/

Xavier Leroy

unread,
Sep 1, 2001, 6:13:30 AM9/1/01
to

Mattias Waldau <mat...@localhost.localdomain> writes:

> I would use sort to shuffle, and have a compare function that
> is random. You will get a n log n shuffle using this technique.

Argh! There is a risk that the sort function will loop, or perform an
"array out of bounds" error. All sorting algorithms rely on the comparison
function to be a total order, which is certainly not the case for yours:

> let cmp a b = (Random.int 3) - 1

Better not do this -- unless you're competing for the "best abuse of a
library function" contest :-)

What you could do, however, is associate random integers to each element of
the array to be shuffled, then sort the (elt, random) pairs by increasing
"random" components. That should give you a good shuffle of the "elt"
components, in n log n time.

- Xavier Leroy
--
Valid e-mail address (without the underscores): Xavier.Leroy@i_n_r_i_a.f_r
This is a protection against junk mail. Apologies for the inconvenience.
Home page: http://pauillac.inria.fr/~xleroy/

Joachim Durchholz

unread,
Sep 1, 2001, 6:47:53 AM9/1/01
to
Mattias Waldau <mat...@localhost.localdomain> wrote:
> I would use sort to shuffle, and have a compare function that
> is random. You will get a n log n shuffle using this technique.

Yeah... but can you prove that the distribution is truly random?

You'd have to prove that the sort algorithm, whatever it is, will place
an element in any array slot with equal probability if the comparison
operation is random. Quicksort is optimized for other purposes, so I
doubt it will have that property. OTOH if it does work, I suspect that
it will work for any sort algorithm.
Food for thought for algorithm researchers :)

Jón Fairbairn

unread,
Sep 1, 2001, 7:25:21 AM9/1/01
to
Xavier...@see.my.sig.for.address (Xavier Leroy) writes:

> What you could do, however, is associate random integers to each element of
> the array to be shuffled, then sort the (elt, random) pairs by increasing
> "random" components. That should give you a good shuffle of the "elt"
> components, in n log n time.

That's what I'd do, though you need to make sure that the randoms are
all distinct, otherwise you get the n^n instead of n! problem
mentioned earlier in the thread in connexion with one of the more
imperative algorithms. Then the difficulty is to generate n /distinct/
random numbers in reasonable time/space.

--
Jón Fairbairn Jon.Fa...@cl.cam.ac.uk

Neelakantan Krishnaswami

unread,
Sep 1, 2001, 7:34:51 AM9/1/01
to

Linear congruential generators have that property. Since they are of
the form x' <- (ax + b) mod m, you get a repeat only when the prng
loops. So a long-period will generate numbers without repeats.

This is occasional justification for using them -- I can't think of
any other situation in which you would want to.


Neel

Xavier Leroy

unread,
Sep 1, 2001, 7:45:32 AM9/1/01
to

j...@cl.cam.ac.uk (=?iso-8859-1?q?J=F3n?= Fairbairn) writes:

> That's what I'd do, though you need to make sure that the randoms are
> all distinct, otherwise you get the n^n instead of n! problem
> mentioned earlier in the thread in connexion with one of the more
> imperative algorithms. Then the difficulty is to generate n /distinct/
> random numbers in reasonable time/space.

As someone else responded, if your PRNG has a large period, this won't
happen. In the general case, it's also improbable to get two identical
random numbers if your sequence is short enough. E.g. assuming 64-bit
random numbers, collisions are unlikely if the sequence to shuffle has
much less than 2^32 elements, which should often be the case :-)

Jón Fairbairn

unread,
Sep 1, 2001, 8:33:19 AM9/1/01
to
Xavier...@see.my.sig.for.address (Xavier Leroy) writes:

> In the general case, it's also improbable to get two identical
> random numbers if your sequence is short enough. E.g. assuming 64-bit
> random numbers, collisions are unlikely if the sequence to shuffle has
> much less than 2^32 elements, which should often be the case :-)

Well, yes, that's true. But it doesn't /feel/ right to rely on
it, somehow!

--
Jón Fairbairn Jon.Fa...@cl.cam.ac.uk

Mattias Waldau

unread,
Sep 1, 2001, 9:48:13 AM9/1/01
to
"Joachim Durchholz" <joac...@gmx.de> writes:
> operation is random. Quicksort is optimized for other purposes, so I
> doubt it will have that property. OTOH if it does work, I suspect that
> it will work for any sort algorithm.

Quicksort and mergesort are the sort algoritms that I mostly believe will
make a good shuffle.

In quicksort, the call to partition will take an element at a time
and decide if it should be put into the left or to the right subpart, then
this will be repeated recursively for the subparts.

In mergesort, the merger will either to take the first element
from the left or the right subpart, and then continue to recurse.

Regarding Xavier Leroy fear that the sorting library might crash
if compare isn't a total ordering, of course I cannot argue since
Xavier Leroy probably is the one who implemented the sorting in Ocaml.

A good and pure sorting functions should never be able to detect that
the compare function is cheating, since the only way to to that
would be to compare the same two elements twice. That compare is a
total ordering is only needed in order to be sure that the result
is sorted.

Al Christians

unread,
Sep 1, 2001, 12:07:12 PM9/1/01
to
Joachim Durchholz wrote:
>
>
> Yeah... but can you prove that the distribution is truly random?
>
No, you can prove by the same logic as the proof previously presented
that it cannot be random in all cases. The sort decision in the
algorithm posted is binary, with equal probability that elements will
compare one way or the other. So, the probability of any result must
be a sum of powers of 1/2. If the number of elements in the list is
not a power of 2, then some results must be more likely than others.


Al

Mark Jason Dominus

unread,
Sep 1, 2001, 12:01:52 PM9/1/01
to
In article <slrn9p1il1...@brick.cswv.com>,

Neelakantan Krishnaswami <ne...@alum.mit.edu> wrote:
>> Then the difficulty is to generate n
>> /distinct/ random numbers in reasonable time/space.
>
>Linear congruential generators have that property. Since they are of
>the form x' <- (ax + b) mod m, you get a repeat only when the prng
>loops. So a long-period will generate numbers without repeats.

For shuffling, LCGs are no good, because they do not have enough state.
Say your LCG has N bits of state. You are trying to produce an
unbiased shuffle of K items. Since 2^N does not divide K!, there is bias.

For example, consider an LCG with 15 bits of state. You are trying to
shuffle 7 items. You have just put 32768 items (the 32768 possible
outputs of the LCG) into 5040 boxes (the shuffles). The best possible
distribution will assign 6 random seeds to some permutations and 7
seeds to some others; this means that even in the optimal case some
permuations will come up 17% more often than others. With this much
bias, it is not clear that it is even worth using an 'unbiased'
algorithm.

A 15-bit LCG is not random enough to shuffle more than about 5 items
effectively. A 64-bit LCG will suffice to shuffle a list of up to 12
items. As the size of the list K increases, the amount of state
necessary in the LCG to acheave a reasonably unbiased shuffle
increases as O(K log K).

Joachim Durchholz

unread,
Sep 1, 2001, 12:42:26 PM9/1/01
to

That's not really a disproof. It still depends on how often the random
comparison is applied to what elements.
I don't think that such abstract arguments would help much. Somebody
would have to either go through the gory details of analysing the effect
of a random sort order on a concrete sort algorithm, or be a genius and
give a *strict* argument why sorting by "random order" is a good or a
bad idea.

A better idea might be to apply the generate-next-random-number
algorithm to all numbers and use these numbers to establish a sort
order.
As Jon has pointed out, if you're truly worried about the quality of the
randomness, you'd have to use a PRNG with more bits of state than each
number alone can provide.
(Hope you can sort this out, it's not too clear what I'm writing here
but I don't have the time to reword this once again.)

Al Christians

unread,
Sep 1, 2001, 12:58:35 PM9/1/01
to
Mattias Waldau wrote:
>
> Code in Ocaml:
>
> let _ = Random.self_init ()
>
> (* shuffle the array in place *)
> let shuffle (a:'a array) : unit =
> let cmp a b = (Random.int 3) - 1 in
> Array.sort cmp a
>

If you check on this to see what the distribution of results is,
you will almost certainly find that it is very non-random looking.
Running with the original spec (1600 elements to sort, initially
arranged 0 to 1599), I find that the array returned has the first
elements too high by about 10%, and the last elements too low by
about 98%.

Look at the last two or three elements in the arrays returned below,
and you can see that this effect shows in your results, too:


> # permutated_int 10;;
> - : int array = [|3; 5; 9; 4; 1; 6; 8; 7; 2; 0|]
> # permutated_int 20;;
> - : int array =
> [|11; 18; 3; 14; 2; 8; 7; 17; 9; 10; 5; 16; 13; 1; 15; 19; 12; 6; 4; 0|]

> [|9; 4; 17; 5; 14; 18; 7; 6; 11; 15; 19; 10; 13; 12; 16; 3; 8; 2; 1; 0|]
> # permutated_int 20;;
> - : int array =
> [|4; 8; 16; 7; 17; 6; 10; 14; 15; 18; 11; 19; 9; 3; 0; 5; 13; 2; 1; 12|]

Al

Neelakantan Krishnaswami

unread,
Sep 2, 2001, 10:07:48 AM9/2/01
to
On Sat, 01 Sep 2001 16:01:52 GMT, Mark Jason Dominus <m...@plover.com> wrote:
>In article <slrn9p1il1...@brick.cswv.com>,
>Neelakantan Krishnaswami <ne...@alum.mit.edu> wrote:
>>> Then the difficulty is to generate n
>>> /distinct/ random numbers in reasonable time/space.
>>
>>Linear congruential generators have that property. Since they are of
>>the form x' <- (ax + b) mod m, you get a repeat only when the prng
>>loops. So a long-period will generate numbers without repeats.
>
> For shuffling, LCGs are no good, because they do not have enough state.
> Say your LCG has N bits of state. You are trying to produce an
> unbiased shuffle of K items. Since 2^N does not divide K!, there is bias.

Damn, you're right -- I should have thought of that. Thanks for
pointing out my error!


Neel

Jeffrey M. Vinocur

unread,
Sep 2, 2001, 11:57:31 AM9/2/01
to
In article <wf66b3r...@calligramme.cl.cam.ac.uk>,

Well, you can check for repeats in n log n time, and repeat if
necessary. So you don't get a guaranteed time bound, but you get
guaranteed correctness and an infintesimal chance of repeating
more than a constant number of times.

--
Jeffrey M. Vinocur * jm...@cornell.edu
http://www.people.cornell.edu/pages/jmv16/

christian adam hresko

unread,
Sep 2, 2001, 1:40:05 PM9/2/01
to

Ryan Tarpine wrote:

> What's the easiest way to get a shuffled list or array of a sequence of
> numbers? e.g. I need 0 through 1599 in random order, each number appearing
> exactly once. Ocaml if possible. Thanks in advance
>

i'm still new at functional programming, so here's a C++ algorithm. you'll
need a better way to seed the random variable though...


#include <iostream>

using namespace std;

void shuffle (int []);

int main()
{
int arr[1600] = {0};
shuffle(arr);

return 0;
}

/*
* I need 0 through 1599 in random order, each number appearing exactly
once. Ocaml if possible.
*/

void shuffle (int shuffarr[])
{
int rnum;
for (int i = 1; i <= 1600; i++) {
do {
rnum = rand()%1599;
} while (shuffarr[rnum] != 0);
shuffarr[rnum] = i - 1;
}
}


hope this helps a little.


cheers,

christian


Mark Jason Dominus

unread,
Sep 2, 2001, 2:26:05 PM9/2/01
to
In article <9mr35p$3lci8$1...@ID-9852.news.dfncis.de>,

Joachim Durchholz <joachim....@gmx.de> wrote:
>> No, you can prove by the same logic as the proof previously presented
>> that it cannot be random in all cases. The sort decision in the
>> algorithm posted is binary, with equal probability that elements will
>> compare one way or the other. So, the probability of any result must
>> be a sum of powers of 1/2. If the number of elements in the list is
>> not a power of 2, then some results must be more likely than others.
>
>That's not really a disproof. It still depends on how often the random
>comparison is applied to what elements.

It does not, as long as the total number of comparisons is finite.
There is no way to add up a finite number of fractions of the form
2^(-k) and get 1/6.

>Somebody would have to either go through the gory details of
>analysing the effect of a random sort order on a concrete sort
>algorithm,

Fine. I'll use the following sort algorithm for three elements a,b,c:

Compare a and b.
If a<b, compare b and c.
If b<c, you are done: a<b<c
If c<b, compare a and c.
If a<c, you are done: a<c<b
If c<a, you are done: c<a<b
If b<a, compare b and c.
If b<c, compare a and c.
If a<c, you are done: b<a<c
If c<a, you are done: b<c<a
If c<b, you are done: c<b<a

If each 'comparison' is done at random with a 1/2 probability of going
either way, then the outcome is:

a<b<c 1/4
a<c<b 1/8
c<a<b 1/8
b<a<c 1/8
b<c<a 1/8
c<b<a 1/4

Some of the orderings appear twice as often as others.

No sorting algorithm can do better than this in general, and most will
do exactly this. For instance, a straightforward implementation of
quicksort produces a table just like this one when it is run on a
3-element set. The pivot element is chosen. It is compared with the
other two elements. Half the time it turns out to be the middle
element and no more comparisons are done. The other half the time it
is either the largest or smallest, and the remaining two items must be
compared.

You can already see that something is wrong, because the pivot element
should be the middle element 1/3 of the time, not 1/2 of the time.

It's now trivial to prove that quicksort with a fair random comparison
function on any number of elements produces a biased shuffle. The
pivot element might be larger than all but three of the remaining
elements, and in that case the remaining three elements will be
recursively sorted with quicksort, which we've already seen will order
them in a biased way.

You can fix this by adjusting the random comparison function so that
it doesn't always choose between a<b and b<a equiprobably. But to do
that probably turns out to be equivalent to the simpler well-known
algorithms like Fischer-Yates.

>A better idea might be to apply the generate-next-random-number
>algorithm to all numbers and use these numbers to establish a sort
>order.

That works, if there is enough entropy in the PRNG. Fischer-Yates
will be more efficient, if your language has arrays.