2 views

Skip to first unread message

Aug 30, 2001, 3:32:41 PM8/30/01

to

What's the easiest way to get a shuffled list or array of a sequence of

numbers? e.g. I need 0 through 1599 in random order, each number appearing

exactly once. Ocaml if possible. Thanks in advance

numbers? e.g. I need 0 through 1599 in random order, each number appearing

exactly once. Ocaml if possible. Thanks in advance

--

Ryan Tarpine, rtar...@hotmail.com

"Technology is a queer thing. It brings you great gifts

in one hand, and it stabs you in the back with the other."

-- C.P. Snow

Aug 30, 2001, 4:07:48 PM8/30/01

to

Ryan Tarpine <rtar...@hotmail.com> wrote:

> What's the easiest way to get a shuffled list or array of a sequence

of

> numbers? e.g. I need 0 through 1599 in random order, each number

appearing

> exactly once. Ocaml if possible. Thanks in advance

> What's the easiest way to get a shuffled list or array of a sequence

of

> numbers? e.g. I need 0 through 1599 in random order, each number

appearing

> exactly once. Ocaml if possible. Thanks in advance

Easiest would be:

Put them in an array.

for i from 1 to N:

exchange the i'th element with a randomly chosen one

(exchanging it with itself resulting in a no-op)

(Sorry no OCaml. Should be easy enough to translate though.)

This means ca. 2N writes, on the average. Knuth's Art of Computer

Programming might have more efficient algorithms, but I can't imagine

one right now.

The algorithm is also very imperative; I'm not sure that it's possible

to write a functional version that matches both speed and efficiency.

(I'd be happy to see such a function.)

Regards,

Joachim

--

This is not an official statement from my employer.

Aug 30, 2001, 4:15:36 PM8/30/01

to

On Thu, 30 Aug 2001, Ryan Tarpine wrote:

> What's the easiest way to get a shuffled list or array of a sequence of

> numbers?

> What's the easiest way to get a shuffled list or array of a sequence of

> numbers?

Post to a newsgroup and ask for help. ;-)

> e.g. I need 0 through 1599 in random order, each number appearing

> exactly once. Ocaml if possible. Thanks in advance

let swap a i j =

let tmp = Array.get a i in

(Array.set a i (Array.get a j);

Array.set a j tmp)

let shuffled_array n =

let a = Array.init n (fun i -> i) in

(Random.init n;

for i = 0 to (n - 1) do

swap a i (Random.int n);

done;

a)

let a = shuffled_array 1600

Not really a very functional program, I admit, and probably not the best,

but I work best for money or love.

-- Brian

Aug 30, 2001, 4:49:27 PM8/30/01

to

Brian Rogoff wrote:

>

> Post to a newsgroup and ask for help. ;-)

>

> > e.g. I need 0 through 1599 in random order, each number appearing

> > exactly once. Ocaml if possible. Thanks in advance

>

> let swap a i j =

> let tmp = Array.get a i in

> (Array.set a i (Array.get a j);

> Array.set a j tmp)

>

> let shuffled_array n =

> let a = Array.init n (fun i -> i) in

> (Random.init n;

> for i = 0 to (n - 1) do

> swap a i (Random.int n);

> done;

> a)

>

> let a = shuffled_array 1600

>

> Not really a very functional program, I admit, and probably not the

> best, but I work best for money or love.

>

>

> Post to a newsgroup and ask for help. ;-)

>

> > e.g. I need 0 through 1599 in random order, each number appearing

> > exactly once. Ocaml if possible. Thanks in advance

>

> let swap a i j =

> let tmp = Array.get a i in

> (Array.set a i (Array.get a j);

> Array.set a j tmp)

>

> let shuffled_array n =

> let a = Array.init n (fun i -> i) in

> (Random.init n;

> for i = 0 to (n - 1) do

> swap a i (Random.int n);

> done;

> a)

>

> let a = shuffled_array 1600

>

> Not really a very functional program, I admit, and probably not the

> best, but I work best for money or love.

>

I believe that you will get a better randommess if you make that:

swap a i (i + (Random.int (n-i));

Al

Aug 30, 2001, 6:08:10 PM8/30/01

to

Al Christians wrote:

> Brian Rogoff wrote:

>> ...

>> let swap a i j =

>> let tmp = Array.get a i in

>> (Array.set a i (Array.get a j);

>> Array.set a j tmp)

>>

>> let shuffled_array n =

>> let a = Array.init n (fun i -> i) in

>> (Random.init n;

>> for i = 0 to (n - 1) do

>> swap a i (Random.int n);

>> done;

>> a)

>>

>> let a = shuffled_array 1600

>> ...

>

> I believe that you will get a better randommess if you make that:

>

> swap a i (i + (Random.int (n-i));

> ...

Thanks to both of you. Just what I needed.

Aug 30, 2001, 6:47:59 PM8/30/01

to

Ryan Tarpine wrote:

>

> Thanks to both of you. Just what I needed.

>

>

> Thanks to both of you. Just what I needed.

>

If you get an A, send us each 15%.

Al

Aug 30, 2001, 7:03:09 PM8/30/01

to

Hi Al,

Al Christians wrote:

But it amounts to slightly less than a... newline! Furthermore your

mailer would strip it as a white space :-(.

Alexander

Aug 30, 2001, 7:59:52 PM8/30/01

to

Joachim Durchholz <joac...@gmx.de> writes:

> Easiest would be:

> Put them in an array.

> for i from 1 to N:

> exchange the i'th element with a randomly chosen one

> (exchanging it with itself resulting in a no-op)

I don't have the proof handy, but you can prove that swapping each element

with a random element is more likely to produce some final orderings than

others. You instead want to swap each element with a randomly chosen

element between it and the end of the list.

--

Russ Allbery (r...@stanford.edu) <http://www.eyrie.org/~eagle/>

Aug 31, 2001, 4:27:52 AM8/31/01

to

Russ Allbery <r...@stanford.edu> wrote:

> Joachim Durchholz <joac...@gmx.de> writes:

>

> > Easiest would be:

> > Put them in an array.

> > for i from 1 to N:

> > exchange the i'th element with a randomly chosen one

> > (exchanging it with itself resulting in a no-op)

>

> I don't have the proof handy, but you can prove that swapping each

element

> with a random element is more likely to produce some final orderings

than

> others.

> Joachim Durchholz <joac...@gmx.de> writes:

>

> > Easiest would be:

> > Put them in an array.

> > for i from 1 to N:

> > exchange the i'th element with a randomly chosen one

> > (exchanging it with itself resulting in a no-op)

>

> I don't have the proof handy, but you can prove that swapping each

element

> with a random element is more likely to produce some final orderings

than

> others.

I don't know. My first approach was to swap randomly chosen elements,

for N times, but I wasn't sure that that wouldn't create a bias.

> You instead want to swap each element with a randomly chosen

> element between it and the end of the list.

I do have a proof (well, sketch of it) that this algorithm will work.

It is by induction:

For one element, the algorithm will be trivially correct.

For N elements, there's an 1/N chance that the algorithm will swap the

first element with itself, leaving the others alone.

Further there's an 1/N chance that the algorithm will swap the first

element with any given other element O. This gives O an 1/N chance to

finish on position 1, and an (N-1)/N chance to end up on any other

position, at which is was placed with an 1/(N-1) probability in the

previous round by induction, which gives an (N-1)/N*1/(N-1) = 1/N

probability, q.e.d.

Conclusion: I don't really know whether my algorithm is correct, but I

can't prove it and I *can* prove yours, so your algorithm is

preferrable.

Aug 31, 2001, 6:25:01 AM8/31/01

to

"Joachim Durchholz" <joac...@gmx.de> writes:

> Ryan Tarpine <rtar...@hotmail.com> wrote:

> > What's the easiest way to get a shuffled list or array of a

> > sequence of numbers? e.g. I need 0 through 1599 in random order,

> > each number appearing exactly once. Ocaml if possible. Thanks in

> > advance

> Easiest would be:

in Haskell, (sorry)

factorial n = product [1..n]

premutations l = left as an exercise

then

do r <- newStdGen -- start off a random generator

let (index, nextgen) = randomIvalInteger (0, factorial 1600 - 1) r

-- generate a random integer between 0 and 1600!

return (permutations [0..1599] !! index)

gives you what you want

> The algorithm is also very imperative;

My algorithem is quite functional. Might be a bit slow, though ;-)

--

Jón Fairbairn Jon.Fa...@cl.cam.ac.uk

Aug 31, 2001, 6:44:32 AM8/31/01

to

Jón Fairbairn <j...@cl.cam.ac.uk> wrote:

> premutations l = left as an exercise

>

> then

>

> do r <- newStdGen -- start off a random generator

> let (index, nextgen) = randomIvalInteger (0, factorial 1600 - 1) r

> -- generate a random integer between 0 and 1600!

> return (permutations [0..1599] !! index)

>

> gives you what you want

>

> > The algorithm is also very imperative;

>

> My algorithem is quite functional. Might be a bit slow, though ;-)

> premutations l = left as an exercise

>

> then

>

> do r <- newStdGen -- start off a random generator

> let (index, nextgen) = randomIvalInteger (0, factorial 1600 - 1) r

> -- generate a random integer between 0 and 1600!

> return (permutations [0..1599] !! index)

>

> gives you what you want

>

> > The algorithm is also very imperative;

>

> My algorithem is quite functional. Might be a bit slow, though ;-)

Yup, it's O(N!) in time and space, since permutations is generating all

of them. I don't think that current-day computers can do that with 1600

items.

Is there a functional, efficient algorithm for generating a random

permutation?

Aug 31, 2001, 7:05:19 AM8/31/01

to

"Joachim Durchholz" <joac...@gmx.de> writes:

> > do r <- newStdGen -- start off a random generator

> > let (index, nextgen) = randomIvalInteger (0, factorial 1600 - 1) r

> > -- generate a random integer between 0 and 1600!

> > return (permutations [0..1599] !! index)

> Yup, it's O(N!) in time and space, since permutations is generating all

> of them.

Uh, doesn't laziness turn it into an O(n) space thing? So unless I'm

mistaken, you're left with only O(n!) time complexity. Might still be

enough to put some people off, though.

> Is there a functional, efficient algorithm for generating a random

> permutation?

Uh, given a list of elements, draw a random one and add to an

accumulator list until none are left? Gives you O(n^2), but that's

because you have to traverse lists - and the constant factor is low.

And space is O(n).

-kzm

--

If I haven't seen further, it is by standing in the footprints of giants

Aug 31, 2001, 7:08:23 AM8/31/01

to

In article <9mnhqf$3649q$1...@ID-9852.news.dfncis.de>,

Joachim Durchholz <joachim....@gmx.de> wrote:

>Russ Allbery <r...@stanford.edu> wrote:

>> I don't have the proof handy, but you can prove that swapping each

>element

>> with a random element is more likely to produce some final orderings

>than

>> others.

>

Joachim Durchholz <joachim....@gmx.de> wrote:

>Russ Allbery <r...@stanford.edu> wrote:

>> I don't have the proof handy, but you can prove that swapping each

>element

>> with a random element is more likely to produce some final orderings

>than

>> others.

>

>Conclusion: I don't really know whether my algorithm is correct, but I

>can't prove it and I *can* prove yours, so your algorithm is

>preferrable.

>can't prove it and I *can* prove yours, so your algorithm is

>preferrable.

I can prove that yours is wrong. Let N=3. Your algorithm generates

three random numbers in the course of its execution. Each of these

random numbers is between 1 and 3. The result, R, is a function of

these three random numbers and nothing else.

There are 27 possible triples of random numbers (1,1,1) .. (3,3,3),

and 6 possible results R. Since 27 is not an exact multiple of 6,

your algorithm must select some results more often than others.

In general, N^N is not a multiple of N! unless N<=2, so your algorithm

is biased except when N<=2.

--

@P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{

@p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f^ord

($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^[P.]/&&

close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d[$q];sleep rand(2)if/\S/;print

Aug 31, 2001, 6:21:10 AM8/31/01

to

you can have an O(n log n) by using a tree instead of a list.

--

Rémi Vanicat

van...@labri.u-bordeaux.fr

http://dept-info.labri.u-bordeaux.fr/~vanicat

Aug 31, 2001, 7:58:46 AM8/31/01

to

Ketil Z Malde <ke...@ii.uib.no> wrote:

> "Joachim Durchholz" <joac...@gmx.de> writes:

>

> > > do r <- newStdGen -- start off a random generator

> > > let (index, nextgen) = randomIvalInteger (0, factorial 1600 -

1) r

> > > -- generate a random integer between 0 and 1600!

> > > return (permutations [0..1599] !! index)

>

> > Yup, it's O(N!) in time and space, since permutations is generating

all

> > of them.

>

> Uh, doesn't laziness turn it into an O(n) space thing?

> "Joachim Durchholz" <joac...@gmx.de> writes:

>

> > > do r <- newStdGen -- start off a random generator

> > > let (index, nextgen) = randomIvalInteger (0, factorial 1600 -

1) r

> > > -- generate a random integer between 0 and 1600!

> > > return (permutations [0..1599] !! index)

>

> > Yup, it's O(N!) in time and space, since permutations is generating

all

> > of them.

>

> Uh, doesn't laziness turn it into an O(n) space thing?

Hmm... I'd think this heavily depends on the details of the

implementation of permutations.

Am I guessing right here?

> So unless I'm

> mistaken, you're left with only O(n!) time complexity. Might still be

> enough to put some people off, though.

Are you sure you meant O(n!) here? That's one of the worst complexities

that one encounters in practice!

> > Is there a functional, efficient algorithm for generating a random

> > permutation?

>

> Uh, given a list of elements, draw a random one and add to an

> accumulator list until none are left?

For example.

> Gives you O(n^2), but that's

> because you have to traverse lists - and the constant factor is low.

O(N^2) isn't "efficient" if a linear one exists. The worst I'd accept is

O(N log N) (and when it comes to log N factors, I'm more concerned about

constants than big-O complexity).

One could use a tree that stores the not-yet-assigned numbers. We'd have

to store the size of each subtree in its root to guide the random-number

generator. And, since we're at it, we can dispose of the actual numbers,

since the number generated can be inferred from the sequence of

left-right decisions of the RNG, leaving us with a tree where each node

stores the number of unused numbers in some power-of-two-sized interval.

This would have a space and time complexity of O(N log N), but with a

relatively high constant overhead.

> And space is O(n).

Agreed.

Aug 31, 2001, 6:45:08 PM8/31/01

to

"Joachim Durchholz" <joac...@gmx.de> writes:

> Ketil Z Malde <ke...@ii.uib.no> wrote:

> > "Joachim Durchholz" <joac...@gmx.de> writes:

> >

[I wrote]

> > > > do r <- newStdGen -- start off a random generator

> > > > let (index, nextgen) = randomIvalInteger (0, factorial 1600 -

> 1) r

> > > > -- generate a random integer between 0 and 1600!

> > > > return (permutations [0..1599] !! index)

> >

> > > Yup, it's O(N!) in time and space, since permutations is generating

> all

> > > of them.

> >

> > Uh, doesn't laziness turn it into an O(n) space thing?

>

> Hmm... I'd think this heavily depends on the details of the

> implementation of permutations.

I think so.

> Am I guessing right here?

>

> > So unless I'm

> > mistaken, you're left with only O(n!) time complexity. Might still be

> > enough to put some people off, though.

>

> Are you sure you meant O(n!) here?

He did. I'm not sure that people have fully appreciated the

implications of the smiley in my original post!

--

Jón Fairbairn Jon.Fa...@cl.cam.ac.uk

Sep 1, 2001, 5:17:40 AM9/1/01

to

I would use sort to shuffle, and have a compare function that

is random. You will get a n log n shuffle using this technique.

is random. You will get a n log n shuffle using this technique.

Code in Ocaml:

let _ = Random.self_init ()

(* shuffle the array in place *)

let shuffle (a:'a array) : unit =

let cmp a b = (Random.int 3) - 1 in

Array.sort cmp a

(* return a permuted array of ints of size l *)

let permutated_int (l:int) : int array =

let a = Array.init l (fun ii -> ii) in

shuffle a;

a

and run you get

# permutated_int 10;;

- : int array = [|3; 5; 9; 4; 1; 6; 8; 7; 2; 0|]

# permutated_int 20;;

- : int array =

[|11; 18; 3; 14; 2; 8; 7; 17; 9; 10; 5; 16; 13; 1; 15; 19; 12; 6; 4; 0|]

# permutated_int 20;;

- : int array =

[|9; 4; 17; 5; 14; 18; 7; 6; 11; 15; 19; 10; 13; 12; 16; 3; 8; 2; 1; 0|]

# permutated_int 20;;

- : int array =

[|4; 8; 16; 7; 17; 6; 10; 14; 15; 18; 11; 19; 9; 3; 0; 5; 13; 2; 1; 12|]

#

--

Mattias Waldau, http://www.abc.se/~m10217/

Sep 1, 2001, 6:13:30 AM9/1/01

to

Mattias Waldau <mat...@localhost.localdomain> writes:

> I would use sort to shuffle, and have a compare function that

> is random. You will get a n log n shuffle using this technique.

Argh! There is a risk that the sort function will loop, or perform an

"array out of bounds" error. All sorting algorithms rely on the comparison

function to be a total order, which is certainly not the case for yours:

> let cmp a b = (Random.int 3) - 1

Better not do this -- unless you're competing for the "best abuse of a

library function" contest :-)

What you could do, however, is associate random integers to each element of

the array to be shuffled, then sort the (elt, random) pairs by increasing

"random" components. That should give you a good shuffle of the "elt"

components, in n log n time.

- Xavier Leroy

--

Valid e-mail address (without the underscores): Xavier.Leroy@i_n_r_i_a.f_r

This is a protection against junk mail. Apologies for the inconvenience.

Home page: http://pauillac.inria.fr/~xleroy/

Sep 1, 2001, 6:47:53 AM9/1/01

to

Mattias Waldau <mat...@localhost.localdomain> wrote:

> I would use sort to shuffle, and have a compare function that

> is random. You will get a n log n shuffle using this technique.

> I would use sort to shuffle, and have a compare function that

> is random. You will get a n log n shuffle using this technique.

Yeah... but can you prove that the distribution is truly random?

You'd have to prove that the sort algorithm, whatever it is, will place

an element in any array slot with equal probability if the comparison

operation is random. Quicksort is optimized for other purposes, so I

doubt it will have that property. OTOH if it does work, I suspect that

it will work for any sort algorithm.

Food for thought for algorithm researchers :)

Sep 1, 2001, 7:25:21 AM9/1/01

to

Xavier...@see.my.sig.for.address (Xavier Leroy) writes:

> What you could do, however, is associate random integers to each element of

> the array to be shuffled, then sort the (elt, random) pairs by increasing

> "random" components. That should give you a good shuffle of the "elt"

> components, in n log n time.

That's what I'd do, though you need to make sure that the randoms are

all distinct, otherwise you get the n^n instead of n! problem

mentioned earlier in the thread in connexion with one of the more

imperative algorithms. Then the difficulty is to generate n /distinct/

random numbers in reasonable time/space.

--

Jón Fairbairn Jon.Fa...@cl.cam.ac.uk

Sep 1, 2001, 7:34:51 AM9/1/01

to

Linear congruential generators have that property. Since they are of

the form x' <- (ax + b) mod m, you get a repeat only when the prng

loops. So a long-period will generate numbers without repeats.

This is occasional justification for using them -- I can't think of

any other situation in which you would want to.

Neel

Sep 1, 2001, 7:45:32 AM9/1/01

to

j...@cl.cam.ac.uk (=?iso-8859-1?q?J=F3n?= Fairbairn) writes:

> That's what I'd do, though you need to make sure that the randoms are

> all distinct, otherwise you get the n^n instead of n! problem

> mentioned earlier in the thread in connexion with one of the more

> imperative algorithms. Then the difficulty is to generate n /distinct/

> random numbers in reasonable time/space.

As someone else responded, if your PRNG has a large period, this won't

happen. In the general case, it's also improbable to get two identical

random numbers if your sequence is short enough. E.g. assuming 64-bit

random numbers, collisions are unlikely if the sequence to shuffle has

much less than 2^32 elements, which should often be the case :-)

Sep 1, 2001, 8:33:19 AM9/1/01

to

Xavier...@see.my.sig.for.address (Xavier Leroy) writes:

> In the general case, it's also improbable to get two identical

> random numbers if your sequence is short enough. E.g. assuming 64-bit

> random numbers, collisions are unlikely if the sequence to shuffle has

> much less than 2^32 elements, which should often be the case :-)

Well, yes, that's true. But it doesn't /feel/ right to rely on

it, somehow!

--

Jón Fairbairn Jon.Fa...@cl.cam.ac.uk

Sep 1, 2001, 9:48:13 AM9/1/01

to

"Joachim Durchholz" <joac...@gmx.de> writes:

> operation is random. Quicksort is optimized for other purposes, so I

> doubt it will have that property. OTOH if it does work, I suspect that

> it will work for any sort algorithm.

> operation is random. Quicksort is optimized for other purposes, so I

> doubt it will have that property. OTOH if it does work, I suspect that

> it will work for any sort algorithm.

Quicksort and mergesort are the sort algoritms that I mostly believe will

make a good shuffle.

In quicksort, the call to partition will take an element at a time

and decide if it should be put into the left or to the right subpart, then

this will be repeated recursively for the subparts.

In mergesort, the merger will either to take the first element

from the left or the right subpart, and then continue to recurse.

Regarding Xavier Leroy fear that the sorting library might crash

if compare isn't a total ordering, of course I cannot argue since

Xavier Leroy probably is the one who implemented the sorting in Ocaml.

A good and pure sorting functions should never be able to detect that

the compare function is cheating, since the only way to to that

would be to compare the same two elements twice. That compare is a

total ordering is only needed in order to be sure that the result

is sorted.

Sep 1, 2001, 12:07:12 PM9/1/01

to

Joachim Durchholz wrote:

>

>

> Yeah... but can you prove that the distribution is truly random?

>

No, you can prove by the same logic as the proof previously presented>

>

> Yeah... but can you prove that the distribution is truly random?

>

that it cannot be random in all cases. The sort decision in the

algorithm posted is binary, with equal probability that elements will

compare one way or the other. So, the probability of any result must

be a sum of powers of 1/2. If the number of elements in the list is

not a power of 2, then some results must be more likely than others.

Al

Sep 1, 2001, 12:01:52 PM9/1/01

to

In article <slrn9p1il1...@brick.cswv.com>,

Neelakantan Krishnaswami <ne...@alum.mit.edu> wrote:

>> Then the difficulty is to generate n

>> /distinct/ random numbers in reasonable time/space.

>

>Linear congruential generators have that property. Since they are of

>the form x' <- (ax + b) mod m, you get a repeat only when the prng

>loops. So a long-period will generate numbers without repeats.

Neelakantan Krishnaswami <ne...@alum.mit.edu> wrote:

>> Then the difficulty is to generate n

>> /distinct/ random numbers in reasonable time/space.

>

>Linear congruential generators have that property. Since they are of

>the form x' <- (ax + b) mod m, you get a repeat only when the prng

>loops. So a long-period will generate numbers without repeats.

For shuffling, LCGs are no good, because they do not have enough state.

Say your LCG has N bits of state. You are trying to produce an

unbiased shuffle of K items. Since 2^N does not divide K!, there is bias.

For example, consider an LCG with 15 bits of state. You are trying to

shuffle 7 items. You have just put 32768 items (the 32768 possible

outputs of the LCG) into 5040 boxes (the shuffles). The best possible

distribution will assign 6 random seeds to some permutations and 7

seeds to some others; this means that even in the optimal case some

permuations will come up 17% more often than others. With this much

bias, it is not clear that it is even worth using an 'unbiased'

algorithm.

A 15-bit LCG is not random enough to shuffle more than about 5 items

effectively. A 64-bit LCG will suffice to shuffle a list of up to 12

items. As the size of the list K increases, the amount of state

necessary in the LCG to acheave a reasonably unbiased shuffle

increases as O(K log K).

Sep 1, 2001, 12:42:26 PM9/1/01

to

That's not really a disproof. It still depends on how often the random

comparison is applied to what elements.

I don't think that such abstract arguments would help much. Somebody

would have to either go through the gory details of analysing the effect

of a random sort order on a concrete sort algorithm, or be a genius and

give a *strict* argument why sorting by "random order" is a good or a

bad idea.

A better idea might be to apply the generate-next-random-number

algorithm to all numbers and use these numbers to establish a sort

order.

As Jon has pointed out, if you're truly worried about the quality of the

randomness, you'd have to use a PRNG with more bits of state than each

number alone can provide.

(Hope you can sort this out, it's not too clear what I'm writing here

but I don't have the time to reword this once again.)

Sep 1, 2001, 12:58:35 PM9/1/01

to

Mattias Waldau wrote:

>

> Code in Ocaml:

>

> let _ = Random.self_init ()

>

> (* shuffle the array in place *)

> let shuffle (a:'a array) : unit =

> let cmp a b = (Random.int 3) - 1 in

> Array.sort cmp a

>

>

> Code in Ocaml:

>

> let _ = Random.self_init ()

>

> (* shuffle the array in place *)

> let shuffle (a:'a array) : unit =

> let cmp a b = (Random.int 3) - 1 in

> Array.sort cmp a

>

If you check on this to see what the distribution of results is,

you will almost certainly find that it is very non-random looking.

Running with the original spec (1600 elements to sort, initially

arranged 0 to 1599), I find that the array returned has the first

elements too high by about 10%, and the last elements too low by

about 98%.

Look at the last two or three elements in the arrays returned below,

and you can see that this effect shows in your results, too:

> # permutated_int 10;;

> - : int array = [|3; 5; 9; 4; 1; 6; 8; 7; 2; 0|]

> # permutated_int 20;;

> - : int array =

> [|11; 18; 3; 14; 2; 8; 7; 17; 9; 10; 5; 16; 13; 1; 15; 19; 12; 6; 4; 0|]

> [|9; 4; 17; 5; 14; 18; 7; 6; 11; 15; 19; 10; 13; 12; 16; 3; 8; 2; 1; 0|]

> # permutated_int 20;;

> - : int array =

> [|4; 8; 16; 7; 17; 6; 10; 14; 15; 18; 11; 19; 9; 3; 0; 5; 13; 2; 1; 12|]

Al

Sep 2, 2001, 10:07:48 AM9/2/01

to

On Sat, 01 Sep 2001 16:01:52 GMT, Mark Jason Dominus <m...@plover.com> wrote:

>In article <slrn9p1il1...@brick.cswv.com>,

>Neelakantan Krishnaswami <ne...@alum.mit.edu> wrote:

>>> Then the difficulty is to generate n

>>> /distinct/ random numbers in reasonable time/space.

>>

>>Linear congruential generators have that property. Since they are of

>>the form x' <- (ax + b) mod m, you get a repeat only when the prng

>>loops. So a long-period will generate numbers without repeats.

>

> For shuffling, LCGs are no good, because they do not have enough state.

> Say your LCG has N bits of state. You are trying to produce an

> unbiased shuffle of K items. Since 2^N does not divide K!, there is bias.

>In article <slrn9p1il1...@brick.cswv.com>,

>Neelakantan Krishnaswami <ne...@alum.mit.edu> wrote:

>>> Then the difficulty is to generate n

>>> /distinct/ random numbers in reasonable time/space.

>>

>>Linear congruential generators have that property. Since they are of

>>the form x' <- (ax + b) mod m, you get a repeat only when the prng

>>loops. So a long-period will generate numbers without repeats.

>

> For shuffling, LCGs are no good, because they do not have enough state.

> Say your LCG has N bits of state. You are trying to produce an

> unbiased shuffle of K items. Since 2^N does not divide K!, there is bias.

Damn, you're right -- I should have thought of that. Thanks for

pointing out my error!

Neel

Sep 2, 2001, 11:57:31 AM9/2/01

to

In article <wf66b3r...@calligramme.cl.cam.ac.uk>,

Well, you can check for repeats in n log n time, and repeat if

necessary. So you don't get a guaranteed time bound, but you get

guaranteed correctness and an infintesimal chance of repeating

more than a constant number of times.

--

Jeffrey M. Vinocur * jm...@cornell.edu

http://www.people.cornell.edu/pages/jmv16/

Sep 2, 2001, 1:40:05 PM9/2/01

to

Ryan Tarpine wrote:

> What's the easiest way to get a shuffled list or array of a sequence of

> numbers? e.g. I need 0 through 1599 in random order, each number appearing

> exactly once. Ocaml if possible. Thanks in advance

>

i'm still new at functional programming, so here's a C++ algorithm. you'll

need a better way to seed the random variable though...

#include <iostream>

using namespace std;

void shuffle (int []);

int main()

{

int arr[1600] = {0};

shuffle(arr);

return 0;

}

/*

* I need 0 through 1599 in random order, each number appearing exactly

once. Ocaml if possible.

*/

void shuffle (int shuffarr[])

{

int rnum;

for (int i = 1; i <= 1600; i++) {

do {

rnum = rand()%1599;

} while (shuffarr[rnum] != 0);

shuffarr[rnum] = i - 1;

}

}

hope this helps a little.

cheers,

christian

Sep 2, 2001, 2:26:05 PM9/2/01

to

In article <9mr35p$3lci8$1...@ID-9852.news.dfncis.de>,

Joachim Durchholz <joachim....@gmx.de> wrote:

>> No, you can prove by the same logic as the proof previously presented

>> that it cannot be random in all cases. The sort decision in the

>> algorithm posted is binary, with equal probability that elements will

>> compare one way or the other. So, the probability of any result must

>> be a sum of powers of 1/2. If the number of elements in the list is

>> not a power of 2, then some results must be more likely than others.

>

>That's not really a disproof. It still depends on how often the random

>comparison is applied to what elements.

Joachim Durchholz <joachim....@gmx.de> wrote:

>> No, you can prove by the same logic as the proof previously presented

>> that it cannot be random in all cases. The sort decision in the

>> algorithm posted is binary, with equal probability that elements will

>> compare one way or the other. So, the probability of any result must

>> be a sum of powers of 1/2. If the number of elements in the list is

>> not a power of 2, then some results must be more likely than others.

>

>That's not really a disproof. It still depends on how often the random

>comparison is applied to what elements.

It does not, as long as the total number of comparisons is finite.

There is no way to add up a finite number of fractions of the form

2^(-k) and get 1/6.

>Somebody would have to either go through the gory details of

>analysing the effect of a random sort order on a concrete sort

>algorithm,

Fine. I'll use the following sort algorithm for three elements a,b,c:

Compare a and b.

If a<b, compare b and c.

If b<c, you are done: a<b<c

If c<b, compare a and c.

If a<c, you are done: a<c<b

If c<a, you are done: c<a<b

If b<a, compare b and c.

If b<c, compare a and c.

If a<c, you are done: b<a<c

If c<a, you are done: b<c<a

If c<b, you are done: c<b<a

If each 'comparison' is done at random with a 1/2 probability of going

either way, then the outcome is:

a<b<c 1/4

a<c<b 1/8

c<a<b 1/8

b<a<c 1/8

b<c<a 1/8

c<b<a 1/4

Some of the orderings appear twice as often as others.

No sorting algorithm can do better than this in general, and most will

do exactly this. For instance, a straightforward implementation of

quicksort produces a table just like this one when it is run on a

3-element set. The pivot element is chosen. It is compared with the

other two elements. Half the time it turns out to be the middle

element and no more comparisons are done. The other half the time it

is either the largest or smallest, and the remaining two items must be

compared.

You can already see that something is wrong, because the pivot element

should be the middle element 1/3 of the time, not 1/2 of the time.

It's now trivial to prove that quicksort with a fair random comparison

function on any number of elements produces a biased shuffle. The

pivot element might be larger than all but three of the remaining

elements, and in that case the remaining three elements will be

recursively sorted with quicksort, which we've already seen will order

them in a biased way.

You can fix this by adjusting the random comparison function so that

it doesn't always choose between a<b and b<a equiprobably. But to do

that probably turns out to be equivalent to the simpler well-known

algorithms like Fischer-Yates.

>A better idea might be to apply the generate-next-random-number

>algorithm to all numbers and use these numbers to establish a sort

>order.

That works, if there is enough entropy in the PRNG. Fischer-Yates

will be more efficient, if your language has arrays.