--
Ryan Tarpine, rtar...@hotmail.com
"Technology is a queer thing. It brings you great gifts
in one hand, and it stabs you in the back with the other."
-- C.P. Snow
Easiest would be:
Put them in an array.
for i from 1 to N:
exchange the i'th element with a randomly chosen one
(exchanging it with itself resulting in a no-op)
(Sorry no OCaml. Should be easy enough to translate though.)
This means ca. 2N writes, on the average. Knuth's Art of Computer
Programming might have more efficient algorithms, but I can't imagine
one right now.
The algorithm is also very imperative; I'm not sure that it's possible
to write a functional version that matches both speed and efficiency.
(I'd be happy to see such a function.)
Regards,
Joachim
--
This is not an official statement from my employer.
Post to a newsgroup and ask for help. ;-)
> e.g. I need 0 through 1599 in random order, each number appearing
> exactly once. Ocaml if possible. Thanks in advance
let swap a i j =
let tmp = Array.get a i in
(Array.set a i (Array.get a j);
Array.set a j tmp)
let shuffled_array n =
let a = Array.init n (fun i -> i) in
(Random.init n;
for i = 0 to (n - 1) do
swap a i (Random.int n);
done;
a)
let a = shuffled_array 1600
Not really a very functional program, I admit, and probably not the best,
but I work best for money or love.
-- Brian
I believe that you will get a better randommess if you make that:
swap a i (i + (Random.int (n-i));
Al
> Brian Rogoff wrote:
>> ...
>> let swap a i j =
>> let tmp = Array.get a i in
>> (Array.set a i (Array.get a j);
>> Array.set a j tmp)
>>
>> let shuffled_array n =
>> let a = Array.init n (fun i -> i) in
>> (Random.init n;
>> for i = 0 to (n - 1) do
>> swap a i (Random.int n);
>> done;
>> a)
>>
>> let a = shuffled_array 1600
>> ...
>
> I believe that you will get a better randommess if you make that:
>
> swap a i (i + (Random.int (n-i));
> ...
Thanks to both of you. Just what I needed.
If you get an A, send us each 15%.
Al
Al Christians wrote:
But it amounts to slightly less than a... newline! Furthermore your
mailer would strip it as a white space :-(.
Alexander
> Easiest would be:
> Put them in an array.
> for i from 1 to N:
> exchange the i'th element with a randomly chosen one
> (exchanging it with itself resulting in a no-op)
I don't have the proof handy, but you can prove that swapping each element
with a random element is more likely to produce some final orderings than
others. You instead want to swap each element with a randomly chosen
element between it and the end of the list.
--
Russ Allbery (r...@stanford.edu) <http://www.eyrie.org/~eagle/>
I don't know. My first approach was to swap randomly chosen elements,
for N times, but I wasn't sure that that wouldn't create a bias.
> You instead want to swap each element with a randomly chosen
> element between it and the end of the list.
I do have a proof (well, sketch of it) that this algorithm will work.
It is by induction:
For one element, the algorithm will be trivially correct.
For N elements, there's an 1/N chance that the algorithm will swap the
first element with itself, leaving the others alone.
Further there's an 1/N chance that the algorithm will swap the first
element with any given other element O. This gives O an 1/N chance to
finish on position 1, and an (N-1)/N chance to end up on any other
position, at which is was placed with an 1/(N-1) probability in the
previous round by induction, which gives an (N-1)/N*1/(N-1) = 1/N
probability, q.e.d.
Conclusion: I don't really know whether my algorithm is correct, but I
can't prove it and I *can* prove yours, so your algorithm is
preferrable.
> Ryan Tarpine <rtar...@hotmail.com> wrote:
> > What's the easiest way to get a shuffled list or array of a
> > sequence of numbers? e.g. I need 0 through 1599 in random order,
> > each number appearing exactly once. Ocaml if possible. Thanks in
> > advance
> Easiest would be:
in Haskell, (sorry)
factorial n = product [1..n]
premutations l = left as an exercise
then
do r <- newStdGen -- start off a random generator
let (index, nextgen) = randomIvalInteger (0, factorial 1600 - 1) r
-- generate a random integer between 0 and 1600!
return (permutations [0..1599] !! index)
gives you what you want
> The algorithm is also very imperative;
My algorithem is quite functional. Might be a bit slow, though ;-)
--
Jón Fairbairn Jon.Fa...@cl.cam.ac.uk
Yup, it's O(N!) in time and space, since permutations is generating all
of them. I don't think that current-day computers can do that with 1600
items.
Is there a functional, efficient algorithm for generating a random
permutation?
> > do r <- newStdGen -- start off a random generator
> > let (index, nextgen) = randomIvalInteger (0, factorial 1600 - 1) r
> > -- generate a random integer between 0 and 1600!
> > return (permutations [0..1599] !! index)
> Yup, it's O(N!) in time and space, since permutations is generating all
> of them.
Uh, doesn't laziness turn it into an O(n) space thing? So unless I'm
mistaken, you're left with only O(n!) time complexity. Might still be
enough to put some people off, though.
> Is there a functional, efficient algorithm for generating a random
> permutation?
Uh, given a list of elements, draw a random one and add to an
accumulator list until none are left? Gives you O(n^2), but that's
because you have to traverse lists - and the constant factor is low.
And space is O(n).
-kzm
--
If I haven't seen further, it is by standing in the footprints of giants
I can prove that yours is wrong. Let N=3. Your algorithm generates
three random numbers in the course of its execution. Each of these
random numbers is between 1 and 3. The result, R, is a function of
these three random numbers and nothing else.
There are 27 possible triples of random numbers (1,1,1) .. (3,3,3),
and 6 possible results R. Since 27 is not an exact multiple of 6,
your algorithm must select some results more often than others.
In general, N^N is not a multiple of N! unless N<=2, so your algorithm
is biased except when N<=2.
--
@P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{
@p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f^ord
($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^[P.]/&&
close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d[$q];sleep rand(2)if/\S/;print
you can have an O(n log n) by using a tree instead of a list.
--
Rémi Vanicat
van...@labri.u-bordeaux.fr
http://dept-info.labri.u-bordeaux.fr/~vanicat
Hmm... I'd think this heavily depends on the details of the
implementation of permutations.
Am I guessing right here?
> So unless I'm
> mistaken, you're left with only O(n!) time complexity. Might still be
> enough to put some people off, though.
Are you sure you meant O(n!) here? That's one of the worst complexities
that one encounters in practice!
> > Is there a functional, efficient algorithm for generating a random
> > permutation?
>
> Uh, given a list of elements, draw a random one and add to an
> accumulator list until none are left?
For example.
> Gives you O(n^2), but that's
> because you have to traverse lists - and the constant factor is low.
O(N^2) isn't "efficient" if a linear one exists. The worst I'd accept is
O(N log N) (and when it comes to log N factors, I'm more concerned about
constants than big-O complexity).
One could use a tree that stores the not-yet-assigned numbers. We'd have
to store the size of each subtree in its root to guide the random-number
generator. And, since we're at it, we can dispose of the actual numbers,
since the number generated can be inferred from the sequence of
left-right decisions of the RNG, leaving us with a tree where each node
stores the number of unused numbers in some power-of-two-sized interval.
This would have a space and time complexity of O(N log N), but with a
relatively high constant overhead.
> And space is O(n).
Agreed.
> Ketil Z Malde <ke...@ii.uib.no> wrote:
> > "Joachim Durchholz" <joac...@gmx.de> writes:
> >
[I wrote]
> > > > do r <- newStdGen -- start off a random generator
> > > > let (index, nextgen) = randomIvalInteger (0, factorial 1600 -
> 1) r
> > > > -- generate a random integer between 0 and 1600!
> > > > return (permutations [0..1599] !! index)
> >
> > > Yup, it's O(N!) in time and space, since permutations is generating
> all
> > > of them.
> >
> > Uh, doesn't laziness turn it into an O(n) space thing?
>
> Hmm... I'd think this heavily depends on the details of the
> implementation of permutations.
I think so.
> Am I guessing right here?
>
> > So unless I'm
> > mistaken, you're left with only O(n!) time complexity. Might still be
> > enough to put some people off, though.
>
> Are you sure you meant O(n!) here?
He did. I'm not sure that people have fully appreciated the
implications of the smiley in my original post!
--
Jón Fairbairn Jon.Fa...@cl.cam.ac.uk
Code in Ocaml:
let _ = Random.self_init ()
(* shuffle the array in place *)
let shuffle (a:'a array) : unit =
let cmp a b = (Random.int 3) - 1 in
Array.sort cmp a
(* return a permuted array of ints of size l *)
let permutated_int (l:int) : int array =
let a = Array.init l (fun ii -> ii) in
shuffle a;
a
and run you get
# permutated_int 10;;
- : int array = [|3; 5; 9; 4; 1; 6; 8; 7; 2; 0|]
# permutated_int 20;;
- : int array =
[|11; 18; 3; 14; 2; 8; 7; 17; 9; 10; 5; 16; 13; 1; 15; 19; 12; 6; 4; 0|]
# permutated_int 20;;
- : int array =
[|9; 4; 17; 5; 14; 18; 7; 6; 11; 15; 19; 10; 13; 12; 16; 3; 8; 2; 1; 0|]
# permutated_int 20;;
- : int array =
[|4; 8; 16; 7; 17; 6; 10; 14; 15; 18; 11; 19; 9; 3; 0; 5; 13; 2; 1; 12|]
#
--
Mattias Waldau, http://www.abc.se/~m10217/
> I would use sort to shuffle, and have a compare function that
> is random. You will get a n log n shuffle using this technique.
Argh! There is a risk that the sort function will loop, or perform an
"array out of bounds" error. All sorting algorithms rely on the comparison
function to be a total order, which is certainly not the case for yours:
> let cmp a b = (Random.int 3) - 1
Better not do this -- unless you're competing for the "best abuse of a
library function" contest :-)
What you could do, however, is associate random integers to each element of
the array to be shuffled, then sort the (elt, random) pairs by increasing
"random" components. That should give you a good shuffle of the "elt"
components, in n log n time.
- Xavier Leroy
--
Valid e-mail address (without the underscores): Xavier.Leroy@i_n_r_i_a.f_r
This is a protection against junk mail. Apologies for the inconvenience.
Home page: http://pauillac.inria.fr/~xleroy/
Yeah... but can you prove that the distribution is truly random?
You'd have to prove that the sort algorithm, whatever it is, will place
an element in any array slot with equal probability if the comparison
operation is random. Quicksort is optimized for other purposes, so I
doubt it will have that property. OTOH if it does work, I suspect that
it will work for any sort algorithm.
Food for thought for algorithm researchers :)
> What you could do, however, is associate random integers to each element of
> the array to be shuffled, then sort the (elt, random) pairs by increasing
> "random" components. That should give you a good shuffle of the "elt"
> components, in n log n time.
That's what I'd do, though you need to make sure that the randoms are
all distinct, otherwise you get the n^n instead of n! problem
mentioned earlier in the thread in connexion with one of the more
imperative algorithms. Then the difficulty is to generate n /distinct/
random numbers in reasonable time/space.
--
Jón Fairbairn Jon.Fa...@cl.cam.ac.uk
Linear congruential generators have that property. Since they are of
the form x' <- (ax + b) mod m, you get a repeat only when the prng
loops. So a long-period will generate numbers without repeats.
This is occasional justification for using them -- I can't think of
any other situation in which you would want to.
Neel
> That's what I'd do, though you need to make sure that the randoms are
> all distinct, otherwise you get the n^n instead of n! problem
> mentioned earlier in the thread in connexion with one of the more
> imperative algorithms. Then the difficulty is to generate n /distinct/
> random numbers in reasonable time/space.
As someone else responded, if your PRNG has a large period, this won't
happen. In the general case, it's also improbable to get two identical
random numbers if your sequence is short enough. E.g. assuming 64-bit
random numbers, collisions are unlikely if the sequence to shuffle has
much less than 2^32 elements, which should often be the case :-)
> In the general case, it's also improbable to get two identical
> random numbers if your sequence is short enough. E.g. assuming 64-bit
> random numbers, collisions are unlikely if the sequence to shuffle has
> much less than 2^32 elements, which should often be the case :-)
Well, yes, that's true. But it doesn't /feel/ right to rely on
it, somehow!
--
Jón Fairbairn Jon.Fa...@cl.cam.ac.uk
Quicksort and mergesort are the sort algoritms that I mostly believe will
make a good shuffle.
In quicksort, the call to partition will take an element at a time
and decide if it should be put into the left or to the right subpart, then
this will be repeated recursively for the subparts.
In mergesort, the merger will either to take the first element
from the left or the right subpart, and then continue to recurse.
Regarding Xavier Leroy fear that the sorting library might crash
if compare isn't a total ordering, of course I cannot argue since
Xavier Leroy probably is the one who implemented the sorting in Ocaml.
A good and pure sorting functions should never be able to detect that
the compare function is cheating, since the only way to to that
would be to compare the same two elements twice. That compare is a
total ordering is only needed in order to be sure that the result
is sorted.
Al
For shuffling, LCGs are no good, because they do not have enough state.
Say your LCG has N bits of state. You are trying to produce an
unbiased shuffle of K items. Since 2^N does not divide K!, there is bias.
For example, consider an LCG with 15 bits of state. You are trying to
shuffle 7 items. You have just put 32768 items (the 32768 possible
outputs of the LCG) into 5040 boxes (the shuffles). The best possible
distribution will assign 6 random seeds to some permutations and 7
seeds to some others; this means that even in the optimal case some
permuations will come up 17% more often than others. With this much
bias, it is not clear that it is even worth using an 'unbiased'
algorithm.
A 15-bit LCG is not random enough to shuffle more than about 5 items
effectively. A 64-bit LCG will suffice to shuffle a list of up to 12
items. As the size of the list K increases, the amount of state
necessary in the LCG to acheave a reasonably unbiased shuffle
increases as O(K log K).
That's not really a disproof. It still depends on how often the random
comparison is applied to what elements.
I don't think that such abstract arguments would help much. Somebody
would have to either go through the gory details of analysing the effect
of a random sort order on a concrete sort algorithm, or be a genius and
give a *strict* argument why sorting by "random order" is a good or a
bad idea.
A better idea might be to apply the generate-next-random-number
algorithm to all numbers and use these numbers to establish a sort
order.
As Jon has pointed out, if you're truly worried about the quality of the
randomness, you'd have to use a PRNG with more bits of state than each
number alone can provide.
(Hope you can sort this out, it's not too clear what I'm writing here
but I don't have the time to reword this once again.)
If you check on this to see what the distribution of results is,
you will almost certainly find that it is very non-random looking.
Running with the original spec (1600 elements to sort, initially
arranged 0 to 1599), I find that the array returned has the first
elements too high by about 10%, and the last elements too low by
about 98%.
Look at the last two or three elements in the arrays returned below,
and you can see that this effect shows in your results, too:
> # permutated_int 10;;
> - : int array = [|3; 5; 9; 4; 1; 6; 8; 7; 2; 0|]
> # permutated_int 20;;
> - : int array =
> [|11; 18; 3; 14; 2; 8; 7; 17; 9; 10; 5; 16; 13; 1; 15; 19; 12; 6; 4; 0|]
> [|9; 4; 17; 5; 14; 18; 7; 6; 11; 15; 19; 10; 13; 12; 16; 3; 8; 2; 1; 0|]
> # permutated_int 20;;
> - : int array =
> [|4; 8; 16; 7; 17; 6; 10; 14; 15; 18; 11; 19; 9; 3; 0; 5; 13; 2; 1; 12|]
Al
Damn, you're right -- I should have thought of that. Thanks for
pointing out my error!
Neel
Well, you can check for repeats in n log n time, and repeat if
necessary. So you don't get a guaranteed time bound, but you get
guaranteed correctness and an infintesimal chance of repeating
more than a constant number of times.
--
Jeffrey M. Vinocur * jm...@cornell.edu
http://www.people.cornell.edu/pages/jmv16/
Ryan Tarpine wrote:
> What's the easiest way to get a shuffled list or array of a sequence of
> numbers? e.g. I need 0 through 1599 in random order, each number appearing
> exactly once. Ocaml if possible. Thanks in advance
>
i'm still new at functional programming, so here's a C++ algorithm. you'll
need a better way to seed the random variable though...
#include <iostream>
using namespace std;
void shuffle (int []);
int main()
{
int arr[1600] = {0};
shuffle(arr);
return 0;
}
/*
* I need 0 through 1599 in random order, each number appearing exactly
once. Ocaml if possible.
*/
void shuffle (int shuffarr[])
{
int rnum;
for (int i = 1; i <= 1600; i++) {
do {
rnum = rand()%1599;
} while (shuffarr[rnum] != 0);
shuffarr[rnum] = i - 1;
}
}
hope this helps a little.
cheers,
christian
It does not, as long as the total number of comparisons is finite.
There is no way to add up a finite number of fractions of the form
2^(-k) and get 1/6.
>Somebody would have to either go through the gory details of
>analysing the effect of a random sort order on a concrete sort
>algorithm,
Fine. I'll use the following sort algorithm for three elements a,b,c:
Compare a and b.
If a<b, compare b and c.
If b<c, you are done: a<b<c
If c<b, compare a and c.
If a<c, you are done: a<c<b
If c<a, you are done: c<a<b
If b<a, compare b and c.
If b<c, compare a and c.
If a<c, you are done: b<a<c
If c<a, you are done: b<c<a
If c<b, you are done: c<b<a
If each 'comparison' is done at random with a 1/2 probability of going
either way, then the outcome is:
a<b<c 1/4
a<c<b 1/8
c<a<b 1/8
b<a<c 1/8
b<c<a 1/8
c<b<a 1/4
Some of the orderings appear twice as often as others.
No sorting algorithm can do better than this in general, and most will
do exactly this. For instance, a straightforward implementation of
quicksort produces a table just like this one when it is run on a
3-element set. The pivot element is chosen. It is compared with the
other two elements. Half the time it turns out to be the middle
element and no more comparisons are done. The other half the time it
is either the largest or smallest, and the remaining two items must be
compared.
You can already see that something is wrong, because the pivot element
should be the middle element 1/3 of the time, not 1/2 of the time.
It's now trivial to prove that quicksort with a fair random comparison
function on any number of elements produces a biased shuffle. The
pivot element might be larger than all but three of the remaining
elements, and in that case the remaining three elements will be
recursively sorted with quicksort, which we've already seen will order
them in a biased way.
You can fix this by adjusting the random comparison function so that
it doesn't always choose between a<b and b<a equiprobably. But to do
that probably turns out to be equivalent to the simpler well-known
algorithms like Fischer-Yates.
>A better idea might be to apply the generate-next-random-number
>algorithm to all numbers and use these numbers to establish a sort
>order.
That works, if there is enough entropy in the PRNG. Fischer-Yates
will be more efficient, if your language has arrays.