How to randomize foreach

Josh

unread,

Aug 22, 2004, 12:27:43 PM8/22/04

to

I have an array that I need to cycle through once, but in a random
order. For now I'm dumping the variables into a hash, pulling a random
number, retrieving that value, using it, deleting it, and repeating
the process until there are no more members in the hash. This
obviously won't be efficient with a large set. Seems there must be a
better way.

Here's what I'm currently doing:

%sections = (
'0'=>'x',
'1'=>'t',
'2'=>'m',
'3'=>'v'
);
$count_sect = (keys %sections); # counts the sections in the keys

do {
$rand_sect = int(rand($count_sect)); # generates a random integer <
$count_sect
print 'key:'.$rand_sect."\n";

if (exists $sections{$rand_sect})
{ $sect=$sections{$rand_sect};
print "SECTION:".$sect."\n"; # do things with the returned section
delete $sections{$rand_sect};
}
else
{ print '-key non existent'."\n";}

} until (!keys %sections);

Anyone have suggestions for improvement? Thanks.

Brian Greenfield

unread,

Aug 22, 2004, 12:49:04 PM8/22/04

to

On 22 Aug 2004 09:27:43 -0700, jlin...@hotmail.com (Josh) wrote:

>I have an array that I need to cycle through once, but in a random
>order.

perldoc -q shuffle

Gunnar Hjalmarsson

unread,

Aug 22, 2004, 1:19:29 PM8/22/04

to

Josh wrote:
> I have an array that I need to cycle through once, but in a random
> order. For now I'm dumping the variables into a hash, pulling a
> random number, retrieving that value, using it, deleting it, and
> repeating the process until there are no more members in the hash.
> This obviously won't be efficient with a large set. Seems there
> must be a better way.

This is at least significantly less code:

my @arr = qw/x t m v/;
print splice(@arr, rand $_, 1) for reverse 1 .. @arr;

As Brian suggested, you should also read the applicable FAQ entry,
especially if your array is large.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Anno Siegel

unread,

Aug 22, 2004, 2:47:35 PM8/22/04

to

Gunnar Hjalmarsson <nor...@gunnar.cc> wrote in comp.lang.perl.misc:

> Josh wrote:
> > I have an array that I need to cycle through once, but in a random
> > order. For now I'm dumping the variables into a hash, pulling a
> > random number, retrieving that value, using it, deleting it, and
> > repeating the process until there are no more members in the hash.
> > This obviously won't be efficient with a large set. Seems there
> > must be a better way.
>
> This is at least significantly less code:
>
> my @arr = qw/x t m v/;
> print splice(@arr, rand $_, 1) for reverse 1 .. @arr;

Or even

print splice(@arr, rand @arr, 1) while @arr;

> As Brian suggested, you should also read the applicable FAQ entry,
> especially if your array is large.

There is also List::Util::shuffle, ready to use.

Anno

Abigail

unread,

Aug 22, 2004, 6:27:22 PM8/22/04

to

Anno Siegel (anno...@lublin.zrz.tu-berlin.de) wrote on MMMMIX September
MCMXCIII in <URL:news:cgapo7$82q$1...@mamenchi.zrz.TU-Berlin.DE>:
%% Gunnar Hjalmarsson <nor...@gunnar.cc> wrote in comp.lang.perl.misc:
%% > Josh wrote:
%% > > I have an array that I need to cycle through once, but in a random
%% > > order. For now I'm dumping the variables into a hash, pulling a
%% > > random number, retrieving that value, using it, deleting it, and
%% > > repeating the process until there are no more members in the hash.
%% > > This obviously won't be efficient with a large set. Seems there
%% > > must be a better way.
%% >
%% > This is at least significantly less code:
%% >
%% > my @arr = qw/x t m v/;
%% > print splice(@arr, rand $_, 1) for reverse 1 .. @arr;
%%
%% Or even
%%
%% print splice(@arr, rand @arr, 1) while @arr;

Let's not. Splicing out elements one-by-one of an array is very
inefficient. Shuffling can be done in linear time, while the splice
method takes time quadratic in the length of the array.

It's even worse than suggesting to use Bubblesort to sort an array.

Abigail
--
perl -wlpe '}$_=$.;{' file # Count the number of lines.

Gunnar Hjalmarsson

unread,

Aug 22, 2004, 7:38:11 PM8/22/04

to

Abigail wrote:
> Anno Siegel wrote:

>> Gunnar Hjalmarsson wrote:
>>>
>>> my @arr = qw/x t m v/;

>>> print splice(@arr, rand $_, 1) for reverse 1 .. @arr;
>>

>> Or even

>>
>> print splice(@arr, rand @arr, 1) while @arr;
>
> Let's not. Splicing out elements one-by-one of an array is very
> inefficient.

Yeah, that's what the FAQ says, too. OTOH, I measured the speed with
help of the Benchmark module, and code that created the array

my @arr = 1 .. 5000;

and then spliced it randomly using those inefficient methods, run
about 15 times per second. So, if you don't have a really big array,
or are about to run it very frequently, is there anything wrong with
keeping it simple?

> Shuffling can be done in linear time, while the splice
> method takes time quadratic in the length of the array.

I measured with the List::Util::shuffle() function, too, and it run
about 70 times per second. However, when I included loading of the
module in the benchmark, the overall speed was reduced to 12 times per
second, i.e. slightly slower than the slice() method...

use Benchmark 'cmpthese';
my %oldINC = %INC;

cmpthese -5, {
'splice()' => sub {
my @arr = 1 .. 5000;
my @shuffled;
push @shuffled, splice(@arr, rand @arr, 1) while @arr;
},
'shuffle()' => sub {
my @arr = 1 .. 5000;
%INC = %oldINC;
require List::Util;
my @shuffled = List::Util::shuffle(@arr);
},
};

Output including module loading:

Rate shuffle() splice()
shuffle() 12.0/s -- -23%
splice() 15.5/s 30% --

Matthew Braid

unread,

Aug 22, 2004, 8:19:06 PM8/22/04

to

Gunnar Hjalmarsson wrote:
> I measured with the List::Util::shuffle() function, too, and it run
> about 70 times per second. However, when I included loading of the
> module in the benchmark, the overall speed was reduced to 12 times per
> second, i.e. slightly slower than the slice() method...
>
> use Benchmark 'cmpthese';
> my %oldINC = %INC;
>
> cmpthese -5, {
> 'splice()' => sub {
> my @arr = 1 .. 5000;
> my @shuffled;
> push @shuffled, splice(@arr, rand @arr, 1) while @arr;
> },
> 'shuffle()' => sub {
> my @arr = 1 .. 5000;
> %INC = %oldINC;
> require List::Util;
> my @shuffled = List::Util::shuffle(@arr);
> },
> };
>
> Output including module loading:
>
> Rate shuffle() splice()
> shuffle() 12.0/s -- -23%
> splice() 15.5/s 30% --
>

Odd. I just tried the same thing (well, I added a third option - sort
{int(rand(3)) - 1} @arr - to double check:

use Benchmark 'cmpthese';
my %oldINC = %INC;
cmpthese -5, {
'splice()' => sub {
my @arr = 1..5000;
my @shuffled;
push @shuffled, splice(@arr, rand @arr, 1) while @arr;
},
'shuffle()' => sub {
my @arr = 1..5000;
%INC = %oldINC;
require List::Util;
my @shuffled = List::Util::shuffle(@arr);
},

'rand()' => sub {
my @arr = 1..5000;
my @shuffled = sort {int(rand(3)) - 1} @arr;
}
};

and I got:

Rate rand() splice() shuffle()
rand() 23.1/s -- -47% -88%
splice() 43.3/s 87% -- -77%
shuffle() 189/s 719% 338% --

Where shuffle was the clear winner by a mile.... Normally I'd say YMMV, but this
seems like a slightly over the top difference. I'm using an slightly old perl
(5.8.1) - maybe something is different between our versions?

MB

jdog

unread,

Aug 22, 2004, 10:44:42 PM8/22/04

to

Thanks for the feedback. Right now the array is fairly small so I'm
using the most convenient option (List::Util::Shuffle). If the script
becomes slow when the array grows I'll do some benchmarking.

Josh

Abigail

unread,

Aug 23, 2004, 3:27:36 AM8/23/04

to

Matthew Braid (m...@uq.net.au.invalid) wrote on MMMMX September MCMXCIII in
<URL:news:cgbd5q$j40$1...@bunyip.cc.uq.edu.au>:
''
'' Odd. I just tried the same thing (well, I added a third option - sort

'' {int(rand(3)) - 1} @arr - to double check:

I've seen variations of the third option before, being presented as a
way to shuffle a list. But I've yet to see any proof that this gives a
fair shuffle (fair being every possible permutation of the list has the
same chance of being produced, given a truely random 'rand()' function).

Abigail
--
perl -we '$_ = q ;4a75737420616e6f74686572205065726c204861636b65720as;;
for (s;s;s;s;s;s;s;s;s;s;s;s)
{s;(..)s?;qq qprint chr 0x$1 and \161 ssq;excess;}'

Abigail

unread,

Aug 23, 2004, 3:28:26 AM8/23/04

to

jdog (josh.l...@gmail.com) wrote on MMMMX September MCMXCIII in
<URL:news:f59ab99c.04082...@posting.google.com>:
^^ Thanks for the feedback. Right now the array is fairly small so I'm
^^ using the most convenient option (List::Util::Shuffle). If the script
^^ becomes slow when the array grows I'll do some benchmarking.

Given that's also one of the most efficient ways of doing it, no need
for a chance if the array grows.

Abigail
--
perl -we 'print split /(?=(.*))/s => "Just another Perl Hacker\n";'

Gunnar Hjalmarsson

unread,

Aug 23, 2004, 4:17:35 AM8/23/04

to

Matthew Braid wrote:
> Gunnar Hjalmarsson wrote:
>> I measured with the List::Util::shuffle() function, too, and it
>> run about 70 times per second. However, when I included loading
>> of the module in the benchmark, the overall speed was reduced to
>> 12 times per second, i.e. slightly slower than the slice()
>> method...

<snip>

>> Output including module loading:
>>
>> Rate shuffle() splice()
>> shuffle() 12.0/s -- -23%
>> splice() 15.5/s 30% --
>
> Odd. I just tried the same thing (well, I added a third option -
> sort {int(rand(3)) - 1} @arr - to double check:

<snip>

> and I got:
>
> Rate rand() splice() shuffle()
> rand() 23.1/s -- -47% -88%
> splice() 43.3/s 87% -- -77%
> shuffle() 189/s 719% 338% --
>
> Where shuffle was the clear winner by a mile.... Normally I'd say
> YMMV, but this seems like a slightly over the top difference. I'm
> using an slightly old perl (5.8.1) - maybe something is different
> between our versions?

Well, I ran it using 5.8.0 on Windows 98. Just tried it with 5.8.1 on
Linux, and then the results were similar to yours. Can the explanation
possibly be that loading a compiled module is much slower on Windows?

Sam Holden

unread,

Aug 23, 2004, 4:32:55 AM8/23/04

to

On 23 Aug 2004 07:27:36 GMT, Abigail <abi...@abigail.nl> wrote:
> Matthew Braid (m...@uq.net.au.invalid) wrote on MMMMX September MCMXCIII in
><URL:news:cgbd5q$j40$1...@bunyip.cc.uq.edu.au>:
> ''
> '' Odd. I just tried the same thing (well, I added a third option - sort
> '' {int(rand(3)) - 1} @arr - to double check:
>
> I've seen variations of the third option before, being presented as a
> way to shuffle a list. But I've yet to see any proof that this gives a
> fair shuffle (fair being every possible permutation of the list has the
> same chance of being produced, given a truely random 'rand()' function).

Clearly it doesn't. Consider the simple case of a two element list (a,b).
There are two possible shuffle results: (a,b) and (b,a) each of which will
have equal probability of occuring in a fair shuffle.

Any sane implementation of sort will when asked to sort a two element list
do a single comparison, so for the above there are three cases:

(int(rand(3)) -1)==-1 -> (a,b)
(int(rand(3)) -1)==0 -> (a,b)
(int(rand(3)) -1)==1 -> (b,a)

Clearly the shuffle isn't fair since (a,b) is twice as likely. Of course
it might improve with a larger list, but it doesn't bode well when the
algorithm fails for such a simple case.

Now the middle case, it could be argued, could result in (b,a) if the
sort was not stable - but even so if it did it always would and hence
(b,a) would be favoured and the problem remains. Well, for a sane
sort implementation anyway.

--
Sam Holden

Tassilo v. Parseval

unread,

Aug 23, 2004, 6:45:52 AM8/23/04

to

Also sprach Sam Holden:

> On 23 Aug 2004 07:27:36 GMT, Abigail <abi...@abigail.nl> wrote:
>> Matthew Braid (m...@uq.net.au.invalid) wrote on MMMMX September MCMXCIII in
>><URL:news:cgbd5q$j40$1...@bunyip.cc.uq.edu.au>:
>> ''
>> '' Odd. I just tried the same thing (well, I added a third option - sort
>> '' {int(rand(3)) - 1} @arr - to double check:
>>
>> I've seen variations of the third option before, being presented as a
>> way to shuffle a list. But I've yet to see any proof that this gives a
>> fair shuffle (fair being every possible permutation of the list has the
>> same chance of being produced, given a truely random 'rand()' function).
>
> Clearly it doesn't. Consider the simple case of a two element list (a,b).
> There are two possible shuffle results: (a,b) and (b,a) each of which will
> have equal probability of occuring in a fair shuffle.
>
> Any sane implementation of sort will when asked to sort a two element list
> do a single comparison, so for the above there are three cases:
>
> (int(rand(3)) -1)==-1 -> (a,b)
> (int(rand(3)) -1)==0 -> (a,b)
> (int(rand(3)) -1)==1 -> (b,a)
>
> Clearly the shuffle isn't fair since (a,b) is twice as likely. Of course
> it might improve with a larger list, but it doesn't bode well when the
> algorithm fails for such a simple case.

The naive solution to that would be disallowing '0' so that pairs are
always swapped.

However, this wont make the shuffling fair either. One problem (which
can be verified with some simple empirical tests), is that the fairness
depends on the algorithm used.

Taking this little program:

my @list = (1 .. 4);
my %perms;

shuffle() for 1 .. 10000;

while (my ($key, $val) = each %perms) {
print "$key => $val\n";
}

sub shuffle {
my @shuffled = sort { int(rand(2)) == 0 ? -1 : 1 } @list;
$perms{ join "-", @shuffled }++;
}

On perl5.8.4 with merge-sort, this yields a distribution ressembling:

2-1-4-3 => 620
1-2-4-3 => 606
3-2-1-4 => 313
1-4-2-3 => 331
2-4-3-1 => 300
1-3-4-2 => 330
2-1-3-4 => 583
4-1-2-3 => 320
1-2-3-4 => 678
3-4-2-1 => 619
2-3-4-1 => 313
3-1-4-2 => 301
4-3-1-2 => 646
2-3-1-4 => 325
4-3-2-1 => 627
1-3-2-4 => 317
4-2-1-3 => 296
1-4-3-2 => 323
2-4-1-3 => 289
3-2-4-1 => 303
3-4-1-2 => 609
4-1-3-2 => 332
4-2-3-1 => 303
3-1-2-4 => 316

You consistently get certain permutations that show up only around 300
times where others happen at least 600 times (2-1-4-3 is one of them).

Now running the same with 'use sort "_qsort"' gives a very different
picture. It's even more biased:

2-1-4-3 => 621
1-2-4-3 => 584
3-2-1-4 => 651
1-4-2-3 => 321
2-4-3-1 => 155
1-3-4-2 => 323
2-1-3-4 => 1242
4-1-2-3 => 299
1-2-3-4 => 1199
3-4-2-1 => 142
2-3-4-1 => 344
3-1-4-2 => 286
4-3-1-2 => 169
4-3-2-1 => 155
2-3-1-4 => 647
1-3-2-4 => 629
4-2-1-3 => 328
1-4-3-2 => 154
2-4-1-3 => 290
3-2-4-1 => 316
4-1-3-2 => 174
3-4-1-2 => 147
3-1-2-4 => 659
4-2-3-1 => 165

Lists 2-1-3-4 and 1-2-3-4 are always at least around 1200.

In a strict sense, this doesn't prove anything and the randomized sort
could still be fair. But really, it's not.

It might be possible to write a sort algorithm in a way that it could be
used for shuffling. Perl's implementations however weren't designed that
way. Especially perl's quicksort has been optimized to death and I
wouldn't be surprised if those optimizations contribute to this strong
bias towards certain permutations.

Having said that, I tried it with these two sort-routines:

sub bubble (&@) {
my ($cmp, @copy) = @_;
for my $i (reverse 1 .. $#copy) {
for my $j (1 .. $i) {
local ($a, $b) = @copy[$j-1, $j];
@copy[$j-1, $j] = @copy[$j, $j-1] if &$cmp == 1;
}
}
return @copy;
}

sub selection (&@) {
my ($cmp, @copy) = @_;
my $min;
for my $i (0 .. $#copy-1) {
$min = $i;
for my $j ($i+1 .. $#copy) {
local ($a, $b) = @copy[$j, $min];
$min = $j if &$cmp == -1;
}
@copy[$min, $i] = @copy[$i, $min];
}
return @copy;
}

in the hope that maybe those naive sort algorithms could be used (as
they tend to compare each element with all remaining ones). But I was
wrong. Selection-sort has a similar bias as quicksort has (this time
favoring 4-1-2-3 and 4-1-3-2). Bubble-sort is slightly better, but still
more biased than perl's built-in mergesort.

There are other interesting things to note. When allowing the compare
function to return -1, 1 and 0 (as opposed to only -1 and 1), selection
sort gets less biased, whereas bubble sort gets more biased (4-3-2-1
only shows up around ten times and 1-2-3-4 1800 times on average).

Without being a formal proof for anything, these observations show that
the shuffle-by-sort approach is probably very rotten and should best be
left alone.

Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval

Anno Siegel

unread,

Aug 23, 2004, 9:08:43 AM8/23/04

to

Abigail <abi...@abigail.nl> wrote in comp.lang.perl.misc:

> Anno Siegel (anno...@lublin.zrz.tu-berlin.de) wrote on MMMMIX September
> MCMXCIII in <URL:news:cgapo7$82q$1...@mamenchi.zrz.TU-Berlin.DE>:
> %% Gunnar Hjalmarsson <nor...@gunnar.cc> wrote in comp.lang.perl.misc:
> %% > Josh wrote:
> %% > > I have an array that I need to cycle through once, but in a random
> %% > > order. For now I'm dumping the variables into a hash, pulling a
> %% > > random number, retrieving that value, using it, deleting it, and
> %% > > repeating the process until there are no more members in the hash.
> %% > > This obviously won't be efficient with a large set. Seems there
> %% > > must be a better way.
> %% >
> %% > This is at least significantly less code:
> %% >
> %% > my @arr = qw/x t m v/;
> %% > print splice(@arr, rand $_, 1) for reverse 1 .. @arr;
> %%
> %% Or even
> %%
> %% print splice(@arr, rand @arr, 1) while @arr;
>
> Let's not. Splicing out elements one-by-one of an array is very
> inefficient. Shuffling can be done in linear time, while the splice
> method takes time quadratic in the length of the array.

True, though the quadratic term has a small factor. One time-linear
shuffler swaps list elements instead of extracting them. That saves
the time splice needs to shift the array tight:

sub swapping {
for ( reverse 1 .. $#_ ) {
my $r = rand( 1 + $_);
@_[ $r, $_] = @_[ $_, $r];
}
@_;
}

Benchmarking this against

sub splicing {
map splice( @_, rand $_, 1), reverse 1 .. @_;
}

shows splicing() about twice as fast for arrays shorter than 1000.
The break-even point is with lists of length 12_000; from then on
swapping() wins out by an increasing margin. At 30_000, swapping
is twice as fast as splicing. On my machine.

That leaves quite a few applications (such as card shuffling) where
the splice solution is practical.

> It's even worse than suggesting to use Bubblesort to sort an array.

Even Bubblesort has found respectable applications. It is fast for small
lists that are almost in order and has been used to speed up another
sort algorithm. I think the example is documented in the _Programming
Pearls_ series, though the example I find [1] uses insertion sort (another
n**2 sorting algorithm).

Anno

[1] Bentley, _Programming Pearls_, p. 112f

Gunnar Hjalmarsson

unread,

Aug 23, 2004, 10:57:13 AM8/23/04

to

Anno Siegel wrote:
> One time-linear shuffler swaps list elements instead of extracting
> them. That saves the time splice needs to shift the array tight:
>
> sub swapping {
> for ( reverse 1 .. $#_ ) {
> my $r = rand( 1 + $_);
> @_[ $r, $_] = @_[ $_, $r];
> }
> @_;
> }
>
> Benchmarking this against
>
> sub splicing {
> map splice( @_, rand $_, 1), reverse 1 .. @_;
> }
>
> shows splicing() about twice as fast for arrays shorter than 1000.
> The break-even point is with lists of length 12_000; from then on
> swapping() wins out by an increasing margin. At 30_000, swapping is
> twice as fast as splicing. On my machine.

The benchmark I posted showed that the execution time of
List::Util::shuffle() is much faster than that. Is the explanation,
then, that I was actually comparing apples and oranges, since
List::Util is a compiled module?

Anno Siegel

unread,

Aug 23, 2004, 12:19:41 PM8/23/04

to

Gunnar Hjalmarsson <nor...@gunnar.cc> wrote in comp.lang.perl.misc:

It is. It also contains a pure Perl implementation, but by default it
loads an XS version. That's why I wanted to add a benchmark against
a Perl implementation of the in-place shuffle.

Anno

Anno Siegel

unread,

Aug 23, 2004, 12:39:38 PM8/23/04

to

Tassilo v. Parseval <tassilo....@post.rwth-aachen.de> wrote in comp.lang.perl.misc:

[interesting results snipped]

One problem with this approach is that it violates the assumptions of
the underlying algorithm. Sort procedures assume that comparisons are
consistent with a linear ordering of the list elements. In particular,
they assume that comparing the same elements twice will have the same
outcome, but also transitivity of the order relation and more. Who
is to say what a sort algorithm is going to do? It may not even halt,
for instance if the random decisions indicate a cyclic ordering.

Nor are any predictions on running time valid, so the expectation
that it runs in n*log n time may not be true. Since that is worse
than the linear time of an in-place shuffle, there isn't much to
recommend the "random sort" at all.

Anno

Abigail

unread,

Aug 23, 2004, 2:24:42 PM8/23/04

to

Anno Siegel (anno...@lublin.zrz.tu-berlin.de) wrote on MMMMX September
MCMXCIII in <URL:news:cgd5et$o5f$1...@mamenchi.zrz.TU-Berlin.DE>:

So, it's fair to compare a splice who is doing its work in C with a
shuffle written in pure Perl?

I'd say comparing the XS shuffle with the C splice is more fair.

Scott W Gifford

unread,

Aug 23, 2004, 5:09:34 PM8/23/04

to

anno...@lublin.zrz.tu-berlin.de (Anno Siegel) writes:

[...]

> One problem with this approach is that it violates the assumptions of
> the underlying algorithm. Sort procedures assume that comparisons are
> consistent with a linear ordering of the list elements. In particular,
> they assume that comparing the same elements twice will have the same
> outcome, but also transitivity of the order relation and more. Who
> is to say what a sort algorithm is going to do? It may not even halt,
> for instance if the random decisions indicate a cyclic ordering.

Indeed, the documentation for Perl's sort function says:

The comparison function is required to behave. If it returns
inconsistent results (sometimes saying $x[1] is less than $x[2]
and sometimes saying the opposite, for example) the results are
not well-defined.

-----ScottG.

Anno Siegel

unread,

Aug 23, 2004, 8:04:56 PM8/23/04

to

Abigail <abi...@abigail.nl> wrote in comp.lang.perl.misc:

I didn't want to imply any unfairness, just to add a data point.

> I'd say comparing the XS shuffle with the C splice is more fair.

Replacing n calls to a C-level function by a single one is fair?

We know that the overhead swamps out a lot of what may be happening
low level, and these benchmarks prove it again. So setting one
swap operation against one call to splice seems fair enough from
that point of view.

Anno

Matthew Braid

unread,

Aug 23, 2004, 8:33:36 PM8/23/04

to

Abigail wrote:

> Matthew Braid (m...@uq.net.au.invalid) wrote on MMMMX September MCMXCIII in
> <URL:news:cgbd5q$j40$1...@bunyip.cc.uq.edu.au>:
> ''
> '' Odd. I just tried the same thing (well, I added a third option - sort
> '' {int(rand(3)) - 1} @arr - to double check:
>
> I've seen variations of the third option before, being presented as a
> way to shuffle a list. But I've yet to see any proof that this gives a
> fair shuffle (fair being every possible permutation of the list has the
> same chance of being produced, given a truely random 'rand()' function).
>
>
> Abigail

I only included the rand() method out of curiousity (I knew it was gonna be
slow, I just put it there as a comparison). I'd never actually _use_ that method
in my code :)

MB

Bart Lateur

unread,

Aug 27, 2004, 7:32:55 AM8/27/04

to

Matthew Braid wrote:

>Odd. I just tried the same thing (well, I added a third option - sort
>{int(rand(3)) - 1} @arr - to double check:

I've always heard that using an inconsequent sorting function in perl,
like this one, is a recipe for a disaster.

You could just generate a random number per entry, and sort according to
that list. Like:

@sort = map rand, 0 .. $#item;
@shuffled = @item[sort { $sort[$a] <=> $sort[$b] } 0 .. $#item];

--
Bart.

Anno Siegel

unread,

Aug 27, 2004, 8:53:58 AM8/27/04

to

Bart Lateur <bart....@pandora.be> wrote in comp.lang.perl.misc:

It may still be hard to prove that this generates an unbiased sequence.
There will usually be a few duplicates among the generated numbers.
For these, the sequence is determined by the sort algorithm, not by
the random generator.

Anno

Bart Lateur

unread,

Aug 29, 2004, 5:48:27 AM8/29/04

to

Anno Siegel wrote:

>There will usually be a few duplicates among the generated numbers.

Not *exact* duplicates, in the numerical sense (though perhaps it
looksthat way when stringified). Otherwise, you'd have reached the end
of the loop, and start getting the exact same sequence again.

--
Bart.