How to shuffle a list?

Phil Bewig

unread,

May 28, 2002, 11:27:18 AM5/28/02

to

It's easy enough to shuffle a vector:

(define (shuffle v)
(let ((n (- (vector-length v) 1)))
(do ((i 0 (+ i 1))) ((< n i) v)
(let* ((r (randint i n))
(t (vector-ref v r)))
(vector-set! v r (vector-ref v i))
(vector-set! v i t)))))

But I can't figure out any way to shuffle a list that is
simpler than copying the list to a vector, shuffling the
vector, and copying the vector back to a list.

Any ideas?

Phil

George Demmy

unread,

May 28, 2002, 12:00:03 PM5/28/02

to

pbe...@swbell.net (Phil Bewig) writes:

I think that the vector method is going to be hard to beat in
practice. To shuffle a list in place or using only lists, you're going
to have to cook up some sort of bookkeeping for the positions and will
wind up traversing the list a bunch of times. A hybrid approach would
set up and empty vector the same length as the list, and cdr-down your
list, inserting the car of the list into a randomly-selected position
in the vector as you go.

G
--
George Demmy
Layton Graphics, Inc

Ben Goetter

unread,

May 28, 2002, 12:18:12 PM5/28/02

to

Here's a destructive list shuffle. RAND returns the obvious thing.

(define (shuffle! deck)
(do ((l (length deck) (- l 1)) (d deck (cdr d)))
((= l 1) deck)
(let* ((n (rand l)) (c (list-ref d n)))
(set-car! (list-tail d n) (car d))
(set-car! d c))))

Joe Marshall

unread,

May 28, 2002, 11:38:01 AM5/28/02

to

"Phil Bewig" <pbe...@swbell.net> wrote in message news:455f7154.02052...@posting.google.com...

Partition the list into two sublists.
Recursively shuffle the two sublists.
Combine the shuffled sublists in some manner.

I can see several ways of doing this and I'm not sure which
would be the best. One way that seems promising to me is
to recombine the shuffled sublists by selecting one at
random to start with and then alternating between them.

Joe Marshall

unread,

May 28, 2002, 2:42:17 PM5/28/02

to

"Joe Marshall" <prunes...@attbi.com> wrote in message news:tvNI8.107$fT5....@typhoon.ne.ipsvc.net...

Turns out that this doesn't work too well.

The problem with this solution is that at each level there
are two possible outcomes, so with 8 elements, for example,
there would be 128 possible outcomes. But there are
8! (40320) permutations of 8 elements so my recursive shuffling
doesn't even get close to covering the space.

Sigh. Back to the drawing board.

David Rush

unread,

May 28, 2002, 4:43:42 PM5/28/02

to

"Joe Marshall" <prunes...@attbi.com> writes:
> "Joe Marshall" <prunes...@attbi.com> wrote in message news:tvNI8.107$fT5....@typhoon.ne.ipsvc.net...
> >
> > "Phil Bewig" <pbe...@swbell.net> wrote in message news:455f7154.02052...@posting.google.com...
> > > It's easy enough to shuffle a vector:

> > > But I can't figure out any way to shuffle a list that is
> > > simpler than copying the list to a vector, shuffling the
> > > vector, and copying the vector back to a list.
> > >
> > > Any ideas?
> >
> > Partition the list into two sublists.
> > Recursively shuffle the two sublists.
> > Combine the shuffled sublists in some manner.
> >
> >
> > I can see several ways of doing this and I'm not sure which
> > would be the best. One way that seems promising to me is
> > to recombine the shuffled sublists by selecting one at
> > random to start with and then alternating between them.
>
> Turns out that this doesn't work too well.
>
> The problem with this solution is that at each level there
> are two possible outcomes, so with 8 elements, for example,
> there would be 128 possible outcomes. But there are
> 8! (40320) permutations of 8 elements so my recursive shuffling
> doesn't even get close to covering the space.

I don't think you've covered the possibility of randomly choosing the
merged element. Your algorithm produces a random tree with a
deterministic merge. I think that if you randomize the merge that
you'll probably over-cover the space. Making it less efficient
order-wise than the copy-to-vector solution, but with a smaller
constant factor.

This is actually a pretty cool homework problem.

david rush
--
There's a difference between knowing the path...and walking it.
-- Morpheus, _The Matrix_

Joe Marshall

unread,

May 28, 2002, 5:11:13 PM5/28/02

to

"David Rush" <ku...@bellsouth.net> wrote in message news:okfy9e4...@bellsouth.net...

>
> I don't think you've covered the possibility of randomly choosing the
> merged element. Your algorithm produces a random tree with a
> deterministic merge. I think that if you randomize the merge that
> you'll probably over-cover the space. Making it less efficient
> order-wise than the copy-to-vector solution, but with a smaller
> constant factor.
>

Yeah, but that's more or less equivalent to selecting a single
element out of the list at random and placing it in another list.
(A solution that *does* work, but doesn't have that certain
je ne sais quoi that a recursive, purely functional (except for
random) solution has.)

ol...@pobox.com

unread,

May 28, 2002, 6:31:16 PM5/28/02

to

A pure-functional list-based perfect shuffling algorithm is described
in

http://pobox.com/~oleg/ftp/Haskell/perfect-shuffle.txt

I hope you don't mind reading Haskell code. The algorithm has the
complexity of Theta(n*logn), which is probably the best one can hope
for. The web page also shows a proof why the algorithm is correct.

Bernhard Pfahringer

unread,

May 28, 2002, 10:03:41 PM5/28/02

to

In article <7eb8ac3e.02052...@posting.google.com>,

please excuse me using Common Lisp here, but my Scheme is rather
rusty I am afraid. This version is as functional as the above,
and also obviously dominated by the sort step, which should be
O(nlogn). (random 1.0) returns a uniform random float from [0,1):

(defun shuffle (l)
(mapcar #'cdr
(sort (loop for x in l
collect (cons (random 1.0) x))
#'< :key #'car)))

I used to use this trick when having to shuffle lists in Prolog, actually.
--
---------------------------------------------------------------------
Bernhard Pfahringer, Dept. of Computer Science, University of Waikato
http://www.cs.waikato.ac.nz/~bernhard +64 7 838 4041
Boycott the IEEE for their copyright transfer and export control form

Ben Goetter

unread,

May 28, 2002, 10:41:03 PM5/28/02

to

Quoth Bernhard Pfahringer:

> (defun shuffle (l)
> (mapcar #'cdr
> (sort (loop for x in l
> collect (cons (random 1.0) x))
> #'< :key #'car)))

In Chez Scheme, which has equivalent SORT and RANDOM library support,
that'd be

(define (shuffle l)
(map cdr
(sort (lambda (x y) (< (car x) (car y)))
(map (lambda (x) (cons (random 1.0) x)) l))))

Phil Bewig

unread,

May 29, 2002, 10:18:11 AM5/29/02

to

pbe...@swbell.net (Phil Bewig) wrote in message news:<455f7154.02052...@posting.google.com>...

Thanks for all the suggestions. And no, David, this isn't
a pretty cool homework problem; I'm writing a test suite
and need random permutations of a list to feed the tester.

Ben's suggestion did what I hoped to avoid; calls to length,
list-ref and list-tail repeatedly traverse the list, leading
to O(n^2) complexity.

George suggested cdring down the list, inserting items into
a random location in a vector, then converting the vector
back into a list. That works well at first, but slows down
as the vector fills. Consider the case of putting the last
item into the vector when only one position remains unfilled.
The program keeps going on and on until it guesses the right
random number.

Joe suggested a divide-and-conquer approach that would
probably be O(n log n). This looks intriguing to me, and
I'm persuaded by David's argument that it will cover the
space of all possible permutations if done correctly.
Perhaps Joe would like to continue exploring his approach,
and report working code back to the group. Or perhaps not.

Oleg's perfect shuffle is perfect, but more machinery than
I want to build into my test suite, and it still has a time
complexity of O(n log n). It was hiding in c.l.functional,
so my searching didn't find it.

Section 13.3 of Jon Bentley's book More Programming Pearls
discusses the problem of random permutations. He presents
an algorithm by Bob Floyd, and suggests an O(n) method
involving a hash table with a linked list connecting the
entries. Again, this is more machinery than I want.

I've decided to go with Bernhard's suggestion that Ben
translated from Lisp to Scheme:

(define (shuffle l)
(map cdr
(sort (lambda (x y) (< (car x) (car y)))
(map (lambda (x) (cons (random 1.0) x)) l))))

This has time complexity O(n log n), but constant factors
make it even with the O(n) approach of copying list to
vector, shuffling vector, and copying vector back to list
for the lists of up to one hundred items I am looking at,
and at only four lines of code it fits nicely in the test
suite.

I think George said it best: the vector method is hard to
beat. If I had larger lists to shuffle, I would use the
vector method. In fact, I still may change my mind and
use the vector method; why keep two shuffles lying around
my bit bucket and have to decide which one to use, when
the vector method is still only ten lines long:

(define (shuffle lst)
(vector->list
((lambda (v)

(let ((n (- (vector-length v) 1)))
(do ((i 0 (+ i 1))) ((< n i) v)
(let* ((r (randint i n))
(t (vector-ref v r)))
(vector-set! v r (vector-ref v i))
(vector-set! v i t)))))

(list->vector lst))))

I suspected when I posted that the vector method might be
best, as most array-based O(n) algorithms seem to become
O(n log n) when adapted to lists, which is the cost of
having a data structure with a size not fixed in advance.

Thanks to all who responded. This was fun, wasn't it?

Phil

Seth Gordon

unread,

May 29, 2002, 11:35:47 AM5/29/02

to

> (define (shuffle l)
> (map cdr
> (sort (lambda (x y) (< (car x) (car y)))
> (map (lambda (x) (cons (random 1.0) x)) l))))

Omigod! It's the Schwartzian Transform!

--
"Fixing a compile is not an accomplishment unless
you did it by actually fixing the bugs the compiler
was complaining about." --Chris Smith
// seth gordon // wi/mit ctr for genome research //
// se...@genome.wi.mit.edu // standard disclaimer //

Joe Marshall

unread,

May 29, 2002, 5:08:32 PM5/29/02

to

"Phil Bewig" <pbe...@swbell.net> wrote in message news:455f7154.0205...@posting.google.com...

>
> Joe suggested a divide-and-conquer approach that would
> probably be O(n log n). This looks intriguing to me, and
> I'm persuaded by David's argument that it will cover the
> space of all possible permutations if done correctly.
> Perhaps Joe would like to continue exploring his approach,
> and report working code back to the group. Or perhaps not.
>
> Oleg's perfect shuffle is perfect, but more machinery than
> I want to build into my test suite, and it still has a time
> complexity of O(n log n). It was hiding in c.l.functional,
> so my searching didn't find it.
>

I looked at Oleg's solution and played with it a bit.
I think I'd liken it to a heap-sort run backwards. It is
purely functional, but as it removes elements from the
heap it has to rebuild parts of the tree.

I thought of a variation on it. Instead of building the
tree deterministically and selecting out of it at random,
build the tree at random and flatten it deterministically.
I think this just pushes the load from the back end of
the algorithm to the front end, though. Sort of like
running a merge-sort backwards.

I played with a few things and basically came up with a
variation on what you settled on

>
> (define (shuffle l)
> (map cdr
> (sort (lambda (x y) (< (car x) (car y)))
> (map (lambda (x) (cons (random 1.0) x)) l))))
>

Instead of traversing the list once to paste on random numbers,
then traversing the result to peel them off, I made a variant
of mergesort where the outermost level pastes on random numbers
as it is dividing the list in two and peels them off as it
is merging the final result.

I'm still not too happy with it. I think the idea of using
random numbers and sorting is neat, but there doesn't seem
to be an obvious divide and conquer method that doesn't
need to know the length of the list beforehand.

The best method seems to be the vector method, which traverses
the list 3 times (so O(3n))

Sorting is N log N, but `unsorting' is O(n) (randomizing
a vector). What gives?

Ben Goetter out of office

unread,

May 29, 2002, 7:48:23 PM5/29/02

to

Seth Gordon <se...@genome.wi.mit.edu> wrote in message news:<3CF4F553...@genome.wi.mit.edu>...

> Omigod! It's the Schwartzian Transform!

Ha! So it is.

As long as we have SORT, here's a better (non-ST, non-pointlessly-consing) version.

(define (shuffle deck)
(sort (lambda (x y) (even? (random 2))) deck))

Chris Kirkwood-Watts

unread,

May 29, 2002, 8:01:59 PM5/29/02

to

__ "Joe Marshall" <prunes...@attbi.com> _____

| Instead of traversing the list once to paste on random numbers, then
| traversing the result to peel them off, I made a variant of mergesort
| where the outermost level pastes on random numbers as it is dividing the
| list in two and peels them off as it is merging the final result.
|
| I'm still not too happy with it. I think the idea of using random numbers
| and sorting is neat, but there doesn't seem to be an obvious divide and
| conquer method that doesn't need to know the length of the list
| beforehand.

One could randomly partition the list into two sublists and then just
append the results of shuffling (randomly partitioning) those. In CL:

(defun shuffle-list (list)
(if (not (cdr list))
list
(loop for element in list
when (evenp (random 2))
collect element into first
else collect element into second
finally (return (append (shuffle-list first)
(shuffle-list second))))))

In Scheme:

(define (shuffle-list liszt)
(if (< (length liszt) 2)
liszt
(let ((first '())
(second '()))
(for-each (lambda (element)
(if (even? (rand 2))
(set! first (cons element first))
(set! second (cons element second))))
liszt)
(append (shuffle-list first) (shuffle-list second)))))

or set!-freely:

(define shuffle-list
(letrec ((partition
(lambda (liszt first second)
(cond ((null? liszt)
(values first second))
((even? (rand 2))
(partition (cdr liszt)
(cons (car liszt) first)
second))
(else
(partition (cdr liszt)
first
(cons (car liszt) second)))))))
(lambda (liszt)
(if (< (length liszt) 2)
liszt
(call-with-values
(lambda () (partition liszt '() '()))
(lambda (first second)
(append (shuffle-list first)
(shuffle-list second))))))))

Chris.

Jeffrey Siegal

unread,

May 29, 2002, 8:02:31 PM5/29/02

to

Ben Goetter out of office wrote:
> (define (shuffle deck)
> (sort (lambda (x y) (even? (random 2))) deck))

(zero? (random 2)) would be (slightly) faster on most architectures.

Thien-Thi Nguyen

unread,

May 29, 2002, 8:26:16 PM5/29/02

to

goe...@hotmail.com (Ben Goetter out of office) writes:

> (sort (lambda (x y) (even? (random 2))) deck)

(the uh... M-C-t-transform:

(sort deck (lambda (x y) (even? (random 2))))

:-)

thi

Bernhard Pfahringer

unread,

May 29, 2002, 10:03:37 PM5/29/02

to

In article <91af2201.02052...@posting.google.com>,

Sorry to disappoint, but this is usually worse,
because you generate O(nlogn) random numbers vs. n random numbers
in my approach. The cost of any good rng is (much) higher than
the consing necessary in my approach.

Bernhard

Joe Marshall

unread,

May 29, 2002, 10:28:33 PM5/29/02

to

"Jeffrey Siegal" <j...@quiotix.com> wrote in message news:3CF56C17...@quiotix.com...

It would?

Testing for zero involves `or'ing together 32 bits,
testing for even involves sampling a single bit.

I doubt that the difference is measurable (the probably both
can be tested in exactly one clock cycle).

Joe Marshall

unread,

May 29, 2002, 10:26:07 PM5/29/02

to

"Ben Goetter out of office" <goe...@hotmail.com> wrote in message
news:91af2201.02052...@posting.google.com...

Let me throw in my two cents.

Since your sort predicate is not transitive (or even deterministic!)
it might behave *very* poorly with regard to certain sort
algorithms. If the sort algorithm infers that I < J and J < K
implies I < K, then you have 2^(n-1) possible outcomes, which
isn't good enough.

Jeffrey Siegal

unread,

May 29, 2002, 10:44:49 PM5/29/02

to

Joe Marshall wrote:
> "Jeffrey Siegal" <j...@quiotix.com> wrote in message news:3CF56C17...@quiotix.com...
>
>>Ben Goetter out of office wrote:
>>
>>>(define (shuffle deck)
>>> (sort (lambda (x y) (even? (random 2))) deck))
>>
>>(zero? (random 2)) would be (slightly) faster on most architectures.
>
>
> It would?
>
> Testing for zero involves `or'ing together 32 bits,
> testing for even involves sampling a single bit.

Yes. 'Or'ing together all those bits is usually done in highly
optimized hardware, since testing against zero is very, very common in
most programs. Testing for zero usually involves, at most, a single
instruction, but is often included in other instructions. Testing for
even involves anding a bit and *then* testing for zero (which might be
included in the and or test instruction or might not).

On some very new architectures, you might be right. I think there is a
trend toward discouraging the use of implicit condition tests and
exposing their cost rather than buring it into other instructions. But
it is still going to be slightly faster (and often smaller) to test for
zero on *most* architectures.

Al Petrofsky

unread,

May 30, 2002, 12:25:22 AM5/30/02

to

Chris Kirkwood-Watts <kirk...@totzeit.net> writes:

> One could randomly partition the list into two sublists and then just
> append the results of shuffling (randomly partitioning) those. In CL:
>
> (defun shuffle-list (list)
> (if (not (cdr list))
> list
> (loop for element in list
> when (evenp (random 2))
> collect element into first
> else collect element into second
> finally (return (append (shuffle-list first)
> (shuffle-list second))))))

[Long functional scheme version omitted]

Lest anyone think scheme is any more hopelessly verbose than it
actually is, here's a functional version that is more compact, and
also does about 1/3 less consing:

(define (shuffle-list liszt)
(let shuffle ((liszt liszt) (acc '()))
(cond ((null? liszt) acc)
((null? (cdr liszt))
(cons (car liszt) acc))
(else (let partition ((liszt liszt) (l1 '()) (l2 '()))
(if (null? liszt)
(shuffle l1 (shuffle l2 acc))
(let ((first (car liszt))
(rest (cdr liszt)))
(if (zero? (random 2))
(partition rest (cons first l1) l2)
(partition rest l1 (cons first l2))))))))))

A theoretical drawback to this algorithm is that its execution time is
unbounded. If your coin keeps coming up heads for a very long time,
you can go through several partitionings without making any progress.

I would guess that the expected execution time would be O(n^2) and
possibly O(n*log(n)). Does anyone see a simple proof of the
complexity?

Joe Marshall writes:

> The best method seems to be the vector method, which traverses the
> list 3 times (so O(3n))

Of course, O(3n), O(n), and O(10n+n^.5) all describe the same set of
functions.

> Sorting is N log N, but `unsorting' is O(n) (randomizing a vector).
> What gives?

Sorting requires Omega(n log n) binary decisions, which follows from
the fact that there are n! possible orderings and log_2(n!) is Omega(n
log n). Similarly, unsorting requires Omega(n log n) random bits.
The vector algorithm is actually O(n log n) and not O(n), but we tend
to ignore factors of log n when log n < 32 (or whatever our machine's
word size is).

In fact, for practical purposes, I think you should compare an O(n)
algorithm and an O(n log n) algorithm the same way you compare two
O(n) algorithms: base your judgment on other information. What I find
instructive to remember is that if someone told me that algorithm A
was O(n^1.001) and algorithm B was O(n^1.000) (i.e. O(n)), I would
automatically discount that difference as negligible. The difference
between O(n) and O(n log n) is even more negligible: O(n) is a subset
of O(n log n) which is a subset of O(n^1.001).

Now, for impractical purposes, I still like to know what an
algorithm's tightest asymptotic bound is.

-al

Al Petrofsky

unread,

May 30, 2002, 1:37:21 AM5/30/02

to

Al Petrofsky <a...@petrofsky.org> writes:
> Chris Kirkwood-Watts <kirk...@totzeit.net> writes:
>
> > One could randomly partition the list into two sublists and then just
> > append the results of shuffling (randomly partitioning) those.

> A theoretical drawback to this algorithm is that its execution time is

> unbounded. If your coin keeps coming up heads for a very long time,
> you can go through several partitionings without making any progress.

I just realized that my code and Kirkwood's versions have a slightly
more serious flaw: they consume unbounded space. More specifically,
they require an unbounded number of active continuations. Below is a
fixed version. One could also put a cap on execution time by
arbitrarily splitting the list in half if several partitioning steps
failed to make any progress, but that would skew the output
distribution. In general, I don't think it's possible to get bounded
execution time without at least slightly disrupting the uniformity of
the output, if we assume the random number source is binary. That
follows from the fact that there's no perfectly random way to make a
trinary decision in a limited amount of time using binary coin flips.

(define (shuffle-list liszt)
(let shuffle ((liszt liszt) (acc '()))
(cond ((null? liszt) acc)
((null? (cdr liszt))
(cons (car liszt) acc))
(else (let partition ((liszt liszt) (l1 '()) (l2 '()))
(if (null? liszt)

(if (null? l1)
(partition l2 '() '())
(shuffle l1 (shuffle l2 acc)))

(let ((first (car liszt))
(rest (cdr liszt)))
(if (zero? (random 2))
(partition rest (cons first l1) l2)
(partition rest l1 (cons first l2))))))))))

-al

Al Petrofsky

unread,

May 30, 2002, 4:37:15 AM5/30/02

to

ol...@pobox.com (ol...@pobox.com) writes:

Here's a purely functional scheme shuffler that also has
Theta(n*log(n)) complexity. Like Oleg's version, it takes a second
list argument, of equal length, for which the nth (zero-based) element
should be an integer uniformly randomly chosen from the interval zero
to n (inclusive). Like the vector algorithm, and unlike the (random
1.0) algorithms, these functional algorithms consume only the minimum
amount of randomness required: log_2(n!) bits, rather than 64*n bits.
(Yes, for large enough n, 64*n is less than log_2(n!), but those are
the cases in which the (random 1.0) algorithm fails, i.e., produces
significantly non-uniform results.)

-al

(define (functional-shuffle liszt randoms)
(let shuffle ((liszt liszt)
(randoms (reverse randoms))
(len (length liszt))
(acc '()))
(case len
((0) acc)
((1) (cons (car liszt) acc))
(else (let ((half-len (quotient len 2)))
(let partition ((liszt liszt) (randoms randoms) (l1-len 0)
(l1 '()) (l1-randoms '())
(l2 '()) (l2-randoms '()))
(if (null? liszt)
(shuffle l1 (reverse l1-randoms) half-len
(shuffle l2 (reverse l2-randoms) (- len half-len)
acc))
(let ((elt (car liszt)) (liszt (cdr liszt))
(random (car randoms)) (randoms (cdr randoms)))
(let ((index (+ random l1-len)))
(if (< index half-len)
(partition liszt randoms (+ 1 l1-len)
(cons elt l1)
(cons random l1-randoms)
l2 l2-randoms)
(partition liszt randoms l1-len
l1 l1-randoms
(cons elt l2)
(cons (- index half-len)
l2-randoms))))))))))))

;; A driver function using a typical non-functional random procedure.
(define (shuffle liszt)
(functional-shuffle liszt
(let gen-randoms ((n (length liszt)) (acc '()))
(if (zero? n)
acc
(gen-randoms (- n 1) (cons (random n) acc))))))

Lauri Alanko

unread,

May 30, 2002, 8:55:39 AM5/30/02

to

In article <455f7154.02052...@posting.google.com>,

Phil Bewig <pbe...@swbell.net> wrote:
>It's easy enough to shuffle a vector:

No it isn't, apparently.

> (define (shuffle v)
> (let ((n (- (vector-length v) 1)))
> (do ((i 0 (+ i 1))) ((< n i) v)
> (let* ((r (randint i n))
> (t (vector-ref v r)))
> (vector-set! v r (vector-ref v i))
> (vector-set! v i t)))))

This won't give you evenly distributed random permutations. You have to swap
each element with a random element _ahead of it_. This is obvious, if you
stop and think about it. Hint: try to define a random permutation
inductively. Then translate that to scheme, and you'll very likely end up
with the straightforward list shuffling algorithm.

Lauri Alanko
l...@iki.fi

Bruce Lewis

unread,

May 30, 2002, 9:17:53 AM5/30/02

to

pbe...@swbell.net (Phil Bewig) writes:

> (do ((i 0 (+ i 1))) ((< n i) v)
> (let* ((r (randint i n))

I question whether the vector algorithm is really as good as the
sort-based list algorithms. For example, if you shuffle numbers 0-40,
will 10 move to the left as often as 30 moves to the right, or will
there be a statistically noticeable difference?

--
<brlewis@[(if (brl-related? message) ; Bruce R. Lewis
"users.sourceforge.net" ; http://brl.codesimply.net/
"alum.mit.edu")]>

Jeffrey Siegal

unread,

May 30, 2002, 10:26:47 AM5/30/02

to

Bruce Lewis wrote:
> pbe...@swbell.net (Phil Bewig) writes:
>
>
>> (do ((i 0 (+ i 1))) ((< n i) v)
>> (let* ((r (randint i n))
>
>
> I question whether the vector algorithm is really as good as the
> sort-based list algorithms. For example, if you shuffle numbers 0-40,
> will 10 move to the left as often as 30 moves to the right, or will
> there be a statistically noticeable difference?

It is when properly implemented.

Al Petrofsky

unread,

May 30, 2002, 1:00:16 PM5/30/02

to

Lauri Alanko <l...@iki.fi> writes:

> Phil Bewig <pbe...@swbell.net> wrote:
>
> > (define (shuffle v)
> > (let ((n (- (vector-length v) 1)))
> > (do ((i 0 (+ i 1))) ((< n i) v)
> > (let* ((r (randint i n))
> > (t (vector-ref v r)))
> > (vector-set! v r (vector-ref v i))
> > (vector-set! v i t)))))
>
> This won't give you evenly distributed random permutations. You have
> to swap each element with a random element _ahead of it_. This is
> obvious, if you stop and think about it.

Obviously :-), that's what he was doing, assuming that (randint i n)
generates a random integer in the range [i,n). Here's a version using
an unary random procedure that generates an integer in the range [0,n):

(define (shuffle lst)
(do ((v (list->vector lst))
(i (length lst) (- i 1)))
((zero? i) (vector->list v))
(let* ((r (random i))
(t (vector-ref v r)))
(vector-set! v r (vector-ref v (- i 1)))
(vector-set! v (- i 1) t))))

-al

Lauri Alanko

unread,

May 30, 2002, 1:07:12 PM5/30/02

to

In article <87znyhk...@radish.petrofsky.org>,
Al Petrofsky <a...@petrofsky.org> wrote:
>Lauri Alanko <l...@iki.fi> writes:
[foolishly]

>> This won't give you evenly distributed random permutations. You have
>> to swap each element with a random element _ahead of it_. This is
>> obvious, if you stop and think about it.
>
>Obviously :-), that's what he was doing, assuming that (randint i n)
>generates a random integer in the range [i,n).

Duh. This was an, uh, braino by me. Somehow the "i" in that expression
managed to consistently avert my eyes.

Apologies.

Lauri Alanko
l...@iki.fi

Phil Bewig

unread,

May 30, 2002, 2:24:58 PM5/30/02

to

Lauri Alanko <l...@iki.fi> wrote in message news:<ad57gb$eck$1...@oravannahka.helsinki.fi>...

I believe my shuffle is correct, in the sense that all possible random permu-
tations of the original vector are equally likely, given an unbiased source of
random integers. The call to (randint i n) generates a random integer between
i and n inclusive. The random integer is never less than i. This means that
the swap is never with an element behind; for instance, after the first time
through the loop, element zero is never changed again.

I also disagree with your statement that you must swap with a random element
ahead of the current element. You must also allow the current element to be
swapped with itself. To see why, consider the swap made at element zero. If
element zero can't be swapped with itself, and if element zero only takes part
in the first swap, then whatever is initially in element zero could never end
up in element zero in the shuffled list, eliminating a whole class of possible
permutations.

Please explain exactly what you think I did wrong, and show code to fix it.
Also show the straightforward list shuffling algorithm you mention in the last
line of your posting.

Phil

Ben Goetter out of office

unread,

May 30, 2002, 2:44:49 PM5/30/02

to

bern...@hummel.cs.waikato.ac.nz (Bernhard Pfahringer) wrote in message news:<ad419p$duk$1...@hummel.cs.waikato.ac.nz>...

> you generate O(nlogn) random numbers vs. n random numbers
> in my approach. The cost of any good rng is (much) higher than
> the consing necessary in my approach.

Whether this wins depends on the hosting implementation, I suppose.

I tested these two approaches in Chez on some non-small datasets
(n = 10000, n = 100000), and found there the overhead of generating
additional random numbers (albeit fixnum ones) more than offset by the
savings from skipping the cons/uncons and flonum arithmetic. As n
increased, the nonconsing fixnum version looked ever better.

If calls into the RNG were indeed expensive, one might be able to use
more than one bit out of it before returning for another RN.

Lauri Alanko

unread,

May 30, 2002, 2:53:41 PM5/30/02

to

In article <455f7154.02053...@posting.google.com>,

Phil Bewig <pbe...@swbell.net> wrote:
>I also disagree with your statement that you must swap with a random element
>ahead of the current element. You must also allow the current element to be
>swapped with itself.

Yes, sloppy language on my part. I did mean "ahead" in an inclusive sense.

As for the your algorithm, it was quite correct. I somehow managed to
repeteadly misread the bit where the element to be swapped was chosen.
Sorry. I was probably too used to seeing something like this:

http://www.augsburg.edu/compsci/Courses/CSC210/fall01thur/09a/#sectionC

(And that's _course material_, yikes.)

>Also show the straightforward list shuffling algorithm you mention in the last
>line of your posting.

Right. With "straigthforward" I meant something like this:

A random permutation of an empty list is an empty list.

A random permutation of a non-empty list is a list whose first element is a
random element of the input list, and the tail of which is a random
permutation of the remainder of the input list.

This translates fairly naturally to scheme (Haskell would be even simpler,
except for the complexity imposed by its purely functional random numbers).

(define (list-remove l i)
(if (zero? i)
(cdr l)
(cons (car l) (list-remove (cdr l) (- i 1)))))

(define (shuffle l)
(if (null? l)
'()
(let ((idx (random (length l))))
(cons (list-ref l idx)
(shuffle (list-remove l idx))))))

Not very efficient, admittedly.

And this is really what the vector-based algorithm also does, except that
its "lists" are subvectors, and "list-remove" is done by swapping the first
element with the element to be removed, and then moving the head of the
subvector forward.

Lauri Alanko
l...@iki.fi

Phil Bewig

unread,

May 30, 2002, 3:26:19 PM5/30/02

to

Are we having fun yet? Wow!

Here is another shuffling algorithm for consideration by all of you.
It feels
good, but I'm not a sophisticated enough mathematician to either prove
or
disprove it. Please take a look and let me know what you think.

Create a vector with M elements, each initialized to the null list.
Cdr down the original list, consing each item on the list to a
randomly-chosen vector element, counting the number of items in the
list as they pass. Fold the contents of the vector into one big list
with append. Repeat the cdr/cons/fold operation logM N - 1 additional
times, where M is the base of the logarithm. The list generated the
last time through the loop is the final output. The beauty of this
algorithm is that small values of M can shuffle large numbers of items
in small numbers of passes; for example, if N is 1,000,000, setting M
to 100 requires only three passes, and even setting M to 10 requires
only six passes. The algorithm has time complexity N logM N, but
since small M suffice, the logM N term becomes vanishingly small
compared to N, so the algorithm is effectively linear. Here's the
code, with M=2 for testing:

(define m 2)
(define (shuffle lst)
(define (fold v) (let loop ((i 0) (b '()))
(if (= i m) b (loop (+ i 1) (append (vector-ref v i) b)))))
(define (pass n) (let loop ((n n) (k 0))
(if (< n m) k (loop (quotient n m) (+ k 1)))))
(let loop ((l lst) (v (make-vector m '())) (n 0) (k -1))
(if (null? l)
(cond ((negative? k) (loop (fold v) (make-vector m '()) 0 (pass
n)))
((zero? k) (fold v))
((positive? k) (loop (fold v) (make-vector m '()) 0 (- k
1))))
(let ((i (randint m)))
(vector-set! v i (cons (car l) (vector-ref v i)))
(loop (cdr l) v (+ n 1) k)))))

Notes on the code: (randint m) returns a non-negative integer less
than m. K is -1 the first time through the loop, then is set to the
number of passes left to be made. Pass calculates (floor (logM n)).
The append is only peformed M logM N times, so it doesn't cost very
much. I would probably set M to somewhere around 10 in a production
version of the code.

The big question is whether logM N passes are sufficient to generate
all possible permutations of the original list with equal probability.
I generated 1,000 lists of 100 items each, shuffled them, and looked
at the car of each of the 1,000 lists; the histogram was agreeably
flat. I know that's not proof of the validity of my method.

Can someone please help?

Phil

Joe English

unread,

May 30, 2002, 2:44:25 PM5/30/02

to

Ben Goetter wrote:

>Seth Gordon wrote:
>> Omigod! It's the Schwartzian Transform!
>
>Ha! So it is.
>
>As long as we have SORT, here's a better (non-ST,
>non-pointlessly-consing) version.
>
>(define (shuffle deck)
> (sort (lambda (x y) (even? (random 2))) deck))

This is not likely to work very well.

The sorting algorithm -- whatever it is -- assumes that the comparison
function is transitive; (lambda (x y) (even? (random 2))) is not.
Since this assumption does not hold, the result is unpredictable.

For instance, suppose 'sort' used an insertion sort.
Then in the above implementation of 'shuffle', half of
the list elements would remain in their original position,
one quarter would move up one position, one eighth would
move up two positions, etc. This is clearly not a very
good shuffle!

With other sorting algorithms this will behave differently,
but it will just as likely be wrong.

--Joe English

jeng...@flightlab.com

Al Petrofsky

unread,

May 30, 2002, 5:28:59 PM5/30/02

to

pbe...@swbell.net (Phil Bewig) writes:

> The big question is whether logM N passes are sufficient to generate
> all possible permutations of the original list with equal probability.

No. Another question would be whether they are sufficient to generate
all the permutations with acceptably uniform probability. The answer
is no again, at least for M=2.

Consider the case of a two-element list. Your code does three passes.
Each pass has a 1/2 chance of putting both elements in the same
bucket, which swaps their order because buckets are built in reverse,
and a 1/2 chance of putting them into different buckets, creating an
additional 1/4 chance of swapping their order and a 1/4 chance of
preserving it. The probability of the order being preserved after
three passes is the sum of the probability that all three passes
preserved it, 1/4*1/4*1/4, and the probability that one pass preserved
it and the other two swapped it, 3*(1/4*3/4*3/4). Total = 1/64+27/64
= 7/16, which is significantly different from the desired 1/2.

You can verify this experimentally:

(do ((n 16000 (- n 1))
(count 0 (+ count (if (equal? '(a b) (shuffle '(a b)))
1
0))))
((zero? n) count))
=> 7051

For M >> N (">>" meaning "much bigger than"), your algorithm would
certainly generate an acceptably uniform distribution in a single
pass, equivalent to that of the assign-random-flonums-and-sort
algorithm, but your version would use much more space and time.

For M=10 and N < 10^6, which seems to be about what you're proposing,
I can't say off-hand whether the distribution would be "acceptably"
uniform. The answer would, of course, depend on how you intend to use
the results.

-al

Joe Marshall

unread,

May 30, 2002, 5:13:51 PM5/30/02

to

"Al Petrofsky" <a...@petrofsky.org> wrote in message news:87r8jul...@radish.petrofsky.org...

>
> Joe Marshall writes:
>
> > Sorting is N log N, but `unsorting' is O(n) (randomizing a vector).
> > What gives?
>
> Sorting requires Omega(n log n) binary decisions, which follows from
> the fact that there are n! possible orderings and log_2(n!) is Omega(n
> log n). Similarly, unsorting requires Omega(n log n) random bits.

D'oh! Forgot about generating those bits.

Jeffrey Siegal

unread,

May 30, 2002, 8:09:32 PM5/30/02

to

Lauri Alanko wrote:
> http://www.augsburg.edu/compsci/Courses/CSC210/fall01thur/09a/#sectionC
>
> (And that's _course material_, yikes.)

From a course using Java. What do you expect?

Max Hailperin

unread,

May 31, 2002, 10:03:43 AM5/31/02

to

Jeffrey Siegal <j...@quiotix.com> writes:

I'm not convinced the language has anything to do with it. Also, so
far as I can figure this out, the code in question is not just "course
material" but rather "textbook material" -- it looks like the course
instructor is just passing along questionable material from the
textbook author. (You can find the same shuffling stuff on the
textbook author's web site as well.) If somebody had a bone to pick
with an algorithm in my textbook, I wouldn't want them to say "what do
you expect, it is a book using Scheme." If I were to one day write
another book, using Java (in more than just the last chapter), I would
still want to be held responsible for more than just my choice of
language. -max

Joe Marshall

unread,

May 31, 2002, 11:55:28 AM5/31/02

to

"Max Hailperin" <m...@max.mcs.gac.edu> wrote in message news:t0u3cw8...@max.mcs.gac.edu...

> If I were to one day write
> another book, using Java (in more than just the last chapter), I would
> still want to be held responsible for more than just my choice of
> language. -max
>

Don't worry, we'll hold you responsible.

I agree that language selection isn't necessarily a reason to critique
an author. However, it does indicate something. How the author comes
to a decision about what language to use depends on factors such as
what the target audience is, the number of people in the target audience
already familiar with the language, the ease the author has in
expressing himself in the language, and the clarity of expression
available to the author.

Clearly Abelson and Sussman felt that the clarity of Scheme (and
the ease both have with the language) outweighed the other two
despite the fact that number of people in the target audience already
familiar with scheme would be quite low.

I find this to be true in general. People who choose Scheme for
writing a book have come to the conclusion that it is more
important to express yourself clearly and precisely than it is
to reach a larger audience.

People who choose a popular language, such as Java, have come
to the conclusion that that the goal of reaching a
larger audience is more important than the goal of clarity of
expression. There are many reasons they may come to this
conclusion:

1. They may be ignorant of Scheme and thus believe that
Java or C++ is as good a language as any.

2. They may wish to educate those people that would not
otherwise listen.

3. They may believe that they can be as clear or almost
as clear in Java as in Scheme.

4. They may not care if *anyone* understands the content
of the book, but only that the maximum number of people
purchase it.

Now when I see a book written in Java, there's a certain
probability that the author is simply trendy (4), ignorant (1),
or self-deluded (3). Alternatively, he may be quite a good
hacker (3, when his belief is grounded in fact) and trying
to impart knowledge that is not well known (2).

Unfortunately, the latter are quite rare and the former are
legion. Thus I approach any book written in Java (or C++)
with a certain amount of skepticism. (And I wouldn't bother
opening a book written in Perl.)

On the page with the array shuffling, the author first presents
the bubblesort algorithm. None of the `features' of Java are
being exploited. The algorithms are written in an imperative
style, the code is unnecessarily verbose and baroque. It appears
as if the author is not completely comfortable with block
scope because most routines `declare' all the variables at
function entry rather than in the nearest block. It appears
that the author is uncomfortable with for, if, and while
statements with only one expression. It appears as if the author
is uncomfortable with conditional expressions ( the ? : ternary).

I've come to expect these problems in Java-oriented books, and
I don't think it is unfounded.

Ray Dillinger

unread,

Jun 2, 2002, 6:53:38 PM6/2/02

to

Ben Goetter out of office wrote:

> bern...@hummel.cs.waikato.ac.nz (Bernhard Pfahringer) wrote in message news:<ad419p$duk$1...@hummel.cs.waikato.ac.nz>...

>

> Whether this wins depends on the hosting implementation, I suppose.
>
> I tested these two approaches in Chez on some non-small datasets
> (n = 10000, n = 100000), and found there the overhead of generating
> additional random numbers (albeit fixnum ones) more than offset by the
> savings from skipping the cons/uncons and flonum arithmetic. As n
> increased, the nonconsing fixnum version looked ever better.
>
> If calls into the RNG were indeed expensive, one might be able to use
> more than one bit out of it before returning for another RN.

Actually it depends on the application one is writing whether
the calls to the RNG are expensive or not.

Briefly, if you are using an RNG for simulation, you can use a
fairly cheap RNG. But if you are using an RNG in an adversarial
situation, especially for high stakes (simulating chance in high-
stakes games of chance, doing cryptography, etc) then the least
expensive RNG's you can actually use (those which your opponent
cannot predict) are way more expensive than cons calls.

Bear