Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How do I generate random numbers in Tcl/Tk?

5,836 views
Skip to first unread message

Nathan Gundlach

unread,
Sep 3, 2000, 4:48:09 PM9/3/00
to
Hi,
 
How do I generate random numbers in Tcl/Tk?
 
I need one between 1 and 8.
 
Please be as specific as possible; I'm a newbie.
 
Thanks in advance,
 
-- Nathan

Steve Offutt

unread,
Sep 5, 2000, 3:12:53 PM9/5/00
to
In article <AOPs5.6513$B54....@newsfeed.slurp.net>,
"Nathan Gundlach" <dagc...@frognet.net> wrote:
> "Jeff David" <jld...@enteract.com> wrote in message
> news:39B2FCCE...@enteract.com...

> > > Nathan Gundlach wrote:
> > >
> > > Hi,
> > >
> > > How do I generate random numbers in Tcl/Tk?
> > >
> > > I need one between 1 and 8.
> > >
> > > Please be as specific as possible; I'm a newbie.
> > >
> > The rand() function within expr should do the trick:
> >
> > If you want a random integer:
> > expr {int(8.0*rand()+1)}
> > If you want a random real:
> > expr {7.0*rand()+1}
> >
> > rand() gives a uniform random number in the range [0,1)
> > so the real example will never give you exactly 8.
> > You will get a real in the range [1,8)
>
> Right, thanks. I knew this, but unfortunately what I need is a new
random
> number each time. Something like:
>
> randomize;
> getrand {$1 $8};
>
> (Note: I know that this is a poor example!)
>
proc random_int { upper_limit } {
global myrand
set myrand [expr int(rand() * $upper_limit + 1)]
return $myrand
}

Something like this help?

To change the lower limit, just change the 1 to whatever.

Steve

--
Steve Offutt
*Learning* to program is my hobby


Sent via Deja.com http://www.deja.com/
Before you buy.

Donal K. Fellows

unread,
Sep 4, 2000, 5:24:04 PM9/4/00
to
In article <NHQs5.6760$B54....@newsfeed.slurp.net>, Nathan Gundlach
<dagc...@frognet.net> writes
>I am writing an app to teach chessplayers algebraic notation (a way of
>writing down the moves of a game). I need to have a way of selecting which
>square to quiz them on. I think that Darren New provided the answer but I
>haven't tested it yet:
>expr srand([clock seconds])
>
>I just need it to be basically random, with different numbers each time the
>app starts up.

Oh, in that case using rand() and srand() is just fine. Had you been
doing cryptography you would have needed something else much stronger...

Donal.
--
Donal K. Fellows (at home)
--
FOOLED you! Absorb EGO SHATTERING impulse rays, polyester poltroon!!
(WARNING: There is precisely one error in this message.)

Steve Offutt

unread,
Sep 6, 2000, 12:11:56 AM9/6/00
to
In article <39B3D5B8...@ajubasolutions.com>,
Dan Kuchler <kuc...@ajubasolutions.com> wrote:
>
> What are you trying to do exactly? It sounds like you
> need to guarantee the uniqueness of the number. Do you
> just need something that is guaranteed to be unique or
> do you need something that is guaranteed to be unique
> *AND* random? How many random numbers are you
> talking about generating? (100s, 1000s, 1000000s?)
>
Dan,

Do you have something handy that would guarantee uniqueness
and random (in the range of 1000 for instance)?

TIA

Dan Kuchler

unread,
Sep 6, 2000, 2:51:22 AM9/6/00
to Steve Offutt
Steve Offutt wrote:
>
> Dan,
>
> Do you have something handy that would guarantee uniqueness
> and random (in the range of 1000 for instance)?

Because of the nature of rand, I think you need to generate
many more than 1000 numbers before you get the same number
again (that is because rand doesn't generate a truly random
number).

According to the man page it sounds like the period for rand
is about 2^32 (although this will probably vary based on
the underlying implementation, althought I seem to remember
that it is implemented using the same formula/algorithm
on most platforms).

Anyway, if you only have about 1000 random numbers it should
be cheap (both memory and performance wise) enough to use
the random numbers as an index in an array.

Each time you generate a random number check (using
'info exists') if that element of the array already exists.
If it does generate another random number. Repeat until
you find a unique one, and the do a
'set randArray($randomNumber) 1'

I hope that helps.

--Dan

Stephen D. Cohen

unread,
Sep 6, 2000, 4:40:06 AM9/6/00
to
On Tue, 05 Sep 2000 23:51:22 -0700, Dan Kuchler
<kuc...@ajubasolutions.com> wrote:

>Because of the nature of rand, I think you need to generate
>many more than 1000 numbers before you get the same number
>again (that is because rand doesn't generate a truly random
>number).

If he is generating random numbers between 1 and 1000, he
obviously would have to get the same number again in less than 1001
rand calls. Or am I missing something here?

>According to the man page it sounds like the period for rand
>is about 2^32 (although this will probably vary based on
>the underlying implementation, althought I seem to remember
>that it is implemented using the same formula/algorithm
>on most platforms).

That is the period before you get the same *sequence* of
numbers again. Not the period before you get the same *number* again.
If you are taking the output of rand and generating "random" integers
from it, you are likely to get many repeats in a small number of
calls.

>Anyway, if you only have about 1000 random numbers it should
>be cheap (both memory and performance wise) enough to use
>the random numbers as an index in an array.

That is the best way to do things if you want to order a set
of elements in "random" order.

>Each time you generate a random number check (using
>'info exists') if that element of the array already exists.
> If it does generate another random number. Repeat until
>you find a unique one, and the do a
>'set randArray($randomNumber) 1'

Of course, crossing off those last few numbers is going to
take forever. On the average, crossing off all 1000 numbers will take
about a half a million calls to rand. This is not the best solution.

A far better solution is to remove the number selected from
the "scoreboard" list and generate a new random number in the range of
1..<number of elements in scoreboard list>. That way it only takes
999 calls to rand to randomly order 1000 unique elements.

I made the mistake of trying it the other way once, it was
excruciating to watch.

Regards,

Steve

David Cuthbert

unread,
Sep 6, 2000, 9:43:34 AM9/6/00
to
"Stephen D. Cohen" wrote:
> On Tue, 05 Sep 2000 23:51:22 -0700, Dan Kuchler
> <kuc...@ajubasolutions.com> wrote:
> >Because of the nature of rand, I think you need to generate
> >many more than 1000 numbers before you get the same number
> >again (that is because rand doesn't generate a truly random
> >number).
>
> If he is generating random numbers between 1 and 1000, he
> obviously would have to get the same number again in less than 1001
> rand calls. Or am I missing something here?

Tcl's rand() works by generating an integer in the range 1..2^31-1; the
output you get is divided by 2^31-1.

I can confirm (empirically) that the generated integers appear to repeat
after 2^31-2 iterations. Given the way the algorithm works (for you EE
types out there, it's a feedback shift register), once a number repeats the
sequence will repeat.

Of course, if you restrict the domain to 1..1000, there are a number of
elements which will map to the same number and you'll get the same number
again (but not the same sequence).

> >According to the man page it sounds like the period for rand
> >is about 2^32 (although this will probably vary based on
> >the underlying implementation, althought I seem to remember
> >that it is implemented using the same formula/algorithm
> >on most platforms).

s/most/all/

The routine can be found in tclExecute.c. It's not a platform-specific
file, so unless you're doing some other trickery under the hood to call a
different function...

> Of course, crossing off those last few numbers is going to
> take forever. On the average, crossing off all 1000 numbers will take
> about a half a million calls to rand. This is not the best solution.
>
> A far better solution is to remove the number selected from
> the "scoreboard" list and generate a new random number in the range of
> 1..<number of elements in scoreboard list>. That way it only takes
> 999 calls to rand to randomly order 1000 unique elements.

Even this is a bit of a pain. In C/C++, you're constantly shifting elements
of an array down (or, worse, traversing down a linked list 1000 times).
Either way, you're shifting/visiting (n/2)! nodes on average. Tcl's lists
appear as the former. Arrays indices may be better, but you still have to
get a list of the valid indices each time (so you're effectively creating a
list 1000 times) or you're back to the shifting problem.

My suggestion is to create an array in order (e.g., a(1)=1, a(2)=2, ...),
and then run through it and perform a swap with another random element.
Traverse it again to get your random ordering.
--
David Cuthbert
da...@kanga.org

Dan Kuchler

unread,
Sep 6, 2000, 10:18:09 AM9/6/00
to David Cuthbert
David Cuthbert wrote:
>
> Even this is a bit of a pain. In C/C++, you're constantly shifting elements
> of an array down (or, worse, traversing down a linked list 1000 times).
> Either way, you're shifting/visiting (n/2)! nodes on average. Tcl's lists
> appear as the former. Arrays indices may be better, but you still have to
> get a list of the valid indices each time (so you're effectively creating a
> list 1000 times) or you're back to the shifting problem.

My solution with array indices used 'info exists' to determine whether
a given index already existed in the array. This is a fast call,
and does not create a list of 1000 items. If I remember correctly,
'info exists' does a hashtable lookup which on average has a depth
of somewhere between 0 and 2 given the implementation of the hashtable
behind tcl's arrays. So that is why I suggested using an array
and 'info exists' to do the checking for uniqueness.

--Dan

Steve Offutt

unread,
Sep 6, 2000, 10:43:32 AM9/6/00
to
In article <39B64C82...@mayo.edu>,
Bob Techentin <techenti...@mayo.edu> wrote:

> "Stephen D. Cohen" wrote:
> >
> > If he is generating random numbers between 1 and 1000, he
> > obviously would have to get the same number again in less than 1001
> > rand calls. Or am I missing something here?
> ...

> > A far better solution is to remove the number selected from
> > the "scoreboard" list and generate a new random number in the range
of
> > 1..<number of elements in scoreboard list>. That way it only takes
> > 999 calls to rand to randomly order 1000 unique elements.
>
> I was reading of an alternative algorithm in "Programming Pearls" Jon
> Louis Bentley a little while back. Great book, by the way. Has
nothing
> to do with Perl. :-)
>
> To get a unique list of numbers (say, 1-1000) in a random order, you
can
> first create an ordered list, then randomize the list by walking
through
> once and performing a random 'swap'. Something like this:
>
> for {set i 0} {$i<1000} {incr i} {lappend nums $i}
> for {set i 0} {$i<1000} {incr i} {
> set j [expr {int(rand()*1000)}]
> set temp [lindex $nums $j]
> set nums [lreplace $nums $j $j [lindex $nums $i]]
> set nums [lreplace $nums $i $i $temp]
> }
>
> This might be a little slower than randomly selecting from a shrinking
> "scoreboard" list because you've got to use [lreplace] twice. I don't
> have a good feel for how fast it would be compared to methods that use
> array accesses. But it doesn't take much code to get the job done,
and
> that is a good thing.
>
Bob,

Interesting algorithm, and compact code.

FWIW, I have been dividing my available time writing code, between
a couple of small projects, and going back through my C++ text,
and developing solutions for selected problems in tcl/tk.(and
adding a GUI)

1000 was just a number that I felt would show up any significant
differences in the speed and efficiency of different algorithms.

In the interest of discussion, and maybe having a little fun with
this, what has everybody got for a solution to shuffling a
standard deck of playing cards? (Whatever the best solution for
52 items is ought to apply to 1000, or 10000, or whatever, right?)

Thanks to all for your input.

Kevin Kenny

unread,
Sep 6, 2000, 1:53:01 PM9/6/00
to

Bob Techentin wrote:
> To get a unique list of numbers (say, 1-1000) in a random order, you can
> first create an ordered list, then randomize the list by walking through
> once and performing a random 'swap'. Something like this:
>
> for {set i 0} {$i<1000} {incr i} {lappend nums $i}
> for {set i 0} {$i<1000} {incr i} {
> set j [expr {int(rand()*1000)}]
> set temp [lindex $nums $j]
> set nums [lreplace $nums $j $j [lindex $nums $i]]
> set nums [lreplace $nums $i $i $temp]
> }
>
> This might be a little slower than randomly selecting from a shrinking
> "scoreboard" list because you've got to use [lreplace] twice. I don't
> have a good feel for how fast it would be compared to methods that use
> array accesses. But it doesn't take much code to get the job done, and
> that is a good thing.

Its fine for lists with a few hundred elements at most. Otherwise,
[lreplace] gets to be a disaster. I timed several methods, and posted
the
results at http://mini.net/cgi-bin/wikit/941.html.

// Kevin

Bob Techentin

unread,
Sep 6, 2000, 4:00:34 PM9/6/00
to
Kevin Kenny wrote:
>
> I timed several methods, and posted the
> results at http://mini.net/cgi-bin/wikit/941.html.

Bravo, Kevin. Very well done.

Bob
--
Bob Techentin techenti...@mayo.edu
Mayo Foundation (507) 284-2702
Rochester MN, 55905 USA http://www.mayo.edu/sppdg/sppdg_home_page.html

Bob Techentin

unread,
Sep 6, 2000, 4:27:20 PM9/6/00
to
Kevin Kenny wrote:
>
> I timed several methods, and posted the
> results at http://mini.net/cgi-bin/wikit/941.html.

One of the suggestions, I think first proposed by Stephen D. Cohen was
to randomly pick elements from one list and hand them off to another.
Something like:

proc K { x y } { set x }
proc shuffle3 { list } {
set n [llength $list]
while {$n>0} {
set j [expr {int(rand()*$n)}]
lappend slist [lindex $list $j]
set list [lreplace [K $list [set list {}]] $j $j]
incr n -1
}
return $slist
}

I've tried timing this on my WinNT machine, and it runs 3x slower than
shuffle1a, which I don't completely understand. It *should* just be
moving the elements from one list to another, but something else must be
going on.

Thanks,

Donal K. Fellows

unread,
Sep 5, 2000, 3:51:15 PM9/5/00
to
In article <8p3gj8$6mu$1...@nnrp1.deja.com>, Steve Offutt
<ind...@oz.sunflower.org> writes

> proc random_int { upper_limit } {
> global myrand
> set myrand [expr int(rand() * $upper_limit + 1)]
> return $myrand
> }
>
>Something like this help?
>
>To change the lower limit, just change the 1 to whatever.

Shame on you for not writing a more general solution! :^)

proc random_int {upperLimit {lowerLimit 1}} {
expr {int(rand()*($upperLimit-$lowerLimit+1)+$lowerLimit)}


}
expr {srand([clock seconds])}

Donal.

Donal K. Fellows

unread,
Sep 6, 2000, 6:01:21 PM9/6/00
to
In article <ae0crsgu69eojv175...@4ax.com>, Stephen D.
Cohen <sco...@tampabay.rr.com> writes

> If he is generating random numbers between 1 and 1000, he
>obviously would have to get the same number again in less than 1001
>rand calls. Or am I missing something here?

Grrr. People who simply *don't* understand random number sequences. If
you have a random sequence, the probability of a number occurring twice
in a row must be exactly the same as the probability of that number
being followed by any other nominated number. (In fact, it gets better
than that. For a *truly* random sequence, the shortest description of
the sequence should be the sequence itself...)

[ random permutations of a list ]


> Of course, crossing off those last few numbers is going to
>take forever. On the average, crossing off all 1000 numbers will take
>about a half a million calls to rand. This is not the best solution.
>
> A far better solution is to remove the number selected from
>the "scoreboard" list and generate a new random number in the range of
>1..<number of elements in scoreboard list>. That way it only takes
>999 calls to rand to randomly order 1000 unique elements.
>
> I made the mistake of trying it the other way once, it was
>excruciating to watch.

Heh! I find this hilarious. :^)

There was a discussion a while back (either on this group or on the
Wiki) of how to randomly permute a list very quickly. IIRC, the fastest
technique is to pair each list element with a random float (such as
rand() really generates,) [lsort] on that element, and then strip those
values out again.

proc shuffle {list} {
set tmp [list]
foreach item $list {
lappend tmp [list $item [expr {rand()}]]
}
set res [list]
foreach item [lsort -real -index 1 $tmp] {
lappend res $item
}
return $res
}

Let's see: two O(n) operations and an O(nlogn) operation. And with
really small constants. It'd be very difficult to beat, since in any
technique with selection from a set (the only potential O(n) algorithm)
you have the problem of shrinking the array. Hmm. You could do it in C
by pulling the element at the end of the list into the gap left by the
removal, but getting that right in Tcl even with the techniques outlined
in the Wiki page on Tcl Performance... Tricky...

sco...@xybion.com

unread,
Sep 6, 2000, 7:06:35 PM9/6/00
to
Gang,

This horse is now nearly completely beaten to death. Thus I will
give it one more good solid flog:

In article <39B6A8A8...@mayo.edu>,


Bob Techentin <techenti...@mayo.edu> wrote:
> Kevin Kenny wrote:
> >
> > I timed several methods, and posted the
> > results at http://mini.net/cgi-bin/wikit/941.html.

I must agree with the "Bravo Kevin" expressed earlier. This is a
nice piece on the relative efficiencies of the various algorithms and
Tcl in general in handling lists.

> One of the suggestions, I think first proposed by Stephen D. Cohen was
> to randomly pick elements from one list and hand them off to another.
> Something like:
>
> proc K { x y } { set x }
> proc shuffle3 { list } {
> set n [llength $list]
> while {$n>0} {
> set j [expr {int(rand()*$n)}]
> lappend slist [lindex $list $j]
> set list [lreplace [K $list [set list {}]] $j $j]
> incr n -1
> }
> return $slist
> }

This is a close approximation of/exactly the code that Kevin ran
in his test, ins't it? I wonder if we could convince him to run *just
one more routine* through his gammut. Here it is:

#
# Steve Cohen's implementation of Steve Cohen's proposed method.
#
proc shuffle4 { list } {


set n [llength $list]
while {$n>0} {
set j [expr {int(rand()*$n)}]
lappend slist [lindex $list $j]

incr n -1
set temp [lindex $list $n]
set list [lreplace [K $list [set list {}]] $j $j $temp]
}
return $slist
}

Note that the only big modification is to simply swap the list
element selected from the first list with the nth element from the
original list. By not modifying the size of the list, Tcl seems to
leave it alone (in memory). On my (really slow P133 laptop) machine,
this runs linearly with list size. I would be interested to see what
Kevin gets with his big-hootie computer.

Regards,

Steve

Kevin Kenny

unread,
Sep 6, 2000, 5:15:30 PM9/6/00
to
Bob Techentin wrote:
> I've tried timing this on my WinNT machine, and it runs 3x slower than
> shuffle1a, which I don't completely understand. It *should* just be
> moving the elements from one list to another, but something else must be
> going on.

The problem is that the 'lreplace' is changing the size of the list, and
so it has to copy, on average, half the elements. The copying leads to
quadratic behavior again. Look at the code in Tcl_ListObjReplace
(tclListObj.c) to see the optimization:

/*
* Shift the elements after the last one removed to their
* new locations.
*/

start = (first + count);
numAfterLast = (numElems - start);
shift = (objc - count); /* numNewElems - numDeleted */
if ((numAfterLast > 0) && (shift != 0)) {
/* ... code that shifts the elements up and down deleted ... */
}

Clever fellows, those Tcl implementors.

I've updated the Wiki page (http://mini.net/cgi-bin/wikit/941.html) with
the shuffle3 procedure and its numbers.

// Kevin

Kevin Kenny

unread,
Sep 6, 2000, 7:47:48 PM9/6/00
to
sco...@xybion.com wrote:

> proc shuffle4 { list } {
> set n [llength $list]
> while {$n>0} {
> set j [expr {int(rand()*$n)}]
> lappend slist [lindex $list $j]
> incr n -1
> set temp [lindex $list $n]
> set list [lreplace [K $list [set list {}]] $j $j $temp]
> }
> return $slist
> }
>
> Note that the only big modification is to simply swap the list
> element selected from the first list with the nth element from the
> original list. By not modifying the size of the list, Tcl seems to
> leave it alone (in memory). On my (really slow P133 laptop) machine,
> this runs linearly with list size. I would be interested to see what
> Kevin gets with his big-hootie computer.

'Big-hootie computer?' Is this a new technical term?

Never mind, I added 'shuffle4' to the benchmark on the Wiki. As you
suspected, it's a shade faster than 'shuffle1a'.

Y'know, I still kind of like the naive 'shuffle0' implementation. It
seems to partake more of the Tao of Tcl, even if it isn't *quite* as
fast as the others.

// Kevin

Stephen D. Cohen

unread,
Sep 6, 2000, 8:35:11 PM9/6/00
to
On Wed, 6 Sep 2000 23:01:21 +0100, "Donal K. Fellows"
<do...@ugglan.demon.co.uk> wrote:

>In article <ae0crsgu69eojv175...@4ax.com>, Stephen D.
>Cohen <sco...@tampabay.rr.com> writes
>> If he is generating random numbers between 1 and 1000, he
>>obviously would have to get the same number again in less than 1001
>>rand calls. Or am I missing something here?
>
>Grrr. People who simply *don't* understand random number sequences. If
>you have a random sequence, the probability of a number occurring twice
>in a row must be exactly the same as the probability of that number
>being followed by any other nominated number. (In fact, it gets better
>than that. For a *truly* random sequence, the shortest description of
>the sequence should be the sequence itself...)

I hope that the Grrr.. was for the original poster, and not my
rather obvious explanation of the pigeon hole principal. Obviously,
it will generally take a lot more than 1000 rand calls to cover the
numbers from 1 to 1000. On the average, it will be more like infinity
(well, it approaches infinity).

>[ random permutations of a list ]
>> Of course, crossing off those last few numbers is going to
>>take forever. On the average, crossing off all 1000 numbers will take
>>about a half a million calls to rand. This is not the best solution.
>>
>> A far better solution is to remove the number selected from
>>the "scoreboard" list and generate a new random number in the range of
>>1..<number of elements in scoreboard list>. That way it only takes
>>999 calls to rand to randomly order 1000 unique elements.
>>
>> I made the mistake of trying it the other way once, it was
>>excruciating to watch.
>
>Heh! I find this hilarious. :^)

It was. I was trying to randomly order a list of about 500
elements. I let it run for a while, and gave up a half hour later
with about 95% of the list done (on a VAX 11/750). :-) I stopped and
did the math and was most impressed with how unlikely it was for me to
succeed with that method.

But it isn't as bad as I speculated in the paragraph quoted
above. The actual expected number of rand calls for a list of 1000 is
7,486. Quite a few calls, but certainly less than half a million.

>There was a discussion a while back (either on this group or on the
>Wiki) of how to randomly permute a list very quickly. IIRC, the fastest
>technique is to pair each list element with a random float (such as
>rand() really generates,) [lsort] on that element, and then strip those
>values out again.
>
> proc shuffle {list} {
> set tmp [list]
> foreach item $list {
> lappend tmp [list $item [expr {rand()}]]
> }
> set res [list]
> foreach item [lsort -real -index 1 $tmp] {
> lappend res $item
> }
> return $res
> }

This is a good choice provided that doubling up the list is
not expensive and that lsort does some sort of reasonably intelligent
sorting. Both of these seem like safe assumptions.

>Let's see: two O(n) operations and an O(nlogn) operation. And with
>really small constants. It'd be very difficult to beat, since in any
>technique with selection from a set (the only potential O(n) algorithm)
>you have the problem of shrinking the array. Hmm. You could do it in C
>by pulling the element at the end of the list into the gap left by the
>removal, but getting that right in Tcl even with the techniques outlined
>in the Wiki page on Tcl Performance... Tricky...

I proposed exactly this solution for Kevin to try about two
hours ago. We'll see if he gives it a whirl and adds it to his web
page. It wasn't really too difficult to get the indexing straight and
I used your K operator to fool Tcl into not doing the ugly thing.

Regards,

Steve

Steve Offutt

unread,
Sep 6, 2000, 8:57:03 PM9/6/00
to
In article <eDKt5BAz...@ugglan.demon.co.uk>,

"Donal K. Fellows" <do...@ugglan.demon.co.uk> wrote:
> In article <8p3gj8$6mu$1...@nnrp1.deja.com>, Steve Offutt
> <ind...@oz.sunflower.org> writes
> > proc random_int { upper_limit } {
> > global myrand
> > set myrand [expr int(rand() * $upper_limit + 1)]
> > return $myrand
> > }
> >
> >Something like this help?
> >
> >To change the lower limit, just change the 1 to whatever.
>
> Shame on you for not writing a more general solution! :^)

Yes, shame on me!!! :^) I must admit to posting a quick hack to
a procedure I wrote returning numbers from 0 to $upper_limit,
thinking it might point the OP in the right direction.

I wrote a general one last night, however, by the time I was
getting around to posting it, well...

proc random_int { upper lower } {
set range [expr $upper - $lower]
return [expr int(rand() * $range) + $lower]
}

Is this a little better Prof Donal?
(and if not, please comment)

Thanks for the nudge. Consider me sufficiently chastised.

Steve


>
> proc random_int {upperLimit {lowerLimit 1}} {
> expr {int(rand()*($upperLimit-$lowerLimit+1)+$lowerLimit)}
> }
> expr {srand([clock seconds])}
>

Donal, is "expr {srand([clock seconds]))" necessary?

Steve Offutt

unread,
Sep 6, 2000, 9:05:50 PM9/6/00
to
In article <39B6D7A4...@crd.ge.com>,

--


Steve Offutt
*Learning* to program is my hobby

Sent via Deja.com http://www.deja.com/
Before you buy.

David Cuthbert

unread,
Sep 6, 2000, 9:58:06 PM9/6/00
to
Dan Kuchler wrote:
> My solution with array indices used 'info exists' to determine whether
> a given index already existed in the array. This is a fast call,
> and does not create a list of 1000 items.

True, but, as Stephen pointed out, you need a lot of these checks to get the
last few numbers in the sequence out. My trusty ol' Mathematica says that,
when there's only one number remaining, 1000 calls to this only gives you a
63% chance of finding that last number.

On average, you'll need 5300 calls to extract all of the number using this
type of blind searching.

To improve this, you can make each call succeed if you get a list of the
numbers not found (e.g., array names ...)

But, anyway, enough theory. I threw these into a benchmark, with 20 trials
each trying to create a list of between 100 and 6400 unique random
integers. The results:

Method: 100 200 400 800 1600 3200 6400
Dan 6.0 16.55 31.55 67.65 149.65 350.50 716.60
Stephen 2.0 5.0 19.00 71.10 471.70 2341.95 15000.15
Dave 2.5 5.1 12.50 23.00 49.05 106.15 224.35
Donal 2.0 2.0 10.50 23.00 46.55 107.65 250.40

(Times in wall ms/iteration, resolution of 1.0 ms; benchmark machine is an
unloaded AMD K6-2/333 running NT 4.0, Tcl8.4a1).

These results may skew a bit on a machine with better FPU performance
(PII/III, Athlon, Alpha, Sparc, etc.).
--
David Cuthbert
da...@kanga.org

--------------------------------------------------------------------------

proc DanOrder {size} {
set result [list]

for {set i 0} {$i < $size} {incr i} {
set number [expr {int(rand() * $size)}]
while {[info exists found($number)]} {
set number [expr {int(rand() * $size)}]
}

set found($number) 1
lappend result $number
}

return $result
}


proc StephenOrder {size} {
set result [list]
set scoreboard [list]

for {set i 0} {$i < $size} {incr i} {
lappend scoreboard $i
}

for {set i 0} {$i < $size} {incr i} {
set index [expr {int(rand() * [llength $scoreboard])}]
lappend result [lindex $scoreboard $index]
set scoreboard [lreplace $scoreboard $index $index]
}

return $result
}


proc DaveOrder {size} {
set result [list]

for {set i 0} {$i < $size} {incr i} {
set position($i) $i
}

for {set i 0} {$i < $size} {incr i} {
set swap [expr {int(rand() * $size)}]
set temp $position($i)
set position($i) $position($swap)
set position($swap) $temp
}

for {set i 0} {$i < $size} {incr i} {
lappend result $position($i)
}

return $result
}


proc DonalOrder {size} {
set result [list]

for {set i 0} {$i < $size} {incr i} {
lappend interim [list [expr {rand()}] $i]
}

foreach el [lsort -real -index 0 $interim] {
lappend result [lindex $el 1]
}

return $result
}


proc Check {method size} {
set start [clock clicks -milliseconds]
set result [$method $size]
set end [clock clicks -milliseconds]

foreach el $result {
if {[info exists found($el)]} {
puts "Error: $method duplicated item $el"
} else {
set found($el) 1
}
}

for {set i 0} {$i < $size} {incr i} {
if {![info exists found($i)]} {
puts "Error: $method missed item $i"
}
}

return [expr {$end - $start}]
}

proc Trial {methods} {
set sizes [list 100 200 400 800 1600 3200 6400]
set trials 20

foreach method $methods {
puts -nonewline stdout "$method: "
flush stdout

foreach size $sizes {
set time 0.0
for {set trial 0} {$trial < $trials} {incr trial} {
set time [expr {$time + [Check $method $size]}]
}

set time [expr {$time / $trials}]
puts -nonewline stdout " $time"
flush stdout
}

puts stdout ""
}
}

Trial [list DanOrder StephenOrder DaveOrder DonalOrder]

Stephen D. Cohen

unread,
Sep 6, 2000, 9:59:15 PM9/6/00
to
On Wed, 06 Sep 2000 19:47:48 -0400, Kevin Kenny <ken...@crd.ge.com>
wrote:

>sco...@xybion.com wrote:
>
>> proc shuffle4 { list } {
>> set n [llength $list]
>> while {$n>0} {
>> set j [expr {int(rand()*$n)}]
>> lappend slist [lindex $list $j]
>> incr n -1
>> set temp [lindex $list $n]
>> set list [lreplace [K $list [set list {}]] $j $j $temp]
>> }
>> return $slist
>> }
>>
>> Note that the only big modification is to simply swap the list
>> element selected from the first list with the nth element from the
>> original list. By not modifying the size of the list, Tcl seems to
>> leave it alone (in memory). On my (really slow P133 laptop) machine,
>> this runs linearly with list size. I would be interested to see what
>> Kevin gets with his big-hootie computer.
>
>'Big-hootie computer?' Is this a new technical term?

Perhaps it is an age group thing, I am an old-timer.

>Never mind, I added 'shuffle4' to the benchmark on the Wiki. As you
>suspected, it's a shade faster than 'shuffle1a'.

But only for the really big lists, as I suspected. I wonder
if the set temp variable reference / dereference could be removed? I
bet it could, and that would further improve the performance of this
quite dead horse.

>Y'know, I still kind of like the naive 'shuffle0' implementation. It
>seems to partake more of the Tao of Tcl, even if it isn't *quite* as
>fast as the others.

Isn't that actually the Zen of Tcl? Whatever... I agree with
your sentiments and was just trying out some academic type stuff.

Regards,

Steve

Dan Kuchler

unread,
Sep 7, 2000, 12:40:30 AM9/7/00
to David Cuthbert
David Cuthbert wrote:
>
> Dan Kuchler wrote:
> > My solution with array indices used 'info exists' to determine whether
> > a given index already existed in the array. This is a fast call,
> > and does not create a list of 1000 items.
>
> True, but, as Stephen pointed out, you need a lot of these checks to get the
> last few numbers in the sequence out. My trusty ol' Mathematica says that,
> when there's only one number remaining, 1000 calls to this only gives you a
> 63% chance of finding that last number.

The original question posed to me was:

> Do you just need something that is guaranteed to be unique or
> do you need something that is guaranteed to be unique
> *AND* random? How many random numbers are you
> talking about generating? (100s, 1000s, 1000000s?)
>

Dan,

Do you have something handy that would guarantee uniqueness
and random (in the range of 1000 for instance)?

When the 'in the range of 1000' statement was made, I didn't
interpret it to be a range between 1 and 1000, I interpreted
it to be 1000 unique numbers.

I think the algorithm I proposed would be fine for generating
1000 unique numbers using rand. rand by its definition generates
a number between 0 and 2^32-1, so there shouldn't be many
collisions. You are correct that what I said wouldn't be
very fast if you needed to randomize numbers between 1 and
1000 but that wasn't the problem that I was trying to solve.

I guess maybe I misunderstood what Steve Offutt
was asking.

Thanks for writing up the code and testing it though..

--Dan

David Cuthbert

unread,
Sep 7, 2000, 1:11:21 AM9/7/00
to
Dan Kuchler wrote:
> When the 'in the range of 1000' statement was made, I didn't
> interpret it to be a range between 1 and 1000, I interpreted
> it to be 1000 unique numbers.

Ah, I read (paraphrasing) "generate 1000 unique random numbers" as
"randomise 1000 consecutive numbers." Yes, this makes a big difference!

Indeed, your code would work extremely well in that case. I was doing a few
double-takes when I saw your solution, given my assumptions (hmm... I think
Ajuba has been working Dan a bit too hard lately... ;-).

> Thanks for writing up the code and testing it though..

I have to admit, I'm a bit surprised by the results. I knew that Stephen's
proposal was going to take a hit due to the list copies, but I didn't
realise just how much this would affect the performance. And seeing Donal's
proposal, my thought was, "Dang, that's going to be fast." But I guess my
CPU just isn't beefy enough on the floating point side; I'd like to run the
benchmark on a Sparc.

--
David Cuthbert
da...@kanga.org

Dan Kuchler

unread,
Sep 7, 2000, 3:30:00 AM9/7/00
to David Cuthbert
David Cuthbert wrote:
>
> I have to admit, I'm a bit surprised by the results. I knew that Stephen's
> proposal was going to take a hit due to the list copies, but I didn't
> realise just how much this would affect the performance.

One good thing came out of this. Eric Melski took a look at
the code that implements lreplace (the underlying
Tcl_ListObjReplace from generic/tclListObj.c (See snippet
below)

The first thing that grabbed his attention from the code
was that there are for loops being used to copy a contiguous
region of memory (or at least that is how I read the code
below). I think if those are replaced with memcpy, there
will be some nice performance speed up.

(I just overheard him, so I may have misunderstood, but
if I understand the code right, he made a nice find..)

--Dan


/*
* Shift the elements after the last one removed to their
* new locations.
*/

start = (first + count);
numAfterLast = (numElems - start);
shift = (objc - count); /* numNewElems - numDeleted */
if ((numAfterLast > 0) && (shift != 0)) {

Tcl_Obj **src, **dst;

if (shift < 0) {
for (src = elemPtrs + start, dst = src + shift;
numAfterLast > 0; numAfterLast--, src++, dst++) {
*dst = *src;
}
} else {
for (src = elemPtrs + numElems - 1, dst = src + shift;
numAfterLast > 0; numAfterLast--, src--, dst--) {
*dst = *src;
}
}

Laurent Duperval

unread,
Sep 7, 2000, 7:55:24 AM9/7/00
to
On 7 Sep, Steve Offutt wrote:
> Donal, is "expr {srand([clock seconds]))" necessary?
>

I feel more chastising on its way. Of course it's necessary. Otherwise, you
always generate the same sequence of random numbers. Remember, these are
pseudo random numbers.

L

--
MY EMAIL ADDRESS HAS CHANGED --> UPDATE YOUR ADDRESSBOOK

Laurent Duperval "Montreal winters are an intelligence test,
and we who are here have failed it."
mailto:laurent....@netergynet.com -Doug Camilli
Penguin Power! ***Nothing I say reflects the views of my employer***

Kevin Kenny

unread,
Sep 7, 2000, 9:51:58 AM9/7/00
to

Dan Kuchler wrote:
>
> David Cuthbert wrote:
> >
> > I have to admit, I'm a bit surprised by the results. I knew that Stephen's
> > proposal was going to take a hit due to the list copies, but I didn't
> > realise just how much this would affect the performance.
>
> One good thing came out of this. Eric Melski took a look at
> the code that implements lreplace (the underlying
> Tcl_ListObjReplace from generic/tclListObj.c (See snippet
> below)
>
> The first thing that grabbed his attention from the code
> was that there are for loops being used to copy a contiguous
> region of memory (or at least that is how I read the code
> below). I think if those are replaced with memcpy, there
> will be some nice performance speed up.

Yeah, it will help, but only by a constant factor. The
quadratic time factor involved in recopying the list on
each trip through the loop will still get you in the end.
The key to performance in this application is to make sure
that the [lreplace] code:

(1) is working with a unique Tcl_Obj, so that it doesn't need
to be duplicated (Donal's 'K' combinator addresses this
problem).
(2) doesn't resize the list, or else deletes elements only from
the end (so that either 'numAfterLast' or 'shift' will be
zero in the code).

// Kevin

Steve Offutt

unread,
Sep 7, 2000, 12:27:56 PM9/7/00
to
In article <39B71C3E...@ajubasolutions.com>,
Dan Kuchler <kuc...@ajubasolutions.com> wrote:

Dan,

You did not misunderstand. What I was looking for was indeed
a single procedure to get 1000 unique random numbers.

Thanks to you and everyone else for their input.

Another place in this thread, also asked about shuffling.

thanks to everyone there also.

> Thanks for writing up the code and testing it though..
>
> --Dan
>

ditto

lvi...@cas.org

unread,
Sep 7, 2000, 1:30:48 PM9/7/00
to

According to Steve Offutt <ind...@oz.sunflower.org>:
:I wrote a general one last night, however, by the time I was

:getting around to posting it, well...
:
: proc random_int { upper lower } {
: set range [expr $upper - $lower]
: return [expr int(rand() * $range) + $lower]
: }

Perhaps this should be submitted to Ajuba Solutions for the tcllib , so
that we don't run into this each time?

--
<URL: https://secure.paypal.com/refer/pal=lvirden%40yahoo.com>
<URL: mailto:lvi...@cas.org> <URL: http://www.purl.org/NET/lvirden/>
Even if explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.

Laurent Duperval

unread,
Sep 7, 2000, 1:41:25 PM9/7/00
to
On 7 Sep, Tadeusz Liszka wrote:
>
> I dunno what units are (8.0 does not have clock -miliseconds so I divided raw
> numbers by 1E4), and I skipped the longest case, but the skew of FPU is much in
> the favor of Donal's code.
>

Pretty soon his head won't be able to fit thru the door. :-)

Paul Duffin

unread,
Sep 7, 2000, 12:48:32 PM9/7/00
to
Dan Kuchler wrote:
>
> David Cuthbert wrote:
> >
> > I have to admit, I'm a bit surprised by the results. I knew that Stephen's
> > proposal was going to take a hit due to the list copies, but I didn't
> > realise just how much this would affect the performance.
>
> One good thing came out of this. Eric Melski took a look at
> the code that implements lreplace (the underlying
> Tcl_ListObjReplace from generic/tclListObj.c (See snippet
> below)
>
> The first thing that grabbed his attention from the code
> was that there are for loops being used to copy a contiguous
> region of memory (or at least that is how I read the code
> below). I think if those are replaced with memcpy, there
> will be some nice performance speed up.
>

Don't whatever you do replace it with memcpy() as that has undefined
behaviour when the areas being copied overlap, as indeed they may do.
Use memmove() instead as that is defined to work *properly* if they
overlap.

Donal K. Fellows

unread,
Sep 8, 2000, 11:26:44 AM9/8/00
to
In article <fDQt5.4023$Ld.8...@newscontent-01.sprint.ca>,

Laurent Duperval <laurent....@netergynet.com> wrote:
> On 7 Sep, Tadeusz Liszka wrote:
>> I dunno what units are (8.0 does not have clock -miliseconds so I
>> divided raw numbers by 1E4), and I skipped the longest case, but
>> the skew of FPU is much in the favor of Donal's code.

AMD processors have (at the very least historically, and maybe even
currently) very poor floating point performance relative to their
integer[*] performance, and my code does require FP comparison to be
fast. Of course, building FP hardware is non-trivial so its no wonder
that AMD don't fare very well there; Intel haven't been much to write
home about in this field either...

You could convert to integers and then use those instead, but the
question is then whether the cost of the conversion would outweigh the
savings due to no longer working with floats.

proc shuffle_f2i {list} {
set x {}; # Force locality in compiler?
set working [list]
foreach item $list {
# Better than multiplying by a constant, as keeps range large
binary scan [binary format f [expr {rand()}]] i x
lappend working [list $item $x]
}
set result [list]
foreach item [lsort -integer -index 1 $working] {
lappend result [lindex $item 0]
}
return $result
}

The above code is grotty, but it works. It will probably be faster on
*86 processors too. It does lack that certain elegance though...

> Pretty soon his head won't be able to fit thru the door. :-)

My door is extra wide, so I'm safe for a while yet...

Donal.
[* "Integer performance" is a slight misnomer here, as it really
refers to all non-floating-point performance. Most instructions in
all but the most mathematically-oriented of programs are integer
instructions. ]
--
Donal K. Fellows http://www.cs.man.ac.uk/~fellowsd/ fell...@cs.man.ac.uk
-- Name me one elf who wants to go to Blackpool after he dies.
-- Raymond E. Feist on feist...@cornell.edu

Donal K. Fellows

unread,
Sep 8, 2000, 11:47:21 AM9/8/00
to
In article <8p5l6d$je1$1...@nnrp1.deja.com>,

Steve Offutt <ind...@oz.sunflower.org> wrote:
> In the interest of discussion, and maybe having a little fun with
> this, what has everybody got for a solution to shuffling a standard
> deck of playing cards?

Any of the saner list-shuffling algorithms posted here will do with
such a small number of elements. Picking them at random until you get
one that's not been chosen before is not very sane...

> (Whatever the best solution for 52 items is ought to apply to 1000,
> or 10000, or whatever, right?)

No. The choice of solution for any particular size of input depends
on both the big-Oh notation performance rating *and* the constant
factors involved. Big-Oh notation (O(n), O(nlogn) and friends) tells
you how an algorithm's performance behaves in the worst case as the
size of input grows large. However, many algorithms have additional
gotchas (like a huge setup time, or a very large constant multiplier
for one of the factors involved) and can be utterly useless in
practise. A case in point is the fact that the very fastest sorting
algorithm is the Bin-Sort. Bin sorting is O(n) unlike the other
algorithms (mostly O(nlogn) or O(n*n)) but it is only efficient where
you have many more elements than unique values and where the actual
range of the values is small (you sort by counting the number of each
element, and then treating the array a bit like a run-length encoding
of the output!) Nobody in their right mind uses bin-sort. Instead,
they either use quick-sort (very good normal behaviour, awful
worst-case, and unstable too) merge-sort (not quite as fast in the
normal case, but the worst case is hardly worse than the normal case,
and the sort is stable) or insertion-sort (which has bad performance,
but is great if you need to maintain a sorted list and are only ever
inserting one element at a time, as the cost of each insertion is
low.) Bubble-sort should only ever be seen in algorithms classes, and
then only as an example of what *not* to do! :^)

One of the reasons why programmers should take advanced classes (like
a degree programme) is so that they can learn this sort of thing.
"How to hack in $LANGUAGE_DU_JOUR" is not really worthy of academic
study, but "What is performance and how do you get it" is.

Donal (Bah! Run out of time...)

Donal K. Fellows

unread,
Sep 7, 2000, 5:04:20 PM9/7/00
to
In article <remdrs87hbfavr0ns...@4ax.com>, Stephen D.
Cohen <sco...@tampabay.rr.com> writes

><do...@ugglan.demon.co.uk> wrote:
>>Grrr. People who simply *don't* understand random number sequences. If
>>you have a random sequence, the probability of a number occurring twice
>>in a row must be exactly the same as the probability of that number
>>being followed by any other nominated number. (In fact, it gets better
>>than that. For a *truly* random sequence, the shortest description of
>>the sequence should be the sequence itself...)
>
> I hope that the Grrr.. was for the original poster, and not my
>rather obvious explanation of the pigeon hole principal.

Would you accept a "Grrr." for anyone that misunderstands probability
when trying to apply it? :^)

>Obviously,
>it will generally take a lot more than 1000 rand calls to cover the
>numbers from 1 to 1000. On the average, it will be more like infinity
>(well, it approaches infinity).

It's infinitely improbable that it would take infinitely long. Just
look out for spaceships called "The Heart of Gold"...

And coding errors; a one-off error could make turn an infinitely
improbable event into a certainty... :^)

>I proposed exactly this solution for Kevin to try about two
>hours ago. We'll see if he gives it a whirl and adds it to his web
>page. It wasn't really too difficult to get the indexing straight and
>I used your K operator to fool Tcl into not doing the ugly thing.

Ah! Good old asynchronous USENET! How do we praise thy name?

Donal (in a very good mood tonight, despite family woes...)

Bob Techentin

unread,
Sep 6, 2000, 9:54:10 AM9/6/00
to
"Stephen D. Cohen" wrote:
>
> If he is generating random numbers between 1 and 1000, he
> obviously would have to get the same number again in less than 1001
> rand calls. Or am I missing something here?
...

> A far better solution is to remove the number selected from
> the "scoreboard" list and generate a new random number in the range of
> 1..<number of elements in scoreboard list>. That way it only takes
> 999 calls to rand to randomly order 1000 unique elements.

I was reading of an alternative algorithm in "Programming Pearls" Jon
Louis Bentley a little while back. Great book, by the way. Has nothing
to do with Perl. :-)

To get a unique list of numbers (say, 1-1000) in a random order, you can
first create an ordered list, then randomize the list by walking through
once and performing a random 'swap'. Something like this:

for {set i 0} {$i<1000} {incr i} {lappend nums $i}
for {set i 0} {$i<1000} {incr i} {
set j [expr {int(rand()*1000)}]
set temp [lindex $nums $j]
set nums [lreplace $nums $j $j [lindex $nums $i]]
set nums [lreplace $nums $i $i $temp]
}

This might be a little slower than randomly selecting from a shrinking
"scoreboard" list because you've got to use [lreplace] twice. I don't
have a good feel for how fast it would be compared to methods that use
array accesses. But it doesn't take much code to get the job done, and
that is a good thing.

Bob

0 new messages