Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Generate random number in SwiftForth & GForth

1,293 views
Skip to first unread message

Mark Wills

unread,
Dec 13, 2011, 4:36:44 PM12/13/11
to
Is there any way to do this? I tried RND RAND & RANDOM but no luck.

Mark

A. K.

unread,
Dec 13, 2011, 4:49:36 PM12/13/11
to
On 13.12.2011 22:36, Mark Wills wrote:
> Is there any way to do this? I tried RND RAND& RANDOM but no luck.
>
> Mark

As ususal in Forth "roll your own" eg

( Fast Random Number Generator
algorithm by George Marsaglia "Xorshift RNGs" )

2463534242 VARIABLE (RND) \ seed

: RND ( -- x )
(rnd) @ dup 13 lshift xor
dup 17 rshift xor
dup 5 lshift xor (rnd) ! ;

Andreas

Mark Wills

unread,
Dec 13, 2011, 5:14:04 PM12/13/11
to
Thanks Andreas

I had to re-jig it slightly to get it to work. VARIABLE doesn't take an initial value (unlike a constant or a value) and an extra DUP was needed:

VARIABLE (RND)
2463534242 (rnd) ! \ seed

: rnd ( -- n)
(rnd) @ dup 13 lshift xor
dup 17 rshift xor
dup DUP 5 lshift xor (rnd) !
;

Happily working in both SwiftForth & GForth now. Thank you.

Mark

Roelf Toxopeus

unread,
Dec 13, 2011, 5:50:05 PM12/13/11
to
In article
<1891233.395.1323812204382.JavaMail.geo-discussion-forums@yqey19>,
Mark Wills <forth...@gmail.com> wrote:

> Is there any way to do this? I tried RND RAND & RANDOM but no luck.
>
> Mark

last time I looked:

SwiftForth/lib/options/rnd.f

gforth-0.6.2/random.fs

are they gone?

Elizabeth D. Rather

unread,
Dec 13, 2011, 6:58:09 PM12/13/11
to
On 12/13/11 12:50 PM, Roelf Toxopeus wrote:
> In article
> <1891233.395.1323812204382.JavaMail.geo-discussion-forums@yqey19>,
> Mark Wills<forth...@gmail.com> wrote:
>
>> Is there any way to do this? I tried RND RAND& RANDOM but no luck.
>>
>> Mark
>
> last time I looked:
>
> SwiftForth/lib/options/rnd.f
>
> gforth-0.6.2/random.fs
>
> are they gone?

No, still there :-) There's quite a lot that comes with SwiftForth that
isn't routinely loaded. It's worth exploring. The Windows version of
SwiftForth has menu support for rummaging around the many options,
libraries and examples, but the Linux/OSX versions are more command-line
oriented, so you have to look in the directories.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Hans Bezemer

unread,
Dec 14, 2011, 1:32:49 AM12/14/11
to
Mark Wills wrote:
This is a heavier duty randomizer. It shouldn't be very hard to port:

TIME: returns an integer, POSIX time
MAX-N: returns the largest positive signed integer
SHIFT: LSHIFT, RSHIFT when negative

S" MAX-N" ENVIRONMENT? \ query environment
[IF] \ if successful
CONSTANT MAX-N \ create constant MAX-N
[ELSE]
.( Warning: MAX-N undefined) CR
[THEN]

[DEFINED] TIME&DATE [IF]
: >JD
>R 3 - DUP 0< IF 12 + R> 1- >R THEN
306 * 5 + 10 / + R@ 1461 4 */ + 1721116 +
DUP 2299169 > IF 3 + R@ 100 / - R@ 400 / + THEN R> DROP
;

: >TIME >JD 2440588 - 86400 * >R 3600 * SWAP 60 * + + R> + ;
: TIME TIME&DATE >TIME ;
[THEN]

\ -- KISS generator --
\ http://objectmix.com/fortran/385655-random_number-intrinsic-gfortran.html
\ Algorithm 2007 by George Marsaglia
\ 4tH version 2009, Bill Cook

[UNDEFINED] kiss [IF]
variable x
variable y
variable z
variable w

max-n constant MAX-KISS

: m ( k n ** m ) over swap shift xor ;
: znew z @ dup 65535 and 18000 * swap -16 shift + dup z ! ;
: wnew w @ dup 65535 and 30903 * swap -16 shift + dup w ! ;

: seed4! w ! z ! y ! x ! ;
: seed4@ x @ y @ z @ w @ ;

: cong x @ 69069 * 1327217885 + dup x ! ;
: shr3 y @ 13 m -17 m 5 m dup y ! ;
: mwc znew 16 shift wnew + ;
: kiss mwc cong + shr3 + ;

: randomize time x ! cong y ! shr3 z ! cong w ! ;

[UNDEFINED] random [IF]
aka kiss random
MAX-KISS constant MAX-RAND
[THEN]

[DEFINED] 4th# [IF]
hide m
hide znew
hide wnew
[THEN]
[THEN]

Hans Bezemer

Brad

unread,
Dec 14, 2011, 10:43:35 AM12/14/11
to
On Dec 13, 2:36 pm, Mark Wills <forthfr...@gmail.com> wrote:
> Is there any way to do this? I tried RND RAND & RANDOM but no luck.
>
Charles G Montgomery posted a really short RNG in this newsgroup. I
found it here:

http://groups.google.com/group/comp.lang.forth/browse_frm/thread/e09554ea9dc6334/41e2b9935decb95b?lnk=gst&q=cgm+random+number#41e2b9935decb95b

-Brad

rickman

unread,
Dec 14, 2011, 12:05:40 PM12/14/11
to
Are the properties of this generator important? If so you might check
this implementation. I don't recognize the design off the top of my
head. Usually when pseudo random numbers are generated a maximal
length Linear Feedback Shift Register (LFSR) is used. The above code
may well be a standard LFSR, but I've not seen this before. If it is
not designed properly, the pattern will repeat well short of the
maximal length. Does that matter to you?

Rick

Andrew Haley

unread,
Dec 14, 2011, 12:43:40 PM12/14/11
to
rickman <gnu...@gmail.com> wrote:
> On Dec 13, 4:49?pm, "A. K." <a...@nospam.org> wrote:
>> On 13.12.2011 22:36, Mark Wills wrote:
>>
>> > Is there any way to do this? I tried RND RAND& RANDOM but no luck.
>>
>> > Mark
>>
>> As ususal in Forth "roll your own" eg
>>
>> ( Fast Random Number Generator
>> algorithm by George Marsaglia "Xorshift RNGs" )
>>
>> 2463534242 VARIABLE (RND) \ seed
>>
>> : RND ( -- x )
>> (rnd) @ dup 13 lshift xor
>> dup 17 rshift xor
>> dup 5 lshift xor (rnd) ! ;
>
> Are the properties of this generator important? If so you might
> check this implementation. I don't recognize the design off the top
> of my head. Usually when pseudo random numbers are generated a
> maximal length Linear Feedback Shift Register (LFSR) is used. The
> above code may well be a standard LFSR, but I've not seen this
> before. If it is not designed properly, the pattern will repeat
> well short of the maximal length.

http://en.wikipedia.org/wiki/George_Marsaglia

It is designed properly, I assure you.

Enjoy: www.jstatsoft.org/v08/i14/paper

Andrew.

Bernd Paysan

unread,
Dec 14, 2011, 5:05:21 PM12/14/11
to
rickman wrote:

> On Dec 13, 4:49 pm, "A. K." <a...@nospam.org> wrote:
>> ( Fast Random Number Generator
>> algorithm by George Marsaglia "Xorshift RNGs" )
>>
>> 2463534242 VARIABLE (RND) \ seed
>>
>> : RND ( -- x )
>> (rnd) @ dup 13 lshift xor
>> dup 17 rshift xor
>> dup 5 lshift xor (rnd) ! ;
>>
>> Andreas
[...]
> The above code
> may well be a standard LFSR, but I've not seen this before. If it is
> not designed properly, the pattern will repeat well short of the
> maximal length. Does that matter to you?

This is an xorshift random number generator, and this is significantly
better than an LFSR. It does have the maximum length, and it does not
have the funny property of LFSRs that many "random" numbers are just
twice the previous random number.

The source code above misses a ! between "(RND)" and "\ seed"

That's my version, it uses different shift counts, but there are many.
This is a expicit 32 bit RNG, which is made 32 bit even on 64 bit
systems:

Variable seed $3f98c5ac seed !

: xorshift ( n -- n' )
dup 1 lshift xor $FFFFFFFF and
dup 3 rshift xor
dup 10 lshift xor $FFFFFFFF and ;
: rnd seed @ xorshift dup seed ! ;

What I don't like with xorshift: 0 is not a legal starting point, just
like LFSR, the algorithm is stuck at 0. I didn't do any investigation
into this problem, but maybe inverting the value at the end of the logic
could solve that problem.

My cryptographic hard Wurstkessel RNG (proven to be "good" with the
dieharder suite, the actual cryptographic peer review is still pending -
but when I look at all the problems AES has, I don't really trust that
sort of peer review, either ;-) with all optimizations on is 6 times
slower than this xorshift RNG, using gforth-fast (maybe it could be only
4 or 5 times slower if I generate more than 64 bytes at a time). But
for all serious random number stuff, using such a more complex, but
proven RNG is a good idea - simple algorithms fail pretty soon in this
test suite, and the artefacts they display there also may show up in
your program.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

Andrew Haley

unread,
Dec 15, 2011, 5:35:33 AM12/15/11
to
Bernd Paysan <bernd....@gmx.de> wrote:
> rickman wrote:
>
>> On Dec 13, 4:49 pm, "A. K." <a...@nospam.org> wrote:
>>> ( Fast Random Number Generator
>>> algorithm by George Marsaglia "Xorshift RNGs" )
>>>
>>> 2463534242 VARIABLE (RND) \ seed
>>>
>>> : RND ( -- x )
>>> (rnd) @ dup 13 lshift xor
>>> dup 17 rshift xor
>>> dup 5 lshift xor (rnd) ! ;
>>>
>>> Andreas
> [...]
>> The above code
>> may well be a standard LFSR, but I've not seen this before. If it is
>> not designed properly, the pattern will repeat well short of the
>> maximal length. Does that matter to you?
>
> This is an xorshift random number generator, and this is significantly
> better than an LFSR. It does have the maximum length, and it does not
> have the funny property of LFSRs that many "random" numbers are just
> twice the previous random number.

https://groups.google.com/group/comp.lang.forth/msg/cc7dec31fa374841?hl=en

Brent proved that for every XOR-shift generator there is an
equivalent LFSR. However, the XOR-shift form is more efficient in
software than in its LFSR form.

Andrew.

Richard P. Brent, Note on Marsaglia's Xorshift Random Number
Generators, JSS 2004

Bernd Paysan

unread,
Dec 15, 2011, 4:08:12 PM12/15/11
to
Andrew Haley wrote:
> Brent proved that for every XOR-shift generator there is an
> equivalent LFSR. However, the XOR-shift form is more efficient in
> software than in its LFSR form.

I'm not convinced. A single LFSR goes from state n to 2n when the MSB
of n is not set (or, when the shift direction is the other way round,
from 2n to n, when the LSB of n is not set). Xorshift RNGs don't, they
have no such trivial relationship between two results.

LFSRs are easier to implement than xorshift generators, but have this
additional awkward property.

Brent gives an equvalent to the (1,3,10) xorshift RNG, which is what I
have used. This time for 32 bit systems only:

: xorshift ( n -- n' )
dup 1 lshift xor
dup 3 rshift xor
dup 10 lshift xor ;

This should be the equivalent LFSR, if I believe in Brent's paper:

$382d1e61 constant lfsr-eq \ polynom from the Brent paper
: lfsr dup 0< IF 2* lfsr-eq xor ELSE 2* THEN ;

If the clamed equivalence is true, xorshift and lfsr should perform the
same function. I use a Forth approach to math: let's try. For
simplicity I use 1 as starting point:

hex 1 ok
xorshift dup u. C03 ok
xorshift dup u. 5A0285 ok
xorshift dup u. CFEE3F7E ok
xorshift dup u. 8A12C1B2 ok
xorshift dup u. 4B5B9A8C ok
xorshift dup u. 82B8A266 ok
xorshift dup u. 5459267F ok
xorshift dup u. 3B6943D1 ok
xorshift dup u. 76FF48FD ok
drop 1 ok
lfsr dup u. 2 ok
lfsr dup u. 4 ok
lfsr dup u. 8 ok
lfsr dup u. 10 ok
lfsr dup u. 20 ok
lfsr dup u. 40 ok
lfsr dup u. 80 ok
lfsr dup u. 100 ok

I think we can stop here, and we have not even used the polynom Brent
proposed. Any LFSR will output this sequence with the starting point 1.

These are not equivalent functions. Whatever equivalence Brent has
proven, it's not functional equivalence. I'm pretty sure I know what an
LFSR is, so I don't think there is a misunderstanding. Did Brent test
his calculations? I mean, in the way we understand testing a Forth
program?

LFSRs have somewhat better properties when you use only the MSB.

: lfsr-seq ( seed rnd -- seed' rnd' )
2dup d+ dup 1 and IF swap lfsr-eq xor swap THEN ;
: gen-seq ( seed -- seed' rnd ) 0 $20 0 do lfsr-seq loop ;

If we start with 1, we will get 1 as first result.

1 gen-seq u. 1 ok
gen-seq u. 3DA97C19 ok
gen-seq u. 53BD8241 ok
gen-seq u. 879B88CF ok
gen-seq u. 37AE719E ok
gen-seq u. 9E525CC9 ok
gen-seq u. 36E8CFAB ok
gen-seq u. 1AFFC028 ok
gen-seq u. E86DFC12 ok

These results look better than the xorshift RNG, but they are way more
costly, and the cycle length is also smaller (only 2^(32-5) values).

Arnold Doray

unread,
Dec 17, 2011, 12:12:06 AM12/17/11
to
You might have put the second DUP in the wrong place. Your results may
still appear "random" but have undesirable/suboptimal properties.

As Andreas' post says, this is a Xorshift RNG, which might be more
clearly factored as:

\ Xorshift (13,17,5)
: xorshift ( n -- n )
dup 13 lshift xor
dup 17 rshift xor
dup 5 lshift xor ;

variable (rnd) \ seed
2463534242 (rnd) ! \ initialize seed

: rnd ( -- n )
(rnd) @ xorshift dup (rnd) ! ;

The second DUP should be *after* the xorshift operation, in order to
update the seed.

Note that zero is obviously a fixed point for any XORSHIFT operation. Ie,

0 xorshift . 0 ok

so zero should not be used as a seed.

Also, you might want to investigate the conditions under which a 0 could
occur after a XORSHIFT, because if you ever hit a zero in the sequence,
the RNG will be stuck at 0 after that. Not something you want to happen
midway as your app runs!

If you need zero in your pseudorandom sequence, you could translate the
sequence by subtraction:

: rnd-with-zero ( -- n )
(rnd) @ xorshift dup (rnd) ! 1- ;

Cheers,
Arnold

Andrew Haley

unread,
Dec 17, 2011, 4:23:47 AM12/17/11
to
Arnold Doray <thinks...@gmail.com> wrote:
> On Tue, 13 Dec 2011 14:14:04 -0800, Mark Wills wrote:
>
> Also, you might want to investigate the conditions under which a 0 could
> occur after a XORSHIFT, because if you ever hit a zero in the sequence,
> the RNG will be stuck at 0 after that. Not something you want to happen
> midway as your app runs!

Zero doesn't appear in the sequence; see the paper.

Andrew.

Arnold Doray

unread,
Dec 17, 2011, 10:27:10 AM12/17/11
to
Yes.

Another, perhaps simpler way of seeing this is to notice that:

x y XOR = 0 iff x = y.

This should be obvious from the definition of the XOR operator.

The XORSHIFT operator is composed of a sequence of three X Y XORs, each
taking an X then left/right shifting to get Y then XOR'ing. Eg,

X (n l/r shift X) XOR == X Y XOR

For the final operation to be zero, X must be the same as Y. Obviously,
the only number that is the same is its right/left shifted value is zero.

This reasoning can be applied to the other remaining 2 XORs, so you end
up with:

X XORSHIFT = 0 iff X = 0.

(BTW, iff means "if and only if").

Cheers,
Arnold



Arnold Doray

unread,
Dec 18, 2011, 7:41:19 AM12/18/11
to
On Thu, 15 Dec 2011 22:08:12 +0100, Bernd Paysan wrote:

> This should be the equivalent LFSR, if I believe in Brent's paper:
>
> $382d1e61 constant lfsr-eq \ polynom from the Brent paper
> : lfsr dup 0< IF 2* lfsr-eq xor ELSE 2* THEN ;
>

The heart of Brent's paper ( http://maths.anu.edu.au/~brent/pd/
rpb218.pdf ) is quite simple, if not trivial -- see Section 5.

He starts with the definition of the minimal polynomial for T, the
XORSHIFT operator, and with trivial manipulations, obtains a recurrence
relation that he asserts defines a LFSR. It's airtight. The equivalence
should be exact.

Unfortunately, his paper only defines the LFSR for a bitstream, while
this recurrence relation is for a sequence of bitvectors.

What's less clear (at least to me) is how to get from this recurrence
relation of his to your LFSR function. Would you care to elaborate?

Thanks,
Arnold

Bernd Paysan

unread,
Dec 18, 2011, 8:58:50 AM12/18/11
to
Hm, I found the clue:

"In hardware implementations of LFSR sequences, the x_j are usually
single bits (elements of F2), but in software implementations it is easy
and more efficient to operate on whole words." (Page 3, first sentence)

So what I did is an LFSR as implemented in hardware, with single bits
x_j. Where Brent assumes is that "it is easy and more efficient to
operate on whole words." No, that's wrong. It is easy and efficient in
software to operate on bits (since the appropriate parallelized bit
vector operations are available), see above definition of a software-
defined bit-wise LFSR.

So he proved the equivalence of xorshift with something that
mathematically resembles an LFSR, but actually is not a "normal" LFSR,
because the x_j are bit vectors, not bits.

As I said, I'm not convinced. LFSRs are sequences of bits, not of bit
vectors. Redefining a concept but not changing the name is at least
misleading. The term "LFSR" is extremely well defined, and its
definition is very narrow.

Arnold Doray

unread,
Dec 18, 2011, 7:13:08 PM12/18/11
to
Brent's results imply that some "word LFSR"s (at the very least those
corresponding to xorshift operators) do not have the nasty 'doubling'
subsequences that "bitstream LFSR"s have for positive integers.

In fact, Brent's result is actually very general -- *any* stateless RNG
that can be written as the action of a matrix on a bitvector is
equivalent to some "word LFSR". So, if you have a rule to identify "good"
word LFSRs, you could then know if your specific stateless RNG is also
"good".

Cheers,
Arnold


rickman

unread,
Dec 19, 2011, 9:20:20 PM12/19/11
to
On Dec 18, 7:13 pm, Arnold Doray <thinksqua...@gmail.com> wrote:
> On Sun, 18 Dec 2011 14:58:50 +0100, Bernd Paysan wrote:
> > Arnold Doray wrote:
>
> >> On Thu, 15 Dec 2011 22:08:12 +0100, Bernd Paysan wrote:
>
> >>> This should be the equivalent LFSR, if I believe in Brent's paper:
>
> >>> $382d1e61 constant lfsr-eq \ polynom from the Brent paper : lfsr  dup
> >>> 0< IF  2* lfsr-eq xor  ELSE  2*   THEN  ;
>
> >> The heart of Brent's paper (http://maths.anu.edu.au/~brent/pd/
I don't feel like digging into all the terminology to understand this
at the level that you all seem to. But LFSR's, at least the useful
ones, are "maximal length". If your xor-shift generator is also
maximal length, won't they generate the same pattern? I haven't given
this a lot of thought so I may be missing something obvious.

Rick

Arnold Doray

unread,
Dec 20, 2011, 11:31:43 AM12/20/11
to
On Mon, 19 Dec 2011 18:20:20 -0800, rickman wrote:

> don't feel like digging into all the terminology to understand this at
> the level that you all seem to. But LFSR's, at least the useful ones,
> are "maximal length". If your xor-shift generator is also maximal
> length, won't they generate the same pattern? I haven't given this a
> lot of thought so I may be missing something obvious.

The Xor-Shift RNG is Marsaglia's invention, not mine. He was a guru on
RNGs.

To answer your question, just because 2 RNGs are 'maximal length' doesn't
mean they necessarily produce the same sequence/pattern.

An "RNG" is just a number generator.

The "random" part is a largely subjective set of criteria that you need
to test for depending on your usage of that sequence. Of course, the
usual meaning is that each term in the sequence should be i.i.d, but this
is not an algorithmic prescription. That's why you need tests like
"Diehard" (also Marsaglia's invention). It is by no means easy to ensure
i.i.d, unless your source is truly random like thermal noise. But even
this has correlations, yes? I don't believe any natural source of noise
is truly "white" (ie, the autocorrelation is a delta function).

So what you're saying is that two number generators produce the same
sequence just because they are both 'maximal length'.

"Maximal length" means that the sequence repeats after N items, where N
is the size of the space. Eg, for an K bit vector, N = 2^K.

Put this way, you can see your assertion is obviously false. A simple
counterexample: in the space {0,1,2}, (whose N = 3) the sequences:

0 1 2 0 1 2 ... and 0 2 1 0 2 1 ... are both maximal length, but
obviously not the same. The stateless number generators for these are:

: 3mod 1+ 3 mod ; \ for the first.
: 3mod+ 1+ 3mod ; \ for the second.

BTW, the exact equivalence proved by Brent is between *word* (ie,
bitvector) LFSRs and xorshift RNGs. These LFSRs are not the same as
bitstream LFSRs.

You have to prove equivalence on a case-by-case basis.

Cheers,
Arnold

Bernd Paysan

unread,
Dec 20, 2011, 5:45:45 PM12/20/11
to
Arnold Doray wrote:
> To answer your question, just because 2 RNGs are 'maximal length'
> doesn't mean they necessarily produce the same sequence/pattern.

In fact, "maximal length" here is about the maximum achievable length of
an xorshift or LFSR RNG pattern length - it's actually 2^n-1 for a bit
width of n (0 is not a member of the output sequence). For an LFSR, you
need a primitive polynom to be maximal length, and for an xorshift RNG,
only a small subset of the three shift numbers give you maximal length,
so you have to choose carefully.

And about the sequence: For a sequence length of n, there are n!
possible sequences. So there really is an abundance of possibilities to
generate a 2^n-1 length sequence of all numbers between 1 and 2^n-1.

As random number generators, these sequence generators have a number of
shortcomings. The first is that their output really is a long sequence,
which means that the same number will not occur again within the
sequence length. Therefore they will fairly quickly fail on a birthday
paradoxon test. The birthday paradoxon says that the likelyhood of two
times the same 32 bit number in a stream of just 64k numbers is 50%.
Run this test often enough, and you should be fairly close to this 50%
figure (i.e. if you run 100 times this test, you should have 50 "twins"
as result). An xorshift or LFSR RNG will result in 0%.

This is a quick-and-dirty test for a 32 bit RNG, called by the generic
word rng32 (you can alias your RNG to that):

Create buckets $40000 cells allot
Variable twins

: bucket! ( n addr -- )
2dup @ = IF 1 twins +! THEN ! ;

: push-bucket ( n -- )
dup $e rshift cells buckets + bucket! ;

: test-rng ( n -- )
buckets $40000 cells erase
0 ?DO rng32 push-bucket LOOP ;

: tests ( n -- ) twins off
0 ?DO $10000 test-rng LOOP twins ? ;

This test has some false negatives, as I don't chain my buckets.
Running this 1000 times with my Wurstkessel rng resulted in a "too good
to be true" 500 twins result (this is with the static test seed for
reproduction).

The diehard and dieharder test suites have a lot more such tests. What
I'm doing as well (this is not part of these test suites) is an FFT and
looking at the spectrum. The frequency distribution should be almost
flat, almost the same everywhere with a gaussian distribution, and the
standard deviation of this distribution should be close to the
theoretically expected one.

rickman

unread,
Dec 20, 2011, 6:34:34 PM12/20/11
to
Yes, once I gave it a little thought I realized how simply this can be
analyzed. An N bit FSM has at most 2^N states and a maximal length
LFSR will typically use at most 2^N-1. That is always the same. But
there are many permutations on how the states will transition. So no,
all sequences of maximal length LFSR of length N are not equivalent.

Rick

Arnold Doray

unread,
Dec 21, 2011, 6:19:50 AM12/21/11
to
On Tue, 20 Dec 2011 23:45:45 +0100, Bernd Paysan wrote:

>
> And about the sequence: For a sequence length of n, there are n!
> possible sequences. So there really is an abundance of possibilities to
> generate a 2^n-1 length sequence of all numbers between 1 and 2^n-1.
>
> As random number generators, these sequence generators have a number of
> shortcomings. The first is that their output really is a long sequence,
> which means that the same number will not occur again within the
> sequence length. Therefore they will fairly quickly fail on a birthday
> paradoxon test. The birthday paradoxon says that the likelyhood of two
> times the same 32 bit number in a stream of just 64k numbers is 50%. Run
> this test often enough, and you should be fairly close to this 50%
> figure (i.e. if you run 100 times this test, you should have 50 "twins"
> as result). An xorshift or LFSR RNG will result in 0%.
>

Yes, that's a good point. Quite a departure from i.i.d behaviour!

You might be able to partly alleviate this problem by wraping the
sequence to a shorter length, eg by using a RNG N MOD. The results might
be better for N << 2^K .


> This is a quick-and-dirty test for a 32 bit RNG, called by the generic
> word rng32 (you can alias your RNG to that):
>
> Create buckets $40000 cells allot Variable twins
>
> : bucket! ( n addr -- )
> 2dup @ = IF 1 twins +! THEN ! ;
>
> : push-bucket ( n -- )
> dup $e rshift cells buckets + bucket! ;
>
> : test-rng ( n -- )
> buckets $40000 cells erase 0 ?DO rng32 push-bucket LOOP ;
>
> : tests ( n -- ) twins off
> 0 ?DO $10000 test-rng LOOP twins ? ;
>
> This test has some false negatives, as I don't chain my buckets. Running
> this 1000 times with my Wurstkessel rng resulted in a "too good to be
> true" 500 twins result (this is with the static test seed for
> reproduction).

Probably because Wurstkessel is not stateless like Xorshift? :)

>
> The diehard and dieharder test suites have a lot more such tests. What
> I'm doing as well (this is not part of these test suites) is an FFT and
> looking at the spectrum. The frequency distribution should be almost
> flat,

An interesting test since it may pick up the doubling subsequences in
your bitstream LFSRs.

You might use the spectrum's entropy as a measure of flatness.

Cheers,
Arnold

Paul Rubin

unread,
Dec 21, 2011, 6:43:09 AM12/21/11
to
Arnold Doray <inv...@invalid.com> writes:
>> paradoxon test. The birthday paradoxon says that the likelyhood of two
>> times the same 32 bit number in a stream of just 64k numbers is 50%....
>> An xorshift or LFSR RNG will result in 0%...
>
> You might be able to partly alleviate this problem by wraping the
> sequence to a shorter length, eg by using a RNG N MOD. The results might
> be better for N << 2^K .

In cryptography the standard approach is just make the period big enough
that the discrepancy won't be noticed (e.g. 2**128 in the case of AES).
A result called the PRP-PRF Switching Lemma shows that other than this
O(2**64) birthday test or something else equally slow, there's no way to
distinguish a (pseudo) random permutation from a PRF (i.e. RNG).

Andrew Haley

unread,
Dec 21, 2011, 10:10:13 AM12/21/11
to
Arnold Doray <inv...@invalid.com> wrote:
> On Tue, 20 Dec 2011 23:45:45 +0100, Bernd Paysan wrote:
>
>> And about the sequence: For a sequence length of n, there are n!
>> possible sequences. So there really is an abundance of possibilities to
>> generate a 2^n-1 length sequence of all numbers between 1 and 2^n-1.
>>
>> As random number generators, these sequence generators have a number of
>> shortcomings. The first is that their output really is a long sequence,
>> which means that the same number will not occur again within the
>> sequence length. Therefore they will fairly quickly fail on a birthday
>> paradoxon test. The birthday paradoxon says that the likelyhood of two
>> times the same 32 bit number in a stream of just 64k numbers is 50%. Run
>> this test often enough, and you should be fairly close to this 50%
>> figure (i.e. if you run 100 times this test, you should have 50 "twins"
>> as result). An xorshift or LFSR RNG will result in 0%.
>
> Yes, that's a good point. Quite a departure from i.i.d behaviour!

Sure, but this is all perfectly normal. The rule of thumb for any
random permutation generator is not to use it for more than the square
root of its period. And, of course, you usually won't be using all of
its state each time. The period can be extended by combining (with a
simple XOR) generators whose period is relatively prime or using a
nonlinear combiner of some kind.

The real issue is "bang for the buck", and from that POV XOR-shift
generators will probably never be beaten.

For many uses, though, it doesn't matter. For example, I've used a
XOR-shift generator to control the delay in an exponential backoff
algorithm.

Andrew.

Arnold Doray

unread,
Dec 21, 2011, 10:30:21 AM12/21/11
to
On Wed, 21 Dec 2011 09:10:13 -0600, Andrew Haley wrote:

> Arnold Doray <inv...@invalid.com> wrote:
>> On Tue, 20 Dec 2011 23:45:45 +0100, Bernd Paysan wrote:
>>
>>> And about the sequence: For a sequence length of n, there are n!
>>> possible sequences. So there really is an abundance of possibilities
>>> to generate a 2^n-1 length sequence of all numbers between 1 and
>>> 2^n-1.
>>>
>>> As random number generators, these sequence generators have a number
>>> of shortcomings. The first is that their output really is a long
>>> sequence,
>>> which means that the same number will not occur again within the
>>> sequence length. Therefore they will fairly quickly fail on a
>>> birthday paradoxon test. The birthday paradoxon says that the
>>> likelyhood of two times the same 32 bit number in a stream of just 64k
>>> numbers is 50%. Run this test often enough, and you should be fairly
>>> close to this 50% figure (i.e. if you run 100 times this test, you
>>> should have 50 "twins" as result). An xorshift or LFSR RNG will
>>> result in 0%.
>>
>> Yes, that's a good point. Quite a departure from i.i.d behaviour!
>
> Sure, but this is all perfectly normal. The rule of thumb for any
> random permutation generator is not to use it for more than the square
> root of its period.

Why exactly?


> And, of course, you usually won't be using all of
> its state each time. The period can be extended by combining (with a
> simple XOR) generators whose period is relatively prime or using a
> nonlinear combiner of some kind.
>

What is the point of doing this? Because both are stateless, you'd still
end up with a period less/equal to than 2^K. Might as well go with a
maximal RNG to start with, then wrap to your needs.


Cheers,
Arnold

Bernd Paysan

unread,
Dec 21, 2011, 10:31:10 AM12/21/11
to
Arnold Doray wrote:
>> This test has some false negatives, as I don't chain my buckets.
>> Running this 1000 times with my Wurstkessel rng resulted in a "too
>> good to be true" 500 twins result (this is with the static test seed
>> for reproduction).
>
> Probably because Wurstkessel is not stateless like Xorshift? :)

Of course not. A cryptographic hard random number generator must be
hard to predict, and in order to achieve that, the hidden part of the
internal state must be at least as long as the visible part - that's
against a brute force attack.

Andrew Haley

unread,
Dec 21, 2011, 11:45:50 AM12/21/11
to
Arnold Doray <inv...@invalid.com> wrote:
> On Wed, 21 Dec 2011 09:10:13 -0600, Andrew Haley wrote:
>
>> Arnold Doray <inv...@invalid.com> wrote:
>>> On Tue, 20 Dec 2011 23:45:45 +0100, Bernd Paysan wrote:
>>>
>>>> And about the sequence: For a sequence length of n, there are n!
>>>> possible sequences. So there really is an abundance of possibilities
>>>> to generate a 2^n-1 length sequence of all numbers between 1 and
>>>> 2^n-1.
>>>>
>>>> As random number generators, these sequence generators have a number
>>>> of shortcomings. The first is that their output really is a long
>>>> sequence,
>>>> which means that the same number will not occur again within the
>>>> sequence length. Therefore they will fairly quickly fail on a
>>>> birthday paradoxon test. The birthday paradoxon says that the
>>>> likelyhood of two times the same 32 bit number in a stream of just 64k
>>>> numbers is 50%. Run this test often enough, and you should be fairly
>>>> close to this 50% figure (i.e. if you run 100 times this test, you
>>>> should have 50 "twins" as result). An xorshift or LFSR RNG will
>>>> result in 0%.
>>>
>>> Yes, that's a good point. Quite a departure from i.i.d behaviour!
>>
>> Sure, but this is all perfectly normal. The rule of thumb for any
>> random permutation generator is not to use it for more than the square
>> root of its period.
>
> Why exactly?

For reasons like the birthday paradox above: you're getting into the
region where nonrandomness will be observable. This is particularly
true of linear congruential generators, but AFAIK it's generally
applied to other kinds of generators. There may not be any very
strong reason for this, but it is safe.

>> And, of course, you usually won't be using all of its state each
>> time. The period can be extended by combining (with a simple XOR)
>> generators whose period is relatively prime or using a nonlinear
>> combiner of some kind.
>
> What is the point of doing this? Because both are stateless, you'd still
> end up with a period less/equal to than 2^K.

I'm sorry, what is K? And why do you say the generators are
stateless? Surely this is quite impossible: they all have some state.
I don't see what you're getting at.

Andrew.

Arnold Doray

unread,
Dec 23, 2011, 4:56:02 AM12/23/11
to
Any stateless RNG would exhibit the birthday paradox. This is because
they never repeat within their period. So the probability of birthday
pairs occuring within the period is zero. This won't change even if you
take a shorter subsequence.

By "stateless" I mean RNGs like xorshift or bitstream LFSR or linear
congruential that are just mappings/functions. Calling it again on the
same input always results in the same answer. I don't know what the
kosher term is.

Bernd's "wurstkessel" passes birthday test because it stores state. So,
calling it twice on the same input may well result in different answers.
This is obviously a necessary property for cryptographically hard RNGs.
Otherwise, its behaviour would be too easy to predict.

However, a different interpretation of what you're saying occurs to me.

Usually, you only want pseudo random numbers within a given limit
[0..N-1]. To get this from the (say) xorshift, you'd simply apply a MOD:

: RNG ( N -- RND )
seed @ XORSHIFT \ get the next RNG
DUP seed ! \ update the seed
SWAP MOD ; \ wrap to [0..N-1]

So perhaps you need to take N < sqrt(2^K), where K is the bitsize of the
integer used (K = 32 for a 32 bit int). This may pass the birthday test
for if N is limited this way. I'll try to work it out and post the
results.


>>> And, of course, you usually won't be using all of its state each time.
>>> The period can be extended by combining (with a simple XOR)
>>> generators whose period is relatively prime or using a nonlinear
>>> combiner of some kind.
>>
>> What is the point of doing this? Because both are stateless, you'd
>> still end up with a period less/equal to than 2^K.
>
> I'm sorry, what is K? And why do you say the generators are stateless?
> Surely this is quite impossible: they all have some state.
> I don't see what you're getting at.
>

I see now you mean two stateless RNGs with *each with its own separate
seed*.

The combination is obviously stateful, and the period may exceed 2^K,
with an upper limit of the product of the periods, if they are co-prime.
That's a neat trick, thanks.


Cheers,
Arnold






Andrew Haley

unread,
Dec 23, 2011, 5:23:08 AM12/23/11
to
Arnold Doray <inv...@invalid.com> wrote:
> On Wed, 21 Dec 2011 10:45:50 -0600, Andrew Haley wrote:
>
>> Arnold Doray <inv...@invalid.com> wrote:
>>> On Wed, 21 Dec 2011 09:10:13 -0600, Andrew Haley wrote:
>>>
>>>> Sure, but this is all perfectly normal. The rule of thumb for any
>>>> random permutation generator is not to use it for more than the square
>>>> root of its period.
>>>
>>> Why exactly?
>>
>> For reasons like the birthday paradox above: you're getting into the
>> region where nonrandomness will be observable. This is particularly
>> true of linear congruential generators, but AFAIK it's generally applied
>> to other kinds of generators. There may not be any very strong reason
>> for this, but it is safe.
>
> Any stateless RNG would exhibit the birthday paradox. This is
> because they never repeat within their period. So the probability of
> birthday pairs occuring within the period is zero. This won't change
> even if you take a shorter subsequence.

Sure, but you have no way to notice that if you take a shorter
subsequence.

> By "stateless" I mean RNGs like xorshift or bitstream LFSR or linear
> congruential that are just mappings/functions. Calling it again on
> the same input always results in the same answer. I don't know what
> the kosher term is.

But a LCG or an xorshift generator doesn't have any input apart from
an initial seed; it outputs a sequence. Terminologically, it's a
random permutation generator. Its output is a permutation of the
integers from 1 to N.

Andrew.
0 new messages