1217 views

Skip to first unread message

Jan 20, 1999, 3:00:00 AM1/20/99

to

My offer of RNG's for C was an invitation to dance;

I did not expect the Tarantella. I hope this post will

stop the music, or at least slow it to a stately dance

for language chauvinists and software police---under

a different heading.

I did not expect the Tarantella. I hope this post will

stop the music, or at least slow it to a stately dance

for language chauvinists and software police---under

a different heading.

In response to a number of requests for good RNG's in

C, and mindful of the desirability of having a variety

of methods readily available, I offered several. They

were implemented as in-line functions using the #define

feature of C.

Numerous responses have led to improvements; the result

is the listing below, with comments describing the

generators.

I thank all the experts who contributed suggestions, either

directly to me or as part of the numerous threads.

It seems necessary to use a (circular) table in order

to get extremely long periods for some RNG's. Each new

number is some combination of the previous r numbers, kept

in the circular table. The circular table has to keep

at least the last r, but possible more than r, numbers.

For speed, an 8-bit index seems best for accessing

members of the table---at least for Fortran, where an

8-bit integer is readily available via integer*1, and

arithmetic on the index is automatically mod 256

(least-absolute-residue).

Having little experience with C, I got out my little

(but BIG) Kernighan and Ritchie book to see if there

were an 8-bit integer type. I found none, but I did

find char and unsigned char: one byte. Furthemore, K&R

said arithmetic on characters was ok. That, and a study

of the #define examples, led me to propose #define's

for in-line generators LFIB4 and SWB, with monster

periods. But it turned out that char arithmetic jumps

"out of character", other than for simple cases such as

c++ or c+=1. So, for safety, the index arithmetic

below is kept in character by the UC definition.

Another improvement on the original version takes

advantage of the comma operator, which, to my chagrin,

I had not seen in K&R. It is there, but only with an

example of (expression,expression). From the advice of

contributors, I found that the comma operator allows

(expression,...,expression,expression) with the

last expression determining the value. That makes it

much easier to create in-line functions via #define

(see SHR3, LFIB4, SWB and FIB below).

The improved #define's are listed below, with a

function to initialize the table and a main program

that calls each of the in-line functions one million

times and then compares the result to what I got with

a DOS version of gcc. That main program can serve

as a test to see if your system produces the same

results as mine.

_________________________________________

|If you run the program below, your output|

| should be seven lines, each a 0 (zero).|

-----------------------------------------

Some readers of the threads are not much interested

in the philosophical aspects of computer languages,

but want to know: what is the use of this stuff?

Here are simple examples of the use of the in-line

functions: Include the #define's in your program, with

the accompanying static variable declarations, and a

procedure, such as the example, for initializing

the static variable (seeds) and the table.

Then any one of those in-line functions, inserted

in a C expression, will provide a random 32-bit

integer, or a random float if UNI or VNI is used.

For example, KISS&255; would provide a random byte,

while 5.+2.*UNI; would provide a random real (float)

from 5 to 7. Or 1+MWC%10; would provide the

proverbial "take a number from 1 to 10",

(but with not quite, but virtually, equal

probabilities).

More generally, something such as 1+KISS%n; would

provide a practical uniform random choice from 1 to n,

if n is not too big.

A key point is: a wide variety of very fast, high-

quality, easy-to-use RNG's are available by means of

the nine in-line functions below, used individually or

in combination.

The comments after the main test program describe the

generators. These descriptions are much as in the first

post, for those who missed them. Some of the

generators (KISS, MWC, LFIB4) seem to pass all tests of

randomness, particularly the DIEHARD battery of tests,

and combining virtually any two or more of them should

provide fast, reliable, long period generators. (CONG

or FIB alone and CONG+FIB are suspect, but quite useful

in combinations.)

Serious users of random numbers may want to

run their simulations with several different

generators, to see if they get consistent results.

These #define's may make it easy to do.

Bonne chance,

George Marsaglia

The C code follows---------------------------------:

#include <stdio.h>

#define znew (z=36969*(z&65535)+(z>>16))

#define wnew (w=18000*(w&65535)+(w>>16))

#define MWC ((znew<<16)+wnew )

#define SHR3 (jsr^=(jsr<<17), jsr^=(jsr>>13), jsr^=(jsr<<5))

#define CONG (jcong=69069*jcong+1234567)

#define FIB ((b=a+b),(a=b-a))

#define KISS ((MWC^CONG)+SHR3)

#define LFIB4 (c++,t[c]=t[c]+t[UC(c+58)]+t[UC(c+119)]+t[UC(c+178)])

#define SWB (c++,bro=(x<y),t[c]=(x=t[UC(c+34)])-(y=t[UC(c+19)]+bro))

#define UNI (KISS*2.328306e-10)

#define VNI ((long) KISS)*4.656613e-10

#define UC (unsigned char) /*a cast operation*/

typedef unsigned long UL;

/* Global static variables: */

static UL z=362436069, w=521288629, jsr=123456789, jcong=380116160;

static UL a=224466889, b=7584631, t[256];

/* Use random seeds to reset z,w,jsr,jcong,a,b, and the table t[256]*/

static UL x=0,y=0,bro; static unsigned char c=0;

/* Example procedure to set the table, using KISS: */

void settable(UL i1,UL i2,UL i3,UL i4,UL i5, UL i6)

{ int i; z=i1;w=i2,jsr=i3; jcong=i4; a=i5; b=i6;

for(i=0;i<256;i=i+1) t[i]=KISS;

}

/* This is a test main program. It should compile and print 7 0's. */

int main(void){

int i; UL k;

settable(12345,65435,34221,12345,9983651,95746118);

for(i=1;i<1000001;i++){k=LFIB4;} printf("%u\n", k-1064612766U);

for(i=1;i<1000001;i++){k=SWB ;} printf("%u\n", k- 627749721U);

for(i=1;i<1000001;i++){k=KISS ;} printf("%u\n", k-1372460312U);

for(i=1;i<1000001;i++){k=CONG ;} printf("%u\n", k-1529210297U);

for(i=1;i<1000001;i++){k=SHR3 ;} printf("%u\n", k-2642725982U);

for(i=1;i<1000001;i++){k=MWC ;} printf("%u\n", k- 904977562U);

for(i=1;i<1000001;i++){k=FIB ;} printf("%u\n", k-3519793928U);

}

/*-----------------------------------------------------

Write your own calling program and try one or more of

the above, singly or in combination, when you run a

simulation. You may want to change the simple 1-letter

names, to avoid conflict with your own choices. */

/* All that follows is comment, mostly from the initial

post. You may want to remove it */

/* Any one of KISS, MWC, FIB, LFIB4, SWB, SHR3, or CONG

can be used in an expression to provide a random 32-bit

integer.

The KISS generator, (Keep It Simple Stupid), is

designed to combine the two multiply-with-carry

generators in MWC with the 3-shift register SHR3 and

the congruential generator CONG, using addition and

exclusive-or. Period about 2^123.

It is one of my favorite generators.

The MWC generator concatenates two 16-bit multiply-

with-carry generators, x(n)=36969x(n-1)+carry,

y(n)=18000y(n-1)+carry mod 2^16, has period about

2^60 and seems to pass all tests of randomness. A

favorite stand-alone generator---faster than KISS,

which contains it.

FIB is the classical Fibonacci sequence

x(n)=x(n-1)+x(n-2),but taken modulo 2^32.

Its period is 3*2^31 if one of its two seeds is odd

and not 1 mod 8. It has little worth as a RNG by

itself, but provides a simple and fast component for

use in combination generators.

SHR3 is a 3-shift-register generator with period

2^32-1. It uses y(n)=y(n-1)(I+L^17)(I+R^13)(I+L^5),

with the y's viewed as binary vectors, L the 32x32

binary matrix that shifts a vector left 1, and R its

transpose. SHR3 seems to pass all except those

related to the binary rank test, since 32 successive

values, as binary vectors, must be linearly

independent, while 32 successive truly random 32-bit

integers, viewed as binary vectors, will be linearly

independent only about 29% of the time.

CONG is a congruential generator with the widely used 69069

multiplier: x(n)=69069x(n-1)+1234567. It has period

2^32. The leading half of its 32 bits seem to pass

tests, but bits in the last half are too regular.

LFIB4 is an extension of what I have previously

defined as a lagged Fibonacci generator:

x(n)=x(n-r) op x(n-s), with the x's in a finite

set over which there is a binary operation op, such

as +,- on integers mod 2^32, * on odd such integers,

exclusive-or(xor) on binary vectors. Except for

those using multiplication, lagged Fibonacci

generators fail various tests of randomness, unless

the lags are very long. (See SWB below).

To see if more than two lags would serve to overcome

the problems of 2-lag generators using +,- or xor, I

have developed the 4-lag generator LFIB4 using

addition: x(n)=x(n-256)+x(n-179)+x(n-119)+x(n-55)

mod 2^32. Its period is 2^31*(2^256-1), about 2^287,

and it seems to pass all tests---in particular,

those of the kind for which 2-lag generators using

+,-,xor seem to fail. For even more confidence in

its suitability, LFIB4 can be combined with KISS,

with a resulting period of about 2^410: just use

(KISS+LFIB4) in any C expression.

SWB is a subtract-with-borrow generator that I

developed to give a simple method for producing

extremely long periods:

x(n)=x(n-222)-x(n-237)- borrow mod 2^32.

The 'borrow' is 0, or set to 1 if computing x(n-1)

caused overflow in 32-bit integer arithmetic. This

generator has a very long period, 2^7098(2^480-1),

about 2^7578. It seems to pass all tests of

randomness, except for the Birthday Spacings test,

which it fails badly, as do all lagged Fibonacci

generators using +,- or xor. I would suggest

combining SWB with KISS, MWC, SHR3, or CONG.

KISS+SWB has period >2^7700 and is highly

recommended.

Subtract-with-borrow has the same local behaviour

as lagged Fibonacci using +,-,xor---the borrow

merely provides a much longer period.

SWB fails the birthday spacings test, as do all

lagged Fibonacci and other generators that merely

combine two previous values by means of =,- or xor.

Those failures are for a particular case: m=512

birthdays in a year of n=2^24 days. There are

choices of m and n for which lags >1000 will also

fail the test. A reasonable precaution is to always

combine a 2-lag Fibonacci or SWB generator with

another kind of generator, unless the generator uses

*, for which a very satisfactory sequence of odd

32-bit integers results.

The classical Fibonacci sequence mod 2^32 from FIB

fails several tests. It is not suitable for use by

itself, but is quite suitable for combining with

other generators.

The last half of the bits of CONG are too regular,

and it fails tests for which those bits play a

significant role. CONG+FIB will also have too much

regularity in trailing bits, as each does. But keep

in mind that it is a rare application for which

the trailing bits play a significant role. CONG

is one of the most widely used generators of the

last 30 years, as it was the system generator for

VAX and was incorporated in several popular

software packages, all seemingly without complaint.

Finally, because many simulations call for uniform

random variables in 0<x<1 or -1<x<1, I use #define

statements that permit inclusion of such variates

directly in expressions: using UNI will provide a

uniform random real (float) in (0,1), while VNI will

provide one in (-1,1).

All of these: MWC, SHR3, CONG, KISS, LFIB4, SWB, FIB

UNI and VNI, permit direct insertion of the desired

random quantity into an expression, avoiding the

time and space costs of a function call. I call

these in-line-define functions. To use them, static

variables z,w,jsr,jcong,a and b should be assigned

seed values other than their initial values. If

LFIB4 or SWB are used, the static table t[256] must

be initialized.

A note on timing: It is difficult to provide exact

time costs for inclusion of one of these in-line-

define functions in an expression. Times may differ

widely for different compilers, as the C operations

may be deeply nested and tricky. I suggest these

rough comparisons, based on averaging ten runs of a

routine that is essentially a long loop:

for(i=1;i<10000000;i++) L=KISS; then with KISS

replaced with SHR3, CONG,... or KISS+SWB, etc. The

times on my home PC, a Pentium 300MHz, in nanoseconds:

FIB 49;LFIB4 77;SWB 80;CONG 80;SHR3 84;MWC 93;KISS 157;

VNI 417;UNI 450;

*/

Jan 21, 1999, 3:00:00 AM1/21/99

to

George Marsaglia (g...@stat.fsu.edu) wrote:

: My offer of RNG's for C was an invitation to dance;

: I did not expect the Tarantella.

: My offer of RNG's for C was an invitation to dance;

: I did not expect the Tarantella.

I saw your original post; I wasn't aware that there was quite a thread

after it. If I wanted to nitpick about language issues, I'd note that as a

FORTRAN programmer, I am liable to use variables with names like "x" and

"y", and thus I would probably modify your preprocessor functions

accordingly.

I am honored to see a post by one of the co-inventors of the famed

MacLaren-Marsaglia random number generator. Although, due partly to a

Cryptologia article, that generator has generally been dismissed as a

component of a cryptosecure stream cipher, I've tended to feel:

1) the article was flawed, since one would never, in practice, use more

than a few of the most significant bits of generated random numbers for

stream cipher purposes, and

2) more elaborate variations on the principle, using the basic

MacLaren-Marsaglia generator as a building block, are possible.

On the other hand, it may be just as well that there is a dichotomy in the

methods used for stream ciphers and those used for random number

generation in the numerical solution of scientific problems, as this has

no doubt contributed to the fact that random number generators are not

affected by export control problems.

John Savard

Jan 22, 1999, 3:00:00 AM1/22/99

to

On 21 Jan 99 05:09:33 GMT, in <36a6b...@ecn.ab.ca>, in sci.crypt

jsa...@ecn.ab.ca () wrote:

>[...]

>I am honored to see a post by one of the co-inventors of the famed

>MacLaren-Marsaglia random number generator. Although, due partly to a

>Cryptologia article, that generator has generally been dismissed as a

>component of a cryptosecure stream cipher, I've tended to feel:

>

>1) the article was flawed, since one would never, in practice, use more

>than a few of the most significant bits of generated random numbers for

>stream cipher purposes, and

There were two formal articles, the first of which was a complete,

fully-exposed attack on a real encryption system using

MacLaren-Marsaglia:

1. Retter, C. 1984. Cryptanalysis of a MacLaren-Marsaglia System.

Cryptologia. 8(2): 97-108.

The next article addressed the question of the extent to which M-M

combining provides strength to the combined generators:

2. Retter, C. 1985. A Key-Search Attack on MacLaren-Marsaglia

Systems. Cryptologia. 9(2): 114-130.

There were also comments in letters:

3. Letters to the Editor. 1984. Cryptologia. 8(4): 374-378.

>2) more elaborate variations on the principle, using the basic

>MacLaren-Marsaglia generator as a building block, are possible.

Anything is possible. But, by itself, MacLaren-Marsaglia is simply

not a mechanism with significant cryptographic strength.

---

Terry Ritter rit...@io.com http://www.io.com/~ritter/

Crypto Glossary http://www.io.com/~ritter/GLOSSARY.HTM

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu