Need help evaluating output from a PRNG

0 views
Skip to first unread message

maf

unread,
Jan 8, 1998, 3:00:00 AM1/8/98
to

In the continuing quest for a (near) perfect pseudo
random number generator that we all seek, I have created
an algorithm that seems to meet quite a few of the
requirements mentioned in the various postings in this
newsgroup as to the properties of what a good set of
random numbers would possess. Using this algorithm, I
generated a 100,000 byte file of random bits for testing
purposes and evaluation.

Below is report created by a program I wrote to do a
frequency count of the random numbers generated by my
algorithm. The file is examined byte by byte, and the
number of 0 and 1 bits that occur in each of the bit
positions in a byte are counted (the bits are numbered
01234567). Also, the bit pairs that occur in each byte
are tabulated. The report summarizes the findings for
the entire file of random numbers.

What I need is someone knowledgeable in such things to
examine these numbers and indicate if this is what a
good random number generation would produce. If not,
what should these numbers look like?

BIT VALUE COUNT (PERCENT) VALUE COUNT (PERCENT)
--- ----- ---------- --------- -----
---------- ---------
0 0 50,170 ( 50.17) 1
49,830 ( 49.83)
1 0 50,043 ( 50.04) 1
49,957 ( 49.96)
2 0 50,101 ( 50.10) 1
49,899 ( 49.90)
3 0 49,776 ( 49.78) 1
50,224 ( 50.22)
4 0 50,248 ( 50.25) 1
49,752 ( 49.75)
5 0 50,330 ( 50.33) 1
49,670 ( 49.67)
6 0 50,132 ( 50.13) 1
49,868 ( 49.87)
7 0 49,992 ( 49.99) 1
50,008 ( 50.01)
---------- ---------
---------- ---------
TOTAL 400,792 ( 50.10) 399,208 (
49.90)

Freq of bits in pairs
---------------------
00 101,895 ( 25.47)
01 98,981 ( 24.75)
10 99,087 ( 24.77)
11 100,037 ( 25.01)


Thanks in advance,

maf


maf

unread,
Jan 8, 1998, 3:00:00 AM1/8/98
to

In the continuing quest for a (near) perfect pseudo
random number generator that we all seek, I have created
an algorithm that seems to meet quite a few of the
requirements mentioned in the various postings in this
newsgroup as to the properties of what a good set of
random numbers would possess. Using this algorithm, I
generated a 100,000 byte file of random bits for testing
purposes and evaluation.

Below is report created by a program I wrote to do a
frequency count of the random numbers generated by my
algorithm. The file is examined byte by byte, and the
number of 0 and 1 bits that occur in each of the bit
positions in a byte are counted (the bits are numbered
01234567). Also, the bit pairs that occur in each byte
are tabulated. The report summarizes the findings for
the entire file of random numbers.

What I need is someone knowledgeable in such things to
examine these numbers and indicate if this is what a
good random number generation would produce. If not,
what should these numbers look like?

BIT VALUE COUNT (PERCENT) VALUE COUNT (PERCENT)
--- ----- ------ --------- ----- ------ --------

0 0 50,170 ( 50.17) 1 49,830 ( 49.83)
1 0 50,043 ( 50.04) 1 49,957 ( 49.96)
2 0 50,101 ( 50.10) 1 49,899 ( 49.90)
3 0 49,776 ( 49.78) 1 50,224 ( 50.22)
4 0 50,248 ( 50.25) 1 49,752 ( 49.75)
5 0 50,330 ( 50.33) 1 49,670 ( 49.67)
6 0 50,132 ( 50.13) 1 49,868 ( 49.87)
7 0 49,992 ( 49.99) 1 50,008 ( 50.01)

------- --------- ------ ---------

Colin Dooley

unread,
Jan 9, 1998, 3:00:00 AM1/9/98
to

maf wrote:
>
> What I need is someone knowledgeable in such things to
> examine these numbers and indicate if this is what a
> good random number generation would produce. If not,
> what should these numbers look like?
>

You can get the results you show from the sequence:

0x00, 0x01, 0x02, 0x03, 0x04, ..... , 0xfd, 0xfe, 0xff


repeated over and over again...


--
<\___/>
/ O O \
\_____/ FTB.

Peter Pearson

unread,
Jan 9, 1998, 3:00:00 AM1/9/98
to

In article <34B5B660...@crl.com>, maf <m...@crl.com> wrote:
...

>
>Below is report created by a program I wrote to do a
>frequency count of the random numbers generated by my
>algorithm. ...

>
>What I need is someone knowledgeable in such things to
>examine these numbers and indicate if this is what a
>good random number generation would produce. If not,
>what should these numbers look like?
>

The standard way of assessing such data is to compute the probability
that a random source would have given a distribution whose lopsidedness
(measured by the Chi-squared statistic) would be less than that observed.
These statistics are tabulated below.


BIT VALUE COUNT (PERCENT) VALUE COUNT (PERCENT) prob(chi2)
--- ----- ------ --------- ----- ------ -------- ----------
0 0 50,170 ( 50.17) 1 49,830 ( 49.83) 0.717703
1 0 50,043 ( 50.04) 1 49,957 ( 49.96) 0.214344
2 0 50,101 ( 50.10) 1 49,899 ( 49.90) 0.477034
3 0 49,776 ( 49.78) 1 50,224 ( 50.22) 0.843429
4 0 50,248 ( 50.25) 1 49,752 ( 49.75) 0.883233
5 0 50,330 ( 50.33) 1 49,670 ( 49.67) 0.963121
6 0 50,132 ( 50.13) 1 49,868 ( 49.87) 0.596193
7 0 49,992 ( 49.99) 1 50,008 ( 50.01) 0.040353


------- --------- ------ ---------
TOTAL 400,792 ( 50.10) 399,208 ( 49.90)

Freq of bits in pairs
---------------------
00 101,895 ( 25.47)
01 98,981 ( 24.75)
10 99,087 ( 24.77)

11 100,037 ( 25.01) Chi2 = 54.64, 3 d.f., prob = 1.000000.


The single-bit values for bits 5 and 7 are "significantly different"
from what would be expected from a random source (the customary
threshold for "statistical significance" being 0.05), with bit 5
being too lopsided and bit 7 being too uniform.

The bit-pair statistics are distinctly too lopsided.

- Peter

Pierre Abbat

unread,
Jan 9, 1998, 3:00:00 AM1/9/98
to

maf <m...@crl.com> wrote in article <34B5B1D0...@crl.com>...

> In the continuing quest for a (near) perfect pseudo
> random number generator that we all seek, I have created
> an algorithm that seems to meet quite a few of the
> requirements mentioned in the various postings in this
> newsgroup as to the properties of what a good set of
> random numbers would possess. Using this algorithm, I
> generated a 100,000 byte file of random bits for testing
> purposes and evaluation.

I strongly suggest that you make a ten-megabyte file and run Diehard on it.
That will do those tests and much more. Even passing Diehard, though, does
not guarantee that the RNG is cryptographically secure.

phma

Gary Ardell

unread,
Jan 9, 1998, 3:00:00 AM1/9/98
to

On Thu, 08 Jan 1998 21:32:16 -0800, maf <m...@crl.com> wrote:

>In the continuing quest for a (near) perfect pseudo
>random number generator that we all seek, I have created
>an algorithm that seems to meet quite a few of the
>requirements mentioned in the various postings in this
>newsgroup as to the properties of what a good set of
>random numbers would possess. Using this algorithm, I
>generated a 100,000 byte file of random bits for testing
>purposes and evaluation.
>

>Below is report created by a program I wrote to do a
>frequency count of the random numbers generated by my

>algorithm. The file is examined byte by byte, and the
>number of 0 and 1 bits that occur in each of the bit
>positions in a byte are counted (the bits are numbered
>01234567). Also, the bit pairs that occur in each byte
>are tabulated. The report summarizes the findings for
>the entire file of random numbers.
>

>What I need is someone knowledgeable in such things to
>examine these numbers and indicate if this is what a
>good random number generation would produce. If not,
>what should these numbers look like?
>
>
>

>BIT VALUE COUNT (PERCENT) VALUE COUNT (PERCENT)

>--- ----- ------ --------- ----- ------ --------

> 0 0 50,170 ( 50.17) 1 49,830 ( 49.83)

> 1 0 50,043 ( 50.04) 1 49,957 ( 49.96)

> 2 0 50,101 ( 50.10) 1 49,899 ( 49.90)

> 3 0 49,776 ( 49.78) 1 50,224 ( 50.22)

> 4 0 50,248 ( 50.25) 1 49,752 ( 49.75)

> 5 0 50,330 ( 50.33) 1 49,670 ( 49.67)

> 6 0 50,132 ( 50.13) 1 49,868 ( 49.87)

> 7 0 49,992 ( 49.99) 1 50,008 ( 50.01)

> ------- --------- ------ ---------
> TOTAL 400,792 ( 50.10) 399,208 ( 49.90)
>
>Freq of bits in pairs
>---------------------
> 00 101,895 ( 25.47)
> 01 98,981 ( 24.75)
> 10 99,087 ( 24.77)
> 11 100,037 ( 25.01)
>
>

> Thanks in advance,
>
>
>
Suggest you get Diehard at:

http://stat.fsu.edu/~geo/diehard.html

Gary


Colin G. St. John

unread,
Jan 10, 1998, 3:00:00 AM1/10/98
to

In article <01bd1d21$d2774500$d250...@phma.trellis.net>,
ph...@pop.trellis.net says...

>
>maf <m...@crl.com> wrote in article <34B5B1D0...@crl.com>...
>> In the continuing quest for a (near) perfect pseudo
>> random number generator that we all seek, I have created
>> an algorithm that seems to meet quite a few of the
>> requirements mentioned in the various postings in this
>> newsgroup as to the properties of what a good set of
>> random numbers would possess. Using this algorithm, I
>> generated a 100,000 byte file of random bits for testing
>> purposes and evaluation.
>
>I strongly suggest that you make a ten-megabyte file and run Diehard on it.
>That will do those tests and much more. Even passing Diehard, though, does
>not guarantee that the RNG is cryptographically secure.
>
>phma


To the suggestion of DIEHARD I'd like to add a program called ENT.
It analyzes a file of test data and gives a result for entropy
( bits per char ), compressability, a chi-squared test, a mean,
a Monte-Carlo calcuation for PI, and a serial correlation test.
I don't have a URL for it, but I think I found it at Simtel. A
search of "random" and "test" ought to turn it up ( at least I
think that's how I found it.

Colin


Terry Ritter

unread,
Jan 10, 1998, 3:00:00 AM1/10/98
to

On Sat, 10 Jan 1998 03:37:18 GMT, in <88440347...@wagasa.cts.com>
in sci.crypt jdco...@cts.com (J. D. Cooley) wrote:

>[...]
>What I find interesting is that you read in this NG and in the documents that
>explain the different random number generator test programs that just because it
>passes does not mean it is acceptable for cryptographic use.

A cryptographic RNG must have very long cycles, a large internal
state, and be somehow isolated so the internal state is hard to
develop externally. Traditionally, statistical RNG's have not had a
large internal state, although that may be changing.


>Also, if it fails
>one or more tests, that doesn't mean it isn't random! So, why even run the
>tests? It looks like you can't trust them either way!

There can be no test on a sequence which unfailingly decides whether
the generator which produced the sequence was random. Since a random
generator can produce any possible sequence, any sequence we think
looks non-random could have been produced fairly at random.

We do expect to see the statistics of large numbers play themselves
out in long random sequences, and this allows us to identify and
reject the worst generators. But a generator which always tests OK is
not OK either: With a random source, we generally *expect* and
*demand* that a statistic exceed the 95% level (often called
"failure") 1 time in 20, and that one time may happen on the very
first test.

The usual role of statistics is to identify particular systematic
events in the context of expected random variations that may conceal
such events. This often occurs in a context of difficult and costly
experimentation, and there is a premium on results which are so good
that they stand above the noise; it may be that not much is lost if a
weak positive is ignored.

In contrast, cryptography and randomness generally support vastly
larger amounts of testing at low cost, and we seek weak indications.
In this context, I find it more useful to conduct many tests and
collect many statistic values, then visually and mathematically
compare the experimental distribution to the ideal for that statistic.

Of course, the mathematical comparison between two distributions
*also* has a statistical distribution, and *will* show very unusual
values occasionally.

---
Terry Ritter rit...@io.com http://www.io.com/~ritter/
The Crypto Glossary: http://www.io.com/~ritter/GLOSSARY.HTM

David A. Scott

unread,
Jan 10, 1998, 3:00:00 AM1/10/98
to

In a previous article, jdco...@cts.com (J. D. Cooley) says:

>If you can't find ENT.EXE, e-mail me. I will be glad to e-mail it to you either
>uuencoded or base64 encoded.
>
>I like ENT better than DIEHARD because with DIEHARD you need a file size of
>11Mb. I don't have that much room on my little notebook (only computer I have).
>
It uses a large file so that more accurate tests can be done

>What I find interesting is that you read in this NG and in the documents that
>explain the different random number generator test programs that just because it

>passes does not mean it is acceptable for cryptographic use. Also, if it fails


>one or more tests, that doesn't mean it isn't random! So, why even run the
>tests? It looks like you can't trust them either way!
>

> "JD"
>
But if something keeps failing a particular test over and over again
with different generated numbers it tells you that the sequence is not
random. Nothing can really tell if it is random.
But the DIEHARDTESTS are the best I have used to catch non randomness
so far. I assume they my add other tests in the future.

--


Bryan Olson

unread,
Jan 10, 1998, 3:00:00 AM1/10/98
to

maf wrote:
>
> In the continuing quest for a (near) perfect pseudo
> random number generator that we all seek, I have created
> an algorithm
[...]

> Below is report created by a program I wrote to do a
> frequency count of the random numbers generated by my
> algorithm.

First, we can't say much about the generator's
security based on these numbers. About all we can
say is whether these statistics are consistent with
a truly unbiased and uncorrelated source.

> BIT VALUE COUNT (PERCENT) VALUE COUNT (PERCENT)
> --- ----- ------ --------- ----- ------ --------
> 0 0 50,170 ( 50.17) 1 49,830 ( 49.83)
> 1 0 50,043 ( 50.04) 1 49,957 ( 49.96)
> 2 0 50,101 ( 50.10) 1 49,899 ( 49.90)
> 3 0 49,776 ( 49.78) 1 50,224 ( 50.22)
> 4 0 50,248 ( 50.25) 1 49,752 ( 49.75)
> 5 0 50,330 ( 50.33) 1 49,670 ( 49.67)
> 6 0 50,132 ( 50.13) 1 49,868 ( 49.87)
> 7 0 49,992 ( 49.99) 1 50,008 ( 50.01)

The the generator were perfect, the counts would
follow a binomial distribution, with p=0.5 and n=100000.
That gives us a standard deviation of
(100000*0.5*0.5)^0.5 =aprox 158
The count farthest from the expected is about 2.09
standard deviations out, which is farther than we'd
expect in eight trials, but not unreasonable. I think
the counts are consistent with a perfect source,
neither too far nor too close to the expected values.

> TOTAL 400,792 ( 50.10) 399,208 ( 49.90)

Standard deviation for a perfect source would be 447,
so this is 1.77 standard deviations from the expected
value. Not too bad.

>
> Freq of bits in pairs

I'm assuming these don't overlap, so if the source were
perfect the pairs would be independent. The standard
deviation would be
(400000*0.25*0.75)^0.5 =aprox 274

> ---------------------
> 00 101,895 ( 25.47)
> 01 98,981 ( 24.75)
> 10 99,087 ( 24.77)
> 11 100,037 ( 25.01)

The count for 00 differs from the expected value by 6.9
times the standard deviation. That's way too far. The
generator has revealed that it does not produce unbiased
uncorrelated bits.

In any introductory statistics text you can find
explanations of standard deviation and how to use it. You
should check my work while you're at it (I didn't do all
that well when I took introduction to statistics).

--Bryan

Reply all
Reply to author
Forward
0 new messages