On Sunday, November 24, 2013 3:17:27 PM UTC+1, Albert van der Horst wrote:
[..]
> This is an invitation to back up his generator with some theory.
> What he should do IMO is to run the tests from Knuth's TAO
> and publish the results on his website. Even if the outcome is
> soso, that would make the Wikipedia entry respectable.
> If the outcome is really bad, then of course it is better to
> remove the Wikipedia mention.
Marsaglia's Diehard test is more convenient.
The results are very bad. The birthday and gorilla tests
are even off the scale.
-marcel
-- -------------------------
FORTH> init-seeds diehard
DIEHARD is checking Ran-HA..
.---------------------------------------------------------------.
| This is the "tough" BIRTHDAY SPACINGS TEST |
| Choose 4096 birthdays in a "year" of 2^32 days. Thus each |
| birthday is a 32-bit integer and the test uses 2^12 of them, |
| so that j, the number of duplicate spacings, is asympotically |
| Poisson distributed with lambda=4. Generators that pass the |
| earlier tests for m=1024 and n=2^24 often fail this test, yet |
| those that pass this test seem to pass the "weaker" test. |
| Each set of 4096 birthdays provide a Poisson variate j, and |
| 500 such j's lead to a chisquare test to see if the result |
| is consistent with the Poisson distribution with lambda=16. |
`---------------------------------------------------------------'
Table of Expected versus Observed counts:
Duplicates 0 1 2 3 4 5 6 7 8 9 >=10
Expected 91 366 732 976 976 781 520 297 148 66 40
Observed 1102 1578 1274 661 277 79 23 3 2 1 0
(O-E)^2/E******4008.0 400.0 102.1 501.4 631.5 476.0 291.7 144.9 64.2 40.7
Birthday Spacings: Sum(O-E)^2/E = *******, p = 1.000
( 5.068 seconds elapsed. )
.-------------------------------------------------------------.
| This is the GCD TEST. Let the (32-bit) RNG produce two |
| successive integers u,v. Use Euclids algorithm to find the |
| gcd, say x, of u and v. Let k be the number of steps needed |
| to get x. Then k is approximately binomial with p=.376 |
| and n=50, while the distribution of x is very close to |
| Pr(x=i)=c/i^2, with c=6/pi^2. The gcd test uses ten |
| million such pairs u,v to see if the resulting frequencies |
| of k's and x's are consistent with the above distributions. |
| Congruential RNG's---even those with prime modulus---fail |
| this test for the distribution of k, the number of steps, |
| and often for the distribution of gcd values x as well. |
`-------------------------------------------------------------'
Euclid's algorithm:
p-value, steps to gcd: 0.640036
p-value, distance of gcd's: 0.892458
( 2.323 seconds elapsed. )
.---------------------------------------------------------------.
| This is the GORILLA test, a strong version of the monkey |
| tests that I developed in the 70's. It concerns strings |
| formed from specified bits in 32-bit integers from the RNG. |
| We specify the bit position to be studied, from 0 to 31, |
| say bit 3. Then we generate 67,108,889 (2^26+25) numbers |
| from the generator and form a string of 2^26+25 bits by |
| taking bit 3 from each of those numbers. In that string of |
| 2^26+25 bits we count the number of 26-bit segments that |
| do not appear. That count should be approximately normal |
| with mean 24687971 and std. deviation 4170. This leads to |
| a normal z-score and hence to a p-value. The test is |
| applied for each bit position 0 (leftmost) to 31. |
| (Some older tests use Fortran's 1-32 for most- to least- |
| significant bits. Gorilla and newer tests use C's 0 to 31.) |
`---------------------------------------------------------------'
Gorilla test for 2^26 bits, positions 0 to 31:
Note: lengthy test -- ~20 minutes for 900 MHz PC
Bits 0 to 7 || 0.000 0.000 0.001 0.000 0.000 0.000 0.000 0.000
Bits 8 to 15 || 0.000 0.006 0.000 0.000 0.000 0.000 0.001 0.000
Bits 16 to 23 || 0.001 0.000 0.000 0.003 0.000 0.003 0.000 0.000
Bits 24 to 31 || 0.003 0.000 0.291 0.049 0.102 0.611 0.001 0.000
ADKS test for the above 32 p values: 1.000
( 49.874 seconds elapsed. )
.---------------------------------------------------------------.
| THE OVERLAPPING 5-PERMUTATION TEST |
| This is the OPERM5 test. It looks at a sequence of ten |
| million 32-bit random integers. Each set of five consecutive |
| integers can be in one of 120 states, for the 5! possible |
| orderings of five numbers. Thus the 5th, 6th, 7th,...numbers |
| each provide a state. As many thousands of state transitions |
| are observed, cumulative counts are made of the number of |
| occurences of each state. Then the quadratic form in the |
| weak inverse of the 120x120 covariance matrix yields a test |
| that the 120 cellcounts came from the specified (asymptotic) |
| distribution with the specified means and 120x120 covariance. |
`---------------------------------------------------------------'
The OPERM5 test for 10 million (overlapping) 5-tuples for Ran-HA.
p-values for 5 runs: 0.5952 0.9339 0.8200 0.5496 0.1751
( 7.197 seconds elapsed. ) ok