[Announce] A rand package for high quality 64bit random numbers (possibly go2)

1482 views
Skip to first unread message

Florian Uekermann

unread,
Feb 24, 2014, 5:25:06 PM2/24/14
to golan...@googlegroups.com
As discussed here (https://code.google.com/p/go/issues/detail?id=4254) and here (https://groups.google.com/group/golang-nuts/browse_thread/thread/2a57bcef8953d44a/335e3adb37bb4333), the math/rand package has a few small shortcomings whose fixes require API changes and are therefore postponed for now. Since the need for 64bit random numbers has come up several times on this list and elsewhere, I want to offer a variant of the rand/math package that uses a uint64 source and offers a Uint64 function/method.

Another issue that has come up frequently on this list in the past  is the lack of a fast high quality random number source for non-cryptographic purposes in Go (such as Mersenne Twister). At the moment math/rand uses a linear congruential generator, which is not suitable for some common purposes (such as Monte Carlo simulations). Since I had to change the random number source anyway to provide 64 random bits instead of 63, I used the popular and well documented MT19937-64 as random number source to solve both issues at once (with similar performance).

The code can be found at https://bitbucket.org/MaVo159/rand/. You can use it with the import statement:
The code passes the unchanged tests in rand_test.go and apart from using a uint64 for seeding instead of int64 users of the math/rand.Rand type should be able to use this without changing code.

I intend to keep the sourcecode as close to the original package as possible which takes into account future changes to the math/rand package. Apart from the removal of the old random number source (rng.go) and the addition of Mersenne Twister as new source (mt.go) the changes are minimal in the hope that we can converge to a solution that is mergeable through a small commit once developement of go2 starts.

I am open to any criticism, suggestions, wishes and nothing is set in stone yet, although the library is supposed to be in a usable state right now.

Some possible points/issues for discussion:
  • I use uint64 as basic type for the random number source (seed and output). The reasoning is that this is appropriate for a "bunch of random bits" especially because of the "special" semantics of the sign bit in signed integers (for example in shifting operations).  One might instead stick with int64 and just start using the sign bit (This is not necessarily good because it might hide semantic changes). This would for example allow keeping the prototype of the Seed function unchanged.
  • What about methods to generate shorter than 32bit integers.
  • What about an io.Reader, that supplies a stream of random bytes.
  • Am I using the best way to seed a Mersenne Twister
  • What about other choices for sources
  • Additional tests (As mentioned, the existing sources, including tests, should remain unchanged if possible, but I don't mind adding new files to the package.)
  • Is there a more appropriate place for this package (or a more appropriate maintainer)
Best regards,
Florian

PS.: This package has been created because I (and other people at my place of employment) need these issues solved now in job related projects. You can expect this to be well maintained until these issues are solved in the go standard library or by a reliable and better third party library.

Nick Craig-Wood

unread,
Feb 24, 2014, 6:18:41 PM2/24/14
to Florian Uekermann, golan...@googlegroups.com
On 24/02/14 22:25, Florian Uekermann wrote:
> At the moment math/rand uses a linear congruential generator,

math/rand's generator definitely isn't a linear congruential generator.

It looks more like Knuth's random number generator algorithm from the
art of computer programming Vol 2, using a very large LSFR of 607 terms
with a tap at 273.

Here is my implementation of the same algorithm

https://github.com/ncw/stressdisk/blob/master/stressdisk.go#L159

I'd say that is a very good random number generator, though I've been
unable to confirm that those taps make a maximal size random number
generator.

The unusual part of the go random number generator seems to be the way
it is seeded.

I can't find any reference to the "algorithm by DP Mitchell and JA
Reeds" mentioned in the comments.

There is no real reason that the go random number generator couldn't
provide 64 bit results - it has been artificially constrained to
positive numbers for some reason.

--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick

Florian Uekermann

unread,
Feb 24, 2014, 6:59:51 PM2/24/14
to golan...@googlegroups.com, Florian Uekermann
On Tuesday, February 25, 2014 12:18:41 AM UTC+1, Nick Craig-Wood wrote:
It looks more like Knuth's random number generator algorithm from the
art of computer programming Vol 2, using a very large LSFR of 607 terms
with a tap at 273.
 ...
I can't find any reference to the "algorithm by DP Mitchell and JA 
Reeds" mentioned in the comments. 
 
My statement was based on nothing else but this discussion: https://groups.google.com/forum/#!topic/golang-nuts/hnG34EuJtU0. Reading it more carefully, I realize that while it mentions an LCG being part of the algorithm, that is not all there is to it. The puzzle of the unknown origin of this is also the only thing I could find about it when I looked for it. Thanks for the correction and the hint what is might be.

There is no real reason that the go random number generator couldn't
provide 64 bit results - it has been artificially constrained to
positive numbers for some reason.
 
I would like to stick to MT for the time being, since a reference implementation and the papers are available. If someone can dig up further references for the rng used right now or provide an alternative implementation of something similar, I am happy to add it as an additional source. 

Michael Jones

unread,
Feb 24, 2014, 7:31:43 PM2/24/14
to Florian Uekermann, golang-nuts

On Mon, Feb 24, 2014 at 3:59 PM, Florian Uekermann <florian....@googlemail.com> wrote:
I can't find any reference to the "algorithm by DP Mitchell and JA 
Reeds" mentioned in the comments. 

Both Don P. Mitchell and James A. Reeds were researchers at Bell Telephone Laboratories during the general time of Inferno and Plan 9.

James A. Reeds

Don P. Mitchell

Write to them if you have questions.

--
Michael T. Jones | Chief Technology Advocate  | m...@google.com |  +1 650-335-5765

Andreas Klauer

unread,
Feb 24, 2014, 7:39:40 PM2/24/14
to golan...@googlegroups.com
Is it supposed to be ~15% slower than math/rand?

(
My benchmark won't be optimal, I'm still learning Go and my first program was one that garbles files with random data.

I ended up doing weird things like pulling 7 bytes out of each Int63() and filling up the missing bytes at the end of each block. Switching it to your rand with minimal changes (and removing the 7 byte "logic"), it takes longer to run for a 7GiB file in tmpfs.

I put the code here: https://github.com/frostschutz/garble
)

Florian Uekermann

unread,
Feb 25, 2014, 10:25:38 AM2/25/14
to golan...@googlegroups.com
I have not done proper statistics on this. I think on my machine the slowdown is a bit less. But it is definitely a little bit slower. Once I understood the properties of the old source better I intend to put it back in and add a few benchmarks.

Florian Uekermann

unread,
Feb 25, 2014, 1:13:49 PM2/25/14
to golan...@googlegroups.com, Florian Uekermann
I read up on the details of the old source a little bit. As Nick Craig-Wood pointed out, it is not an LCG. The LCG is used for seeding (can I edit my first message somehow, to add a note that the statement is wrong?).

If I am not mistaken again, the generator is an ALFG (Additive Lagged Fibonacci Generator, thats what Wikipedia calls it). Knuth describes the algorithm in Volume 2 of The art of computer programming in section 3.2.2 (around equation 7). Both Wikipedia and Knuth state the parameter combination 607,273 as possible combination with a period length of 2^(e-1)*(2^607-1) where e is the length of the random number in bits.
I actually found a few references examining its properties and it seems to be a good rng so faar, but there is still seems to be a lack of mathematical background and it is fairly easy to get into trouble by not seeding properly.

I will add the current go implementation back into my repository as default, since it indeed seems to produce 64bit random numbers. That way it should even be possible to have the new implementation produce the same random numbers as the old one by default and have a Uint64 method.
If I have time I will dig for more references to document the current version a little more extensively.

Best regards,
Florian

Nick Craig-Wood

unread,
Feb 25, 2014, 4:28:30 PM2/25/14
to Michael Jones, Florian Uekermann, golang-nuts
On 25/02/14 00:31, Michael Jones wrote:
> On Mon, Feb 24, 2014 at 3:59 PM, Florian Uekermann
> <florian....@googlemail.com
> <mailto:florian....@googlemail.com>> wrote:
>
> I can't find any reference to the "algorithm by DP Mitchell and JA
> Reeds" mentioned in the comments.
>
>
> Both Don P. Mitchell and James A. Reeds were researchers at Bell
> Telephone Laboratories during the general time of Inferno and Plan 9.
>
> James A. Reeds
> http://www.dtc.umn.edu/~reedsj/
>
> Don P. Mitchell
> http://mentallandscape.com/
>
>
> Write to them if you have questions.

I shall do that if no-one else has already done so, and I'll propose a
CL with any info that I find for future source code readers.

Nick Craig-Wood

unread,
Feb 25, 2014, 4:32:49 PM2/25/14
to Florian Uekermann, golan...@googlegroups.com
On 25/02/14 18:13, Florian Uekermann wrote:
> I read up on the details of the old source a little bit. As Nick
> Craig-Wood pointed out, it is not an LCG. The LCG is used for seeding
> (can I edit my first message somehow, to add a note that the statement
> is wrong?).

;-)

> If I am not mistaken again, the generator is an ALFG (Additive Lagged
> Fibonacci Generator, thats what Wikipedia calls it).

http://en.wikipedia.org/wiki/Lagged_Fibonacci_generator

Yes that is definitely the right algorithm! I'd not seen that page
before - thanks.

Florian Uekermann

unread,
Feb 26, 2014, 5:57:19 PM2/26/14
to golan...@googlegroups.com, Florian Uekermann
On Tuesday, February 25, 2014 10:32:49 PM UTC+1, Nick Craig-Wood wrote:
Yes that is definitely the right algorithm!  I'd not seen that page
before - thanks.

I wouldn't have found it without your hint that Knuth describes it in Volume 2.

I have dug a little bit through the literature and the way it is used seems to be a common suggestion in the literature (and boost also does it similarly). The one thing that is special about it is the XOR with a predefined vector. Since it is XOR it is not really imaginable that this could be problematic in itself, but I wonder what the origin of these numbers is. Especially because they are all 63bit. The generator is easily modified to produce 64bit numbers, but I would like to extend the initial vector to include MSBs that are not all zero without changing the generated sequence of the lesser 63bit.

Any ideas as to where that comes from? The comment says:
// cooked random numbers
// the state of the rng
// after 780e10 iterations
, but I am not sure what this means (which rng is referenced here?).
If nobody has any ideas, Ken Thompson is probably the best person to answer this.

Once we figured this out it is probably time to extend the comments a bit.

I added the old random number generator back into my repository with minimal modifications to satisfy the new 64 bit Source interface. The numbers that are generated by my package should be exactly the same as those generated by "math/rand". The only code change required is that all seeds in my package are uint64 instead of int64, but a simple conversion should solve this without changing the initialization. On top of the old functionality, there is of course the Mersenne Twister Source and the Uint64() method. A few more choices for Sources will follow.

Andreas Klauer

unread,
Feb 26, 2014, 6:37:24 PM2/26/14
to golan...@googlegroups.com, Florian Uekermann
Am Mittwoch, 26. Februar 2014 23:57:19 UTC+1 schrieb Florian Uekermann:
Any ideas as to where that comes from? The comment says:
// cooked random numbers
// the state of the rng
// after 780e10 iterations
, but I am not sure what this means (which rng is referenced here?).

It's just something to add more noise to the seed / initial state of the source? I'm no expert for random generators, but I've read somewhere that they tend to start out with low quality numbers and only become better in the long run...

And the rng would be rng.go itself, I'd assume, takes some time to verify though, comment the cooked numbers, follow it up with 780e10 Int63() calls to a source with the default seed 1, then look at its internal state?

Just guesswork.

Nick Craig-Wood

unread,
Feb 27, 2014, 12:25:20 PM2/27/14
to Andreas Klauer, golan...@googlegroups.com, Florian Uekermann
Jim Reeds sent me this about the Go random number generator. He's given
me permission to post it here provided that I point out that these
comments were written off-the-cuff, 25 years or so
after the events, at the end of a tiring day, without any
serious attempt to verify the facts.

> I am the co-author, together with Don Mitchell. Actually, we wrote
> it first for the then-current version of UNIX (v9, or v10) and it
> was picked up by Ken Thompson for use in Plan 9. From there I
> suppose it flowed into Go naturally. I don't actually know when Don
> and I did this. Definitely before 1997.
>
> We never documented it. Our value-added to what's described in
> Knuth is the seeding. Ken had noticed that a naive seeding of an
> earlier lag-17 generator gave biased results in the early output of
> the RNG, and liked the idea of running the RNG a long time to bring
> it into a "steady state" of randomness. So that's what we did: use
> a more-or-less perfunctory LCG to nudge the presumably OK-looking
> rng_cooked[] vector.
>
> (Ken used some earlier RNG in the 'fortunes' program, which, at
> login-time, typed a randomly selected motto from a file of one-liner
> zingers. It was seeded with, let's say, the pid or the output of
> time(0). He saw repeated fortunes more often than he expected, and
> discovered that the effect of the seed on the RNG he was using did
> not really kick in in a serious way until thousands and thousands of
> calls later. Hence the concern.)

So I think your comments are spot on.

I'd like to put a little more documentation in rng.go referencing
Knuth / wikipedia for the generator part and a description of what the
cooked numbers are and how the seeding works, something like this

diff -r abd51e52924a src/pkg/math/rand/rng.go
--- a/src/pkg/math/rand/rng.go Thu Feb 27 01:45:22 2014 -0800
+++ b/src/pkg/math/rand/rng.go Thu Feb 27 17:23:02 2014 +0000
@@ -26,6 +26,10 @@
// cooked random numbers
// the state of the rng
// after 780e10 iterations
+ //
+ // These are used as a set of clean random numbers which are
+ // perturbed by the seeding algorithm into a new state for the
+ // generator.
rng_cooked [_LEN]int64 = [...]int64{
5041579894721019882, 4646389086726545243, 1395769623340756751,
5333664234075297259,
2875692520355975054, 9033628115061424579, 7143218595135194537,
4812947590706362721,
@@ -201,6 +205,8 @@

// Seed uses the provided seed value to initialize the generator to a
deterministic state.
func (rng *rngSource) Seed(seed int64) {
+ // This uses a Linear Congruential Generator to perturb the
+ // cooked random numbers into a new state for the generator.
rng.tap = 0
rng.feed = _LEN - _TAP

@@ -229,6 +235,10 @@
}

// Int63 returns a non-negative pseudo-random 63-bit integer as an int64.
+//
+// The random numbers are generated using an Additive Lagged Fibonacci
+// Generator as described in Knuth: Art of Computer Programming Vol 2,
+// with a sequence length of at least 2**607.
func (rng *rngSource) Int63() int64 {
rng.tap--
if rng.tap < 0 {

Michael Jones

unread,
Feb 27, 2014, 11:42:09 PM2/27/14
to Nick Craig-Wood, Andreas Klauer, golang-nuts, Florian Uekermann
Great! The source should also credit them by name (I put public contact addresses in an earlier email)


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Nick Craig-Wood

unread,
Feb 28, 2014, 6:34:51 AM2/28/14
to Michael Jones, Andreas Klauer, golang-nuts, Florian Uekermann
The source already credits them by name here

http://golang.org/src/pkg/math/rand/rng.go#L7

That won't appear in the godoc though

Maybe the comments I wrote, and a reference to the original authors
should be in the public godoc ?

http://golang.org/src/pkg/math/rand/rand.go
> Nick Craig-Wood <ni...@craig-wood.com <mailto:ni...@craig-wood.com>>
> -- http://www.craig-wood.com/nick
>
> --
> You received this message because you are subscribed to the Google
> Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to golang-nuts...@googlegroups.com
> <mailto:golang-nuts%2Bunsu...@googlegroups.com>.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>
>
> --
> /Michael T. Jones | Chief Technology Advocate | m...@google.com
> <mailto:m...@google.com> | +1 650-335-5765/

Florian Uekermann

unread,
Feb 28, 2014, 9:47:00 AM2/28/14
to golan...@googlegroups.com, Andreas Klauer, Florian Uekermann
I like those comments, but one thing we still don't know where the cooked numbers come from. Andreas suggestion has a problem with being circular if I understand it correctly beause seeding the rng already involves them. There must be some method to generate them (also they don't seem to be present in the plan9 code I found). I ask Ken Thompson.

I have started running the dieharder test suite on this rng (using my version that returns a 64th bit) and it looks fine so faar (I had my doubts about the 64th bit because it is 0 in the cooked numbers and therefore mostly defined by the lcg in the beginning. Seems not to be too much of an issue since the lcgs have more problems with the lower bits afaik).

Andreas Klauer

unread,
Feb 28, 2014, 10:20:09 AM2/28/14
to golan...@googlegroups.com, Andreas Klauer, Florian Uekermann
Am Freitag, 28. Februar 2014 15:47:00 UTC+1 schrieb Florian Uekermann:
seeding the rng already involves them

Not if they're set to 0. It was just a guess, it might just as well be that the internal state has to start with all 0, instead of seeding.

Florian Uekermann

unread,
Feb 28, 2014, 10:38:41 AM2/28/14
to golan...@googlegroups.com, Andreas Klauer, Florian Uekermann
Yes it could be that they are all 0 and only the lcg is used for seeding. That is also my best guess atm. Starting from 0 without seeding is not possible for mathematical reasons. I will try to verify the different approaches after I got a reply from Ken Thompson.

Michael Jones

unread,
Feb 28, 2014, 10:38:35 AM2/28/14
to Nick Craig-Wood, Andreas Klauer, golang-nuts, Florian Uekermann
Just thinking that since the question came up, then the source could name them by name rather than initials, and include what you wrote.
--
Michael T. Jones | Chief Technology Advocate  | m...@google.com |  +1 650-335-5765

Florian Uekermann

unread,
Mar 2, 2014, 10:21:45 AM3/2/14
to golan...@googlegroups.com, Andreas Klauer, Florian Uekermann
I got a reply from ken. He told me the details about why the cooked vector was created (in the fortunes program for plan9). He did however not specify exactly which random number generator was used to generate them. I think I found the first revision that includes (a slightly different version of) rand. I am running that right 780e10 times now without the cooked vector and the standard seeding at that time to check whether or not that was what ken used. It will take some time. I will report back with more details soon.

The hg revision for the rand package I am talking about is 399 and the path is "src/lib/rand.go".

Best,
Florian

Note: This message was written yesterday, I just sent it to the wrong address. I believe I found the algorithm in the meantime! I am just checking a few things, I will post the code for generating the cooked vector later.

Andreas Klauer

unread,
Mar 2, 2014, 10:38:11 AM3/2/14
to golan...@googlegroups.com, Andreas Klauer, Florian Uekermann
Am Sonntag, 2. März 2014 16:21:45 UTC+1 schrieb Florian Uekermann:
I think I found the first revision that includes (a slightly different version of) rand. I am running that right 780e10 times now without the cooked vector and the standard seeding at that time to check whether or not that was what ken used. It will take some time.

Ah, nice. I did it with the current rand for seed 0 and 1 (took 6 hours) and it produced completely different numbers...

Florian Uekermann

unread,
Mar 3, 2014, 1:32:20 PM3/3/14
to golan...@googlegroups.com, Andreas Klauer, Florian Uekermann
I attached the code to this message. If you execute this you will get the cooked random numbers that are used for seeding the current random number generator. As you can see in the main I plan to modify it to generate the a non-zero 64th bit without changing the other 63. That way we could use it for seeding a 64bit generator properly without changing the outcome of the existing functions by substituting the 64bit cooked numbers with the 63 bit ones. I provide updated code for that tomorrow if I have time. We should also include this in the documentation somehow, to document the origin of the cooked numbers. This code is based on the rnd.go from revision 399, written by Ken Thompson.
randGenCooked.go

Florian Uekermann

unread,
Mar 4, 2014, 9:17:12 AM3/4/14
to golan...@googlegroups.com, Andreas Klauer, Florian Uekermann
Here is the version that would be fit for inclusion in the math/rand package. Nick, how is the documentation of rng.go coming along?, do you want to add a reference to this file, so we upload can both of them together. We should probably file an issue first...

The file attached is a standalone program that includes a "// +build ignore" comment and is ignored for building math/rand. That is how it is done for the certificate generation program in the tls package.

I also attached the output, since running this takes a few hours, but that doesn't need to go upstream. I included the 64bit cooked numbers in my package and they produce the same output when calling the Int63 (and friends) methods.

Best regards,
Florian

PS.: As stated in older messages: The code is heavily based on kens code from revision 399 I just got rid of stuff that isn't needed and modified a few lines (to generate 64 bits and to remove the cooked vector from seeding)
ALFGgenerateCooked.go
out64.txt
Reply all
Reply to author
Forward
0 new messages