Setting random number generator seed using sequential integers

James Douglas

unread,

Aug 9, 2016, 10:27:07 AM8/9/16

to julia-users

Hi All,

I have a question about how Julia handles the seed given to the random number generator through the function srand(seed). I am doing Monte Carlo simulations using Julia, where I have some code that depends on the output from the rand() function and then I run this code with many different initial seeds. Naively I have set the seed specifically for each run of the code as the integer corresponding to the run, i.e., I run the code starting with srand(1), srand(2), etc. up to srand(N) for N runs. My primary motivation for this is that I would like to reproduce a particular run if I see anything strange. However, I recently saw that at least in C++ (http://www.pcg-random.org/posts/cpp-seeding-surprises.html), seeding the Mersenne Twister with integers like this is a bad idea. So I would like to know if there is a problem with seeding Julia's random number generator in this way? Will it bias my sample or does Julia somehow avoid this in the way it processes the seed given to srand()? And finally, if it is a bad idea, what would be the best way to seed so that I can reproduce everything at a later stage?

Thanks in advance,

James

Stefan Karpinski

unread,

Aug 9, 2016, 11:11:30 AM8/9/16

to Julia Users

That's a very good question. Issue filed: https://github.com/JuliaLang/julia/issues/17918.

Andreas Lobinger

unread,

Aug 9, 2016, 11:23:16 AM8/9/16

to julia-users

Hello,

we somehow have similar problem (not in julia, but matlab). One idea is to have a long (624 32-bit int) (random) number available and multiply by the single seed. Or get /dev/random for seeding and store it along the output.

Simon Byrne

unread,

Aug 9, 2016, 11:26:46 AM8/9/16

to julia-users

It's an interesting problem. In general, there's not much you can do if the state of your generator is larger than the size of the seed (which it definitely is in the case of Mersenne twisters).

As to whether it would bias your sample: it ultimately depends on what you are doing with your random numbers. The example given in the blog post is somewhat pathological: you are essentially testing for an exact bit pattern. A more typical use case (outside of cryptography) is to treat it like continuous variates where you assume that the probability of observing any exact value is zero (though of course it isn't in floating point), but that you care about values in a particular range, i.e.

if rand() < 0.1

...

end

In this case, you generally should be fine, as any bias would typically be in the order of machine epsilon, and easily swamped by the Monte Carlo error.

-Simon

James Douglas

unread,

Aug 12, 2016, 10:00:57 AM8/12/16

to julia-users

Yes, testing whether the random numbers given by rand() lie within some continuous range is indeed what I am doing.

I think I understand the problem now better, thanks to your answers and the discussion in the issue filed above. It is of course true that any set of N fixed seeds can only produce N initial random numbers, the question is then whether taking seeds as 1:N gives a representative random distribution on [0,1) for this first value and whether there is any overlap in the following streams. From some very naive testing it doesn't seem like there is an obvious problem, for example if I take seeds 1:10000 in Julia and look at the distribution of the initial random numbers is appears uniformly distributed, and if I take the first ten thousand random numbers from each stream (what I'm doing in my MC simulations) and test for repeated values between the streams (using the function unique(), probably some better tests out there?) I only get 1 repeated value out of 10^8. So I will for the moment assume that what I am doing is ok and perhaps in the future use randjump() to ensure unique streams as suggested by rfourquet.