Setting RNG seeds for parallel computing

Gray Calhoun

unread,

Jul 15, 2014, 6:25:11 PM7/15/14

to julia...@googlegroups.com

Hi everyone, I'm trying to start using Julia for some Monte Carlo simulations
(not MCMC) which I'd like to parallelize. I haven't found any documentation
for setting the RNG's seed for parallelization. The naive approach gives
different results than non-parallel execution (which is not surprising).

Starting with `julia -p 4` and executing:

    ## Sequential execution
    srand(1)
    seqfor = Array(Float64,4)
    for i=1:4
        seqfor[i] = rand()
    end

    ## Parallel execution
    srand(1)
    parfor = @parallel (vcat) for i=1:4
        rand()
    end

    [sort(parfor) sort(seqfor)]

gives

    4x2 Array{Float64,2}:
     0.346517 0.00790928
     0.346517 0.236033
     0.662369 0.312707
     0.914194 0.346517

and re-running the parallel code can give different results even after
re-seeding. If we start julia without `-p 4` then both loops give the
same results. If it matters, I'm using Julia version 0.3.0-rc1+28
from source (commit 79e4771).

Is there documentation on the right way to parallelize simulations in
Julia? If not, should my next step be to carefully read the "parallel
computing" documentation?

Andreas Noack

unread,

Jul 16, 2014, 1:01:35 AM7/16/14

to julia...@googlegroups.com

You'll have to set the seed on each process, i.e. fetch 2 srand(1); fetch 3 srand(2); However, I am not sure what our random number generator (dsfmt) promises about independence of the different streams. We should look into that to ensure that you get good random number when running in parallel.

--
Med venlig hilsen

Andreas Noack Jensen

james.dill...@gmail.com

unread,

Jul 16, 2014, 1:07:09 AM7/16/14

to julia...@googlegroups.com

If your goal is that each call to rand(), regardless of process, sequentially pulls a number from the RNG stream, then perhaps you just need to create a MersenneTwister([seed]) object.

mystream = MersenneTwister(1)

## Parallel execution
srand(1)
parfor = @parallel (vcat) for i=1:4

rand(mystream)
end

Intuitively, this seems like it must come with some communication overhead.

It seems likely that there should be a way to avoid this overhead with pmap().

Jim

Tomas Lycken

unread,

Jul 16, 2014, 3:09:41 AM7/16/14

to julia...@googlegroups.com

Another way would be to just initialize a `MersenneTwister` stream on each process, and seed them differently. Since the streams are process local, there shouldn't be any communication overhead, and streams with different seeds should be independent.

// T

Gray Calhoun

unread,

Jul 16, 2014, 10:45:50 AM7/16/14

to julia...@googlegroups.com

On Wednesday, July 16, 2014 12:07:09 AM UTC-5, James Delaney wrote:

If your goal is that each call to rand(), regardless of process,

sequentially pulls a number from the RNG stream, then perhaps

you just need to create a MersenneTwister([seed]) object.

mystream = MersenneTwister(1)
## Parallel execution
    srand(1)
    parfor = @parallel (vcat) for i=1:4
        rand(mystream)
    end

Thanks for the reply. This is probably close to what I'll end up
doing. The execution time of each simulation is long enough that it
should dwarf any communication overhead, and I'd (ideally) like the
results to be independent of the number of processors used, which this
accomplishes.

Unfortunately the code doesn't work as written. With four processors,
it returns

    julia> parfor
    4-element Array{Float64,1}:
     0.236033
     0.236033
     0.236033
     0.236033

I'll look into what's going on more carefully, because it would be
nice to have an approach like this work. (I assume `mystream` is
copied into each process, which `pmap` should be able to avoid.)

--Gray

--
Gray Calhoun Assistant Professor of Economics, Iowa State University
http://gray.clhn.co // (515) 294-6271 // 467 Heady Hall

Gray Calhoun

unread,

Jul 16, 2014, 10:54:07 AM7/16/14

to julia...@googlegroups.com

On Wednesday, July 16, 2014 2:09:41 AM UTC-5, Tomas Lycken wrote:
> Another way would be to just initialize a `MersenneTwister` stream
> on each process, and seed them differently. Since the streams are
> process local, there shouldn't be any communication overhead, and
> streams with different seeds should be independent.

My understanding is that streams with different seeds are not
guaranteed to be independent, but there are some RNGs for which they
are.

--Gray

Tomas Lycken

unread,

Jul 16, 2014, 11:15:00 AM7/16/14

to julia...@googlegroups.com

Yeah, sorry - I suspected there might be something I was missing. According to the wikipedia page about the MT one of its disadvantages is just that:

It can take a long time to turn a non-random initial state—particularly an initial state with many zeros—into output that passes randomness tests. A consequence of this is that two instances of the generator, started with initial states that are almost the same, will output nearly the same sequence for many iterations before eventually diverging.

So you should probably not do this without significant warm-up, or in some other way make sure that the seeds are different enough. If you don’t need randoms very often, you’re probably better off with James’ approach.

// T

Gray Calhoun

unread,

Jul 18, 2014, 12:58:13 PM7/18/14

to julia...@googlegroups.com

On Tuesday, July 15, 2014 5:25:11 PM UTC-5, Gray Calhoun wrote:
> Hi everyone, I'm trying to start using Julia for some Monte
> Carlo simulations (not MCMC) which I'd like to parallelize. I
> haven't found any documentation for setting the RNG's seed for
> parallelization. The naive approach gives different results
> than non-parallel execution (which is not surprising).

[rest of message cut]

Thanks to everyone who replied to my original email. Just to follow
up, it turns out that `pmap` gives a simple solution, since it manages
the parallel tasks from the main process. The following code gives the
general idea:

```
addprocs(1)
srand(1)
function rvproducer()
    for i=1:4
        produce(rand())
    end
end

parallelrvs = pmap(Task(rvproducer)) do x
    println(x) ## verify that the work is done on different processes
    return x
end

srand(1)
sort(parallelrvs) == sort(rand(4)) ## returns true
```

--Gray

Viral Shah

unread,

Jul 18, 2014, 5:36:11 PM7/18/14

to julia...@googlegroups.com

One thing we can do is use SFMT Jump. It's dependencies are not so straightforward to build, and I am not sure how well supported this is, but it may be the best way to get parallel streams.

http://www.math.sci.hiroshima-u.ac.jp/~%20m-mat/MT/SFMT/JUMP/index.html

-viral

Gray Calhoun

unread,

Jul 19, 2014, 3:53:14 PM7/19/14

to julia...@googlegroups.com

On Friday, July 18, 2014 5:36:11 PM UTC-4, Viral Shah wrote:

One thing we can do is use SFMT Jump. It's dependencies are not so straightforward to build, and I am not sure how well supported this is, but it may be the best way to get parallel streams.

http://www.math.sci.hiroshima-u.ac.jp/~%20m-mat/MT/SFMT/JUMP/index.html

That's probably the right approach long-term; I know that R uses an RNG developed by L'Ecuyer [1] that might be easier to implement and would probably be worth including on its own. (For reproducibility of results that depend on this RNG, if nothing else.) If the pmap method I mentioned in an earlier email turns out to be too slow [2], I might look into these other approaches.

[1] See section 6 of <https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf>,
this package is included in R 3.0+ by default.
[2] https://groups.google.com/d/msg/julia-users/AydkFR7mJqo/_bu4S2W-x90J

James Delaney

unread,

Jul 19, 2014, 8:52:11 PM7/19/14

to julia...@googlegroups.com

Gray, thanks for the update.

Just fwiw, I did put the MersenneTwister.state.val::Array(Int32,1) in a SharedArray so that it could be referenced by all of the processes in your original parfor implementation. This method seemed to require writing a new wrapper to the call to dsfmt_genrand_close_open() in libdSFMT.

Doing this ultimately gets the results you were looking for, but seems to be far less elegant and less efficient than your pmap() implementation.

Since I also agree that Monte Carlo in parallel seems like it is unavoidably "distributed", you should require each process to work on non-overlapping substreams of the same stream. I wonder (mainly for my own purposes) if this pmap() implementation is all that "safe", in the meantime, before a SFMT Jump is implemented.

That is, if you end up generating more than one rand() in a process, you are not guaranteed that just by setting the seed will give you the same results in each time you run it.

In the meantime, might it be preferable to just generate all the rand()s you are ever going to need, serially, ahead of time, put them in a distributed array, and have your parallel processes work on their separate pieces of the distributed array?

Jim

Gray Calhoun

unread,

Jul 20, 2014, 4:18:08 PM7/20/14

to julia...@googlegroups.com

On Saturday, July 19, 2014 8:52:11 PM UTC-4, James Delaney wrote:

Since I also agree that Monte Carlo in parallel seems like it is unavoidably "distributed", you should require each process to work on non-overlapping substreams of the same stream. I wonder (mainly for my own purposes) if this pmap() implementation is all that "safe", in the meantime, before a SFMT Jump is implemented.

That is, if you end up generating more than one rand() in a process, you are not guaranteed that just by setting the seed will give you the same results in each time you run it.

In the meantime, might it be preferable to just generate all the rand()s you are ever going to need, serially, ahead of time, put them in a distributed array, and have your parallel processes work on their separate pieces of the distributed array?

Hi Jim, that's a good point. As long as all of the calls to rand() happen in the `rvproducer` function I think it should be safe: the rvproducer code is only used in the main process and is wrapped in `@async` so it waits until the call finishes before executing again. It should produce exactly the same numbers in the same order as it would without the parallelization.

At least, that's my understanding of the pmap code. The `@sync` block in pmap starts here:

<https://github.com/JuliaLang/julia/blob/05aa8101d1a4ad5b504bb5f4c7f1dc49e82964f4/base/multi.jl#L1341>

and `getnext_tasklet()` is defined immediately before this block (line 1325) and is what generates the next values from `Task(rvproducer)`. Those random variables are then passed to the other processes as an argument in line 1348. A secondary feature (that I didn't realize until just now) is that the random variables are stored if the call fails and are later retried. In my case, this code is all executed on a workstation so I don't expect individual processes to fail, but that could be nice on a shakier system.

Take all of this with a grain of salt. This is my first attempt at a serious project with Julia, so I might be misunderstanding the code base.

--Gray

Reply all

Reply to author

Forward