Dear all,
Please compare the following simple scripts:
script 1:
library(pbdMPI, quiet = TRUE)
init()
# here all ranks get the same RNS stream
comm.set.seed(seed=123456)
comm.print(rnorm(5), all.rank = TRUE)
comm.end.seed()
# here each rank gets a different RNG stream
comm.set.seed(seed=123456, diff = TRUE)
comm.print(rnorm(5), all.rank = TRUE)
comm.end.seed()
finalize()
and script 2:
library(parallel, quiet = TRUE)
np <- 2L;
cl <- makeCluster(spec=np, type="MPI", outfile=""); # here all processes
get the same RNS stream
a <- parSapply(cl=cl, X=1:2, FUN=function(x) { set.seed(seed=123456,
kind="L'Ecuyer"); print(rnorm(5));});
stopCluster(cl=cl);
cl <- makeCluster(spec=np, type="MPI", outfile="");
clusterSetRNGStream(cl=cl, iseed=123456); # here each process gets a
different RNS stream
a <- parSapply(cl=cl, X=1:2, FUN=function(x) print(rnorm(5)));
stopCluster(cl=cl);
mpi.quit();
You can run script 1 with "mpirun -np 2 Rscript script1.R" and script 2
with "mpirun -np 1 Rscript script2.R".
I have the following questons:
1. In script 1, the streams are completely different when diff=TRUE from
when diff=FALSE. Shouldn't one of the random vectors in the case
diff=TRUE be the same as in the case diff=FALSE? This is exactly the
case in script 2.
2. Why are the random vectors generated in script 1 completely different
than those generated with script 2,
despite that the seed is the same, that is seed=123456 in both script 1
and script 2?
3. Is there a better way to implement the same stream for all workers in
script 2?
Best regards,
Martin Ivanov