pbdNCDF4 with foreach

29 views
Skip to first unread message

Dawg

unread,
Feb 2, 2016, 9:49:57 PM2/2/16
to RBigDataProgramming
Hello -

is it possible to write to the same 2-d matrix with foreach, where different columns get filled in by the different instances?
The code below works with %do% but not with %dopar%


library(pbdNCDF4)
library(doMC)
registerDoMC()
options(cores=4)

n.loc <- 1000
n.events <- 1000
n.sample <- 10
locID <- c(1:n.loc)
n.hazard <- ceiling(runif(n.loc,1,n.events))
event.address <- c(0,cumsum(n.hazard))

ncfname <- "test_ncdf4.nc"
n.sample.nc <- ncdim_def("sampleID", "sampleID", 1:n.sample)
n.event.nc <- ncdim_def("eventID", "eventID", 1:sum(n.hazard))
varloss <- ncvar_def("varloss", "Loss", list(n.sample.nc,n.event.nc), 0, longname="Sampled Losses", prec="double",compression=NA)
ncout <- nc_create_par(ncfname,varloss,force_v4=T)

xxx <- foreach(i.loc=c(1:n.loc),.combine='c', .inorder=FALSE) %dopar% {
        s <- rnorm(n.hazard[i.loc]*n.sample)
        s2 <- matrix(ifelse(s>1,s,0),n.sample,n.hazard[i.loc])
        ncvar_put(ncout,varloss,s2,start=c(1,event.address[i.loc]+1),count=c(-1,event.address[i.loc+1]-event.address[i.loc]))
        if((i.loc%%100) == 0) print(paste(i.loc))
        i.loc
}
nc_close(ncout)

Ostrouchov, George

unread,
Feb 2, 2016, 11:33:23 PM2/2/16
to RBigDataProgramming
While mixing fork (via multicore) and MPI can work if managed correctly, I do not recommend using fork (or foreach) for NetCDF4 parallel writes.

Remove all  foreach components. Use pbdMPI concepts of comm.rank() or get.jid() to divide and assign work. Then run in batch with
mpirun –np 4 Rscript your_script_file.r

That said, if you are not using a parallel file system, parallel writes will slow you down.

See examples in pbdDEMO.

George


--
Programming with Big Data in R
Simplifying Scalability
http://r-pbd.org/
---
You received this message because you are subscribed to the Google Groups "RBigDataProgramming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rbigdataprogram...@googlegroups.com.
To post to this group, send email to rbigdatap...@googlegroups.com.
Visit this group at https://groups.google.com/group/rbigdataprogramming.
To view this discussion on the web visit https://groups.google.com/d/msgid/rbigdataprogramming/35cb1a01-c491-4478-8d84-b9b8b522833e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages