Parallel calculation using R on Linux compute server

113 views
Skip to first unread message

Yang Yang

unread,
Dec 5, 2016, 6:50:39 PM12/5/16
to RBigDataProgramming
I am now dealing with a large dataset and I want to use parallel calculation to accelerate the process on a Linux compute server. I use two packages doSNOW and parallel to do parallel jobs. When I submit the job by "qsub", I got the following error.

MKtre.pbs.e12231587:
mpirun has exited due to process rank 0 with PID 19053 on node cl2n166 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here).

WestGrid.Rout:
> cl <- makeCluster(mpi.universe.size()-1, type='MPI',outfile='')
Error in Rmpi::mpi.comm.spawn(slave = mpitask, slavearg = args, nslaves = count,  :  MPI_Comm_spawn is not supported.
Calls: makeCluster -> makeMPIcluster -> <Anonymous>
Execution halted

Below is my R script code:
library(Rmpi)
library(fume)
library(foreach)
library(doSNOW)

load("spei03_df.rdata",.GlobalEnv)

spei03_data=spei03_df[,c(-1,-2)]
rownames(spei03_data)=1:nrow(spei03_data)

cl <- makeCluster(mpi.universe.size()-1, type='MPI',outfile='')
registerDoSNOW(cl)
MK_grid <- 
  foreach(i=1:10, .packages="fume",.combine='rbind') %dopar% {
    abc <- mkTrend(as.matrix(spei03_data)[i,])
    data.frame(P_value=abc$`Corrected p.value`, Slope=abc$`Sen's Slope`*10,
               Zc=abc$Zc)
  }
stopCluster(cl)
save(MK_grid,file="MK_grid.rdata")
mpi.exit()


Below is my pbs file:
#!/bin/bash
#PBS -l nodes=2:ppn=12
#PBS -l walltime=2:00:00 
module load application/R/3.3.1
cd $PBS_O_WORKDIR 
mpirun -n 1 R CMD BATCH ./WestGrid.R

I wonder how I can fix the error, or is there any better solution using the "pbdR" packages. Thanks a lot.


Wei-Chen Chen

unread,
Dec 6, 2016, 7:44:08 PM12/6/16
to RBigDataProgramming
Your 1 and 2 are both inaccurate. You have only one process and it did call init. The problem is your mpi does not support spawning, so there is no way finalize was called before the error. To fully utilize your resource, you are better to run code as in SPMD. You can read pbdMPI vignette to start.
Reply all
Reply to author
Forward
0 new messages