Parallelizing within R function - how does this work on HPC?

248 views
Skip to first unread message

Keith, Sally

unread,
Jun 17, 2013, 11:59:23 PM6/17/13
to Connolly, Sean, Blowes, Shane, Erin Graham, Figueiredo, Joana, Hisano, Mizue, Hoogenboom, Mia, Jess Hopf, Jordan.Casey, Karen Chong Seng, Louis Vignali, Mariana Alvarez Noreiga, Martino Malerba, Neil Chan, Phillips, Ben, Stephen Ban, Theophilus Zhi En Teo, Thibaut, Loic, Thomas Roberts, Bridge, Thomas, Walker, Stefan, tropi...@googlegroups.com

Hi All

 

I am running approximate Bayesian Computing (ABC) using Sequential Monte Carlo in R with the package ‘EasyABC’. The function I’m using (ABC_sequential) has an argument within it that allows the function to be parallelized across cores (it draws on the package ‘cluster’ to do this).

 

http://cran.r-project.org/web/packages/EasyABC/EasyABC.pdf

 

This works fine on my desktop and uses the cores fully. However, I am not sure how to get it to work on the HPC. I have tried but it says it “cannot open the connection”. I’ve also tried setting the function so it is not parallelized and instead specifying use of multiple cores within the PBS script used to submit the R job to the HPC – this doesn’t work either.

 

Sample code is available within the vignette http://cran.r-project.org/web/packages/EasyABC/vignettes/EasyABC.pdf (my own code is not really simple enough to provide here as an example)

 

Has anyone managed to use the HPC to run jobs parallelized within the R function rather than through PBS + bash scripts? If so, I would really appreciate some enlightenment on this because my models are taking a LOOOOOOOONG time to run!

 

Thanks

Sal

 

Phillips, Ben

unread,
Jun 18, 2013, 12:23:46 AM6/18/13
to Keith, Sally, Connolly, Sean, Blowes, Shane, Erin Graham, Figueiredo, Joana, Hisano, Mizue, Hoogenboom, Mia, Jess Hopf, Jordan.Casey, Karen Chong Seng, Louis Vignali, Mariana Alvarez Noreiga, Martino Malerba, Neil Chan, Stephen Ban, Theophilus Zhi En Teo, Thibaut, Loic, Thomas Roberts, Bridge, Thomas, Walker, Stefan, tropi...@googlegroups.com
Hey Sal,

I've run into problems running parallel code through R on the hpc before.  I wrote some R functions in C that used openmp and the code worked beautifully on my desktop (though the computer did get a tad hot).  When I moved things across to the HPC, nothing.  The code executed, but I could never get more than one CPU working on the problem.  It was something about how R on the hpc interacted with the request to get multiple cores.  The problem is solveable (you can get R libraries that run things in parrallel on the HPC), but I have no idea how they pull that trick off.  My guess is that your EasyABC library is falling into whatever trap I fell into.  In the end I gave up.

I had a conversation with Wayne Mallet about this and he had a few suggestions which I didn't really chase up, see below.  Might be helpful?

Ben


From Wayne:

You can watch a processes resource usage using "top".  By default, top aggregates CPU usage in multiprocessor jobs (so you may see processes reporting >100% CPU usage), although there is an option to expand the view so that each "sub-process" gets show separately.

I've not used OpenMP with GCC/GFORTAN.  I've only worked with SGI (during PhD) and Intel compilers (later, while they were still free for education use).

My guess is that something (R?) on HPC systems has set your "affinity" to 1 CPU.  You could check this using calls to the omp_getnum_procs() and omp_get_max_threads(). As for a fix, you could try calling sched_setaffinity or pthread_sched_setaffinity in you program before starting any parallel tasks.

Please note that I haven't done OMP programming for at least 4 years now, and I've never used R in the manner you are.  I will try to help where I can but you are delving into things I haven't done so don't expect quick responses from me if further help is required on this matter.
Regards,
Wayne


From: Ben Phillips <ben.ph...@jcu.edu.au>
Date: Fri, 3 Aug 2012 09:26:55 +1000
To: Wayne Mallett <wayne....@jcu.edu.au>
Subject: Re: code and such

Howdy Wayne,

Sorry to be a humbug again, but I am still having issues.  I managed to resolve the compilation issues on my mac, and the function parallelizes beautifully across the four cores I have at my disposal there. I can watch this happen using an activity monitor.

I can resolve the pragma issues we discussed on the HPC simply by adding the path to the /usr/lib64 directory as follows:

R CMD SHLIB -L/usr/lib64/ -lgomp -fopenmp ~/evo-dispersal/KBGrad/PointMetrics1D.c

This generates the call to gcc as follows:
gcc -m64 -std=gnu99 -shared -L/usr/local/lib64 -o /home/jc227089/evo-dispersal/KBGrad/PointMetrics1D.so /home/jc227089/evo-dispersal/KBGrad/PointMetrics1D.o -L/usr/lib64/ -lgomp -fopenmp -L/usr/lib64/R/lib -lR

which compiles with no issues whatsoever. 

I have thrown in the -Wall switch at times as well just to make sure my pragma errors are not going unreported.

The function then dyn.loads into R, again, no worries.  So far, all good.  Problem is though that I get no change in performance when I vary the number of cores available to the process.  A job that takes 19s with one core take 19s with four.  When I run the same trial on the mac, I get a halving of the time when I change from one to four.  I have no idea how to monitor cpu activity on the hpc, so I am just assuming this lack of improvement is because I am still only accessing one cpu.

So it really seems like I am still, somehow, not getting parallel behaviour on the hpc, and I am completely stumped.  I wondered if there was some rule in the login node that a user could only get one CPU, so I ran the trial interactively on four private nodes (qsub -l nodes=4 -l pmem=4gb -I) to the same effect.

Any (more) advice you can offer will be greatly appreciated.  

In confusion.

Ben








On 02/08/2012, at 8:41 AM, Mallett, Wayne wrote:

I've found the libgomp.a file – it is in
/usr/lib/gcc/x86_64-redhat-linux/4.4.?/
Where ? = 4 or 6
My previous search only look inside the /usr/lib64 directory.  Note that the x86_64 versions of gcc creates the files mentioned above.
Regards,
Wayne

Jeremy VanDerWal

unread,
Jun 18, 2013, 2:12:01 AM6/18/13
to tropi...@googlegroups.com, Connolly, Sean, Blowes, Shane, Erin Graham, Figueiredo, Joana, Hisano, Mizue, Hoogenboom, Mia, Jess Hopf, Jordan.Casey, Karen Chong Seng, Louis Vignali, Mariana Alvarez Noreiga, Martino Malerba, Neil Chan, Phillips, Ben, Stephen Ban, Theophilus Zhi En Teo, Thibaut, Loic, Thomas Roberts, Bridge, Thomas, Walker, Stefan, Wayne Mallett
G'day Sal,

Unlike Ben, I have had luck parallelizing R on the HPC. Differences between your computer and HPC may be the version of R and EasyABC. 

The HPC is running R 2.15.1 and has a version EasyABC that uses the parallel package rather than the cluster package -- the cluster package has been integrated into the core of R through the parallel package since R 2.14.

I modified the example from the helpfile for ABC_mcmc function and checked if the parallel nature of it works on the HPC... it does. 

Below is the script I tested on the HPC. The differences in compute time in this simple example is small -- 8.5 sec without parallel and 7.5 sec with 4 cores -- this difference will increase with size of compute time.

I hope this helps.

Jeremy

--------------------------------------------------------------------------
module load R
R

library(EasyABC)
## the model has two parameters and outputs two summary statistics.
## defining a simple toy model:
toy_model<-function(x){ c( x[1] + x[2] + rnorm(1,0,0.1) , x[1] * x[2] + rnorm(1,0,0.1) ) }
## define prior information
toy_prior=list(c("unif",0,1),c("normal",1,2))
# a uniform prior distribution between 0 and 1 for parameter 1, and a normal distribution
# of mean 1 and standard deviation of 2 for parameter 2.
## define the targeted summary statistics
sum_stat_obs=c(1.5,0.5)
## artificial example to perform the Marjoram et al. (2003)’s method, with modifications
# drawn from Wegmann et al. (2009) without Box-Cox and PLS transformations.
##
ABC_Marjoram<-ABC_mcmc(method="Marjoram", model=toy_model, prior=toy_prior,summary_stat_target=sum_stat_obs) ##original example from the help documentation
ABC_Marjoram
ABC_Marjoram<-ABC_mcmc(method="Marjoram", model=toy_model, prior=toy_prior,summary_stat_target=sum_stat_obs,n_cluster=4,use_seed=TRUE) ##modified to use 4 cores
ABC_Marjoram

--------------------------------------------------------------------------

--
An R group for questions, tips and tricks relevant to spatial ecology and climate change.
All R questions welcome.
---
You received this message because you are subscribed to the Google Groups "Tropical R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tropical-r+...@googlegroups.com.
To post to this group, send an email to tropi...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/tropical-r/784FB8B78AB80245ACF1783D83A15249203F4440%40HKXPRD0610MB352.apcprd06.prod.outlook.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Keith, Sally

unread,
Jun 18, 2013, 3:13:08 AM6/18/13
to tropi...@googlegroups.com, Connolly, Sean, Blowes, Shane, Erin Graham, Figueiredo, Joana, Hisano, Mizue, Hoogenboom, Mia, Jess Hopf, Jordan.Casey, Karen Chong Seng, Louis Vignali, Mariana Alvarez Noreiga, Martino Malerba, Neil Chan, Phillips, Ben, Stephen Ban, Theophilus Zhi En Teo, Thibaut, Loic, Thomas Roberts, Bridge, Thomas, Walker, Stefan, Mallett, Wayne

Hi  Jeremy

 

Thanks for the response. Although the code to run on 4 clusters works for the mcmc function, when I modify it for use with the sequential function it returns the error below (NB: the single core version of this sequential code works fine). Perhaps Wayne Mallet will be able to decipher this for me!

 

Thanks

Sal

 

ABC_2<-ABC_sequential(method="Beaumont", model=toy_model,,nb_simul=10,tolerance_tab=c(1.25,0.75),

     prior=toy_prior,summary_stat_target=sum_stat_obs,n_cluster=4,use_seed=TRUE) ##modified to use 4 cores

 

Error in socketConnection("localhost", port = port, server = TRUE, blocking = TRUE,  :

  cannot open the connection

Calls: ABC_sequential ... makePSOCKcluster -> newPSOCKnode -> socketConnection

In addition: Warning message:

In socketConnection("localhost", port = port, server = TRUE, blocking = TRUE,  :

  port 10187 cannot be opened

Execution halted

Jeremy VanDerWal

unread,
Jun 18, 2013, 7:20:03 PM6/18/13
to tropi...@googlegroups.com, Connolly, Sean, Blowes, Shane, Erin Graham, Figueiredo, Joana, Hisano, Mizue, Hoogenboom, Mia, Jess Hopf, Jordan.Casey, Karen Chong Seng, Louis Vignali, Mariana Alvarez Noreiga, Martino Malerba, Neil Chan, Phillips, Ben, Stephen Ban, Theophilus Zhi En Teo, Thibaut, Loic, Thomas Roberts, Bridge, Thomas, Walker, Stefan, Mallett, Wayne
Sal,

Can you let me know what you are doing? I tested your code on the hpc on login1, login3 and node 33 and I cannot reproduce your error. My sequence is below.

I did notice that the method="Beaumont" is quick as single core but once you increase the n_cluster to use more than a single core, it takes forever... This is not the case for methods of "Drovandi", "Delmoral" or "Lenormand"; they appear to work well.

If you want, I am on campus this week if you want to bring by your code and we can do some testing? 

Cheers,

Jeremy

---------------------------------------------------------------------------------
##on the HPC at the command prompt
module load R
R
##once in R -- the R version that loaded was R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows" 
library(EasyABC)
##### EXAMPLE 1 #####
#####################
set.seed(1)
## artificial example to show how to use the ’ABC_sequential’ function.
## defining a simple toy model:
toy_model<-function(x){ 2 * x + 5 + rnorm(1,0,0.1) }
## define prior information
toy_prior=list(c("unif",0,1)) # a uniform prior distribution between 0 and 1
## define the targeted summary statistics
sum_stat_obs=6.5
## to perform the Beaumont et al. (2009)’s method:
##
tolerance=c(1.5,0.5)
ABC_Beaumont<-ABC_sequential(method="Beaumont", model=toy_model, prior=toy_prior,nb_simul=20, summary_stat_target=sum_stat_obs, tolerance_tab=tolerance)
ABC_Beaumont

ABC_Beaumont<-ABC_sequential(method="Beaumont", model=toy_model, prior=toy_prior,nb_simul=20, summary_stat_target=sum_stat_obs, tolerance_tab=tolerance,n_cluster=4,use_seed=TRUE)
ABC_Beaumont

ABC_2<-ABC_sequential(method="Beaumont", model=toy_model,,nb_simul=10,tolerance_tab=c(1.25,0.75), prior=toy_prior,summary_stat_target=sum_stat_obs,n_cluster=4,use_seed=TRUE) ##modified to use 4 cores
ABC_2
---------------------------------------------------------------------------------


Keith, Sally

unread,
Jun 18, 2013, 10:47:28 PM6/18/13
to tropi...@googlegroups.com, Connolly, Sean, Blowes, Shane, Erin Graham, Figueiredo, Joana, Hisano, Mizue, Hoogenboom, Mia, Jess Hopf, Jordan.Casey, Karen Chong Seng, Louis Vignali, Mariana Alvarez Noreiga, Martino Malerba, Neil Chan, Phillips, Ben, Stephen Ban, Theophilus Zhi En Teo, Thibaut, Loic, Thomas Roberts, Bridge, Thomas, Walker, Stefan, Mallett, Wayne

Hi All

 

Now I have got it working, here’s what I have discovered. It may be useful for others.

 

If it is a problem with the HPC, as Wayne says “A system error would abruptly take you out of R and there would be no reference to R functions/methods” within the ‘.out’ file.

 

When I went back to check my R code again, turns out I hadn’t changed the file path of one of the files that one of the functions was trying to call since testing it on my desktop. This led to the rather uninformative error message:

 

Error in checkForRemoteErrors(val) :

  4 nodes produced errors; first error: cannot open the connection

Calls: ABC_sequential ... clusterApplyLB -> dynamicClusterApply -> checkForRemoteErrors

Execution halted

 

 

Moral of the story – if you get the above error message, it means there is something wrong in the R code, NOT on the HPC.

 

Thanks Jeremy and Wayne!

 

Cheers

Sal

Reply all
Reply to author
Forward
0 new messages