Dear Nimble users and devs,
I would like to run MCMC, in parallel, for several models that differ only in their constants or data values. The code of the model involves a user-defined distribution without an associated random-generation function.
I tried following the example on the Nimble website,
https://r-nimble.org/nimbleExamples/parallelizing_NIMBLE.html
where in my case the "container function" for parallelization passes the relevant constants or data to the model. But I keep on getting some kind of error related to the non-existence of the random-generation function. The code defined within the container function, however, runs just fine when executed normally (or called with Rscript). Do you know what the problem could be?
Here is an example:
##################################################
library('parallel')
mycluster <- makeCluster(3)
parallelSamplesFun <- function(XX){ # XX enters the constants of the model
library('nimble')
dmymnorm <- nimbleFunction(
run = function(x=double(1), mean=double(1), prec=double(2), log=integer(0, default=0)){
lp <- -sum(asRow(x-mean) %*% prec %*% asCol(x-mean))/2
if(log) return(lp)
else return(exp(lp))
returnType(double(0))
})
registerDistributions(list(
dmymnorm = list(
BUGSdist = "dmymnorm(mean, prec)",
Rdist = "dmymnorm(mean, prec)",
pqAvail = FALSE,
types = c("value = double(1)", "mean = double(1)", "prec = double(2)")
)))
assign('dmymnorm', dmymnorm, envir=.GlobalEnv)
code <- nimbleCode({
x[1:2] ~ dmymnorm(mean=means[1:2], prec=prec[1:2,1:2])
})
constants <- list(means=rep(XX,2), prec=diag(c(1/1^2, 1/0.1^2)))
inits <- list(x=1:2)
modeldata <- list()
model <- nimbleModel(code=code, name='model', constants=constants, inits=inits, data=modeldata, dimensions=list(x=2, means=2, prec=c(2,2)), calculate=F)
Cmodel <- compileNimble(model, showCompilerOutput = TRUE, resetFunctions = TRUE)
confmodel <- configureMCMC(Cmodel, nodes=NULL)
confmodel$addSampler(target='x', type='AF_slice')
confmodel$setMonitors(c('x','logProb_x'))
mcmc <- buildMCMC(confmodel)
Cmcmc <- compileNimble(mcmc)
samples <- runMCMC(mcmc=Cmcmc, niter=10, nburnin=5, inits=list(x=1:2))
return(samples)
}
## Now we call the container function
results <- parLapply(cl = mycluster, X = c(1,10), parallelSamplesFun)
## Error in checkForRemoteErrors(val) :
## 2 nodes produced errors; first error: In sizeAssignAfterRecursing: 'rmymnorm' is not available or its output type is unknown.
## This occurred for: eigenBlock(model_x,1:2) <<- rmymnorm(=1,mean=model_means[1:2],prec=model_prec[1:2, 1:2])
## This was part of the call: {
## eigenBlock(model_x,1:2) <<- rmymnorm(=1,mean=model_means[1:2],prec=model_prec[1:2, 1:2])
## }
##################################################
Some additional notes on my experiments with this problem:
1. As already said, the code within the function works just fine when called directly (after assigning some value to XX).
2. Calling the function directly, eg "parallelSamplesFun(1)", gives the same error.
3. I tried eliminating the registration of the user-defined function and/or its assignment to the global environment, but the error persists.
4. I also tried with the 'foreach'+'doFuture' packages. Same error.
I manage to get parallel runs only by invoking the Nimble code through three separate Rscript instances, so somehow I'm getting it to work, but this is not an ideal solution for debugging.
Thank you again for your comments and advice!
Cheers,
Luca