Run parallel chains and update model

1,605 views
Skip to first unread message

cro...@ualberta.ca

unread,
Jun 27, 2018, 8:09:01 PM6/27/18
to nimble-users
Hello all,
I am a quantitative ecologist - i.e. not a programmer or a statistician - and I use JAGS through R quite a bit to run hierarchical Bayesian models. I am trying to use nimble in the same way I use JAGS in an attempt to speed things up on models that take a long time in JAGS (like 3 weeks) and use a great deal of memory for the processing and output. 

My questions are:
1. Can I use parallel processing with nimble, i.e. send individual MCMC chains to separate cores (and if so, how? - I can't find anything on the internet about this)? I've tried with doParallel and can't get it to work. 

2. Is there any way to update the models after they have finished if the haven't converged yet? 

Thank you in advance for any help you can give or resources you can point me to. 

Andrew D. Crosby
Postdoctoral Fellow
Boreal Avian Modelling Project
Dept. of Biological Sciences
CW405 Biological Sciences Building
University of Alberta
Edmonton, AB T6G 2E9

Daniel Turek

unread,
Jun 27, 2018, 9:23:12 PM6/27/18
to cro...@ualberta.ca, nimble-users
I can give an easy answer to #2:

>> Is there any way to update the models after they have finished if the haven't converged yet? 

If you use the lowest-level entry point to NIMBLE's MCMC engine, the run() method of the compiled MCMC object, then at that level there's the ability to "continue" an on-going run using reset=FALSE.  Note the different syntax for extracting the posterior samples after using this method.


Rmodel <- nimbleModel(code, constants, data, inits)

conf <- configureMCMC(Rmodel)
Rmcmc <- buildMCMC(conf)
Cmodel <- compileNimble(Rmodel)
Cmcmc <- compileNimble(Rmcmc, project = Rmodel)

Cmcmc$run(10000)
samples <- as.matrix(Cmcmc$mvSamples)

## examine samples array, determine it hasn't converged yet...

## continue that same run of the MCMC for 50000 more iterations:
Cmcmc$run(50000, reset = FALSE)

samplesAll60000 <- as.matrix(Cmcmc$mvSamples)


Let me know if this answers your question.

Cheers,
Daniel





--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users+unsubscribe@googlegroups.com.
To post to this group, send email to nimble...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/9375bca7-b53b-4536-a4ef-275b2062649f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Perry de Valpine

unread,
Jun 28, 2018, 2:20:42 AM6/28/18
to Daniel Turek, cro...@ualberta.ca, nimble-users
Regarding parallelization, currently what you need to do is compile separate copies of the model and MCMC on each thread.  This is not ideal because that step can take some time.  We hope in the future to make it easier.

This is a brief answer but let me know if you need more details.

-Perry

To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

To post to this group, send email to nimble...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/9375bca7-b53b-4536-a4ef-275b2062649f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

To post to this group, send email to nimble...@googlegroups.com.

Andy Crosby

unread,
Jun 28, 2018, 12:12:35 PM6/28/18
to Perry de Valpine, Daniel Turek, nimble-users

Thank you both for the quick answers. So I guess it’s a tradeoff between compilation time and run time. I’ll keep working on it to see what I can do.

 

Best,

 

Andrew D. Crosby
Postdoctoral Fellow
Boreal Avian Modelling Project
Dept. of Biological Sciences
CW405 Biological Sciences Building
University of Alberta
Edmonton, AB T6G 2E9

 

Cat Sun

unread,
Mar 21, 2020, 3:34:18 AM3/21/20
to nimble-users
Hi,

I'm also interested in being able to send chains off to their own cores while also making sure that I can extend the chains if they end up not being long enough. I found code ( https://r-nimble.org/nimbleExamples/parallelizing_NIMBLE.html ) illustrating how to parallelize, and had some questions that I'm hoping someone can clarify:

1) What is the point of the 'seed' argument in the 'run_MCMC_allcode' function? It's not invoked in the parLapply call, but it seems necessary to avoid errors like " Error in checkForRemoteErrors(val) :  3 nodes produced errors; first error: unused argument (X[[i]])"

2) I cant seem to get the function to correctly return the compiled MCMC object, when I use the run() method in anticipation of wanting to extend the chains. I get empty slots. Is this to do with the active binding? See code and output below.

3) Does it even make sense to try to do this? If i have to bring the chains from the different cores back together to assess convergence and effective sample sizes, is it possible to get back to the cores and where they left off? Or do I somehow need to send the compiled model MCMC object back to each core with another custom function  and parLapply()... This latter option does not seem likely, as the code in the link for parallelization suggests that the model and MCMC building and compiling need to happen on the cores. 

My models dont take long to compile, especially relative to the chain lengths that are likely to be necessary. I just dont want to risk running the chains for not long enough, and then having to start from scratch again to run for longer.

any thoughts would be greatly appreciated - thanks!


-Cat


################
#### the code 
################

library(parallel)
this_cluster <- makeCluster(3)
set.seed(10120)
# Simulate some data
myData <- rgamma(1000, shape = 0.4, rate = 0.8)

# Create a function with all the needed code
run_MCMC_allcode <- function( seed,data) { # removing the seed argument here causes issues with the parLapply call that comes after
  library(nimble)
  
  myCode <- nimbleCode({
    a ~ dunif(0, 100)
    b ~ dnorm(0, 100)
    
    for (i in 1:length_y) {
      y[i] ~ dgamma(shape = a, rate = b)
    }
  })
  
  myModel <- nimbleModel(code = myCode,
                         data = list(y = data),
                         constants = list(length_y = 1000),
                         inits = list(a = 0.5, b = 0.5))
 

 # Instead of this, since runMCMC does not allow for chain extension... 
  # CmyModel <- compileNimble(myModel)
  # myMCMC <- buildMCMC(CmyModel)
  # CmyMCMC <- compileNimble(myMCMC)
  # results <- runMCMC(CmyMCMC, niter = 10) # but there is no issue when removing setSeed = seed 
  # return(results)
  # 
  # 

#and instead have this:
  conf <- configureMCMC(myModel)
  conf$addMonitors(c("a","b"))
  Rmcmc <- buildMCMC(conf)
  Cmodel <- compileNimble(myModel,showCompilerOutput = TRUE)
  Cmcmc <- compileNimble(Rmcmc, project = myModel,showCompilerOutput = TRUE)
  Cmcmc$run(20)
  return(Cmcmc)
}

chain_output <- parLapply(cl = this_cluster, X = 1:3, 
                          fun = run_MCMC_allcode, 
                          data = myData)

stopCluster(this_cluster)

####### the output

> chain_output[[1]]$mvSamples
CmodelValues object with variables: a, b.

> as.matrix(chain_output[[1]]$mvSamples)
Error in .Call(NULL, <pointer: (nil)>) : 
  first argument must be a string (of length 1) or native symbol reference


















--

To unsubscribe from this group and stop receiving emails from it, send an email to nimble...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to nimble...@googlegroups.com.

Adrian Monroe

unread,
Mar 21, 2020, 11:53:00 AM3/21/20
to Cat Sun, nimble-users
Hi Cat,

To answer your third question, see below for some code to check on your model, and then keep sampling more iterations until convergence. If anyone sees areas for improvement I would be grateful!

Best,
Adrian

library(parallel)
library(coda)
set.seed(22)
nc <- 3 # number of chains
cl<-makeCluster(nc,timeout=5184000)
inits <- function() {list(a = rnorm(1), b = rnorm(1))}
nimbledata <- list(y = data)
nimbleconstants <- list(length_y = 1000)
params <- c("a", "b")
clusterExport(cl, c("myCode", "inits", "nimbledata", "nimbleconstants", "params"))
for (j in seq_along(cl)) {
  set.seed(j)
  init <- inits()
  clusterExport(cl[j], "init")
}
out <- clusterEvalQ(cl, {
  library(nimble)
  library(coda)
  model <- nimbleModel(code = myCode, name = "myCode",
                           constants = nimbleconstants, data = nimbledata,
                           inits = init)
  Cmodel <- compileNimble(model)
  modelConf <- configureMCMC(model)
  modelConf$addMonitors(params)
  modelMCMC <- buildMCMC(modelConf)
  CmodelMCMC <- compileNimble(modelMCMC, project = myCode)
  out1 <- runMCMC(CmodelMCMC, niter = 10000)
  return(as.mcmc(out1))
})

out.mcmc <- as.mcmc(out)
traceplot(out.mcmc[, "b"]

## If has not converged, continue sampling
start <- Sys.time()
out2 <- clusterEvalQ(cl, {
  out1 <- runMCMC(CmodelMCMC, niter = 20000)
  return(as.mcmc(out1))
})

out.mcmc.update1 <- as.mcmc(out2)

out.mcmc.bind <- mcmc.list()
for (i in seq_len(nc)) {
out.mcmc.bind[[i]] <- mcmc(rbind(out.mcmc[[i]], out.mcmc.update1[[i]]))
}
traceplot(out.mcmc.bind[, "b"]

To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/4fa0c5c5-9fc3-492c-b21f-50f6c5f3185a%40googlegroups.com.

Cat Sun

unread,
Mar 21, 2020, 3:50:01 PM3/21/20
to nimble-users
Hi Adrian - yes, that worked fantastically, thank you! 

Interested to hear if anybody has clarity on questions 1 and 2, just for curiosity's sake.

Cheers,
To unsubscribe from this group and stop receiving emails from it, send an email to nimble...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/4fa0c5c5-9fc3-492c-b21f-50f6c5f3185a%40googlegroups.com.

Chris Paciorek

unread,
Mar 24, 2020, 2:46:57 PM3/24/20
to Cat Sun, nimble-users
Hi Cat,

1) In the `parLapply` call, the values of `X` are the values used for
the `seed` function. So it is used.

2) Passing around NIMBLE's compiled objects is not something that will
work in general at the moment because of how the C++ objects are set
up and interfaced to R.
We are working (long-term) on improving NIMBLE's ability to save and
reload models and nimbleFunctions, which should ultimately address
this.

3) Adrian's solution looks quite nice. Thanks!

Chris
> To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/dfdf70b6-3291-48d5-904a-ab41ecbbdec0%40googlegroups.com.

Adrian Monroe

unread,
Mar 28, 2020, 2:14:01 PM3/28/20
to oliver...@gmail.com, nimble-users
Hi Ollie,

You bring up a good point, and I am sharing this with the rest of the nimble group to clarify. It seems indeed according to chapter 7 of the nimble help (https://r-nimble.org/html_manual/cha-mcmc.html#sec:runMCMC) that runMCMC does NOT continue an MCMC run where it left off previously. I must have missed that, and I have modified the code for sampling and updating in parallel is below. Note the update returns all iterations from the start, not just the additional samples. Interestingly, when I used runMCMC to update, it didn't look to me like it was completely starting over, but instead was sampling from distributions it had reached previously (which is probably why I assumed it picked up from the previous run). If any of the nimble developers have some insights on this, and why reset = FALSE is not the default, I would be interested.

Best,
Adrian

out1 <- clusterEvalQ(cl, {
  library(nimble)
  library(coda)
  model <- nimbleModel(code = myCode, name = "model",

                           constants = nimbleconstants, data = nimbledata,
                           inits = init)
  Cmodel <- compileNimble(model)
  modelConf <- configureMCMC(model)
  modelConf$addMonitors(params)
  modelMCMC <- buildMCMC(modelConf)
  CmodelMCMC <- compileNimble(modelMCMC, project = model)
  CmodelMCMC$run(10000, reset = FALSE)
  return(as.mcmc(as.matrix(CmodelMCMC$mvSamples)))
})

out.mcmc <- as.mcmc(out1)
traceplot(out.mcmc[, "b"]

## If has not converged, continue sampling
start <- Sys.time()
out2 <- clusterEvalQ(cl, {
  CmodelMCMC$run(20000, reset = FALSE)
  return(as.mcmc(as.matrix(CmodelMCMC$mvSamples)))
})

out.mcmc.update1 <- as.mcmc(out2) 
 

On Sat, Mar 28, 2020 at 6:07 AM orwearn <oliver...@gmail.com> wrote:
Hi Adrian

I'm implementing your code solution - thanks for posting it! I had a quick question, though. The Nimble help page (in Section 7.4says 

However, using runMCMC does not support several lower-level options, such as...continuing an existing MCMC run (picking up where it left off)

So what is happening in your code where you continue sampling? Is it starting from the beginning again, and therefore includes burn-in that might need removing?

Thanks for any help with this, I'm just getting started with nimble
Ollie

orwearn

unread,
Mar 29, 2020, 8:55:50 AM3/29/20
to nimble-users
Hi Adrian (and all), 

Thanks for updating your code example. I have a follow-up query on it:

Am I right in thinking that R cannot be closed between the initial run and the update in your code, because it depends on the same cluster ('cl') being open? 

Is there a way of saving, exiting and then updating with more iterations at a later time (as possible in runjags etc.)? (I am running NIMBLE on Google Cloud, wherein it's difficult for me to keep R open for 3+ weeks...). 

Many thanks for any pointers from anyone, I'm only just starting to transition over from JAGS
Ollie

PS If there's a way to add a progress bar (e.g. just for node 1), that would also be a dream...! But not to worry if that isn't possible.

Adrian Monroe

unread,
Apr 1, 2020, 12:56:24 PM4/1/20
to orwearn, nimble-users
Hi Ollie,

Your connection to the clusters will be lost if you shut down R. That said, unless I am missing something, I don't see why you couldn't use clusterEvalQ() to return the necessary nimble objects from each cluster (such as in a list), save these nimble objects, then export the nimble objects back to individual clusters in another R session (looping clusterExport() through each cluster) and continue updating with more iterations. I am curious to see if that works, so please let me know how it goes.

I don't know about a progress bar, and the run times for each chain can be different so I don't know what one would really tell you. 

Best,
Adrian

To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/fd4c56ae-fa02-4702-818d-f76ac52467d4%40googlegroups.com.

Daniel Turek

unread,
Apr 2, 2020, 3:39:15 PM4/2/20
to Adrian Monroe, orwearn, nimble-users
To add just a few things:

1) I'm not sure if there's any way to have the progress bar printed from one cluster node.  I agree, that would be nice.

2) Even if you return the compiled object(s) from one cluster (looping over clusterExport(), as was suggested), these compiled objects still could not be saved (to be reloaded subsequently into a new R session) due to the current nature of NIMBLE's compilation and the resulting objects.  As I believe was mentioned, addressing this is a long-term goal for NIMBLE.

3) Towards that end - being able to start a new R session, and resume executing an MCMC chain precisely where it left off - I wrote a vignette about "Saving Model and MCMC State", which demonstrates how to save the model and MCMC state after executing an MCMC algorithm.  This is only demonstrated for a single chain, but could be generalized for the multi-chain case.  See the vignette here:

Hope this helps!
Daniel




Daniel Eacker

unread,
Jan 18, 2022, 2:36:39 PM1/18/22
to nimble-users
Hi,

A bit late to add this, but I just wanted to add that I don't think this is correct for setting the seed in parallel processing using the parallel package:

1) In the `parLapply` call, the values of `X` are the values used for
the `seed` function. So it is used.

I've tried this and it produces different results across MCMC simulations.

What I've found that works for repeatable results using parallel processing is to add this just after making the cluster:

e.g.,

this_cluster <- makeCluster(2)

clusterSetRNGStream(cl = this_cluster, 500) # note that 500 could be any integer.

So, I don't think the X in parLapply actually does anything to produce repeatable results, whereas using this builtin function of the parallel package provides reproducible results.

Cheers,

Dan

Chris Paciorek

unread,
Jan 20, 2022, 4:57:06 PM1/20/22
to Daniel Eacker, nimble-users
hi Dan,

Thanks for commenting here.

I've checked the example and I do get reproducible results using the code provided. I'm wondering if you modified it such that you are creating random initial values? If so, you'd need to use set.seed before you generate the initial values within the run_MCMC_allcode function.

One other note is that the approach in that example relies on simply setting different seeds for the different chains. While this is probably safe, it is possible that the random numbers generated for one chain could partially overlap the random numbers generated for another chain. Even if this happened, I don't think it would cause substantive dependence between the chains. But your approach of using clusterSetRNGStream has the nice feature of ensuring the sequences of random numbers for each of the worker processes won't overlap.

-chris

Quresh Latif

unread,
Aug 15, 2022, 1:23:43 PM8/15/22
to nimble-users
Hello all. I realize this is an older post, but thought I'd share that I was able to successfully build on Adrian's code. The code below implements parallel processing, checks Rhats for specified parameters, and if they are too low, it keeps resuming until Rhats meet the desired threshold. The calculation of Rhats leverages the 'mcmcOutput' package. Perhaps others may find this useful.

    #~~~~~~~~~Parallel processing code (won't work in Windows)~~~~~~~~~#
    library(parallel)
    library(coda)
    set.seed(3)
    cl<-makeCluster(nc, timeout = 5184000)
    clusterExport(cl, c("model", "inits", "data", "constants", "parameters", "ni"))

    for (j in seq_along(cl)) {
      set.seed(j)
      init <- inits()
      clusterExport(cl[j], "init")
    }
    out1 <- clusterEvalQ(cl, {
      library(nimble)
      library(coda)
      model <- nimbleModel(code = model, name = "model",
                           
                           constants = constants, data = data,

                           inits = init)
      Cmodel <- compileNimble(model)
      modelConf <- configureMCMC(model)
      modelConf$addMonitors(parameters)

      modelMCMC <- buildMCMC(modelConf)
      CmodelMCMC <- compileNimble(modelMCMC, project = model)
      CmodelMCMC$run(ni, reset = FALSE)
      return(as.mcmc(as.matrix(CmodelMCMC$mvSamples)))
    })
   
    ## Check convergence ##
    out2 <- out1
    out2[[1]] <- out2[[1]][seq(nb+2, ni+1, by = nt),]
    out2[[2]] <- out2[[2]][seq(nb+2, ni+1, by = nt),]
    out.mcmc <- coda::as.mcmc.list(lapply(out2, coda::as.mcmc))
    #traceplot(out.mcmc[, "b"])
    library(mcmcOutput)
    #if(nc > 1) mod.raw <- coda::as.mcmc.list(lapply(out$samples, coda::mcmc))
    mod <- mcmcOutput(out.mcmc)
    sumTab <- summary(mod, MCEpc = F, Rhat = T, n.eff = T, f = T, overlap0 = T, verbose = F)
    sumTab <- sumTab %>%
      as_tibble() %>%
      mutate(Parameter = row.names(sumTab)) %>%
      select(Parameter, mean:f)
    #MCMCvis::MCMCtrace(out.mcmc, params = "DELTA1[114, 2]", pdf = F, ISB = F)
    ind.Rht <- which(!str_detect(sumTab$Parameter, "test.n") &
                       !str_detect(sumTab$Parameter, "M.save") &
                       !str_detect(sumTab$Parameter, "pXtheta") &
                       !str_detect(sumTab$Parameter, "Ind") &
                       !str_detect(sumTab$Parameter, "ind"))
    mxRht <- sumTab %>% slice(ind.Rht) %>% pull(Rhat) %>% max(na.rm = T)
    #sumTab %>% slice(ind.Rht) %>% arrange(desc(Rhat)) %>% View()

   
    ## If has not converged, continue sampling
    n.reruns <- 0
    while(round(mxRht, digits = 1) > 1.1) {
      out2 <- clusterEvalQ(cl, {
        CmodelMCMC$run(ni, reset = FALSE)
        return(as.mcmc(as.matrix(CmodelMCMC$mvSamples)))
      })
     
      n.reruns <- n.reruns + 1
      print(str_c("Reruns = ", n.reruns))
      if(n.reruns == 1) {
        nb2 <- nb + nb
        ni2 <- ni + ni
      } else {
        nb2 <- nb2 + nb
        ni2 <- ni2 + ni
      }
      out2[[1]] <- out2[[1]][seq(nb2+2, ni2+1, by = nt),]
      out2[[2]] <- out2[[2]][seq(nb2+2, ni2+1, by = nt),]
      out.mcmc.update <- coda::as.mcmc.list(lapply(out2, coda::as.mcmc))
     
      mod <- mcmcOutput(out.mcmc.update)
      sumTab <- summary(mod, MCEpc = F, Rhat = T, n.eff = T, f = T, overlap0 = T, verbose = F)
      sumTab <- sumTab %>%
        as_tibble() %>%
        mutate(Parameter = row.names(sumTab)) %>%
        select(Parameter, mean:f)
      #MCMCvis::MCMCtrace(out.mcmc, params = "DELTA1[114, 2]", pdf = F, ISB = F)
      mxRht <- sumTab %>% slice(ind.Rht) %>% pull(Rhat) %>% max(na.rm = T)
      gc(verbose = F)
    }
    #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

Quresh Latif

unread,
Aug 23, 2022, 12:17:16 PM8/23/22
to nimble-users
One trade-off I am discovering with the above code is that because it uses the base function 'CModelMCMC$run' to allow resumption of sampling after checking convergence, I can't thin the samples as I go. I can thin after recovering the samples to save hard drive space, but the primary sampler does not allow thinning (or burn-in), which means that I end up running out of RAM before I reach convergence. For the time being, it seems like this code is useful at the initial stage of exploring a model, but once I have a sense of how long I'm going to need to run the sampler to get to convergence, I am probably going to have to switch to using the higher level 'nimbleMCMC' function, so that I can implement burn-in and thinning for better memory management.

It might be nice for future versions of NIMBLE if either thinning/burn-in could be included in the base function (...$run) or if an option for resuming sampling could be incorporated into higher level functions that also allow thinning and burn-in.

Daniel Turek

unread,
Aug 23, 2022, 12:24:59 PM8/23/22
to Quresh Latif, nimble-users
Quresh, just set the thinning interval in the MCMC configuration object (using conf$setThin(INTERVAL), I believe), doing that before you build and compile the MCMC.  Then, even using mcmc$run(ITERATIONS) will respect the thinning interval, and only record those samples internally.

Quresh Latif

unread,
Aug 23, 2022, 12:38:00 PM8/23/22
to nimble-users
Ah, perfect! Thanks for pointing that out. I also see that you can set thinning when actually configuring the MCMC using the 'thin' argument (e.g., configureMCMC(Rmodel, thin = 10)). I don't see an argument for setting burn-in, but maybe I'm missing it? In any case, the option for setting thinning in the primary sampler is going to help a lot. Thank you!

Daniel Turek

unread,
Aug 23, 2022, 12:44:41 PM8/23/22
to Quresh Latif, nimble-users
Great, glad it helps.

Quresh Latif

unread,
Aug 23, 2022, 12:56:31 PM8/23/22
to nimble-users
Is it possible or are there plans to allow the user to pull the saved samples out of CModelMCMC and just keep the last sample along with mcmc parameters at that step (e.g., jump probabilities, etc.)? The idea would be to save the samples to the hard drive and keep on sampling without relying on RAM to store sampled iterations. This is what the `saveJAGS` package allows for JAGS, but JAGS is clearly a lot slower, so I'm wondering if there's any way to do this with NIMBLE.

Daniel Turek

unread,
Aug 23, 2022, 2:27:17 PM8/23/22
to Quresh Latif, nimble-users
Quresh, I think we have this functionality pretty well built-in, although I'm not sure how well documented this is... if it's documented at all.

Try:

## build and compile model and MCMC
mcmc$run(iterations)
samples <- as.matrix(mcmc$mvSamples)
## save samples to disk
mcmc$run(more_iterations, reset = FALSE, resetMV = TRUE)   ## resets the internal mvSamples object, but leaves everything else the same
more_samples <- as.matrix(mcmc$mvSamples)




Quresh Latif

unread,
Aug 23, 2022, 2:37:09 PM8/23/22
to nimble-users
Oh wow! Another game changer. I searched for 'resetMV' in the NIMBLE user manual and came up with nothing, so it's not documented in there. I'm not sure how to access the help file (or if there is one) for the ...$run function. In any case, I'll write some code that saves the samples to disk and post it here. Thanks so much!

Daniel Turek

unread,
Aug 23, 2022, 3:07:24 PM8/23/22
to Quresh Latif, nimble-users
You're welcome!  It makes me happy to hear "another game changer".

It looks like it's only documented using help(buildMCMC) --- at least it's documented somewhere, I suppose.

Take a look for yourself, but I'll copy that documentation here:

     ‘resetMV’: Boolean specifying whether to begin recording posterior
     sample chains anew. This argument is only considered when using
     ‘reset = FALSE’.  Specifying ‘reset = FALSE, resetMV = TRUE’
     allows the MCMC algorithm to continue running from where it left
     off, but without appending the new posterior samples to the
     already existing samples, i.e. all previously obtained samples
     will be erased. This option can help reduce memory usage during
     burn-in (default = FALSE).





Quresh Latif

unread,
Aug 23, 2022, 3:11:58 PM8/23/22
to nimble-users

I do see the documentation there. Thanks for pointing it out! Working on upgrading my code now….

 

Quresh S. Latif 
Research Scientist
Bird Conservancy of the Rockies

Phone: (970) 482-1707 ext. 15

www.birdconservancy.org

 

From: Daniel Turek <db...@williams.edu>
Sent: Tuesday, August 23, 2022 1:07 PM
To: Quresh Latif <quresh...@birdconservancy.org>
Cc: nimble-users <nimble...@googlegroups.com>
Subject: Re: Run parallel chains and update model

 

You're welcome!  It makes me happy to hear "another game changer".

 

On Wed, Apr 1, 2020 at 12:56 PM Adrian Monroe <apm...@gmail.com> wrote:

Hi Ollie,

 

Your connection to the clusters will be lost if you shut down R. That said, unless I am missing something, I don't see why you couldn't use clusterEvalQ() to return the necessary nimble objects from each cluster (such as in a list), save these nimble objects, then export the nimble objects back to individual clusters in another R session (looping clusterExport() through each cluster) and continue updating with more iterations. I am curious to see if that works, so please let me know how it goes.

 

I don't know about a progress bar, and the run times for each chain can be different so I don't know what one would really tell you. 

 

Best,

Adrian

 

On Sun, Mar 29, 2020 at 6:55 AM orwearn <oliver...@gmail.com> wrote:

Hi Adrian (and all), 

 

Thanks for updating your code example. I have a follow-up query on it:

 

Am I right in thinking that R cannot be closed between the initial run and the update in your code, because it depends on the same cluster ('cl') being open? 

 

Is there a way of saving, exiting and then updating with more iterations at a later time (as possible in runjags etc.)? (I am running NIMBLE on Google Cloud, wherein it's difficult for me to keep R open for 3+ weeks...). 

 

Many thanks for any pointers from anyone, I'm only just starting to transition over from JAGS

Ollie

 

PS If there's a way to add a progress bar (e.g. just for node 1), that would also be a dream...! But not to worry if that isn't possible.

On Sunday, March 29, 2020 at 1:14:01 AM UTC+7, Adrian Monroe wrote:

Hi Ollie,

 

You bring up a good point, and I am sharing this with the rest of the nimble group to clarify. It seems indeed according to chapter 7 of the nimble help (https://link.edgepilot.com/s/2d4acb37/e08YctqZB0a2BWthtIRhPg?u=https://r-nimble.org/html_manual/cha-mcmc.html%23sec:runMCMC) that runMCMC does NOT continue an MCMC run where it left off previously. I must have missed that, and I have modified the code for sampling and updating in parallel is below. Note the update returns all iterations from the start, not just the additional samples. Interestingly, when I used runMCMC to update, it didn't look to me like it was completely starting over, but instead was sampling from distributions it had reached previously (which is probably why I assumed it picked up from the previous run). If any of the nimble developers have some insights on this, and why reset = FALSE is not the default, I would be interested.

Hi,

 

I'm also interested in being able to send chains off to their own cores while also making sure that I can extend the chains if they end up not being long enough. I found code ( https://link.edgepilot.com/s/5efceaac/RtQw4wQLCESzDc-vZHMF5Q?u=https://r-nimble.org/nimbleExamples/parallelizing_NIMBLE.html ) illustrating how to parallelize, and had some questions that I'm hoping someone can clarify:

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble...@googlegroups.com.
To post to this group, send email to nimble...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

Quresh Latif

unread,
Aug 24, 2022, 6:08:57 PM8/24/22
to nimble-users
Hi all. I modified the code above to include all of the desired functionality noted in previous posts (parallel processing, resuming after checking convergence, thinning the primary sampler in each cluster, saving to the hard drive to avoid overloading RAM). The code appears to be working well in the script I have running for my particular project, but when I try to put the code in an R function (see attached), I am getting an error on line 17 ("Error in get(name, envir = envir) : object 'init' not found"). Not sure if this is a fixable problem, or if this is just not a task that works from within a function. Any insight would be appreciated. In the meantime, perhaps the code contained within the attached (not yet operational) function will prove useful to folks here.
RunNimbleParallel.R

Chris Paciorek

unread,
Aug 24, 2022, 8:44:39 PM8/24/22
to Quresh Latif, nimble-users
Hi Quresh,

I just looked in the documentation for clusterExport and it indicates that the default location where variables (such as 'init') are looked for is in the global environment:

  ‘clusterExport’ assigns the values on the master R process of the
     variables named in ‘varlist’ to variables of the same names in the
     global environment (aka ‘workspace’) of each node.  The
     environment on the master from which variables are exported
     defaults to the global environment.

So it looks like you'll have to modify the `envir` argument to tell clusterExport to look in the local environment of the function call to find init.

Or you could just export all the inits to all the workers and then on the worker side determine which element of the inits to use.

-Chris

Quresh Latif

unread,
Aug 25, 2022, 1:49:51 PM8/25/22
to nimble-users
Great! Thanks for sleuthing that out for me! The attached version of the function is now working for me. I just replaced the '<-' with '<<-' on line 17. Hope this ends up being useful to others.
RunNimbleParallel.R

Quresh Latif

unread,
Sep 29, 2022, 1:57:43 PM9/29/22
to nimble-users
Hi all. In case anyone is or is considering trying my function out, I have found some errors and made some revisions. Here is the most up to date version. Feel free to post issues there if you find any problems.

Quresh Latif

unread,
Nov 2, 2022, 10:28:39 AM11/2/22
to nimble-users
Hi all. I've been using the parallel processing code posted in this thread for some time now, wherein I check for convergence periodically and resume sampling as needed. I am wondering if there would be any effect of periodic interruption and resumption of sampling on adaptation. Does the information used to adapt jump probabilities reset every time I interrupt and resume? If so, would it be better to let the sampler run longer when I know from initial trials that it will require a longer run, or would adaptation be the same regardless of how frequently sampling is interrupted?

Daniel Turek

unread,
Nov 2, 2022, 4:15:08 PM11/2/22
to Quresh Latif, nimble-users
Quresh, the very short answer is:

(1) if you are using runMCMC(compiled_mcmc_object, n_iterations), then adaptation is being reset every time you "interrupt and restart" the MCMC in this manner.

(2) if you are using compiled_mcmc_object$run(n_iterations, reset = FALSE), then the adaptive parameters are *not* reset, upon "restarting" everything picks up precisely where it left off, and it doesn't matter how many times you "interrupt and restart" the MCMC in this manner, it will be identical to an "uninterrupted" run of the same total length.

Hope this helps,
Daniel




Quresh Latif

unread,
Nov 2, 2022, 4:22:46 PM11/2/22
to nimble-users

Perfect, thank you!

 

Quresh S. Latif 
Research Scientist
Bird Conservancy of the Rockies

Phone: (970) 482-1707 ext. 15

www.birdconservancy.org

 

From: nimble...@googlegroups.com <nimble...@googlegroups.com> On Behalf Of Daniel Turek
Sent: Wednesday, November 2, 2022 2:14 PM
To: Quresh Latif <quresh...@birdconservancy.org>
Cc: nimble-users <nimble...@googlegroups.com>
Subject: Re: Run parallel chains and update model

 

Quresh, the very short answer is:

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "nimble-users" group.
To unsubscribe from this topic, visit https://link.edgepilot.com/s/8154d790/AQlDNgDryUejIidZGXvmMQ?u=https://groups.google.com/d/topic/nimble-users/RHH9Ybh7bSI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://link.edgepilot.com/s/d28140a6/JHrNdPU_bESfqX4s_f04FQ?u=https://groups.google.com/d/msgid/nimble-users/CAKbe0hqzPwNMGzt9Tzo8xmRSF11wT6Aechuhxuje6GOYQ2cN%253Dw%2540mail.gmail.com.

Quresh Latif

unread,
Dec 25, 2022, 3:30:25 PM12/25/22
to nimble-users
Hi all. I have run into a headscratcher with the parallel processing function that I wrote and posted in this thread. I am trying to run a zero-inflated dynamic community abundance model. The model runs fine with 'nimbleMCMC', but when I try to run it using ' RunNimbleParallel' function, I get the following error:

"Error in checkForRemoteErrors(lapply(cl, recvResult)) : 2 nodes produced errors; first error: Could not evaluate loop syntax: is indexing information provided via 'constants'?"

A zip file containing a somewhat stripped down version of the model (no covariates), along with data in an R workspace and scripts for running the model both with 'nimbleMCMC' and 'RunNimbleParallel' is posted here. I suspect it has something to do with the multi-species aspect of the model since a single species version runs just fine, but it's not yet clear to me what's wrong with my indexing, especially since the model runs fine using 'nimbleMCMC'. I am still working on stripping the model down further to see if I can clearly isolate the component that isn't incompatible with my parallelization function, but just thought I'd post what I've got so far in case anyone has the time and interest to help me troubleshoot. I'd especially be curious to know if anyone else is able to run this model using the 'RunNimbleParallel' function, or if they get the same error in their environment as I know that parallelization functions can be sensitive to the operating system.

Appreciate any thoughts or suggestions from anyone.

Quresh Latif

unread,
Dec 27, 2022, 5:16:28 PM12/27/22
to nimble-users
Well, once I sat down at my desk for real and my brain wasn't in vacation mode, I quickly found the error in my code. I had not included 'nspp' in my list of constants. The 'nspp' object specifies the number of species represented in the data, and thus the length of the species loop for defining the model. Given the bone-headed nature of my mistake, I now feel sheepish posting about it here, except that it's still puzzling why the model ran with 'nimbleMCMC' without throwing an error. I also was able to run the model using the more base set of NIMBLE functions that are embedded within 'RunNimbleParallel' (i.e., nimbleModel, compileNimble, configureMCMC, etc. implemented in lines 24 to 32 in the 'RunNimbleParallel' function). The model and samplers apparently get compiled and run just fine with no errors even though I am missing 'nspp' in my list of constants. Moreover, when I check the model after defining it using the 'nimbleModel' function by running 'model$calculate()', I get a number rather than NA or Inf, so the log-probability appears fully defined despite not having specified the length of the species loop. Curious if there are any insights out there as to why I only get an error with this mistake when attempting parallel processing.

Chris Paciorek

unread,
Jan 2, 2023, 6:24:04 PM1/2/23
to Quresh Latif, nimble-users
Hi Quresh,

It's probably picking up the value of `nspp` from a variable named `nspp` in your R global environment when it is not found in `constants`. That variable wouldn't be present in the worker processes, hence the error when parallelizing.

In some cases, we've been careful to prevent objects from being found via R's scoping rules, but in this case and some other cases, objects will be found by R's scoping rules if you don't provide them in the usual 'nimble' manner. In this case, it might be better if nimble errored out -- it's something that's been in the back of our minds for a while.

-chris

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

Quresh Latif

unread,
Jan 2, 2023, 6:36:30 PM1/2/23
to nimble-users
Gotcha, that makes sense. Thanks!

Quresh Latif

unread,
Jan 23, 2023, 12:08:01 PM1/23/23
to nimble-users
I am starting to thinking a little more about how to build on the work documented in this thread, one piece of functionality that would be nice to add is to be able to resume sampling from where one has left off if interrupted (e.g., due to a power outage or automatic Windows update). Is it possible to save the entire MCMC sampling state to the hard drive such that one can resume sampling from that point rather than starting from scratch?

Daniel Turek

unread,
Jan 24, 2023, 9:17:29 AM1/24/23
to Quresh Latif, nimble-users
Queresh, good question.  The short answer is there's no built-in mechanism for doing this in NIMBLE's MCMC, but there have been two attempts at documenting the procedure you're asking about.  I think the second of these is a little better, but please take a look at both:


Let me know if you have any questions about this.

Cheers,
Daniel


Quresh Latif

unread,
Apr 6, 2023, 11:37:01 AM4/6/23
to nimble-users
Question about specifying monitors. In my function, I am specifying monitors using  modelConf$addMonitors(parameters). It appears that monitors are added for parameters that contain all strings in `parameters`. For example, if I have two parameters appearing in my model, "sigma" and "tvar_sigma", and I include "sigma" as an element in `parameters`, monitors will be added for both "sigma" and "tvar_sigma" because both contain the string "sigma". Is this the way adding monitors is intended to function with Nimble, and is there a way to change this functionality to only add monitors for parameters whose names match exactly the elements in `parameters`?

Daniel Turek

unread,
Apr 6, 2023, 3:34:12 PM4/6/23
to Quresh Latif, nimble-users
Quresh, thanks for the question as always.

Bottom line, what you described should not be the case.  There is no "matching of all strings to all variables" that takes place when adding monitors to an MCMC configuration.  I believe something else must be causing whatever you're witnessing.

One possibility, I see that in your function, you first call:

modelConf <- configureMCMC(model, thin = nt)

It's possible that "tvar_sigma" is being added as a default monitor by configureMCMC -- unless otherwise specified, configureMCMC assigns monitors to all top-level non-data stochastic nodes in a model.  Then, later, when you use conf$addMonitors to add a monitor for "sigma", you now have monitors on both "sigma" (added manually) and "tvar_sigma" (added initially by configureMCMC by default).  Is this possibly the case?

I'll also point out functions:

conf$resetMonitors()       ## remove all current monitors
conf$setMonitors(params)   ## remove all current monitors, and add monitors for params

If this still doesn't resolve your question, can you send a reproducible example demonstrating the problem?

Thanks,
Daniel



Quresh Latif

unread,
Apr 10, 2023, 5:13:05 PM4/10/23
to Daniel Turek, nimble-users

Thanks so much! Switching to modelConf$setMonitors(params) did the trick. Makes perfect sense!

 

Quresh S. Latif 
Research Scientist
Bird Conservancy of the Rockies

Phone: (970) 482-1707 ext. 15

www.birdconservancy.org

 

From: Daniel Turek <danie...@gmail.com>
Sent: Thursday, April 6, 2023 1:34 PM
To: Quresh Latif <quresh...@birdconservancy.org>
Cc: nimble-users <nimble...@googlegroups.com>
Subject: Re: Run parallel chains and update model

 

Quresh, thanks for the question as always.

 

Bottom line, what you described should not be the case.  There is no "matching of all strings to all variables" that takes place when adding monitors to an MCMC configuration.  I believe something else must be causing whatever you're witnessing.

 

One possibility, I see that in your function, you first call:

 

modelConf <- configureMCMC(model, thin = nt)

 

It's possible that "tvar_sigma" is being added as a default monitor by configureMCMC -- unless otherwise specified, configureMCMC assigns monitors to all top-level non-data stochastic nodes in a model.  Then, later, when you use conf$addMonitors to add a monitor for "sigma", you now have monitors on both "sigma" (added manually) and "tvar_sigma" (added initially by configureMCMC by default).  Is this possibly the case?

 

I'll also point out functions:

 

conf$resetMonitors()       ## remove all current monitors

conf$setMonitors(params)   ## remove all current monitors, and add monitors for params

 

If this still doesn't resolve your question, can you send a reproducible example demonstrating the problem?

 

Thanks,

Daniel

 

 

On Thu, Apr 6, 2023 at 11:37 AM Quresh Latif <quresh...@birdconservancy.org> wrote:

Question about specifying monitors. In my function, I am specifying monitors using  modelConf$addMonitors(parameters). It appears that monitors are added for parameters that contain all strings in `parameters`. For example, if I have two parameters appearing in my model, "sigma" and "tvar_sigma", and I include "sigma" as an element in `parameters`, monitors will be added for both "sigma" and "tvar_sigma" because both contain the string "sigma". Is this the way adding monitors is intended to function with Nimble, and is there a way to change this functionality to only add monitors for parameters whose names match exactly the elements in `parameters`?

On Tuesday, January 24, 2023 at 7:17:29 AM UTC-7 Daniel Turek wrote:

Queresh, good question.  The short answer is there's no built-in mechanism for doing this in NIMBLE's MCMC, but there have been two attempts at documenting the procedure you're asking about.  I think the second of these is a little better, but please take a look at both:

 

 

Let me know if you have any questions about this.

 

Cheers,

Daniel

 

 

On Mon, Jan 23, 2023 at 12:08 PM Quresh Latif <quresh...@birdconservancy.org> wrote:

I am starting to thinking a little more about how to build on the work documented in this thread, one piece of functionality that would be nice to add is to be able to resume sampling from where one has left off if interrupted (e.g., due to a power outage or automatic Windows update). Is it possible to save the entire MCMC sampling state to the hard drive such that one can resume sampling from that point rather than starting from scratch?

On Monday, January 2, 2023 at 4:36:30 PM UTC-7 Quresh Latif wrote:

Gotcha, that makes sense. Thanks!

On Monday, January 2, 2023 at 4:24:04 PM UTC-7 Christopher Paciorek wrote:

Hi Quresh,

 

It's probably picking up the value of `nspp` from a variable named `nspp` in your R global environment when it is not found in `constants`. That variable wouldn't be present in the worker processes, hence the error when parallelizing.

 

In some cases, we've been careful to prevent objects from being found via R's scoping rules, but in this case and some other cases, objects will be found by R's scoping rules if you don't provide them in the usual 'nimble' manner. In this case, it might be better if nimble errored out -- it's something that's been in the back of our minds for a while.

 

-chris

 

On Tue, Dec 27, 2022 at 2:16 PM Quresh Latif <quresh...@birdconservancy.org> wrote:

Well, once I sat down at my desk for real and my brain wasn't in vacation mode, I quickly found the error in my code. I had not included 'nspp' in my list of constants. The 'nspp' object specifies the number of species represented in the data, and thus the length of the species loop for defining the model. Given the bone-headed nature of my mistake, I now feel sheepish posting about it here, except that it's still puzzling why the model ran with 'nimbleMCMC' without throwing an error. I also was able to run the model using the more base set of NIMBLE functions that are embedded within 'RunNimbleParallel' (i.e., nimbleModel, compileNimble, configureMCMC, etc. implemented in lines 24 to 32 in the 'RunNimbleParallel' function). The model and samplers apparently get compiled and run just fine with no errors even though I am missing 'nspp' in my list of constants. Moreover, when I check the model after defining it using the 'nimbleModel' function by running 'model$calculate()', I get a number rather than NA or Inf, so the log-probability appears fully defined despite not having specified the length of the species loop. Curious if there are any insights out there as to why I only get an error with this mistake when attempting parallel processing.

On Sunday, December 25, 2022 at 1:30:25 PM UTC-7 Quresh Latif wrote:

Hi all. I have run into a headscratcher with the parallel processing function that I wrote and posted in this thread. I am trying to run a zero-inflated dynamic community abundance model. The model runs fine with 'nimbleMCMC', but when I try to run it using ' RunNimbleParallel' function, I get the following error:

 

"Error in checkForRemoteErrors(lapply(cl, recvResult)) : 2 nodes produced errors; first error: Could not evaluate loop syntax: is indexing information provided via 'constants'?"

 

A zip file containing a somewhat stripped down version of the model (no covariates), along with data in an R workspace and scripts for running the model both with 'nimbleMCMC' and 'RunNimbleParallel' is posted here. I suspect it has something to do with the multi-species aspect of the model since a single species version runs just fine, but it's not yet clear to me what's wrong with my indexing, especially since the model runs fine using 'nimbleMCMC'. I am still working on stripping the model down further to see if I can clearly isolate the component that isn't incompatible with my parallelization function, but just thought I'd post what I've got so far in case anyone has the time and interest to help me troubleshoot. I'd especially be curious to know if anyone else is able to run this model using the 'RunNimbleParallel' function, or if they get the same error in their environment as I know that parallelization functions can be sensitive to the operating system.

 

Appreciate any thoughts or suggestions from anyone.

 

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

ak...@wisc.edu

unread,
Jun 2, 2023, 4:52:56 PM6/2/23
to nimble-users
Hey y'all, Thank you for this very helpful thread! I am adding onto it, because my question is relevant for running things in parallel using the approach/tricks here. I have a model with far too many user defined distributions and multiple nimble functions that I need to call within each of the parallel chain runs. In the past, when needing to do this, I've just defined nimble functions within clusterEvalQ(....) via copy and paste. However, given the scope of what I'm trying to fit, I'd like a more elegant solution. 

 I've tried to just source the external .R file in 

out1 <-  mcmc.list(clusterEvalQ(cl, {
  library(nimble)
  library(coda)

  source("script_with_distributions.R")
  source("script_with_funs.R")
...})
In those scripts I do assign the functions and user-defined distributions in the global environment. But this returns the error: 

3 nodes produced errors; first error: object of type 'closure' is not subsettable

I've also tried to "feed" in the names of the functions in the clusterExport(...) function a la:

clusterExport(cl, c("modelcode",
                    "initsFun",
                    "nimData",
                    "nimConsts",
                    "parameters",
                    "reps",
                    "n_thin",
                    "my_fun",
                    "dMyDistn"
                    ))

It returns the same: 3 nodes produced errors; first error: object of type 'closure' is not subsettable.
Lastly, I've tried within the clusterEvalQ() statement to include assign
  assign("my_fun", my_fun, envir = .GlobalEnv)
  assign("dMyDistn", dMyDistn, envir = .GlobalEnv)

Outside of the parallel compution, the model compiles and short test runs do appear to work. Any ideas for solving this? I'm assuming I'm missing something very straightforward. Thanks!

Chris Paciorek

unread,
Jun 5, 2023, 9:51:19 PM6/5/23
to ak...@wisc.edu, nimble-users
A few thoughts here...

It might work to export the uncompiled nimbleFunctions and then compile them as part of the clusterEvalQ, but I'd have to spend more time trying that out.

I would expect it to work to source a code file that defines the user-defined functions / distributions using `nimbleFunction()`. However I also get an error when trying to use `source` (though a different error than you are getting).

Given that, and that I don't have more time now to look into this, you might try the approach with `parLapply` presented in this example on our webpage. But if you've already seen that approach and it isn't what you want, let me know and I can try to dig deeper here.

-chris

ak...@wisc.edu

unread,
Jun 7, 2023, 4:34:22 PM6/7/23
to nimble-users
Hi Chris,

In typical style, I found my own error as shortly after I asked for help (I was calling an initial values function as opposed to the init object that had already been assigned using the initial values function via a previous clusterExport() call while setting the seed for each cluster. The method of calling source("myfun.R"), which is a script with all of the user-defined distributions (and functions) worked. Hope it helps others using externally defined distributions and functions. 

Thanks,
Alison

Abby Keller

unread,
Jun 26, 2024, 1:30:03 PMJun 26
to nimble-users
Hi!

Just following up on this thread with a question about saving the model state when running multiple chains in parallel. So far, I've been following Adrian's example (thanks BTW!!!):

cl <- makeCluster(8)

set.seed(10120)

clusterExport(cl, c("code", "inits", "data", "constants"))

for (j in seq_along(cl)) {
  set.seed(j)
  init <- inits()
  clusterExport(cl[j], "init")
}

out <- clusterEvalQ(cl, {
  library(nimble)

  # nimble functions here ...

  myModel <- nimbleModel(code = code,
                         data = data,
                         constants = constants,
                         inits = init)

  mcmcConf_myModel <- configureMCMC(myModel,
                                    useConjugacy=FALSE,
                                    enableWAIC=TRUE)

  #build MCMC
  myMCMC <- buildMCMC(mcmcConf_myModel)
 
  #compile the model and MCMC
  CmyModel <- compileNimble(myModel)
  #compile the MCMC
  cmodel_mcmc <- compileNimble(myMCMC,project=myModel)
 
  cmodel_mcmc$run(750000,thin=100,
                  reset=FALSE)
  return(as.mcmc(as.matrix(cmodel_mcmc$mvSamples)))
})


And then when trying to pick the MCMC back up:

out2 <- clusterEvalQ(cl, {
  cmodel_mcmc$run(500000,thin=100,
               reset = FALSE)
  return(as.mcmc(as.matrix(cmodel_mcmc$mvSamples)))
})

I get the error: 'Error in summary.connection(connection) : invalid connection'. This code previously has worked for me, and I have only recently started seeing this error. I think this is because I'm running on a shared server, and perhaps when nodes are not in use (i.e., in the time between when I notice the first run has completed and when I try to begin the second), they automatically get re-allocated before I stop the cluster with stopCluster(cl).

I'm thinking the solution to this would be applying Daniel's example of saving the model state, but in parallel. Before attempting, I just wanted to see if this would be the correct way of implementing:

To run the first time:

cl <- makeCluster(8)

set.seed(10120)

clusterExport(cl, c("code", "inits", "data", "constants"))

for (j in seq_along(cl)) {
  set.seed(j)
  init <- inits()
  clusterExport(cl[j], "init")
}

out1 <- clusterEvalQ(cl, {
  library(nimble)

  # nimble functions here ...

  # add functions for getting state
  getStateVariableNames <- function(samplerDef) {}
  getModelState <- function(model) {}
  getMCMCstate <- function(conf, mcmc) {}
  getWAICstate <- function(mcmc) {}

  myModel <- nimbleModel(code = code,
                         data = data,
                         constants = constants,
                         inits = init)

  mcmcConf_myModel <- configureMCMC(myModel,
                                    useConjugacy=FALSE,
                                    enableWAIC=TRUE)

  #build MCMC
  myMCMC <- buildMCMC(mcmcConf_myModel)
 
  #compile the model and MCMC
  CmyModel <- compileNimble(myModel)
  #compile the MCMC
  cmodel_mcmc <- compileNimble(myMCMC,project=myModel)
 
  cmodel_mcmc$run(750000,thin=100,
                  reset=FALSE)

  stateList <- list(modelState = getModelState(CmyModel),
                    mcmcState = getMCMCstate(mcmcConf_myModel, cmodel_mcmc),
                    waicState = getWAICstate(cmodel_mcmc))

  samples <- as.mcmc(as.matrix(cmodel_mcmc$mvSamples))
 
  return(list(stateList,samples))
})

# save
saveRDS(out1[[1]], file = 'savedstate1.rds')
saveRDS(out1[[2]], file = 'savedsamples1.rds')

stopCluster(cl)


And then in a new R session to pick up where I left off:

## load the saved "state" file
stateList <- readRDS(file = 'savedstate.rds')
cl <- makeCluster(8)

set.seed(10120)

clusterExport(cl, c("code", "inits", "data", "constants"))

for (j in seq_along(cl)) {
  set.seed(j)
  init <- inits()
  clusterExport(cl[j], "init")

  ### this is new
  clusterExport(stateList[[j]]$modelState, "modelState")
  clusterExport(stateList[[j]]$mcmcState, "mcmcState")
  clusterExport(stateList[[j]]$waicState, "waicState")
}
out2 <- clusterEvalQ(cl, {
  library(nimble)

  # nimble functions here ...

  # add functions for getting and saving state
  getStateVariableNames <- function(samplerDef) {}
  getModelState <- function(model) {}
  getMCMCstate <- function(conf, mcmc) {}
  getWAICstate <- function(mcmc) {}
  setModelState <- function(model, modelState) {}
  setMCMCstate <- function(conf, mcmc, mcmcState) {}
  setWAICstate <- function(mcmc, waicState) {}

  myModel <- nimbleModel(code = code,
                         data = data,
                         constants = constants,
                         inits = init)

  mcmcConf_myModel <- configureMCMC(myModel,
                                    useConjugacy=FALSE,
                                    enableWAIC=TRUE)

  #build MCMC
  myMCMC <- buildMCMC(mcmcConf_myModel)
 
  #compile the model and MCMC
  CmyModel <- compileNimble(myModel)
  #compile the MCMC
  cmodel_mcmc <- compileNimble(myMCMC,project=myModel)

  ## restore the saved "state" into the new model and new MCMC
  setModelState(CmyModel, modelState)
  setMCMCstate(mcmcConf_myModel, cmodel_mcmc, mcmcState)
  setWAICstate(cmodel_mcmc, waicState)
 
  cmodel_mcmc$run(750000,thin=100,
                  reset=FALSE)

  stateList <- list(modelState = getModelState(CmyModel),
                    mcmcState = getMCMCstate(mcmcConf_myModel, cmodel_mcmc),
                    waicState = getWAICstate(cmodel_mcmc))

  samples <- as.mcmc(as.matrix(cmodel_mcmc$mvSamples))
 
  return(list(stateList,samples))
})

# save
saveRDS(out2[[1]], file = 'savedstate2.rds')
saveRDS(out2[[2]], file = 'savedsamples2.rds')



Does this make sense?


Thanks so much!
Best,
Abby Keller


Abby Keller

unread,
Jun 26, 2024, 4:55:09 PMJun 26
to nimble-users
Hi again -- 

Sorry, I have an updated question. I think the following code (slightly different than above) works in terms of saving the model and mcmc states and adding back the model and mcmc states when run again in parallel. 

#########
# run 1 #
#########
# save model state
state <- list()
for(i in 1:8){
  state[[i]] <- out[[i]][1]
}
# save samples
samples <- list()
for(i in 1:8){
  samples[[i]] <- out[[i]][2]
}

saveRDS(state, 'savedstate1.rds')
saveRDS(samples, 'savedsamples1.rds')

stopCluster(cl)


#########
# run 2 #
#########


## load the saved "state" file
stateList <- readRDS(file = 'savedstate1.rds')

cl <- makeCluster(8)

set.seed(10120)

clusterExport(cl, c("code", "inits", "data", "constants"))

for (j in seq_along(cl)) {
  set.seed(j)
  init <- inits()
  clusterExport(cl[j], "init")

  ### this is new
  modelState <- stateList[[j]][[1]]$modelState
  mcmcState <- stateList[[j]][[1]]$mcmcState
  waicState <- stateList[[j]][[1]]$waicState
  clusterExport(cl[j], "modelState")
  clusterExport(cl[j], "mcmcState")
  clusterExport(cl[j], "waicState")
# save model state
state <- list()
for(i in 1:8){
  state[[i]] <- out[[i]][1]
}
# save samples
samples <- list()
for(i in 1:8){
  samples[[i]] <- out[[i]][2]
}

saveRDS(state, 'savedstate2.rds')
saveRDS(samples, 'savedsamples2.rds')

stopCluster(cl)

However, it's not clear to me if the actual samples are being saved in the MCMC state, such that the last MCMC sample for each dimension in the first run is being used to get the first MCMC sample in the second run. Would it make sense to set the initial values of the second run equal to the values of the last sample in the first run?

Thanks as always for your help!
Best,
Abby

Chris Paciorek

unread,
Jun 29, 2024, 10:51:44 AMJun 29
to Abby Keller, nimble-users
Hi Abby,

I'm not quite understanding your question. Restarting multiple chains in parallel should "just" involve carrying out the steps in Daniel's example but for multiple chains instead of one chain, including setting the state of the model (namely model variables, including logProb values) and the state of the MCMC (tuning parameters, if used, for the MCMC samplers). As long as the model has the right values in it, that's what's needed in terms of samples - one doesn't need anything further in terms of the "actual samples".

One other note -- in looking back at Daniel's example, it looks like the MCMC state that he works with is specific to the RW sampler (and perhaps some others that share adaptation behavior), but for other nimble samplers that have tuning parameters (e.g, RW_block or HMC), one would need to modify his code. But I haven't looked more closely, and he may have other comments on that front.

-chris

Abby Keller

unread,
Jul 1, 2024, 9:37:56 AMJul 1
to nimble-users
Hi Chris,

Thanks for your quick response, I appreciate it. And so sorry, I think I had answered my own question right after this post- I was expecting the last MCMC sample to be stored in the MCMC state, rather than in the model state. This makes a lot more sense now, and thanks for that clarification.

And thanks also for the heads up about block samplers. I'll look into that and see if anyone else has posted about it.

Thanks again.
Best,
Abby

Reply all
Reply to author
Forward
0 new messages