nimble and memory use

Henry Scharf

unread,

Aug 8, 2021, 7:56:19 PM8/8/21

to nimble-users

Hello nimble users and developers,

Thanks very much for all your hard work building and maintaining nimble. Recently, I've been trying to conduct a large simulation study that involves refitting the same model to various datasets. The model is somewhat complex and a little slow to build/compile, but it behaves reasonably well for a single run on one dataset. However, when I try to repeatedly fit the model for the simulation study, R rapidly grows in its memory usage until it overwhelms my system's resources.

I've created what I think is a minimum working example of the phenomenon I'm encountering. The first loop below aligns with my understanding of how R's memory use works (ie, removing variables and doing garbage collection releases memory). The second loop shows the confusing behavior I'm encountering with nimble. I would expect the memory usage to stay flat as in the first case, but it seems to grow linearly with each iteration. I'd be grateful for any insight.

Thanks,

Henry

library(pryr)
mem_1 <- rep(NA, 10)
for(i in 1:length(mem_1)){
a <- rnorm(1e7)
mem_1[i] <- mem_used()
var_list <- ls(all.names = T)
rm(list = var_list[-which(var_list == "mem_1")])
gc()
}
plot(mem_1)

mem_2 <- rep(NA, 10)
for(i in 1:length(mem_2)){
library(nimble)
code <- nimbleCode({
    mu ~ dnorm(mean = 0, sd = 1)
    for(i in 1:N){
      y[i] ~ dnorm(mean = mu, sd = 1)
    }
})
constants <- list(N = 1e3)
data <- list(y = rnorm(constants$N))
model <- nimbleModel(code = code, constants = constants, data = data)
Cmodel <- compileNimble(model)
conf <- configureMCMC(model = model)
mcmc <- buildMCMC(conf = conf)
Cmcmc <- compileNimble(mcmc, project = model)
Cmcmc$run(niter = 1e3)
samples <- as.matrix(Cmcmc$mvSamples)
mem_2[i] <- mem_used()
var_list <- ls(all.names = T)
rm(list = var_list[-which(var_list == "mem_2")])
gc()
}
plot(mem_2)

Perry de Valpine

unread,

Aug 9, 2021, 9:02:51 PM8/9/21

to Henry Scharf, nimble-users

Hi Henry,

This is a tricky issue. We have built two features to attempt to reduce memory use, but still it seems that objects evade R's garbage collection. This may be because we have a lot of reference class objects and environments, creating potentially closed loops of referenced objects. The two features are:

nimbleOptions(clearNimbleFunctionsAfterCompiling = TRUE) # This can modestly reduce memory use

nimble:::clearCompiled(model) # This attempts to clear all compiled content for the project related to model and to unload the on-the-fly compiled shared library used for it.

However I tried both of these and they don't resolve the issue you're reporting. It's something we'll have to look into more. We have worked on this in the past but this doesn't look like good behavior.

Here are some potential workarounds.

You could within each loop use system2() to call Rscript and launch a self-contained process. This would make sense if you really need to do the full nimble building and compilation each time.

If you are really re-using the same model structure, you could build and compile just once and then simply re-assign data values. e.g.

Cmodel$y <- some_other_values

then re-run your already-compiled MCMC.

If in your simulations you need models of different sizes or different setups of what is data (observed vs unobserved), it gets trickier but it is still possible to build and compile just once and re-use those objects. For example you can configure-build-compile a full set of samplers for all nodes (including, atypically, data nodes) and then control the sampler order in a particular run of the MCMC to include some samplers and omit others. In that way, you can make some nodes handled like data and some like unobserved on a run-by-run basis. If that sounds like something you need and it is too imprecisely described here, please holler again and we could go into more detail.

Will one of those approaches help you?

-Perry

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/ef734cf0-fbc6-4066-ade2-319c6d3f4449n%40googlegroups.com.

Henry Scharf

unread,

Aug 10, 2021, 7:16:14 PM8/10/21

to nimble-users

Hi Perry,

Thanks for the quick reply and helpful suggestions. I think the system2() + Rscript workout will work for me. The memory use during some preliminary runs using system2() looked good.

Not having to rebuild and compile is tempting, but the data structures for the various simulation scenarios are pretty intricate and I'm reluctant to kick that nest.

Best,

Henry

gesta...@gmail.com

unread,

Aug 25, 2022, 11:21:18 AM8/25/22

to nimble-users

Perry or Henry,

I know this thread is a year old now, but do either of you have a very simple code example demonstrating this system2() + Rscript approach for repeatedly building and compiling nimble models when the data structure and model constants are changing from one simulation to the next?

Thanks,

Glenn

Message has been deleted

Chris Paciorek

unread,

Apr 15, 2023, 2:28:48 PM4/15/23

to Keith Lau, nimble-users

I think using Rscript, that could be a good strategy. You'll likely want to write a file of R code, say `run_chain.R`) that implements a single chain and saves the results in a chain-specific file.

The R file should take an ID as an argument, using syntax like:

args <- commandArgs(TRUE)
chainID <- as.numeric(args[1])

Then in your main R script, you can use parLapply (as seen in our parallelization example) to invoke a function where that function uses system2 to invoke the file with the ID of the chain.

run_MCMC_allcode <- function(seed) {

system2(paste("Rscript run_chain.R", seed)

}

Let me know if any of that is not clear.

It's clunky, I know. We are working on the memory limitations as part of an overhaul of the compiler and model-building systems, but that's a long-term project.

You could also experiment with using `nimble:::clearCompiled`(see here) to clear out compiled objects that might be holding onto a lot of memory, but as I played a

bit with that for a simple case, it didn't seem to be effective in terms of freeing up memory, so I'm not sure how well it would work for you.

-chris

On Tue, Apr 11, 2023 at 3:05 PM Keith Lau <genw...@gmail.com> wrote:

Dear all,

I am also interested in knowing how to delete redundant memory in parallel computing. I ran 50 chains in parallel by foreach. When it returns the 50 summary objects (each of small size of memory), I found the main R session increased a lot of memory (almost 256 GB). It seems returning many redundant memory from each sessions (I'm not sure). I wonder whether there is any way to solve the memory issue. I saw there is a system2() + Rscript approach, but I wonder how to do it in an example.

Thanks a lot!
Keith

gesta...@gmail.com 在 2022年8月25日星期四晚上11:21:18 [UTC+8] 的信中寫道：

To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/51199ce7-978a-444f-89a1-50ed29abac3cn%40googlegroups.com.

Message has been deleted

Dan Linden

unread,

Sep 3, 2024, 5:22:28 PM9/3/24

to nimble-users

If someone is able to share a small toy example implementing this Rscript + system2() approach, I would greatly appreciate it. I'm experiencing the same problem that Henry had, which is that my simulation accumulates memory to the point that it bogs down considerably. Each model fitting takes ~5 min to compile and fit, but 100 simulations in parallel (whether 3-5 cores) is taking 8+ hours (as memory eventually hovers near the max).

Dan Linden

unread,

Sep 4, 2024, 12:43:02 PM9/4/24

to nimble-users

One strategy that works pretty well is to wrap the parallel processed model fitting in a for loop that closes and reopens the connection each loop. The downside is that a loop has to wait for the slowest process before clearing and moving onto the next loop, but the additional computation time is hopefully minimal compared to a fully optimized solution.

Chris Paciorek

unread,

Sep 6, 2024, 8:49:09 PM9/6/24

to Dan Linden, nimble-users

Hi Dan, I think this should work:

1.) `run_chain.R`:

```

args <- commandArgs(TRUE)
chainID <- as.numeric(args[1])

cat("Running iteration number ", chainID, "\n")

Sys.sleep(4) # The "work" being done.

```

2.) R code to manage the tasks:

```

run_MCMC_allcode <- function(seed) {
system2("Rscript", paste("run_chain.R", seed))
}

library(parallel)
nCores <- 4

nTasks <- 12
cl <- makeCluster(nCores)
result <- parLapply(cl, seq_len(nTasks), run_MCMC_allcode)

```

To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/888e5cd4-0f24-4690-ba10-e27ad00bb0a0n%40googlegroups.com.

Dan Linden

unread,

Sep 9, 2024, 11:39:42 AM9/9/24

to nimble-users

Thanks for this, Chris, it does indeed work. Appears to take as long as my for-loop with the simulation broken into chunks, but I'm sure that mileage will vary.

Reply all

Reply to author

Forward