Adding a module dramatically slows down a capture-recapture model

40 views
Skip to first unread message

Marwan Naciri

unread,
Jun 1, 2026, 8:56:47 AMJun 1
to nimble-users

Hi Nimble team,

I am trying to test for an effect of food availability on survival (specifically mortality hazard rates) in a capture-recapture (CR) model. Since no clean covariate for food availability is available, I built a model to estimate food availability in each year from raw and messy data. This model on its own runs fine (~20 min), as well as the CR model on its own (~10h). But when I include the food availability module in the CR model, and include an effect of (estimated) food availability on mortality hazard rates, suddenly the model takes much longer to compile (buildMCMC() and compileNimble() steps), requires way more memory, and takes much longer to run. 

I made a test with a simulated dataset of modest size (15 years, 30 marked individuals per year, food availability data), and I confirmed that when I include both the CR module and the food module in a single model without including a parameter for an effect of food on mortality, my whole script takes ~30 mins to run. But when I include a parameter for an effect of acorn on mortality, the buildMCMC() and compileNimble() steps last ~5 minutes each instead of seconds, and the model takes >12 hours to run. With my real dataset (30 years of data, ~100 marked individuals in each year), the model becomes prohibitively slow.

Is this to be expected? Is it related to the specifics of my food availability model? Is there a way to speed things up? Any insight would be appreciated. 

The model code and the code to simulate data is attached.

Thanks in advance,

Marwan


CRR_food_availability_code.R

Daniel Turek

unread,
Jun 3, 2026, 11:40:09 AMJun 3
to Marwan Naciri, nimble-users
Marwan, thanks for sending this question, and I'll try to take a look, and hopefully come up with some suggestions for how to speed things up (when you include both the CR model, and the food availability model).

One question, if you have a moment - could you please point me directly to the parameter you mentioned: the parameter for the effect of food on mortality ?  It wasn't immediately clear to me from looking at your code, and knowing exactly which parameter you're referring to will help me understand what is causing the computational bottleneck.

Thanks,
Daniel


--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/nimble-users/0b2610cb-1ce7-4f21-95ca-6adf248eda3bn%40googlegroups.com.

Marwan Naciri

unread,
Jun 3, 2026, 12:16:24 PMJun 3
to nimble-users
Hi Daniel,
Sorry I mixed things up and I sent the code where I did not include an effect of food on mortality instead of the code where I included it.
Attached is the code that includes the effect in the model (the parameter is called 'c'), though I simulate data without any effect of food on mortality.

Sorry about that,
Many thanks,
Marwan
CRR_food_availability_code.R

Daniel Turek

unread,
Jun 4, 2026, 3:22:13 PMJun 4
to Marwan Naciri, nimble-users
Marwan, I was able to look at this.  Everything is working "fine", the issue is that when you include the acorn model, that introduces 2123 latent nodes acorn_obs[i, t, k] into the model.  Each of these must be sampled by the MCMC (and, being discrete nodes arising from a dnegbin distribution, they are all assigned slice samplers).

For each MCMC iteration, each of these 2123 acorn_obs nodes is updated.  The slice sampler which does this update requires multiple calculations of all the dependencies of each acorn_obs[i, t, k] node.  As the model is written, each of these acorn_obs[i, t, k] nodes have a total of 6672 dependencies, broken down below:

Dependencies of each acorn_obs[i, t, k] node:
first there's a single acorn_corrected[i,t,k] node - no problem here.
then a single acorn[i,t] node - no problem
then a single acorn_index[t] - no problem
then *all 15* acorn_index_s[t] nodes.

This number itself isn't a lot, but because each acorn_corrected[i,t,k] flows into *all 15* acorn_index_s[t] nodes, and in combination, all 15 acorn_index_s[t] nodes are used for mN, then in turn sW, sS, etc.... the dependencies of every individual acorn_obs[i, t, k] end up including *all* of each of the following:

acorn_index_s: 15
lik: 202
mN: 84
ones: 202
psiH: 84
psiN: 84
S: 252
sS: 84
sW: 84
zeta: 5578

for a total of 6672 dependencies.

The sampling time of these acorn_obs[i, t, k] nodes account for 99.92994 % of the total MCMC runtime.

Basically, because of this funneling of the dependencies into *all 15* of the acorn_index_s[t] nodes (because of the standardization), then fanning out from there, the dependencies of each acorn_obs node basically include most of the model.

One way to limit this would be to skip the standardization.  That is, never create acorn_index_s[t], and instead just use acorn_index[t].  I gave this a quick try, and building the MCMC took about 20% of the original time (due to so many fewer dependencies in each sampler, I believe), and the MCMC runtime itself was reduced to about 40% of the original.  So this is one option for you.

I'm sure there are other things you could do to reduce these dependencies as much as possible, but this was one quick approach, which worked pretty well.

Let me know if anything doesn't make sense.

Cheers,
Daniel





Marwan Naciri

unread,
Jun 15, 2026, 8:56:39 AM (7 days ago) Jun 15
to nimble-users
Hi Daniel, 

Thanks a lot and for taking time on this issue and for your detailed answer!
With my actual data (30 years, thousands of marked individuals) and model structure (the CR model is slightly more complicated than the model used in the simulations), and having removed the standardization step, the model still takes very long and a huge amount of memory to compile (crashed after 3,25 hours because 50 GB of memory were not sufficient, while the CR model on its own used to take less than 10 minutes to compile and 10 GB of memory were sufficient. If you have in mind any other way of reducing the number of dependencies I would greatly appreciate your insight.
Would using a distribution other than the negbin (e.g., lognormal) so that a slice sampler is not required speed up compilation time ? or the MCMC run?

Thanks in advance,

Cheers,

Marwan

Perry de Valpine

unread,
Jun 15, 2026, 12:40:07 PM (7 days ago) Jun 15
to Marwan Naciri, nimble-users

Reply all
Reply to author
Forward
0 new messages