Efficient simulation loop with different NA in simulated data

40 views
Skip to first unread message

Mathieu Pruvot

unread,
Mar 12, 2026, 6:17:58 PMMar 12
to nimble-users
Hi everyone,
I am trying to run simulations of an integrated dynamic occupancy model with missing observations.
In my first version, I would recompile the model after simulating new data, and everything worked great. However recompiling the model at each simulation is very time consuming. I have then try to compile the model once at the beginning, and then only update the data, but it appear as if nimble doesn't update which data points are flagged as missing data, and result in errors. Any recommendation on how do do this more efficiently?
Thanks
Mathieu

Daniel Turek

unread,
Mar 13, 2026, 9:39:46 AMMar 13
to Mathieu Pruvot, nimble-users
Mathieu, you should be able to do what you're trying - setting new data into a compiled model, even when the missing / observed data points are different.  If the new data is in a named list called NEWDATA, and Cmodel is the compiled model object, then you should use:

Cmodel$setData(NEWDATA)

This will update the values in the model object, and also the internal "data flags" inside the model (logical variables inside the model, which indicate which nodes are "observed data")

There is one caveat, however.  If you have already configured and built an MCMC to operate on the model, then the MCMC will *not* change its sampling strategy.  Here's what I mean:

Say in the original model, the data is:
data = list(y = c(1, 2, NA, NA))
Then you build and execute an MCMC on this model.  The MCMC will treat y[1] and y[2] as fixed data, and will assign samplers to update the missing values y[3] and y[4].

Say you now change the data in the model, using:
NEWDATA = list(y = c(1, NA, 3, 4))
with different missing values.  The old MCMC, the one you previously built and executed, will *not* update its sampling strategy, and will still have samplers assigned to y[3] and y[4], and no sampler assigned to y[2].

There are some ways you could work around this, but from your email I'm not even sure if you're using the MCMC engine, so I'll leave it here, and please follow up if you have further questions.

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/nimble-users/b49d01b7-ffec-4f75-9c66-43cebcc27748n%40googlegroups.com.

Mathieu Pruvot

unread,
Mar 13, 2026, 12:14:36 PMMar 13
to nimble-users
Hi Daniel,
Thanks for the response, and I appreciate I should have given a little mode details.
The caveat you illustrated is precisely what my issue is, as my detection data generating function are designed to create data gaps (to be consistent with real-life monitoring data).
See below the chunk of code related to the initial model fitting and simulation loop (ignore the low number of iteration and simulations, this was just for testing purpose).
The error I get is:
Error in mcmc$run(niter, nburnin = nburnin, thin = thinToUseVec[1], thin2 = thinToUseVec[2],  :
  in binary sampler, all log probability density values are negative infinity and sampling cannot proceed

Which I believe is related to lack of updating of the sampling strategy as you described above.
The model takes a while to compile, so I think the gain in speed could be substantial, but Can't quite figure out how to set it up properly.
Thanks!


############################################################
## 9. BUILD MODEL ONCE
############################################################


# Constants 

constants <- list(
  N=N, T=T,
  nScout=nScout, nDrone=nDrone,
  maxCam=maxCam, nWeek=nWeek,
  W=W
)

# --- Simulate initial landscape ---
land <- simulate_landscape(forest_prop, smooth_strength)
z <- simulate_occupancy(land, true)
detec <- simulate_detection(
  z,
  prop_cell_scout_gap, prop_scout_replicate,
  prop_cell_drone_gap, prop_drone_replicate,
  pWeekActive, siteActiveProb, seasonGapProb,
  maxGapLength, siteWideWeekMiss,
  ViewMaps
)

# --- Initialize latent occupancy based on any positive detection ---
z_inits <- matrix(0, N, T)
for(i in 1:N){
  for(t in 1:T){
    det <- any(detec$yScout[i,t,]==1, na.rm=TRUE) ||
      any(detec$yDrone[i,t,]==1, na.rm=TRUE) ||
      any(detec$yCam[i,t,,]==1, na.rm=TRUE)
    z_inits[i,t] <- ifelse(det, 1, rbinom(1,1,0.3))
  }
}


data <- c(land,detec)


# --- Initial values  ---
inits <- list(
  z = z_inits,
  beta0=0, beta1=0, beta2=0,
  alpha0=0, alphaNbr=0, alpha2=0, alpha3=0, alphaControl=0,
  delta0=0, delta1=0,
  theta0=0, theta_adequate=0, theta_good=0,
  eta0=0, eta_adequate=0, eta_good=0,
  phi0=0, phi_old=0, phi_fresh=0

)

# --- Build and compile Nimble model ---
model <- nimbleModel(code, constants=constants, data=data, inits=inits)
cmodel <- compileNimble(model)

conf <- configureMCMC(model, monitors=c(
  "beta0","beta1","beta2",
  "alpha0","alphaNbr","alpha2","alpha3","alphaControl",
  "delta0","delta1",
  "theta0","theta_adequate","theta_good",
  "eta0","eta_adequate","eta_good",
  "phi0","phi_old","phi_fresh"
))

mcmc <- buildMCMC(conf)
cmcmc <- compileNimble(mcmc, project=model)

############################################################
## 10. SIMULATION LOOP
############################################################

results_A_effL <- list()
for(sim in 1:nSim){
 
  cat("Simulation", sim, "\n")
 
  # --- Simulate landscape, occupancy, detection ---
  land <- simulate_landscape(forest_prop, smooth_strength)
  z <- simulate_occupancy(land, true)
  detec <- simulate_detection(
    z,
    prop_cell_scout_gap, prop_scout_replicate,
    prop_cell_drone_gap, prop_drone_replicate,
    pWeekActive, siteActiveProb, seasonGapProb,
    maxGapLength, siteWideWeekMiss,
    ViewMaps
  )
 
  # --- Update model data ---
  cmodel$forest[] <- land$forest
  cmodel$distCrop[] <- land$distCrop
  cmodel$control[,] <- land$control
  cmodel$scoutCond[,,] <- detec$scoutCond
  cmodel$droneCond[,,] <- detec$droneCond
  cmodel$camFOV[,,,] <- detec$camFOV
 
  # --- Update detection data ---
  cmodel$setData(list(
    yScout = detec$yScout,
    yDrone = detec$yDrone,
    yCam   = detec$yCam
  ))
 
 
  # --- Re-initialize latent states based on detection ---
  z_init <- matrix(0, N, T)
  for(i in 1:N){
    for(t in 1:T){
      det <- any(detec$yScout[i,t,]==1, na.rm=TRUE) ||
        any(detec$yDrone[i,t,]==1, na.rm=TRUE) ||
        any(detec$yCam[i,t,,]==1, na.rm=TRUE)
      z_init[i,t] <- ifelse(det, 1, rbinom(1,1,0.3))
    }
  }
  cmodel$z[,] <- z_init
 

  # --- Run MCMC ---
  samples <- runMCMC(
    cmcmc,
    niter=2000,
    nburnin=1000,
    nchains=1,
    samplesAsCodaMCMC=TRUE
  )
 
 
  results_A_effL[[sim]] <- samples
 
}


Daniel Turek

unread,
Mar 18, 2026, 10:08:32 AMMar 18
to nimble-users, Mathieu Pruvot
To follow up with the users list, Mathieu and I have been in touch (off-list) discussing his specific model, and ways to speed up the simulation process (using new data) without the need for recompilation at every iteration.  This is an on-going effort, but other users should note that there are strategies for accomplishing this if they encounter a similar problem.

Cheers,
Daniel


Reply all
Reply to author
Forward
0 new messages