is it possible to have some nodes (not all) constant in a variable?

337 views
Skip to first unread message

roger pradel

unread,
Aug 28, 2021, 11:19:17 AM8/28/21
to nimble-users
I did not find an answer to this question, and, if possible, the way to specify which nodes are constants.
I think this could be useful to neutralize the occasions prior to first encounter in CJS type capture-recapture models. Currently, they are apparently given as data with no distribution. I am wondering if specifying them as data would make a difference.

Many thanks for any light that could be shed on this.

Roger

Daniel Turek

unread,
Aug 31, 2021, 8:41:26 AM8/31/21
to roger pradel, nimble-users
Roger, it's good to hear from you.  I agree, documentation available on this sort of question is probably minimal at best, and it's a difficult question.

When a node is considered to be "constant", there are two different ways that could appear.  This applies the same for stand-alone nodes ("a" or "b"), or nodes which appear inside a larger variable (as your question asks; "x[4]", or "y[3,6]")

1) If the node has no distributional assumption associated with it, and is merely a fixed value.  These usually appear as covariates, and we also call them RHS only "right-hand side only" nodes, since they only appear on the right-hand side of declarations, e.g., "a" in:
y ~ dbern(a)
x ~ dnorm(mu, a^2)
... anything else, so long as "a" never appears on the left-hand side of ~ or <-

In these cases, the "constant" value of a will factor into all the declarations in which it appears, but "a" itself has no likelihood distribution.  So whatever value you provide for "a" will affect (in this case) x, y, etc, but "a" itself has no prior.

2) If the node (somewhere in the model code) appears on the left-hand side of a declaration, e.g., in addition to any of the above, we also have:
a ~ dAnyDistribution( [parameters] )

In this case, the "constant" value of a will also be used for evaluating the likelihood of [parameters] according to this prior distribution.

Now, finally, getting to your question.  In (1) above, the value of "a" can equally well be provided as either "constants" or as "data" or as "inits" (via the "constants" list or the "data" list or the "inits" list to nimbleModel).  All will work the same.

In (2) above, the value of "a" cannot be provided using "constant" list.  If "a" is provided as "data", then it will be treated as a constant (un-changing) value, but will affect the likelihood of [parameters] as explained above.  And if the value of "a" is provided as "inits" list, then "a" will not in fact be constant, it will use this initial value, and undergo MCMC sampling, so I don't think this is the case you want.

Finally; when the node in question is part of a larger variable, all of the above applies the same, one just has to be careful about how you specify the full node.  Say you have x[1:5], and only x[3] is "constant".  Then, you'd supply "inits" and "data" lists as:

inits = list(x = c(1, 2, NA, 4, 5) )    ##  These are initial values for the non-constant x[1], x[2], x[4], and x[5]
data = list(x = c(NA, NA, 3, NA, NA))  ## This is the "constant" value of 3 provided for x[3]

So, like I said, there's a lot going on here.  I hope this makes sense and helps with your question (rather that confusing things more).  Please feel free to expand on your question further, if that would be helpful.

Cheers, all the best,
Daniel


--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/4f1575c7-31d9-4fed-bd87-b02d6c3f15ban%40googlegroups.com.

roger pradel

unread,
Aug 31, 2021, 11:30:21 AM8/31/21
to nimble-users
Hi Daniel,

Very happy to hear from you too!! Hope you are well back in the US!

Yes, your explanations help me understand better how constants, inits, and data are handled in Nimble. I asked the question initially because I had a message urging me to specify as constants some nodes of a multidimensional variable.
Detected use of non-constant indexes: s[1, 8, 2], s[1, 9, 2].... For computational efficiency we recommend specifying these in 'constants'.
I see now that I can do that with inits and data (your last example).

However, thinking more about it, this is probably a secondary issue. To get down to my real problem, I am modeling capture-recapture data of what I thought was a reasonably small size (15 occasions, 120 individuals). Yet, on a modern laptop, nimbleModel ran in 102 minutes and the following compileNimble of the output took close to 1 day... and eventually failed.

Compiling...
Error in inDL(x, as.logical(local), as.logical(now), ...) :
   impossible de charger l'objet partagé 'C:/Users...... .dll':
   LoadLibrary failure: Le module spécifié est introuvable.

The message is partly in French but basically it says the temporary dll object could not be found.
When I looked at the resulting object size, I saw that it was huge (6,202,891,504 B). So, I guess THIS is the real problem.

Why it is so large is probably because it has many nodes. At least this is my guess. I did try running the model with 10, then 20 individuals and that seemed to work well.

There are many nodes because an individual may be equipped with 3 different gears and each one has 3 levels. Adding to that that the individual itself may be alive, just dead (recoverable) or gone (unrecoverable) combined with 8 occasions per individual on average, that makes for many many, probably too many nodes. My feeling at the moment is that I should concentrate on reducing the number of nodes drastically by filtering over latent states as you did in your Environ Ecol Stat 2016 paper. Trying to chase constants seems futile here.

Do you believe this is the right strategy or that better specifying the constant nodes would solve the size/time problem?

Cheers and many thanks

Roger

PS: I could send you the code if you prefer

roger pradel

unread,
Sep 1, 2021, 6:26:59 AM9/1/21
to nimble-users
hi Daniel,

Since yesterday, I have made progress in my understanding of the problem I have. At the root, my problem is that for a certain code in the capture history, the event step must be absent. I opted for metaprogramming the code, replacing the for loop over occasions and individuals by a long list of statements where I could skip the ones corresponding to the specific code. That apparently was a very bad idea. Yesterday, I made trials with the pump example in the manual.

I compared the performance of the original code :

pumpCode <- nimbleCode({
  for (i in 1:N) {
    theta[i] ~ dgamma(alpha,beta)
    lambda[i] <- theta[i]*t[i]
    x[i] ~ dpois(lambda[i])
  }
  alpha ~ dexp(1.0)
  beta ~ dgamma(0.1,1.0)
})


with the equivalent code without the for loop :

pumpCodebis <- nimbleCode({
  theta[1] ~ dgamma(alpha,beta)
  lambda[1] <- theta[1]*t[1]
  x[1] ~ dpois(lambda[1])
  theta[2] ~ dgamma(alpha,beta)
  lambda[2] <- theta[2]*t[2]
  x[2] ~ dpois(lambda[2])
  theta[3] ~ dgamma(alpha,beta)
  lambda[3] <- theta[3]*t[3]
  x[3] ~ dpois(lambda[3])
  theta[4] ~ dgamma(alpha,beta)
  lambda[4] <- theta[4]*t[4]
  x[4] ~ dpois(lambda[4])
  theta[5] ~ dgamma(alpha,beta)
  lambda[5] <- theta[5]*t[5]
  x[5] ~ dpois(lambda[5])
  theta[6] ~ dgamma(alpha,beta)
  lambda[6] <- theta[6]*t[6]
  x[6] ~ dpois(lambda[6])
  theta[7] ~ dgamma(alpha,beta)
  lambda[7] <- theta[7]*t[7]
  x[7] ~ dpois(lambda[7])
  theta[8] ~ dgamma(alpha,beta)
  lambda[8] <- theta[8]*t[8]
  x[8] ~ dpois(lambda[8])
  theta[9] ~ dgamma(alpha,beta)
  lambda[9] <- theta[9]*t[9]
  x[9] ~ dpois(lambda[9])
  theta[10] ~ dgamma(alpha,beta)
  lambda[10] <- theta[10]*t[10]
  x[10] ~ dpois(lambda[10])
  alpha ~ dexp(1.0)
  beta ~ dgamma(0.1,1.0)
})


With the same constants, data, and inits (those in the manual), I then compared the time it took to build the model.

start.time <- Sys.time()
for (i in 1:20) {
pump <- nimbleModel(code = pumpCode, name = "pump", constants = pumpConsts,
                    data = pumpData, inits = pumpInits)
}
end.time <- Sys.time()
difftime(end.time,start.time, units = "mins")

vs.

start.time <- Sys.time()
for (i in 1:20) {
pump <- nimbleModel(code = pumpCodebis, name = "pump", constants = pumpConsts,
                    data = pumpData, inits = pumpInits)
}
end.time <- Sys.time()
difftime(end.time,start.time, units = "mins")

The code with the for loop builds faster : 0.44 mins vs 2.68 mins. Does this make sense to you?

I also noticed that there was apparently a slowing down over the 20 iterations and repeated the test 2 more times with the original pumpCode code to check this impression.
After the 0.44 mins of the first test, the second test was down to 0.75 mins, and the third to 0.81 mins. Could that be due to some cluttering of memory or disk space?

Would you recommend keeping a for loop in the code?

In this case, I imagine that I could use a nested index. Say, I want to skip 3 in the for loop in pumpCode. I would construct a vector of indices without 3, pass it as data or constant, and loop from 1 to N-1 like this:

pumpCode <- nimbleCode({
  for (i in 1:length(index)) {
    theta[ index[i]] ~ dgamma(alpha,beta)
    lambda[ index[i] ] <- theta[ index[i] ]*t[ index[i] ]
    x[ index[i] ] ~ dpois(lambda[ index[i] ])
  }
  alpha ~ dexp(1.0)
  beta ~ dgamma(0.1,1.0)
})

pumpConsts <- list(N = 10,
                   t = c(94.3, 15.7, 62.9, 126, 5.24,
                         31.4, 1.05, 1.05, 2.1, 10.5),
                   index = c(1, 2, 4, 5, 6, 7, 8, 9, 10))
pumpData <- list(x = c(5, 1, 5, 14, 3, 19, 1, 1, 4, 22))
pumpInits <- list(alpha = 1, beta = 1,
                  theta = rep(0.1, pumpConsts$N))


Do you believe this is a good approach? Can you suggest another one?

Cheers

Roger

Perry de Valpine

unread,
Sep 1, 2021, 12:55:33 PM9/1/21
to roger pradel, nimble-users
Hi Roger,

Hello and good to see this exchange.  I can chime in on some of the issues.

Yes, there is a computational cost in model-building and compilation for each line (declaration) of model code.  Each line of model code is converted to a new set of functions for calculation, simulation and other utilities.  When there is a for loop, there is one such set of functions that will work for every value of the loop indices.  When you manually unroll the loop, you are making nimble create a new set of functions for each line, and then compile and manage those.  That would definitely increase the time of nimbleModel and compileNimble for the model.  It is less clear why it would increase the run time, except that the mysteries of compiled C++ performance can involve memory accesses and executable code size and such, and I suspect that for such reasons the executable from the unrolled version loses some efficiency.

Back to your main problem: would the hidden Markov model implementations in nimbleEcology (marginalized distributions to be used in nimble models) help in your case?  Do you want to share the code (on-list or off-list, as you prefer)?  nimble can get bogged down in large problems, but it's not immediately clear to me that yours should be in that zone, but I'm not seeing your problem clearly.

Best,
Perry



Daniel Turek

unread,
Sep 2, 2021, 10:10:13 AM9/2/21
to roger pradel, Perry de Valpine, nimble-users
Thanks for the additional information, Perry.  I  agree, that in most cases keeping the for-loop (rather than writing every line explicitly) will lead to better performance, although we are sometimes surprised by different situations that arise.

Without seeing the full model code, it's difficult to confidently give advice.  But, if some subset of the indices being used in the model (e.g., s[1, 8, 2], s[1, 9, 2]) are fixed (never-changing), then it's definitely better for model efficiency to specify these as model "constants", rather than as "inits" or "data".  However, if some other indices in the same variable are non-constant (dynamic), for example perhaps s[2, 8, 2] is a dynamic index, then all of s[] cannot be specified as a constant.  In this case, depending on the form and breakdown of constant/non-constant s[], maybe s[] could be broken down into two variables (say s_constant and s_dynamic), one of which is never-changing (and specified as a model "constant"), and the other of which is dynamic (and specified as model "inits").  Both of these could be used inside for-loops in the model, but perhaps two separate for-loops.

Regarding the memory usage, and progressive slow down of R: this is something we're aware of, and trying to track down.  We've heard other users experience this, and it's on our known list of things to fix.  Thanks for reporting it.

Yes, if you're able to use a "marginalize over latent states" strategy for your model, then I would certainly recommend it.  The 2016 EES paper you mentioned gives the general mathematical form for marginalization, and nimbleEcology implements a few commonly used cases of this.  That said, what's provided in nimbleEcology certainly doesn't cover all possible cases where marginalization is possible, and I've seen a number of other people write their own custom distributions, specifically written for more complex models, to marginalize over the latent states.  If you think it might be possible for your model, then yes, I'd certainly recommend it, since I think it would help with the speed and memory problems that you're experiencing.  Let us know if you'd need help with that.

Thanks again for sharing these issues.  I hope we're able to bring your (full-sized) problem into the realm of tractability.

Cheers, keep in touch,
Daniel


Reply all
Reply to author
Forward
0 new messages