Spatial Capture Recapture with partial identities

254 views
Skip to first unread message

Ben Augustine

unread,
Nov 13, 2019, 3:38:54 PM11/13/19
to nimble-users
I am trying to see if I can fit various spatial partial identity models in NIMBLE that are currently only implemented in my R-based samplers. Most of these models are extensions of unmarked SCR, so I think the first step is to see if I can fit unmarked SCR in NIMBLE. A caveat is that I need to probabilistically reconstruct the capture history instead of marginalizing over the individuals.

Unmarked SCR is typical SCR where the individual identities are lost. The data are typically summarized by n, the trap by occasion counts, or just the trap-level counts. To include observation-level partial identity covariates, I will need to work with the latent identity samples one by one. The true individual by trap capture history is latent:

y.true[i,j]~Poisson(K*lamd[i,j]) where lamd[i,j] is the individual by trap expected count, a function of the distance between activity center i and trap j. We observe matrix y.obs of dimension n.obs x J where n.obs is the total number of latent identity counts, sum(y.true). Each row corresponds to an observation with a single 1 indicating the trap of capture. Another way to represent this would be to have y.obs be a vector of length n.obs listing the trap of capture for each observation.

We can create a map from y.obs to y.true, call it ID. ID is a vector of length n.obs indicating the individual identity that each sample is currently assigned to. We can construct y.true from y.obs ID by:

 y.true=matrix(0,nrow=M,ncol=J)
 for(i in 1:length(ID)){
   y.true[ID[i],]=y.true[ID[i],]+y.obs[i,]
 }

In R, I update y.true by iterating through the rows of y.obs, or a subset of these rows, on each iteration and switching their individual identities. This is a Gibbs update:


 up=sample(1:n.samples,nswap,replace=FALSE) #which latent data observations will we update? Could just update all of them.
        for(l in up){
          nj=which(y.obs[l,]>0) #which trap is the sample recorded in?
          possible=which(z==1)#Can only swap the sample with z=1 individuals
          #there will be more constraints here limiting possible swaps when I introduce the partial identity covariates

          #construct full conditional
          njprobs=lamd[,nj]
          njprobs[setdiff(1:M,possible)]=0
          njprobs=njprobs/sum(njprobs) #this is the full conditional

          newID=sample(1:M,1,prob=njprobs) #sample new identity for this observation

          #update y.true if we didn't propose the same ID it already has
          if(ID[l]!=newID){
            swapped=c(ID[l],newID)
            #update y.true
            y.true[ID[l],]=y.true[ID[l],]-y.obs[l,] #subtract this observation from current ID
            y.true[newID,]=y.true[newID,]+y.obs[l,] #add this observation to new ID
            ID[l]=newID
            ll.y[swapped,]= dpois(y.true[swapped,],K*lamd[swapped,],log=TRUE) #update observation model likelihood
          }
        }

Can this or something similar be done in NIMBLE? I appreciate any help and will be happy to provided more information, if needed.

Ben

Perry de Valpine

unread,
Nov 15, 2019, 7:03:58 PM11/15/19
to Ben Augustine, nimble-users
Hi Ben,

Yes it should be possible.  Let me sketch how you would do it.  You will need to fill in a meaningful model to have a working example, so I haven't tested this.

I'll assume in the model you have something like
for(i in 1:N)
    ID[i] ~ dcat(prior_probs[1:N])
where prior_probs could be set in the constants list in nimbleModel, or it could be a variable in the model.

And I assume you are using the ID elements as stochastic indices, say like this (but I know your SCR model is different)
  for(i in 1:N) {
    y[i] ~ dnorm( mu[ ID[i] ], sd = 1)

You can translate your sampler to a nimble sampler that is compilable and compatible with the MCMC system like this:

ID_sampler <- nimbleFunction(
  name = "ID_sampler",
  contains = sampler_BASE,
  setup = function(model, mvSaved, target, control) { ## Required arguments. setup is run once when the MCMC is built.
    N <- length(model$y)
    probs <- rep(1/N, N) ## for drawing random integers
    ## target should be "ID".
    ##  The names are hard-coded below but could be generalized, but anyway target is used two lines down.
    nSwap <- control$nSwap ## how many trials to do each time the sampler is run
    calcNodes <- model$getDependencies(target) ## This could be broken down to individual ones, but as a simple demonstration, I'm using all together
  },
  run = function() {
    ## for uncompiled execution, you can put a browser() in here and check what is happening
    for(i in 1:nSwap) {
      l <- rcat(1, probs) ## We don't support compilation of sample(), so I'm using rcat instead.
      nj <- which(model$y.obs[l,]>0) ## This will be a length-1 vector
      possible <- which(model$z == 1)
     ## setdiff is not supported for compilation, so I re-arranged how to get the result you need, IIUC
      njProbs <- numeric(0, length = N)
      njProbs[ possible ] <- model$lamb[possible, nj]
      newID <- rcat(1, njProbs)
      if(model$ID[l] != newID) {
        model$y.true[model$ID[l],] <<- model$y.true[model$ID[l],]- model$y.obs[l,] #subtract this observation from current ID
        model$y.true[model$newID,] <<- model$y.true[newID,]+model$y.obs[l,] #add this observation to new ID
        model$ID[l]=newID
      }
    }
    model$calculate(calcNodes) ## This updates the log probabilities in the model
    nimCopy(model, mvSaved, nodes = calcNodes, row = 1, logProb = TRUE) ## This updates the saved states of the model
  }
)

Once you have used "model <- nimbleModel(...)", you would do as follows

MCMCconf <- configureMCMC(model)
MCMCconf$printSamplers() ## inspect default samplers
MCMCconf$removeSamplers("ID") ## remove default samplers for ID
MCMCconf$addSampler(type = ID_sampler, target = "ID", control = list(nSwap = 10)) ## add your new sampler
MCMCconf$printSamplers() ## inspect modified samplers
MCMC <- buildMCMC(MCMCconf) ## build the MCMC from the configuration
MCMCrun(niter = 10) ## debug by running it uncompiled

## When you are happy with it, you can skip uncompiled testing and compile and run
Cmodel <- compileNimble(model)
CMCMC <- compileNimble(MCMC, project = model) ## the compiled MCMC will use the compiled model
CMCMC$run(niter)

I don't immediately see why a Gibbs update with no model probability calculations makes sense, but that's for you to be clear on.

If you get it working, if nSwap is small, it could be more efficient to calculate and copy only the swapped nodes and their dependencies, but I've used all nodes for now.

HTH

-Perry

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/01187de8-4cfb-447f-bf84-4dcaf706f355%40googlegroups.com.

Ben Augustine

unread,
Nov 19, 2019, 10:33:32 AM11/19/19
to nimble-users
Much appreciated, Perry!

Before you responded, Nathan Hostetter helped me out and I believe we got the basic unmarked SCR sampler working in NIMBLE. This sampler proposes all y[1:M,j] at once, but I should be able to figure out how to propose them one at a time at each trap. I think I'm on track now, but will come back if I encounter more questions.

Ben


Basic model:

####NIMBLE model
NimModel <- nimbleCode({

  lam0~dunif(0,10)       
  sigma~dunif(0,10)     
  psi~dunif(0,1)

  for(i in 1:M) {
    z[i] ~ dbern(psi)
    s[i,1] ~ dunif(xlim[1],xlim[2])
    s[i,2] ~ dunif(ylim[1],ylim[2])
    lam[i,1:J] <- GetDetectionRate(s = s[i,1:2], X = X[1:J,1:2], J=J, sigma=sigma, lam0=lam0, z=z[i])
    for(j in 1:J) {
      y[i,j] ~ dpois(K*lam[i,j])
    }#j
   }#i
  n <- Getn(y=y[1:M,1:J],z=z[1:M],M=M,J=J)
  N <- sum(z[1:M])

})#model

Alternate sampler for y:

ySampler <- nimbleFunction(

 contains = sampler_BASE,
 setup = function(model, mvSaved, target, control) {
   # Defined stuff
   n<-control$n
   j<-control$j
   M<-control$M
   calcNodes <- model$getDependencies(target)
 },

 run = function() {
     lam.curr <- model$lam[1:M,j] #individual by trap expected counts
     #Sample y[1:M,j] by reassigning n[j] using full conditional
     fullcond <- lam.curr[1:M]/sum(lam.curr[1:M])
     model$y[1:M,j] <<- rmulti(1, n, fullcond[1:M])
     # proposal model logProb
     model_lp_proposed <- model$calculate(calcNodes)
     copy(from = model, to = mvSaved, row = 1, nodes = calcNodes, logProb = TRUE)
 },
 methods = list( reset = function () {} )
)

Ben Augustine

unread,
Nov 19, 2019, 6:56:37 PM11/19/19
to nimble-users
I'm now updating the latent capture history one observation at a time (necessary to add partial identity covariates later) and have 3 questions.

1) Is there a way to not call the custom capture history update for traps with no samples? Currently, I'm aborting inside of the custom update. Color coded red below. Can I just remove the sampler to update the capture histories for those traps and not supply a new one?

2) I don't know how to correctly supply "ID" as data with an init when ID is not referenced in the main NIMBLE model, but only in a custom MCMC update. I am able to get it to work by  saying ID is required in a custom function where I don't in fact use it (color coded blue below). ID is used and updated in the custom capture history update. It is just a derived parameter describing how the observed data are currently allocated to individuals. There is nothing I can say about it in the main Nimble model.

3) Is it possible to do multiple MH updates in a loop inside one call to a custom update? I am currently doing multiple Gibbs updates, but they don't require the likelihood to be updated before the next ID update. See below in green.

Main model code.

####NIMBLE model
NimModel <- nimbleCode({

  lam0~dunif(0,10)       
  sigma~dunif(0,10)     
  psi~dunif(0,1)

  for(i in 1:M) {
    z[i] ~ dbern(psi)
    s[i,1] ~ dunif(xlim[1],xlim[2])
    s[i,2] ~ dunif(ylim[1],ylim[2])
    lam[i,1:J] <- GetDetectionRate(s = s[i,1:2], X = X[1:J,1:2], J=J,
                                   sigma=sigma, lam0=lam0, z=z[i],ID=ID[1:n.samp])

    for(j in 1:J) {
      y[i,j] ~ dpois(K*lam[i,j])
    }#j
   }#i
  ncap <- Getncap(y=y[1:M,1:J],z=z[1:M],M=M,J=J)

  N <- sum(z[1:M])
  # for(l in 1:n.samp){
  #   ID[l]~dunif(0,M)
  # }
 

})#model


Custom capture history update:

## sampler to jointly update y[1:M,j,k] so that they sum to n[j,k]
IDSampler <- nimbleFunction(

 contains = sampler_BASE,
 setup = function(model, mvSaved, target, control) {
   # Defined stuff
   y.obs<-control$y.obs

   j<-control$j
   M<-control$M
   calcNodes <- model$getDependencies(target)
 },

 run = function() {
   lam.curr <- model$lam[1:M,j] #individual by trap expected counts
   fullcond <- lam.curr[1:M]/sum(lam.curr[1:M])
   idx=which(y.obs==j) #which samples are at this trap?
   if(length(idx)>0){
     for(i in 1:length(idx)){ #update them one by one
       ID.curr <- model$ID[idx[i]]
       ID.prop <- rcat(1,fullcond[1:M])
       #subtract out old ID obs
       model$y[ID.curr,j] <<- model$y[ID.curr,j]-1
       #add in new ID obs
       model$y[ID.prop,j] <<- model$y[ID.prop,j]+1
       #swap in new ID
       model$ID[idx[i]] <<- ID.prop
     }
     # calculate logProb after Gibbs updates

     model_lp_proposed <- model$calculate(calcNodes)
     copy(from = model, to = mvSaved, row = 1, nodes = calcNodes, logProb = TRUE)
   }
 },
 methods = list( reset = function () {} )
)

On Wednesday, November 13, 2019 at 3:38:54 PM UTC-5, Ben Augustine wrote:

Perry de Valpine

unread,
Nov 19, 2019, 7:23:19 PM11/19/19
to Ben Augustine, nimble-users
I may not follow everything but here goes.

On Tue, Nov 19, 2019 at 3:56 PM Ben Augustine <baug...@vt.edu> wrote:
I'm now updating the latent capture history one observation at a time (necessary to add partial identity covariates later) and have 3 questions.

1) Is there a way to not call the custom capture history update for traps with no samples? Currently, I'm aborting inside of the custom update. Color coded red below. Can I just remove the sampler to update the capture histories for those traps and not supply a new one?


It looks right now (following my initial reply, where I was adapting your R code) that you have a single sampler responsible for all the custom updates.  In that case, doing some kind of check to avoid unnecessary effort, as you do, or passing in a vector of indices needing sampling effort, is the solution I see.

Another way you could go, building from the basic idea you now have working, would be to modify the custom sampler to do just one index and have a separate copy of the sampler for each index that needs sampling.  That is similar to what happens with our standard samplers; there is one sampler object for each node to be sampled.  To create that, you would use "addSampler" for indices that need sampling (and not others), i.e. for traps with samples.  A convenient way to get node names for adding (or removing, using removeSampler), is model$expandNodeNames("lam"), for example.

2) I don't know how to correctly supply "ID" as data with an init when ID is not referenced in the main NIMBLE model, but only in a custom MCMC update. I am able to get it to work by  saying ID is required in a custom function where I don't in fact use it (color coded blue below). ID is used and updated in the custom capture history update. It is just a derived parameter describing how the observed data are currently allocated to individuals. There is nothing I can say about it in the main Nimble model.

One way to think of the model is as a data structure that caches calculations.  I think you are saying you need ID to be in the data structure, but you don't need it to do anything meaningful in the model graph.  In that case, just get it declared somewhere.  Putting it artificially on the right-hand side of a declaration as you've done will do the trick.  An alternative would be to uncomment the lines with the uniform prior.  That will get it into the model and the uniform prior, even if ever calculated, will not influence anything.
 

3) Is it possible to do multiple MH updates in a loop inside one call to a custom update? I am currently doing multiple Gibbs updates, but they don't require the likelihood to be updated before the next ID update. See below in green.

You can do anything valid that you want.  The rules (documented in the manual) are that, upon entry, you can expect that both the model and mvSaved are in the same state, which is up to date.  Upon exit, your sampler must leave both the model and mvSaved in the same state, again up to date.  If you implement multiple MH updates internal to one "sampler", that is fine as long as you know what you are doing.  It is (not recommended but) possible to violate the rules if you know your samplers will come in sequence in the sampler list (you can see the sampler order by MCMCconf$printSamplers();  you can access sampling assignment internals by MCMCconf$getSamplers();  you can modify the order of samplers by MCMCconf$setSamplers();  you can also modify this order before running the MCMC, but this gets into more internals so I'll stop for now).  In that case, as long as the last of your samplers leaves model and mvSaved in sync upon exit, before any other samplers you didn't write are called, you're good.  That is definitely *not* recommended, but I'm just saying it to highlight how the system works.
 
The User Manual is long but should have a lot of these details.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

Ben Augustine

unread,
Nov 21, 2019, 10:42:31 AM11/21/19
to Perry de Valpine, nimble-users
Much appreciated, Perry! I think I know enough now to stop bugging you!

Ben

Ben Augustine

unread,
Nov 26, 2019, 3:16:16 PM11/26/19
to nimble-users
OK, I've made a lot of progress and I'm almost done transferring one progressively complex set of spatial partial identity models to NIMBLE, unmarked SCR -> categorical SPIM -> categorical SPIM with observation error in category levels -> genotype SPIM.

One (hopefully) last thing I'm unsure about is how to handle a specific set of parameters that I house in a list in my R-based sampler. I transferred another set of parameters from a list structure to a ragged matrix for use in NIMBLE, which works, but if I want to record the posteriors, it also records the matrix indices that are not used, which isn't ideal. The set of parameters is a list of matrices that could be of length over 100 for specific applications. The NIMBLE documentation seems to suggest that the only way to access list elements is to subset by name, e.g., pi$locus1, pi$locus2, etc. This will be very tedious if there are 100 or so list elements.

So my question is, is there a workaround where I can index lists by their numeric index?

Thanks.

Ben
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users+unsubscribe@googlegroups.com.

Perry de Valpine

unread,
Nov 26, 2019, 10:03:36 PM11/26/19
to Ben Augustine, nimble-users
Can you clarify a bit more?  Do you need these matrices in the model, or in a nimbleFunction?  It sounded like you want them in a model (since you mentioned recording them as part of the posterior), but your last comment about accessing list elements makes it sound like they are in a nimbleFunction, because there wouldn't be any list elements in a model.  Do you need to use them as a matrix per se, e.g. as a covariance matrix, or would it work to organize them as a vector?  Is the issue that in the list of over 100 matrices, the matrices differ in dimension?  Or is the issue that each one of them is ragged?  Sometimes a good strategy is to organize a data structure by concatenation, so you'd have myManyElements[ start[i]:end[i] ], where start[i] and end[i] give the first and last indices of the section of myManyElements that corresponds to unit i.  I can't see if that would work or not in your case from the information provided.

A simple option if the ragged array implementation works but is simply really large is to use monitors2 to record them with a high thin2 interval.

A somewhat more complicated approach, which would take some indexing gymnastics, would be if there are a small set of sizes needed, e.g. if some of the arrays are 2x2, some 3x3, etc.  Then with some nested indexing you could set up my2by2arrays[i, j, k], my3x3arrays[i, j, k], etc.  Obviously that wouldn't work well if there are many sizes needed.

-Perry

To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/af8113b7-ad41-42e8-8edf-dbab41e6f733%40googlegroups.com.

Ben Augustine

unread,
Nov 27, 2019, 10:52:09 AM11/27/19
to nimble-users
Thanks Perry. I'll refrain from elaborating more now because a full description is tedious and probably not worth your time to wade through--it looks like the answer is no, there is no way to index lists by an integer. The easier option between concatenation and a ragged array for this problem is to use a ragged array. I can make that work and I'll look into using the thin2 option. The model will work, but it won't be pretty!

Ben
To unsubscribe from this group and stop receiving emails from it, send an email to nimble...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble...@googlegroups.com.

Ben Augustine

unread,
Dec 1, 2019, 4:32:42 PM12/1/19
to nimble-users
OK, 1 more problem. I believe I'm running into the issue reference here:

https://github.com/nimble-dev/nimble/issues/640

I want to use a 3D array for a parameter and index it to send into a custom distribution. Can this currently be done? This is the code in question:

# #Model for observation error.
  for(m in 1:nsamp){
    for(l in 1:ncat){
      for(obs in 1:nrep){
          G_obs[m,l,obs]~dmycat(pOE = pOE[l,1:nlevels[l],G_true[ID[m],l]],
                                          G_obs_seen = G_obs_seen[m,l,obs])
      }
    }
  }

returning:

 Cannot do math with arrays that have more than 2 dimensions.
 This occurred for: pOE
 This was part of the call:  nimNonseqIndexedd(pOE,l,G_obs[idx[l], l2, obs],nimNonseqIndexedd(G_true,whichpossible,l2))

Ben Augustine

unread,
Dec 1, 2019, 4:37:03 PM12/1/19
to nimble-users
Actually, maybe it's coming from here inside a custom MCMC update. It worked when pOE was 2D.

 Gprobs <- rep(1,M)
       for(l2 in 1:ncat){
         for(obs in 1:nrep){
           if(G_obs_seen[idx[l],l2,obs]){#bypass missing values
           Gprobs[whichpossible] <- Gprobs[whichpossible]*pOE[l,G_obs[idx[l],l2,obs],G_true[whichpossible,l2]]

Ben Augustine

unread,
Dec 1, 2019, 4:44:30 PM12/1/19
to nimble-users
Solved by using a for loop to index the non-consecutive set of indices one at a time. Sorry for the premature posting.

Perry de Valpine

unread,
Dec 2, 2019, 12:16:12 PM12/2/19
to Ben Augustine, nimble-users
Catching up a bit post-Thanksgiving-holiday.

It sounds like you moved ahead with a solution, but just in case it is helpful:

In a nimbleFunction, you can make an integer-indexed list by creating a nimbleFunctionList in setup code whose elements are nimbleFunctions whose only purpose may be to serve as data structures.

In a model, there are no lists of any kind.

-Perry

To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/b416ebbc-9ff7-4451-b4da-bec004c8ef14%40googlegroups.com.

Perry de Valpine

unread,
Dec 2, 2019, 12:17:14 PM12/2/19
to Ben Augustine, nimble-users
No problem.  Glad to hear you worked it out.

To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/9c220f05-e047-478e-8501-b976d683e105%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages