Block sampling of categorical nodes in MCMC

176 views
Skip to first unread message

roger pradel

unread,
Sep 20, 2021, 12:36:47 PM9/20/21
to nimble-users
Hi everybody,

In an analysis of capture recapture data, I have noticed that some nodes would not update. I have now identified the problem but wonder what is the best way to solve it in practice and seek advice. This is a general problem, so I hope someone has the answer.

The core of the problem is that I have categorical nodes with 3 ordered possible states like this:
From state 1 (S1), one may move to state 2 (S2) with some probability, or remain in S1
From S2, one moves SYSTEMATICALLY to S3 (probability 1)
S3 is absorbing

In my case, the states are 'alive', 'dead recoverable', 'long dead'

At any time, an individual may be in any of the 3 states. However, if death occurs at some point in time, the state sequence will be

1112333

and this cannot be updated by separate categorical samplers (default configuration).

This is because one node cannot be changed without changing a neighboring node.

The solution I see is to sample 2 successive nodes simultaneously with overlap: (sampling node 1 and node 2), (sampling 2 and 3),... Then, the states will be able to move.

Has someone done that before and can explain me the best way to implement it? Or is there a better solution?

I know that another solution could be to implement the distribution dHMM for capture recapture histories found in the nimbleEcology package, but before doing that, which means quite a lot of work on my code, I'd rather do something more immediate!

Thanks for any suggestion!

Have a good day further!

Roger

Chris Paciorek

unread,
Sep 28, 2021, 8:18:42 PM9/28/21
to roger pradel, nimble-users
Hi Roger,

Indeed, if you have such dependence between nodes, you generally need to set up a joint sampler. At a high level, you can do this in nimble using user-defined samplers. We have some info on this in section 15.5 of our user manual. We also presented on this in a workshop in May 2020. You can see the content here: https://github.com/nimble-training/nimble-virtual-2020/tree/master/content/6_writing_new_mcmc_samplers.

I don't have any specific pointers on how to correctly set up this specific sort of sampler, unfortunately.

-chris

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/2dcb3b24-0867-4fae-9242-5d8c372df79bn%40googlegroups.com.

roger pradel

unread,
Oct 1, 2021, 7:33:16 AM10/1/21
to nimble-users
Hi Christopher,

Thanks for your advice. I also exchanged off-list with Daniel and Perry and I have now written my own sampler. I provide it below in case it might be of some use to somebody else. It worked perfectly for me.

# change default categorical samplers for survival state s
# to own samplers of a 2-occasion sliding window of s states

# moving-window categorical sampler
# (see Manual p188 for guidance on writing samplers)
# (this is adapted from the standard categorical sampler code)
MW_cat <- nimbleFunction(
  contains = sampler_BASE,
  setup = function(model, mvSaved, target, control) {
    calcNodes <- model$getDependencies(target)
    calcNodesNoSelf <- model$getDependencies(target, self = FALSE)
    isStochCalcNodesNoSelf <- model$isStoch(calcNodesNoSelf)
    calcNodesNoSelfDeterm <- calcNodesNoSelf[!isStochCalcNodesNoSelf]
    calcNodesNoSelfStoch <- calcNodesNoSelf[isStochCalcNodesNoSelf]
    # 4 below is specific to current problem; in general this should be ns^window_length
    k <- 4
    probs <- numeric(k)
    logProbs <- numeric(k)
    # 'double' required for compilation
    validstatepairs <- nimMatrix(c(1,1,2,3,1,2,3,3),nrow = 4,ncol = 2,type = 'double')
  },
  run = function() {
    currentValue <- model[[target]]
    if(currentValue[1] == 3 & currentValue[2] ==3) currentCase <- 4 else currentCase <- currentValue[2]
    logProbs[currentCase] <<- model$getLogProb(calcNodes)
    for (i in 1:k) {
      if (i != currentCase) {
        model[[target]] <<- validstatepairs[i,1:2]
        logProbs[i] <<- model$calculate(calcNodes)
        if (is.nan(logProbs[i]))
          logProbs[i] <<- -Inf
      }
    }
    logProbs <<- logProbs - max(logProbs)
    probs <<- exp(logProbs)
    newCase <- rcat(1, probs)
    if (newCase != currentCase) {
      model[[target]] <<- validstatepairs[newCase,1:2]
      model$calculate(calcNodes)
      nimCopy(from = model, to = mvSaved, row = 1, nodes = target,
              logProb = TRUE)
      nimCopy(from = model, to = mvSaved, row = 1, nodes = calcNodesNoSelfDeterm,
              logProb = FALSE)
      nimCopy(from = model, to = mvSaved, row = 1, nodes = calcNodesNoSelfStoch,
              logProbOnly = TRUE)
    }
    else {
      nimCopy(from = mvSaved, to = model, row = 1, nodes = target,
              logProb = TRUE)
      nimCopy(from = mvSaved, to = model, row = 1, nodes = calcNodesNoSelfDeterm,
              logProb = FALSE)
      nimCopy(from = mvSaved, to = model, row = 1, nodes = calcNodesNoSelfStoch,
              logProbOnly = TRUE)
    }
  },
  methods = list( reset = function () {} )
)
Note that this sampler is specialized to my particular problem as it examines only the relevant combinations of successive states (validstatepairs in code)

Roger
Reply all
Reply to author
Forward
0 new messages