Good day everyone,
I've just submitted a pre-print, "Efficient Bayesian implementations of capture-recapture models with Stan", to EcoEvoRxiv, available
here.
The manuscript and accompanying GitHub
repo cover a range of open capture-recapture models, covering single survey and robust designs, multistate and multievent, for both Cormack-Jolly-Seber and Jolly-Seber models. The Jolly-Seber variants introduce a new method to account for unequal survey intervals in the entry process, where survey interval lengths are treated as offsets in Dirichlet or logistic-normal entry probabilities.
Jolly-Seber models have accompanying `js_*_rng()` functions that return population sizes (N), number of entries to (B) and exits from (D) the population, and the super-population size, using the forward-backward sampling algorithm. A primary motivation for writing this manuscript was that the
BPA translation does not recover latent states properly, but instead simulates them from the posterior predictive distribution.
Likelihood functions for all model variants are overloaded with two variants:
1. Parameters varying by survey only, which facilitates major computational advantages. (This variant would still be used with an intercepts-only model.)
2. Parameters varying by both survey and individual, accommodating individual-level heterogeneity in time-varying parameters.
For variant (1) with Jolly-Seber models (which use data augmentation), the likelihood need only be computed once for an "all-0" augmented capture history. For variant (2), time-by-survey varying parameters are usually imputed for all augmented individuals, which is computationally demanding. For that reason, there is a "hybrid" version for variant (2) in Jolly-Seber models, where intercepts are used for the augmented individuals, requiring only a single likelihood computation for the augmented individuals while allowing time-varying parameters for all observed individuals.
All model variants were tested with simulation-based calibration (SBC) to ensure the Stan programs correctly recover the data-generating process. Results and the simulation script used are available in the `sbc` folder of the repo.
Cheers,
Matt