Hi everyone,
I am working on a series of binomial N-mixture models for single bird species, with package ubms, linking their abundances with a set of covariates describing forest structure and management. The data I am using covers 6 years of point count surveys, 135 plots and up to 3 visits to each plot each year (but quite often less than that). My focus is on the effects of the covariates and not on how abundance varies across years, so I am using the “stacked” approach (https://kenkellner.com/ubms/articles/random-effects.html) and including a random effect of the year.
I managed to get all models running and converging, but on examining the diagnostics some issues came up:
- The posterior predictive p-value for many of the species is either very close to 1 or very close to 0. As far as I understood, the first case may reflect underdispersed residuals and overfitting, but I tried running a much simpler model for one of the species, only containing the most important covariates, and the situation remains the same.
- In the second case (close to 0), this should reflect overdispersed residuals but I am surprised that happens in a model that already contains a random effect. I tried including an observation-level random effect to absorb that extra dispersion, but I could not get the model to converge;
- For most species, the apparent detection probability (maximum observed counts divided by estimated abundance) is suspiciously low (around 10% or below), and abundance estimates look a bit too high, though not crazily so. Besides, if I use the function plot_residuals for the detection part of the model, the residuals seem to have a rising trend. I tried changing the priors of the detection model to “push” the intercept away from extreme values, but this barely had any effect.
I am a bit lost as to how to address these issues, after trying to tinker with the models for a while. I am also relatively new to hierarchical models and Bayesian approaches. If this was a regular GLM, I would have gone for a quasipoisson (in case of underdispersion) or negative binomial (in case of overdispersion) model. I’d be very thankful for any insights or suggestions!