Convergence and sampling diagnostics in multi-species latent factor spatial model

6 views
Skip to first unread message

Aaron Skinner

unread,
Feb 6, 2026, 12:43:42 AM (12 days ago) Feb 6
to spOccupancy and spAbundance users

Hi Jeff and all, 

I am running a multi-species latent factor spatial model with the 100 most abundant species in my system (more info on study system & goals). Overall the models seem to be running decently well, but there are several parameters that have Rhats greater than 1.1 or effective sample sizes < 100 (about 10% total). I've attached a Word document with some screenshots of model output, figures, tables, and the model specification. 

The parameters that are not fitting well are mostly the intercepts for both alpha and beta, as well as several of the beta estimates. And unfortunately, the community level parameters also don't seem to be sampling very well. Furthermore, the trace plots on the factor loadings show poor convergence for many species.

So I have a few questions about what I might do to improve convergence, sampling efficiency, and GOF. You have at least three suggestions in your 'convergence issues vignette': 1) consider the order of the species, 2) using more factors, or 3) fit one chain. I’m curious whether you think any of these might be particularly helpful in my case, and how you would approach them in practice.

1) Species order: In the vignette you recommend ordering species based on underlying biology (e.g., functional guilds). How should goodness of fit factor into this decision? My goal is inference on the full bird community, so I’d like biologically meaningful structure to be represented. However, ordering more widespread species (those observed at more sites and across more ecoregions) first consistently improves WAIC. I recognize WAIC reflects predictive performance, but it also loosely tracks model fit (ideally I’d rely on PPCs, though I’m currently limited by RAM). With respect to the recommendation from Carvalho et al. (2008) discussed in the convergence vignette, I’ve included code and output in the Word document—does my interpretation look correct?

2) Number of latent factors: Would you prioritize goodness of fit here, biological interpretability, or some balance of the two? Given that my main interest is community response to land-use change, should factor structure reflect functional guilds, shared responses to disturbance, or something else?

Finally, outside of the three suggested strategies, I’m wondering whether tuning parameters or initial values might help improve sampling efficiency. Acceptance rates are close to the target (~0.43), and tuning values stabilize above 2 by the first report (n.report = 100). I’ve included example output in the document.

Thanks very much for any insight you all might have.
Aaron, PhD Candidate, UBC

Convergence sampling diagnostics spOccupancy.docx

Jeffrey Doser

unread,
Feb 17, 2026, 4:01:57 PM (20 hours ago) Feb 17
to spOccupancy and spAbundance users
Hi Aaron, 

Sorry for the delay. Have you been able to successfully fit a model with msPGOcc()? This is what I would first suggest doing as this model is substantially simpler and would let you better narrow down how/if you can achieve convergence of the model you currently have specified. Here are some other thoughts: 
  • It's not clear to me how long you've run the model for based on the document you shared (n.batch = 200 it seems, but I don't know what you set the batch.length to be). These models can require hundreds of thousands of MCMC iterations to achieve convergence, so if you're running it for a lot less than 100,000 I'm not too surprised it hasn't converged. 
  • I may be misinterpreting the density plot that you are showing in the word document of the factor loadings, but is that showing the density of the actual values of the factor loadings? If so, there is certainly an identifiability problem in the model, since the values of the factor loadings span an extremely massive range (-5000 to 5000). I'm not exactly sure what the plot is showing though. 
  • You should be cautious in interpreting WAIC for models that are far from convergence. 
  • I don't remember the exact approach Carvalho et al. recommend, but generally you should look at which species has the maximum mean value of a given factor loading, set that species to the corresponding order in the y-matrix, and then do that for all the factors. 
  • It might be worthwhile to try to fit a model with fewer factors just to see if you can achieve convergence of the model. 
  • These are complex models, so there is no best approach for how to set the number of latent factors. I would recommend basing the number of factors biologically, while also considering potential computational limitations. While you can more formally assess the number of factors you want in your model by doing a series of WAIC comparisons, that is probably not the best approach if you're really focused on inference. If you have way too many factors, you can look at the factor loadings and you would see that there are effectively no "significant" loadings for certain factors (the latter factors). Alternatively, if you have way too few factors, you would likely see many significant factor loadings for all factors in your model, which may signal that there is additional residual spatial structure you're not accounting for. 
  • Tuning parameters won't really have any influence if acceptance rates are close to 0.43 by the end of the MCMC chain. 
  • Initial values could help with convergence of the factor loadings. The factor loadings can be hard to identify, so sometimes more restrictions on the initial values need to be placed. You can do this by manually specifying the initial values for lambda based on a preliminary run. Then, you would want to add a small amount of noise to those values across different MCMC chains so that the chains don't start at exactly the same values. To do that, you would need to run the chains separately by setting n.chains = 1 and running that script three different times with different initial values. Setting n.chains > 1 won't allow you to control the initial values of lambda (unless you fixed them at exactly the same value, which I don't recommend. 
Jeff

Reply all
Reply to author
Forward
0 new messages