Hi Jeff and all,
I am running a multi-species latent factor spatial model with the 100 most abundant species in my system (more info on study system & goals). Overall the models seem to be running decently well, but there are several parameters that have Rhats greater than 1.1 or effective sample sizes < 100 (about 10% total). I've attached a Word document with some screenshots of model output, figures, tables, and the model specification.
The parameters that are not fitting well are mostly the intercepts for both alpha and beta, as well as several of the beta estimates. And unfortunately, the community level parameters also don't seem to be sampling very well. Furthermore, the trace plots on the factor loadings show poor convergence for many species.
So I have a few questions about what I might do to improve convergence, sampling efficiency, and GOF. You have at least three suggestions in your 'convergence issues vignette': 1) consider the order of the species, 2) using more factors, or 3) fit one chain. I’m curious whether you think any of these might be particularly helpful in my case, and how you would approach them in practice.
1) Species order: In the vignette you recommend ordering species based on underlying biology (e.g., functional guilds). How should goodness of fit factor into this decision? My goal is inference on the full bird community, so I’d like biologically meaningful structure to be represented. However, ordering more widespread species (those observed at more sites and across more ecoregions) first consistently improves WAIC. I recognize WAIC reflects predictive performance, but it also loosely tracks model fit (ideally I’d rely on PPCs, though I’m currently limited by RAM). With respect to the recommendation from Carvalho et al. (2008) discussed in the convergence vignette, I’ve included code and output in the Word document—does my interpretation look correct?
2) Number of latent factors: Would you prioritize goodness of fit here, biological interpretability, or some balance of the two? Given that my main interest is community response to land-use change, should factor structure reflect functional guilds, shared responses to disturbance, or something else?
Finally, outside of the three suggested strategies, I’m wondering whether tuning parameters or initial values might help improve sampling efficiency. Acceptance rates are close to the target (~0.43), and tuning values stabilize above 2 by the first report (n.report = 100). I’ve included example output in the document.
Thanks very much for any insight you all might have.
Aaron, PhD Candidate, UBC