covariates' order does matter(?)

Claudio

unread,

Dec 26, 2025, 6:45:51 AM12/26/25

to spOccupancy and spAbundance users

Hi all,

I fitted sfJSDM model with spOccupancy using four covariates: covariate a is categorical, covariates b, c and d are numerical. I noticed different WAIC values for (presumed) same models but with a different order of covariates. For example, model a+b+c+d got WAIC=647.4, while model a+b+d+c got 653.9. In between, model a+b+d got 650.9 and a+b got 653.7.

When fitting the same covariate models using lfJSDM, I got again differences, although less evident: a+b+d+c=645 and a+b+c+d=646.

I wonder why this happens, and if there is any "rule of thumb" to be followed.

thanks in advance

Claudio

Jeffrey Doser

unread,

Dec 29, 2025, 4:15:45 PM12/29/25

to Claudio, spOccupancy and spAbundance users

Hi Claudio,

The order of covariates in any model in spOccupancy (whether it is in the detection or occupancy formula) does not matter. There are two points to mention regarding what you stated in your post:

It is not surprising that there are differences in models with different covariates included in the model. In fact, this is a key way that using WAIC is useful. So, the different WAIC values for model a+b+d and a+b is not necessarily unexpected.
WAIC can differ between model runs of the same model. First, one would expect some slight differences from one time running the model run to the next just based on Monte Carlo error. Second, t is also important to mention that the WAIC does have a standard error associated with it, but I have not implemented this in spOccupancy (although that would be a nice feature to add). Third, if the model has not fully converged, then you may see WAIC differences between different runs of the same model. The convergence of a model should also not be impacted by covariate order, but it could be impacted by which level of a categorical variable is treated as the baseline level. Finally, sfJSDM and lfJSDM can be complex models to estimate depending on how many species, how many factors, and how many sites are included in the model. If the model is fairly complex for the data set, then there may be difficulties in estimating the latent factors, which could lead to the same model arriving at slightly different solutions for the latent/spatial factors, which could then lead to slightly different WAIC values. Factor models are notoriously difficult to estimate, and there's lots of statistical literature on the topic. This paper by Papastamoulis and Ntzoufras presents a very nice overview of the statistical challenges with these type of models.

Hope that helps clarify some reasons as to why this might be happening.

Jeff

--
You received this message because you are subscribed to the Google Groups "spOccupancy and spAbundance users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spocc-spabund-u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/spocc-spabund-users/aa2e4feb-5c54-4be2-85ce-751e0cba6d56n%40googlegroups.com.

--

Jeffrey W. Doser, Ph.D.

Assistant Professor

Department of Forestry and Environmental Resources

North Carolina State University

Statistical Ecology and Forest Science Lab

Pronouns: he/him/his

Claudio

unread,

Feb 1, 2026, 4:39:15 AMFeb 1

to spOccupancy and spAbundance users

Hi Jeff,

thanks for the clear reply (and sorry if I reply so late).

Just to clarify point 1: I mentioned models a+b+d and a+b not for the sake of their comparison but for highlighting the fact that the two "alternative" WAICs I got for a+b+c+d and a+b+d+c encompass two models (a+b+d and a+b), thus the choice was not trivial.

I assessed convergence on the basis of Rhat<1.1 and large ESS at community level. Should I have checked the species level too? And what if just one or few species show no good Rhat and ESS for one covariate?

thanks again

Claudio

Jeffrey Doser

unread,

Feb 5, 2026, 9:01:58 PMFeb 5

to spOccupancy and spAbundance users

Hi Claudio,

Yes, you should ensure convergence of the species-specific parameters. You generally want all species-specific effects to converge, so if you see some are not converging, then I would suggest fine-tuning the model and re-running again (e.g., run for longer, increase burn-in and/or thin, change starting values). Of course, if the parameters are very far from converging, that could indicate to you that the model is overparameterized and you're not able to generate reliable estimates for the given species. In that case, you would want to consider simplifying the model structure and/or dropping certain species for which you're unable to obtain converged estimates.