Fitting Spatial vs Non-spatial Occupancy Model

25 views

Skip to first unread message

Mohamad Nur-hafizuddin Abdullah Mohamad Norajame

unread,

Jan 22, 2026, 1:23:06 AMJan 22

to spOccupancy and spAbundance users

Hi all,

I fitted a non-spatial single-season occupancy models to opportunistic citizen-science data. I only include the sites/grid cells that at least have one observation of any species and generate non-detection within these grid cells. Grid cells with no observation records are not included because in my understanding, including the cells will underestimates detection probability (lots of grid cells will contain NAs across replicates).

As a result, the grid are not evenly distributed, largely reflecting observer access and activity rather a systematic spatial design. I did try to run spatial model with the same MCMC settings but the running time is 2 times slower than the non-spatial model, and the spatial parameter does not converge really well (low ESS, high rhat) and produce very high standard deviation.

My question is that is there any rule of thumb on when to use spatial vs non-spatial model? My concern is I have about 20 species to analyse (and will increase over time) and if I go for the spatial model, then I will need to increase iteration, investigate further on the prior distributions and tuning parameter (Metropolis), and this will take a longer time. I am not sure if adding spatial autocorrelation to the occupancy component is appropriate given the unstructured data.

Best

Jeffrey Doser

unread,

Jan 26, 2026, 6:34:42 AMJan 26

to Mohamad Nur-hafizuddin Abdullah Mohamad Norajame, spOccupancy and spAbundance users

Hi Mohamad,

Thanks for the message. Like many statistical modeling decisions, there is not a general rule of thumb on when to use a spatial model vs. a non-spatial model and it is highly dependent on the objectives and specific data at hand. It is perhaps insightful to consider what happens if you fit a non-spatial model when there is truly residual spatial autocorrelation in the underlying occupancy process. If you do this, your model is assuming occupancy between two locations are conditionally independent, when in fact they may not be. In this case, one key risk is that the credible intervals you generate for covariate effects will be overly precise (e.g., too narrow) and so you may not be accurately reflecting the amount of uncertainty in the covariate effects and/or occupancy estimates. Further, in certain situations, failure to include spatial autocorrelation can lead to bias in estimated covariate effects, although this is more dependent on the underlying pattern of the spatial autocorrelation and its relationship to the covariates in the model. Finally, if your objective is prediction and making maps of occupancy probability, failing to account for spatial autocorrelation when present results in throwing away a lot of information that can be used to help reduce uncertainty in predictions and provide more accurate predictions.

Of course, there are downsides to spatial models. They take longer, are more complex, and generally require more data than non-spatial models. As with any comparison of two models using the same detection-nondetection data, you can compare a spatial model and a non-spatial model using WAIC and/or cross-validation. For PGOcc and spPGOcc models in spOccupancy, you can also check the residuals using the residuals() function, which would allow you to diagnose if there was residual spatial autocorrelation in occupancy after fitting a non-spatial occupancy model.

In the case of your specific example where locations were determined following non-probability sampling (e.g., based on observer access instead of some probability sampling design), estimates from a spatial model could certainly reflect processes that are the result of the non-probability sampling (e.g., if species occupancy was tied to vegetation conditions in easily accessible locations). However, the same could be said about any estimated covariate effects in a non-spatial model (e.g., covariate effects could be biased if sampling locations are only in a part of the covariate space), so I don't think the sampling design itself is a reason to not use a spatial model.

Altogether, I would suggest exploring the use of a spatial model for at least one species, using residuals() to see if there is spatial autocorrelation in residuals of non-spatial models, and testing using WAIC and/or cross-validation if there are substantial gains in predictive power from the spatial model compared to the nonspatial model. Of course, there are practical reasons to not use a spatial model as you point out, but you would need to live with the potential consequences of doing so.

Hope that helps,

Jeff

--
You received this message because you are subscribed to the Google Groups "spOccupancy and spAbundance users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spocc-spabund-u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/spocc-spabund-users/1ee3fcf8-6715-42ce-82d8-42705efa91a8n%40googlegroups.com.

Jeffrey W. Doser, Ph.D.

Assistant Professor

Department of Forestry and Environmental Resources

North Carolina State University

Statistical Ecology and Forest Science Lab

Pronouns: he/him/his

Reply all

Reply to author

Forward

0 new messages