Hi Mohamad,
Thanks for the message. Like many statistical modeling decisions, there is not a general rule of thumb on when to use a spatial model vs. a non-spatial model and it is highly dependent on the objectives and specific data at hand. It is perhaps insightful to consider what happens if you fit a non-spatial model when there is truly residual spatial autocorrelation in the underlying occupancy process. If you do this, your model is assuming occupancy between two locations are conditionally independent, when in fact they may not be. In this case, one key risk is that the credible intervals you generate for covariate effects will be overly precise (e.g., too narrow) and so you may not be accurately reflecting the amount of uncertainty in the covariate effects and/or occupancy estimates. Further, in certain situations, failure to include spatial autocorrelation can lead to bias in estimated covariate effects, although this is more dependent on the underlying pattern of the spatial autocorrelation and its relationship to the covariates in the model. Finally, if your objective is prediction and making maps of occupancy probability, failing to account for spatial autocorrelation when present results in throwing away a lot of information that can be used to help reduce uncertainty in predictions and provide more accurate predictions.
Of course, there are downsides to spatial models. They take longer, are more complex, and generally require more data than non-spatial models. As with any comparison of two models using the same detection-nondetection data, you can compare a spatial model and a non-spatial model using WAIC and/or cross-validation. For PGOcc and spPGOcc models in spOccupancy, you can also check the residuals using the
residuals() function, which would allow you to diagnose if there was residual spatial autocorrelation in occupancy after fitting a non-spatial occupancy model.
In the case of your specific example where locations were determined following non-probability sampling (e.g., based on observer access instead of some probability sampling design), estimates from a spatial model could certainly reflect processes that are the result of the non-probability sampling (e.g., if species occupancy was tied to vegetation conditions in easily accessible locations). However, the same could be said about any estimated covariate effects in a non-spatial model (e.g., covariate effects could be biased if sampling locations are only in a part of the covariate space), so I don't think the sampling design itself is a reason to not use a spatial model.
Altogether, I would suggest exploring the use of a spatial model for at least one species, using residuals() to see if there is spatial autocorrelation in residuals of non-spatial models, and testing using WAIC and/or cross-validation if there are substantial gains in predictive power from the spatial model compared to the nonspatial model. Of course, there are practical reasons to not use a spatial model as you point out, but you would need to live with the potential consequences of doing so.
Hope that helps,
Jeff