about my goals and data structure, including map of our study region within Colombia and point counts on an example farm.
I think accounting for spatial structure will be important given the small distance between points, and the benefits for prediction. There could also be spatially structured unmeasured confounds that are correlated with silvopasture implementation (or survival of trees), which could bias my estimate of interest. Thus, I’m looking for thoughts on how I can improve performance on these spatial models.
So far Ive just fit models to try to understand the spatial process. I’ve experimented with fitting very basic occupancy models (intercept, ecoregion, a random effect of point count cluster) to full models. Ive fit 3 detection covs in all cases. The variables I’ve been playing with for occupancy are:
Ecoregion + Elevation + Precipitation + Total_edge_300m + landcover_300m + Local habitat + Survey_year + (1 | point count cluster)
Generally, I’ve found that the priors on phi and sigma.sq have had the biggest impact on the performance of these models. Originally, I couldn’t get model convergence or decent chain mixing on the spatial parameters until I changed the sigma.sq prior to a uniform distribution and didn’t allow any probability on zero (I set minimum to .5). I have also played with the priors on phi to control for autocorrelation (i.e. the effective spatial range) at scales of 500 meters - 5 km (within farms) to regional (e.g. 5 km - 25 km).
These have produced much more reasonable chain-mixing and reasonable rhats and effective sample sizes. However the goodness of fit tests aren’t particularly inspiring, as maybe 1/3 of the 120 (30 species x 4 GOF tests) GOF tests are under 0.1.
Some specific questions:-I’m struggling to interpret the sigma.sq term. Phi has a nice biological interpretation as the effective spatial range, but the sigma.sq term isn’t as clear to me. Should sigma.sq scale with the effective spatial range? Perhaps tighter more regularizing priors would be helpful?
-I’ve noticed that the effective spatial range goes to the lower limit of the prior on phi. For example, if I set lower <- 3 / 50e3 the effective spatial range tends to approach 50km for all species. Similarly with sigma.sq, it tends to approach the lower limit of whatever the prior is. Is this behavior expected?
-Occasionally the model seems to get stuck and just shows ‘Chain 1 Sampling…’. There is no error produced, but spOccupancy doesn’t actually start to sample and so eventually I just have to shut down R and start again. What is the model telling me in this case?
-I am not that interested in a full model selection workflow (e.g. dredging all possible submodels), but would like to beef up or pair down models to achieve better model fit. Do you have suggestions for this process?
-Overall, what would you recommend for next steps?
Model specifications:
spPGOcc(
occ.formula = occ.formula,
det.formula = det.formula,
priors = priors,
cov.model = "exponential", NNGP = TRUE, n.neighbors = 15,
data = spOcc[1:4], n.burn = 2000, n.batch = 400, batch.length = 25,
n.thin = 20, n.chains = 3, n.report = 100, verbose = TRUE
)
Thanks so much for any thoughts,
Aaron
PhD Candidate, University of British Columbia