questions about correlations between occupancy and detectability in community occupancy models (multi-part email)

Quresh Latif

unread,

Mar 17, 2022, 5:12:28 PM3/17/22

to hmecology: Hierarchical Modeling in Ecology

I have some questions about modeling of the correlation between occupancy and detectability in the context of community occupancy models. Previously, I've always modeled the correlation (roughly) as follows (i indexes species):

rho ~ dunif(-1, 1)

lpsi.intercept[i] ~ dnorm(beta0.mu, pow(sigma.beta0, -2))

lp.intercept[i] ~ dnorm(zeta0.mu + (rho*sigma.zeta0 / sigma.beta0) * (beta0[i] - beta0.mu), pow(sigma.zeta0, -2)/(1 - pow(rho, 2)))

Lately, however, I am running multi-scale community occupancy models, where I am modeling occupancy at two spatial scales (corresponding with nested sampling units of survey points nested within 1-km^2 grid cells) along with detectability, which makes me want to take the mutlivariate normal approach using a Wishart prior exemplified in Chapter 11.6.2, Volume 1 of AHM, so that I can model all three pair-wise correlations between coarse-scale occupancy, fine-scale occupancy, and detectability. I have three questions arising for me that I'm hoping someone here can help me answer:

###########

# Question 1 #

###########

I am having trouble understanding the prior for the bi-variate species-specific random effect in chapter 11.6.2. The variance-covariance matrix is implemented as

Omega[1:2, 1:2] ~ dwish(R[,], df)

Sigma[1:2,1:2] <- inverse(Omega[,])

for(k in 1:nspec){ # Group lpsi and lp together in array eta
lpsi[k] <- eta[k,1]
lp[k] <- eta[k,2]
eta[k, 1:2] ~ dmnorm(mu.eta[], Omega[,])
}

For the above, R is given in the data as `matrix(c(5,0,0,1), ncol=2)` and df = 3. Is there an explanation somewhere in the book or that someone can provide here for the values for R and df in this example?

##########

# Question 2 #

##########

I followed a tutorial for psychology researchers, and came up with the approach in the attached file for modeling cross-scale correlations, but looking at the example in AHM, I'm not sure I am doing this right (seems like I am setting priors in too many places). Would it be possible to get a second set of eyes on my code to check if what I am doing is correct?

###########

# Question 3 #

###########

It has always been my understanding that we expect a positive correlation between occupancy and detectability. I've always used an uninformative prior for rho (dunif[-1,1]), but lately I've been wondering if it might make more sense to force rho to be positive (dunif[0,1]). Has anyone considered doing this? Are there any drawbacks that might not be immediately obvious?

model_community_static_simple.jags

Quresh Latif

unread,

Mar 18, 2022, 3:03:59 PM3/18/22

to hmecology: Hierarchical Modeling in Ecology

A colleague sent a response to my first question. I am posting his response here, and my follow-up question.

Response to Question 1:

Hi Quresh,

I can perhaps help address your first question. The “DF” represents degrees of freedom and should be equal to K+1 for a uniform distribution, where K is the number of modeled random effects. The R matrix provided represents an uninformative prior, according to Kery and Shaub (Bayesian Population Analysis Using WinBugs). Below is a photo of a paragraph from this text (pg. 206), which discusses the Inverse Wishart and the parameters you inquired about.

[Screen shot of page 206 in Kery and Shaub also posted with this response]

My follow-up question:

Thanks! I got the book and took a look. I guess I maybe should be setting my df to 4 in my model, but I don’t understand the rationale for R[5,0,0,1]. Seems like the zeros on the off-diagonals implies a prior with no correlation between detectability and occupancy. Also, I don’t get why the prior for logit-occupancy variance is so much larger than the prior for logit-detectability variance. I also don't know what the analog would be for R with 3 dimensions to specify an uninformed prior for coarse-scale occupancy, fine-scale occupancy, and detectability as in my model attached to the initial post.

Quresh Latif

unread,

Mar 18, 2022, 4:21:53 PM3/18/22

to hmecology: Hierarchical Modeling in Ecology

I did some more exploring. Using the code pasted below, I tried different values for R and df to understand their implications for the priors for rho and variance of occupancy and detectability. Based on my explorations, I am thinking that R <- matrix(c(5, 0, 0, 0, 1, 0, 0, 0, 1), ncol = 3) and df = 4 would be roughly analogous to the R used for the single-scale community model in AHM. It seems like the larger value for R[1,1] sets the prior for occupancy (or large-scale occupancy in the multi-scale model) to be somewhat narrower than priors for small-scale occupancy and detectability, which I suppose makes sense since there are less data at the highest level. It seems to me that setting R[1,1] to 1 would be more uninformative than the R[1,1] = 5 used in AHM, but I like the idea of using weakly informative priors, so I'll follow AHM's lead and set R = matrix(c(5, 0, 0, 0, 1, 0, 0, 0, 1), ncol = 3) for my multi-scale model. Not sure if this in fact constitutes a weakly informative prior, but it seems like it follows the idea of R = matrix(c(5, 0, 0, 1), ncol = 2) for the single-scale model in AHM. Would much appreciate any thoughts or suggestions from the community on this or questions in my initial post. I am still curious what folks think about forcing rho to be positive, but it doesn't seem like there is an easy way to do that with the Wishart prior, so I'll leave that idea aside for now.

library(matlib)

# 2-dimensional (single-scale occupancy)
nsims <- 10000
R <- matrix(c(5, 0, 0, 1), ncol = 2)
df <- 3
Omega <- rWishart(n = nsims, df, R)
Sigma <- Omega
for(i in 1:dim(Omega)[3]) Sigma[,,i] <- inv(Omega[,,i])
rho <- numeric(length = nsims)
for(i in 1:nsims) rho[i] <- Sigma[1, 2, i] / sqrt(Sigma[1,1,i] * Sigma[2,2,i])
hist(rho)
hist(Sigma[1,1,])
hist(Sigma[2,2,])

# 3-dimensional (multi-scale occupancy)
nsims <- 10000
R <- matrix(c(5, 0, 0, 0, 1, 0, 0, 0, 1), ncol = 3)
df <- 4
Omega <- rWishart(n = nsims, df, R)
Sigma <- Omega
for(i in 1:dim(Omega)[3]) Sigma[,,i] <- inv(Omega[,,i])
rho.12 <- rho.23 <- rho.13 <- numeric(length = nsims)
for(i in 1:nsims) {
rho.12[i] <- Sigma[1, 2, i] / sqrt(Sigma[1,1,i] * Sigma[2,2,i])
rho.23[i] <- Sigma[2, 3, i] / sqrt(Sigma[2,2,i] * Sigma[3,3,i])
rho.13[i] <- Sigma[1, 3, i] / sqrt(Sigma[1,1,i] * Sigma[3,3,i])
}
hist(rho.12)
hist(rho.23)
hist(rho.13)
hist(Sigma[1,1,])
min(Sigma[1,1,])
max(Sigma[1,1,])
hist(Sigma[2,2,])
min(Sigma[2,2,])
max(Sigma[2,2,])
hist(Sigma[3,3,])
min(Sigma[3,3,])
max(Sigma[3,3,])

Thomas Riecke

unread,

Mar 19, 2022, 5:31:41 AM3/19/22

to Quresh Latif, hmecology: Hierarchical Modeling in Ecology

Hi Quresh,

I wouldn't use the inverse Wishart. It has a number of fairly well documented problems. See this link for a really fantastic write up of different approaches by Alvarez et al.:

https://arxiv.org/pdf/1408.4050.pdf

You might check out the 'dmnorm.vcov' distribution in JAGS. It allows users to define independent priors on the covariance matrix components (sigma's and rho's), and uses a likelihood based constraint to ignore non-positive definite (i.e., impossible) matrices. It's ad hoc, but pretty brilliant. Plummer also implemented a scaled inverse Wishart (JAGS 4.3 user manual) that kind of 'softens' some of the issues caused by the inverse Wishart. Fay and colleagues published a really nice paper recently where they demonstrate the use of a Cholesky decomposition (code is provided in the appendix).

https://besjournals.onlinelibrary.wiley.com/doi/pdfdirect/10.1111/2041-210X.13728

If you're starting to use Nimble you can use the Barnard et al. (2000, Statistica Sinica) variance decomposition approach (it can't be implemented in JAGS or BUGS). This allows for specifying independent priors on the variances, and vague priors for the correlations. I think this is probably the best solution.

Anyway, lots of options out there. I think the Barnard approach is 'best' (that's what we use for current projects), but if you want a quick(er) fix then perhaps check out some of Fay et al.'s code or just use dmnorm.vcov() and provide priors for the covariance matrix components. It's tricky to figure out optimal priors with covariance matrices, and then transforming everything with link functions makes it even more complicated. Best of luck!

Thomas

--
*** Three hierarchical modeling email lists ***
(1) unmarked: for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology (this list): for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2016, 2020)
---
You received this message because you are subscribed to the Google Groups "hmecology: Hierarchical Modeling in Ecology" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hmecology+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hmecology/b0cf85d5-8884-44ba-89a6-6171e97825ddn%40googlegroups.com.

--

Thomas Riecke (he/him)

postdoc

Swiss Ornithological Institute

https://thomasriecke.github.io

Quresh Latif

unread,

Mar 21, 2022, 8:52:50 AM3/21/22

to hmecology: Hierarchical Modeling in Ecology

Well, it seems that even though the priors seemed to make sense when I explored them in R using the code posted previously, the posteriors don't make sense at all (see attached). Not sure if these estimates reflect the issues brought up in the materials shared by T. Riecke, or if I am just implementing the Wishart incorrectly, but considering T. Riecke's advice, I'm going to revert to the univeriate approximation (i.e., defining separate correlations between coarse-scale and fine-scale occupancy, and between fine-scale occupancy and detectability) for my main analysis for now, and then report back once I've had a chance to try out the alternatives in the reference provided by T. Riecke.

Sad.csv

Thomas Riecke

unread,

Mar 24, 2022, 8:10:24 AM3/24/22

to Quresh Latif, hmecology: Hierarchical Modeling in Ecology

Hi all,

As a brief follow-up, see the attached script for some different ways to parameterize correlation and covariance matrices in MCMC samplers. This is by no means exhaustive, and apologies it's a simple R script and not more fully commented or 'marked down,' but this may be helpful to folks that are using these approaches in their analyses. Approaches covered include:

1) inverse Wishart (not recommended)

2) scaled inverse Wishart

3) Martyn Plummer's dmnorm.vcov 'trick', highly useful

4) Chen and Dunson 2003 and Kinney and Dunson 2008 (from the excellent write-up in Fay et al.)

5) my best attempt at an LKJ prior in JAGS (note that this is much more straightforward in Nimble or Stan, examples linked)

6) the full Barnard et al. (2000) decomposition approach in Nimble

No promises that this is 'error-free' (particularly the LKJ approach in JAGS). I'm happy to answer questions off-list if folks have any, feel free to make corrections on-list if there are any major errors. Just wish I'd had something like this when I was banging my head against my desk five years ago thinking about these issues with CMR data, and figured I'd blast this out there.

Thomas

To view this discussion on the web visit https://groups.google.com/d/msgid/hmecology/df8f8e4b-9507-4c3b-89cd-1205b4592e2fn%40googlegroups.com.

Riecke_hmecology_covariance_correlation_priors.R

Quresh Latif

unread,

Mar 25, 2022, 11:25:59 AM3/25/22

to hmecology: Hierarchical Modeling in Ecology

Thanks Thomas! It is definitely most helpful to have all of those options in one place.

I have certainly not tried them all, but I did try incorporating the dmnorm.vcov trick into the multi-scale community occupancy context (see attached JAGS code). I tried both an unconstrained version of the model, wherein I estimated all three correlations between coarse-scale occupancy, fine-scale occupancy, and detectability, and a constrained version where I fixed the correlation between grid-scale occupancy and detectability to zero. In both cases, all correlations are estimated to be near zero while both logit grid-scale occupancy (BETA0) and logit point-scale occupancy (beta0) are estimated to be extremely high. In contrast, the old way of specifying two bivariate correlations (commented out in JAGS code and labeled "old way") produces estimates that make a lot more sense, with species-specific estimates more variable and also much more sensible (not shown in the attached summary).

My suspicion is that the bivariate correlation approach is also more directional, i.e., only lower level parameters are subjected to the correlation parameter (e.g, fine-scale occupancy is a function of the correlation between fine-scale and coarse-scale occupancy, but coarse-scale occupancy is only explicitly governed by the community mean and SD). The inverse Wishart approach also did not work (see my original post). I'd be curious to know if any of the other approaches listed by Thomas work better, but my suspicion at this point is that estimating a full correlation matrix with all three correlations is too unconstrained in a multiscale community occupancy context. I am going to revert to the bivariate approach since that has been working fine thus far. I'd appreciate any thoughts or experiences suggesting a multivariate approach that actually works and is better than the bivariate approach. Also, there is certainly the possibility that I made an error in how I implemented the dmnorm.vcov, so I'd appreciate any suggestions on error-correcting adjustments to my JAGS code.

Quresh

Mod_sum_simple_bivariate_correlations.csv

model_community_static_simple.jags

Mod_sum_simple_TRieke_model_full_correlation.csv

Mod_sum_simple_TRieke_restricted_corr.csv

Reply all

Reply to author

Forward