I try to perform a full dynamic model on JAG for a data set extracted from a citizen science program.
I have ~600 sites surveyed for 11 years with maximum 10 secondary sessions (surveys per year). Unfortunately, I do not have 10 survey for each site every year (~70% NA).
I want to perform a full-time dependent model with annual estimation of extinction, colonisation and detection parameters. In addition, I want to add covariates of Julian date (continuous variable) and type of list (categorical variable) on detection model.
It seems JAG don't like NA values in Y array (detection/non-detection) as well as in Observation covariates array (same dimension as Y array). I am a little bit lost. How can I deal with it ?
data_detect_m<-as.matrix(data_detec)
data_detect_m <- array(data_detect_m, dim=c(nrow(data_detect_m), 10, length(2014:2024)))
head(data_detect_m)
data_type_m<-as.matrix(data_type)
data_type_m <- array(data_type_m, dim=c(nrow(data_detect_m), 10, length(2014:2024)))
head(data_type_m)
data_date_m<-as.matrix(data_date)
data_date_m <- array(data_date_m, dim=c(nrow(data_detect_m), 10, length(2014:2024)))
head(data_date_m)
Y <- as.array(data_detect_m) # nSites x nSurvey x nYears
# Convert categorical variable `list_type` into numeric indices (1-4)
ObsCov1_index <- as.numeric(factor(data_type_m, levels = c("Comprehensive_daily_list",
"Medium_daily_list",
"Short_daily_list",
"Single_records")))
ObsCov1_array <- array(ObsCov1_index, dim = dim(data_type_m))
ObsCov2 <- as.array(data_date_m) # nSites x nSurvey x nYears
dim(ObsCov2)
# Get dimensions
nsites <- dim(Y)[1] # Number of sites
nsurvey <- dim(Y)[2] # Number of surveys per year
nyears <- dim(Y)[3] # Number of years
# Structure the data for JAGS
bdata <- list(
y = Y, # Detection matrix (nsites x nsurvey x nyears)
list_type = ObsCov1_array, # List type as numeric indices
julian_date = ObsCov2, # Julian date (continuous)
nSites = nsites,
nSurvey = nsurvey,
nYears = nyears,
nListTypes = 4 # Number of levels in list_type
)
cat(file = "dynocc_model.jags", "
model {
for (i in 1:nSites) {
# Initial occupancy probability
z[i,1] ~ dbern(psi1)
# Occupancy dynamics (Colonization and Extinction)
for (t in 2:nYears) {
logit(psi[i, t]) <- logit(phi[t-1] * z[i, t-1] + (1 - z[i, t-1]) * gamma[t-1])
z[i, t] ~ dbern(psi[i, t])
}
# Detection process
for (t in 1:nYears) {
for (j in 1:nSurvey) {
# Quadratic effect of Julian date
logit(p[i, j, t]) <- beta0 +
beta1 * julian_date[i, j, t] +
beta2 * pow(julian_date[i, j, t], 2) +
alpha[list_type[i, j, t]] # Categorical effect
y[i, j, t] ~ dbern(p[i, j, t] * z[i, t])
}
}
}
# Priors
psi1 ~ dbeta(1,1) # Prior for initial occupancy
for (t in 1:(nYears-1)) {
phi[t] ~ dbeta(1,1) # Extinction probability (yearly)
gamma[t] ~ dbeta(1,1) # Colonization probability (yearly)
}
beta0 ~ dnorm(0, 0.1) # Detection intercept
beta1 ~ dnorm(0, 0.1) # Linear effect of Julian date
beta2 ~ dnorm(0, 0.1) # Quadratic effect of Julian date
for (k in 1:nListTypes) {
alpha[k] ~ dnorm(0, 0.1) # Effect of list type categories
}
# Derived quantities
for (t in 1:nYears) {
N[t] <- sum(z[, t]) # Total occupied sites per year
}
}
")
# Initialize z using max detection per site-year
zst <- apply(Y, c(1, 3), max, na.rm = TRUE)
# Replace NA values (if any)
zst[
is.na(zst)] <- 0
# Define initialization function
inits <- function() {
list(z = zst)
}
# Parameters to monitor
wanted <- c("psi1", "phi", "gamma", "beta0", "beta1", "beta2", "alpha", "p", "N")
# Run JAGS Model
outY <- jags(
data = bdata,
inits = inits,
parameters.to.save = wanted,
model.file = "dynocc_model.jags",
n.adapt=1000,
n.chains = 3,
n.iter = 2500,
n.burnin = 500,
n.thin = 2,
parallel = TRUE
)
I got some error such as :
I get it is due to NA values. I am quite new in the Bayesian world as I am much more used to frequentist approach. Can someone help me ?
I initially wanted to add annual heterogeneity in site-level extra-dispersion in p as suggested for citizen science data in AHM2 p 268 but I'll be very happy to run simpler models.
Many thanks.