Multinomial N-mixture model

289 views
Skip to first unread message

Guillaume Souchay

unread,
Oct 11, 2016, 4:41:42 AM10/11/16
to hmecology: Hierarchical Modeling in Ecology
Hi guys,

I am trying to fit a N-mixture model using a multinomial distribution for abundance. I would like to get the total abundance on my site over all sectors that are surveyed.
I could just use a classical N-mixture model and derive the sum, but as this model would be part of an integrated model, I would like to get total abundance somewhere in the right part of the model. I can define M (total abundance) twice in an integrated population model, so I need to get it on the right part here. Thus, the trick would be the use of the multinomial distribution, and this is easily done with JAGS.

Here is the description of the model :
    model {
    
    # Priors
    beta0 ~ dunif(-20,20)
    beta1 ~ dnorm(0,0.01)
    beta2 ~ dnorm(0,0.01)
    beta3 ~ dnorm(0,0.01)
    beta4 ~ dnorm(0,0.01)
    
      for (k in 1:n.year){
    lam_tot[k] ~ dunif(1,50)
    eps[k] ~ dnorm(0,tau.p)
    for (i in 1:n.sec){
    lam[i,k] ~ dgamma(5,1)
    }
    }
    tau.p <- pow(sd.p,-2)
    sd.p ~ dunif(0,5)

    # Likelihood
    # Ecological model 
    for (k in 1:n.year){                                 
    M[k] ~ dpois(lam_tot[k])
    N.sec[1:n.sec,k] ~ dmulti(pi[1:n.sec,k],M[k])         # Abundance on each sector is a part of the total abundance
    pi[1:n.sec,k] <- lam[1:n.sec,k]/sum(lam[,k])          # definition of pi
                          
    # Observation model for replicated counts
    for (i in 1:n.sec){
    for (j in 1:(field[i,k])){              
    y[i,j,k] ~ dbin(z.sec[i,k]*p[i,j,k], N.sec[i,k])       # z.sec is used to force the detection to be 0 when no counts occured    
    p[i,j,k] <- exp(lp[i,j,k])/(1+exp(lp[i,j,k]))
    lp[i,j,k] <- beta0 + beta1 * date[i,j,k] + beta2 * pow(date[i,j,k],2) + beta3 * pow(hour[i,j,k],2) + beta4 * hab[i,j,k] + eps[k]   # several cov are used to fit the detection function    

    
    # # Assess model fit using Chi-squared discrepancy # I remove this part for the moment, as I would like to be able to get the model as least once !
    # # Compute fit statistic E for observed data
    # eval[i,j,k] <- p[i,j,k] * N.sec[i,k]     # Expected values
    # E[i,j,k] <- pow((y[i,j,k] - eval[i,j,k]),2) / (eval[i,j,k] + 0.5)
    # # Generate replicate data and compute fit stats for them
    # y.new[i,j,k] ~ dbin(p[i,j,k], N.sec[i,k])
    # E.new[i,j,k] <- pow((y.new[i,j,k] - eval[i,j,k]),2) / (eval[i,j,k] + 0.5)
    # } #j
    # for (j in (field[i,k]+1):max(field)){
    # eval[i,j,k] <- 0
    # E[i,j,k] <- 0
    # y.new[i,j,k] <- 0
    # E.new[i,j,k] <- 0
    } #j 
    } #i 
    } #k 
    
    # # Derived and other quantities
    # # Total pop. size across all sectors from the site
    # # mean abundance on sector of each site
    # for (k in 1:n.year){
    # M[k] <- sum(N.sec[,k])                   # this is the standard way to get total abundance
    # }
    # fit <- sum(E[,,])
    # fit.new <- sum(E.new[,,])
    }
    

I am using JAGS and package jagsUI, so the use of the dmulti is possible.
However, problems arise with the inits. Everytime, I get stucked due to init for beta0 :
Error: Error in node beta0
Current value is inconsistent with data

In the similar way that the classical N-mixture model for these dataset is initiated, I tried to give only inits for abundance. 
But, 1 over 20 tries, I could pass the adaptive phase and it crashed at the burn-in phase with always the same error.
I tried to give also init value for beta0 (the intercept of the detection) but it crashes always, either at adaptive or at burn-in phase.

A solution could be to fit the model using WinBUGS which is more flexible about inits, but BUGS don't handle multinomial distribution with unfixed values.
In the AHM book, there are some examples on how to fit a conditional multinomial model, but the multinomial distribution is used for the detection part of the model and you have to give the total number of captured animal and this is defined as data, but it's not exactly the case here. And I'm not sure that I could use the same trick for the abundance part of the model.

So, any idea or suggestion is welcome ! 

I provide the R script to run the model as well as the data to reproduce the code.

ps: the reason why I have to get abundance at a higher scale is that some demographic information for the IPM is only available at the scale of the higher scale ... 

Guillaume
Multi_Nmix.RData
multi_Nmix.R

John Clare

unread,
Oct 14, 2016, 5:04:07 PM10/14/16
to hmecology: Hierarchical Modeling in Ecology
This seems to work (or at least run ?).  Couple changes in red.

# use of the joint Poisson-multinomial distribution
# if {n_1 ~ Poisson(lam_1),...,n_k ~ Poisson(lam_k)}, then, conditionnal on the total, the vector of counts is multinomial 
# {n_1,...,n_k} ~ Multinomial(sum(n[1:k]),{lam_1/sum(lam[1:k]),...,lam_k/sum(lam[1:k])})

#### model with overdispersion in both abundance and detection
sink("Nmix_multi.txt")
cat("
    model {
    
    # Priors
    beta0 ~ dunif(-3,3)
    beta1 ~ dnorm(0,0.01)
    beta2 ~ dnorm(0,0.01)
    beta3 ~ dnorm(0,0.01)
    beta4 ~ dnorm(0,0.01)
    
    # eps.N ~ dunif(0,tau.lam)
    # tau.lam <- pow(sd.lam,-2)
    # sd.lam ~ dunif(0,5)
    
    for (k in 1:n.year){
    # alpha.lam[k] ~ dnorm(0,0.1)
    lam_tot[k] ~ dgamma(.1,.1)
    eps[k] ~ dnorm(0,tau.p)
    for (i in 1:n.sec){
    lam[i,k] ~ dgamma(5,1)
    # for (j in 1:(field[i,k])){
    # p[i,j,k] ~ dunif(0,1)
    # }
    }
    }
    tau.p <- pow(sd.p,-2)
    sd.p ~ dunif(0,5)
    
    # Likelihood
    # Ecological model for true abundance with overdispersion in abundance
    for (k in 1:n.year){                                # Loop over years 
    M[k] ~ dpois(lam_tot[k])

    # Observation model for replicated counts
    for (i in 1:n.sec){
    N.sec[i,k] ~ dmulti(pi[i,k],M[k])         # Abundance on each sector
    pi[i,k] <- lam[i,k]/sum(lam[,k])
    
    for (j in 1:(field[i,k])){              # Loop over temporal reps 
    y[i,j,k] ~ dbin(p[i,j,k], N.sec[i,k])       # Detection  z.sec[i,k]  
    #p[i,j,k] <- exp(lp[i,j,k])/(1+exp(lp[i,j,k]))
    logit(p[i,j,k]) <- beta0 + beta1 * date[i,j,k] + beta2 * pow(date[i,j,k],2) + beta3 * pow(hour[i,j,k],2) + beta4 * hab[i,j,k] + eps[k]
    # lp[i,j,k] <- beta0 + eps[k]
    
    # # Assess model fit using Chi-squared discrepancy
    # # Compute fit statistic E for observed data
    # eval[i,j,k] <- p[i,j,k] * N.sec[i,k]     # Expected values
    # E[i,j,k] <- pow((y[i,j,k] - eval[i,j,k]),2) / (eval[i,j,k] + 0.5)
    # # Generate replicate data and compute fit stats for them
    # y.new[i,j,k] ~ dbin(p[i,j,k], N.sec[i,k])
    # E.new[i,j,k] <- pow((y.new[i,j,k] - eval[i,j,k]),2) / (eval[i,j,k] + 0.5)
    # } #j
    # for (j in (field[i,k]+1):max(field)){
    # eval[i,j,k] <- 0
    # E[i,j,k] <- 0
    # y.new[i,j,k] <- 0
    # E.new[i,j,k] <- 0
    } #j 
    } #i 
    } #k 
    
    # # Derived and other quantities
    # # Total pop. size across all sectors from the site N
    # # mean abundance on sector of each site
    # for (k in 1:n.year){
    # N[k] <- sum(N.sec[,k])                   
    # mean.abundance.sec[k] <- mean(lambda[,k])
    # }
    # fit <- sum(E[,,])
    # fit.new <- sum(E.new[,,])
    }
    ",fill = TRUE)
sink()

## data
z.sec <- field
z.sec[field!=0] <- 1
jags.data <- list(y = y,n.year = n.year, n.sec = n.sec,z.sec = z.sec,
                  date = sd.date,hour = sd.hour,hab = sd.hab,field = field)

## Initial values
# using estimated abundance from classical N-mixture model
M.init <- apply(Abun_Pa[,,3],2,sum)*3
Nst1 <- apply(y,c(1,3),max,na.rm=T)+1
Nst1[Nst1=="-Inf"] <- 1

inits <- function(){list(M=M.init)}

# Parameters monitored
parameters <- c("N.sec","M","beta0","beta1","beta2","beta3","beta4","sd.p","lam","pi","fit","fit.new") 

# MCMC settings
ni <- 1000
nt <- 3
nb <- 100
nc <- 3

# Call JAGS from R
library(jagsUI)
Nmix_multi <- jags(jags.data, inits, parameters, "Nmix_multi.txt", 
                   n.chains = nc, n.thin = nt, n.iter = ni, n.burnin = nb)

John Clare

unread,
Oct 16, 2016, 3:22:28 PM10/16/16
to hmecology: Hierarchical Modeling in Ecology
Whoops, had left this up in r and looked at the results quickly, and my proposed solution solves very little.   Some of the priors above weren't appropriate: beta0 seems to hit a boundary around -3, so the original formulation was better; z.sec can move back into the likelihood just fine, and gamma (.1, .1) is a terrible prior for lam_tot--gamma (5, .1) looks reasonable, or maybe better, gamma(shape, scale) with shape and scale defined as stochastic nodes.  The biggest issue, if I understand what you are trying to estimate correctly, is that dmulti is expecting a vector of probabilities rather than a single probability: as written, all M animals are found both in N.sec values.  Maybe something like this works better:

sink("Nmix_multi.txt")
cat("
    model {
    
    # Priors
    beta0 ~ dunif(-20,20)
    beta1 ~ dnorm(0,0.01)
    beta2 ~ dnorm(0,0.01)
    beta3 ~ dnorm(0,0.01)
    beta4 ~ dnorm(0,0.01)
    
    # eps.N ~ dunif(0,tau.lam)
    # tau.lam <- pow(sd.lam,-2)
    # sd.lam ~ dunif(0,5)
    gam1~dunif(0, 50)    
    gam2~dunif(0, 50)

    for (k in 1:n.year){
    # alpha.lam[k] ~ dnorm(0,0.1)
    lam_tot[k] ~ dgamma(gam1,gam2)
    eps[k] ~ dnorm(0,tau.p)
    for (i in 1:n.sec){
    lam[i,k] ~ dgamma(5,1)
    # for (j in 1:(field[i,k])){
    # p[i,j,k] ~ dunif(0,1)
    # }
    }
    }
    tau.p <- pow(sd.p,-2)
    sd.p ~ dunif(0,5)
    
    # Likelihood
    # Ecological model for true abundance with overdispersion in abundance
    for (k in 1:n.year){                                # Loop over years 
    M[k] ~ dpois(lam_tot[k])
    N.sec[1, k] ~ dbin(pi[k,1],M[k]) 
    N.sec[2, k] <-M[k]-N.sec[1,k] 
    # Abundance on each sector
    pi[k, 1] <- lam[1,k]/sum(lam[,k])
    # Observation model for replicated counts
    for (i in 1:n.sec){

    for (j in 1:(field[i,k])){              # Loop over temporal reps 
    y[i,j,k] ~ dbin(p[i,j,k]*z.sec[i,k], N.sec[i,k])       # Detection  z.sec[i,k]  

Guillaume Souchay

unread,
Oct 17, 2016, 11:13:48 AM10/17/16
to hmecology: Hierarchical Modeling in Ecology
Hi John,

Thanks a lot for your contribution. It's very helpful.
Indeed, dmulti is expecting a vector, and I think this is why it is socomplicated to run a model using dmulti and random variables with JAGS. With Gibbs Sampler, you can not upgrade the whole vector at the same time but each of the components of the vector, so the multinomial relationship is no more valid and then, the detection part is no more valid.
As you mentionned, for this special case with only 2 sectors, the formalisation of the multinomial using the binomial is a nice trick and this is currently running.

The code used is :
sink("Nmix_bin.txt")
cat("
    model {
    
    # Priors
    beta0 ~ dunif(-20,20)
    beta1 ~ dnorm(0,0.01)
    beta2 ~ dnorm(0,0.01)
    beta3 ~ dnorm(0,0.01)
    beta4 ~ dnorm(0,0.01)
    
    for (k in 1:n.year){
    lam_tot[k] ~ dgamma(gam1,gam2)
    eps[k] ~ dnorm(0,tau.p)
    for (i in 1:n.sec){
    lam[i,k] ~ dgamma(10,1)
    }
    }
    tau.p <- pow(sd.p,-2)
    sd.p ~ dunif(0,5)
    gam1~dunif(0,50)    
    gam2~dunif(0,50)
    
    # Likelihood
    # Ecological model for true abundance with overdispersion in abundance
    for (k in 1:n.year){                                # Loop over years 
    M[k] ~ dpois(lam_tot[k])
    N.sec[1,k] ~ dbin(pi[1,k],M[k])
    N.sec[2,k] <- M[k] - N.sec[1,k] 
    pi[1,k] <- lam[1,k]/sum(lam[,k])
    
    # Observation model for replicated counts
    for (i in 1:n.sec){
    for (j in 1:(field[i,k])){              # Loop over temporal reps 
    y[i,j,k] ~ dbin(z.sec[i,k]*p[i,j,k], N.sec[i,k])       # Detection 
    p[i,j,k] <- exp(lp[i,j,k])/(1+exp(lp[i,j,k]))
    lp[i,j,k] <- beta0 + beta1 * date[i,j,k] + beta2 * pow(date[i,j,k],2) + beta3 * pow(hour[i,j,k],2) + beta4 * hab[i,j,k] + eps[k]
    } #j
    } #i 
    } #k 

    }
    ",fill = TRUE)
sink()


For the moment, I only ran it over 1000 iterations, just to be sure that the model is right and it works with JAGS.
I will be able afterwards to integrate this N-mixture model into an IPM.

However, I am still wondering about what do you in the case of more than 2 sectors, or in other words, for a vector with more than 2 elements for the multinomial distribution with random variables.
Should we use several conditional binomial distribution ? e.g.:
N.sec[1] ~ dbin(pi[1],N)
N.sec[2] ~ dbin(pi[2]/(pi[1]+pi[2]),N-N.sec[1])
...
N.sec[n] ~ dbin(pi[n]/(
pi[1]+pi[2]+...+pi[n-1]),N-N.sec[1]-...-N.sec[n-1])

or should we try to use something else than JAGS, like Nimble or STAN to implement such model?
This is an open question and if anyone has a suggestion

Cheers,
Guillaume

John Clare

unread,
Oct 18, 2016, 12:34:45 PM10/18/16
to hmecology: Hierarchical Modeling in Ecology
Well, I I think the model will work using dmulti--the binomial usage was a trick that worked quickly in this context, so more a convenience thing rather than a necessity for fitting the model in JAGS (although in theory, there's nothing wrong with extending the binomials).
Reply all
Reply to author
Forward
0 new messages