Inf or NaN in BYM when zero_mean = 1

101 views
Skip to first unread message

Konstantinoudis Akis

unread,
Nov 10, 2020, 9:48:12 AM11/10/20
to nimble-users
Hi all, 

I hope I find you well. 

I have been trying to fit a very simple BYM model in NIMBLE. The code is as follows:

BYMCode <- nimbleCode(
{
  for (i in 1:N){
    
    O[i] ~ dpois(mu[i])                                                         # Poisson likelihood for observed counts 
    log(mu[i]) <- log(E[i]) + alpha + theta[i] + phi[i]                         
                             
    theta[i] ~ dnorm(0, tau = tau.theta)        # area-specific RE
    RR[i] <- exp(alpha + theta[i] + phi[i])                # area-specific RR
    resRR[i] <- exp(theta[i] + phi[i])                # area-specific residual RR
    e[i] <- (O[i]-mu[i])/sqrt(mu[i])                                     # residuals  
    proba.resRR[i]<-step(resRR[i]-1)                                   # Posterior probability
  } 
  
  # BYM prior
  phi[1:N] ~ dcar_normal(adj = adj[1:L], weights = weights[1:L], num = num[1:N], tau = tau.phi)
  
  # Priors
  alpha ~ dflat()                                                               # vague prior (Unif(-inf, +inf)) 
  overallRR <- exp(alpha)                                                # overall RR across study region
                                 
  tau.theta ~ dgamma(0.5, 0.05)                                   # prior for the precision hyperparameter
  sigma2.theta <- 1/tau.theta                                          # variance of unstructured area random effects
                     
  tau.phi ~ dgamma(0.5, 0.0005)                                    # prior on precison of spatial area random effects
  sigma2.phi <- 1/tau.phi                                                    # conditional variance of spatial area random effects
  sigma2.phi.marginal <- sd(phi[1:N])*sd(phi[1:N])        # empirical marginal variance of spatial effects
  
  frac.spatial <-  sigma2.phi.marginal/( sigma2.phi.marginal + sigma2.theta)    # fraction of total variation in log RR
  
}
)

n.LTLA <- dim(COVID19Deaths)[1] 

# Format the data for NIMBLE in a list
COVIDdata = list(
                 O = COVID19Deaths$deaths                      # observed nb of deaths
)      
      
COVIDConsts <-list(      
                 N = n.LTLA,                                   # nb of LTLAs
                 L = length(nbWB_A$weights),                   # the number of neighboring areas
                 E = COVID19Deaths$expected,                   # expected number of deaths
                 adj = nbWB_A$adj,                             # the elements of the neighbouring matrix
                 num = nbWB_A$num,
                 weights = nbWB_A$weights
)

inits <- list(
  list(alpha = 0.01,
       tau.theta = 10,
       tau.phi = 1,
       theta = rep(0.01, times = 317), 
       phi = c(rep(0.5, times = 158), 0, rep(-0.5, times = 158))
       ),
  list(alpha = 0.5,
       tau.theta = 1,
       tau.phi = 0.1,
       theta = rep(0.05, times = 317),
       phi = c(rep(0.05, times = 158), 0, rep(-0.05, times = 158))
       )
)

params <- c("sigma2.theta", "sigma2.phi","overallRR", "RR", "theta",
             "resRR", "frac.spatial", "sigma2.phi.marginal", "proba.resRR", "mu")


ni <- 50000  #nb iterations 
nt <- 5     #thinning interval
nb <- 10000  #nb iterations as burning
nc <- 2     #nb chains

t0<- Sys.time()
modelBYM.sim <- nimbleMCMC(code = BYMCode,
                      data = COVIDdata,
                      constants = COVIDConsts, 
                      inits = inits,
                      monitors = params,
                      niter = ni,
                      nburnin = nb,
                      thin = nt, 
                      nchains = nc, 
                      setSeed = 9, 
                      progressBar = TRUE, 
                      samplesAsCodaMCMC = TRUE, 
                      summary = TRUE, 
                      WAIC = TRUE
                      )
t1<- Sys.time()
t1 - t0



So the model runs smoothly if you exclude the phi (ie the ICAR component). I have tried several things, but imposing sum to zero constraints is important because the intercept is smth I need to have. I have tried several things:

1. If I remove the zero_mean = 1, it works but it never converges, I have tried a lot of different niter. I have read that it could be better if I remove the intercept under the zero_mean specification, but I want to keep the intercept. 

2. I have played with the priors of the hyperparameters and intercept.

3. Notice that I imposed also the contraints as we used to do in winbugs (making the initial values to sum to zero) but this didnt work too

any ideas?

Best wishes, 
Gary








Chris Paciorek

unread,
Nov 10, 2020, 5:20:48 PM11/10/20
to Konstantinoudis Akis, nimble-users
Hi Akis,

Can you tell us which values go off to Inf or NaN?  If so, we could help to diagnose what the problem is. 

I do generally recommend that you remove alpha and let the intercept be the mean of the phi values with zero_mean=0 (as you've read). Is there a reason you can't use the mean of the phi values in place of alpha? It really should be equivalent given you have a flat prior on alpha in your attempt with zero_mean = 1.

-chris

 

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/99a65586-c8f1-4856-b077-e42fee42c392n%40googlegroups.com.

Chris Paciorek

unread,
Nov 12, 2020, 12:52:43 PM11/12/20
to Konstantinoudis Akis, nimble-users
Hi Gary,

(Note to others reading this, this response follows some off-list discussion in which Gary shared his data with me.)

It looks like the issue is that the 50th location has 0 neighbors and also has a 0 count. This means that the inference for phi[50] is unconstrained by the CAR prior (since there are no neighbors) and unconstrained by the data (since the likelihood increases as the mean of the Poisson for the observation at the 50th location goes to zero). So either in the zero_mean=0 or zero_mean=1 case, you will see phi[50] wandering off towards minus infinity. In the zero_mean=1 case the side effect is that the other phi values need to increase to offset phi[50].

I would suggest simply taking that location out of the modeling.

-Chris

Anthony S

unread,
Mar 10, 2021, 11:06:13 PM3/10/21
to nimble-users
Good evening Chris

I expect I am having the same issue with a simple N-mixture model. However, I do not have the ability to remove the location because it is essential to downstream analyses. Are there any other options? I thought the initial values were the problem; however, if I remove the spatial component I do not have the problem. I appreciate any input.

Respectfully,

Anthony Snead

Chris Paciorek

unread,
Mar 11, 2021, 12:57:36 PM3/11/21
to Anthony S, nimble-users
hi Anthony,

If a location has no neighbors, then there is no information coming from the spatial structure. So then one is relying solely on the data, and if the count is zero, the likelihood says that the best value for the mean of the Poisson is a value that is as infinitesimally close to zero. So that's not so helpful. So in principle one would need to either have a somewhat informative prior for the mean of the Poisson for that location or just give up and decide to fix the mean of that Poisson at some plausible value. 

All that being said, I'm not fully understanding why removing the spatial component would solve the problem. What is your prior on the mean of the Poisson for that location when you remove the spatial component? I suppose if you set a prior that amounts to having a somewhat informative prior, then what you are seeing is consistent with my analysis above.

-chris

Reply all
Reply to author
Forward
0 new messages