Hi all,
I’m currently testing a set of poisson/nb models, some of which include a spatial effect and others not (for comparison purposes). For your record, my model includes both continuous and factor variables as fixed-effect terms.
While following the extensive (and very helpful) INLA material available online, I noticed that one has to explicitly remove the Intercept term when using a spde model (and add it in the inla.stack effects). This is not the case when running a ‘simple’ model without spatial random effect, whereby the formula uses R’s internal intercept specification.
Why does one have to make such distinction explicitly? I spent quite some time reading many INLA material, but couldn’t find any palatable answer to that.
That said, I also noticed that estimates of fixed effects changed remarkably between a spatial model that removes the intercept explicitly from R’s standard model formula (see mod1 below), and one that uses R’s internal formula intercept (see mod2 below). The most drastic difference happens for the factor variable, where in the first case the first level of the first factor isn’t even absorbed by the intercept, whereas the one from the second factor variable is. This also leads to estimates with very high standard deviations (see summary of mod1 below).
Why is this odd behavior happening? I’m certainly missing some important link, but can't figure out why. In either case, it would be very nice to understand this aspect of INLA better, including why the first factor level is not being absorbed when using INLA standard intercept approach for spde models.
Below I include part of my code and the summary of the two models to understand my issue.The model I’m showing has two continuous (offNadirAngle & PopDens) and two factor variables (Image_resolution (three levels) & Cloud_presence(two levels)) as fixed effects.
Thank you in advance very much for your time and guidance on this issue.
-Marie
################# CODE #################
## Estimation stack
#~~~~~~~~~~~~~~~~
stk.est <- inla.stack(
data = list(Ncars = df$Ncars),
A = list(A, 1),
effects = list(list(spatial = 1:spde$n.spde),
data.frame(Intercept = 1,
Image_resolution = df$Image_resolution,
offNadirAngle = df$offNadirAngle,
Cloud_presence = df$Cloud_presence,
Effort = df$SQKM,
PopDens = df$Worldpop_dens)),
tag = 'estimation')
## Models
#~~~~~~~~
# model 1: Uses the intercept specified in the estimation stack (stk.est), with R's internal intercept explicityl removed
mod1 <- inla(Ncars ~ 0 + Intercept + Image_resolution + Cloud_presence + offNadirAngle + log1p(PopDens) + f(spatial, model = spde)+ offset(log(Effort)),
family="poisson",
control.compute = list(dic=T, waic=T, cpo=T),
data = inla.stack.data(stk.est),
control.predictor = list(A = inla.stack.A(stk.est)))
# model 2: Uses R's internal intercept in the model formula
mod2 <- inla(Ncars ~ Image_resolution + Cloud_presence + offNadirAngle + log1p(PopDens) + f(spatial, model = spde)+ offset(log(Effort)),
family="poisson",
control.compute = list(dic=T, waic=T, cpo=T),
data = inla.stack.data(stk.est),
control.predictor = list(A = inla.stack.A(stk.est)))
## Model summaries
#~~~~~~~~~~~~~~~~~~
### For mod1
summary(mod1)
Time used:
Pre = 2.58, Running = 13.2, Post = 0.0677, Total = 15.8
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
Intercept -0.784 15.813 -31.796 -0.784 30.227 -0.784 0
Image_resolution0.3 0.271 15.811 -30.737 0.271 31.280 0.271 0
Image_resolution0.4 -0.261 15.811 -31.269 -0.261 30.748 -0.261 0
Image_resolution0.5 -0.811 15.811 -31.819 -0.811 30.198 -0.811 0
Cloud_presenceNo 0.050 0.001 0.048 0.050 0.051 0.050 0
offNadirAngle -0.013 0.000 -0.013 -0.013 -0.013 -0.013 0
log1p(PopDens) 0.726 0.046 0.636 0.726 0.817 0.726 0
Random effects:
Name Model
spatial SPDE2 model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Theta1 for spatial -2.19 0.314 -2.85 -2.17 -1.62 -2.09
Theta2 for spatial 1.69 0.289 1.17 1.67 2.30 1.60
Deviance Information Criterion (DIC) ...............: 3298720.68
Deviance Information Criterion (DIC, saturated) ....: 3303741.23
Effective number of parameters .....................: 3873.00
Watanabe-Akaike information criterion (WAIC) ...: 642107.11
Effective number of parameters .................: 292598.15
Marginal log-Likelihood: -1695542.82
CPO, PIT is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
### For mod2
summary(mod2)
Time used:
Pre = 2.56, Running = 8.89, Post = 0.0609, Total = 11.5
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) -0.258 0.150 -0.553 -0.258 0.036 -0.257 0
Image_resolution0.4 -0.532 0.001 -0.534 -0.532 -0.531 -0.532 0
Image_resolution0.5 -1.082 0.001 -1.084 -1.082 -1.080 -1.082 0
Cloud_presenceNo 0.050 0.001 0.048 0.050 0.051 0.050 0
offNadirAngle -0.013 0.000 -0.013 -0.013 -0.013 -0.013 0
log1p(PopDens) 0.727 0.046 0.637 0.727 0.817 0.727 0
Random effects:
Name Model
spatial SPDE2 model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Theta1 for spatial -2.16 0.302 -2.79 -2.14 -1.61 -2.08
Theta2 for spatial 1.67 0.284 1.15 1.65 2.26 1.59
Deviance Information Criterion (DIC) ...............: 3298720.68
Deviance Information Criterion (DIC, saturated) ....: 3303741.23
Effective number of parameters .....................: 3873.00
Watanabe-Akaike information criterion (WAIC) ...: 642107.08
Effective number of parameters .................: 292598.14
Marginal log-Likelihood: -1695538.46
CPO, PIT is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')