How to interpret inla outputs that include categorical variables?

krtmc...@gmail.com

unread,

Sep 28, 2018, 11:08:57 AM9/28/18

to R-inla discussion group

Hi,

I’d like some help figuring out how to interpret the output from inla for categorical variables.

This post sought to ensure "that the interpretation of a would be the differences from the ground mean": https://groups.google.com/forum/#!searchin/r-inla-discussion-group/sum-to-zero$20constraint$20for$20fixed$20categorical$20variable%7Csort:date/r-inla-discussion-group/aCvAChJk0So/hA4b7VzgAQAJ

The suggestion was as follows: “To get the behaviour you describe, you can instead use an f(...,model=“iid”, constr=TRUE) random effects model and impose a sum to zero constrain there (and optionally fix the precision parameter, to mimic the behaviour of a regular factor covariate).”

I would like to know how this this differs from using “control.fixed=list(expand.factor.strategy="inla")”.

It gives the output below. How should I interpret this? The 95% credible intervals for the most important levels of the categorical variable do not contain zero? How is this determined. i.e. was it compared to the mean of all of the levels? Also, how different is this interpretation when compared with using f(...,model=“iid”, constr=TRUE)? Thanks for your help.

Kurt

######For single variable soil..,

stack.est = inla.stack(data = list(AGB=est.data), A = list(A.est, 1),

effects = list(c(field.indices), data.frame (Intercept=1, soil=biomass4$DOMSOI_12, Plot = inla.group(biomass4$Plot))),

tag="est")

formula <- AGB ~ -1 + Intercept + soil + f(Plot, model="rw1")+ f(field, model=spde)

family = 'gaussian'

control.family = list(hyper = list(prec = list(

prior = "pc.prec", fixed = FALSE, param = c(prior.median.gaus.sd,0.5))))

#-- Call INLA and get results --#

mod = inla(formula,

data=inla.stack.data(stack.est),

family=family,

control.predictor=list(A=inla.stack.A(stack.est), compute=TRUE),

control.fixed=list(expand.factor.strategy="inla"),

control.compute=list(cpo=TRUE, dic=TRUE, waic = TRUE, config=TRUE),

keep=FALSE, verbose=TRUE)

mod$dic$dic

print(summary(mod))

Fixed effects:
	mean	sd	0.025quant	0.5quant	0.975quant	mode
Intercept	202.5691	8.5827	185.7158	202.5694	219.4049	202.5709
Ah	61.4027	20.4614	21.1749	61.4197	101.499	61.4557
Ao	6.5099	14.4237	-21.8095	6.5086	34.8105	6.5071
Ap	53.892	21.0617	12.4911	53.9072	95.1706	53.9394
Bd	-5.5688	25.786	-56.1909	-5.5717	45.0225	-5.5752
Be	2.6193	29.3545	-55.0186	2.6199	60.2011	2.6235
Bh	9.5713	24.52	-38.5799	9.5733	57.6671	9.5794
E	-23.0918	25.8093	-73.7437	-23.1	27.5594	-23.1144
Fh	10.5801	29.2085	-46.7717	10.5811	67.8739	10.5856
Fo	45.6997	14.8441	16.5363	45.7046	74.8089	45.7157
Fr	16.9426	29.2151	-40.4254	16.9447	74.2465	16.9513
Fx	69.1909	11.6517	46.305	69.193	92.0438	69.1983
Gd	85.9835	14.3183	57.8385	85.9933	114.0484	86.0142
Ge	7.2266	29.2058	-50.1182	7.227	64.5167	7.2303
Gm	-11.5483	27.3186	-65.1754	-11.5524	42.052	-11.5582
I	4.9352	12.4293	-19.4662	4.9334	29.3243	4.9307
Je	-27.2367	20.4151	-67.2947	-27.2467	12.8391	-27.2648
Kl	-31.0783	19.3554	-69.0542	-31.0888	6.9207	-31.1083
Lc	-99.5886	19.9491	-138.6698	-99.6196	-60.3725	-99.6803
Lf	16.633	29.2146	-40.7339	16.6351	73.9359	16.6418
Lo	-46.6099	24.5689	-94.8031	-46.6263	1.6298	-46.6572
Nd	3.0787	20.4036	-36.9839	3.078	43.1079	3.0785
O	-12.457	29.2103	-69.8002	-12.4601	44.8512	-12.4639
Od	20.3532	23.4661	-25.7392	20.3587	66.373	20.3716
Tm	-22.655	29.2237	-80.0193	-22.6599	34.6842	-22.6673
Tv	26.6877	29.2302	-30.7151	26.6915	84.0166	26.7018
WR	28.6334	19.3648	-9.4104	28.6398	66.6068	28.6542
Zg	12.4649	20.4067	-27.6122	12.4672	52.4927	12.4733

Random effects:
Name Model
Plot RW1 model
field SPDE2 model

Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for the Gaussian observations 0.0002 0.0000 0.0002 0.0002 0.0002 0.0002
Precision for Plot 53.1149 14.0537 32.1695 50.8144 86.8022 46.3932
Range for field 2181.8808 973.7679 855.1236 1993.7392 4608.8956 1665.5552
Stdev for field 0.7496 0.4021 0.2514 0.6583 1.7860 0.5124

Expected number of effective parameters(std dev): 13.25(0.2999)
Number of equivalent replicates : 24.45

Deviance Information Criterion (DIC) ...: 3740.01
Effective number of parameters .........: 13.68

Watanabe-Akaike information criterion (WAIC) ...: 3743.41
Effective number of parameters .................: 16.00

Marginal log-Likelihood: -1932.28
CPO and PIT are computed

Posterior marginals for linear predictor and fitted values computed

krtmc...@gmail.com

unread,

Nov 1, 2018, 6:34:15 PM11/1/18

to R-inla discussion group

Can anyone help??

Helpdesk

unread,

Nov 3, 2018, 5:33:33 AM11/3/18

to krtmc...@gmail.com, R-inla discussion group

the issue about 'factors' is that in the freq viewpoint, the common
approach is to chose a reference category, and then everything else is
relative to that one, this is what happen with 'lm()' and relatives,

> model.matrix(~1+x, data=data.frame(x=as.factor(c(1,2,3))))
(Intercept) x2 x3
1 1 0 0
2 1 1 0
3 1 0 1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$x
[1] "contr.treatment"

from a Bayesian point of view, this leads to a non-exchangable prior
for 'x', as the marginal variance depends on the choice of reference
category. As an alternative, you can use the iid which sums to zero,

f(x, model="iid", constr=T)

so that the average effect of 'x' is zero.

this one, explains a third strategy

http://www.r-inla.org/faq#TOC-How-does-inla-deal-with-NA-s-in-the-data-argument-

by setting the expand.factor.strategy="inla", then you'll get a
'singular' expandsion without the reference group, but then only
contrasts are likelihood identifiable. see the refence for details.

there is no single best solution here...

Best
H

> > .4049 202.5709

> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to r-inla-discussion...@googlegroups.com
> .
> To post to this group, send email to
> r-inla-disc...@googlegroups.com.
> Visit this group at
> https://groups.google.com/group/r-inla-discussion-group.
> For more options, visit https://groups.google.com/d/optout.

--
Håvard Rue
Helpdesk
he...@r-inla.org

krtmc...@gmail.com

unread,

Nov 5, 2018, 9:20:01 AM11/5/18

to R-inla discussion group

Thanks a million!

Kurt

> send an email to r-inla-discussion-group+unsub...@googlegroups.com

Brian

unread,

Jan 24, 2025, 3:22:16 AM1/24/25

to R-inla discussion group

Hello, does this mean that when using f(x, model ="iid", constructed = T), the effects of 'x' levels are interpreted relative to the overall average effect of the outcome? Specifically, does it imply that the average 'y' value in x1/x2 is higher or lower compared to the average y?

Helpdesk (Haavard Rue)

unread,

Jan 24, 2025, 6:30:01 AM1/24/25

to Brian, R-inla discussion group

I guess you mean f(x, model ="iid", constr=TRUE)

constr=T will say that the sum (or mean), of the iid effects are set to zero,
hence the iid effects are centered. this would ensure, f.ex. that a model

1 + f(x, model="iid", constr=TRUE)

has no confounding between the intercept and iid terms

he...@r-inla.org

Brian

unread,

Jan 24, 2025, 7:06:49 AM1/24/25

to R-inla discussion group

Thank you for the clarification! Just to confirm, with constr=TRUE, the random effects for x (where x is a factor with several levels) capture the deviations of y for each level of x from the overall average (the intercept)?

Helpdesk (Haavard Rue)

unread,

Jan 24, 2025, 2:57:13 PM1/24/25

to Brian, R-inla discussion group

On Fri, 2025-01-24 at 04:06 -0800, Brian wrote:
> Thank you for the clarification! Just to confirm, with constr=TRUE, the random
> effects for x (where x is a factor with several levels) capture the deviations
> of y for each level of x from the overall average (the intercept)?

yes

n=10
> x=as.factor(1:10)
> y=rnorm(n)
> r=inla(y~1 + f(idx,model="iid",constr=TRUE), data=data.frame(y,idx=1:n, x),
family="stdnormal")
> r$summary.fixed

mean sd 0.025quant 0.5quant 0.975quant

(Intercept) 0.3337736279 0.3162254153 -0.2863130495 0.3337736279 0.9538603054
mode kld
(Intercept) 0.3337736279 5.526824971e-11
> r$summary.random$idx$mean
[1] -1.279748355e-04 -1.140560885e-04 8.522459669e-05 1.652662371e-04
[5] -1.344403152e-04 -3.720065591e-05 5.959069135e-05 9.553867382e-05
[9] -8.215610038e-05 9.021291741e-05
> mean(r$summary.random$idx$mean)
[1] 5.120967894e-10

Brian

unread,

Jan 25, 2025, 4:26:23 AM1/25/25

to R-inla discussion group

Thank you!

Reply all

Reply to author

Forward