formula comparison with lme and multilevel modeling

399 views
Skip to first unread message

ignacio quintero

unread,
May 15, 2014, 4:27:47 PM5/15/14
to r-inla-disc...@googlegroups.com

Dear all, 


I'm trying to specify the following models from lme to INLA. So far, as I understand this first models have the correct INLA formulation (and they give almost exact results):


For each model I have first the lme formulation and then the formula for INLA:


# random intercept

m1 = lmer(dr ~ st.ele + (1|poId), data = sdat)

form = dr ~ st.ele + f(poId, model = 'iid')


#organize data

n.loc = max(sdat$poId)

sdat$i.int = sdat$poId

sdat$j.int = sdat$poId + n.loc

sdat$k.int = sdat$j.int + n.loc


# uncorrelated random intercept and random slope

m2 = lmer(dr ~ st.ele + (1|poId) + (0 + st.ele|poId), data = sdat)

form = dr ~ st.ele + f(i.int, model = 'iid') + f(j.int, st.ele, model = 'iid')


# correlated random intercept and random slope

m3 = lmer(dr ~ st.ele + (st.ele |poId), data = sdat)

form = dr ~ st.ele + f(i.int, model = 'iid2d', n = 2*n.loc) + f(j.int, st.ele, copy = 'i.int')


# uncorrelated random intercept and random slope of st.ele^2

m4 = lmer(dr ~ st.ele + I(st.ele^2) + (1|poId) + (0 + I(st.ele^2)|poId), data = sdat)

form = dr ~ st.ele + I(st.ele^2) + f(i.int, model = 'iid') + f(j.int, I(st.ele^2), model = 'iid')

Now, the following two models are not giving similar results at all. My first guess is that I didn't get the models correctly (before than specifying different priors)?


#random intercept and slopes, with correlated slopes

m5 = lmer(dr ~ st.ele + I(st.ele^2) + (1|poId) + (0 + st.ele + I(st.ele^2)|poId), data = sdat)

form = dr ~ st.ele + I(st.ele^2) + f(poId, model = 'iid') + f(i.int, st.ele, model = 'iid2d', n = 2*n.loc) + f(j.int, I(st.ele^2), copy = 'i.int')


# correlated random intercept and slopes

m6 = lmer(dr ~ st.ele + I(st.ele^2) + (st.ele + I(st.ele^2)|poId), data = sdat)

form = dr ~ st.ele + I(st.ele^2) + f(i.int, model = 'iid3d', n = 3*n.loc) + f(j.int, st.ele, copy = 'i.int') + f(k.int, I(st.ele^2), copy = 'i.int')



A final question I have is how to specify a three level model, where poId is nested within moId, as in the following lmd statement:


# Global Model: random intercept and slopes, with correlated slopes

lmer(dr ~ st.ele + I(st.ele^2) + (1|poId/moId) + (0 + st.ele + I(st.ele^2)|poId/moId), data = 1)


Any help would be much appreciated!

Thank you,

Ignacio



INLA help

unread,
May 15, 2014, 5:28:32 PM5/15/14
to ignacio quintero, r-inla-disc...@googlegroups.com
yes, its all about getting the model `right' its the same. the
model-spesification in inla is much more 'basic' compared to lme, which
is tailored for 'classical mixed models'.

if you are uncertain if its 'correct', run both on Gaussian data with
fixed high precision, and the results should be about the same (at least
the mode values).

H
> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to r-inla-discussion...@googlegroups.com.
> To post to this group, send email to
> r-inla-disc...@googlegroups.com.
> Visit this group at
> http://groups.google.com/group/r-inla-discussion-group.
> For more options, visit https://groups.google.com/d/optout.


--
Håvard Rue
he...@r-inla.org

ignacio quintero

unread,
May 16, 2014, 12:41:02 PM5/16/14
to r-inla-disc...@googlegroups.com, ignacio quintero, he...@r-inla.org
Thanks for your reply, it seems like the models are matching (see data generation and modelling below). I know many things can explain why lme and INLA would give different results, but, with my data, lme fails to converge. Which should I trust? Any ideas? I'm sorry but I just started playing with INLA one week ago, and thus my ignorance.

I would share my data, but it is very big (~700,000 rows), and that's the reason why I recently just turned from STAN/JAGS to INLA.

Additionally, I don't know how to add a covariate that affects the group level coefficients (the second level model). Is there anywhere I can find this? Is it possible?

Any help would be much appreciated.

Ignacio


library(mvtnorm); library(lme4); library(INLA)


# MODEL 1


#generate data

n.loc = 20 # number of groups

n.sams = 20 # samples per group

ntot = n.loc*n.sams # total n

id = gl(n = n.loc, n.sams) #generate levels

x1 = runif(ntot, 0, 30) # covariate

x = scale(x1) #scale it for computation ease


#generate coefficients for intercept and lineear and quadratic factors

coe = rmvnorm(n.loc, c(20, 50, -10), sigma = matrix(c(25, 0, 0, 0, 30, 10, 0, 10, 30), ncol = 3, nrow = 3))


#generate y values according to the above

y = c()

for (i in 1:20) y = c(y, rnorm(n.loc, coe[i, 1] + coe[i, 2]*x[(((i-1)*20)+1):(i*20)] + coe[i, 3]*(x[(((i-1)*20)+1):(i*20)]^2), sd = 2))


#final data

dat = data.frame(x, y, id)


# lme analysis

m5 = lmer(y ~ x + I(x^2) + (1|id) + (0 + x + I(x^2)|id), data = dat)


#INLA analysis

dat$id = droplevels(dat$id)

dat$id = as.integer(dat$id)

dat$i.int = dat$id

dat$j.int = dat$id + n.loc

form = y ~ x + I(x^2) + f(id, model = 'iid') + f(i.int, x, model = 'iid2d', n = 2*n.loc) + f(j.int, I(x^2), copy = 'i.int')

mod = inla(form, data = dat, family="gaussian", control.compute = list(dic = TRUE))



# MODEL 2


# generate data

n.loc = 20 # number of groups

n.sams = 20 # samples per group

ntot = n.loc*n.sams # total n

id = gl(n = n.loc, n.sams) #generate levels

x1 = runif(ntot, 0, 30) # covariate

x = scale(x1) #scale it for computation ease


#generate coefficients for intercept and lineear and quadratic factors

coe = rmvnorm(n.loc, c(20, 50, -10), sigma = matrix(c(25, 20, 10, 20, 30, 5, 10, 5, 30), ncol = 3, nrow = 3))


#generate y values according to the above

y = c()

for (i in 1:20) y = c(y, rnorm(n.loc, coe[i, 1] + coe[i, 2]*x[(((i-1)*20)+1):(i*20)] + coe[i, 3]*(x[(((i-1)*20)+1):(i*20)]^2), sd = 2))


#final data

dat = data.frame(x, y, id)


# lme analysis

m6 = lmer(y ~ x + I(x^2) + (x + I(x^2)|id), data = dat)


#INLA analysis

dat$id = droplevels(dat$id)

dat$id = as.integer(dat$id)

dat$i.int = dat$id

dat$j.int = dat$id + n.loc

dat$k.int = dat$j.int + n.loc

form = y ~ x + I(x^2) + f(i.int, model = 'iid3d', n = 3*n.loc) + f(j.int, x, copy = 'i.int') + f(k.int, I(x^2), copy = 'i.int')

mod = inla(form, data = dat, family="gaussian", control.compute = list(dic = TRUE))

INLA help

unread,
May 17, 2014, 2:22:02 AM5/17/14
to ignacio quintero, r-inla-disc...@googlegroups.com
On Fri, 2014-05-16 at 09:41 -0700, ignacio quintero wrote:
> Thanks for your reply, it seems like the models are matching (see data
> generation and modelling below). I know many things can explain why
> lme and INLA would give different results, but, with my data, lme
> fails to converge. Which should I trust? Any ideas? I'm sorry but I
> just started playing with INLA one week ago, and thus my ignorance.

Hi,

with gaussian responce the Bayesian results in INLA should be exact up
to numerical integration error.

>
> Additionally, I don't know how to add a covariate that affects the
> group level coefficients (the second level model). Is there anywhere I
> can find this? Is it possible?

can you describe this using math?

Best
H


--
Håvard Rue
he...@r-inla.org

ignacio quintero

unread,
May 17, 2014, 11:39:59 AM5/17/14
to r-inla-disc...@googlegroups.com, ignacio quintero, he...@r-inla.org

Yes, it is the formula in the attached snapshot.

Thanks,

i
Screen Shot 2014-05-17 at 11.38.07 AM.png

INLA help

unread,
May 18, 2014, 3:21:58 AM5/18/14
to ignacio quintero, r-inla-disc...@googlegroups.com
On Sat, 2014-05-17 at 08:39 -0700, ignacio quintero wrote:
>
> Yes, it is the formula in the attached snapshot.


yes, that formula is doable. its a little involed to do this though, as
you have one model for the mean and another one for deviation around the
mean. beta1 and 2 are dependent then you have the model iid2d. you also
have to use the weights argument f(idx, weight, model="...") and the
copy-feature...

I attach a simulated exampel showing how this can be done.

In this case all the calculations are exact except for the numerical
integration of the hyperparameters.
ex.R

ignacio quintero

unread,
May 19, 2014, 10:31:07 AM5/19/14
to r-inla-disc...@googlegroups.com, ignacio quintero, he...@r-inla.org
This is great! Thanks so much to take some time to do this. I get the idea now.

Best,

i
Reply all
Reply to author
Forward
0 new messages