Recover parameters in correlated random intercept and slope on simulated data

415 views
Skip to first unread message

marcel...@gmail.com

unread,
Oct 1, 2016, 11:08:08 PM10/1/16
to R-inla discussion group
Hi INLA team,

Thank you for creating and providing help support for INLA! 
I am still new at using INLA, and am having a bit of trouble recovering parameters in a hierarchical random slope and intercept model.
Below is my code for simulated data, I'd like to compare INLA's result with lmer(). While I carefully read and replicated the 
code in this post
I am still having trouble recovering the variance estimates, such as the diagonals of the covariance structure, and estimation of rho. Fixed effects are recovered well.
I have read the documentation for the multivariate random effects, "iid2d" from the INLA webpage, but I am still stuck.

Furthermore, if you could please provide an example/correct my attempt below on how to set the wishart priors with the "iid2d" distribution is would be a great help.
  
Any help with the above is very much appreciated.
Marcela



####################################################################
# My attempt at running a random intercept and slope model in INLA with 
# simulated data this data is compared with lmer() and inla()

library(ggplot2)
library(MASS)

tp = 4
no.ppl = 100
# create Age matrix - participants aged around baseline with repeated measures
# apporx 1.5 years

baseline.age = 10
baseline<- runif(no.ppl, min = baseline.age, max = (baseline.age+5))

follow.up1<- baseline + runif(no.ppl, min = 1, max = 2) # randomly people are followed up between 1-2 years
follow.up2<- follow.up1 + runif(no.ppl, min = 1, max = 2)
follow.up3<- follow.up2 + runif(no.ppl, min = 1, max = 2)

Age.mat<- matrix(c(baseline, follow.up1, follow.up2, follow.up3), nrow = no.ppl, ncol = tp)
##
res.mat<- matrix(0, nrow = no.ppl, ncol = tp)

b0 = 10
b1 = 3
response.mat<- matrix(0, no.ppl, tp)

# random effects for each person - on slope and intercept
rho = 0.01
Sigma<- matrix(c(2, rho, rho, 2), 2,2) # <== covariance structure

r.eff<- rmvnorm(no.ppl, mean = c(b0, b1), sigma = Sigma)

for(p in 1:no.ppl){
  for(t in 1:tp){
    response.mat[p, t]<- r.eff[p, 1] + r.eff[p, 2]*Age.mat[p,t] + rnorm(1, mean=0, sd = residual.noise.sd)  
  }  
}

# combine into long format to plot spaghetti plot
res.v<- melt(as.data.frame(t(response.mat)))
age.v<- melt(as.data.frame(t(Age.mat)))

long.d<- data.frame(ppl = factor(rep(1:no.ppl, each = tp)), tp = factor(rep(1:no.ppl, times = 100)), 
                    response = res.v$value, age = age.v$value)

#  ****************************************** 
#  plot  *
#  ******************************************

windows()
ggplot(long.d, aes(x = age, y = response, group = ppl, colour = ppl)) + geom_line() + theme_bw()

# ### implement with lmer *****************************************************************
class.m <- lmer(response ~ age + (1 + age|ppl), data = long.d)
summary(class.m)

# compare with actual values
Sigma  # <== theoretical values
print(diag(cov(r.eff)))  # this is the estimated diagonals of the covariance structure
print(cor(r.eff)[1,2]) # estimated correlation in the covariance structure

# ^^ on this 'ideal' data set, it recovered the parameters well
# both global intercept and age, as well as the random effects for the slope. NOT the intercept
#############################################################################################

long.d$ppl<- as.numeric(as.character(long.d$ppl))

long.d$ppl.age<- long.d$ppl
formula = response ~ f(ppl, model = "iid2d", n = 2*no.ppl) + age + f(ppl.age, age, copy = "ppl")

inla.m<- inla(formula, data = long.d, family = "gaussian",
              control.compute=list(dic=TRUE, mlik=TRUE, config=TRUE), 
               control.predictor=list(compute=TRUE)) # to plot the marginals
summary(inla.m)

# check values
b0  # <== yup fixed effects are quite close to solution
b1

# check noise estimates *************************************
1/inla.m$summary.hyperpar[1,6]  # residual variance est mode 

solve(Sigma) # <= theoretical covariance precision
1/diag(cov(r.eff)) # precision simulated random effects

# what INLA estimated - note they are usually skewed so don't use the mean, compare with the mode
inla.m$summary.hyperpar[2, 6]  # close to true value
inla.m$summary.hyperpar[3, 6]  # <== not quite right

# plot the marginals of residual noise and Sigma
windows()
par(mfrow = c(2,2))
plot(inla.m$marginals.hyperpar[[1]], main = "precision for residual noise")
plot(inla.m$marginals.hyperpar[[2]], main = "precision for Sigma_11")
plot(inla.m$marginals.hyperpar[[3]], main = "precision for Sigma_22") # <==why does this have such a heavy tail?
                                                                       # and Sigma_11 doesn't
plot(inla.m$marginals.hyperpar[[4]], main = "precision for rho_12")

rho # <== this is the actual covar(Sigma_11, Sigma_22)
cor(r.eff)[1,2] # correlation from simulated data <=== this is what is estimated in lmer()
# and (methinks) inla()
  
# what INLA estimates is  (am not sure here... how to recover the covariance of Sigma.hat)
inla.m$summary.hyperpar[4, 6]/sqrt(Sigma[1,1]*Sigma[2,2])

# or this ???
inla.m$summary.hyperpar[4, 6]/sqrt(diag(cov(r.eff))[1]*diag(cov(r.eff))[2] )

#################################################################################
##################################################################################
# my attempt at setting the wishart prior for random slope and intercept model
formula.1 = response ~ f(ppl, model = "iid2d", n = 2*no.ppl,
                         hyper = list(theta = list(prior = "wishart2d", 
                       param =c(1000,1000, 1)))) + 
                         age + f(ppl.age, age, copy = "ppl")

# tried to set the random effects precision prior to
# W.inverse = [1/1000           1/sqrt(1000^2)
#              1/sqrt(1000^2)    1/1000]
# with rho = 1
                                        
inla.m1<- inla(formula.1, data = long.d, family = "gaussian",
              control.compute=list(dic=TRUE, mlik=TRUE, config=TRUE), 
               control.predictor=list(compute=TRUE)) # to plot the marginals
summary(inla.m1)
# there seems to be no difference in the Sigma estimates when I specify different
# prior values!


Håvard Rue

unread,
Oct 2, 2016, 1:49:24 PM10/2/16
to marcel...@gmail.com, R-inla discussion group
On Sat, 2016-10-01 at 20:08 -0700, marcel...@gmail.com wrote:
> Hi INLA team,


oops, for Gaussian observations, there is no approximation errors
except numerical integration, so unless its correct there is something
wrong somewhere. 


## you need to add this, as for the that part you want the second half
## of the iid2d model

long.d$ppl.age<- long.d$ppl + no.ppl
----------------------------^^^^^^^^^

you can also add this after the inla() call

inla.m = inla.rerun(inla.m)

as the posterior is somewhat flat to ensure to get the presice
estimate. it just rerun starting at the located mode


hope this helps. 


btw, I think there is a little confusion what inla() returns for the
iid2d model. 

if you have covariance matrix Sigma with elements S_ij,

then inla returns the posterior for

1/S11, 1/S22, S12/sqrt(S11*S22)


you we're confused further down the code 

let me know...

H



--
Håvard Rue
Department of Mathematical Sciences
Norwegian University of Science and Technology
N-7491 Trondheim, Norway
Voice: +47-7359-3533 URL : http://www.math.ntnu.no/~hrue
Mobile: +47-9260-0021 Email: havar...@math.ntnu.no

R-INLA: www.r-inla.org

marcel...@gmail.com

unread,
Oct 2, 2016, 8:08:47 PM10/2/16
to R-inla discussion group, marcel...@gmail.com, hr...@r-inla.org
Hi Havard,

Thank you for your prompt reply and help!
I've fixed my code and it runs well now and figured out how to make it rerun starting at the located mode.

Can you please confirm this is how I specify the priors for the "iid2d" model 

formula.1 = response ~ f(ppl, model = "iid2d", n = 2*no.ppl,
                         hyper = list(theta = list(prior = "wishart2d",   # <====  is the prior = "wishart2d" necessary? as it only runs the wishart prior?
                       param =c(1000,1000, 1)))) +                         # <=== I've tried different values here, and it seems to have no effect on 1/S11, 1/S22 and  S12/sqrt(S11*S22)
                         age + f(ppl.age, age, copy = "ppl")

# tried to set the random effects precision prior to
# W.inverse = [1/1000           1/sqrt(1000^2)
#              1/sqrt(1000^2)    1/1000]
# with rho = 1

thanks in advanced
-marcela

Håvard Rue

unread,
Oct 3, 2016, 5:35:07 AM10/3/16
to marcel...@gmail.com, R-inla discussion group
On Sun, 2016-10-02 at 17:08 -0700, marcel...@gmail.com wrote:

> formula.1 = response ~ f(ppl, model = "iid2d", n = 2*no.ppl,
>                          hyper = list(theta = list(prior =
> "wishart2d",   # <====  is the prior = "wishart2d" necessary? as it
> only runs the wishart prior?

essentially, yes. its a joint prior and its designed to do that for the
wishart2d only. 

>     and it seems to have no effect on 1/S11, 1/S22 and
>  S12/sqrt(S11*S22)

you can check doing this on a fake much smaller dataset .

I attach a small example


the parameterisation of the wishart is a little tricky but the
description in the doc is correct...
iid2d.R
Reply all
Reply to author
Forward
0 new messages