ggplot2 for plotting lmer output

3,994 views
Skip to first unread message

Andrey K.

unread,
Nov 23, 2012, 10:13:33 AM11/23/12
to ggp...@googlegroups.com
Hi everybody,
    did anyone have experience plotting the results of the mixed/multilevel models as produced by lmer?

the output of the lmer, like
time
model<- lmer(outcome~1+time+(1+| subjectID))

would be a S4 object ( not clear what that it and the documentation left my rookie mind confused). 
Does anyone know of an easy way to graphs the results of this model in ggplot2 by extracting the necessary parameters from the 4S output from lmer?
For example, if I have 1000 subjects in the longitudinal model above, i would have 1000 individual trajectories could be showed by 

coef(model)

function. So, ideally it would produce 1000 lines to go on the plot.The

fixef(model)

would spit out the parameters for the overall trajectory ( intercept + slope), which could be over-imposed over the multiple lines plots from (coef(model)).
but the trick is to extract those parameters from the 4S lmer objects. To resume` the an opening question:

Does anyone know of an easy way use  4S object from lmer to create plots with ggplot2?

William Beasley

unread,
Nov 24, 2012, 5:22:01 PM11/24/12
to ggp...@googlegroups.com
Hey Andrey, It was good seeing you at that conference thingy the other week.  Great work, by the way.  You have good instincts about pulling the values out of the 'S4' object.  There are a bunch of differences between S3 and S4, but for the purposes of this, the only practical difference is that you use a '@' to pull out values from the `mer` object, instead of using a '$'.

There's no native way to specify a lmer model to ggplot2's `stat_smooth` function, as far as I know.  I imagine it's because there are so many ways to specify a MLM.  Instead, my recommendation is to pull out the predicted/modeled values from the `mer` object (which is returned by lmer).  This occurs below where `dsPredict` is defined.

I just realized I used a different capitalization scheme for the variables, so be careful with that, I guess.  If I left out some important details, please tell me.


rm(list=ls(all=TRUE))
library(ggplot2)
library(plyr)
library(lme4)

subjectCount <- 20
timePointCount <- 11

populationIntercept <- 1
populationSlope <- 2.5
subjectIntercept <- rnorm(subjectCount)
subjectSlope <- rnorm(subjectCount)
dsFull <- data.frame(
  SubjectID=factor(rep(seq_len(subjectCount), each=timePointCount)),
  Time=rep(seq_len(timePointCount), times=subjectCount),
  G0=populationIntercept,
  G1=populationSlope,
  B0=rep(subjectIntercept, each=timePointCount),
  B1=rep(subjectSlope, each=timePointCount),
  E=rnorm(subjectCount*timePointCount),
  Y=NA_real_
)

dsFull$Y <- (dsFull$G0 + dsFull$B0) + (dsFull$G1 + dsFull$B1) * dsFull$Time + dsFull$E
ds <- dsFull[, c("SubjectID", "Time", "Y")]
# m1 <- lmer(Y ~ 1 + Time + (1 | SubjectID), data=ds)
m2 <- lmer(Y ~ 1 + Time + (1 + Time | SubjectID), data=ds)

dsPredict <- data.frame(
  m2@flist,  #The grouping factors for the random effects
  Time=m2@X[, 2], #The values of the time points (which varies, depending how the model equation is specified).
  YHat=m2@eta #Predicted response, given the fixed and random effects.
)

ds <- plyr::join(x=ds, y=dsPredict, by=c("SubjectID", "Time")) #Probably overkill

g <- ggplot(ds, aes(x=Time, y=Y, group=SubjectID)) + geom_line()#color="gray60")
g #Plot the observed points
g + geom_line(aes(y=YHat), color="tomato") #Plot the modeled points

Andrey K.

unread,
Nov 24, 2012, 6:27:35 PM11/24/12
to ggp...@googlegroups.com
brilliant! it took me no time adapting that code to my model. here are the results. now it is just a matter of cosmetics and overimposing the group trajectories. 
thanks!  
predicted trajectories.png
Reply all
Reply to author
Forward
0 new messages