Error terms - lme4 vs rstanarm

217 views
Skip to first unread message

Paul Scutti

unread,
Oct 16, 2016, 12:59:48 PM10/16/16
to Stan users mailing list
Hi Stan users,

I am running a basic random intercept model both in lme4 and rstanarm. While I am getting identical results for the fixed effects, the same cannot be said for the random effects. The standard deviation for the residual is ok (0.74 for both), but for the intercept I get 0.31 in lme4 and more than double (0.63) in rstanarm. Any thoughts? 

Below you will find the summaries and sessionInfo, while I am also attaching the workspace should you need further information.  

Cheers,
Paul


mod_lmer_1
Linear mixed model fit by REML ['lmerMod']
Formula: DV1 ~ VarF1 + VarF2 + VarF3 + VarF4 + VarF5 + VarF6 + VarF7 +  
    VarF8 + VarF9 + VarT1 + VarT2 + VarT3 + (1 | Level1)
   Data: mydata
REML criterion at convergence: 7546.922
Random effects:
 Groups   Name        Std.Dev.
 Level1   (Intercept) 0.3072  
 Residual             0.7386  
Number of obs: 3346, groups:  Level1, 2
Fixed Effects:
(Intercept)        VarF1        VarF2        VarF3        VarF4        VarF5        VarF6  
  -0.031232     0.036068     0.100994     0.050455     0.029563    -0.038123    -0.029646  
      VarF7        VarF8        VarF9        VarT1        VarT2        VarT3  
  -0.006044     0.003626    -0.024349     0.243189     0.271287     0.070910 


mod_rstanarm_1
stan_glmer(formula = DV1 ~ VarF1 + VarF2 + VarF3 + VarF4 + VarF5 + 
    VarF6 + VarF7 + VarF8 + VarF9 + VarT1 + VarT2 + VarT3 + (1 | 
    Level1), data = mydata, cores = 2, prior = priorC, prior_intercept = priorI, 
    prior_covariance = decov(regularization = 2), adapt_delta = 0.9999)

Estimates:
            Median MAD_SD
(Intercept) 0.0    0.3   
VarF1       0.0    0.0   
VarF2       0.1    0.0   
VarF3       0.1    0.0   
VarF4       0.0    0.0   
VarF5       0.0    0.0   
VarF6       0.0    0.0   
VarF7       0.0    0.0   
VarF8       0.0    0.0   
VarF9       0.0    0.0   
VarT1       0.2    0.0   
VarT2       0.3    0.0   
VarT3       0.1    0.0   
sigma       0.7    0.0   

Error terms:
 Groups   Name        Std.Dev.
 Level1   (Intercept) 0.62    
 Residual             0.74    
Num. levels: Level1 2 

Sample avg. posterior predictive 
distribution of y (X = xbar):
         Median MAD_SD
mean_PPD 0.0    0.0   

Observations: 3346  Number of unconstrained parameters: 18  



sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rstan_2.12.1       StanHeaders_2.12.0 ggplot2_2.1.0      arm_1.9-1          lme4_1.1-12       
 [6] Matrix_1.2-6       MASS_7.3-45        rstanarm_2.12.1    Rcpp_0.12.7        shiny_0.14        

loaded via a namespace (and not attached):
 [1] gtools_3.5.0       zoo_1.7-13         shinyjs_0.7        reshape2_1.4.1     splines_3.3.1     
 [6] lattice_0.20-33    colorspace_1.2-6   miniUI_0.1.1       htmltools_0.3.5    stats4_3.3.1      
[11] loo_0.1.6          base64enc_0.1-3    nloptr_1.0.4       matrixStats_0.50.2 plyr_1.8.4        
[16] stringr_1.1.0      munsell_0.4.3      gtable_0.2.0       htmlwidgets_0.7    threejs_0.2.2     
[21] codetools_0.2-14   coda_0.18-1        colourpicker_0.2   inline_0.3.14      httpuv_1.3.3      
[26] parallel_3.3.1     markdown_0.7.7     xts_0.9-7          xtable_1.8-2       scales_0.4.0      
[31] DT_0.2             shinystan_2.2.1    jsonlite_1.1       abind_1.4-5        mime_0.5          
[36] gridExtra_2.2.1    digest_0.6.10      stringi_1.1.1      grid_3.3.1         tools_3.3.1       
[41] magrittr_1.5       shinythemes_1.1    rsconnect_0.4.3    dygraphs_1.1.1.2   rstudioapi_0.6    
[46] minqa_1.2.4        R6_2.1.3           nlme_3.1-128      
ErrorTerms_161016.RData

Ben Goodrich

unread,
Oct 16, 2016, 1:15:27 PM10/16/16
to Stan users mailing list
On Sunday, October 16, 2016 at 12:59:48 PM UTC-4, Paul Scutti wrote:
I am running a basic random intercept model both in lme4 and rstanarm. While I am getting identical results for the fixed effects, the same cannot be said for the random effects. The standard deviation for the residual is ok (0.74 for both), but for the intercept I get 0.31 in lme4 and more than double (0.63) in rstanarm. Any thoughts? 

Prior on the standard deviation, only having two groups with which to estimate that standard deviation, mode of likelihood vs. mean of posterior, the irrelevance of comparing point estimates, etc.

Ben

P.S. Setting regularization = 2 does not do anything when there is only one hyperparameter to estimate.

Jonah Gabry

unread,
Oct 16, 2016, 6:10:32 PM10/16/16
to Stan users mailing list
Hi Paul, 

The estimates from rstanarm are not intended to always mirror those from lme4. lme4 is doing a type of maximum likelihood estimation giving you point estimates at the mode of the likelihood. rstanarm is doing full Bayesian inference, which means we're estimating the full joint posterior distribution of all parameters, which is a compromise between the likelihood and the prior, and then if we summarize the posterior we typically look at expectations, which can give you something quite different than the mode of the likelihood. So even before estimating anything we shouldn't necessarily expect to the get the same answers, especially if those answers are point estimates. For some model+data combinations there won't be much difference and for others there might be substantial differences. In most cases I would trust the estimates from Stan over those from lme4 when there are important differences (and not only for the estimates of the parameters themselves but also for prediction). Assuming of course that there were no red flags when running the MCMC. 

If you do want to reproduce the lme4 standard deviation this should come closer

stan_glmer(
  DV1 ~ VarF1 + VarF2 + VarF3 + VarF4 + VarF5 +
    VarF6 + VarF7 + VarF8 + VarF9 + VarT1 + 
    VarT2 + VarT3 + (1 | Level1),
  data = mydata,
  QR = TRUE,
  prior = normal(0,1),
  prior_covariance = decov(scale = 0.25),
  prior_intercept = priorI,
  adapt_delta = 0.99,
  cores = 2
)

although I'm not sure whether or not it's the right thing to do in this particular case. Often with only a few group levels it's good to be skeptical of the lme4 estimate of the standard deviation of the "random effects". The uncertainty estimates are probably too small. 

Hope that helps, 

Jonah

P.S. See this recent thread for more on why Ben's P.S. is true 

Paul Scutti

unread,
Oct 17, 2016, 11:45:38 AM10/17/16
to Stan users mailing list
Hi Jonah and Ben,

Thank you for taking the time to clarify some important points: it was really helpful. I re-ran the model with prior_covariance = decov(scale = 0.25) and the the standard deviation of the random intercept is now in line with what I was expecting.

In my random intercept model that regularisation=2 was a left-over of a previous model with random slopes, and I was not expecting it to play any role in the new one. So Jonah thank you for pointing out that prior_covariance can indeed be applied to a 1x1 covariance matrix scenario.


Best regards,
Paul

Bruno Nicenboim

unread,
Oct 17, 2016, 1:52:50 PM10/17/16
to Stan users mailing list
Hi Paul,

you might be interested in having a look at 
They show that "even under convergence, overparameterization may lead to uninterpretable models" in lme4.

Bests,
Bruno

Andrew Gelman

unread,
Oct 17, 2016, 4:21:36 PM10/17/16
to stan-...@googlegroups.com
Yes, more and more I've been convinced of the effectiveness of informative priors in these situations.
A

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Paul Scutti

unread,
Oct 19, 2016, 9:22:34 AM10/19/16
to Stan users mailing list
Hi Bruno,

Thank you for the link. I read the paper today, installed the RePsychLing package to test it out, and promptly got some interesting results out of my analysis.
I work in market research where multilevel models are basically unheard of but, given the nature of our data, I actually believe we should use them all the time. The paper you referred me to and this mailing list in general go a long way in bridging the gap between research and everyday application.


Regards,
Paul

Ben Goodrich

unread,
Oct 19, 2016, 11:15:54 AM10/19/16
to Stan users mailing list
On Wednesday, October 19, 2016 at 9:22:34 AM UTC-4, Paul Scutti wrote:
Thank you for the link. I read the paper today, installed the RePsychLing package to test it out, and promptly got some interesting results out of my analysis.
I work in market research where multilevel models are basically unheard of but, given the nature of our data, I actually believe we should use them all the time. The paper you referred me to and this mailing list in general go a long way in bridging the gap between research and everyday application.

The RePsychLing package was started on GitHub before rstanarm was on CRAN. I don't think RePsychLing does anything that rstanarm can't do and rstanarm can do a lot of additional things and is more convenient. This (pre-Stan) Bayesian book

http://www.perossi.org/home/bsm-1

has a lot of marketing applications and data, although you would have to use the Stan language to implement most of those case studies.

Ben

Paul Scutti

unread,
Oct 19, 2016, 11:28:24 AM10/19/16
to Stan users mailing list
Hi Ben,

Yes, I found RePsychLing useful in the context of lme4 only. Generally speaking though rstanarm seems to me more intuitive and much better at explaining your data and your models.

Thank your for the link to Rossi's website.


Best,
Paul

Bob Carpenter

unread,
Oct 19, 2016, 11:31:26 AM10/19/16
to stan-...@googlegroups.com
It'd be great if someone wanted to translate all of
the models in that book to Stan; we already have an
issue for it:

https://github.com/stan-dev/example-models/issues/65

We're happy to help with feedback on Stan code. Or
we're happy to take case studies if someone just wants
to do one of the models in more depth.

- Bob
Reply all
Reply to author
Forward
0 new messages