Standard errors in a hierarchical model (se.fixef() and se.ranef() for Stan)

813 views
Skip to first unread message

Sean Raleigh

unread,
Jul 11, 2015, 1:21:59 AM7/11/15
to stan-...@googlegroups.com
In Gelman and Hill, one finds the functions se.fixef() and se.ranef() for pulling standard errors out of a model fit by lmer(). Is there some way to get these standard errors from Stan (or, for that matter, any MCMC sample)?

Ben Goodrich

unread,
Jul 11, 2015, 12:23:13 PM7/11/15
to stan-...@googlegroups.com, sean.r...@gmail.com
On Saturday, July 11, 2015 at 1:21:59 AM UTC-4, Sean Raleigh wrote:
In Gelman and Hill, one finds the functions se.fixef() and se.ranef() for pulling standard errors out of a model fit by lmer(). Is there some way to get these standard errors from Stan (or, for that matter, any MCMC sample)?

The se.fixef() and se.ranef() functions are part of the arm package, which supports a closed set of models. Stan allows you to do pretty much any model, but as a consequence, rstan has no way of recognizing which parameters are constant across groups and which parameters are group-specific. Moreover, MCMC output does not produce frequentist standard errors (the se_mean column refers to the estimated standard deviation of that mean across many chains) but rather posterior standard deviations. So, nothing is stopping you from looking at the sd column of the output for posterior margins that you know to be constant across groups vs. margins that are group-specific. If you need to access that programatically, do

s <- summary(posterior)$summary
s
[,"sd"]

Ben

Sean Raleigh

unread,
Jul 11, 2015, 12:35:19 PM7/11/15
to stan-...@googlegroups.com, sean.r...@gmail.com
Thanks for the quick response, Ben. I didn't appreciate the fact that the sd for the posterior was playing the same role as the frequentist standard error produced by lmer. Somehow I imagined I'd have to do something much more complicated (involving covariance matrices or something like that).

So if I'm understanding you correctly, the only difference between "random effects" and "fixed effects" standard errors in my MCMC output is me deciding that my parameters vary by group or not and reading off the sd column in either case. Do I have that right?

--Sean

Jonah

unread,
Jul 11, 2015, 12:47:39 PM7/11/15
to stan-...@googlegroups.com
Yup. (And like Ben said, make sure to use 'sd' and not 'se_mean'). 

Sean Raleigh

unread,
Jul 11, 2015, 9:25:34 PM7/11/15
to stan-...@googlegroups.com
So I am noticing that these numbers are slightly different. For example, in a variable-intercept model with 20 groups, I get the following standard deviations from Stan:

a[1] 2.4395695
a
[2] 6.9842508
a
[3] 3.3638914
a
[4] 8.2936868
a
[5] 7.3915319
a
[6] 5.9151832
a
[7] 3.1845462
a
[8] 6.1735792
a
[9] 6.7233317
a
[10] 1.8390783
a
[11] 2.4875086
a
[12] 9.1757516
a
[13] 3.1529689
a
[14] 5.3231362
a
[15] 5.7129531
a
[16] 7.8969134
a
[17] 8.3591933
a
[18] 3.5772681
a
[19] 8.3709972
a
[20] 3.9758822


These are the ones from lmer from se.ranef():

 (Intercept)
1 2.414051
2 6.436764
3 3.279047
4 7.705084
5 7.056111
6 5.500787
7 3.071937
8 5.956278
9 6.135242
10 1.809878
11 2.437176
12 8.329476
13 3.083749
14 5.081977
15 5.500787
16 7.202888
17 7.526200
18 3.448435
19 7.526200
20 3.861101


It seems that Stan is giving me consistently larger standard deviations than lmer. The bigger deviations are in groups like 4 and 12 in which--not coincidentally, I imagine--the sample sizes are quite small (7 and 4 respectively, where several other groups are well over 100).

Not sure what to make of that.

--Sean

Sean Raleigh

unread,
Jul 11, 2015, 9:31:18 PM7/11/15
to stan-...@googlegroups.com
To clarify, I understand why the standard errors are larger in groups with small sample sizes. What I meant is that the differences between Stan output and lmer output are greatest in those groups as well.

Sean Raleigh

unread,
Jul 11, 2015, 9:52:13 PM7/11/15
to stan-...@googlegroups.com
And one more update. When I look at the percentage change from one to the other, the largest factors are in groups with the most shrinkage. For example, group 19 is off by a factor of 1.11. You can see it in the attached picture.
OQ_diff.png

Ben Goodrich

unread,
Jul 12, 2015, 2:12:43 AM7/12/15
to stan-...@googlegroups.com, sean.r...@gmail.com
On Saturday, July 11, 2015 at 9:25:34 PM UTC-4, Sean Raleigh wrote:
It seems that Stan is giving me consistently larger standard deviations than lmer. The bigger deviations are in groups like 4 and 12 in which--not coincidentally, I imagine--the sample sizes are quite small (7 and 4 respectively, where several other groups are well over 100).

Not sure what to make of that.

The best way to look at is to say that lmer is an asymptotic approximation to a full-Bayesian approach like Stan. Thus, if you don't utilize (much) prior information in your Stan program and lmer() understates some of uncertainty, then it is not surprising that the posterior standard deviations produced by Stan are larger than the standard errors produced by lmer, particularly in small groups.

Ben

Sean Raleigh

unread,
Jul 12, 2015, 11:21:31 AM7/12/15
to stan-...@googlegroups.com
That all makes good sense. Thanks again for your help!

--
You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/QC4QkR6H93A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrew Gelman

unread,
Jul 13, 2015, 1:49:45 AM7/13/15
to stan-...@googlegroups.com
Hi, the se.fixed and se.random things actually aren’t so helpful because you can’t use them to get uncertainty for combinations of coefficients.  stan.regression() will be more convenient because it will return posterior draws which can be manipulated at will.  We should put examples in the documentation.

On Jul 11, 2015, at 1:21 AM, Sean Raleigh <sean.r...@gmail.com> wrote:

In Gelman and Hill, one finds the functions se.fixef() and se.ranef() for pulling standard errors out of a model fit by lmer(). Is there some way to get these standard errors from Stan (or, for that matter, any MCMC sample)?

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.

Sean Raleigh

unread,
Jul 13, 2015, 10:41:05 AM7/13/15
to stan-...@googlegroups.com
Is this stan.regression() function something that already exists or something in development?

You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/QC4QkR6H93A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.

Ben Goodrich

unread,
Jul 13, 2015, 10:53:00 AM7/13/15
to stan-...@googlegroups.com, sean.r...@gmail.com
On Monday, July 13, 2015 at 10:41:05 AM UTC-4, Sean Raleigh wrote:
Is this stan.regression() function something that already exists or something in development?

I doubt there is going to be a function called stan.regression(). Things are in the works for there to be a function that probably will be called stan_lmer() that is compatible with lme4::lmer() syntax but estimates the model via Stan.

But if you have already written a hierarchical .stan program and have estimated it with stan(), then you can already manipulate the draws if you want just by doing

draws <- as.matrix(my_stan_output)
dim
(draws)
colnames
(draws)
# do something with draws

Ben
Reply all
Reply to author
Forward
0 new messages