Help with confidence intervals for standardised regression coefficients

1,976 views
Skip to first unread message

Hannah

unread,
May 23, 2016, 4:08:24 PM5/23/16
to lavaan
Hi all,

I would really like to be able to compute the confidence intervals for the standardised regression coefficients from my sem model, however, I have only been able to compute the CIs for the unstandardised coefficients.

Here is my model:

clpm1 <- '

#latent variables age 7
SF7 =~ Sfears7_Q1 + Sfears7_Q2 + Sfears7_Q3 + Sfears7_Q4 + Sfears7_Q5 + Sfears7_Q6
AT7 =~ SC7_Q1 + SC7_Q2 + SC7_Q3 + SC7_Q4 + SC7_Q5 + SC7_Q6 + SC7_Q7 + SC7_Q8 + SC7_Q9 + SC7_Q10 + SC7_Q11 + SC7_Q12

#latent variable age 10
SF10 =~ Sfears10_Q1 + Sfears10_Q2 + Sfears10_Q3 + Sfears10_Q4 + Sfears10_Q5 + Sfears10_Q6
AT10 =~ SC10_Q1 + SC10_Q2 + SC10_Q3 + SC10_Q4 + SC10_Q5 + SC10_Q6 + SC10_Q7 + SC10_Q8 + SC10_Q9 + SC10_Q10 + SC10_Q11 + SC10_Q12

#latent variable age 13
SF13 =~ Sfears13_Q1 + Sfears13_Q2 + Sfears13_Q3 + Sfears13_Q4 + Sfears13_Q5 + Sfears13_Q6
AT13 =~ SC13_Q1 + SC13_Q2 + SC13_Q3 + SC13_Q4 + SC13_Q5 + SC13_Q6 + SC13_Q7 + SC13_Q8 + SC13_Q9 + SC13_Q10 + SC13_Q11 + SC13_Q12

#correlations
AT10 ~~ SF10
AT13 ~~ SF13

#autoregressive and cross-lagged paths
SF10 ~ AT7 + SF7 
AT10 ~ SF7 + AT7 
SF13 ~ AT10 + SF10
AT13 ~ SF10 + AT10
,

#Cross lagged panel model 1 
ModFitLV1 <- sem(clpm1, data = FinalData, std.ov = TRUE, std.lv=TRUE, missing = "fiml", estimator = "MLR", verbose = TRUE)
summary(ModFitLV1, fit.measures =TRUE, standardized=TRUE, rsquare = TRUE, ci = TRUE)

Using the summary command the CIs are computed for the unstandardised coefficients, but not the standardised coefficients.

I have also tried using the parameter estimates function (see below), however, the CIs are also for the unstandardised regression coefficients.
parameterEstimates(ModFitLV1, standardized = TRUE, ci = TRUE, level = 0.95)


Does anyone know how to compute the CIs for the standardised regression coefficients from my model fit?

Any help would be very much appreciated!!

Thanks in advance.
Hannah

Terrence Jorgensen

unread,
May 26, 2016, 4:50:32 AM5/26/16
to lavaan
Does anyone know how to compute the CIs for the standardised regression coefficients from my model fit?

You should use unstandardized coefficients for making inferences about a null hypothesis, and use standardized coefficients as a standardized measure of effect size.  But if you just want to use the width of a CI (or just the size of the SE) to describe the sampling variability you would expect for a standardized coefficient across repeated experiments, then you can get the SE from the standardizedSolution() function, and calculate a central normal-theory CI from that.

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Hannah

unread,
May 26, 2016, 12:01:40 PM5/26/16
to lavaan
Thank you for the advice!!

Hannah

unread,
May 31, 2016, 4:01:58 AM5/31/16
to lavaan
Hi Terrence, 

Would you possibly be able to help me with one more issue I am stuck on. 

I would like to extract the exact pvalue of my regression coefficients from my sem model (ModFitLV1), instead of reporting p<.001. Do you know how to do this using the inspect(), parameterestimates() or any other function in Lavaan?

Thank you in advance!

Best wishes,
Hannah



On Thursday, 26 May 2016 09:50:32 UTC+1, Terrence Jorgensen wrote:

Terrence Jorgensen

unread,
May 31, 2016, 5:19:00 AM5/31/16
to lavaan
I would like to extract the exact pvalue of my regression coefficients from my sem model (ModFitLV1), instead of reporting p<.001. Do you know how to do this using the inspect(), parameterestimates() or any other function in Lavaan?

You can save the output of parameterEstimates() and extract the pvalues column, as long as you know which row to look for. 

params <- parameterEstimates(ModFitLV1)
params$pvalue

Or you can print the parameterEstimates() output using the print() method with the nd argument (for up to 8 digits, I think).

print(params, nd = 8)

Hannah

unread,
May 31, 2016, 7:16:01 AM5/31/16
to lavaan
This worked perfectly. Thank you very much for your help!

Douglas Bonett

unread,
Apr 12, 2017, 4:27:58 PM4/12/17
to lavaan
Request:  Please add  ci =  and level =   in standardizedSolution()

Mark Seeto

unread,
Apr 13, 2017, 7:08:21 AM4/13/17
to lavaan


On Thursday, May 26, 2016 at 6:50:32 PM UTC+10, Terrence Jorgensen wrote:

You should use unstandardized coefficients for making inferences about a null hypothesis, and use standardized coefficients as a standardized measure of effect size.  But if you just want to use the width of a CI (or just the size of the SE) to describe the sampling variability you would expect for a standardized coefficient across repeated experiments, then you can get the SE from the standardizedSolution() function, and calculate a central normal-theory CI from that.

Terrence (or anyone else), what's the reason for using unstandardized coefficients for inferences about a null hypothesis instead of standardized coefficients? It would seem strange to me to report the p-value for the unstandardized coefficient together with the estimate and confidence interval for the standardized coefficient.

Thanks,
Mark

Terrence Jorgensen

unread,
Apr 17, 2017, 9:55:54 AM4/17/17
to lavaan
Terrence (or anyone else), what's the reason for using unstandardized coefficients for inferences about a null hypothesis instead of standardized coefficients?

Historically, I think this had something to do with the SEs not being trustworthy when analyzing correlation matrices instead of covariance matrices (at least in certain models without constraining residual variances to be 1 - explained variance).  


Since null (nil) hypotheses of zero are interpreted the same way for unstandardized and standardized parameters (no effect or relationship), the same hypothesis can be tested using the unstandardized estimate's test statistic or CI.  And the standardized point estimate is used for interpretation, if we don't want reference to an arbitrarily set scale for latent common factors.  Of course, we need to be careful about interpreting standardized effect sizes that way because we use sample estimates of variances to standardize, rather than any known population variance, and variances are also affected by design issues (e.g., how homogeneous is the sampling frame?), so standardized effect sizes are often not comparable across studies anyway. Here's some interesting reading on the subject:


Of course nowadays, it has been worked out how to calculate unbiased SEs for standardized solutions, but I don't think that means unstandardized solutions should be ignored.  A host of more seasoned SEM veterans on SEMNET might have more to say about this tradition, if you want to try posting this question there:


It would seem strange to me to report the p-value for the unstandardized coefficient together with the estimate and confidence interval for the standardized coefficient.

Well, according to the APA manual, we are expected to report an estimate along with a test statistic, df (if applicable), p value, CI for the estimate, and a standardized measure of effect size.  As Dr. Bonett pointed out, it would be naïve not to report a CI along with the point estimate of a standardized effect size, because those are just as susceptible to sampling variability as unstandardized estimates.  But I am unsure whether normal-theory CIs calculated from SEs for standardized estimates are appropriate, or whether unstandardized confidence limits should be transformed -- maybe the latter is wrong if the sampling variability of estimates used to standardize also need to be considered in the transformation.  Likelihood profile CIs for standardized estimates would be great...  Anyway, I'd love to see literature on this.  Again, SEMNET might be a good place to ask about this, in case someone has already investigated the issue.

Chris I

unread,
Apr 17, 2017, 10:11:32 AM4/17/17
to lavaan
Following up on this, when I run a SEM in lavaan, I get the following output:

"Regressions:
Estimate 0.232 Std.Err 0.072 z-value 3.225 P(>|z|) 0.001 Std.lv 0.217 Std.all 0.217"

Are these p values (and standard errors, z-values) given for the unstandardised coefficients then?

I would just like to confirm as I read the following in Kline's SEM textbook:

"• Do not indicate anything about statistical significance for the standardized parameter estimates unless you used a method, such as constrained estimation, that generates correct standard errors in the standardized solution."

Terrence Jorgensen

unread,
Apr 17, 2017, 10:30:02 AM4/17/17
to lavaan
Are these p values (and standard errors, z-values) given for the unstandardised coefficients then?

Yes.  To view SEs for the standardized coefficients, use standardizedSolution()

I would just like to confirm as I read the following in Kline's SEM textbook:

"• Do not indicate anything about statistical significance for the standardized parameter estimates unless you used a method, such as constrained estimation, that generates correct standard errors in the standardized solution."

The SEs reported in standardizedSolution()  are correct.

Chris I

unread,
Apr 17, 2017, 10:41:10 AM4/17/17
to lavaan
Thanks for the quick reply!

Mark Seeto

unread,
Apr 17, 2017, 8:48:47 PM4/17/17
to lavaan
Thanks for explanation Terrence.

Yves Rosseel

unread,
Aug 30, 2017, 8:49:02 AM8/30/17
to lav...@googlegroups.com
On 04/12/2017 10:27 PM, Douglas Bonett wrote:
> Request: Please add ci = and level = in standardizedSolution()

In dev 0.6 now.

Yves.

Stas Kolenikov

unread,
Aug 30, 2017, 1:12:31 PM8/30/17
to lav...@googlegroups.com
What Terrence said, in terms of methodology, is approximately true.
Instead of saying "unbiased", he should have said, "asymptotic". Very
few things are technically unbiased in statistics; the estimate of the
mean is when we are talking about i.i.d. data, and Horvitz-Thompson
survey estimator of the totals is when we are talking about data with
no nonresponse. Everything else is biased, period. Estimates and
standard errors that we get in SEM are biased, biased, biased, biased.
You may get an unbiased estimate of the means and the covariance
matrix, and then you twist them and screw them and stretch them and
reverse them with nonlinear maximization and the computations that the
standard errors require on top of that. (Read Browne 1984 to get a
feeling of just how complex the computations are.)

There are two main issues involved. First, the standard errors for
standardized coefficients are obtained through the delta method, and
with all the nonlinearities that are involved with that, the promised
land of asymptotia is pushed a little bit further away. Any
nonlinearity introduces or exacerbates small sample biases, and
introduces or blows up higher order moments like skewness and kurtosis
of the sampling distributions. Take a log of a positively distributed
variable, and the mean of the logs is not equal to the log of the mean
(read up on the lognormal distribution). What the delta method says is
that as the sampling distribution get tighter around the population
value (which we hope to achieve with larger sample sizes), the
variance of the nonlinear transformation gets tighter, as well, in
some predictable fashion. However, this is only an approximation --
and as an approximation, by Murphy's law, it is usually biased down,
making the standard errors anticonservative. Note that the standard
errors for unstandardized coefficients are themselves obtained by the
delta method approximations, and as such are less reliable than say
the standard errors for the item means (as if anybody is interested in
that trivia). So unstandardized standard errors aren't that great to
begin with, but by convoluting things and producing standardized
coefficients, you make things even less great.

The second issue is that most analytical derivations (have to) assume
that the model is correctly specified. When it isn't, things may get
iffy. If the model is not correctly specified, then you may end up
dividing your coefficient by a wrong quantity. In a more subtle way,
when the SEM model is incorrect, ADF/WLS standard errors are too
small. They are not necessarily disastrously small, but enough to get
worried about (in my simulations, the CIs had something like upper 80%
coverage; in some disastrous examples in other areas of statistics, I
have seen simulated 20% actual coverage).

Terrence points to the possibility of transforming the unstandardized
CIs, or by using likelihood profiles. These are good ideas -- but then
again the likelihood profile mostly makes sense with the model is
exactly right, and you have a multivariate normal distribution to deal
with. With non-normal data, you don't have that luxury.

-- Stas Kolenikov, PhD, PStat (ASA, SSC) @StatStas
-- Senior Scientist, Abt Associates @AbtDataScience
-- Program Chair (2018), Survey Research Methods Section of the
American Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
-- http://stas.kolenikov.name
> --
> You received this message because you are subscribed to the Google Groups
> "lavaan" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lavaan+un...@googlegroups.com.
> To post to this group, send email to lav...@googlegroups.com.
> Visit this group at https://groups.google.com/group/lavaan.
> For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages