Help with confidence intervals for standardised regression coefficients

Hannah

unread,

May 23, 2016, 4:08:24 PM5/23/16

to lavaan

Hi all,

I would really like to be able to compute the confidence intervals for the standardised regression coefficients from my sem model, however, I have only been able to compute the CIs for the unstandardised coefficients.

Here is my model:

clpm1 <- '

#latent variables age 7

SF7 =~ Sfears7_Q1 + Sfears7_Q2 + Sfears7_Q3 + Sfears7_Q4 + Sfears7_Q5 + Sfears7_Q6

AT7 =~ SC7_Q1 + SC7_Q2 + SC7_Q3 + SC7_Q4 + SC7_Q5 + SC7_Q6 + SC7_Q7 + SC7_Q8 + SC7_Q9 + SC7_Q10 + SC7_Q11 + SC7_Q12

#latent variable age 10

SF10 =~ Sfears10_Q1 + Sfears10_Q2 + Sfears10_Q3 + Sfears10_Q4 + Sfears10_Q5 + Sfears10_Q6

AT10 =~ SC10_Q1 + SC10_Q2 + SC10_Q3 + SC10_Q4 + SC10_Q5 + SC10_Q6 + SC10_Q7 + SC10_Q8 + SC10_Q9 + SC10_Q10 + SC10_Q11 + SC10_Q12

#latent variable age 13

SF13 =~ Sfears13_Q1 + Sfears13_Q2 + Sfears13_Q3 + Sfears13_Q4 + Sfears13_Q5 + Sfears13_Q6

AT13 =~ SC13_Q1 + SC13_Q2 + SC13_Q3 + SC13_Q4 + SC13_Q5 + SC13_Q6 + SC13_Q7 + SC13_Q8 + SC13_Q9 + SC13_Q10 + SC13_Q11 + SC13_Q12

#correlations

AT10 ~~ SF10

AT13 ~~ SF13

#autoregressive and cross-lagged paths

SF10 ~ AT7 + SF7

AT10 ~ SF7 + AT7

SF13 ~ AT10 + SF10

AT13 ~ SF10 + AT10

,

#Cross lagged panel model 1

ModFitLV1 <- sem(clpm1, data = FinalData, std.ov = TRUE, std.lv=TRUE, missing = "fiml", estimator = "MLR", verbose = TRUE)

summary(ModFitLV1, fit.measures =TRUE, standardized=TRUE, rsquare = TRUE, ci = TRUE)

Using the summary command the CIs are computed for the unstandardised coefficients, but not the standardised coefficients.

I have also tried using the parameter estimates function (see below), however, the CIs are also for the unstandardised regression coefficients.

parameterEstimates(ModFitLV1, standardized = TRUE, ci = TRUE, level = 0.95)

Does anyone know how to compute the CIs for the standardised regression coefficients from my model fit?

Any help would be very much appreciated!!

Thanks in advance.

Hannah

Terrence Jorgensen

unread,

May 26, 2016, 4:50:32 AM5/26/16

to lavaan

Does anyone know how to compute the CIs for the standardised regression coefficients from my model fit?

You should use unstandardized coefficients for making inferences about a null hypothesis, and use standardized coefficients as a standardized measure of effect size. But if you just want to use the width of a CI (or just the size of the SE) to describe the sampling variability you would expect for a standardized coefficient across repeated experiments, then you can get the SE from the standardizedSolution() function, and calculate a central normal-theory CI from that.

Terrence D. Jorgensen

Postdoctoral Researcher, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

UvA web page: http://www.uva.nl/profile/t.d.jorgensen

Hannah

unread,

May 26, 2016, 12:01:40 PM5/26/16

to lavaan

Thank you for the advice!!

Hannah

unread,

May 31, 2016, 4:01:58 AM5/31/16

to lavaan

Hi Terrence,

Would you possibly be able to help me with one more issue I am stuck on.

I would like to extract the exact pvalue of my regression coefficients from my sem model (ModFitLV1), instead of reporting p<.001. Do you know how to do this using the inspect(), parameterestimates() or any other function in Lavaan?

Thank you in advance!

Best wishes,

Hannah

On Thursday, 26 May 2016 09:50:32 UTC+1, Terrence Jorgensen wrote:

Terrence Jorgensen

unread,

May 31, 2016, 5:19:00 AM5/31/16

to lavaan

I would like to extract the exact pvalue of my regression coefficients from my sem model (ModFitLV1), instead of reporting p<.001. Do you know how to do this using the inspect(), parameterestimates() or any other function in Lavaan?

You can save the output of parameterEstimates() and extract the pvalues column, as long as you know which row to look for.

params <- parameterEstimates(ModFitLV1)
params$pvalue

Or you can print the parameterEstimates() output using the print() method with the nd argument (for up to 8 digits, I think).

print(params, nd = 8)

Hannah

unread,

May 31, 2016, 7:16:01 AM5/31/16

to lavaan

This worked perfectly. Thank you very much for your help!

Douglas Bonett

unread,

Apr 12, 2017, 4:27:58 PM4/12/17

to lavaan

Request: Please add ci = and level = in standardizedSolution()

Mark Seeto

unread,

Apr 13, 2017, 7:08:21 AM4/13/17

to lavaan

On Thursday, May 26, 2016 at 6:50:32 PM UTC+10, Terrence Jorgensen wrote:

You should use unstandardized coefficients for making inferences about a null hypothesis, and use standardized coefficients as a standardized measure of effect size. But if you just want to use the width of a CI (or just the size of the SE) to describe the sampling variability you would expect for a standardized coefficient across repeated experiments, then you can get the SE from the standardizedSolution() function, and calculate a central normal-theory CI from that.

Terrence (or anyone else), what's the reason for using unstandardized coefficients for inferences about a null hypothesis instead of standardized coefficients? It would seem strange to me to report the p-value for the unstandardized coefficient together with the estimate and confidence interval for the standardized coefficient.

Thanks,
Mark

Terrence Jorgensen

unread,

Apr 17, 2017, 9:55:54 AM4/17/17

to lavaan

Terrence (or anyone else), what's the reason for using unstandardized coefficients for inferences about a null hypothesis instead of standardized coefficients?

Historically, I think this had something to do with the SEs not being trustworthy when analyzing correlation matrices instead of covariance matrices (at least in certain models without constraining residual variances to be 1 - explained variance).

http://dx.doi.org/10.1037/0033-2909.105.2.317

Since null (nil) hypotheses of zero are interpreted the same way for unstandardized and standardized parameters (no effect or relationship), the same hypothesis can be tested using the unstandardized estimate's test statistic or CI. And the standardized point estimate is used for interpretation, if we don't want reference to an arbitrarily set scale for latent common factors. Of course, we need to be careful about interpreting standardized effect sizes that way because we use sample estimates of variances to standardize, rather than any known population variance, and variances are also affected by design issues (e.g., how homogeneous is the sampling frame?), so standardized effect sizes are often not comparable across studies anyway. Here's some interesting reading on the subject:

http://dx.doi.org/10.1348/000712608X377117

http://dx.doi.org/10.1037/a0028086

Of course nowadays, it has been worked out how to calculate unbiased SEs for standardized solutions, but I don't think that means unstandardized solutions should be ignored. A host of more seasoned SEM veterans on SEMNET might have more to say about this tradition, if you want to try posting this question there:

http://www2.gsu.edu/~mkteer/semnet.html

It would seem strange to me to report the p-value for the unstandardized coefficient together with the estimate and confidence interval for the standardized coefficient.

Well, according to the APA manual, we are expected to report an estimate along with a test statistic, df (if applicable), p value, CI for the estimate, and a standardized measure of effect size. As Dr. Bonett pointed out, it would be naïve not to report a CI along with the point estimate of a standardized effect size, because those are just as susceptible to sampling variability as unstandardized estimates. But I am unsure whether normal-theory CIs calculated from SEs for standardized estimates are appropriate, or whether unstandardized confidence limits should be transformed -- maybe the latter is wrong if the sampling variability of estimates used to standardize also need to be considered in the transformation. Likelihood profile CIs for standardized estimates would be great... Anyway, I'd love to see literature on this. Again, SEMNET might be a good place to ask about this, in case someone has already investigated the issue.

Chris I

unread,

Apr 17, 2017, 10:11:32 AM4/17/17

to lavaan

Following up on this, when I run a SEM in lavaan, I get the following output:

"Regressions:
Estimate 0.232 Std.Err 0.072 z-value 3.225 P(>|z|) 0.001 Std.lv 0.217 Std.all 0.217"

Are these p values (and standard errors, z-values) given for the unstandardised coefficients then?

I would just like to confirm as I read the following in Kline's SEM textbook:

"• Do not indicate anything about statistical significance for the standardized parameter estimates unless you used a method, such as constrained estimation, that generates correct standard errors in the standardized solution."

Terrence Jorgensen

unread,

Apr 17, 2017, 10:30:02 AM4/17/17

to lavaan

Are these p values (and standard errors, z-values) given for the unstandardised coefficients then?

Yes. To view SEs for the standardized coefficients, use standardizedSolution()

I would just like to confirm as I read the following in Kline's SEM textbook:

"• Do not indicate anything about statistical significance for the standardized parameter estimates unless you used a method, such as constrained estimation, that generates correct standard errors in the standardized solution."

The SEs reported in standardizedSolution() are correct.

Chris I

unread,

Apr 17, 2017, 10:41:10 AM4/17/17

to lavaan

Thanks for the quick reply!

Mark Seeto

unread,

Apr 17, 2017, 8:48:47 PM4/17/17

to lavaan

Thanks for explanation Terrence.

Yves Rosseel

unread,

Aug 30, 2017, 8:49:02 AM8/30/17

to lav...@googlegroups.com

On 04/12/2017 10:27 PM, Douglas Bonett wrote:
> Request: Please add ci = and level = in standardizedSolution()

In dev 0.6 now.

Yves.

Stas Kolenikov

unread,

Aug 30, 2017, 1:12:31 PM8/30/17

to lav...@googlegroups.com

What Terrence said, in terms of methodology, is approximately true.
Instead of saying "unbiased", he should have said, "asymptotic". Very
few things are technically unbiased in statistics; the estimate of the
mean is when we are talking about i.i.d. data, and Horvitz-Thompson
survey estimator of the totals is when we are talking about data with
no nonresponse. Everything else is biased, period. Estimates and
standard errors that we get in SEM are biased, biased, biased, biased.
You may get an unbiased estimate of the means and the covariance
matrix, and then you twist them and screw them and stretch them and
reverse them with nonlinear maximization and the computations that the
standard errors require on top of that. (Read Browne 1984 to get a
feeling of just how complex the computations are.)

There are two main issues involved. First, the standard errors for
standardized coefficients are obtained through the delta method, and
with all the nonlinearities that are involved with that, the promised
land of asymptotia is pushed a little bit further away. Any
nonlinearity introduces or exacerbates small sample biases, and
introduces or blows up higher order moments like skewness and kurtosis
of the sampling distributions. Take a log of a positively distributed
variable, and the mean of the logs is not equal to the log of the mean
(read up on the lognormal distribution). What the delta method says is
that as the sampling distribution get tighter around the population
value (which we hope to achieve with larger sample sizes), the
variance of the nonlinear transformation gets tighter, as well, in
some predictable fashion. However, this is only an approximation --
and as an approximation, by Murphy's law, it is usually biased down,
making the standard errors anticonservative. Note that the standard
errors for unstandardized coefficients are themselves obtained by the
delta method approximations, and as such are less reliable than say
the standard errors for the item means (as if anybody is interested in
that trivia). So unstandardized standard errors aren't that great to
begin with, but by convoluting things and producing standardized
coefficients, you make things even less great.

The second issue is that most analytical derivations (have to) assume
that the model is correctly specified. When it isn't, things may get
iffy. If the model is not correctly specified, then you may end up
dividing your coefficient by a wrong quantity. In a more subtle way,
when the SEM model is incorrect, ADF/WLS standard errors are too
small. They are not necessarily disastrously small, but enough to get
worried about (in my simulations, the CIs had something like upper 80%
coverage; in some disastrous examples in other areas of statistics, I
have seen simulated 20% actual coverage).

Terrence points to the possibility of transforming the unstandardized
CIs, or by using likelihood profiles. These are good ideas -- but then
again the likelihood profile mostly makes sense with the model is
exactly right, and you have a multivariate normal distribution to deal
with. With non-normal data, you don't have that luxury.

-- Stas Kolenikov, PhD, PStat (ASA, SSC) @StatStas
-- Senior Scientist, Abt Associates @AbtDataScience
-- Program Chair (2018), Survey Research Methods Section of the
American Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
-- http://stas.kolenikov.name

> --
> You received this message because you are subscribed to the Google Groups
> "lavaan" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lavaan+un...@googlegroups.com.
> To post to this group, send email to lav...@googlegroups.com.
> Visit this group at https://groups.google.com/group/lavaan.
> For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward