Understanding std.all estimates

136 views
Skip to first unread message

Ralph Peterson

unread,
Jun 27, 2019, 3:56:50 AM6/27/19
to lavaan
Hi everyone,

I suppose this is a very basic SEM question, but I'm having a hard time understanding the standardized estimates.

I tried simulating data and fitting a model like this:

library(lavaan)
set.seed(42) 
y <- rnorm(1000,5,3)
x1 <- y+rnorm(1000,0,1)
x2 <- y+rnorm(1000,0,2)
x3 <- y+rnorm(1000,0,3)
df <- data.frame(y,x1,x2,x3)

model <- 'y =~ 1*x1+1*x2+1*x3
#x1~~a*x1
#x2~~a*x2
#x3~~a*x3
'
fit <- sem(model, df,std.ov=T)
summary(fit, standardize=T)

Given that I use std.ov in the sem command, all indicators should be standardized. I would assume that should lead to a case where the std.ov and std.all columns are equal, given that the indicators are already standardized.
But I get this output:

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  y =~                                                                 
    x1                1.000                               0.856    0.898
    x2                1.000                               0.856    0.843
    x3                1.000                               0.856    0.788

Only if I set the error terms to be equal, I get the same std.all estimates for all indicators, so it seems they are somehow responsible?

Any help to understand would be very much appreciated!

Thanks
Ralph

Edward Rigdon

unread,
Jun 27, 2019, 1:49:06 PM6/27/19
to lav...@googlegroups.com
The y in your model is not the y in your data frame. You used the =~ operator, so lavaan created a new common factor named y. That common factor is not standardized--its variance is set by the first loading fixed to 1. So your std.all results reflect standardization of the factor as well, while the std.ov results do not.
--Ed Rigdon

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/f12f85d6-a742-482e-aee9-fbd07a53d078%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ralph Peterson

unread,
Jun 27, 2019, 2:48:08 PM6/27/19
to lavaan
Thanks for your answer! Sorry, my question was posed in a very confusing way. I should not have included y in the data frame. Of course, the latent factor y is of interest.
What I meant was: why are the std.all estimates different from the STD.LV ones, given that the ovs are standardized anyway and both kinds of estimates standardize the latent factor?

Terrence Jorgensen

unread,
Jun 27, 2019, 5:35:37 PM6/27/19
to lavaan
why are the std.all estimates different from the STD.LV ones, given that the ovs are standardized anyway and both kinds of estimates standardize the latent factor?

Because the parameters are standardized using the model-implied SDs (not "observed" SDs, which are 1 by definition here).  You are fitting a restricted model, so the sample statistics are not perfectly reproduced:

lavInspect(fit, "sampstat") # observed
lavInspect
(fit, "cov.ov")   # model-implied

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Ralph Peterson

unread,
Jun 28, 2019, 4:44:07 AM6/28/19
to lavaan
Thank you very much!
That explains a lot - so the standardization takes place after model estimation.

Could you write out the formulas for the standardization in both cases? I think I'm still struggling with how error terms play a role here.
Sorry for the very basic questions.

Ralph Peterson

unread,
Jun 28, 2019, 5:07:39 AM6/28/19
to lavaan
It seems like it's
Std.lv Loading = Loading * SD (Latent Variable)
Std.all Loading = Loading * SD (Latent Variable) / SD (Estimated Indicator)

But why the multiplication?

Terrence Jorgensen

unread,
Jun 29, 2019, 1:18:02 PM6/29/19
to lavaan
Std.lv Loading = Loading * SD (Latent Variable)
Std.all Loading = Loading * SD (Latent Variable) / SD (Estimated Indicator)

Correct

But why the multiplication?

Think of Cov(X,Y) (i.e., the covariance between X and Y) as the fundamental information about their linear relationship.  A regression slope and a correlation coefficient both simply divide the covariance to express it relative to something of interest.
  1. Slope of Y regressed on X = Cov(X,Y) / Var(X) = Cov(X,Y) / [SD(X) * SD(X)]
  2. Cor(X,Y) = Cov(X,Y) / [SD(X) * SD(Y)]
Note that in factor analysis, the predictor X is the latent factor, and the outcome Y is the indicator.  Transforming the simple slope (1) above into the correlation (2) involves removing one of the SD(X) in the denominator of the slope (i.e., multiplying by SD(X) to cancel it out) and replacing it with SD(Y).  

Partial regression/correlation coefficients controlling for a third variable Z (or more) are more complex because they subtract out covariances/correlations of Z with X and Y, but the principle is the same; however, the formula above will transform a partial slope to a standardized slope, the latter of which is not actually equivalent to a partial correlation.

In the case of Std.lv, only the latent variable (which is the predictor X in the example above), you only assume the factor variance is 1, so you only have to cancel out its units with the transformation formula.

Ralph Peterson

unread,
Jun 29, 2019, 2:36:23 PM6/29/19
to lavaan
Thank you so much! That was very helpful and well-explained!
Reply all
Reply to author
Forward
0 new messages