Algorithm used in lavPredict

Yue Teng

unread,

Jun 28, 2017, 11:25:41 PM6/28/17

to lavaan

Hello,

I'm doing a project that uses factor scores in causal effect estimation. I obtained the factor score of a latent variable in SEM models using the lavPredict function. I want to know the exact algorithm used by lavPredict so I can compare the results of different methods of factor scoring (i.e., the regression method and Bartlett's method). My model is like

"f =~ y1 + y2 + y3 + y4 + y5
f ~ x1 + x2 + x3
x1 ~~ x2 + x3
x2 ~~ x3" (an SEM model) , and

"f =~ y1 + y2 + y3 + y4 + y5" (a CFA model), where f is the latent variable whose factor score is to be estimated.

Many thanks in advance.

Yue

Terrence Jorgensen

unread,

Jun 29, 2017, 4:30:16 AM6/29/17

to lavaan

You can ask general questions on SEMNET:

www2.gsu.edu/~mkteer/semnet.html

It is also easy to find descriptions and formulas by searching online. This was the first hit I found:

http://pareonline.net/getvn.asp?v=14&n=20

The text describes both methods, and the appendix has formulae.

Terrence D. Jorgensen

Postdoctoral Researcher, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

UvA web page: http://www.uva.nl/profile/t.d.jorgensen

Yue Teng

unread,

Jun 29, 2017, 2:52:51 PM6/29/17

to lavaan

Thanks Terrence, I'll try posting my question on SEMNET too. Actually I was aware of and had read the paper you referred to (by DiStefano). It talked about the calculation of factor scores under the EFA framework. I wonder if lavPredict uses the same formulae because it is under the SEM framework with latent factors predicting and being predicted by indicators, predictors, covariates simultaneously. Also, the output of an SEM model doesn't readily produce pattern/structure matrices, which are needed if using the formulas described. I've also read other papers on factor scores (e.g., Grice 2001) but they are all under the EFA framework. I still couldn't figure out how lavPredict obtains the factor scores.

Terrence Jorgensen

unread,

Jun 30, 2017, 5:33:42 AM6/30/17

to lavaan

CFA is just a restricted EFA, so the same formulas apply. You can just calculate factor scores from a CFA instead of an SEM (not sure why you would need factor scores anyway, if you can estimate your structural model -- using factor scores would mean you can't trust your SEs or test stats). But the model-implied factor covariance matrix can be calcualted from the residual covariance matrix and regression coefficients, so I don't see how having latent regressions would prevent factor scores from being calculated. The pattern matrix is factor loadings:

lavInspect(fit, "est")$lambda

If you want the model-implied covariances among factors:

lavInspect(fit, "cov.lv")

The structure matrix is the correlation between indicators and factors, which is easy to extract:

https://groups.google.com/d/msg/lavaan/d_VFwIn-2EM/e3k3qp_vBAAJ

yros...@gmail.com

unread,

Jun 30, 2017, 5:40:19 AM6/30/17

to lavaan

On 06/29/2017 08:52 PM, Yue Teng wrote:

talked about the calculation of factor scores under the EFA framework.

But that does not matter. The formulas are identical. Yet another source for the formulas (in the continuous case only):

Devlieger, I., Mayer, A., & Rosseel, Y. (2016). Hypothesis testing using factor score regression: A comparison of four methods. Educational and Psychological Measurement, 76(5), 741-770.

And of course, you can check the source code in the file lav_predict.R Also from the source code, on the Bartlett method:

# factor scores - normal case - Bartlett method
# NOTES: 1) this is the classic 'Bartlett' method; for the linear/continuous #           case, this is equivalent to 'ML'
#        2) the usual formula is: #               FSC = solve(lambda' theta.inv lambda) (lambda' theta.inv)
#           BUT to deal with zero or negative variances, we use the #           'GLS' version instead:
#               FSC = solve(lambda' sigma.inv lambda) (lambda' sigma.inv)
#           Reference: Bentler & Yuan (1997) 'Optimal Conditionally Unbiased #                      Equivariant Factor Score Estimators' #                      in Berkane (Ed) 'Latent variable modeling with #                      applications to causality' (Springer-Verlag)
#        3) instead of solve(), we use MASS::ginv, for special settings where
#           -by construction- (lambda' sigma.inv lambda) is singular

Yves.

Yue Teng

unread,

Jun 30, 2017, 4:25:16 PM6/30/17

to lavaan

I see! I now have a better understanding about EFA and CFA. Thank you so much Terrence and Yves for your explanations and references!

Yue

Reply all

Reply to author

Forward