Possible to sidestep measurement model estimation?

33 views
Skip to first unread message

Matti Cervin

unread,
Nov 1, 2019, 12:36:28 PM11/1/19
to lavaan
I am fitting a big model with CFA in a large dataset. The model includes eight latent variables (five first-order latent variables and three second-order latent variables). All indicators are binary. Model fit is good and I want to move on to examine associations between these eight latent variables and other variables in the dataset. Is there a way to do this without having to run a new CFA or SEM each time? Can I somehow extract values for each participant on each latent variable into the original dataset and then run regressions using the original dataset? At the moment, I run a separate model for each regression. Each run takes like 20 minutes so would be nice to sidestep this part of the process.

Mauricio Garnier-Villarreal

unread,
Nov 1, 2019, 6:18:03 PM11/1/19
to lavaan
By now, the best way to get proper estimates using factor scores is with plausiblevalues() function from the semTools package. This way you estimate multiple sets of possible factor scores and estimate the model with these as multiple imputated data sets.

If you extract only 1 set of factor scores, most likely will bias the results.

Terrence Jorgensen

unread,
Nov 2, 2019, 12:13:00 PM11/2/19
to lavaan
the best way to get proper estimates using factor scores is with plausiblevalues() function from the semTools package. This way you estimate multiple sets of possible factor scores and estimate the model with these as multiple imputated data sets.

If you extract only 1 set of factor scores, most likely will bias the results. 

Specifically, the SEs (and therefore test statistics) will be biased because estimated factor scores will be treated as known/observed data.

There is also an analytical alternative that should be more efficient (both less computationally intensive and smaller SEs because the plausible-values approach has additional Monte Carlo error; this is analogous to comparing multiple imputation to FIML).  Unfortunately, the fsr() implementation in lavaan is very "beta", not yet ready for public consumption (research still needed to reveal the wisest default options).  But when it is, we might never need to simultaneously estimate these big SEM models again :-)


Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Matti Cervin

unread,
Nov 2, 2019, 3:04:16 PM11/2/19
to lav...@googlegroups.com
Thanks both of you - really interesting reading. Guess I just have to buy a stronger computer while waiting for fsr()!

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/ZB6OimJGC0Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/96d0d5ad-0d5f-4872-9289-a6b35fdb9735%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages