Possible to sidestep measurement model estimation?

Matti Cervin

unread,

Nov 1, 2019, 12:36:28 PM11/1/19

to lavaan

I am fitting a big model with CFA in a large dataset. The model includes eight latent variables (five first-order latent variables and three second-order latent variables). All indicators are binary. Model fit is good and I want to move on to examine associations between these eight latent variables and other variables in the dataset. Is there a way to do this without having to run a new CFA or SEM each time? Can I somehow extract values for each participant on each latent variable into the original dataset and then run regressions using the original dataset? At the moment, I run a separate model for each regression. Each run takes like 20 minutes so would be nice to sidestep this part of the process.

Mauricio Garnier-Villarreal

unread,

Nov 1, 2019, 6:18:03 PM11/1/19

to lavaan

By now, the best way to get proper estimates using factor scores is with plausiblevalues() function from the semTools package. This way you estimate multiple sets of possible factor scores and estimate the model with these as multiple imputated data sets.

If you extract only 1 set of factor scores, most likely will bias the results.

Terrence Jorgensen

unread,

Nov 2, 2019, 12:13:00 PM11/2/19

to lavaan

the best way to get proper estimates using factor scores is with plausiblevalues() function from the semTools package. This way you estimate multiple sets of possible factor scores and estimate the model with these as multiple imputated data sets.

If you extract only 1 set of factor scores, most likely will bias the results.

Specifically, the SEs (and therefore test statistics) will be biased because estimated factor scores will be treated as known/observed data.

There is also an analytical alternative that should be more efficient (both less computationally intensive and smaller SEs because the plausible-values approach has additional Monte Carlo error; this is analogous to comparing multiple imputation to FIML). Unfortunately, the fsr() implementation in lavaan is very "beta", not yet ready for public consumption (research still needed to reveal the wisest default options). But when it is, we might never need to simultaneously estimate these big SEM models again :-)

https://doi.org/10.1177/0013164415607618

https://doi.org/10.1027/1614-2241/a000130

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Matti Cervin

unread,

Nov 2, 2019, 3:04:16 PM11/2/19

to lav...@googlegroups.com

Thanks both of you - really interesting reading. Guess I just have to buy a stronger computer while waiting for fsr()!

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/ZB6OimJGC0Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/96d0d5ad-0d5f-4872-9289-a6b35fdb9735%40googlegroups.com.

Reply all

Reply to author

Forward