Why CFA Excluding Cases and How to Override?

613 views
Skip to first unread message

Jonathan Dinsmore

unread,
Jul 27, 2022, 5:47:52 PM7/27/22
to lavaan
Hi, I'm running CFA with lavaan, and it's only using a portion of my cases:

                                                                      Used       Total

  Number of observations                           260         309


I want to overcome this issue, if it's because some of the observations have a missing value here or there, I would like to insert the mean into those missing values (impute?), or if it's some other reason, whatever I need to do to make it work with the full subset of 309. I don't even know how to investigate why it's doing it. 

Can anyone help me with this? It's the first analysis I've ever done in R, so if you have any suggestions, please provide the specific code if you can.

Thanks!

Alejandro Hermida

unread,
Jul 28, 2022, 2:07:18 AM7/28/22
to lavaan
Hi!

This is because the default in lavaan is to perform listwise deletion. You can add the argument "missing = fiml" to your cfa command and it will handle the missing values using Full Information Maximum Likelihood, which is not imputation. I would also use Maximum-Likelihood with robust standard-errors with the argument "estimator = MLR" to account for any non-normality in your data (The default is ML). Here are the resources for both issues:

Enders, C. K., & Bandalos, D. L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural equation modeling8(3), 430-457.
Zhong, X., & Yuan, K. H. (2011). Bias and efficiency in structural equation modeling: Maximum likelihood versus robust methods. Multivariate Behavioral Research46(2), 229-265.

Hope this helps,

Alejandro 

Jonathan Dinsmore

unread,
Jul 28, 2022, 9:44:07 AM7/28/22
to lavaan
Alejandro, 

Thank you very much, so after implementing what you suggested, I believe that lavaan is indeed using all the cases, however it tells me that I have "missing patterns." I'm not sure what this means. I suppose the main thing is to understand what to report when I got to publish these findings. 

Here is the top area of the results, in case it gives any clues (see attached).

Thanks!



Screen Shot 2022-07-28 at 9.43.38 AM.png

Daniel Morillo Cuadrado

unread,
Jul 28, 2022, 10:11:16 AM7/28/22
to lav...@googlegroups.com
Having "missing patterns" means that exactly; that you have response patterns (i.e. the response vector for one participant/case) with missing values in some of the variables.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/592e954f-67c8-47f5-bca7-73cad99c0ad9n%40googlegroups.com.

Jonathan Dinsmore

unread,
Jul 28, 2022, 11:22:06 AM7/28/22
to lavaan
Hi Daniel, 

Thanks, I see, so in terms of reporting the analysis, I should probably have some more detailed metric of the missing vectors, right? Any way for me to compute that? 

Appreciate your help. 

Thanks,
Jonathan

Alejandro Hermida

unread,
Jul 29, 2022, 3:12:14 AM7/29/22
to lavaan
Hi Jonathan,

you can simply report the missingness in your variables for this. This can be done in different ways. For example, if the CFA is central to your analysis,  you could add a "%NA" column to the table where you present item details (e.g., loadings, wording). If the CFA is less central or you don't present such a table, you could describe the Nmin = and Nmax =, the minimum and maximum number of observations used for the analysis, in the text or in the table including the fit indices of the CFA. In any case, the missings for your items can be computed easily with base R functions such as table(data$your_item, NA = "always") for single items, colSums(is.na(data)) for the whole DB, or using special packages such as this one. This resource explains in detail best practices to discuss missingness in your manuscript:

Newman, D. A. (2014). Missing data: Five practical guidelines. Organizational Research Methods17(4), 372-411.

Hope this helps,

Alejandro 

Jonathan Dinsmore

unread,
Jul 29, 2022, 10:10:17 AM7/29/22
to lavaan
Awesome, that's very helpful, Alejandro! Thanks so much. I will go through all this when next I have a change, and will get back to you all if I have further questions. Cheers!
Reply all
Reply to author
Forward
0 new messages