Hi all,
I have been trying to fit a cfa model on a likert scale with 5 categories with over 50 items and 8 latent factors using a sample size of over 28,000. I used WLSMV as the estimator as this has been recommended for CFAs with ordinal data.
In my first preliminary round of CFAs, I used listwise deletion because that was the default method of handling missing data in lavaan. After listwise deletion, my sample size dropped down by roughly 20%. This is a substantial amount of missing data but ultimately, not a big deal as my sample size is still quite large. However, I understand that listwise deletion is not ideal as it assumes that the missing data are MCAR (I conducted LittleMCAR's test in SPSS and unfortunately, this test came out statistically significant, and therefore, list wise deletion is not a viable approach).
Unfortunately, according to Timothy Brown, there’re issues with pairwise deletion too (e.g., if data are MAR, parameter estimates as well as standard errors are severely biased, etc. ).
So I have two questions: Is WLSMV still the best estimator given the nature of the missing data I have in my sample? If not, is there another estimator that can handle ordinal data and missing data well? If there’s no other estimator that can handle ordinal data like WLSMV, do people recommend data imputation as the next step? If so, which imputation method? I really hope I don’t have to do data imputations but at the same time, I will do it if that’s the best way of handling it.
Many thanks for those who have been helping me with my journey with CFAs. I hope to continue getting valuable insights from you and the rest of lavaan community!
Anh
Because my data are ordinal, they do not have a mutlivarate normal distribution, so I wonder if multiple imputation is still the best approach?
how should I efficiently test MCAR and MAR in R, especially when I have close to 60 variables in my datasets?
do you have a reference that states what you said, i..e, "even if imperfect, imputation model is probably going to lead to less biased estimates than listwise deletion?"