WLSMV Estimator in CFA with missing data / multiple imputation

Jan Brederecke

unread,

Oct 11, 2018, 4:31:52 AM10/11/18

to lavaan

Hello everyone,

I am confronted with the following problem:

To evaluate a questionnaire's factor structure via confirmatory factor Analysis we collected data of 140 participants.

37 of them have either one or two items missing.

The data is ordinal so i concluded that I have to use the WLSMV estimator in lavaan.

When I did so few weeks ago using an older version of lavaan I was able to use ' missing = "ml"' ' and R gave me a nice output telling me that it used all 140 participants via the DWLS estimator.

I was rather happy to have "solved" the problem of missings until I read deeper into the topic and found out that there must be a mistake because WLSMV wouldn't use missings and as far as I understand it was a kind of "bug" in the old Version of lavaan that made it possible to specify the "ml" - estimator for missing - whatever then happend behind the scenes.

Meanwhile my colleagues wrote a nice paper that included my findings so what I am looking for is as follows:

What exactly did I compute via my analysis and is there any chance that it can be of use once I understand what really happened?

Would multiple imputation for the missings be a good way to go to be able to use all 140 participants? The whole paper is now based on that number (it investigates other topics aswell) so I would be really glad to find a way to keep that.

Thank you all for taking the time

Jan Brederecke

Terrence Jorgensen

unread,

Oct 11, 2018, 6:28:18 AM10/11/18

to lavaan

What exactly did I compute via my analysis and is there any chance that it can be of use once I understand what really happened?

If you declared the indicators as ordered, then lavaan would have used estimator = "DWLS", not "ML". I think the previous behavior was to change missing = "ML" to missing = "pairwise" without a warning, but do not quote me on that. Now, it explicitly tells you that ML is not available for ordered data (thus, of course, no FIML).

Would multiple imputation for the missings be a good way to go to be able to use all 140 participants?

Pairwise deletion uses all available information to estimate each polychoric correlation, so all 140 would have been used. It assumes MCAR though, whereas MI would only assume MAR (assuming that any relevant predictors of missingness were included in the imputation model). So MI is less restrictive.

Terrence D. Jorgensen

Postdoctoral Researcher, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Jan Brederecke

unread,

Oct 11, 2018, 6:51:06 AM10/11/18

to lavaan

Dear Mr Jorgensen,

thank you so much for your quick and helpful answer, this is all that I needed and more!