CFA on non-normal ordinal data

Jonil Ursin

unread,

Aug 4, 2023, 1:57:26 AM8/4/23

to lavaan

Hi

I'm doing a 3 factor CFA where the items are all scored on a 4 point Likert scale. I've used the WLSMV estimator for the analysis as this seems to be the estimator to use for ordinal data. However, as I understand this assumes normal distribution of the latent variables. How do I test for this?

I've looked at each of the items, and like most Likert scale data they are not normally distributed. The skewness and kurtosis are both way off for each item. In addition, the Mardia's test is significant, so the data does not follow a multivariate normal distribution.

So I am wondering:

1. Is this the right way to "test" the normality of the underlying latent variables? Can I assume the latent variables are not normally distributed when the items are not?

2. Should I not be using the WLSMV estimator for this data? Is MLR an alternative even though my data is not continuous? I tried running a CFA with the MLR estimator - the factor loadings got a little worse, but some of the fit indices improved.

Thanks in advance!

Kind regards,

Jonil

Edward Rigdon

unread,

Aug 8, 2023, 9:46:16 PM8/8/23

to lav...@googlegroups.com

If you had some other criterion for identifying "the" underlying variable, then you might have a shot at assessing its normality. Or if you had prior knowledge to allow you to constrain some thresholds and intercepts or other attributes of your ordinal variables (as in some multiple group analyses), you might then be able to assess underlying normality. As it is, an assumption of underlying normality just means that thresholds and intercepts will be optimized such that underlying normality is most plausible.

You might consider whether there is a plausible theoretical story about the distribution of the underlying variables. That might carry weight with some readers.

Other alternatives include two-part models. If you have a large number of respondents, say, at a base level on your scale and the rest distributed across the response scale, then you might investigate whether the two-part model (one logistic regression assessing "base or not base" and another modeling the distribution of non-base cases), makes more sense. There are also "zero-inflated" options that you might consider.

If your observed variables are declared ordinal, then DWLS is a good choice.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/811f0a9e-ebc8-4823-9d64-f648141a585cn%40googlegroups.com.

Valeria Ivaniushina

unread,

Aug 9, 2023, 4:50:00 AM8/9/23

to lav...@googlegroups.com

Dear professor Rigdon,

Could you give a reference for such a two-part model?

I think it’s an interesting approach, but l’ve never seen an application

Regards,

Valeria

Ср, 9 авг. 2023 г. в 03:46, Edward Rigdon <edward...@gmail.com>:

To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/CAHxMgefua7xaesAu%2B8vzccqit2rRYDfXJpwL0YQkN8D5mtzDdg%40mail.gmail.com.

Jonil Ursin

unread,

Aug 11, 2023, 2:47:01 AM8/11/23

to lavaan

Thank you for a very detailed and useful response :)

Reply all

Reply to author

Forward