Questions about the use of estimation

149 views
Skip to first unread message

Chen Ali

unread,
Jun 7, 2019, 8:13:17 AM6/7/19
to lavaan

 

 

I have questions about the use of estimation. I checked the univariate and multivariate normality for 6 items that are 5-point Likert scale.

However, univariate and multivariate normality were violated. Therefore, I used MLR and WLSMV estimators to run 1-factor CFA when I treated the data are continuous. It turned out that WLSMV showed the best model fit compared with MLR : CFI:0.994,TLI:0.988, RMSEA:0.052,SRMS=0.041, and loadings are high.

 

In addition, I used WLSMV estimators to run 1-factor CFA when I treated the data are categorical. The results showed that CFI: 0.994, TLI:0.990, RMSEA:0.108, SRMS=0.044 and loadings are high. The value of RMSEA is too high compared with the situation where I used WLSMV estimation to treat the data are continuous.

 

My question is that : Could WLSMV be applied in the continuous data (i.e. 5-pint Likert scale) when data are not normal? Because I thought that WLSMV only could be applied in the categorical data.

Terrence Jorgensen

unread,
Jun 7, 2019, 10:28:45 AM6/7/19
to lavaan

Could WLSMV be applied in the continuous data (i.e. 5-pint Likert scale) when data are not normal? Because I thought that WLSMV only could be applied in the categorical data.


WLSMV is just a keyword used as a shortcut for the lavaan arguments

estimator = "DWLS" # only used for categorical data
se
= "robust"
test
= "scaled.shifted"

WLS can be used for continuous data, but the robust procedures are only available for DWLS with categorical data.

Model fit is not comparable between the different estimators because that are finding solutions to reproduce different data (means+covariances of continuous data vs. thresholds+polychoric correlations for categorical data).  So that is not the basis for choosing an estimator.  

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam


Chen Ali

unread,
Jun 11, 2019, 4:32:28 AM6/11/19
to lavaan

Thank you!

I am still a bit confused about WLSMV. Could WLSMV be applied in continuous data as well as categorical data?

 

In the following code, I used the WLSMV estimator, but it did not treat data as categorical data:

model<-'QoLIBRI=~q10+q11+q12+q13+q14+q15'
fit_WLSMV<-cfa(model=model,data=All,estimator="WLSMV")

 

Another code,I used the WLSMV estimator, but it treated data as categorical data:

model<-'QoLIBRI=~q10+q11+q12+q13+q14+q15'
fit_WLSMV<-cfa(model=model,data=All,ordered =Items,estimator="WLSMV")

 

Also, are the following two codes using the same WLSMV estimation based on the same CFA model?

 

model<-'QoLIBRI=~q10+q11+q12+q13+q14+q15'

fit_WLSMV<-cfa(model=model,data=All,ordered =Items,estimator="WLSMV")

fit_mplus<-cfa(model = model,estimator="DWLS", se = "robust.sem", test = "scaled.shifted",parameterization="theta",

               ordered = Items,data=All)


Terrence Jorgensen於 2019年6月7日星期五 UTC+2下午4時28分45秒寫道:

Terrence Jorgensen

unread,
Jun 17, 2019, 4:43:51 PM6/17/19
to lavaan

Could WLSMV be applied in continuous data as well as categorical data?


Not without tricking the program into it.  Why would you want to?  DWLS and theta parameterization were developed for the application of covariance structure models to ordinal data. 
Message has been deleted

Yago Luksevicius de Moraes

unread,
Jun 12, 2023, 12:31:48 PM6/12/23
to lavaan
Hi all,

My colleagues and I are doing a simulation study and, for now, it seems that WLSMV behaves almost equal to MLR when data is continuous, i.e., both estimate the same loadings and intercepts and have almost equal fit statistics (although WLSMV tends to indicate a better fit, difference is not big enough to change any decision). The only pragmatical reason I've seen so far to prefer MLR to WLSMV is that WLSMV is less likely to converge. We still have some issues to solve before submitting the paper, but hopefully, it'll be published still this year.

Em segunda-feira, 12 de junho de 2023 às 12:07:14 UTC-3, joh4nd escreveu:
Adding test = "scaled.shifted",parameterization="theta" when estimating a model with ordered = TRUE (and estimator = DWLS) changes the results.

This makes me confused what Terrence means by  "DWLS and theta parameterization were developed for the application of covariance structure models to ordinal data."

Do you mind explaining what do you meant, Terrence?

Christian Arnold

unread,
Jun 12, 2023, 12:57:29 PM6/12/23
to lav...@googlegroups.com
Hi,

Interesting. You wrote: "
My colleagues and I are doing a simulation study and, for now, it seems that WLSMV behaves almost equal to MLR when data is continuous [...]]".

For convenience, I ignore the convergence problem and transfer your statement to my language world: both estimators are equally effective for continuous data What about efficiency? Which estimator is less computationally intensive and faster?

Best

Christian 


Von: lav...@googlegroups.com <lav...@googlegroups.com> im Auftrag von Yago Luksevicius de Moraes <yagol...@gmail.com>
Gesendet: Montag, Juni 12, 2023 6:31:55 PM
An: lavaan <lav...@googlegroups.com>
Betreff: Re: Questions about the use of estimation
--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/556a2a1f-98ef-4d80-984d-32f5481b4451n%40googlegroups.com.

Yago Luksevicius de Moraes

unread,
Jun 12, 2023, 1:07:51 PM6/12/23
to lavaan
That's harder to say because differences are just fractions of seconds. Sometimes WLSMV is faster, and other times MLR. 
I think that, as long as the models converge, there's no much difference. However, we are using minimal models, so there are not many parameters to estimate. To see differences, we would need to test models more complex than those we are using.

Best,
Yago

Christian Arnold

unread,
Jun 12, 2023, 1:13:56 PM6/12/23
to lav...@googlegroups.com
I would not ignore this. Think, for example, about the bootstrap with many thousands of draws. For me it's a logical consequence: with the same effectiveness, the more efficient solution is to be preferred. 

Maybe this is an aspect you could consider? Just my 0.0001 cents.

Best

Christian 


Von: lav...@googlegroups.com <lav...@googlegroups.com> im Auftrag von Yago Luksevicius de Moraes <yagol...@gmail.com>
Gesendet: Montag, Juni 12, 2023 7:08:00 PM

Christian Arnold

unread,
Jun 12, 2023, 5:02:56 PM6/12/23
to lav...@googlegroups.com
Think of many thousands of people using the bootstrap (although there are probably other methods, for example Monte Carlo CI). You "only" need ML estimation for the bootstrap... wlsmv or anything else for continuous data may be just as effective (or not). Is this estimation more efficient? I don't know, but I doubt it. What do your results say? A little bit you should think about the resources used and the environment? Right?

Best

Christian 

From: lav...@googlegroups.com <lav...@googlegroups.com> on behalf of Yago Luksevicius de Moraes <yagol...@gmail.com>
Sent: Monday, June 12, 2023 7:07:51 PM
To: lavaan <lav...@googlegroups.com>
Subject: Re: Questions about the use of estimation
 

Yago Luksevicius de Moraes

unread,
Jun 13, 2023, 11:21:36 AM6/13/23
to lavaan
Yeah, there's lots of methods that are poorly known, and may be better than the most popular ones in some situations.
Using WLSMV to continuous data is surprising us. When we first look the results of WLSMV and MLR, we couldn't see any difference. I plotted both results in the same graphics and we can't see that there are two models plotted. At first, I thought I had done something wrong, and had to double-check and redo the analyses again to convince myself the results were correct.

Of course, numerically, results are not totally equal, but differences are so small (like, at the third decimal place) that, qualitatively, it is rare to find an example where interpretations change. We can only see differences when the relation sample size x model complexity makes WLSMV begin failing to converge.

A curious thing, though, is that MLR rejected the wrong model more often (i.e., is more powerful) when sample size is small, but WLSMV seems to become better as the sample increases. "Small" is relative to the kind of the model, so that WLSMV seems to become more efficient than MLR when  sample size > 150 for models with 6 continuous items and some threshold between 2,000 and 10,000 for models with 6 dichotomous items. This was not something we were looking for, and it will probably have to stay for a future study, so I've done only a few comparisons, and wouldn't recommend WLSMV for big samples of continuous data for now, especially because differences tend to be less than 20 observation units when sample < 500. However, it seems an interesting field of research.

Best,
Yago

Reply all
Reply to author
Forward
0 new messages