Ordinal data : DWLS, MLR and ML Bootstrap

842 views
Skip to first unread message

Helene

unread,
Jul 9, 2015, 4:36:31 PM7/9/15
to lav...@googlegroups.com

Hi,


My model has 5 latent variables which are measured by a total of 37 indicators. All indicators, except nb_kms, are ordinal data (a scale with values ranging from 1 to 5). Following Lavaan tutorial, I specified the indicators as ordinal and run CFA and SEM with WLS as estimator. At this stage I encountered four problems:

1-     1- AIC and BIC are not computed with WLS. I need them because I would like to compare alternative models.

2-     2- For the structural model, standard errors and p-values are not computed for the parameters of the equations. Thus I can’t conclude on the significance of the relationship between my latent variables.

3-     3- I get more than 50 warnings in the format: “empty cell(s) in bivariate table of sent8 x pol3”

4-     4- I tried several estimators: WLS, DWLS and WLSMV. How to choose the most appropriate one?


Alternatively, I ran the CFA and SEM models with ML bootstrap and MLR estimators which are advised for non-normal variables. But I doubt that I can consider the results as reliable…

What would you advise in this situation? My program is below.


Thank you for your help!

Best regards,


Hélène


################### Measurement model ##########################################

modele_mesure <- ' environnement =~ nep1+nep4+nep5+pol1+pol3+pol4+pol5+pol6+pbvoit1+pbvoit3+pbvoit4+pbvoit5

affectif =~ aff1+aff3+aff6+aff5+instr1+instr5

perception =~  instr7 + +instr8 +instr9 +temps1 +temps2 +temps3 +temps7 +sent1 +sent2 +sent3 +sent4 +sent5 +sent6 +sent7 +sent8

tpt_doux =~ fce_tcu + fce_train_car

voiture =~ fce_voit + nb_voit'

fit_mesure <- cfa(modele_mesure, data = epd_opi_ord,estimator = "WLS")

summary(fit_mesure, fit.measures = TRUE)

fitMeasures(fit_mesure,"all")

 

################### Stuctural model ##########################################

modele_str1 <- '


#measurement model

environnement =~ nep1+nep4+nep5+pol1+pol3+pol4+pol5+pol6+pbvoit1+pbvoit3+pbvoit4+pbvoit5

affectif =~ aff1+aff3+aff5+aff6+instr1+instr5

perception =~  instr7 + +instr8 +instr9 +temps1 +temps2 +temps3 +temps7 +sent1 +sent2 +sent3 +sent4 +sent5 +sent6 +sent7 +sent8

tpt_doux =~ fce_train_car+ fce_tcu

voiture =~ fce_voit + nb_voit

 

#regressions

perception ~ affectif + environnement

tpt_doux ~ perception +voiture + affectif + environnement

voiture ~ perception + tpt_doux + affectif + environnement

'

 

fit1 <- sem(modele_str1, data = epd_opi,estimator = "WLS",     ordered=c("nep1","nep4","nep5","pol1","pol3","pol4","pol5","pol6","pbvoit1","pbvoit3","pbvoit4","pbvoit5","aff1","aff3","aff5","aff6","instr1","instr5","instr7", "instr8 ","instr9 ","temps1 ","temps2 ","temps3 ","temps7 ","sent1 ","sent2 ","sent3 ","sent4 ","sent5 ","sent6 ","sent7 ","sent8","fce_tcu "," fce_train_car","fce_voit"))

summary(fit1, standardized = TRUE, fit.measures=TRUE)

Terrence Jorgensen

unread,
Jul 15, 2015, 11:52:26 AM7/15/15
to lav...@googlegroups.com

1-     1- AIC and BIC are not computed with WLS. I need them because I would like to compare alternative models.


That is because there is no likelihood calculated with least-squares estimators (e.g., ULS, WLS, GLS).  AIC and BIC are calculated from the likelihood, so they are only available using likelihood-based estimators.

2-     2- For the structural model, standard errors and p-values are not computed for the parameters of the equations. Thus I can’t conclude on the significance of the relationship between my latent variables.


If your model converged and provides point estimates, but standard errors could not be calculated, then the covariance matrix of your parameters probably could not be inverted.  This can happen with insufficiently identified models, or it can happen when the data are particularly troublesome.

3-     3- I get more than 50 warnings in the format: “empty cell(s) in bivariate table of sent8 x pol3”


This might be related to (2).  You have around 30 categorical indicators, each with 5 response categories.  So there are 5^30 = 9.3132257 x 10^20 possible response categories.  Unless your sample size is on the order of a hundred billion billions, then there is no way you can have observed all possible response categories.  These warnings are pointing out specific instances when there are zeros in 2-way contingency tables.  The more of those cases there are, the less you can trust your chi-squared test statistic, and estimation of coefficients and standard errors can become difficult. 

4-     4- I tried several estimators: WLS, DWLS and WLSMV. How to choose the most appropriate one?


DWLS is preferable.  You will see those results in the "robust" column of your summary() output.

Alternatively, I ran the CFA and SEM models with ML bootstrap and MLR estimators which are advised for non-normal variables. But I doubt that I can consider the results as reliable…

What would you advise in this situation? 


I don't usually advocate treating categorical data as though it were continuous, but it is a reasonable option in this case (perhaps your best option).  Your ordinal indicators all have 5 response categories, and there is some evidence from simulation studies that 5 or more response categories will yield conclusions that are acceptable close using either WLS or MLR, particularly for structural parameters.  MLR's performance is less good for factor loadings, unless you have more categories.


So if your research questions are primarily about the structural regressions, then those should be fine with MLR.  If you are also interested in factor loadings,  then I would use ML with bootstrapping.

Terry

Helene

unread,
Jul 20, 2015, 7:43:16 AM7/20/15
to lav...@googlegroups.com
Thank you very much for your answer and your help !
Best regards,

Hélène
Reply all
Reply to author
Forward
0 new messages