WLSMV estimator: are results reliable when number of observations is too small to compute Gamma?

Łukasz Deryło

unread,

Jun 14, 2019, 7:19:00 AM6/14/19

to lavaan

I run CFA (confirmatory factor analysis) with WLSMV estimator (since my data are ordinal) in lavaanand I get the following warning message:

number of observations (190) too small to compute Gamma

Is this a problem with Gamma only and the rest is computed correctly? Can I proceed with results obtained with this warning? E.g. interpret estimates, p-values and fit indices in usual way?

Or maybe this affects somehow (how?) credibility of a whole CFA?

Terrence Jorgensen

unread,

Jun 17, 2019, 5:03:27 PM6/17/19

to lavaan

Can I proceed with results obtained with this warning?

"Can" and "should" are 2 different things. This estimator, and the corresponding robust corrections, require much larger N to stabilize and yield unbiased estimates/SEs and nominal error rates.

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Pavneet Kaur

unread,

Jun 24, 2019, 11:36:37 AM6/24/19

to lavaan

I am also working on CFA with ordinal variables from a scale of 0-3. I faced the same problem as Łukasz Deryło because my sample size is 199 (total is 227 but the rest cases have missing data).

Can anyone please suggest me how can I address the concerns of the warning:

In lav_samplestats_from_data(lavdata = lavdata, missing = lavoptions$missing, :

lavaan WARNING: number of observations (199) too small to compute Gamma

Thanks in advance.

Terrence Jorgensen

unread,

Jun 25, 2019, 1:29:14 AM6/25/19

to lavaan

I am also working on CFA with ordinal variables from a scale of 0-3. I faced the same problem as Łukasz Deryło because my sample size is 199

You can't escape the need for more data. SEM involves complex multivariate systems, and relies heavily on asymptotic theory (what should happen as N approaches infinity, not what does happen in finite samples). Modeling 2nd-order moments (covariance matrices) already requires N > 120 for estimation to stabilize even for smallish models in the best case scenario (multivariate normality), and for the test statistic's sampling distribution to be approximately chi-squared. Robust corrections for continuous data rely on even higher-order moments (4th order, i.e., multivariate kurtosis), which requires even larger N to stabilize. But the robust procedure in WLSMV is adjusting for multistage estimation (first thresholds, then polychoric correlations, then fitting your model to those), which involve even more assumptions about latent variables that underlie each observed discrete indicators, so even more data is needed for that process to stabilize.

I thought Gamma was only necessary for calculating the robust chi-squared statistic. Do you still get estimates and SEs?

Pavneet Bharaj

unread,

Jun 25, 2019, 9:49:40 AM6/25/19

to lav...@googlegroups.com

Thanks a lot Dr. Terrence.

Yeah I am getting all the estimates even after the warning message.

Please see the attached:

lavaan 0.6-3 ended normally after 58 iterations

Optimization method NLMINB
Number of free parameters 50

Used Total
Number of observations 199 214

Estimator DWLS Robust
Model Fit Test Statistic 365.028 376.625
Degrees of freedom 203 203
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.270
Shift parameter 89.145

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/c22f2ce3-b977-4127-9d82-7b6c31c4087b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Pavneet Kaur Bharaj

Doctoral Student

Indiana University Bloomington

pkbh...@iu.edu

Terrence Jorgensen

unread,

Jun 25, 2019, 10:52:48 AM6/25/19

to lavaan

Oh, you even get the robust test. I must have misunderstood what Gamma is used for.

Pavneet Bharaj

unread,

Jun 25, 2019, 10:58:43 AM6/25/19

to lav...@googlegroups.com

My model is specified as (for which I shared the output above):

model<-'Exper=~NA*e1+e2+e3+e4+ e5
Belief1=~NA*B1.1+B1.2
Belief2=~NA*B2.1+B2.2+B2.3+B3.1+B3.2+B4.1+B4.2
Belief3=~NA*B5.1+B5.2+B5.3+B6.1+B6.2+B6.3+B7.1+B7.2
Belief2~~1*Belief2
Belief1~~1*Belief1
Belief3~~1*Belief3
Exper~~1*Exper
Belief1~g11*Exper
Belief2~b21*Belief1
Belief2~g21*Exper
Belief3~b31*Belief1
Belief3~b32*Belief2
Belief3~g31*Exper
a:=g11*b31
b:=g21*b32
c:=g11*b21*b32'
fit_model<-cfa(model, data=mydata, estimator="wlsmv")
summary(fit_model, fit.measures=TRUE, standardized=T)

Also, I want to know about what difference does "ordered" command do in the analysis as my fit indices were quite different when I used it as

fit_model<-cfa(model, data=mydata, estimator="wlsmv",
ordered=c("B1.1","B1.2","B2.1","B2.2","B2.3","B3.1",
"B3.2","B4.1","B4.2","B5.1","B5.2","B5.3","B6.1",
"B6.2","B6.3","B7.1","B7.2","e1","e2","e3","e4","e5"))

--

You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.

To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/b2536d49-1c11-4638-9fef-c9eac9a41eed%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Terrence Jorgensen

unread,

Jun 25, 2019, 11:32:10 AM6/25/19

to lavaan

what difference does "ordered" command do in the analysis

Without it, lavaan will treat the numeric values as though the numbers are meaningful when placed on a number line (i.e., interval-level data). Declaring outcomes as "ordered" means the numbers are treated as ordinal categories, so lavaan assumes there is a normally distributed latent item-response underlying each ordered indicator, and that your model is hypothesizing relationships among those latent item-responses.

Here is a great teaching article about how to interpret SEMs with categorical outcomes, although it is about growth factors rather than common factors.

http://dx.doi.org/10.1037/1082-989X.9.3.301

my fit indices were quite different when I used it as

Yes, that is expected. Your are modeling different data.

https://doi.org/10.1080/10705511.2014.882658

https://doi.org/10.1080/10705511.2014.859510

Pavneet Bharaj

unread,

Jun 25, 2019, 11:34:03 AM6/25/19

to lav...@googlegroups.com

Thanks a lot Dr. Terrence. That really helps in clarifying my doubts.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.

To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/8d30a9db-3181-4ec9-9de3-82255cc56317%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Łukasz Deryło

unread,

Jun 27, 2019, 5:39:11 AM6/27/19

to lavaan

I wonder, what I should do now: discussion changed it's merit (thanks Mr Kaur!) and ended up with answer for question that was asked somewhere in the middle (question about "ordered" parameter) while main question (the one about Gamma) is still unanswered. Should I post it again?

W dniu wtorek, 25 czerwca 2019 17:34:03 UTC+2 użytkownik Pavneet Kaur napisał:

Thanks a lot Dr. Terrence. That really helps in clarifying my doubts.

On Tue, Jun 25, 2019 at 11:32 AM Terrence Jorgensen <tjorge...@gmail.com> wrote:

what difference does "ordered" command do in the analysis

Without it, lavaan will treat the numeric values as though the numbers are meaningful when placed on a number line (i.e., interval-level data). Declaring outcomes as "ordered" means the numbers are treated as ordinal categories, so lavaan assumes there is a normally distributed latent item-response underlying each ordered indicator, and that your model is hypothesizing relationships among those latent item-responses.

Here is a great teaching article about how to interpret SEMs with categorical outcomes, although it is about growth factors rather than common factors.

http://dx.doi.org/10.1037/1082-989X.9.3.301

my fit indices were quite different when I used it as

Yes, that is expected. Your are modeling different data.

https://doi.org/10.1080/10705511.2014.882658

https://doi.org/10.1080/10705511.2014.859510

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
http://www.uva.nl/profile/t.d.jorgensen

--
You received this message because you are subscribed to the Google Groups "lavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lav...@googlegroups.com.

To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/8d30a9db-3181-4ec9-9de3-82255cc56317%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward