686 views

Skip to first unread message

Sep 17, 2019, 5:23:11 AM9/17/19

to lavaan

Hello,

I am new to lavaan and I'd really appreciate your guidance on the following issue.

I am new to lavaan and I'd really appreciate your guidance on the following issue.

I want to do a multigroup confirmatory factor analysis on a scale containing 3 factors and 16 items on a 5-point Likert scale.

As suggested, I used the WLSMV estimator because of the categorical data. What I want to do is test invariance of the scale across three different gropus (depending on the age of the patients) :

- young adults (*n *= 83)

As suggested, I used the WLSMV estimator because of the categorical data. What I want to do is test invariance of the scale across three different gropus (depending on the age of the patients) :

- young adults (

- adults (*n *= 1281)

- eldery persons (*n *= 107)

When it comes to my data, they are **not normally distributed** (Mardia coefficient: b2d= 519.104 / **z=51.119** / p=0.00 ). There are no missing data and outliers.

The model tested is the following:

model<-'

F1=~E1+E5+E9+E11+E14+E15+E18

F2=~E2+E4+E13+E17

F3=~E3+E7+E8+E12+E16

F1~~F2

F1~~F3

F2~~F3

Now, I know that this topic was already mentioned before, but I couldn't find the answers needed.

Whenever I want to do a MGCFA on young adults and eldery patients, I get the lavaan error message:

Whenever I want to do a MGCFA on young adults and eldery patients, I get the lavaan error message:

`In lav_samplestats_from_data(lavdata = lavdata, missing = lavoptions$missing, :`

lavaan WARNING:
number of observations (83) too small to compute Gamma

Therefore, I have a few questions:

1) What is Gamma and do I need it to test the scale's invariance depending on the age of the participants?

2) Can I use the MLR estimator instead (or any other estimator)?

3) I know that one of the solutions is to increase the number of participants, the problem is I can't do that since I'm analysing the data from a project that is over now. What are my other options instead?

Thank you for your answers,

- Natalija, a desperate PhD baby student

Sep 17, 2019, 6:10:55 AM9/17/19

to lavaan

When it comes to my data, they arenot normally distributed(Mardia coefficient: b2d= 519.104 /z=51.119/ p=0.00 ).

With *N* > 1400, you have a lot of power to detect minor deviations, so the test is not very informative on its own.

Whenever I want to do a MGCFA on young adults and eldery patients, I get the lavaan error message:`In lav_samplestats_from_data(lavdata = lavdata, missing = lavoptions$missing, :`

lavaan WARNING: number of observations (83) too small to compute Gamma

Therefore, I have a few questions:

1) What is Gamma and do I need it to test the scale's invariance depending on the age of the participants?

As described on the ?lavInspect help page, Gamma is the asymptotic covariance matrix of the sample statistics. A parameter's point estimate is accompanied by a *SE*, which is an estimate of its sampling variability across repeated samples from the same population. Multiple parameters not only vary but covary (i.e., correlate across samples), leading to an estimated sampling-covariance matrix. Gamma is needed for some calculations (I forget which), but Gamma can only be calculated if you have enough information from your data. With 16 items on 5-point scales, you have 16*15/2 polychoric correlations + 16*4 thresholds = 184 sample statistics in each group, which is way larger than your 2 small *N*s.

If you still get test statistics from your models and can use lavTestLRT() to compare models, then I'm guessing the Gamma is not needed for those calculations.

2) Can I use the MLR estimator instead (or any other estimator)?

A barplot() of each variable would show you how asymmetric the distributions are. If each variable is approximately symmetric and each response category has a large enough *N*, then treating these as continuous with a robust ML estimator would give you approximately unbiased results about your factor loadings/correlations, as well as test statistics with approximately nominal Type I error rates.

Unfortunately, I don't know of any studies about treating ordinal data as continuous in the context of testing invariance, especially with a major imbalance in sample sizes. ML **might** avoid the issue with Gamma, but you would still have smaller *N*s in those 2 groups that the number of sample statistics: 16*19/2 = 152 sample stats.

3) I know that one of the solutions is to increase the number of participants, the problem is I can't do that since I'm analysing the data from a project that is over now. What are my other options instead?

I think you could try both estimators (robust ML and robust DWLS). If they both lead to the same conclusions, that adds a little extra confidence. And if invariance constraints hold (i.e., null hypotheses are not rejected), then that stabilizes the estimates because there are fewer of them and they are using information from all groups instead of one. But the big problem is still the small *N*s: with so little information about those populations, you have very little power to detect truly meaningful violations of invariance (unless the violations are huge). So Type II error rates will probably be large.

If the models fit well in each sample and you have simple structure (no cross-loadings or correlated residuals across factors), you could test invariance for each factor separately to decrease the number of estimated parameters in your models (and the number of sample stats, which might resolve the Gamma issue).

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

Sep 17, 2019, 10:50:23 AM9/17/19

to lavaan

Thank you so much for you clear explanationsTerrence Jorgensen! Testing each factor separately is an excellent idea !

I forgot to mention that, despite the number of observations beign too small to compute Gamma, lavaan gave me fit indices that were pretty good. But, when comparing the models between them, the invariance constraints stopped applying to the models at one point. Are those fit indices reliable event with this Gamma problem?

Thanks!

Thanks!

Sep 18, 2019, 6:25:06 AM9/18/19

to lavaan

In a book by Beaujean, A. (2014) *Latent variance Modelin Using R. A step-by-Step Guide*, I found the following explanation:

**"If a categorical data is coded numerically, R assumes it is a continuous variable, unless told differently"** (p.12).

I assume then that if your Likert scale points are coded numerically, you should use estimators for the continuous data?

Either way, I'll try both estimators, but I'm a little bit confused with the nature of the data depending on different statistical programs.

Either way, I'll try both estimators, but I'm a little bit confused with the nature of the data depending on different statistical programs.

On Tuesday, September 17, 2019 at 12:10:55 PM UTC+2, Terrence Jorgensen wrote:

Sep 18, 2019, 4:29:22 PM9/18/19

to lavaan

I assume then that if your Likert scale points are coded numerically, you should use estimators for the continuous data?

No, it means lavaan has no way of knowing the numbers represent ordinal categories, unless you tell it that the variables are ordinal using the ordered= argument:

when comparing the models between them, the invariance constraints stopped applying to the models at one point

I don't understand what you mean.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu