Multigroup CFA - problems with strong invariance estimation

177 views
Skip to first unread message

Matti Cervin

unread,
Sep 16, 2019, 8:01:20 AM9/16/19
to lavaan
I am examining whether a model is invariant across two different groups. Total N~2300, >1000 in each group.

This is the model

model<-'
F1 =~ x1+x2+x3
F2 =~ 1*x4+1*x5
F3 =~ 1*x6+1*x7
F4 =~ x8+x9+x10
F5 =~ 1*x11+1*x12

Three factors have only two indicators and I fix factor loadings for these indicators.

I test invariance with the following CFAs:

config <- cfa(model, data=all, group="gr",
              ordered = c('x1','x2','x3','x4',
                          'x5','x6','x7','x8','x9',
                          'x10','x11','x12'))
 
weak <- cfa(model, data=all, group="gr",
            ordered = c('x1','x2','x3','x4',
                          'x5','x6','x7','x8','x9',
                          'x10','x11','x12'),
            group.equal = c("loadings"))
 
strong <- cfa(model,data=all, group="gr",
              ordered = c('x1','x2','x3','x4',
                          'x5','x6','x7','x8','x9',
                          'x10','x11','x12'),
              group.equal = c("loadings", "intercepts"))
 
strict <- cfa(model, data=all, group="gr",
               ordered = c('x1','x2','x3','x4',
                          'x5','x6','x7','x8','x9',
                          'x10','x11','x12'),
               parameterization = "theta",
               group.equal = c("loadings", "intercepts", "residuals"))


All models run fine except for the "strong" model (i.e.,   group.equal = c("loadings", "intercepts")) where I get this warning:

Warning message:
In lavaan::lavaan(model = model, data = all, ordered = c("x1",  :
  lavaan WARNING: the optimizer warns that a solution has NOT been found!

The exact same warning appears for the strong model when I test for invariance for this model across two other groups. The "strict" model runs fine.

Any ideas on what may be the problem with running the "strong" model?

Terrence Jorgensen

unread,
Sep 16, 2019, 9:22:54 AM9/16/19
to lavaan
You have ordered indicators, so you need to start by constraining thresholds (at which point you need to freely estimate the intercepts and residual variances that are fixed in Group 2 for identification -- this necessitates using parameterization = "theta" in all models). If you have only 3 categories, this should be equivalent to your configural model (i.e., you just trade 2 thresholds for those 2 estimates).  If you have only 2 categories, you can't distinguish these levels of invariance, so you just simultaneously constrain thresholds, loadings, and intercepts to test strong variance (compared to the default configural model).

See examples in the ?semTools::measEq.syntax help page (after installing semTools), and read the Wu & Estabrook (2016) reference therein.

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam


Terrence Jorgensen

unread,
Sep 16, 2019, 9:24:16 AM9/16/19
to lavaan

Matti Cervin

unread,
Sep 16, 2019, 9:30:17 AM9/16/19
to lav...@googlegroups.com
Thanks a lot, Terrence!

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/O_dRFM2XtwM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/79191648-6161-4cc1-89f3-c50cefde5ea5%40googlegroups.com.

Matti Cervin

unread,
Sep 16, 2019, 10:39:44 AM9/16/19
to lav...@googlegroups.com
Terry - I am reading the sources you linked to, but I am no expert in SEM so I am having a hard time figuring out the correct way to go about this.

My indicators are binary (i.e., 0 and 1). Can I use the code that I specified above and only add constraints? That is, this code:

config <- cfa(model, data=all, group="gr",
              ordered = c('x1','x2','x3','x4',
                          'x5','x6','x7','x8','x9',
                          'x10','x11','x12'))
 
weak <- cfa(model, data=all, group="gr",
            ordered = c('x1','x2','x3','x4',
                          'x5','x6','x7','x8','x9',
                          'x10','x11','x12'),
            group.equal = c("loadings"))
 
strong <- cfa(model,data=all, group="gr",
              ordered = c('x1','x2','x3','x4',
                          'x5','x6','x7','x8','x9',
                          'x10','x11','x12'),
              group.equal = c("loadings", "intercepts"))
 
strict <- cfa(model, data=all, group="gr",
               ordered = c('x1','x2','x3','x4',
                          'x5','x6','x7','x8','x9',
                          'x10','x11','x12'),
               parameterization = "theta",
               group.equal = c("loadings", "intercepts", "residuals"))

 Or do I need to use another function (e.g., measEq.syntax)?

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/O_dRFM2XtwM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

Matti Cervin

unread,
Sep 16, 2019, 12:31:18 PM9/16/19
to lav...@googlegroups.com
Sorry for the last, unedited script. This is what I use after reading your links, but I am unsure whether it is correct. Note that the indicators are binary (0/1).
 
vars<-c('x1','x2','x3','x4',

                          'x5','x6','x7','x8','x9',
                          'x10','x11','x12')
 
config <- cfa(model, estimator="WLSMV",data=all, group="child",
              ordered = vars,
              parameterization = "theta",group.equal = c("thresholds"))
 
weak <- cfa(model, estimator="WLSMV",data=all, group="child",
            ordered = vars,
            parameterization = "theta",
            group.equal = c("thresholds", "loadings"))
 
strict <- cfa(model, estimator="WLSMV",data=all, group="child",
              ordered = vars,
              parameterization = "theta",
              group.equal = c("thresholds","loadings", "intercepts"))


This code gives me interpretable output, but I am not sure whether it is correct. Thanks so much for your patience!

Terrence Jorgensen

unread,
Sep 17, 2019, 5:22:13 AM9/17/19
to lavaan
I am unsure whether it is correct. Note that the indicators are binary (0/1).

Then you need to simultaneously constrain thresholds and loadings.  FYI, declaring ordered= outcomes forces lavaan to use DWLS by default, with the MV-adjusted test statistic, so no need to request it.

config <- cfa(model, data=all, group="child",
              ordered
= vars, parameterization = "theta"))
 
weak
<- cfa(model, data=all, group="child",

            ordered
= vars, parameterization = "theta",
           
group.equal = c("thresholds", "loadings"))

 
strict
<- cfa(model, data=all, group="child",

              ordered
= vars, parameterization = "theta",
             
group.equal = c("thresholds","loadings", "intercepts"))
Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Matti Cervin

unread,
Sep 17, 2019, 9:18:45 AM9/17/19
to lav...@googlegroups.com
Then you need to simultaneously constrain thresholds and loadings.

Thanks. Is this the correct code to do that:

model<-'
F1 =~ x1+x2+x3
F2 =~ 1*x4+1*x5
F3 =~ 1*x6+1*x7
F4 =~ x8+x9+x10
F5 =~ 1*x11+1*x12

fit <- cfa(model, data=all, group="child",
            ordered = vars, std.lv=T,

            parameterization = "theta",
            group.equal = c("thresholds", "loadings"))


Or should I use this code:

model<-'
F1 =~ x1+x2+x3
F2 =~ 1*x4+1*x5
F3 =~ 1*x6+1*x7
F4 =~ x8+x9+x10
F5 =~ 1*x11+1*x12

fit2 <- measEq.syntax(configural.model = model, data = all, 
parameterization = "theta",
ordered = vars, group = "male",
ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
group.equal = c("thresholds", "loadings"),
return.fit = TRUE)

If the latter is correct, which ID.fac and ID.cat settings are best to use in my case. Thanks again! Really appreciate it.

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/O_dRFM2XtwM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

Terrence Jorgensen

unread,
Sep 18, 2019, 2:31:06 PM9/18/19
to lavaan
Thanks. Is this the correct code to do that:

I showed you the correct code in the previous email, except that I made a typo in the strict invariance model.  Strict invariance should additionally constrain residual variances, not intercepts (the latter are fixed to zero by default, so they are already equal).

group.equal = c("thresholds","loadings","residuals")

Or should I use this code:

The only reason to use measEq.syntax() would be to learn about the model specification and lavaan syntax, so leave the default return.fit=FALSE so you can print the syntax and see what is happening.  This is a pedagogical tool.

If the latter is correct, which ID.fac and ID.cat settings are best to use in my case

ID.cat should always be the default (following Wu & Estabrook, 2016), but with binary indicators the choice is arbitrary (the model specification would be statistically equivalent to lavaan's default specification).  ID.fac is arbitrary unless a level of invariance is rejected, in which case tests for partial invariance could be biased using the reference-indicator/marker-variable method if that indicator in fact has DIF; so either the default std.lv or effects.coding would be smarter if you need to test for partial invariance.

Matti Cervin

unread,
Sep 18, 2019, 2:40:16 PM9/18/19
to lav...@googlegroups.com
Terrence: Thank you so much for this. After your last set of instructions, everything runs smoothly.

Just to be sure -- so when binary items are used, the best approach is to directly test for strict invariance by constraining thresholds, loadings and residuals?

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/O_dRFM2XtwM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

Terrence Jorgensen

unread,
Sep 18, 2019, 4:33:26 PM9/18/19
to lavaan
when binary items are used, the best approach is to directly test for strict invariance by constraining thresholds, loadings and residuals?

You can still test strong invariance first (constrained thresholds and loadings; by virtue of fixing intercepts to 0 for identification, they are already equal).  But if strong invariance fits significantly worse than configural invariance, even if you narrow it down to a violation in one specific indicator, you have no way of knowing whether the violation of invariance is due to differences in that item's loading, intercept, or threshold.  I posted my slides, which explain why:

Matti Cervin

unread,
Sep 19, 2019, 5:10:11 AM9/19/19
to lav...@googlegroups.com
Thanks! For one group comparison, everything works nicely. However, when I test for invariance for another set of groups, I can't identify a model for configural and strong settings. However, for the strict invariance, everything works. This is the code and output:

model<-'
F1 =~ x1+x2+x3
F2 =~ 1*x4+1*x5
F3 =~ 1*x6+1*x7
F4 =~ x8+x9+x10
F5 =~ 1*x11+1*x12'
 

fitconfig <- cfa(malt2, data=all, group="child",

                    ordered = vars,
                    parameterization = "theta",
                    group.equal = c("thresholds"))
 
# lavaan WARNING: the optimizer warns that a solution has NOT been found!
 
fitstrong <- cfa(malt2, data=all, group="child",

           ordered = vars,
           parameterization = "theta",
           group.equal = c("thresholds", "loadings"))
 
# lavaan WARNING: the optimizer warns that a solution has NOT been found!
 
fitstrict <- cfa(malt2, data=all, group="child",

               ordered = vars,
               parameterization = "theta",
               group.equal = c("thresholds", "loadings", "residuals"))
 
#  This runs fine and gives me the following output: 
chisq.scaled: 197.195
df.scaled: 91.000
pvalue.scaled: 0.000
rmsea.scaled: 0.032
cfi.scaled: 0.991
tli.scaled: 0.987
srmr: 0.068

Given that the strict model fits nicely, can I just exclude the configural and strong tests and assume strict invariance?

What may cause the problems with the configural and strong models?                    


--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/O_dRFM2XtwM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

Terrence Jorgensen

unread,
Sep 19, 2019, 10:46:55 AM9/19/19
to lavaan
Given that the strict model fits nicely, can I just exclude the configural and strong tests and assume strict invariance?

This is a general question better suited for SEMNET.

All conclusions are tentative in science; the more evidence is gathered in support (or that does not refute) a conclusion, the more confidence we can have.  If your models don't converge, you don't have the opportunity to falsify your hypothesis of invariance.  You could argue the decent approximate fit is at least not evidence against the null hypothesis that strict invariance is a decent approximation to the truth.  Since misfit is statistically signifcant, you could still look for a partial invariance model by freeing the thresholds, loadings, and residuals for one item at a time using the group.partial= argument. 

What may cause the problems with the configural and strong models?                    

No idea.  My first guess would have been a small sample, but you have large N in both groups.  My only other guess would be such poor fit that a solution can't be found, but your fit is admittedly not that poor.  However, global fit indices can mask a lot of important local misfit.  Try looking at the lavResiduals() output from your strict model to see which parts of your model fail.  Even if SRMR is fairly low, that is just an average, which could be due to several near-zero residuals and one or two very large ones that could point to why a solution is hard to find (no guarantees).

Matti Cervin

unread,
Sep 19, 2019, 11:05:58 AM9/19/19
to lav...@googlegroups.com
This has been really helpful. Thanks a lot!

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/O_dRFM2XtwM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages