scalar invariance for ordinal data

247 views
Skip to first unread message

Chen Ali

unread,
Nov 21, 2019, 9:48:15 AM11/21/19
to lavaan

Hello,

I am doing scalar invariance for ordinal data across three groups. I followed Millsap, Roger, & Jenn Yun-Tein (2014) and Wu & Estabrook (2016) approaches. I am confused whether I should make the means of a latent variable for the rest two groups be freely estimated and only constrain the mean of  latent variable of a reference group to as 0 when I did the test of  scalar invariance.


When I used the following codes, all means of a latent variable are set as 0.

intercepts<- measEq.syntax(configural.model = first,

                        data = healthy,

                        ordered = c("q10", "q11", "q12", "q13","q14","q15"),

                        parameterization = "delta",

                        ID.fac = "std.lv",

                        ID.cat = "Wu.Estabrook.2016",

                        group = "CNT",

                        group.equal = c("thresholds", "loadings","intercepts"))

 

Thank you in advance!

Ali

Terrence Jorgensen

unread,
Nov 21, 2019, 10:53:40 AM11/21/19
to lavaan

 followed Millsap, Roger, & Jenn Yun-Tein (2014) and Wu & Estabrook (2016) approaches.


You can't follow both at the same time.  Do you mean you tried both approaches to compare them?

I am confused whether I should make the means of a latent variable for the rest two groups be freely estimated and only constrain the mean of  latent variable of a reference group to as 0 when I did the test of  scalar invariance.


The syntax-writing function should make the right choice based on your selected method of identification.  What did the script show you?

cat(as.character(intercepts))

I think Millsap's approach always fixes one group's mean to zero and estimates the mean in other groups, regardless of the model.  With Wu & Estabrook's approach and ID.fac = "std.lv", the same thing should happen in your syntax above: one group's mean fixed to zero, but others are estimated.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Chen Ali

unread,
Nov 22, 2019, 7:00:52 AM11/22/19
to lavaan

I used Wu & Estabrook (2016) approach, but I would like to fixe one group's mean to 0 and estimate the means in other two groups.

In the syntax, I also used ID.fac = "std.lv" in the syntax. However, the output showed that means were set as 0 for all groups.


cat(as.character(intercepts)) showed me:


## LOADINGS:

 

OS =~ c(NA, NA, NA)*q10 + c(lambda.1_1, lambda.1_1, lambda.1_1)*q10

OS =~ c(NA, NA, NA)*q11 + c(lambda.2_1, lambda.2_1, lambda.2_1)*q11

OS =~ c(NA, NA, NA)*q12 + c(lambda.3_1, lambda.3_1, lambda.3_1)*q12

OS =~ c(NA, NA, NA)*q13 + c(lambda.4_1, lambda.4_1, lambda.4_1)*q13

OS =~ c(NA, NA, NA)*q14 + c(lambda.5_1, lambda.5_1, lambda.5_1)*q14

OS =~ c(NA, NA, NA)*q15 + c(lambda.6_1, lambda.6_1, lambda.6_1)*q15

 

## THRESHOLDS:

 

q10 | c(NA, NA, NA)*t1 + c(q10.thr1, q10.thr1, q10.thr1)*t1

q10 | c(NA, NA, NA)*t2 + c(q10.thr2, q10.thr2, q10.thr2)*t2

q10 | c(NA, NA, NA)*t3 + c(q10.thr3, q10.thr3, q10.thr3)*t3

q10 | c(NA, NA, NA)*t4 + c(q10.thr4, q10.thr4, q10.thr4)*t4

q11 | c(NA, NA, NA)*t1 + c(q11.thr1, q11.thr1, q11.thr1)*t1

q11 | c(NA, NA, NA)*t2 + c(q11.thr2, q11.thr2, q11.thr2)*t2

q11 | c(NA, NA, NA)*t3 + c(q11.thr3, q11.thr3, q11.thr3)*t3

q11 | c(NA, NA, NA)*t4 + c(q11.thr4, q11.thr4, q11.thr4)*t4

q12 | c(NA, NA, NA)*t1 + c(q12.thr1, q12.thr1, q12.thr1)*t1

q12 | c(NA, NA, NA)*t2 + c(q12.thr2, q12.thr2, q12.thr2)*t2

q12 | c(NA, NA, NA)*t3 + c(q12.thr3, q12.thr3, q12.thr3)*t3

q12 | c(NA, NA, NA)*t4 + c(q12.thr4, q12.thr4, q12.thr4)*t4

q13 | c(NA, NA, NA)*t1 + c(q13.thr1, q13.thr1, q13.thr1)*t1

q13 | c(NA, NA, NA)*t2 + c(q13.thr2, q13.thr2, q13.thr2)*t2

q13 | c(NA, NA, NA)*t3 + c(q13.thr3, q13.thr3, q13.thr3)*t3

q13 | c(NA, NA, NA)*t4 + c(q13.thr4, q13.thr4, q13.thr4)*t4

q14 | c(NA, NA, NA)*t1 + c(q14.thr1, q14.thr1, q14.thr1)*t1

q14 | c(NA, NA, NA)*t2 + c(q14.thr2, q14.thr2, q14.thr2)*t2

q14 | c(NA, NA, NA)*t3 + c(q14.thr3, q14.thr3, q14.thr3)*t3

q14 | c(NA, NA, NA)*t4 + c(q14.thr4, q14.thr4, q14.thr4)*t4

q15 | c(NA, NA, NA)*t1 + c(q15.thr1, q15.thr1, q15.thr1)*t1

q15 | c(NA, NA, NA)*t2 + c(q15.thr2, q15.thr2, q15.thr2)*t2

q15 | c(NA, NA, NA)*t3 + c(q15.thr3, q15.thr3, q15.thr3)*t3

q15 | c(NA, NA, NA)*t4 + c(q15.thr4, q15.thr4, q15.thr4)*t4

 

## INTERCEPTS:

 

q10 ~ c(nu.1, nu.1, nu.1)*1 + c(0, 0, 0)*1

q11 ~ c(nu.2, nu.2, nu.2)*1 + c(0, 0, 0)*1

q12 ~ c(nu.3, nu.3, nu.3)*1 + c(0, 0, 0)*1

q13 ~ c(nu.4, nu.4, nu.4)*1 + c(0, 0, 0)*1

q14 ~ c(nu.5, nu.5, nu.5)*1 + c(0, 0, 0)*1

q15 ~ c(nu.6, nu.6, nu.6)*1 + c(0, 0, 0)*1

 

## SCALING FACTORS:

 

q10 ~*~ c(1, NA, NA)*q10

q11 ~*~ c(1, NA, NA)*q11

q12 ~*~ c(1, NA, NA)*q12

q13 ~*~ c(1, NA, NA)*q13

q14 ~*~ c(1, NA, NA)*q14

q15 ~*~ c(1, NA, NA)*q15

 

 

## LATENT MEANS/INTERCEPTS:

 

OS ~ c(alpha.1.g1, alpha.1.g2, alpha.1.g3)*1 + c(0, 0, 0)*1---> Does mean that the latent means of three groups are fixed as 0?

 

## COMMON-FACTOR VARIANCES:

 

OS ~~ c(1, NA, NA)*OS + c(psi.1_1.g1, psi.1_1.g2, psi.1_1.g3)*OS




Terrence Jorgensen於 2019年11月21日星期四 UTC+1下午4時53分40秒寫道:
Terrence Jorgensen於 2019年11月21日星期四 UTC+1下午4時53分40秒寫道:
Terrence Jorgensen於 2019年11月21日星期四 UTC+1下午4時53分40秒寫道:

Terrence Jorgensen

unread,
Nov 22, 2019, 7:12:40 AM11/22/19
to lavaan

## LATENT MEANS/INTERCEPTS:

 

OS ~ c(alpha.1.g1, alpha.1.g2, alpha.1.g3)*1 + c(0, 0, 0)*1---> Does mean that the latent means of three groups are fixed as 0?


Yes, and I think this may be an old bug that is already fixed.  What version are you using?

sessionInfo()

If it is not 0.5-2.911, please install the development version

devtools::install_github("simsem/semTools/semTools")

Restart R and load semTools to check the version installed correctly, then see if the bug is fixed.  If not, please send me a reproducible example (minimal R script and just enough data to reproduce the problem) so I can track it down.

Chen Ali

unread,
Nov 25, 2019, 7:49:06 AM11/25/19
to lavaan


Thanks! I updated the new version of semTools, and it only fixed a mean of a reference group as 0 and the rest means are freely estimated.

I have another question about the scalar invariance. Could the below codes be used to test the scalar invariance for ordinal data? If I understand correctly, the below codes fixed the intercepts as 0 across groups.


thresholds<-cfa(model = first,estimator="DWLS", se = "robust.sem", test = "scaled.shifted",parameterization="theta",
                ordered = Items,group = "CNT",group.equal=c("loadings","thresholds"),data=
healthy)

Ali

Terrence Jorgensen於 2019年11月22日星期五 UTC+1下午1時12分40秒寫道:

Terrence Jorgensen

unread,
Nov 25, 2019, 11:01:38 AM11/25/19
to lavaan

Could the below codes be used to test the scalar invariance for ordinal data? If I understand correctly, the below codes fixed the intercepts as 0 across groups.


thresholds<-cfa(model = first,estimator="DWLS", se = "robust.sem", test = "scaled.shifted",parameterization="theta",
                ordered = Items,group = "CNT",group.equal=c("loadings","thresholds"),data=
healthy)

Yes, you can test scalar invariance by comparing the model above to the default configural model, because lavaan does not free intercepts for ordered indicators even when they can be identified from the data (consistent with Mplus).  But if you reject the null hypothesis of scalar invariance, it will not be clear which types of parameters differ.  Think of it as an omnibus test, which would need follow-up tests to reveal the source of DIF.

FYI, the highlighted arguments are redundant because cfa() will use those automatically with any ordered indicators.

Chen Ali

unread,
Nov 26, 2019, 7:35:12 AM11/26/19
to lavaan
Thanks! It cleard my confusion.

Terrence Jorgensen於 2019年11月25日星期一 UTC+1下午5時01分38秒寫道:

Thomas McCauley

unread,
Dec 6, 2019, 8:35:27 PM12/6/19
to lavaan
I'm interested in testing scalar invariance for ordinal indicators, and am a bit confused by the responses here.

First, according to the guidelines set forth by Svetina, Svetina, & Rutkowski (2019), you can test for threshold invariance by directly comparing a model in which thresholds are constrained to a model in which thresholds and loadings are constrained. However, your answer, Terrence, would suggest that one can only compare the threshold model to the configural model. Which is correct? Here is the code I am running to generate and compare the threshold-constrained and threshold-and-loadings-constrained models:

### Step 3. Testing threshold and loadings invariance

# Fit the model
dddLoadingsTheory <- measEq.syntax(configural.model = dddfactor, 
                              data = dddata, 
                              ordered = c(
                                "compassionate",
                                "empathic",
                                "sympathetic",
                                "softhearted",
                                "tender"), 
                              parameterization = "delta", 
                              ID.fac = "std.lv", 
                              ID.cat = "Wu.Estabrook.2016", 
                              group = "SexCode", 
                              group.equal = c("thresholds",
                                                        "loadings"))

# Model fit
dddLoadingsTheoryFit <- cfa(as.character(dddLoadingsTheory),
                       data = dddata,
                       group = "SexCode",
                       ordered = c(
                         "compassionate",
                         "empathic",
                         "sympathetic",
                         "softhearted",
                         "tender"),
                       test = "Satorra-Bentler")

summary(dddLoadingsTheoryFit, 
        fit.measures = TRUE)

# Compare the loadings-constrained model to the loadings-and-thresholds-constrained model
lavTestLRT(dddLoadingsTheoryFit, 
           dddThresholdTheoryFit,
           A.method = "delta", 
           scaled.shifted = TRUE,
           type = "chisq",
           method = "satorra.bentler.2001")

Second, Chen Ali's original post indicated that they tested scalar invariance by constraining the intercepts of manifest indicators to be equal, in addition to thresholds and loadings. As I understand from Wu & Estabrook (2016), you can identify a model in which intercepts of ordinal indicators can be constrained to be equal (pg. 1031), which is the convention for MI w/continuous indicators. But is constraining intercepts necessary testing scalar invariance with ordinal indicators? I realize this second question is likely more appropriate for semnet, but it seems that Terrence's statement -- "Yes, you can test scalar invariance by comparing the model above to the default configural model, because lavaan does not free intercepts for ordered indicators even when they can be identified from the data (consistent with Mplus)." -- is in conflict with the approach outlined by Chen Ali.

Best,
Thomas

References:

Svetina, D., Rutkowski, L., & Rutkowski, D. (2019). Multiple-Group Invariance with Categorical Outcomes Using Updated Guidelines: An Illustration Using M plus and the lavaan/semTools Packages. Structural Equation Modeling: A Multidisciplinary Journal, 1-20.

Wu, H., & Estabrook, R. (2016). Identification of confirmatory factor analysis models of different levels of invariance for ordered categorical outcomes. Psychometrika81(4), 1014-1045.

Terrence Jorgensen

unread,
Dec 10, 2019, 8:47:25 AM12/10/19
to lavaan
according to the guidelines set forth by Svetina, Svetina, & Rutkowski (2019), you can test for threshold invariance by directly comparing a model in which thresholds are constrained to a model in which thresholds and loadings are constrained.

If thresholds are constrained in both models, then how do you expect to test that constraint?  The difference between those 2 models is that the loadings are additionally constrained, so those are the constraints you would test in that model comparison.
 
However, your answer, Terrence, would suggest that one can only compare the threshold model to the configural model.

I don't suggest you can "only" do anything.  I suggest that it is wiser to test threshold invariance before testing higher-order measurement parameters, because (as Wu and Estabrook show) tests/interpretation of all the other measurement parameters depend on whether thresholds are constrained to equality.  If you can decouple the location/scale of latent item responses by equating thresholds, then tests of higher-order measurement parameters are meaningful.

Second, Chen Ali's original post indicated that they tested scalar invariance by constraining the intercepts of manifest indicators to be equal, in addition to thresholds and loadings. As I understand from Wu & Estabrook (2016), you can identify a model in which intercepts of ordinal indicators can be constrained to be equal (pg. 1031), which is the convention for MI w/continuous indicators. But is constraining intercepts necessary testing scalar invariance with ordinal indicators? I realize this second question is likely more appropriate for semnet, but it seems that Terrence's statement -- "Yes, you can test scalar invariance by comparing the model above to the default configural model, because lavaan does not free intercepts for ordered indicators even when they can be identified from the data (consistent with Mplus)." -- is in conflict with the approach outlined by Chen Ali.

Keep reading.  The next sentence says "But if you reject the null hypothesis of scalar invariance, it will not be clear which types of parameters differ.  Think of it as an omnibus test, which would need follow-up tests to reveal the source of DIF".  So you can test equivalence of all measurement parameters simultaneously, or you can proceed in steps by constraining one type of parameter at a time.  The latter is more informative, but perhaps unnecessary if a single omnibus test is not significant.

Claire Chen

unread,
Jun 15, 2020, 11:29:30 PM6/15/20
to lavaan
Hello!

To clarify, if residual invariance is not of interest and delta parameterization is used, you don't have to constraint the latent intercept to be equal since they are fixed to be 0 across all models. Is that correct?
If yes, you would set  group.equal = c("thresholds", "loadings") instead of group.equal = c("thresholds", "loadings", "intercepts") for testing scalar invariance, as shown in the previous example by Thomas, right?

Thank you,
Claire

Terrence Jorgensen

unread,
Jun 17, 2020, 8:11:53 AM6/17/20
to lavaan
if residual invariance is not of interest and delta parameterization is used, you don't have to constraint the latent intercept to be equal since they are fixed to be 0 across all models. Is that correct?

Not when ID.cat = "Wu.Estabrook.2016" and group.equal= (or long.equal=) includes "thresholds".  Upon constraining thresholds, intercepts for subsequent groups/occasions no longer need to be fixed for identification (nor do (residual) variances, given > 1 threshold).

Chen Yun-Ju

unread,
Jun 17, 2020, 2:05:33 PM6/17/20
to lav...@googlegroups.com
Thank you so much, Dr. Jorgensen!

In my analysis with a large set of ordinal data, I did intend to use Wu.Estabrook's method by first constraining thresholds. When setting intercepts to be equal upon constraining threshold and loading,  I got this error message.
 A factor's mean cannot be freed unless it has at least one indicator without a cross-loading whose intercept is constrained to equality. Use cat(as.character()) to check whether the syntax returned by measEq.syntax() must be manually adapted to free the necessary latent means.

I found the discussion here (https://github.com/simsem/semTools/issues/60) quite helpful. However, the tricky thing is that EVERY indicator in my model has cross-loading. In this case, does it still make sense to manually free any latent means? Would you have any suggestions for testing scalar invariance given the lack of a unique indicator for each factor? 

Here are the relevant codes here for your reference:
data[,3:39] <- lapply(data[,3:39], ordered)
scalar <- measEq.syntax(configural.model = mod4, data = data, parameterization = "delta", ID.fac="std.lv", ID.cat = "Wu.Estabrook.2016", group = "sex", group.equal = c("thresholds", "loadings", "intercepts"))

Thank you!
Claire


--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/f20a4267-9684-4e19-b340-12a93655fcd6o%40googlegroups.com.

Terrence Jorgensen

unread,
Jun 18, 2020, 4:35:12 AM6/18/20
to lavaan
EVERY indicator in my model has cross-loading. In this case, does it still make sense to manually free any latent means? 

Since you have a bifactor or MTMM model, constraining intercepts to equality only allows you to estimate differences in means of EITHER general / trait factors OR specific / method factors.  You could even fit both models to compare their fit (although I doubt they are nested; you could use net() to check).  I would recommend adding "means" to your group.equal= argument, which should avoid the error (since means will all be fixed to zero).  Then you can copy / paste or use update() to free only the (set of) means to specify the model you want to fit.  
Reply all
Reply to author
Forward
0 new messages