Hi everyone,
I have been working with some (simulated) categorical data for a while and greatly appreciate the developments in the measEq.syntax function, as it has made my life easier when applying the identification constraints from Wu & Estabrook (2016).
There is still one thing that I do not understand, though. In the Wu & Estabrook paper, they mention that it is impossible to test for threshold and loading invariance simultaneously, as it is “statistically equivalent” to the baseline (i.e., configural) model (Section 5.3; Proposition 6). A screenshot of that specific part of the paper is attached to this post. For brevity, I will write “metric invariance” instead of “threshold and loading invariance” from now on.
Looking at the documentation from the semTools package, I see that the measEq.syntax function considers other non-testable non-invariances (e.g., Propositions 4 and 5 of Wu & Estabrook) on page 70:
"For binary data, there is no independent test of threshold, intercept, or residual-variance equality. Equivalence of thresholds must also be assumed for three-category indicators."
However, there is no mention of exactly what happens when testing for metric invariance with binary data. Following what Wu & Estabrook explain, I would expect that a configural model and a model with metric invariance should be “statistically equivalent” (e.g., test statistics; chi-squared). Is this correct?
I was playing around with a simple (simulated) toy data set with two groups. It was generated with equal thresholds and loadings across the groups. I fitted a configural and a metric invariant model and compared them, but I noticed they are not “statistically equivalent,” at least not in terms of chi-squared values.
Am I misinterpreting what Wu & Estabrook stated in their paper? Or is semTools doing something else in the background? I am just trying to understand precisely what Wu & Estabrook meant by their "Proposition 6".
Attached to this post, you can find the toy dataset and the R script where I ran the two models. I am also copying & pasting it here if that is easier to access.
Thank you very much for your help!
Best,
Andres Perez
Code:
# 2025-12-16
# Testing metric invariance with binary data using semTools
# Load libraries
library(lavaan)
library(semTools)
# Load data
load("binary_data.Rdata")
# Define the model
S1 <- '
# factor loadings
F1 =~ x1 + x2 + x3 + x4 + x5
F2 =~ z1 + z2 + z3 + z4 + z5
F3 =~ m1 + m2 + m3 + m4 + m5
F4 =~ y1 + y2 + y3 + y4 + y5
'
######################################
########### CONFIGURAL FIT ###########
######################################
S1.config <- as.character(
semTools::measEq.syntax(configural.model = S1,
dat = binary,
parameterization = "delta",
ordered = T,
ID.fac = "std.lv",
ID.cat = "Wu",
group = "group")
)
fit.config <- cfa(model = S1.config,
data = binary,
group = "group",
ordered = T)
######################################
######## FULLY INVARIANT FIT #########
######################################
S1.inv <- as.character(
semTools::measEq.syntax(configural.model = S1,
dat = binary,
parameterization = "delta",
ordered = T,
ID.fac = "std.lv",
ID.cat = "Wu",
group = "group",
group.equal = c("thresholds", "loadings"))
)
fit.inv <- cfa(model = S1.inv,
data = binary,
group = "group",
ordered = T)
#################################
########### COMPARING ###########
#################################
fit.config
fit.inv
# summary(fit.config)
# summary(fit.inv)
fitmeasures(fit.config , c("chisq")) # chisq 219.339 fitmeasures(fit.inv , c("chisq")) # chisq 255.249