measEq.syntax (Wu&Estabrook) for binary items

36 views
Skip to first unread message

Andres Perez

unread,
Dec 16, 2025, 6:17:31 AM (3 days ago) Dec 16
to lavaan
Hi everyone,

I have been working with some (simulated) categorical data for a while and greatly appreciate the developments in the measEq.syntax function, as it has made my life easier when applying the identification constraints from Wu & Estabrook (2016).

There is still one thing that I do not understand, though. In the Wu & Estabrook paper, they mention that it is impossible to test for threshold and loading invariance simultaneously, as it is “statistically equivalent” to the baseline (i.e., configural) model (Section 5.3; Proposition 6). A screenshot of that specific part of the paper is attached to this post. For brevity, I will write “metric invariance” instead of “threshold and loading invariance” from now on. 

Looking at the documentation from the semTools package, I see that the measEq.syntax function considers other non-testable non-invariances (e.g., Propositions 4 and 5 of Wu & Estabrook) on page 70:

"For binary data, there is no independent test of threshold, intercept, or residual-variance equality. Equivalence of thresholds must also be assumed for three-category indicators."

However, there is no mention of exactly what happens when testing for metric invariance with binary data. Following what Wu & Estabrook explain, I would expect that a configural model and a model with metric invariance should be “statistically equivalent” (e.g., test statistics; chi-squared). Is this correct?

I was playing around with a simple (simulated) toy data set with two groups. It was generated with equal thresholds and loadings across the groups. I fitted a configural and a metric invariant model and compared them, but I noticed they are not “statistically equivalent,” at least not in terms of chi-squared values. 

Am I misinterpreting what Wu & Estabrook stated in their paper? Or is semTools doing something else in the background? I am just trying to understand precisely what Wu & Estabrook meant by their "Proposition 6". 

Attached to this post, you can find the toy dataset and the R script where I ran the two models. I am also copying & pasting it here if that is easier to access.

Thank you very much for your help!

Best,
Andres Perez

Code:
# 2025-12-16
# Testing metric invariance with binary data using semTools

# Load libraries
library(lavaan)
library(semTools)

# Load data
load("binary_data.Rdata")

# Define the model
S1 <- '
    # factor loadings
    F1 =~ x1 + x2 + x3 + x4 + x5
    F2 =~ z1 + z2 + z3 + z4 + z5
    F3 =~ m1 + m2 + m3 + m4 + m5
    F4 =~ y1 + y2 + y3 + y4 + y5
'

######################################
########### CONFIGURAL FIT ###########
######################################
S1.config <- as.character(
  semTools::measEq.syntax(configural.model = S1,
                          dat              = binary,
                          parameterization = "delta",
                          ordered          = T,
                          ID.fac           = "std.lv",
                          ID.cat           = "Wu",
                          group            = "group")
)

fit.config <- cfa(model       = S1.config,
                  data        = binary,
                  group       = "group",
                  ordered     = T)

######################################
######## FULLY INVARIANT FIT #########
######################################

S1.inv <- as.character(
  semTools::measEq.syntax(configural.model = S1,
                          dat              = binary,
                          parameterization = "delta",
                          ordered          = T,
                          ID.fac           = "std.lv",
                          ID.cat           = "Wu",
                          group            = "group",
                          group.equal      = c("thresholds", "loadings"))
)

fit.inv <- cfa(model       = S1.inv,
               data        = binary,
               group       = "group",
               ordered     = T)

#################################
########### COMPARING ###########
#################################
fit.config
fit.inv

# summary(fit.config)
# summary(fit.inv)

fitmeasures(fit.config , c("chisq")) 
chisq 219.339 
fitmeasures(fit.inv    , c("chisq")) chisq 255.249  


binary_data.Rdata
Screenshot 2025-12-16 104654.png
binary_test.R

Victoria Savalei

unread,
Dec 17, 2025, 4:50:43 PM (2 days ago) Dec 17
to lavaan

This paper is not easy to understand. What they are stating in these Propositions are identification conditions -- that is, minimum sets of constraints for a model to be identified. For Proposition 6, I think they are pointing out that for binary data, there is an equivalent parameterization to the default configural model that has equal thresholds and equal loadings. In particular, if all factor loadings are set to 1 (and are thus invariant) and all thresholds are set to 0 (and are thus invariant), but we free all intercepts in all groups (so we have p*G of these estimates) and we free residual variances in all groups (so we have p*G of these estimates), we get a model with equivalent fit (chi-square and df) to the configural model. See Table 4, the line that says T and Lambda. 

To see this in lavaan, you would have to manually free the intercepts and residual variances in both groups, which currently have some default constraints applied to them (intercepts zero in at least one group, depending on how loading and thresholds constraints are imposed; residual variances held at 1 under the "theta" parameterization). 

So threshold and loadings constraints are not testable with binary data because the configural model can be "rotated" to imply that they are true, unless you make further assumptions on some of the other parameters. When you find differences in lavaan in your two runs, it means that it has already made some other assumptions that make these constraints testable (such as those above). This is all rather esoteric. The biggest applied contribution of Wu & Estabrook is to point out that intercepts of underlying continuous indicators can be free once thresholds have been constrained. 

Reply all
Reply to author
Forward
0 new messages