defualt lavvan vs. Wu & Estabrook (2016) or Millsap & Tein (2004) for multiple-group measurement invaraince

101 views
Skip to first unread message

ahmad

unread,
Nov 19, 2025, 7:36:30 AMNov 19
to lavaan
Hi,

I want to conduct multiple-group measurement and structural models with categorical data (two latent factors with 7 categories [highly skewed] and one with 3 categories) using the WLSMV estimator. I tried the default options in lavaan (group.equal = "thresholds, loadings, intercepts"), and I also tested the Wu & Estabrook (2016) approach with ID.fac = "std.lv" and the Millsap & Tein (2004) approach using the measEq.syntax function. I have the following questions:

1. Can I use the default lavaan approach for my categorical model, specifying group.equal in the appropriate sequence for categorical indicators and fixing the marker method (e.g., configural → group = "sex"; then group.equal = "thresholds"; then group.equal = c("thresholds", "loadings"); then group.equal = c("thresholds", "loadings", "intercepts")), instead of using the newer identification approaches such as Wu & Estabrook (2016) or Millsap & Tein (2004)?

2. When I compared the three methods, the results were exactly the same for the default lavaan specification (using group.equal manually) and the Wu & Estabrook (2016) method with ID.fac = "std.lv", but both differed from the Millsap & Tein (2004) approach. Why did the first two methods yield identical results?

3. When I used measEq.syntax for Wu & Estabrook (2016) and Millsap & Tein (2004), I did not obtain robust versions of the fit indices (these were NA; only the standard and scaled versions were available). However, when I used group.equal manually, robust fit indices were provided. Why does this happen?

Best wishes,
A

Yves Rosseel

unread,
Nov 20, 2025, 11:31:25 AMNov 20
to lav...@googlegroups.com
On 11/19/25 13:36, ahmad wrote:
> 1. Can I use the default lavaan approach for my categorical model,
> specifying group.equal in the appropriate sequence for categorical
> indicators and fixing the marker method (e.g., configural → group =
> "sex"; then group.equal = "thresholds"; then group.equal =
> c("thresholds", "loadings"); then group.equal = c("thresholds",
> "loadings", "intercepts"))

I would say: yes.

> 2. When I compared the three methods, the results were exactly the same
> for the default lavaan specification (using group.equal manually) and
> the Wu & Estabrook (2016) method with ID.fac = "std.lv"

Indeed. This has changed in recent versions of lavaan. The current
version (0.6-20) will 'mimic' the Wu & Estabrook (2016) approach by default.

> 3. When I used measEq.syntax for Wu & Estabrook (2016) and Millsap &
> Tein (2004), I did not obtain robust versions of the fit indices (these
> were NA; only the standard and scaled versions were available). However,
> when I used group.equal manually, robust fit indices were provided. Why
> does this happen?

Even if you use ID.fac = "std.lv" and get identical results/fit
measures? Then I don't know.

The 'usual' reason why you may get NA for the robust fit indices (in the
categorical case) is that the polychoric correlation matrix is not
positive definite, which unfortunately happens a lot. For the moment, we
have no way around it.

Yves.

--
Yves Rosseel
Department of Data Analysis, Ghent University

Message has been deleted

ahmad

unread,
Nov 20, 2025, 9:42:24 PMNov 20
to lavaan
Thank you for your response, Prof. Rosseel.

Following up on the issue of obtaining NA values for robust fit indices when using measEq.syntax (versus using the default group.equal argument), I tried writing the syntax manually (not the syntax provided by measEq.syntax, and I again obtained NA values, similar to what happens with measEq.syntax.

You asked whether the NA values also occur when using ID.fac = "std.lv". Initially, I used the default approach with the marker method when specifying constraints manually using the group.equal argument, while I used ID.fac = "std.lv" with the Wu and Estabrook (2016) approach via measEq.syntax. I have now rerun the model using the default approach with std.lv = TRUE, and I obtained the same results. Although I am using lavaan.mi with multiply imputed data, it seems that whenever syntax is user-defined, either manually or via measEq.syntax, the model behaves differently. In many cases, the model fails to converge, gives warnings, or produces NA values for robust fit indices from my expericne. I am not sure, but there may be a bug (see below for the syntax and corresponding results).

For example, when comparing the default method and measEq.syntax using the syntax below, the default method produced robust fit indices, whereas measEq.syntax did not. As you can see, the model fit indices (i.e., both standard and scaled) were exactly the same. I also examined all factor loadings, thresholds, intercepts, variances, etc., and they were identical. This contradicts your suggestion that the most common reason for obtaining NA for robust fit indices is a non-positive definite covariance matrix. If that were the case, we should have seen NA values for robust fit indices even when using the default method. NA for robust fit indices is a very common problem. My results suggest that this issue might be due to writing the syntax (i.e., manually specifying the model or using measEq.syntax) versus using the default methods in lavaan, rather than being caused by model mis-specification.
default1.PNG
default.PNG

measEq.PNG
measEq2.PNG

I also have an additional question regarding this topic, and I would greatly appreciate your insight. I plan to conduct multiple-group SEM measurement invariance testing and then compare regression paths (direct, indirect, and total effects) across groups. In this context, I am unsure whether the marker method is compatible with Wu and Estabrook (2016)? and which identification method (marker vs. std.lv) is preferable for categorical indicators for thi purpose. When I used the marker method with Wu and Estabrook (2016) through measEq.syntax, the model did not converge. However, when I used the same model with ID.fac = "std.lv", it converged without any issues (with NA for robust fit indices), whereas when I used the same model with the marker method with the default approach, the model converged without any issue (with robust fit indices). 

Best wishes,
A

Terrence Jorgensen

unread,
Nov 21, 2025, 4:46:52 AMNov 21
to lavaan
> 2. When I compared the three methods, the results were exactly the same
> for the default lavaan specification (using group.equal manually) and
> the Wu & Estabrook (2016) method with ID.fac = "std.lv"

Indeed. This has changed in recent versions of lavaan. The current
version (0.6-20) will 'mimic' the Wu & Estabrook (2016) approach by default.

Hooray!  
But to be clear for anyone reading, this would only be true for invariance across the group= variable.  To (additionally) evaluate invariance across repeated/dependent measures (e.g., multiple occasions), the semTools::measEq.syntax() function will still help write the correct lavaan syntax.

Terrence D. Jorgensen    (he, him, his)
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
http://www.uva.nl/profile/t.d.jorgensen


Terrence Jorgensen

unread,
Nov 21, 2025, 5:32:07 AMNov 21
to lavaan
Following up on the issue of obtaining NA values for robust fit indices when using measEq.syntax

This has nothing to do with how you write your syntax.
You are using lavaan.mi to fit your models, whose results have a mean- and variance-adjusted test statistic.  This already complicates the calculation of robust (i.e., population-consistent estimates of) fit indices when using ML estimation:


However, the solution above is not applicable to the 2-stage estimation algorithm we call the (categorical) diagonally weighted least-squares (DWLS) estimator.  That solution is proposed here:


It involves fitting a model to the model again using a ML estimator, but fixing the parameters to their DWLS estimates.  I have not implemented this extra step for models fitted to multiple imputations, nor am I certain how easily this method can be implemented for that situation.  

Another option is to pool your saturated model and fit all your hypothesized (e.g., invariance) models to those pooled results.  That leads to having a single SEM result that does not need pooling; thus, you have a lavaan object (not lavaan.mi) and you will get robust fit indices.  You can find an example and references to read on the ?semTools::poolSat() help page.  Not that (changes in) fit indices should be the basis of your statistical decisions about whether invariance is violated.



Something I noticed in your output is that the default baseline model (used to calculate CFI and TLI) has the same df=72 as the target model (but obviously it fits worse).  That means they are not nested models, so the CFI and TLI are not meaningful:



To evaluate an SEM with equivalent thresholds using CFI (or any incremental fit index), an appropriate baseline model should also have equivalent thresholds.  It is an unfortunate oversight that I never included such a step in the ?measEq.syntax help page, probably because I don't recommend using (changes in) fit indices to test invariance hypotheses.


I plan to conduct multiple-group SEM measurement invariance testing and then compare regression paths (direct, indirect, and total effects) across groups.

Invariant thresholds and loadings (metric invariance) is sufficient for such comparisons.
 
In this context, I am unsure whether the marker method is compatible with Wu and Estabrook (2016)? and which identification method (marker vs. std.lv) is preferable for categorical indicators for this purpose.

They should yield statistically equivalent results.  Only the Millsap & Tein (2004) might be problematic (in certain circumstances) due to the constraints they recommend imposing relative to the marker variable.


When I used the marker method with Wu and Estabrook (2016) through measEq.syntax, the model did not converge. However, when I used the same model with ID.fac = "std.lv", it converged without any issues (with NA for robust fit indices), whereas when I used the same model with the marker method with the default approach, the model converged without any issue (with robust fit indices). 

Without the syntax or output, I can't judge why these inconsistencies would occur in your particular (multiply imputed) data.

ahmad

unread,
Nov 21, 2025, 2:10:41 PMNov 21
to lavaan
Thank you for your response, Dr. Jorgensen.

I noticed that not only in the scalar measurement invariance model, but also in my mediation model estimated with the total sample (not multiple-groups SEM), the user model has more degrees of freedom than the baseline model. In the mediation model, this appears to be due to including covariates since the wodel withut covarites has fewer df than the baseline model. Is this problematic in the same way it is for measurement invariance? How should a baseline model be defined in this context? I could not find an example in lavaan or semTools. If I define a custom baseline model, should covariates be included there as well?

You mentioned that “to evaluate an SEM with equivalent thresholds using CFI (or any incremental fit index), an appropriate baseline model should also have equivalent thresholds.” My question is why, when I use the MLR estimator instead of WLSMV with ordered, the baseline-model degrees of freedom in lavaan do not change. In both cases the baseline model has df = 72, while the user model has df = 56 with MLR and df = 72 with WLSMV.

Regarding your comment that you cannot judge the inconsistencies without syntax or output, please see the syntax and results below.
1.PNG2.PNG
Best wishes,
A

ahmad

unread,
Nov 21, 2025, 9:38:28 PMNov 21
to lavaan
Following up on the previous post regarding the issue of the user model having more degrees of freedom than the baseline model, I realised why the degrees of freedom (df) for the user model were larger than those of the baseline model in my SEM. When covariates are included with fixed.x = TRUE, lavaan does not modify the baseline (independence) model to account for relationships among the covariates. As a result, the baseline model’s df remains the same as when no covariates are included, while the df for the user model increases because additional parameters are estimated. However, when I explicitly include the covariates in the model and specify their covariances, the baseline model’s df becomes larger than the user model’s df, which is the expected ordering.

I understand that the your recommendation is generally to avoid modeling covariates directly (i.e., keep fixed.x = TRUE). My question is: What should be done when using fixed.x = TRUE leads to the baseline model having fewer df than the user model?

Best wishes,

A

Reply all
Reply to author
Forward
0 new messages