Replicating measurementInvarianceCat() function using lavaan syntax

98 views
Skip to first unread message

christophe...@nicholls.edu

unread,
Jul 6, 2018, 7:22:09 PM7/6/18
to lavaan
Hi folks,

I have been looking for guidance on replicating the measurementInvarianceCat() function using lavaan syntax but have come up empty handed. I was hoping someone could help me out.

To better understand what is going on under the hood, so to speak, with the measurementInvarianceCat() function, I'm trying to replicate it via the lavaan syntax.

Suppose I have a simple one factor model:
model1.cfa <- '
  CM =~ CM1 + CM2 + CM3 + CM4
'

Clearly I could use the measurementInvarianceCat() function as so:
mi <- measurementInvarianceCat(model = model1.cfa, ordered = c("CM1","CM2","CM3","CM4","CM5","CM6","CM7","CM8"), 
                         data = data1, group="COND", parameterization = "theta", estimator = "dwls", information = "expected")
...to get the configural through equal means models. 

How would I have to modify the syntax above to derive the models tested by the measurementInvarianceCat() function?

Thanks in advance,

Chris

Terrence Jorgensen

unread,
Jul 9, 2018, 10:46:43 AM7/9/18
to lavaan
How would I have to modify the syntax above to derive the models tested by the measurementInvarianceCat() function?

The author of that function used to host the attached materials on his web site.  The R syntax shows the models that are fit using the measurementInvarianceCat() function.  The method is based on Millsap & Tein (2004)


But Wu & Estabrook (2016) recently gave some good reasons not to bother yourself with the particular specifications Millsap & Tein recommended, which were motivated by identifying models with a simple nesting structure.


Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

catInvariance.docx
catInvariance2c.R
catInvariance5c.R
example2c.csv
example5c.csv

Christopher Castille

unread,
Jul 10, 2018, 11:01:45 AM7/10/18
to lav...@googlegroups.com
You’re awesome, Terrence!

Just applied your code to the data you passed along and my own while also comparing to the measurementInvarianceCat() function. The chi-square values are not exactly the same. Should I ignore this discrepancy?

Chris

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.
<catInvariance.docx><catInvariance2c.R><catInvariance5c.R><example2c.csv><example5c.csv>

Christopher Castille

unread,
Jul 10, 2018, 11:17:46 AM7/10/18
to lav...@googlegroups.com
One more question. In my specific context, we learned (unfortunately, after the data were collected) that a particular measurement model is best modeled using a bifactor. Are there any unique considerations I need to make for modeling a bifactor (in this case, 20 variables with two uncorrelated factors explaining 10 factors a piece and a broader bifactor)?

On Jul 9, 2018, at 9:46 AM, Terrence Jorgensen <TJorge...@GMAIL.COM> wrote:

christophe...@nicholls.edu

unread,
Jul 10, 2018, 2:48:40 PM7/10/18
to lavaan
Just noting here that I was able to run the model seemingly without difficulty. 



christophe...@nicholls.edu

unread,
Jul 10, 2018, 2:58:58 PM7/10/18
to lavaan
One error did pop up when I wrote out my configural model:
Error in qr.default(t(ceq.JAC)) : 
  NA/NaN/Inf in foreign function call (arg 1)

I looked this up and apparently there is some kind of 0/0 error that is generated by syntax, but I'm not sure where it lies. 

I apologize for the long bit of text here. The lavaan syntax and model are below. Please advise. 

```{r}
Configural.cfa <- '
#For each factor, the factor loading of one marker variable is fixed (usually as 1). Other factor loadings are freely estimated across groups.
##Substantive factors
PP =~ c(1, 1)*PP1 + PP2 + PP3 + PP4 + PP5 + PP6 + PP7 + PP8 + PP9 + PP10
IRB =~ c(1, 1)*IRB1 + IRB2 + IRB3 + IRB4 + IRB5 + IRB6 + IRB7
OCBI =~ c(1, 1)*OCBI1 + OCBI2 + OCBI3 + OCBI4 + OCBI5 + OCBI6 + OCBI7
OCBO =~ c(1, 1)*OCBO1 + OCBO2 + OCBO3 + OCBO4 + OCBO5 + OCBO6 + OCBO7
##Method factors
CM =~ c(1, 1)*CM1 + CM2 + CM3 + CM4 + CM5 + CM6 + CM7 + CM8
PA =~ c(1, 1)*PA1 + PA2 + PA3 + PA4 + PA5 + PA6 + PA7 + PA8 + PA9 + PA10 
Na =~ c(1, 1)*NA1 + NA2 + NA3 + NA4 + NA5 + NA6 + NA7 + NA8 + NA9 + NA10 
AP =~ c(1, 1)*PA1 + PA2 + PA3 + PA4 + PA5 + PA6 + PA7 + PA8 + PA9 + PA10 + NA1 + NA2 + NA3 + NA4 + NA5 + NA6 + NA7 + NA8 + NA9 + NA10 
NegIW =~ c(1, 1)*IRB6 + IRB7 + OCBO3 + OCBO4 + OCBO5

#One threshold of each variable is constrained to equality across groups. One additional threshold of the marker variable is equally constrained across groups. Other thresholds are free. In other words, two thresholds are constrained in marker variables and one threshold in other variables are constrained. #Note: Some variables contain fewer threshold because of low endorsement of certain response categories. 
##Substantive factors
###Proactive Personality
PP1 | c(t1a,t1a)*t1 + c(t1b,t1b)*t2 + t3 + t4
PP2 | c(t2a,t2a)*t1 + t2 + t3
PP3 | c(t3a,t3a)*t1 + t2 + t3 + t4
PP4 | c(t4a,t4a)*t1 + t2 + t3 + t4
PP5 | c(t5a,t5a)*t1 + t2 + t3 + t4
PP6 | c(t6a,t6a)*t1 + t2 + t3 
PP7 | c(t7a,t7a)*t1 + t2 + t3 + t4
PP8 | c(t8a,t8a)*t1 + t2 + t3 + t4
PP9 | c(t9a,t9a)*t1 + t2 + t3 + t4
PP10 | c(t10a,t10a)*t1 + t2 + t3
###In-Role Behavior. IRB6 is also included because it serves as the marker for NegIW.
IRB1 | c(t11a,t11a)*t1 + c(t11b,t11b)*t2 + t3
IRB2 | c(t12a,t12a)*t1 + t2 + t3
IRB3 | c(t13a,t13a)*t1 + t2 + t3
IRB4 | c(t14a,t14a)*t1 + t2 + t3 
IRB5 | c(t15a,t15a)*t1 + t2 + t3 + t4
IRB6 | c(t16a,t16a)*t1 + t2 + t3
IRB7 | c(t17a,t17a)*t1 + t2 + t3 + t4
###OCBI
OCBI1 | c(t18a,t18a)*t1 + c(t18b,t18b)*t2 + t3
OCBI2 | c(t19a,t19a)*t1 + t2 + t3
OCBI3 | c(t20a,t20a)*t1 + t2 + t3 + t4
OCBI4 | c(t21a,t21a)*t1 + t2 + t3 + t4
OCBI5 | c(t22a,t22a)*t1 + t2 + t3 + t4
OCBI6 | c(t23a,t23a)*t1 + t2 + t3 + t4
OCBI7 | c(t24a,t24a)*t1 + t2 + t3
###OCBO
OCBO1 | c(t25a,t25a)*t1 + c(t25b,t25b)*t2 + t3 + t4
OCBO2 | c(t26a,t26a)*t1 + t2 + t3
OCBO3 | c(t27a,t27a)*t1 + t2 + t3 + t4
OCBO4 | c(t28a,t28a)*t1 + t2 + t3
OCBO5 | c(t29a,t29a)*t1 + t2 + t3 + t4
OCBO6 | c(t30a,t30a)*t1 + t2 + t3
OCBO7 | c(t31a,t31a)*t1 + t2 + t3
##Method factors
###CM
CM1 | c(t32a,t32a)*t1 + c(t32b,t32b)*t2 + t3 + t4
CM2 | c(t33a,t33a)*t1 + t2 + t3 + t4
CM3 | c(t34a,t34a)*t1 + t2 + t3 + t4
CM4 | c(t35a,t35a)*t1 + t2 + t3 + t4
CM5 | c(t36a,t36a)*t1 + t2 + t3 + t4
CM6 | c(t37a,t37a)*t1 + t2 + t3 + t4
CM7 | c(t38a,t38a)*t1 + t2 + t3 + t4
CM8 | c(t39a,t39a)*t1 + t2 + t3 + t4
###Positive Affectivity
PA1 | c(t40a,t40a)*t1 + c(t40b,t40b)*t2 + t3 + t4
PA2 | c(t41a,t41a)*t1 + t2 + t3 + t4
PA3 | c(t42a,t42a)*t1 + t2 + t3 + t4
PA4 | c(t43a,t43a)*t1 + t2 + t3 + t4
PA5 | c(t44a,t44a)*t1 + t2 + t3 + t4
PA6 | c(t45a,t45a)*t1 + t2 + t3 + t4
PA7 | c(t46a,t46a)*t1 + t2 + t3 + t4
PA8 | c(t47a,t47a)*t1 + t2 + t3 + t4
PA9 | c(t48a,t48a)*t1 + t2 + t3 + t4
PA10 | c(t49a,t49a)*t1 + t2 + t3 + t4
###Negative Affectivity #Note: Some variables contain fewer threshold because of low endorsement of certain response categories. 
NA1 | c(t50a,t50a)*t1 + c(t50b,t50b)*t2 + t3 + t4
NA2 | c(t51a,t51a)*t1 + t2 + t3
NA3 | c(t52a,t52a)*t1 + t2 + t3
NA4 | c(t53a,t53a)*t1 + t2 + t3 
NA5 | c(t54a,t54a)*t1 + t2 + t3
NA6 | c(t55a,t55a)*t1 + t2 + t3
NA7 | c(t56a,t56a)*t1 + t2 + t3
NA8 | c(t57a,t57a)*t1 + t2 + t3 + t4
NA9 | c(t58a,t58a)*t1 + t2 + t3
NA10 | c(t59a,t59a)*t1 + t2 + t

#Factor variances and covariances are freely estimated.
##Factor variances freely estimated. 
###Substantive factors
PP ~~ NA*PP
IRB ~~ NA*IRB
OCBI ~~ NA*OCBI
OCBO ~~ NA*OCBO
###Method factors
CM ~~ NA*CM
PA ~~ NA*PA
Na ~~ NA*Na
AP ~~ NA*AP
NegIW ~~ NA*NegIW
##Factor covariances freely estimated (with the exception being the bifactors and, also, PA and NA are uncorrelated)
PP ~~ NA*IRB
PP ~~ NA*OCBI
PP ~~ NA*OCBO
PP ~~ NA*CM
PP ~~ NA*PA
PP ~~ NA*Na
PP ~~ NA*AP
PP ~~ NA*NegIW
IRB ~~ NA*OCBI
IRB ~~ NA*OCBO
IRB ~~ NA*CM
IRB ~~ NA*PA
IRB ~~ NA*Na
IRB ~~ NA*AP
IRB ~~ 0*NegIW
OCBI ~~ NA*OCBO
OCBI ~~ NA*CM
OCBI ~~ NA*PA
OCBI ~~ NA*Na
OCBI ~~ NA*AP
OCBI ~~ NA*NegIW
OCBO ~~ NA*CM
OCBO ~~ NA*PA
OCBO ~~ NA*Na
OCBO ~~ NA*AP
OCBO ~~ 0*NegIW
CM ~~ NA*PA
CM ~~ NA*Na
CM ~~ NA*AP
CM ~~ NA*NegIW
PA ~~ 0*Na
PA ~~ 0*AP
PA ~~ NA*NegIW
Na ~~ 0*AP
Na ~~ NA*NegIW

#Factor means of the first group are fixed as 0. The factor means of the other groups are freely estimated. 
##Substantive factors
PP ~ c(0, NA)*1
IRB ~ c(0, NA)*1
OCBI ~ c(0, NA)*1
OCBO ~ c(0, NA)*1
##Method factors
CM ~ c(0, NA)*1
PA ~ c(0, NA)*1
Na ~ c(0, NA)*1
AP ~ c(0, NA)*1
NegIW ~ c(0, NA)*1

#The unique variances of the first group are fixed as 1. The unique variances of other groups are freely estimated.
PP1 ~~ c(1,NA)*PP1
PP2 ~~ c(1,NA)*PP2
PP3 ~~ c(1,NA)*PP3
PP4 ~~ c(1,NA)*PP4
PP5 ~~ c(1,NA)*PP5
PP6 ~~ c(1,NA)*PP6
PP7 ~~ c(1,NA)*PP7
PP8 ~~ c(1,NA)*PP8
PP9 ~~ c(1,NA)*PP9
PP10 ~~ c(1,NA)*PP10
###In-Role Behavior
IRB1 ~~ c(1,NA)*IRB1
IRB2 ~~ c(1,NA)*IRB2
IRB3 ~~ c(1,NA)*IRB3
IRB4 ~~ c(1,NA)*IRB4
IRB5 ~~ c(1,NA)*IRB5
IRB6 ~~ c(1,NA)*IRB6
IRB7 ~~ c(1,NA)*IRB7
###OCBI
OCBI1 ~~ c(1,NA)*OCBI1
OCBI2 ~~ c(1,NA)*OCBI2
OCBI3 ~~ c(1,NA)*OCBI3
OCBI4 ~~ c(1,NA)*OCBI4
OCBI5 ~~ c(1,NA)*OCBI5
OCBI6 ~~ c(1,NA)*OCBI6
OCBI7 ~~ c(1,NA)*OCBI7
###OCBO
OCBO1 ~~ c(1,NA)*OCBO1
OCBO2 ~~ c(1,NA)*OCBO2
OCBO3 ~~ c(1,NA)*OCBO3
OCBO4 ~~ c(1,NA)*OCBO4
OCBO5 ~~ c(1,NA)*OCBO5
OCBO6 ~~ c(1,NA)*OCBO6
OCBO7 ~~ c(1,NA)*OCBO7
##Method factors
###CM
CM1 ~~ c(1,NA)*CM1
CM2 ~~ c(1,NA)*CM2
CM3 ~~ c(1,NA)*CM3
CM4 ~~ c(1,NA)*CM4
CM5 ~~ c(1,NA)*CM5
CM6 ~~ c(1,NA)*CM6
CM7 ~~ c(1,NA)*CM7
CM8 ~~ c(1,NA)*CM8
###Positive Affectivity
PA1 ~~ c(1,NA)*PA1
PA2 ~~ c(1,NA)*PA2
PA3 ~~ c(1,NA)*PA3
PA4 ~~ c(1,NA)*PA4
PA5 ~~ c(1,NA)*PA5
PA6 ~~ c(1,NA)*PA6
PA7 ~~ c(1,NA)*PA7
PA8 ~~ c(1,NA)*PA8
PA9 ~~ c(1,NA)*PA9
PA10 ~~ c(1,NA)*PA10
###Negative Affectivity
NA1 ~~ c(1,NA)*NA1
NA2 ~~ c(1,NA)*NA2
NA3 ~~ c(1,NA)*NA3
NA4 ~~ c(1,NA)*NA4
NA5 ~~ c(1,NA)*NA5
NA6 ~~ c(1,NA)*NA6
NA7 ~~ c(1,NA)*NA7
NA8 ~~ c(1,NA)*NA8
NA9 ~~ c(1,NA)*NA9
NA10 ~~ c(1,NA)*NA10

##Free residual covariances
#Synonyms and Antonyms
CM1~~CM2
CM3~~CM4
CM5~~CM6
CM7~~CM8
'

configural <- cfa(Configural.cfa, ordered = c("PA1", "PA2", "PA3", "PA4", "PA5", "PA6", "PA7", "PA8", 
                                              "PA9", "PA10", "NA1", "NA2", "NA3", "NA4", "NA5", "NA6", 
                                              "NA7", "NA8", "NA9", "NA10", "Mood_T1", "PP1", "PP2", "PP3", 
                                              "PP4", "PP5", "PP6", "PP7", "PP8", "PP9", "PP10", 
                                              "IRB1", "IRB2", "IRB3", "IRB4", "IRB5", "IRB6", "IRB7", "OCBI1",
                                              "OCBI2", "OCBI3", "OCBI4", "OCBI5", "OCBI6", "OCBI7", "OCBO1", 
                                              "OCBO2", "OCBO3", "OCBO4", "OCBO5", "OCBO6", "OCBO7","CM1","CM2","CM3","CM4","CM5","CM6","CM7","CM8"),
                  group = "COND", data = data1, estimator = "DWLS", information = "expected", std.lv=TRUE)
```

Terrence Jorgensen

unread,
Jul 11, 2018, 5:18:38 AM7/11/18
to lavaan
The chi-square values are not exactly the same. Should I ignore this discrepancy?

Are the df the same?  When you compare the lavInspect(fit, "free") output between the two fitted models, are there any differences?

In your syntax in a later post, you did not set parameterization = "theta", which is what measurementInvarianceCat() does.

Terrence Jorgensen

unread,
Jul 11, 2018, 5:21:08 AM7/11/18
to lavaan
Are there any unique considerations I need to make for modeling a bifactor (in this case, 20 variables with two uncorrelated factors explaining 10 factors a piece and a broader bifactor)?

I don't know, but this is not a lavaan issue, so SEMNET is a better place to ask.

christophe...@nicholls.edu

unread,
Jul 11, 2018, 5:14:03 PM7/11/18
to lavaan
Once again, you point me in the right direction. The lavInspect(fit,"free) function helped me to diagnose the problem. My measurement model requires that certain residual covariances be freed (it is a scale that contains psychometric synonyms and antonyms, which must be correlated). When building the configural model, I should have specified that these residual correlations were to be freely estimated in both groups. 

christophe...@nicholls.edu

unread,
Jul 15, 2018, 10:56:06 PM7/15/18
to lavaan
I'd like to follow up. 

Suppose I'm only interested in the factor loadings and latent covariances of a measurement model. Do I need to replicate all elements of the syntax to test for invariance in these parameters?

For instance, suppose I have a simple two factor mode and two groups:

m1 <– ' 
f1 =~ c(NA,NA)*x1 + x2 + x3
f2 =~ c(NA,NA)*x4 + x5 + x6

#Fix latent variances to 1.
f1 ~~ c(1,1)*f1
f2 ~~ c(1,1)*f2

#Fix latent means to 0.
f1 ~ c(0,0)*0
f2 ~ c(0,0)*0

#Freely estimate covariances.
f1 ~~c(NA,NA)*f2
'

I've deliberately left out the thresholds to help draw your attention to one of my concerns. 

Is there a particular problem with comparing the above model to one where the factor loadings are constrained to equality:

m2 <– ' 
f1 =~ c(a,a)*x1 + c(b,b)*x2 + c(c,c)*x3
f2 =~ c(d,d)*x4 + c(e,e)*x5 + c(f,f,)*x6

#Fix latent variances to 1.
f1 ~~ c(1,1)*f1
f2 ~~ c(1,1)*f2

#Fix latent means to 0.
f1 ~ c(0,0)*0
f2 ~ c(0,0)*0

#Freely estimate covariances.
f1 ~~c(NA,NA)*f2
'

...or one where the covariances are forced to be equal?

m2 <– ' 
f1 =~ c(a,a)*x1 + c(b,b)*x2 + c(c,c)*x3
f2 =~ c(d,d)*x4 + c(e,e)*x5 + c(f,f,)*x6

#Fix latent variances to 1.
f1 ~~ c(1,1)*f1
f2 ~~ c(1,1)*f2

#Fix latent means to 0.
f1 ~ c(0,0)*0
f2 ~ c(0,0)*0

#Impose equality constraints on latent covariances.
f1 ~~c(g,g)*f2
'

If there is, would you please either explain or point me to a reading that might be instructive?

Thanks in advance,

Chris

Terrence Jorgensen

unread,
Jul 17, 2018, 6:08:48 AM7/17/18
to lavaan
Do I need to replicate all elements of the syntax to test for invariance in these parameters?

Only if you want to follow the sequence recommended by Millsap & Tein (2004), which is not something I am a fan of.  If you follow Wu & Estabrook's (2016) advice, you would begin with constraining thresholds (allowing you to free residual variances and intercepts) before progressing through the usual sequence of constraints used with continuous indicators.  These methods both require you to specify fixed/free measurement parameters in the syntax because they do not conform to software's identification defaults.

If you follow Muthén's advice (from the Mplus manual and some other invariance papers he coauthored with Asparouhov), you would simultaneously (not in separate steps) constrain loadings and thresholds to test invariance.  This method requires the least extra syntax for you to write manually, because the cfa() function will impose default identification constraints that do not conflict with this model comparison.

I've deliberately left out the thresholds to help draw your attention to one of my concerns. 

The cfa() function will include those by default for indicators declared as ordered.

Is there a particular problem with comparing the above model to one where the factor loadings are constrained to equality:

Not a problem to compare, but you are not testing only the equality of loadings.  When loadings are constrained to equality, it is no longer necessary to fix all factor variances to 1 (just a reference group's factor variances).  If you don't release those identification constraints in group 2, then you are simultaneously testing whether both the factor loadings and the factor variances are equal, which is more restrictive than simply testing whether the loadings alone are equal.  You can then test whether factor variances are equal in a second step.

...or one where the covariances are forced to be equal?

As long as factor variances can be constrained to equality (which means fixing them all to 1 if that is your method of identification), then yes, you can meaningfully constrain latent covariances to test whether the factor correlations are equal across groups.

If there is, would you please either explain or point me to a reading that might be instructive?

About measurement invariance with categorical indicators?  I recommend Wu & Estabrook (2016).  More generally about testing invariance, Roger Millsap's (2004) book Statistical Approaches to Measurement Invariance is excellent.
Reply all
Reply to author
Forward
0 new messages