Theta parameterisation convergence

228 views
Skip to first unread message

Giacomo Mason

unread,
Aug 31, 2017, 7:15:06 AM8/31/17
to lavaan
Hi all,

I apologise in advance for the long post.

I'm having a some issues with conducting a measurement invariance analysis with CFA categorical data.

The data is structured as follows. I have a 12 item questionnaire, of which items 1-7 load on factor EXT, and items 8-12 load on factor INT. 
Most items have 3 response categories, apart from 5, 6, 7, and 12.

I'm using Millsap and Yun-Tein (2004), and Sunthud's codes (http://sunthud.com/catInvariance.html) as a reference throughout.

This is how I'm setting up the configural model, using items 2 and 9 as anchor/reference items:
configural <- "
EXT =~ c(1, 1)*X2 + X1 + X3 + X4 + X5 + X6 + X7
INT =~ c(1, 1)*X9 + X8 + X10 + X11 + X12
X2 | c(t1_2, t1_2)*t1 + c(t2_2, t2_2)*t2
X1 | c(t1_1, t1_1)*t1 + t2
X3 | c(t1_3, t1_3)*t1 + t2
X4 | c(t1_4, t1_4)*t1 + t2
X5 | c(t1_5, t1_5)*t1
X6 | c(t1_6, t1_6)*t1
X7 | c(t1_7, t1_7)*t1
X9 | c(t1_9, t1_9)*t1 + c(t2_9, t2_9)*t2
X8 | c(t1_8, t1_8)*t1 + t2
X10 | c(t1_10, t1_10)*t1 + t2
X11 | c(t1_11, t1_11)*t1 + t2
X12 | c(t1_12, t1_12)*t1
EXT ~~ NA*EXT
INT ~~ NA*INT
EXT ~~ NA*INT
EXT ~ c(0, NA)*1
INT ~ c(0, NA)*1
X1 ~~ c(1, NA)*X1
X2 ~~ c(1, NA)*X2
X3 ~~ c(1, NA)*X3
X4 ~~ c(1, NA)*X4
X5 ~~ c(1, NA)*X5
X6 ~~ c(1, NA)*X6
X7 ~~ c(1, NA)*X7
X8 ~~ c(1, NA)*X8
X9 ~~ c(1, NA)*X9
X10 ~~ c(1, NA)*X10
X11 ~~ c(1, NA)*X11
X12 ~~ c(1, NA)*X12
"


Question 1:
I have struggled to find references on the identification of systems with mixed dichotomous/polytomous items. Do you think the restrictions I impose are the correct ones? I'm basically assuming that I need to restrict the thresholds for dichotomous items, but I can leave their unique variances unrestricted.

__________________________________________________________________________________


Question 2:
I cannot seem to estimate the model using the theta parameterisation.
This converges easily: fa.config <- cfa(configural, data = items.c, group = "cohort", estimator="wlsmv")
This does not: fa.config.theta <- cfa(configural, data = items.c, group = "cohort", estimator="wlsmv, parameterization='theta'")

Looking around the group, it seems this might happen sometimes. Following the suggestion here, I tried fitting a statistically equivalent model with free loadings/thresholds but restricted mean/variance of the latents:
configural2 <- "
EXT =~ c(NA, NA)*X2 + X1 + X3 + X4 + X5 + X6 + X7
INT =~ c(NA, NA)*X9 + X8 + X10 + X11 + X12
X2 | c(t1_2, t1_2)*t1 + t2
X1 | c(t1_1, t1_1)*t1 + t2
X3 | c(t1_3, t1_3)*t1 + t2
X4 | c(t1_4, t1_4)*t1 + t2
X5 | c(t1_5, t1_5)*t1
X6 | c(t1_6, t1_6)*t1
X7 | c(t1_7, t1_7)*t1
X9 | c(t1_9, t1_9)*t1 + t2
X8 | c(t1_8, t1_8)*t1 + t2
X10 | c(t1_10, t1_10)*t1 + t2
X11 | c(t1_11, t1_11)*t1 + t2
X12 | c(t1_12, t1_12)*t1
EXT ~~ c(1,1)*EXT
INT ~~ c(1,1)*INT
EXT ~~ NA*INT
EXT ~ c(0, 0)*1
INT ~ c(0, 0)*1
X1 ~~ c(1, NA)*X1
X2 ~~ c(1, NA)*X2
X3 ~~ c(1, NA)*X3
X4 ~~ c(1, NA)*X4
X5 ~~ c(1, NA)*X5
X6 ~~ c(1, NA)*X6
X7 ~~ c(1, NA)*X7
X8 ~~ c(1, NA)*X8
X9 ~~ c(1, NA)*X9
X10 ~~ c(1, NA)*X10
X11 ~~ c(1, NA)*X11
X12 ~~ c(1, NA)*X12
"
Again, this converges only using the delta parameterisation, not using the theta.

Also, when I compare the two statistically equivalent models in the delta parameterisation, I get the same degrees of freedom but slightly different chi squares. Maybe they are not equivalent after all?

 anova(fa.config,fa.config2)
Scaled Chi Square Difference Test (method = "satorra.2000")

            Df AIC BIC  Chisq Chisq diff Df diff Pr(>Chisq)
fa.config  106         5492.7                              
fa.config2 106         5451.5    -12.525  3.0108          1

__________________________________________________________________________________


Question 3:
When I place restrictions on loadings (metric model) and thresholds (scalar model), both delta and theta parameterisations converge. However, I get wildly different fit indices between theta and delta.

> c( fitMeasures(fa.metric, c("rmsea", "mfi", "cfi")), moreFitIndices(fa.metric)["gammaHat"] )
     rmsea        mfi        cfi   gammaHat 
0.07875093 0.83539587 0.88289309 0.94344432 
> c( fitMeasures(fa.metric.theta, c("rmsea", "mfi", "cfi")), moreFitIndices(fa.metric.theta)["gammaHat"] )
     rmsea        mfi        cfi   gammaHat 
0.06385695 0.88847071 0.92300057 0.96207935 

> c( fitMeasures(fa.scalar, c("rmsea", "mfi", "cfi")), moreFitIndices(fa.scalar)["gammaHat"] )
     rmsea        mfi        cfi   gammaHat 
0.09987232 0.73769735 0.80190970 0.90793484 
> c( fitMeasures(fa.scalar.theta, c("rmsea", "mfi", "cfi")), moreFitIndices(fa.scalar.theta)["gammaHat"] )
     rmsea        mfi        cfi   gammaHat 
0.08120037 0.81782922 0.86905499 0.93718116 

Is this to be expected?

__________________________________________________________________________________

I would be happy to provide data, but they are from a protected dataset.

Thanks so much in advance for your help, and thanks for maintaining this great resource!

All the best,

Giacomo

Felix Fischer

unread,
Sep 1, 2017, 8:01:47 AM9/1/17
to lavaan
Hi,

question 1: wu & estabrook discuss the binary/polytomous case, but i don't think the combination. 
question 2: your model constrains the thresholds to be the same, which is not the configural model. also, i think if you use delta parameterization you need to define the scaling factor in the model instead of the residual variances. see ?model.syntax

question 3: theta and delta parameterization should give the same fit, i think, as both are the same model. i think you would need to use the scaled fit statistics, e.g. "cfi.scaled"?

best, felix


Giacomo Mason

unread,
Sep 1, 2017, 9:09:02 AM9/1/17
to lavaan
Hi Felix,

thanks for your reply!

question 2: I was under the impression that Millsap & Yun-Tein say that the first of the thresholds should be equal across groups for all items. This is also true in Sunthud's examples.
question 2b: Are the constraints on the unique variances ignored if I'm using delta parameterisation? In that case, do you know how I can come up with equivalent constraints on the scaling factors?

question 3: even with scaled CFI, I get different values. It might be because (as in 2b) I'm not specifying the delta parameterisation correctly...
> c( fitMeasures(fa.scalar, c("rmsea", "mfi", "cfi.scaled")), moreFitIndices(fa.scalar)["gammaHat"] )
     rmsea        mfi cfi.scaled   gammaHat 
0.09987232 0.73769735 0.71386126 0.90793484 
> c( fitMeasures(fa.scalar.theta, c("rmsea", "mfi", "cfi.scaled")), moreFitIndices(fa.scalar.theta)["gammaHat"] )
     rmsea        mfi cfi.scaled   gammaHat 
0.08120037 0.81782922 0.80117664 0.93718116 

Thanks again

G

Felix Fischer

unread,
Sep 1, 2017, 10:35:54 AM9/1/17
to lavaan
Wu  & Estabrook introduce a different order of models compared to Millsap & Yun-Tein. So, in their configural model, you would have:

EXT =~ X2 + X1 + X3 + X4 + X5 + X6 + X7 # free loadings
EXT ~~ c(1,1)*EXT #fixed factor variance
EXT ~ c(0,0)*EXT #fixed factor mean

X1 | t1 + t2 #free threshold
X1  ~ c(0, 0)*1 #fixed Intercept @0
X1 ~~ c(1,1)*X1 #residual variance @1


just add your other terms and run the model with parameterization = theta. In case of delta, simply replace the residual covariances X1~~ c(1,1)*X1 with the scaling factor: X1 ~*~ X1

You should also get the configural model by just specifying the loadings (better try that), but I felt that specifiying the entire model made it somewhat clearer and easier to introduce the constraints as needed (see terrys response in the other thread).

Best, Felix

Felix Fischer

unread,
Sep 1, 2017, 10:45:07 AM9/1/17
to lavaan
scaling factor needs to be constrained: X1 ~*~ c(1,1)*X1

Giacomo Mason

unread,
Sep 4, 2017, 12:30:11 PM9/4/17
to lavaan
Dear Felix,

thanks, this was incredibly helpful. It seems that the main message in Wu and Estabrook versus M&YT is: constrain thresholds first, then thresholds + loadings/any other parameter.

I was successfully able to write down the theta (condition 8) and delta (condition 7) parameterisations of the two configural models proposed in W&E:

# CONFIGURAL as in Wu Estabrook Condition 8
configural.we.theta <- "
EXT =~ c(NA, NA)*X2 + X1 + X3 + X4 + X5 + X6 + X7
INT =~ c(NA, NA)*X9 + X8 + X10 + X11 + X12
X1 | t1 + t2
X2 | t1 + t2
X3 | t1 + t2
X4 | t1 + t2
X5 | t1
X6 | t1
X7 | t1
X8 | t1 + t2
X9 | t1 + t2
X10 | t1 + t2
X11 | t1 + t2
X12 | t1
X1  ~ c(0, 0)*1
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1
X5  ~ c(0, 0)*1
X6  ~ c(0, 0)*1
X7  ~ c(0, 0)*1
X8  ~ c(0, 0)*1
X9  ~ c(0, 0)*1
X10  ~ c(0, 0)*1
X11  ~ c(0, 0)*1
X12  ~ c(0, 0)*1
X1 ~~ c(1,1)*X1
X2 ~~ c(1,1)*X2
X3 ~~ c(1,1)*X3
X4 ~~ c(1,1)*X4
X5 ~~ c(1,1)*X5
X6 ~~ c(1,1)*X6
X7 ~~ c(1,1)*X7
X8 ~~ c(1,1)*X8
X9 ~~ c(1,1)*X9
X10 ~~ c(1,1)*X10
X11 ~~ c(1,1)*X11
X12 ~~ c(1,1)*X12
EXT ~~ c(1,1)*EXT
INT ~~ c(1,1)*INT
EXT ~~ NA*INT
EXT ~ c(0, 0)*1
INT ~ c(0, 0)*1
"

# CONFIGURAL as in Wu Estabrook Condition 7
configural.we.delta <- "
EXT =~ c(NA, NA)*X2 + X1 + X3 + X4 + X5 + X6 + X7
INT =~ c(NA, NA)*X9 + X8 + X10 + X11 + X12
X1 | t1 + t2
X2 | t1 + t2
X3 | t1 + t2
X4 | t1 + t2
X5 | t1
X6 | t1
X7 | t1
X8 | t1 + t2
X9 | t1 + t2
X10 | t1 + t2
X11 | t1 + t2
X12 | t1
X1  ~ c(0, 0)*1
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1
X5  ~ c(0, 0)*1
X6  ~ c(0, 0)*1
X7  ~ c(0, 0)*1
X8  ~ c(0, 0)*1
X9  ~ c(0, 0)*1
X10  ~ c(0, 0)*1
X11  ~ c(0, 0)*1
X12  ~ c(0, 0)*1
X1 ~*~ c(1,1)*X1
X2 ~*~ c(1,1)*X2
X3 ~*~ c(1,1)*X3
X4 ~*~ c(1,1)*X4
X5 ~*~ c(1,1)*X5
X6 ~*~ c(1,1)*X6
X7 ~*~ c(1,1)*X7
X8 ~*~ c(1,1)*X8
X9 ~*~ c(1,1)*X9
X10 ~*~ c(1,1)*X10
X11 ~*~ c(1,1)*X11
X12 ~*~ c(1,1)*X12
EXT ~~ c(1,1)*EXT
INT ~~ c(1,1)*INT
EXT ~~ NA*INT
EXT ~ c(0, 0)*1
INT ~ c(0, 0)*1
"

Indeed these two give the same likelihood and fit indices (DF=106):
fa.config.we.theta <- cfa(configural.we.theta, data = items.c, group = "cohort", estimator="wlsmv", parameterization="theta")
fa.config.we.delta <- cfa(configural.we.delta, data = items.c, group = "cohort", estimator="wlsmv", parameterization="delta")


For my analysis I need a parameterisation that allows comparison of latent means and variances across groups, like in M&YT. However I'm still not sure how to write it down in lavaan. W&E describe it in the following way (page 1020):
  1. for all groups set the intercepts = 0 and a reference loading to 1 for each factor
  2. in the first group set the mean of the LVs to zero and the variances to 1
  3. Set invariant one threshold per item and a second threshold in the reference item of each factor

I thus estimate the following:

configural.myt.theta <- "
EXT =~ c(1, 1)*X2 + X1 + X3 + X4 + X5 + X6 + X7
INT =~ c(1, 1)*X9 + X8 + X10 + X11 + X12
X2 | c(t1_2, t1_2)*t1 + c(t2_2, t2_2)*t2
X1 | c(t1_1, t1_1)*t1 + t2
X3 | c(t1_3, t1_3)*t1 + t2
X4 | c(t1_4, t1_4)*t1 + t2
X5 | c(t1_5, t1_5)*t1
X6 | c(t1_6, t1_6)*t1
X7 | c(t1_7, t1_7)*t1
X9 | c(t1_9, t1_9)*t1 + c(t2_9, t2_9)*t2
X8 | c(t1_8, t1_8)*t1 + t2
X10 | c(t1_10, t1_10)*t1 + t2
X11 | c(t1_11, t1_11)*t1 + t2
X12 | c(t1_12, t1_12)*t1
X1  ~ c(0, 0)*1
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1
X5  ~ c(0, 0)*1
X6  ~ c(0, 0)*1
X7  ~ c(0, 0)*1
X8  ~ c(0, 0)*1
X9  ~ c(0, 0)*1
X10  ~ c(0, 0)*1
X11  ~ c(0, 0)*1
X12  ~ c(0, 0)*1
X1 ~~ c(1,1)*X1
X2 ~~ c(1,1)*X2
X3 ~~ c(1,1)*X3
X4 ~~ c(1,1)*X4
X5 ~~ c(1,1)*X5
X6 ~~ c(1,1)*X6
X7 ~~ c(1,1)*X7
X8 ~~ c(1,1)*X8
X9 ~~ c(1,1)*X9
X10 ~~ c(1,1)*X10
X11 ~~ c(1,1)*X11
X12 ~~ c(1,1)*X12
EXT ~~ c(1,NA)*EXT
INT ~~ c(1,NA)*INT
EXT ~~ NA*INT
EXT ~ c(0, NA)*1
INT ~ c(0, NA)*1
"
fa.config <- cfa(configural.myt.theta, data = items.c, group = "cohort", estimator="wlsmv", parameterization="theta")

But this results in a model with 120 DF instead of 106. Do you have any suggestion on how I can get this M&YT parameterisation right?

Thanks again for your help!

Giacomo

Felix Fischer

unread,
Sep 4, 2017, 2:47:51 PM9/4/17
to lavaan
Great, good to hear. In order to narrow down the difference between your model and millsaps advice, I would suggest you compare your number of free paramters with those mentioned in table 3 of the wu paper. maybe that sheds some light.

And to compare means, check "5.8. The Comparison of Factor Means and Variances". It essentially says you can do so when you constrain thresholds, loadings and intercepts and the paper gives the identification specifications as well.


Terrence Jorgensen

unread,
Sep 10, 2017, 5:38:18 AM9/10/17
to lavaan
In order to narrow down the difference between your model and millsaps advice, I would suggest you compare your number of free paramters with those mentioned in table 3 of the wu paper.

That's good advice, and looking at the output of 

PT.config.we.theta <- parTable(fa.config.we.theta)

will show you which parameters are freely estimated (PT.config.we.theta$free > 0), fixed (PT.config.we.theta$free == 0), or constrained to equality (look for labels in PT.config.we.theta$label or PT.config.we.theta$plabel and match them to constraints listed in the last row(s) of PT.config.we.theta)

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Giacomo Mason

unread,
Sep 11, 2017, 6:10:35 AM9/11/17
to lavaan
Hi both,

thanks so much for your help! I was already trying to go the way suggested by Terrence, by merging the partables of the two models. Sadly, I still can't figure out the correct syntax for the configural model in the MYT parameterisation. 

If I estimate both models with theta parameterisation, I get a difference of 14 DFs between the two:
Scaled Chi Square Difference Test (method = "satorra.2000")

                     Df AIC BIC  Chisq Chisq diff Df diff Pr(>Chisq)    
fa.config.we.theta  106         2254.6                                  
fa.config.myt.theta 120         6021.7     1618.8  8.8352  < 2.2e-16 ***


When I assemble and merge the two partables, I get the following pattern of free parameters, where both columns free.MYT and free.WE sum up to 66 free parameters:
      lhs  op    rhs group free.MYT free.WE
1   .p13.  ==  .p86.     0      0     NA
2   .p14.  ==  .p87.     0      0     NA
3   .p15.  ==  .p88.     0      0     NA
4   .p17.  ==  .p90.     0      0     NA
5   .p19.  ==  .p92.     0      0     NA
6   .p21.  ==  .p94.     0      0     NA
7   .p22.  ==  .p95.     0      0     NA
8   .p23.  ==  .p96.     0      0     NA
9   .p24.  ==  .p97.     0      0     NA
10  .p25.  ==  .p98.     0      0     NA
11  .p26.  ==  .p99.     0      0     NA
12  .p28.  == .p101.     0      0     NA
13  .p30.  == .p103.     0      0     NA
14  .p32.  == .p105.     0      0     NA
15    EXT  =~     X1     1      1      2
16    EXT  =~     X1     2     32     35
17    EXT  =~     X2     1      0      1
18    EXT  =~     X2     2      0     34
19    EXT  =~     X3     1      2      3
20    EXT  =~     X3     2     33     36
21    EXT  =~     X4     1      3      4
22    EXT  =~     X4     2     34     37
23    EXT  =~     X5     1      4      5
24    EXT  =~     X5     2     35     38
25    EXT  =~     X6     1      5      6
26    EXT  =~     X6     2     36     39
27    EXT  =~     X7     1      6      7
28    EXT  =~     X7     2     37     40
35    INT  =~    X10     1      8     10
36    INT  =~    X10     2     39     43
37    INT  =~    X11     1      9     11
38    INT  =~    X11     2     40     44
39    INT  =~    X12     1     10     12
40    INT  =~    X12     2     41     45
41    INT  =~     X8     1      7      9
42    INT  =~     X8     2     38     42
43    INT  =~     X9     1      0      8
44    INT  =~     X9     2      0     41
49     X1   |     t1     1     13     13
50     X1   |     t1     2     44     46
51     X1   |     t2     1     14     14
52     X1   |     t2     2     45     47
59    X10   |     t1     1     26     28
60    X10   |     t1     2     57     61
61    X10   |     t2     1     27     29
62    X10   |     t2     2     58     62
69    X11   |     t1     1     28     30
70    X11   |     t1     2     59     63
71    X11   |     t2     1     29     31
72    X11   |     t2     2     60     64
79    X12   |     t1     1     30     32
80    X12   |     t1     2     61     65
87     X2   |     t1     1     11     15
88     X2   |     t1     2     42     48
89     X2   |     t2     1     12     16
90     X2   |     t2     2     43     49
97     X3   |     t1     1     15     17
98     X3   |     t1     2     46     50
99     X3   |     t2     1     16     18
100    X3   |     t2     2     47     51
107    X4   |     t1     1     17     19
108    X4   |     t1     2     48     52
109    X4   |     t2     1     18     20
110    X4   |     t2     2     49     53
117    X5   |     t1     1     19     21
118    X5   |     t1     2     50     54
125    X6   |     t1     1     20     22
126    X6   |     t1     2     51     55
133    X7   |     t1     1     21     23
134    X7   |     t1     2     52     56
141    X8   |     t1     1     24     24
142    X8   |     t1     2     55     57
143    X8   |     t2     1     25     25
144    X8   |     t2     2     56     58
151    X9   |     t1     1     22     26
152    X9   |     t1     2     53     59
153    X9   |     t2     1     23     27
154    X9   |     t2     2     54     60
53     X1 ~*~     X1     1      0      0
54     X1 ~*~     X1     2      0      0
63    X10 ~*~    X10     1      0      0
64    X10 ~*~    X10     2      0      0
73    X11 ~*~    X11     1      0      0
74    X11 ~*~    X11     2      0      0
81    X12 ~*~    X12     1      0      0
82    X12 ~*~    X12     2      0      0
91     X2 ~*~     X2     1      0      0
92     X2 ~*~     X2     2      0      0
101    X3 ~*~     X3     1      0      0
102    X3 ~*~     X3     2      0      0
111    X4 ~*~     X4     1      0      0
112    X4 ~*~     X4     2      0      0
119    X5 ~*~     X5     1      0      0
120    X5 ~*~     X5     2      0      0
127    X6 ~*~     X6     1      0      0
128    X6 ~*~     X6     2      0      0
135    X7 ~*~     X7     1      0      0
136    X7 ~*~     X7     2      0      0
145    X8 ~*~     X8     1      0      0
146    X8 ~*~     X8     2      0      0
155    X9 ~*~     X9     1      0      0
156    X9 ~*~     X9     2      0      0
29    EXT  ~~    EXT     1      0      0
30    EXT  ~~    EXT     2     62      0
31    EXT  ~~    INT     1     31     33
32    EXT  ~~    INT     2     64     66
45    INT  ~~    INT     1      0      0
46    INT  ~~    INT     2     63      0
55     X1  ~~     X1     1      0      0
56     X1  ~~     X1     2      0      0
65    X10  ~~    X10     1      0      0
66    X10  ~~    X10     2      0      0
75    X11  ~~    X11     1      0      0
76    X11  ~~    X11     2      0      0
83    X12  ~~    X12     1      0      0
84    X12  ~~    X12     2      0      0
93     X2  ~~     X2     1      0      0
94     X2  ~~     X2     2      0      0
103    X3  ~~     X3     1      0      0
104    X3  ~~     X3     2      0      0
113    X4  ~~     X4     1      0      0
114    X4  ~~     X4     2      0      0
121    X5  ~~     X5     1      0      0
122    X5  ~~     X5     2      0      0
129    X6  ~~     X6     1      0      0
130    X6  ~~     X6     2      0      0
137    X7  ~~     X7     1      0      0
138    X7  ~~     X7     2      0      0
147    X8  ~~     X8     1      0      0
148    X8  ~~     X8     2      0      0
157    X9  ~~     X9     1      0      0
158    X9  ~~     X9     2      0      0
33    EXT  ~1            1      0      0
34    EXT  ~1            2     65      0
47    INT  ~1            1      0      0
48    INT  ~1            2     66      0
57     X1  ~1            1      0      0
58     X1  ~1            2      0      0
67    X10  ~1            1      0      0
68    X10  ~1            2      0      0
77    X11  ~1            1      0      0
78    X11  ~1            2      0      0
85    X12  ~1            1      0      0
86    X12  ~1            2      0      0
95     X2  ~1            1      0      0
96     X2  ~1            2      0      0
105    X3  ~1            1      0      0
106    X3  ~1            2      0      0
115    X4  ~1            1      0      0
116    X4  ~1            2      0      0
123    X5  ~1            1      0      0
124    X5  ~1            2      0      0
131    X6  ~1            1      0      0
132    X6  ~1            2      0      0
139    X7  ~1            1      0      0
140    X7  ~1            2      0      0
149    X8  ~1            1      0      0
150    X8  ~1            2      0      0
159    X9  ~1            1      0      0
160    X9  ~1            2      0      0



Again, the two syntaxes (maybe more legible like this) are, for the MYT parameterisation
configural.myt.theta <- "
EXT =~ c(1, 1)*X2 + X1 + X3 + X4 + X5 + X6 + X7    # LOADINGS CONSTRAINED ON ANCHORS
INT =~ c(1, 1)*X9 + X8 + X10 + X11 + X12
X2 | c(t1_2, t1_2)*t1 + c(t2_2, t2_2)*t2           # FIRST THRESHOLD ON EACH ITEM, AND SECOND THRESHOLD ON ANCHORS
X1 | c(t1_1, t1_1)*t1 + t2
X3 | c(t1_3, t1_3)*t1 + t2
X4 | c(t1_4, t1_4)*t1 + t2
X5 | c(t1_5, t1_5)*t1
X6 | c(t1_6, t1_6)*t1
X7 | c(t1_7, t1_7)*t1
X9 | c(t1_9, t1_9)*t1 + c(t2_9, t2_9)*t2
X8 | c(t1_8, t1_8)*t1 + t2
X10 | c(t1_10, t1_10)*t1 + t2
X11 | c(t1_11, t1_11)*t1 + t2
X12 | c(t1_12, t1_12)*t1
X1  ~ c(0, 0)*1                       # ALL INTERCEPTS TO ZERO
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1
X5  ~ c(0, 0)*1
X6  ~ c(0, 0)*1
X7  ~ c(0, 0)*1
X8  ~ c(0, 0)*1
X9  ~ c(0, 0)*1
X10  ~ c(0, 0)*1
X11  ~ c(0, 0)*1
X12  ~ c(0, 0)*1
X1 ~~ c(1,1)*X1           # ALL UNIQUE VARIANCES TO 1
X2 ~~ c(1,1)*X2
X3 ~~ c(1,1)*X3
X4 ~~ c(1,1)*X4
X5 ~~ c(1,1)*X5
X6 ~~ c(1,1)*X6
X7 ~~ c(1,1)*X7
X8 ~~ c(1,1)*X8
X9 ~~ c(1,1)*X9
X10 ~~ c(1,1)*X10
X11 ~~ c(1,1)*X11
X12 ~~ c(1,1)*X12
EXT ~~ c(1,NA)*EXT            # FIXED MEAN AND VARIANCE IN FIRST GROUP, FREE COVARIANCE
INT ~~ c(1,NA)*INT
EXT ~~ NA*INT
EXT ~ c(0, NA)*1
INT ~ c(0, NA)*1
"

... and for the Wu Estabrook (condition 8) theta parameterisation:
configural.we.theta <- "
EXT =~ c(NA, NA)*X2 + X1 + X3 + X4 + X5 + X6 + X7     # ALL FREE LOADINGS
INT =~ c(NA, NA)*X9 + X8 + X10 + X11 + X12
X1 | t1 + t2                                          # ALL FREE THRESHOLDS
X2 | t1 + t2
X3 | t1 + t2
X4 | t1 + t2
X5 | t1
X6 | t1
X7 | t1
X8 | t1 + t2
X9 | t1 + t2
X10 | t1 + t2
X11 | t1 + t2
X12 | t1
X1  ~ c(0, 0)*1                            # ALL INTERCEPTS TO ZERO
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1
X5  ~ c(0, 0)*1
X6  ~ c(0, 0)*1
X7  ~ c(0, 0)*1
X8  ~ c(0, 0)*1
X9  ~ c(0, 0)*1
X10  ~ c(0, 0)*1
X11  ~ c(0, 0)*1
X12  ~ c(0, 0)*1
X1 ~~ c(1,1)*X1           # ALL UNIQUE VARIANCES TO 1
X2 ~~ c(1,1)*X2
X3 ~~ c(1,1)*X3
X4 ~~ c(1,1)*X4
X5 ~~ c(1,1)*X5
X6 ~~ c(1,1)*X6
X7 ~~ c(1,1)*X7
X8 ~~ c(1,1)*X8
X9 ~~ c(1,1)*X9
X10 ~~ c(1,1)*X10          
X11 ~~ c(1,1)*X11
X12 ~~ c(1,1)*X12
EXT ~~ c(1,1)*EXT          # FIXED MEANS AND VARIANCES IN BOTH GROUPS, FREE COVARIANCE
INT ~~ c(1,1)*INT
EXT ~~ NA*INT
EXT ~ c(0, 0)*1
INT ~ c(0, 0)*1
"

So I'm not sure where I'm going wrong here. It seems that the threshold constraints are driving the difference between the two!

Thanks again for your help

Giacomo

Giacomo Mason

unread,
Sep 21, 2017, 10:37:57 AM9/21/17
to lavaan
I'm sorry to chase and I understand it's a cumbersome question, but I would really appreciate your help on this!

Best wishes,

Giacomo

Felix Fischer

unread,
Sep 22, 2017, 3:07:52 PM9/22/17
to lavaan
oh, sorry, for not looking into this. apparently, if i use the function lav_partable_df (which i think is used to get the df in the cfa function) i get the same dfs for every model. Maybe a bug? 

@Yves: could you have a look into this?

Minimal example:

items.c = data.frame(
  X1 = sample(1:3, 100, replace = TRUE),
  X2 = sample(1:3, 100, replace = TRUE),
  X3 = sample(1:3, 100, replace = TRUE),
  X4 = sample(1:3, 100, replace = TRUE),
  X5 = sample(1:2, 100, replace = TRUE),
  X6 = sample(1:2, 100, replace = TRUE),
  X7 = sample(1:2, 100, replace = TRUE),
  X8 = sample(1:3, 100, replace = TRUE),
  X9 = sample(1:3, 100, replace = TRUE),
  X10 = sample(1:3, 100, replace = TRUE),
  X11 = sample(1:3, 100, replace = TRUE),
  X12 = sample(1:2, 100, replace = TRUE),
  cohort = letters[1:2])

configural.we.theta <- "
EXT =~ c(NA, NA)*X2 + X1 + X3 + X4 + X5 + X6 + X7
INT =~ c(NA, NA)*X9 + X8 + X10 + X11 + X12
X1 | t1 + t2
X2 | t1 + t2
X3 | t1 + t2
X4 | t1 + t2
X5 | t1
X6 | t1
X7 | t1
X8 | t1 + t2
X9 | t1 + t2
X10 | t1 + t2
X11 | t1 + t2
X12 | t1
X1  ~ c(0, 0)*1
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1
X5  ~ c(0, 0)*1
X6  ~ c(0, 0)*1
X7  ~ c(0, 0)*1
X8  ~ c(0, 0)*1
X9  ~ c(0, 0)*1
X10  ~ c(0, 0)*1
X11  ~ c(0, 0)*1
X12  ~ c(0, 0)*1
X1 ~~ c(1,1)*X1
X2 ~~ c(1,1)*X2
X3 ~~ c(1,1)*X3
X4 ~~ c(1,1)*X4
X5 ~~ c(1,1)*X5
X6 ~~ c(1,1)*X6
X7 ~~ c(1,1)*X7
X8 ~~ c(1,1)*X8
X9 ~~ c(1,1)*X9
X10 ~~ c(1,1)*X10
X11 ~~ c(1,1)*X11
X12 ~~ c(1,1)*X12
EXT ~~ c(1,1)*EXT
INT ~~ c(1,1)*INT
EXT ~~ NA*INT
EXT ~ c(0, 0)*1
INT ~ c(0, 0)*1
"

# CONFIGURAL as in Wu Estabrook Condition 7
configural.we.delta <- "
EXT =~ c(NA, NA)*X2 + X1 + X3 + X4 + X5 + X6 + X7
INT =~ c(NA, NA)*X9 + X8 + X10 + X11 + X12
X1 | t1 + t2
X2 | t1 + t2
X3 | t1 + t2
X4 | t1 + t2
X5 | t1
X6 | t1
X7 | t1
X8 | t1 + t2
X9 | t1 + t2
X10 | t1 + t2
X11 | t1 + t2
X12 | t1
X1  ~ c(0, 0)*1
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1
X5  ~ c(0, 0)*1
X6  ~ c(0, 0)*1
X7  ~ c(0, 0)*1
X8  ~ c(0, 0)*1
X9  ~ c(0, 0)*1
X10  ~ c(0, 0)*1
X11  ~ c(0, 0)*1
X12  ~ c(0, 0)*1
X1 ~*~ c(1,1)*X1
X2 ~*~ c(1,1)*X2
X3 ~*~ c(1,1)*X3
X4 ~*~ c(1,1)*X4
X5 ~*~ c(1,1)*X5
X6 ~*~ c(1,1)*X6
X7 ~*~ c(1,1)*X7
X8 ~*~ c(1,1)*X8
X9 ~*~ c(1,1)*X9
X10 ~*~ c(1,1)*X10
X11 ~*~ c(1,1)*X11
X12 ~*~ c(1,1)*X12
EXT ~~ c(1,1)*EXT
INT ~~ c(1,1)*INT
EXT ~~ NA*INT
EXT ~ c(0, 0)*1
INT ~ c(0, 0)*1
"



configural.myt.theta <- "
EXT =~ c(1, 1)*X2 + X1 + X3 + X4 + X5 + X6 + X7
INT =~ c(1, 1)*X9 + X8 + X10 + X11 + X12
X2 | c(t1_2, t1_2)*t1 + c(t2_2, t2_2)*t2
X1 | c(t1_1, t1_1)*t1 + t2
X3 | c(t1_3, t1_3)*t1 + t2
X4 | c(t1_4, t1_4)*t1 + t2
X5 | c(t1_5, t1_5)*t1
X6 | c(t1_6, t1_6)*t1
X7 | c(t1_7, t1_7)*t1
X9 | c(t1_9, t1_9)*t1 + c(t2_9, t2_9)*t2
X8 | c(t1_8, t1_8)*t1 + t2
X10 | c(t1_10, t1_10)*t1 + t2
X11 | c(t1_11, t1_11)*t1 + t2
X12 | c(t1_12, t1_12)*t1
X1  ~ c(0, 0)*1
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1
X5  ~ c(0, 0)*1
X6  ~ c(0, 0)*1
X7  ~ c(0, 0)*1
X8  ~ c(0, 0)*1
X9  ~ c(0, 0)*1
X10  ~ c(0, 0)*1
X11  ~ c(0, 0)*1
X12  ~ c(0, 0)*1
X1 ~~ c(1,1)*X1
X2 ~~ c(1,1)*X2
X3 ~~ c(1,1)*X3
X4 ~~ c(1,1)*X4
X5 ~~ c(1,1)*X5
X6 ~~ c(1,1)*X6
X7 ~~ c(1,1)*X7
X8 ~~ c(1,1)*X8
X9 ~~ c(1,1)*X9
X10 ~~ c(1,1)*X10
X11 ~~ c(1,1)*X11
X12 ~~ c(1,1)*X12
EXT ~~ c(1,NA)*EXT
INT ~~ c(1,NA)*INT
EXT ~~ NA*INT
EXT ~ c(0, NA)*1
INT ~ c(0, NA)*1
"

fa.config.we.theta <- cfa(configural.we.theta, data = items.c, group = "cohort", estimator="wlsmv", parameterization="theta")
fa.config.we.delta <- cfa(configural.we.delta, data = items.c, group = "cohort", estimator="wlsmv", parameterization="delta")
fa.config <- cfa(configural.myt.theta, data = items.c, group = "cohort", estimator="wlsmv", parameterization="theta")

lav_partable_npar(partable(fa.config))
lav_partable_npar(partable(fa.config.we.delta))
lav_partable_npar(partable(fa.config.we.theta))

lav_partable_ndat(partable(fa.config))
lav_partable_ndat(partable(fa.config.we.delta))
lav_partable_ndat(partable(fa.config.we.theta))

lav_partable_df(partable(fa.config))
lav_partable_df(partable(fa.config.we.delta))
lav_partable_df(partable(fa.config.we.theta))

Giacomo Mason

unread,
Sep 23, 2017, 10:35:08 AM9/23/17
to lavaan
Felix,

thanks so much for following up on this. I did not know about lav_partable_df(). I can confirm that I get 106 DF on all models on my true data as well. A bug would also explain why I get fit indices that are different, if they get DFs in the same way as anova() does.

Thanks also for producing the minimal example for Yves to look at! 

Giacomo

Felix Fischer

unread,
Sep 24, 2017, 4:46:57 PM9/24/17
to lavaan
"For my analysis I need a parameterisation that allows comparison of latent means and variances across groups, like in M&YT." This can be done if you follow Wu & Estabrooks advice. Building on the configural theta model, you constrain thresholds and loadings to be equal across groups and intercepts = 0. Other parameters (factor means & variances) can be freely estimated and compared. The residual variances must be constrained to 1 in the reference group, but i am not entirely sure about that, so you better check. Best, Felix

Giacomo Mason

unread,
Sep 25, 2017, 9:30:34 AM9/25/17
to lavaan
Hi Felix,

yes I think I understand the restrictions in Wu and Estabrook now (thanks to your major help!).

I would still feel more comfortable with the whole analysis if I could compare the WE parameterisation with the MYT parameterisation, so I can check I'm doing thing correctly. But any fit comparison at the moment is impossible if neither anova nor the fit indices are computed using the correct degrees of freedom.

Giacomo

Giacomo Mason

unread,
Sep 26, 2017, 8:58:24 AM9/26/17
to lavaan
I do think there might be a bug in how DFs are calculated... This is what I get when I compare the output of lav_partable_df() and fitmeasures( , "df") on my data (same thing holds for npar)

> lav_partable_df(partable(fa.config))
[1] 106
> fitmeasures(fa.config, "df")
 df
120
> lav_partable_df(partable(fa.config.we.delta))
[1] 106
> fitmeasures(fa.config.we.delta, "df")
 df
106
> lav_partable_df(partable(fa.config.we.theta))
[1] 106
> fitmeasures(fa.config.we.theta, "df")
 df
106

kma...@aol.com

unread,
Sep 26, 2017, 9:39:52 AM9/26/17
to lavaan
Giacomo & Felix,
This is an interesting thread but the example is a lot to digest.  Is the example with two latent variables and 12 observed indicators really minimal in the sense that the problem does not occur with anything less?  66 parameters and 106 df are a lot to juggle by hand.

Have you tried inspecting the parameter tables directly?  The 'free' column counts the free parameters for you.  Can you isolate the parameters that you think that lavaan is counting incorrectly (wrongly omitting or wrongly including)?

lavaanify(configural.we.theta, ngroups=2, parameterization='theta')
lavaanify(configural.we.delta, ngroups=2, parameterization='delta')
lavaanify(configural.myt.theta, ngroups=2, parameterization='theta')

Just to clarify the terminology, one would normally expect different parameterizations of the same model to share the same df count.  You seem to be expecting different df counts for each of your model syntax objects.  So, strictly speaking, is it safe to infer that you do not mean these as parameterizations of the same model?  Could you annotate the intended differences so that the reader can better understand how they impact parameter count and df count?

Keith
------------------------
Keith A. Markus
John Jay College of Criminal Justice, CUNY
http://jjcweb.jjay.cuny.edu/kmarkus
Frontiers of Test Validity Theory: Measurement, Causation and Meaning.
http://www.routledge.com/books/details/9781841692203/

Terrence Jorgensen

unread,
Sep 26, 2017, 11:51:50 AM9/26/17
to lavaan
I was also struggling following this example (but I acknowledge that this topic is neither simple nor straight-forward, regardless of the toy example).  In the spirit of Keith's recommendation, I am posting here a smaller example, and in fact reproducible because I set a seed before generating data.  I also generated data from a factor model so that it would be more likely to converge on a solution than the example generating independent vectors (no common factor).

I apply different thresholds to the same latent item-responses to generate binary and 3-category data, and a mixture (2 of each type).  I only fit models to the ternary data to get the ball rolling.  I noticed that in Felix's and in Giacomo's syntax for fitting Millsap & Yun-Tein's model, the residual variances were mistakenly fixed to 1 in all groups ("# ALL UNIQUE VARIANCES TO 1") rather than only the first group.  For binary data, only the marker-variable's residual variance would also have to be fixed to 1 in other groups.  Here's a summary of the rules in Appendix A of M's article:


Anyway, when I free those residual variances in the example below, and when I free the factor variances in group 1 (which the previous examples also failed to do for the MYT parameterization), the models all have the same df.  

As Felix & Giacomo point out, the df returned by lav_partable_df() is not consistent with the correct df reported by fitMeasures() when the model includes constraints, presumably because lav_partable_npar() returns the count of all free parameters without subtracting out the number of equality constraints at the bottom of the parameter table.


N <- 1000
pop
<- ' F1 =~ .6*X1 + .7*X2 + .8*X3 + .9*X4 '
## generate normal latent-response data
dat
<- simulateData(pop, model.type = "cfa", sample.nobs = N,
                    standardized
= TRUE, std.lv = TRUE, seed = 12345,
                    empirical
= TRUE)
## apply thresholds for binary data
dat2
<- data.frame(lapply(dat, cut, breaks = c(-Inf, 0, Inf),
                          ordered_result
= TRUE))
dat2$group
<- 1:2
## apply thresholds for ternary data
dat3
<- data.frame(lapply(dat, cut, breaks = c(-Inf, -.5, .5, Inf),
                          ordered_result
= TRUE))
dat3$group
<- 1:2
## mixture of binary and ternary data
mix
<- cbind(dat3[c("X1","X2")], dat2[c("X3","X4")])


#### fit models to 3-category data


# CONFIGURAL as in Wu Estabrook Condition 7

mod
.c7.3 <- "
F1 =~ c(NA, NA)*X1 + X2 + X3 + X4

X1 | t1 + t2
X2 | t1 + t2
X3 | t1 + t2
X4 | t1 + t2

X1  ~ c(0, 0)*1
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1

X1 ~*~ c(1, 1)*X1
X2 ~*~ c(1, 1)*X2
X3 ~*~ c(1, 1)*X3
X4 ~*~ c(1, 1)*X4

F1  ~~ c(1, 1)*F1
F1   ~ c(0, 0)*1
"


# CONFIGURAL as in Wu Estabrook Condition 8
mod
.c8.3 <- "
F1 =~ c(NA, NA)*X1 + X2 + X3 + X4

X1 | t1 + t2
X2 | t1 + t2
X3 | t1 + t2
X4 | t1 + t2

X1  ~ c(0, 0)*1
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1

X1 ~~ c(1, 1)*X1
X2 ~~ c(1, 1)*X2
X3 ~~ c(1, 1)*X3
X4 ~~ c(1, 1)*X4

F1 ~~ c(1, 1)*F1
F1  ~ c(0, 0)*1
"


# Millsap & Yun-Tein's (MYT) parameterization
mod
.myt3 <- "
F1 =~ c(1, 1)*X1 + X2 + X3 + X4
X1 | c(t1_1, t1_1)*t1 + c(t2_1, t2_1)*t2
X2 | c(t1_2, t1_2)*t1 + t2

X3 | c(t1_3, t1_3)*t1 + t2
X4 | c(t1_4, t1_4)*t1 + t2

X1  ~ c(0, 0)*1
X2  ~ c(0, 0)*1
X3  ~ c(0, 0)*1
X4  ~ c(0, 0)*1

X1 ~~ c(1, NA)*X1
X2 ~~ c(1, NA)*X2
X3 ~~ c(1, NA)*X3
X4 ~~ c(1, NA)*X4

F1 ~~ c(NA, NA)*F1
F1  ~ c( 0, NA)*1
"


fit
.c7.3 <- cfa(mod.c7.3, data = dat3, group = "group", parameterization="delta")
fit
.c8.3 <- cfa(mod.c8.3, data = dat3, group = "group", parameterization="theta")
fit
.myt3 <- cfa(mod.myt3, data = dat3, group = "group", parameterization="theta")

WED
<- partable(fit.c7.3)
WET
<- partable(fit.c8.3)
MYT
<- partable(fit.myt3) # does not subtract constraints

lav_partable_npar
(WED)
lav_partable_npar
(WET)
lav_partable_npar
(MYT)

lav_partable_ndat
(WED)
lav_partable_ndat
(WET)
lav_partable_ndat
(MYT)

lav_partable_df
(WED)
lav_partable_df
(WET)
lav_partable_df
(MYT) # wrong
fitmeasures
(fit.myt3, "df") # correct

WED
#[WED$op %in% c("~~","~*~"), ]
WET
#[WET$op %in% c("~~","~*~"), ]
MYT
#[MYT$op %in% c("~~","~*~"), ]

Giacomo Mason

unread,
Sep 26, 2017, 12:22:13 PM9/26/17
to lavaan
Hey Keith,

thanks for your kind help and time.

@Terrence: your post came exactly as I was writing the below, so pardon the duplication. The issue was precisely that of the unique variances!

What I'm trying to do is indeed to compare equivalent parameterisations of the same model, following Wu and Estabrook (2016), WE henceforth.

Using the data and example code at Sunthud's webpage, I've finally figured out how to implement these parameterisations in lavaan:
  1. the MYT parameterisation (from Millsap and Yun-Tein, 2004)
  2. the WE delta parameterisation (condition 7 in WE)
  3. the WE theta parameterisation (condition 8 in WE)
  4. an alternative parameterisation similar to WE theta, which relaxes equality of latent means and variances by restricting the loading and one threshold on an item for each factor.
The code is attached, and follows below (using the dataset Sunthud provides):
library(lavaan)

dat5 <- read.csv("example5c.csv", header = FALSE)
colnames(dat5) <- c("u1", "u2", "u3", "u4", "u5", "u6", "u7", "u8", "g")
for(i in 1:8) dat5[,i] <- ordered(dat5[,i])
dat5[,9] <- factor(dat5[,9], labels = c("male", "female"))

configural5 <- "
f1 =~ c(1, 1)*u1 + u2 + u3 + u4
f2 =~ c(1, 1)*u5 + u6 + u7 + u8
u1 | c(t11, t11)*t1 + c(t12, t12)*t2 + t3 + t4
u2 | c(t21, t21)*t1 + t2 + t3 + t4
u3 | c(t31, t31)*t1 + t2 + t3 + t4
u4 | c(t41, t41)*t1 + t2 + t3 + t4
u5 | c(t51, t51)*t1 + c(t52, t52)*t2 + t3 + t4
u6 | c(t61, t61)*t1 + t2 + t3 + t4
u7 | c(t71, t71)*t1 + t2 + t3 + t4
u8 | c(t81, t81)*t1 + t2 + t3 + t4
f1 ~~ NA*f1
f2 ~~ NA*f2
f1 ~~ NA*f2
f1 ~ c(0, NA)*1
f2 ~ c(0, NA)*1
u1 ~~ c(1, NA)*u1
u2 ~~ c(1, NA)*u2
u3 ~~ c(1, NA)*u3
u4 ~~ c(1, NA)*u4
u5 ~~ c(1, NA)*u5
u6 ~~ c(1, NA)*u6
u7 ~~ c(1, NA)*u7
u8 ~~ c(1, NA)*u8
"

configural5.we.theta <- "
f1 =~ c(NA, NA)*u1 + u2 + u3 + u4
f2 =~ c(NA, NA)*u5 + u6 + u7 + u8
u1 | t1 + t2 + t3 + t4
u2 | t1 + t2 + t3 + t4
u3 | t1 + t2 + t3 + t4
u4 | t1 + t2 + t3 + t4
u5 | t1 + t2 + t3 + t4
u6 | t1 + t2 + t3 + t4
u7 | t1 + t2 + t3 + t4
u8 | t1 + t2 + t3 + t4
f1 ~~ c(1,1)*f1
f2 ~~ c(1,1)*f2
f1 ~~ NA*f2
f1 ~ c(0, 0)*1
f2 ~ c(0, 0)*1
u1 ~~ c(1, 1)*u1
u2 ~~ c(1, 1)*u2
u3 ~~ c(1, 1)*u3
u4 ~~ c(1, 1)*u4
u5 ~~ c(1, 1)*u5
u6 ~~ c(1, 1)*u6
u7 ~~ c(1, 1)*u7
u8 ~~ c(1, 1)*u8
"
configural5.we.theta.alt <- "
f1 =~ c(NA,NA)*u2 + c(l1, l1)*u1 + u3 + u4
f2 =~ c(NA,NA)*u6 + c(l5, l5)*u5 + u7 + u8
u1 | c(t11, t11)*t1 + t2 + t3 + t4
u2 | t1 + t2 + t3 + t4
u3 | t1 + t2 + t3 + t4
u4 | t1 + t2 + t3 + t4
u5 | c(t15, t15)*t1 + t2 + t3 + t4
u6 | t1 + t2 + t3 + t4
u7 | t1 + t2 + t3 + t4
u8 | t1 + t2 + t3 + t4
f1 ~~ c(1,NA)*f1
f2 ~~ c(1,NA)*f2
f1 ~~ NA*f2
f1 ~ c(0, NA)*1
f2 ~ c(0, NA)*1
u1 ~~ c(1, 1)*u1
u2 ~~ c(1, 1)*u2
u3 ~~ c(1, 1)*u3
u4 ~~ c(1, 1)*u4
u5 ~~ c(1, 1)*u5
u6 ~~ c(1, 1)*u6
u7 ~~ c(1, 1)*u7
u8 ~~ c(1, 1)*u8
"
configural5.we.delta <- "
f1 =~ c(NA, NA)*u1 + u2 + u3 + u4
f2 =~ c(NA, NA)*u5 + u6 + u7 + u8
u1 | t1 + t2 + t3 + t4
u2 | t1 + t2 + t3 + t4
u3 | t1 + t2 + t3 + t4
u4 | t1 + t2 + t3 + t4
u5 | t1 + t2 + t3 + t4
u6 | t1 + t2 + t3 + t4
u7 | t1 + t2 + t3 + t4
u8 | t1 + t2 + t3 + t4
f1 ~~ c(1,1)*f1
f2 ~~ c(1,1)*f2
f1 ~~ NA*f2
f1 ~ c(0, 0)*1
f2 ~ c(0, 0)*1
u1 ~*~ c(1, 1)*u1
u2 ~*~ c(1, 1)*u2
u3 ~*~ c(1, 1)*u3
u4 ~*~ c(1, 1)*u4
u5 ~*~ c(1, 1)*u5
u6 ~*~ c(1, 1)*u6
u7 ~*~ c(1, 1)*u7
u8 ~*~ c(1, 1)*u8
"
outConfigural5 <- cfa(configural5, data = dat5, group = "g", parameterization="theta", estimator="wlsmv")
outConfigural5.we.theta <- cfa(configural5.we.theta, data = dat5, group = "g", parameterization="theta", estimator="wlsmv")
outConfigural5.we.theta.alt <- cfa(configural5.we.theta.alt, data = dat5, group = "g", parameterization="theta", estimator="wlsmv")
outConfigural5.we.delta <- cfa(configural5.we.delta, data = dat5, group = "g", parameterization="delta", estimator="wlsmv")

c( df.partab =lav_partable_df(partable(outConfigural5)),fitMeasures(outConfigural5, c("df","rmsea", "mfi", "cfi")), moreFitIndices(outConfigural5)["gammaHat"] )
c( df.partab =lav_partable_df(partable(outConfigural5.we.theta)),fitMeasures(outConfigural5.we.theta, c("df","rmsea", "mfi", "cfi")), moreFitIndices(outConfigural5.we.theta)["gammaHat"] )
c( df.partab =lav_partable_df(partable(outConfigural5.we.theta.alt)), fitMeasures(outConfigural5.we.theta.alt, c("df","rmsea", "mfi", "cfi")), moreFitIndices(outConfigural5.we.theta.alt)["gammaHat"] )
c( df.partab =lav_partable_df(partable(outConfigural5.we.delta)), fitMeasures(outConfigural5.we.delta, c("df","rmsea", "mfi", "cfi")), moreFitIndices(outConfigural5.we.delta)["gammaHat"] )


I hope this can be useful to others (it took me ages to figure out, but I'm quite a noob with SEM).

It still puzzles me why the lav_partable_df() function returns different values than fitmeasures(, "df"), as you can see in the example... But maybe it's intended behaviour

Also, my original issue of non-convergence using the MYT parameterisation on my original data remains. But at least now I can be sure I'm proceeding correctly if I study measurement invariance within the WE framework instead of MYT.

Thanks everyone for your help, it's been precious and useful for my research!

Giacomo
catInvariance5_configural.R
example5c.csv

Yves Rosseel

unread,
Sep 26, 2017, 1:53:56 PM9/26/17
to lav...@googlegroups.com, Giacomo Mason
> It still puzzles me why the lav_partable_df() function returns different
> values than fitmeasures(, "df"), as you can see in the example... But
> maybe it's intended behaviour

It is intended indeed.

lav_partable_df() does not take into account the number of
(non-redundant) equality constraints. For the latter, I need to compute
the rank of the 'constraints matrix', which is not stored in the
parameter table (the input of lav_partable_df), but in the
@Mo...@con.jac slot.

This is the relevant code snippet in lavaan (in lav_test.R):

# degrees of freedom
df <- lav_partable_df(lavpartable)

# handle equality constraints (note: we ignore inequality constraints,
# active or not!)
# we use the rank of con.jac (even if the constraints are nonlinear)
if(nrow(lavm...@con.jac) > 0L) {
ceq.idx <- attr(lavm...@con.jac, "ceq.idx")
if(length(ceq.idx) > 0L) {
neq <- qr(lavm...@con.jac[ceq.idx,,drop=FALSE])$rank
df <- df + neq
}
}


The code of lav_partable_df() is simply:

lav_partable_df(partable)
{
npar <- lav_partable_npar(partable)
ndat <- lav_partable_ndat(partable)
df <- ndat - npar
as.integer(df)
}

To get the 'correct' df, there two ways. Either using fitMeasures(), or
by using:

lavInspect(fit, "test")[[1]]$df


Yves.

Terrence Jorgensen

unread,
Sep 26, 2017, 2:45:16 PM9/26/17
to lavaan
Also, my original issue of non-convergence using the MYT parameterisation on my original data remains. 

I forgot to mention that in my last post.  Notice my example works with a large sample (N = 1000).  The example posted earlier had N = 100 (50 per group).  Try running the earlier example to simulate a much larger sample, and you will get convergence. Try running my example to simulate a much smaller sample, and you will notice convergence problems.  Oddly, convergence problems arise for one parameterization before another equivalent parameterization.  Not sure why, but I've noticed it before.  

Yves Rosseel

unread,
Sep 26, 2017, 3:18:19 PM9/26/17
to lav...@googlegroups.com
> convergence problems arise for one parameterization before another
> equivalent parameterization. Not sure why, but I've noticed it before.

Indeed. The theta parameterization is for some reason harder for the
optimizer than the delta parameterization. One day, I will understand
why, and fix it.

Yves.

Felix Fischer

unread,
Sep 26, 2017, 3:39:31 PM9/26/17
to lavaan
ah, great! i'll need to print this thread for future reference. :P

Giacomo Mason

unread,
Sep 27, 2017, 4:50:11 AM9/27/17
to lavaan
Thank you Yves, Terrence, and Felix! Keep up the great work, this group is an amazing resource.

Giacomo

kma...@aol.com

unread,
Sep 27, 2017, 9:11:31 AM9/27/17
to lavaan
Just a footnote for future reference:  Although she goes by Jenn informally, Jenn's full first name is Jenn-Yun and her last name is Tein.  So, for example, an APA formatted citation might read Millsap & Tein, 2004 and a reference might begin Millsap, R. E. & Tein, J.- Y. (2004)... 

Thanks everyone for an illuminating thread.

Terrence Jorgensen

unread,
Oct 9, 2017, 3:28:11 AM10/9/17
to lavaan
Although she goes by Jenn informally, Jenn's full first name is Jenn-Yun and her last name is Tein. 

Wow, I've been citing her incorrectly for a long time, all based on this one reference with a misplaced hyphen


Thanks Keith!

Terrence Jorgensen

unread,
Aug 26, 2018, 6:12:58 PM8/26/18
to lavaan
Anyone still following this thread might be interested to know there is a new function in the development version of semTools, which is more general and flexible than the previous invariance-testing functions.

Reply all
Reply to author
Forward
0 new messages