Multigroup CFA with ordinal data in R

veren...@gmail.com

unread,

Aug 14, 2018, 4:28:29 AM8/14/18

to lavaan

Hi all,

I'm testing factorial invariance between groups (dichotomous data). I know that I have to take thresholds into account and not intercepts. But i don't know, if i did it in the right way?? My syntax for continous data (testing strong measurement invariance) looks like the following. As you can see it is partial measuremant (otherwise i had a high SRMR). Can I do it like this? Results of my fitindices: CFI=.99, RMSEA=.024, SRMR=.07

And what does this actually mean: T2_OD1a ~*~ T2_OD1a??

Thank you :)

# Strong Measurement Invariance

model.strong1<-"
Diagnostik=~c(1,1,1)*V1_OD1a+
    c(fl1,fl1,fl1)*V2_OD1a+
    c(fl2,fl2,fl2)*T1_OD1a+
    c(fl3,fl3,fl3)*T2_OD1a
Rückmeldung=~c(1,1,1)*V1_OR1a+
    c(fl4,fl4,fl4)*V2_OR1a+
    c(fl5,fl5,fl5)*T1_OR1a+
    c(fl6,fl7,fl6)*T2_OR1a

V1_OD1a |c(t11, t11, t11)*t1
V2_OD1a |c(t21, t21, t21)*t1
T1_OD1a |c(t31, t31, t31)*t1
T2_OD1a |c(t41, t41, t41)*t1

V1_OR1a |c(t51, t51, t51)*t1
V2_OR1a |c(t61, t61, t61)*t1
T1_OR1a |c(t71, t71, t71)*t1
T2_OR1a |c(t81, t81, t81)*t1

V1_OD1a    ~~    V1_OR1a
V1_OR1a    ~~    V2_OR1a
T2_OD1a    ~*~    T2_OD1a
T2_OR1a    ~*~    T2_OR1a
V1_OD1a    ~*~    V1_OD1a
V2_OD1a    ~*~    V2_OD1a

Diagnostik~c(lm1,lm2,lm3)*1
     lm1==0
Rückmeldung~c(lm4,lm5,lm6)*1
     lm4==0
"

fit.strong1<-sem(model.strong1,data=data.T1
                      ,group="Studigruppe",meanstructure=T, estimator="WLSMV", ordered = c("T1_OD1a, V1_OD1a, T2_OD1a, V2_OD1a, T1_OR1a, V1_OR1a, T2_OR1a, V2_OR1a"))
fit.strong1
(fitind.strong1<-fitmeasures(fit.strong1
                            ,c("SRMR", "RMSEA.scaled", "CFI.scaled")))
summary(fit.strong1,stand=T)

Terrence Jorgensen

unread,

Aug 14, 2018, 6:53:53 AM8/14/18

to lavaan

I know that I have to take thresholds into account and not intercepts

Well, that's an arbitrary software default. If you fix thresholds to zero and free the intercepts instead, the intercepts will be the same magnitude as the free thresholds were, but the opposite sign. Since you only have 1 threshold per item, there are actually many alternative parameterizations of the configural model that are equivalent.

https://doi.org/10.1007/s11336-016-9506-0

But i don't know, if i did it in the right way??

You can compare the default configural model to a model in which both loadings and thresholds are constrained to equality (and intercepts will be equal because they remain fixed to zero by default) to test strong invariance. You cannot separately test loadings because if that is a misspecification, it can be expressed as differences in thresholds.

And what does this actually mean: T2_OD1a ~*~ T2_OD1a??

That is the scaling factor, which is the SD of the latent item-response.

https://www.statmodel.com/download/webnotes/CatMGLong.pdf

If you want to be able to test strict invariance, you need to set parameterization = "theta" so that the residual variances are instead fixed to 1 by default.

Either way, when you constrain thresholds, you need to free either the residual variances (theta) or scaling factors (default delta parameterization) in all but the reference (first) group. Otherwise, you are testing strict invariance rather than merely strong invariance.

Delta parameterization:

T2_OD1a ~*~ c(1, NA, NA)*T2_OD1a

Theta parameterization:

T2_OD1a ~~ c(1, NA, NA)*T2_OD1a

To subsequently test strict invariance, you then equate the residual variances by fixing them all to 1 again, and comparing that to the strong-invariance model.

T2_OD1a ~~ c(1, 1, 1)*T2_OD1a

Terrence D. Jorgensen

Postdoctoral Researcher, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

UvA web page: http://www.uva.nl/profile/t.d.jorgensen

veren...@gmail.com

unread,

Aug 16, 2018, 9:09:29 AM8/16/18

to lavaan

Thank you for your fast reply and your help!

Just to be sure... taking into account your advices. Does my syntax look like this??:

model.strong<-"
Diagnostik=~c(1,1,1)*V1_OD1a+
    c(fl1,fl1,fl1)*V2_OD1a+
    c(fl2,fl8,fl2)*T1_OD1a+
    c(fl3,fl7,fl3)*T2_OD1a
Rückmeldung=~c(1,1,1)*V1_OR1a+
    c(fl4,fl4,fl4)*V2_OR1a+
    c(fl5,fl5,fl5)*T1_OR1a+
    c(fl6,fl6,fl6)*T2_OR1a

V1_OD1a ~~ c(1,NA,NA)*V1_OD1a
T1_OD1a ~~ c(1,NA,NA)*T1_OD1a
V2_OD1a ~~ c(1,NA,NA)*V2_OD1a
T2_OD1a ~~ c(1,NA,NA)*T2_OD1a

V1_OR1a ~~ c(1,NA,NA)*V1_OR1a
T1_OR1a ~~ c(1,NA,NA)*T1_OR1a
V2_OR1a ~~ c(1,NA,NA)*V2_OR1a
T2_OR1a ~~ c(1,NA,NA)*T2_OR1a

Diagnostik~c(lm1,lm2,lm3)*1
     lm1==0
Rückmeldung~c(lm4,lm5,lm6)*1
     lm4==0
"

veren...@gmail.com

unread,

Aug 16, 2018, 10:17:57 AM8/16/18

to lavaan

Besides...when I try to compare two models by using "anova" i get the following error:

What does it mean?

Error in MASS::ginv(Delta1) %*% Delta0 : non-conformable arguments.

Michelle Degli Esposti

unread,

Aug 17, 2018, 10:14:16 AM8/17/18

to lav...@googlegroups.com

Hello all,

I was hoping someone might be able to help me with the following error. I’m trying to fit a second-order latent growth model using ordinal factors.

I’ve already established longitudinal invariance (i.e. unique factor invariance) and am now fitting the structural part of the model. Although this has worked for a complete case analysis and pairwise deletion, I’m getting funny errors when using lavaan.mi on a list of 20 imputed datasets (imputed via mice package).

# CONVERT MIDS OBJECT TO LIST OF IMPUTED DATAFRAMES

imputed_list <- list()

for(i in 1:20){

imputed_list[[i]] <- complete(imputed,action=i,include=FALSE)

}

# SPECIFY ORDINAL FACTORS

ordered_cat <- c("destruct_7", "fight_7", "disobedient_7",

"destruct_11", "fight_11", "disobedient_11",

"destruct_16", "fight_16", "disobedient_16")

as.character(ordered_cat)

# SPECIFY MODEL1

model1 <- '

# # Measurement model: Unique Factor Invariance

# Loadings (NB fix first factor loading to 1 for model identification)

latent_asb7 =~ destruct_7 + FightLoading*fight_7 + DisobedientLoading*disobedient_7

latent_asb11 =~ destruct_11 + FightLoading*fight_11 + DisobedientLoading*disobedient_11

latent_asb16 =~ destruct_16 + FightLoading*fight_16 + DisobedientLoading*disobedient_16

# Latent common factor means; fix asb7 to 0 for model identification

latent_asb7 ~ 0*1

latent_asb11 ~ 1

latent_asb16 ~ 1

# Latent common factor variances

latent_asb7 ~~ latent_asb7

latent_asb11 ~~ latent_asb11

latent_asb16 ~~ latent_asb16

# Constrain thresholds to be equal across measurement occasions

destruct_7 | destructt1*t1 + destructt2*t2

destruct_11 | destructt1*t1 + destructt2*t2

destruct_16 | destructt1*t1 + destructt2*t2

fight_7 | fightt1*t1 + fightt2*t2

fight_11 | fightt1*t1 + fightt2*t2

fight_16 | fightt1*t1 + fightt2*t2

disobedient_7 | disobedientt1*t1 + disobedientt2*t2

disobedient_11 | disobedientt1*t1 + disobedientt2*t2

disobedient_16 | disobedientt1*t1 + disobedientt2*t2

# Unique variances constrained to 1.00

destruct_7 ~~ 1*destruct_7

fight_7 ~~ 1*fight_7

disobedient_7 ~~ 1*disobedient_7

destruct_11 ~~ 1*destruct_11

fight_11 ~~ 1*fight_11

disobedient_11 ~~ 1*disobedient_11

destruct_16 ~~ 1*destruct_16

fight_16 ~~ 1*fight_16

disobedient_16 ~~ 1*disobedient_16

# Lagged unique factor covariances

destruct_7 ~~ destruct_11 + destruct_16

destruct_11 ~~ destruct_16

fight_7 ~~ fight_11 + fight_16

fight_11 ~~ fight_16

disobedient_7 ~~ disobedient_11 + disobedient_16

disobedient_11 ~~ disobedient_16

# # Structural model: Conditional Latent Growth Curve Model

# Loadings

i =~ 1*latent_asb7 + 1*latent_asb11 + 1*latent_asb16

s =~ 0*latent_asb7 + 4*latent_asb11 + 9*latent_asb16

# fixed effects

i ~ 0*1

s ~ 1

# random effects

i ~~ i

s ~~ s

i ~~ s

# regression

i + s ~ neglect6

'

# FIT MODEL1

fit. model1 <- lavaan.mi(model1, data = imputed_list,

ordered= ordered_cat,

parameterization = "theta",

estimator = "wlsmv",

auto.fix.first=TRUE)

1. When I first ran the above I got the following error, presumably because neglect6 is is a binary variable:

Error in lav_data_full(data = data, group = group, cluster = cluster, :

lavaan ERROR: unordered factor(s) detected; make them numeric or ordered: neglect6

2. I then changed ‘neglect6’ to an ordered factor using the below loop and got the following error

for(i in 1:20){

imputed_list[[i]]$neglect6 <- factor(imputed_list[[i]]$neglect6, ordered=TRUE, levels=c(0,1))

}

Error in DX1[th.idx > 0L, ] <- TAU :

number of items to replace is not a multiple of replacement length

In addition: Warning messages:

1: In th[th.idx > 0L] <- TAU[, 1L] :

number of items to replace is not a multiple of replacement length

2: In WLS.obs - WLS.est :

longer object length is not a multiple of shorter object length

3: In diff * diff * WLS.VD :

longer object length is not a multiple of shorter object length

4: In th[th.idx > 0L] <- TAU[, 1L] :

number of items to replace is not a multiple of replacement length

3. Finally I tried changing ‘neglect6’ to numeric and this gave me the below error:

Error in nlminb(start = start.x, objective = objective_function, gradient = GRADIENT, :

NA/NaN gradient evaluation

In addition: Warning message:

In muthen1984(Data = X[[g]], ov.names = ov.names[[g]], ov.types = ov.types, :

lavaan WARNING: trouble constructing W matrix; used generalized inverse for A11 submatrix

If I remove regression ‘i + s ~ neglect6’ the model works but if I replace the regression with another binary endogenous variable (i.e. sex) I get the same errors. I’ve checked the 20 imputed dataframes and they seem to be all correct and consistent. I have a large sample size (n=8088).

If anyone could shed light on the above I would be very grateful as I’m running out of ideas!

Many thanks,

Michelle

Terrence Jorgensen

unread,

Aug 18, 2018, 9:46:59 AM8/18/18

to lavaan

Does my syntax look like this??:

Looks right to me.

when I try to compare two models by using "anova" i get the following error:
What does it mean?
Error in MASS::ginv(Delta1) %*% Delta0 : non-conformable arguments.

That has to do with how the robust test statistic is calculated. Not sure what is causing the problem. Can you post enough data and syntax to reproduce this error? (including the other model you are comparing strong invariance to -- it should be the configural model, since you only have binary indicators)

Terrence Jorgensen

unread,

Aug 18, 2018, 10:26:06 AM8/18/18

to lavaan

2. I then changed ‘neglect6’ to an ordered factor

You don't want to do that. Exogenous categorical predictors must be represented by numeric dummy codes:

http://lavaan.ugent.be/tutorial/cat.html

Also, check the labels of the unordered factor, which should be levels you use when coercing it to ordered. If the labels were not "0" and "1", then the factor() function might have returned NAs.

3. Finally I tried changing ‘neglect6’ to numeric and this gave me the below error:

After you tried changing it to an ordered factor? Check an imputation, just in case it was an error in transformation.

table(imputed_list[[1]]$neglect6, useNA = "ifany")

If I remove regression ‘i + s ~ neglect6’ the model works but if I replace the regression with another binary endogenous variable (i.e. sex) I get the same errors.

Are they also factors, or are they numeric dummy codes?

I would try 3 things to try narrowing down the problem.

explicitly set fixed.x=FALSE (should be set automatically in runMI()
use cfa.mi() in case there is some default setting that should be turned on in lavaan()
try fitting the same model to the each imputed data set in a loop, using cfa()

for (i in 1:20) {
  fit.model1 <- try(cfa(model1, data = imputed_list[[i]], ordered= ordered_cat,
                        parameterization = "theta", estimator = "wlsmv"),
                    silent = TRUE)
  if (inherits(fit.model1, "try-error") {
    cat('imputation', i, 'failed \n')
  } else cat('imputation', i, 'succeeded\n')
}

Then you can see which imputations produce an error, and possibly figure out why.

If you can't narrow down the problem, please email me your R script and enough data to reproduce the problem (preferably send me your original and a few imputed data sets in files saved by readRDS(), just so I can see when lavaan.mi() fails).

Message has been deleted

veren...@gmail.com

unread,

Aug 19, 2018, 10:49:22 AM8/19/18

to lavaan

Terrence,

thank you for your support.

I think you are right. My problem could be caused by the comparing model...

When I use the following syntax, the summary contains intercepts and not thresholds. Perhaps something is missing in the syntax?

model.weak<-"

Diagnostik=~c(1,1,1)*V1_OD1a+
c(fl1,fl1,fl1)*V2_OD1a+
c(fl2,fl8,fl2)*T1_OD1a+
c(fl3,fl7,fl3)*T2_OD1a
Rückmeldung=~c(1,1,1)*V1_OR1a+
c(fl4,fl4,fl4)*V2_OR1a+
c(fl5,fl5,fl5)*T1_OR1a+
c(fl6,fl6,fl6)*T2_OR1a

V1_OR1a ~~ V2_OR1a "

fit.weak<-sem(model.weak,data=data.T1

,group="Studigruppe",meanstructure=T, estimator= "WLSMV", ordered = c("T1_OD1a, V1_OD1a, T2_OD1a, V2_OD1a, T1_OR1a, V1_OR1a, T2_OR1a, V2_OR1a"))

fit.weak
(fitind.weak<-fitmeasures(fit.weak

,c("SRMR", "RMSEA.scaled", "CFI.scaled")))

summary(fit.weak,stand=T)

Yves Rosseel

unread,

Sep 16, 2018, 2:24:52 PM9/16/18

to lav...@googlegroups.com

Do you still get this with a recent version of lavaan (0.6-2, or 0.6-3
dev)? If so, please provide a reproducible example that we can run.

Yves.

On 8/19/18 4:48 PM, veren...@gmail.com wrote:

> --
> You received this message because you are subscribed to the Google
> Groups "lavaan" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to lavaan+un...@googlegroups.com
> <mailto:lavaan+un...@googlegroups.com>.
> To post to this group, send email to lav...@googlegroups.com
> <mailto:lav...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/lavaan.
> For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward