Longitudinal Measurement Invariance

Ye

unread,

Jun 21, 2022, 12:58:05 PM6/21/22

to blavaan

Hi Blavaan group,

I am trying to test a longitudinal measurement invariance model with Blavaan and couldn't find a good example online. The examples are multi-group MI which I can't specify covariate between uniqueness.

In mplus, I can specify the CFA model and add the approximate priors of using diff command, I am not sure how to do that in Blavaan.

I searched the group discussions and tried the HolzingerSwineford1939 example which fits a configural model first and used stan parameter names to specify the prior in the weak invariance model, unfortunately, I had an error message that says 'Error in FUN(X[[i]], ...) :Stan does not support NA (in lambda_y_mn) in data'.

I would appreciate it if anyone can share some good examples of how to test longitudinal MI in Blavaan?

Thank you!

Ed Merkle

unread,

Jun 21, 2022, 2:52:26 PM6/21/22

to Ye, blavaan

Ye,

You are right that there is no good example online. I am meaning to do it but also need a dataset that could be posted online.

For approximate invariance, you would want to use the "wiggle" argument, as shown at the bottom of this page:

https://ecmerkle.github.io/blavaan/articles/invariance.html

If you send the model syntax, and the command that leads to the error message, maybe I could give more specific advice.

Ed

--
You received this message because you are subscribed to the Google Groups "blavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blavaan+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/220b373a-4472-44c8-bed2-e8441fc38bb8n%40googlegroups.com.

Mauricio Garnier-Villarreal

unread,

Jun 22, 2022, 4:47:47 AM6/22/22

to blavaan

Ye

As Ed mentioned, we need to write up a good example of these models for the website

Here is a quick example code, for three time points

#### Configural invariance model
configural <-'
verbal1 =~ item11 + item12 + item13
verbal2 =~ item21 + item22 + item23
verbal3 =~ item31 + item32 + item33
'
fitConfig <- bcfa(configural,
data=datsim,
std.lv = TRUE,
n.chains = 3,
burnin=3000,
sample=2000
)

### Weak invariance ###
weak <-'
verbal1 =~ L11*item11 + L12*item12 + L13*item13
verbal2 =~ L11*item21 + L12*item22 + L13*item23
verbal3 =~ L11*item31 + L12*item32 + L13*item33

verbal1 ~~ 1*verbal1
verbal2 ~~ NA*verbal2
verbal3 ~~ NA*verbal3
'
fitWeak <- bcfa(weak,
data=datsim,
std.lv = TRUE,
n.chains = 3,
burnin=2000,
sample=2000
)

##### Strong Invariance #################
strong <-'
verbal1 =~ L11*item11 + L12*item12 + L13*item13
verbal2 =~ L11*item21 + L12*item22 + L13*item23
verbal3 =~ L11*item31 + L12*item32 + L13*item33

##Latent variances
verbal1 ~~ 1*verbal1
verbal2 ~~ NA*verbal2
verbal3 ~~ NA*verbal3

## Intercepts

item11 ~ I11*1
item21 ~ I21*1
item31 ~ I31*1
item12 ~ I11*1
item22 ~ I21*1
item32 ~ I31*1
item13 ~ I11*1
item23 ~ I21*1
item33 ~ I31*1

## Latent means

verbal1 ~0*1
verbal2 ~NA*1
verbal3 ~NA*1

'
fitStrong <- bcfa(strong,
data=datsim,
std.lv = TRUE,
n.chains = 3,
burnin=3000,
sample=2000
)

### weak approximate invariance using wiggle argument
weak <-'
verbal1 =~ L11*item11 + L12*item12 + L13*item13
verbal2 =~ L11*item21 + L12*item22 + L13*item23
verbal3 =~ L11*item31 + L12*item32 + L13*item33

verbal1 ~~ 1*verbal1
verbal2 ~~ NA*verbal2
verbal3 ~~ NA*verbal3
'
fitWeakWiggle <- bcfa(weak,
data=datsim,
std.lv = TRUE,
n.chains = 3,
burnin=2000,
sample=2000,
wiggle = c("L11", "L12", "L13"),
wiggle.sd = 0.01
)

### Strong with wiggle
strong <-'
verbal1 =~ L11*item11 + L12*item12 + L13*item13
verbal2 =~ L11*item21 + L12*item22 + L13*item23
verbal3 =~ L11*item31 + L12*item32 + L13*item33

##Latent variances
verbal1 ~~ 1*verbal1
verbal2 ~~ NA*verbal2
verbal3 ~~ NA*verbal3

## Intercepts

item11 ~ I11*1
item21 ~ I21*1
item31 ~ I31*1
item12 ~ I11*1
item22 ~ I21*1
item32 ~ I31*1
item13 ~ I11*1
item23 ~ I21*1
item33 ~ I31*1

## Latent means

verbal1 ~0*1
verbal2 ~NA*1
verbal3 ~NA*1

'
fitStrongWiggle <- bcfa(strong,
data=datsim,
std.lv = TRUE,
n.chains = 3,
burnin=3000,
sample=2000,
target = "stan",
wiggle = c("I11", "I21", "I31"),
wiggle.sd = 0.01
)

Ye

unread,

Jun 22, 2022, 5:13:17 PM6/22/22

to blavaan

Hi Dr. Garnier-Villarreal and Dr.Merkle,

Thank you very much for the information and for the example code. The code examples are extremely helpful!

Thank you again!

Ye

unread,

Jul 25, 2022, 11:38:11 PM7/25/22

to blavaan

Hi Dr. Garnier-Villarreal and Dr.Merkle,

Thank you for the help with the longitudinal measurement invariance question.

I am trying to see how this approach can be extended to ordinal items and have simulate a 20 ordered items (three categories), 500 observations dataset (4 time points and 5 items at each time points).

I noticed that it takes extremely long time to fit a configural model (~5 hours) and a null model (~ 18 hours). I only tested these two models, and they are not converged with 3000 burnin.

I want to ask if this computation time is reasonable for a model with 20 ordered items? And is there any suggestions on how to speed it up? Understand that 3000 burnin is not enough, do you have suggestions on the number of burnin iterations?

Thank you very much!

Ed Merkle

unread,

Jul 26, 2022, 11:33:28 AM7/26/22

to Ye, blavaan

Ye,

It is correct that the ordinal models take a long time, especially as the number of ordinal items increases. The functionality is relatively new, and I plan to continue to work on this to speed up estimation.

I am a little surprised that the model does not converge. I have often found that, if the model does not converge by 1000 burnin iterations, extra iterations will not help. If possible, I would be interested in looking at your model and data to see whether I can find any bugs.

Thanks,

Ed

Ye Feng

unread,

Jul 26, 2022, 11:46:08 AM7/26/22

to Ed Merkle, blavaan

Thank you Dr.Merkle for the information!

Absolutely. I attached the data file and the model syntax I used here. As a follow-up question, I found in Mplus, the DIFF option is not available for polytomous items. Do you mind letting me know if Blavaan supports the wiggle room for threshold differences of polytomous items?

Thank you again for your help!

bfit.config <- bcfa(configural.cat,data = dat, n.chains = 4, std.lv = T, target = 'stan', burnin = 10000, sample = 5000, bcontrol = list(cores = 4),ordered = T)

--

Ye Feng
Psychometrics and Quantitative Psychology Ph.D. Student
Fordham University
441 East Fordham Road
Bronx, NY 10458
Mobile: (917) 601-8112

dat.csv

Ed Merkle

unread,

Jul 26, 2022, 4:35:46 PM7/26/22

to Ye Feng, blavaan

I believe that wiggle should work with thresholds. Though that might not be helpful until the model speeds up.

I have looked at it a little bit and wonder whether it will run faster as a 4-group model. Basically, you would want to rearrange the data to have five columns (x1 to x5), then add a sixth column "time" to indicate which time point is in each row. I think this might make it easier to specify equality constraints, and the model is now dealing with a series of 5x5 covariance matrices instead of a single 20x20 matrix.

Speed might still be an issue after this change, but I think it might help.

Ed

Mauricio Garnier-Villarreal

unread,

Jul 27, 2022, 6:27:58 AM7/27/22

to blavaan

Ye

I think the model might not be convergeing because you are mixing identification methods, you are using both marker variable, and fixed variance. When you specify std.lv=T blavaan uses fixed factor variance method of identification. But you are also adding te marker variable method with your constraints like the first indicator factor loadings, and your additioanl constraints

F1 =~ 1*x11 + x12 + x13 + x14 + x15
F2 =~ 1*x21 + x22 + x23 + x24 + x25
F3 =~ 1*x31 + x32 + x33 + x34 + x35
F4 =~ 1*x41 + x42 + x43 + x44 + x45

In general I recommend you to use labels and constraints for tests, like measurement invariance, but let blavaan do the identifications constraints, like using std.lv=T

I would say that this should be the configural model, this was 30% faster in a smal iteration example, and almost converged even with just a few iterations

configural.cat <- '
# Define the latent factors.

F1 =~ x11 + x12 + x13 + x14 + x15
F2 =~ x21 + x22 + x23 + x24 + x25
F3 =~ x31 + x32 + x33 + x34 + x35
F4 =~ x41 + x42 + x43 + x44 + x45

# Unique factor covariances
x11 ~~ x21 + x31 + x41
x21 ~~ x31 + x41
x31 ~~ x41

x12 ~~ x22 + x32 + x42
x22 ~~ x32 + x42
x32 ~~ x42

x13 ~~ x23 + x33 + x43
x23 ~~ x33 + x43
x33 ~~ x43

x14 ~~ x24 + x34 + x44
x24 ~~ x34 + x44
x34 ~~ x44

x15 ~~ x25 + x35 + x45
x25 ~~ x35 + x45
x35 ~~ x45

'

bfit.config <- bcfa(configural.cat,data = dat,

n.chains = 3, std.lv = T,
target = 'stan',
burnin = 1000, sample = 1000,
bcontrol = list(cores = 3),
ordered = T)

Ye

unread,

Jul 28, 2022, 10:55:50 AM7/28/22

to blavaan

Thank you Dr. Garnier-Villarreal and Dr.Merkle for such helpful information! It was the problem with how I specified the model identification and the convergence issue is solved!

I noticed that the BSEM indices (BRMSEA, BCFI, BTIL....) are not available for ordinal items. Do you have any suggestions on how to obtain these BSEM indices for models with ordinal items?

Thank you again for your help!

Ye

Mauricio Garnier-Villarreal

unread,

Jul 29, 2022, 8:18:15 AM7/29/22

to blavaan

Ye

The BSEM fit indices formulas so far only work with continuous indicators. We have a couple of ideas on how to make this work with categorical indicators, but these still need testing and development, so there is no solution for now

Ye

unread,

Aug 22, 2022, 11:18:14 PM8/22/22

to blavaan

Hi Dr. Garnier-Villarreal and Dr.Merkle,

Thank you very much for your help with my previous questions!

I encountered more issues while testing the approximate invariance with ordered categorical items.

To reduce the computation time, I treated the longitudinal measurement invariance test question as a multi-group one and converted the data into a long format. I used the Lavan default method for model identification and the models converged with the WLSMV estimator. Specifically,

1. Configural model: freely estimates loadings and thresholds in all groups, and constrains the latent item responses as variances of 1 and intercepts of 0 in all groups. The common-factor variances are set to 1 and means are set to 0 in all groups.
2. Loadings invariance model: constrained the loadings to be equal across groups, thresholds freely estimated in all groups. The latent item responses were constrained with variances of 1 and intercepts of 0. The common-factor variances were constrained to 1 in the first group and freely estimated in other groups, and the common-factor means were constrained to 0 in all groups.
3. Thresholds invariance model: constrained the loadings and thresholds to be equal across groups. The latent item variances were fixed to 1 in the first group and freely estimated in other groups. The common-factor mean and variance were fixed to 0 and 1 in the first group and freely estimated in other groups.

To test loading approximate invariance, I added the labels of item loadings in the wiggle statement, and the models converged with different data I simulated.
However, when I further tested the loading exact invariance + thresholds, approximate invariance model, by adding the thresholds labels into the wiggle statements, the models failed to converge.
Similarly, when testing the loadings + thresholds approximate invariance model, the models failed to converge. Moreover, some of the models had an error message: Error in validate_ll(x) : All input values must be finite.

I understand that treating longitudinal data as multi-group data introduces a definite source of model misspecification that may cause the convergence issue. However, I tried testing the thresholds approximate invariance model using the wide format data, but I still have the convergence issue with the thresholds approximate invariance model and loading + thresholds approximate invariance model.

Do you have any suggestions on correctly testing the thresholds approximate invariance with ordered categorical indicators? e.g. specific model identification methods, increase the burin-in sample, or increase/decrease the wiggle sd?

I attached the syntax and data for your reference. Thank you very much!

long.csv

model_syntax.R

Mauricio Garnier-Villarreal

unread,

Aug 23, 2022, 6:06:40 AM8/23/22

to blavaan

Ye

When testing measurement invariance with categorical indicators, you should test for threslholds first, then test for loadings, and then intercepts. This way the threshold model defines a first order measurement model between the categorical answers and the underlying factor

See

Wu, H., & Estabrook, R. (2016). Identification of confirmatory factor analysis models of different levels of invariance for ordered categorical outcomes. Psychometrika, 81(4), 1014–1045. doi: 10.1007/s11336-016-9506-0

When you say that it doesnt converged, what do you mean? After how many iterations? How high is the R-hat? When you add the wiggle it commonly needs more iterations

Sorry I cant read the code in detail now

Ed Merkle

unread,

Aug 23, 2022, 11:57:27 AM8/23/22

to Ye, blavaan

Ye,

I tried a short run of bfit.scalar (100 burnin/100 sample) and got convergence. I suspect the problem could be related to std.lv=TRUE paired with uninformative priors on loadings. You might consider more informative priors on the loadings. For example, if all items are known to be in the same direction, you might use something like normal(1, .33) there. I generally think that normal(0, big SD) is not best for loadings, because people almost always know the intended direction of the items.

Also, for groups 2-4, the variance of each observed variable is free. These priors default to gamma(1,.5) (on SD, not variance), which again might be too uninformative and combine with loadings to cause problems. Because the SDs are fixed to 1 in group 1, I would opt for priors that place more density around 1, like maybe

dp = dpriors(lambda = "normal(1,.33)", theta = "gamma(4,4)[sd]")

Also, for the wiggle argument, you can shorten it to something like

wiggle = c("loadings", "thresholds")

Hope it helps,

Ed

--
You received this message because you are subscribed to the Google Groups "blavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blavaan+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/8af929b2-53ff-4ebb-8bcf-a3d402503b85n%40googlegroups.com.

Ye

unread,

Aug 26, 2022, 12:04:44 PM8/26/22

to blavaan

Thank you Dr. Garnier-Villarreal and Dr.Merkle for your suggestions! They are really helpful.

I am running the models with simulated data, including both the non-invariant populations and variant populations. I recorded the number of models that had a maximum R hat larger than 1.1, with 1000 burnin and 1000 sample. Based on the results I have so far, nearly all of the exact loading + thresholds approximate invariance model and loading + thresholds approximate invariance models had a maximum R-hat larger than 1.1, across conditions and replications.

Unfortunately, I only recorded the model fit indices so it is difficult to trace back to the largest R-hat of the non-converged models. I do get the warning messages ' The largest R-hat is 2.33, indicating chains have not mixed.' and 'blavaan WARNING: Small effective sample sizes (< 100) for some parameters.'

I tried adding informative priors to an example data that reported issues before, and it helped! I would incorporate it into the codes and increase the iterations sample in my further simulations. As a follow-up question, the error message: "Error in validate_ll(x) : All input values must be finite." is also caused by model non-convergence, right?

Besides, I tried following the Wu & Estabrook approach to testing thresholds invariance, the df of the baseline model and threshold invariance model is the same. Does that suggest the thresholds invariance model of three-ordered categoricals is not testable? If so, is the current identification approach (lavvan default) the right way to go or is it not recommended?

Thank you again for all your help!

Best regards,

Ye

Mauricio Garnier-Villarreal

unread,

Aug 28, 2022, 12:05:48 PM8/28/22

to blavaan

Ye

When you fix the threshold model you whould not have the same df.

In the configural model you have "theta" parameterization, nothing is fixed equal between groups/time points

Threshold model: fix thresholds equall between groups/time points. Freely estimate the intercepts and variances of the indicators of group 2 (keep fixed the ones for group 1).

My guess is that you fixed the thresholds, but didnt fee the intercepts and variances

Look at this quick example from the semTools package

library(semTools)

?measEq.syntax

mod.cat <- ' FU1 =~ u1 + u2 + u3 + u4'

syntax.config <- measEq.syntax(configural.model = mod.cat, data = datCat,
ordered = paste0("u", 1:4),
parameterization = "theta",
ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
group = "g")

mod.config <- as.character(syntax.config)
fit.config <- cfa(mod.config, data = datCat, group = "g",
ordered = paste0("u", 1:4), parameterization = "theta")
summary(fit.config)

## metric invariance
syntax.thr <- measEq.syntax(configural.model = mod.cat, data = datCat,
ordered = paste0("u", 1:4),
parameterization = "theta",
ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
group = "g",
group.equal = c("thresholds"))
summary(syntax.thr) # summarize model features
mod.thr <- as.character(syntax.thr) # save as text
cat(mod.thr)

fit.thr <- cfa(mod.thr, data = datCat, group = "g",
ordered = paste0("u", 1:4), parameterization = "theta")
summary(fit.thr)

anova(fit.config, fit.thr)

Reply all

Reply to author

Forward