Questions about using lavaan with multiple imputations on longitudinal data with multiple groups

359 views
Skip to first unread message

Blaž Rebernjak

unread,
Feb 27, 2020, 9:17:24 AM2/27/20
to lavaan
Dear all,

we are trying to fit a CFA model with several latent factors measured by ordinal indicators in two time points and between two groups of participants. We have had some success with specifying and running smaller models (for individual constructs), but run into some issues when putting them all together. We have several questions, some theoretical, some practical:

1) Since we are using ordinal indicators, we are using the WLSMW estimator. We have found in the literature that this requires a sample size between 200 and 500. But how does this work with multigroups structure? Does it mean that we would need 500 participants per group?

2) Some pooled estimates have infinite degrees of freedom, is this a problem, and what does it mean? How can p vales even be calculated with infinite degrees of freedom?

3) How exactly are standardized solutions pooled across imputations? In our case factor loadings, for example, have the same values for standardized and non-standardized parameter estimates. This is not the case when evaluating models on individual imputed datasets.

4) Pooled outputs have fit indices in two columns, what is the difference between the two? The values reported by anova() correspond to the left column.

5) Our more complex measurement models (6 constructs, 2-5 indicators each) have some problems in estimating fit using the mi.cfa() function. Even though the model converged for the vast majority of imputations, we can't get a pooled estimate of model fit. We can get the coefficients if calling summary() with fit.measures=FALSE argument, but every time we try to acces the model fit we get the following error: Error in calculate.D2(w, DF, asymptotic = TRUE) : (list) object cannot be coerced to type 'double'

6) Finally, we would like to test the invariances using the measEq.syntax() function to produce model descriptions. However, the documentation for this function is very scarce, is there some literature somebody can point us to when using this function? How do we access the model descriptions for various models generated by the function? Does this function test the models against each-other, how does it choose which parameter estimates to show. It says in the documentation "Optionally returns the fitted model (if data are provided) representing some chosen level of measurement equivalence/invariance across groups and/or repeated measures", but what does chosen mean here, how do we choose?

Thank you very much for your time,

Blaž


Terrence Jorgensen

unread,
Feb 27, 2020, 4:40:45 PM2/27/20
to lavaan
1) Since we are using ordinal indicators, we are using the WLSMW estimator. We have found in the literature that this requires a sample size between 200 and 500. But how does this work with multigroups structure? Does it mean that we would need 500 participants per group?

For asymptotic behaviors to kick in, yes, but the N required also depends on the size of your model and reliability of your indicators.  Models that estimate more parameters (which are closer to zero in the population) require more data for stability.

 
2) Some pooled estimates have infinite degrees of freedom, is this a problem, and what does it mean? How can p vales even be calculated with infinite degrees of freedom?

A t distribution with infinite df is the normal distribution.  The semTools package just rounds df > 9999 to Inf to prevent ugly displays (and because there is no practical difference between t and z statistics.
  
3) How exactly are standardized solutions pooled across imputations?

The same way as for complete data, but using pooled unstandardized estimates, which are standardized using pooled saturated-model estimates.

 
In our case factor loadings, for example, have the same values for standardized and non-standardized parameter estimates. This is not the case when evaluating models on individual imputed datasets.

Really?  If you are using parameterization="delta" and std.lv=TRUE, you should expect them to be the same because they are by definition.  If you think something is amiss, please provide a reproducible example.

4) Pooled outputs have fit indices in two columns, what is the difference between the two? The values reported by anova() correspond to the left column.

As with complete data, the second column contains robust test statistics.  I am working on synchronizing the print methods with lavaan before the next version goes to CRAN.
  
5) Our more complex measurement models (6 constructs, 2-5 indicators each) have some problems in estimating fit using the mi.cfa() function. Even though the model converged for the vast majority of imputations, we can't get a pooled estimate of model fit. We can get the coefficients if calling summary() with fit.measures=FALSE argument, but every time we try to acces the model fit we get the following error: Error in calculate.D2(w, DF, asymptotic = TRUE) : (list) object cannot be coerced to type 'double'

Are you using the latest development version?

devtools::install_github("simsem/semTools/semTools")

If so, please provide a reproducible example (syntax and enough data to generate the error, so I can track down the issue).
 
6) Finally, we would like to test the invariances using the measEq.syntax() function to produce model descriptions. However, the documentation for this function is very scarce, is there some literature somebody can point us to when using this function?

That's probably the most detailed help-page I have written, other than permuteMeasEq(), but I do agree a vignette is in order.  Here is a problematic tutorial that I do not find very accurate:  https://doi.org/10.1080/10705511.2019.1602776

I am planning to write a better one after I make it though some major life events.
  
How do we access the model descriptions for various models generated by the function?

What is a "model description"?  Have you used the summary() or as.character() methods, as shown in the help-page examples?  What specific information would you like to access?

Does this function test the models against each-other, how does it choose which parameter estimates to show.

Again, see the help-page examples to illuminate how to use the function.
 
It says in the documentation "Optionally returns the fitted model (if data are provided) representing some chosen level of measurement equivalence/invariance across groups and/or repeated measures", but what does chosen mean here, how do we choose?

Using the group.equal= and long.equal= arguments.  But as the help page says, I do not recommend automatically fitting the model.  I recommend inspecting the syntax that the function writes, to verify you are fitting the model you intend to fit (or to learn how to specify the model using certain guidelines, controlled by the ID.* arguments).

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
 

Blaž Rebernjak

unread,
Mar 5, 2020, 7:00:06 AM3/5/20
to lavaan
First of all, thank you very much for a very detailed and informative response.

We have consulted the literature you mention, and some other papers discussing the subject and feel we have a much better grasp of what's going on.

On Thursday, February 27, 2020 at 10:40:45 PM UTC+1, Terrence Jorgensen wrote:
1) Since we are using ordinal indicators, we are using the WLSMW estimator. We have found in the literature that this requires a sample size between 200 and 500. But how does this work with multigroups structure? Does it mean that we would need 500 participants per group?

For asymptotic behaviors to kick in, yes, but the N required also depends on the size of your model and reliability of your indicators.  Models that estimate more parameters (which are closer to zero in the population) require more data for stability.

Okay, thanks for the clarification.
 

 
2) Some pooled estimates have infinite degrees of freedom, is this a problem, and what does it mean? How can p vales even be calculated with infinite degrees of freedom?

A t distribution with infinite df is the normal distribution.  The semTools package just rounds df > 9999 to Inf to prevent ugly displays (and because there is no practical difference between t and z statistics.

Makes sense, we just figured calculation with anything being Inf would be problematic.
 
  
3) How exactly are standardized solutions pooled across imputations?

The same way as for complete data, but using pooled unstandardized estimates, which are standardized using pooled saturated-model estimates.

 
In our case factor loadings, for example, have the same values for standardized and non-standardized parameter estimates. This is not the case when evaluating models on individual imputed datasets.

Really?  If you are using parameterization="delta" and std.lv=TRUE, you should expect them to be the same because they are by definition.  If you think something is amiss, please provide a reproducible example.

No, we are using theta, but I couldn't come up with a reproducible example, I think it could be down to the way the latents were scaled, which differed between cfa and cfa.mi cases, will revisit this.
 

4) Pooled outputs have fit indices in two columns, what is the difference between the two? The values reported by anova() correspond to the left column.

As with complete data, the second column contains robust test statistics.  I am working on synchronizing the print methods with lavaan before the next version goes to CRAN.

Okay, makes sense. Although, I wonder, what do robust estimates mean in the context of DWLS? In ML it compensates for departures from multivariate normality, right? Do you have any literature on this topic to recommend?
 
  
5) Our more complex measurement models (6 constructs, 2-5 indicators each) have some problems in estimating fit using the mi.cfa() function. Even though the model converged for the vast majority of imputations, we can't get a pooled estimate of model fit. We can get the coefficients if calling summary() with fit.measures=FALSE argument, but every time we try to acces the model fit we get the following error: Error in calculate.D2(w, DF, asymptotic = TRUE) : (list) object cannot be coerced to type 'double'

Are you using the latest development version?

devtools::install_github("simsem/semTools/semTools")

If so, please provide a reproducible example (syntax and enough data to generate the error, so I can track down the issue).

Unfortunately, I cannot install the dev version, as I get the following error when trying to run the above code (with the latest version of R):

Downloading GitHub repo simsem/semTools@master
Error: Failed to install 'semTools' from GitHub:
  missing value where TRUE/FALSE needed

A colleague of mine managed to do it on her computer, so it must be some specific / weird R type bug, because it happens for other people and other packages also.
 
 
6) Finally, we would like to test the invariances using the measEq.syntax() function to produce model descriptions. However, the documentation for this function is very scarce, is there some literature somebody can point us to when using this function?

That's probably the most detailed help-page I have written, other than permuteMeasEq(), but I do agree a vignette is in order.  Here is a problematic tutorial that I do not find very accurate:  https://doi.org/10.1080/10705511.2019.1602776

I am planning to write a better one after I make it though some major life events.
  
How do we access the model descriptions for various models generated by the function?

What is a "model description"?  Have you used the summary() or as.character() methods, as shown in the help-page examples?  What specific information would you like to access?

Does this function test the models against each-other, how does it choose which parameter estimates to show.

Again, see the help-page examples to illuminate how to use the function.
 
It says in the documentation "Optionally returns the fitted model (if data are provided) representing some chosen level of measurement equivalence/invariance across groups and/or repeated measures", but what does chosen mean here, how do we choose?

Using the group.equal= and long.equal= arguments.  But as the help page says, I do not recommend automatically fitting the model.  I recommend inspecting the syntax that the function writes, to verify you are fitting the model you intend to fit (or to learn how to specify the model using certain guidelines, controlled by the ID.* arguments).

We managed to miss that part of the doc, it is indeed really well written and the usage of the function is clear to us now, thank you very much for writing it. It could be clearer for beginners, probably, if measEq.syntax() only created syntaxes and did not actually have the ability to test models, as this could lead to confusion, and, as you say, it is not advisable to do so in any case.
 

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
 


We still have some further questions, if you don't mind:

1) According to Liu, Millsap, West, Tein, Tanaka and Grimm (2017) the one threshold for each indicator in configural models should be constrained to equality across time. This is not done by the function automatically. Should we specify this by hand when supplying the configural fit object from which the specification of the configural model is derived? Or is this not necessary for scaling latents?

2) With using the compareFit() function , what do the crosses mean next to some fit indices? Also, the difference in chisq presented in top comparison (nested model comparison) does not equal the difference between actual chisq scaled presented in part two (model fit indices), why is this the case?

3) When using the same values (thresholds and loadings) in both group.equal and long.equal, everything works fine, but when trying to keep thresholds for both the same, but constrain loadings only across groups (or across time) we get the following error:

Error in GLIST.specify[[1]]$lambda[, ff] : subscript out of bounds

The measEq.syntax() function should allow for different types of constraints across groups and across time, right?

4) With smaller models, we add some constraints to the base model ("configural") to make it identified (for example loadings across time), before suppying this fit object to the measEq.syntax(). The function later overrides this, and the configural model it outputs does not contain these constraints. In the documentation it is writen that configural.model is "A model with no measurement-invariance constraints (i.e., representing only configural invariance), unless required for model identification. ". Does this mean that they were not actually required for identification?

Thank you very much for your time.

Blaž

Terrence Jorgensen

unread,
Mar 21, 2020, 10:48:50 AM3/21/20
to lavaan
No, we are using theta, but I couldn't come up with a reproducible example, I think it could be down to the way the latents were scaled, which differed between cfa and cfa.mi cases, will revisit this.

cfa.mi() just passes any arguments down to cfa(),  otherwise relying on its identical defaults, so that is unlikely. 

what do robust estimates mean in the context of DWLS? In ML it compensates for departures from multivariate normality, right? Do you have any literature on this topic to recommend? 

DWLS fits your normal-theory model to estimated sample moments for latent responses underlying ordered observations.  So the "data" (polychoric correlations) are estimated, not known.  The robust method adjusts for this by taking uncertainty of the estimated polychorics into account.

 
Unfortunately, I cannot install the dev version, as I get the following error when trying to run the above code (with the latest version of R):

Downloading GitHub repo simsem/semTools@master
Error: Failed to install 'semTools' from GitHub:
  missing value where TRUE/FALSE needed

A colleague of mine managed to do it on her computer, so it must be some specific / weird R type bug, because it happens for other people and other packages also.  

Try installing in a fresh R session, and first verify nothing has been loaded beyond the base packages

sessionInfo()

RStudio has some insidious hidden features like automatically saving workspaces and reloading them when RStudio is opened from that working directory.  This has the unfortunate side-effect of making R "remember" old crap you don't want anything to do with anymore.

 It could be clearer for beginners, probably, if measEq.syntax() only created syntaxes and did not actually have the ability to test models, as this could lead to confusion,

That is why the default is return.fit=FALSE.  The option was included for the eventual design of a wrapper that automates the steps to test invariance (far out on the horizon).
 
1) According to Liu, Millsap, West, Tein, Tanaka and Grimm (2017) the one threshold for each indicator in configural models should be constrained to equality across time. This is not done by the function automatically. Should we specify this by hand when supplying the configural fit object from which the specification of the configural model is derived? Or is this not necessary for scaling latents?

Read carefully  -- their advice is merely Millsap & Tein's (2004) advice applied to longitudinal data.  So you can reproduce this behavior merely by setting ID.cat="millsap"
 
2) With using the compareFit() function , what do the crosses mean next to some fit indices?

Those indicate the model preferred by each index.

Also, the difference in chisq presented in top comparison (nested model comparison) does not equal the difference between actual chisq scaled presented in part two (model fit indices), why is this the case?

Because the robust chi-squared-difference test is not a difference between 2 robust statistics.  It applies (essentiall) the same "robustification" method to the naive chi-squared-difference statistic as is applied to the overall fit statistic.

3) When using the same values (thresholds and loadings) in both group.equal and long.equal, everything works fine, but when trying to keep thresholds for both the same, but constrain loadings only across groups (or across time) we get the following error:

Error in GLIST.specify[[1]]$lambda[, ff] : subscript out of bounds

The measEq.syntax() function should allow for different types of constraints across groups and across time, right?

Yes, and it works fine for the help-page example:

mod.cat <- ' FU1 =~ u1 + u2 + u3 + u4
             FU2 =~ u5 + u6 + u7 + u8 '

## the 2 factors are actually the same factor (FU) measured twice
longFacNames
<- list(FU = c("FU1","FU2"))

measEq
.syntax(configural.model = mod.cat, data = datCat,
              ordered
= paste0("u", 1:8),
              parameterization
= "theta",
              ID
.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
             
group = "g", longFacNames = longFacNames,
             
group.equal = c("thresholds","loadings"),
             
long.equal = "thresholds")
measEq
.syntax(configural.model = mod.cat, data = datCat,
              ordered
= paste0("u", 1:8),
              parameterization
= "theta",
              ID
.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
             
group = "g", longFacNames = longFacNames,
             
long.equal = c("thresholds","loadings"),
             
group.equal = "thresholds")


4) With smaller models, we add some constraints to the base model ("configural") to make it identified (for example loadings across time), before suppying this fit object to the measEq.syntax(). The function later overrides this, and the configural model it outputs does not contain these constraints. In the documentation it is writen that configural.model is "A model with no measurement-invariance constraints (i.e., representing only configural invariance), unless required for model identification. ". Does this mean that they were not actually required for identification?

No, although perhaps they were required for empirical identification given your small N.  Did they not converge without those constraints?  If you need to assume additional constraints for identification, then I would recommend adding the additional identification constraints in the earliest step first.  For example, if you have categorical data on which a configural model does not converge (even after trying other methods to help it along), constrain thresholds first.  Then if that is necessary to achieve convergence, you have to consider that your least-constrained baseline model (in place of a configural model).
Reply all
Reply to author
Forward
0 new messages