First of all, thank you very much for a very detailed and informative response.
We
have consulted the literature you mention, and some other papers
discussing the subject and feel we have a much better grasp of what's
going on.
On Thursday, February 27, 2020 at 10:40:45 PM UTC+1, Terrence Jorgensen wrote:
1) Since we are using ordinal indicators, we are using the WLSMW estimator. We have found in the literature that this requires a sample size between 200 and 500. But how does this work with multigroups structure? Does it mean that we would need 500 participants per group?
For asymptotic behaviors to kick in, yes, but the N required also depends on the size of your model and reliability of your indicators. Models that estimate more parameters (which are closer to zero in the population) require more data for stability.
Okay, thanks for the clarification.
2) Some pooled estimates have infinite degrees of freedom, is this a problem, and what does it mean? How can p vales even be calculated with infinite degrees of freedom?
A t distribution with infinite df is the normal distribution. The semTools package just rounds df > 9999 to Inf to prevent ugly displays (and because there is no practical difference between t and z statistics.
Makes sense, we just figured calculation with anything being Inf would be problematic.
3) How exactly are standardized solutions pooled across imputations?
The same way as for complete data, but using pooled unstandardized estimates, which are standardized using pooled saturated-model estimates.
In our case factor loadings, for example, have the same values for standardized and non-standardized parameter estimates. This is not the case when evaluating models on individual imputed datasets.
Really? If you are using
parameterization="delta" and
std.lv=TRUE, you should expect them to be the same because they are by definition. If you think something is amiss, please provide a reproducible example.
No, we are using theta, but I couldn't come up with a reproducible example, I think it could be down to the way the latents were scaled, which differed between cfa and cfa.mi cases, will revisit this.
4) Pooled outputs have fit indices in two columns, what is the difference between the two? The values reported by anova() correspond to the left column.
As with complete data, the second column contains robust test statistics. I am working on synchronizing the print methods with lavaan before the next version goes to CRAN.
Okay, makes sense. Although, I wonder, what do robust estimates mean in the context of DWLS? In ML it compensates for departures from multivariate normality, right? Do you have any literature on this topic to recommend?
5) Our more complex measurement models (6 constructs, 2-5 indicators each) have some problems in estimating fit using the mi.cfa() function. Even though the model converged for the vast majority of imputations, we can't get a pooled estimate of model fit. We can get the coefficients if calling summary() with fit.measures=FALSE argument, but every time we try to acces the model fit we get the following error: Error in calculate.D2(w, DF, asymptotic = TRUE) : (list) object cannot be coerced to type 'double'
Are you using the latest development version?
devtools::install_github("simsem/semTools/semTools")
If so, please provide a reproducible example (syntax and enough data to generate the error, so I can track down the issue).
Unfortunately, I cannot install the dev version, as I get the following error when trying to run the above code (with the latest version of R):
Downloading GitHub repo simsem/semTools@master
Error: Failed to install 'semTools' from GitHub:
missing value where TRUE/FALSE needed
A colleague of mine managed to do it on her computer, so it must be some specific / weird R type bug, because it happens for other people and other packages also.
6) Finally, we would like to test the invariances using the measEq.syntax() function to produce model descriptions. However, the documentation for this function is very scarce, is there some literature somebody can point us to when using this function?
That's probably the most detailed help-page I have written, other than
permuteMeasEq(), but I do agree a vignette is in order. Here is a problematic tutorial that I do not find very accurate:
https://doi.org/10.1080/10705511.2019.1602776
I am planning to write a better one after I make it though some major life events.
How do we access the model descriptions for various models generated by the function?
What is a "model description"? Have you used the summary() or as.character() methods, as shown in the help-page examples? What specific information would you like to access?
Does this function test the models against each-other, how does it choose which parameter estimates to show.
Again, see the help-page examples to illuminate how to use the function.
It says in the documentation "Optionally returns the fitted model (if data are provided) representing some chosen level of measurement equivalence/invariance across groups and/or repeated measures", but what does chosen mean here, how do we choose?
Using the group.equal= and long.equal= arguments. But as the help page says, I do not recommend automatically fitting the model. I recommend inspecting the syntax that the function writes, to verify you are fitting the model you intend to fit (or to learn how to specify the model using certain guidelines, controlled by the ID.* arguments).
We managed to miss that part of the doc, it is indeed really well written and the usage of the function is clear to us now, thank you very much for writing it. It could be clearer for beginners, probably, if measEq.syntax() only created syntaxes and did not actually have the ability to test models, as this could lead to confusion, and, as you say, it is not advisable to do so in any case.
Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
We still have some further questions, if you don't mind:
1) According to Liu, Millsap, West, Tein, Tanaka and Grimm (2017) the one threshold for each indicator in configural models should be constrained to equality across time. This is not done by the function automatically. Should we specify this by hand when supplying the configural fit object from which the specification of the configural model is derived? Or is this not necessary for scaling latents?
2) With using the compareFit() function , what do the crosses mean next to some fit indices? Also, the difference in chisq presented in top comparison (nested model comparison) does not equal the difference between actual chisq scaled presented in part two (model fit indices), why is this the case?
3) When using the same values (thresholds and loadings) in both group.equal and long.equal, everything works fine, but when trying to keep thresholds for both the same, but constrain loadings only across groups (or across time) we get the following error:
Error in GLIST.specify[[1]]$lambda[, ff] : subscript out of bounds
The measEq.syntax() function should allow for different types of constraints across groups and across time, right?
4) With smaller models, we add some constraints to the base model ("configural") to make it identified (for example loadings across time), before suppying this fit object to the measEq.syntax(). The function later overrides this, and the configural model it outputs does not contain these constraints. In the documentation it is writen that configural.model is "A model with no measurement-invariance constraints (i.e., representing only configural invariance), unless required for model identification. ". Does this mean that they were not actually required for identification?
Thank you very much for your time.
Blaž