Hi everyone,
I am conducting a confirmatory factor analysis on some survey (ordinal) data (range: 1-4). Given the presence of missing data, I am using semTools::runMI to generate multiple imputed datasets and pool the results using the Rubin's formula. However, as you can see below, the combined goodness of fit indices do not make much sense (CFI = 1, TLI > 1, RMSEA = 0, SRMR = 0.09), especially when compared to the goodness of fit indices estimated from each individual dataset. Does anybody know how to fix this issue? I thought I could take the average of the goodness of fit indices from the results associated with each imputed dataset, but there might be a better solution.
Other than this, I wonder if there is an easy way to add information about auxiliary variables (e.g., gender, linguistic background) to the runMI function to improve the imputation process.
R syntax:
ordered2 <- scaled_db_lib |> select(i1:i6, i8:i22, i25:i28) |> colnames()
out1 <- semTools::runMI(model_cfa,
data = scaled_db_lib_mice,
m = 5,
miPackage = "mice",
fun = "cfa",
rotation = "oblimin",
estimator = "WLSMV", ordered = ordered2)
#Pooled goodness of fit indices
> out1 |> fitMeasures(c("cfi", "tli", "rmsea", "srmr"))
"D3" only available using maximum likelihood estimation. Changed test to "D2".
Robust corrections are made by pooling the naive chi-squared statistic across 5 imputations for which the model converged, then applying the average (across imputations) scaling factor and shift parameter to that pooled value.
To instead pool the robust test statistics, set test = "D2" and pool.robust = TRUE.
cfi cfi.scaled tli
1.000 0.717 1.031
tli.scaled rmsea rmsea.ci.lower
0.686 0.000 0.000
rmsea.ci.upper rmsea.pvalue rmsea.scaled
0.022 1.000 0.028
rmsea.ci.lower.scaled rmsea.ci.upper.scaled rmsea.pvalue.scaled
0.012 0.039 1.000
srmr srmr_bentler srmr_bentler_nomean
0.091 0.091 0.091
srmr_mplus srmr_mplus_nomean
0.091 0.091
#Imputed dataset 1
> fitMeasures(lavaan::sem(model = model_cfa, data = out1@DataList[[1]], missing = "pairwise", rotation = "oblimin", estimator = "WLSMV", ordered = ordered2), c("cfi", "tli", "rmsea", "srmr"))
cfi tli rmsea srmr
0.924 0.915 0.069 0.090
#Imputed dataset 2
> fitMeasures(lavaan::sem(model = model_cfa, data = out1@DataList[[2]], missing = "pairwise", rotation = "oblimin", estimator = "WLSMV", ordered = ordered2), c("cfi", "tli", "rmsea", "srmr"))
cfi tli rmsea srmr
0.913 0.903 0.076 0.094
#Imputed dataset 3
> fitMeasures(lavaan::sem(model = model_cfa, data = out1@DataList[[3]], missing = "pairwise", rotation = "oblimin", estimator = "WLSMV", ordered = ordered2), c("cfi", "tli", "rmsea", "srmr"))
cfi tli rmsea srmr
0.925 0.917 0.067 0.089
#Imputed dataset 4
> fitMeasures(lavaan::sem(model = model_cfa, data = out1@DataList[[4]], missing = "pairwise", rotation = "oblimin", estimator = "WLSMV", ordered = ordered2), c("cfi", "tli", "rmsea", "srmr"))
cfi tli rmsea srmr
0.909 0.899 0.072 0.092
#Imputed dataset 5
> fitMeasures(lavaan::sem(model = model_cfa, data = out1@DataList[[5]], missing = "pairwise", rotation = "oblimin", estimator = "WLSMV", ordered = ordered2), c("cfi", "tli", "rmsea", "srmr"))
cfi tli rmsea srmr
0.920 0.911 0.070 0.091
Thank you for your help!
Michael