Pooled test statistic

367 views

Skip to first unread message

aygul

unread,

May 3, 2020, 4:30:11 PM5/3/20

to lavaan

Hello,

I am new to using lavaan so apologies if my issue appears naive. I am working with 10 imputed datasets and having issues with the test statistic. The test statistic across 10 imputed datasets is relatively small, whereas when I run the model on the basis of only one dataset, it is much larger. What could be the issue here? Many thanks for your help.

Here is what I run and the output I get for 10 imputed datasets:

> dis1 <- '

+ SACQ1 =~ Q18_2 + Q18_3 + Q18_13 + Q18_19 + Q18_7 + Q18_1 + Q18_18 + Q18_4 + Q18_9 + Q18_11

+ SACQ4 =~ Q17_23 + Q17_21 + Q17_19 + Q17_17 + Q17_24 + Q17_18 + Q17_14 + Q18_6 + Q17_11

+ SACQ2 =~ Q17_10neg + Q17_12neg + Q17_6neg + Q17_15neg + Q17_20neg + Q17_1

+ SACQ3 =~ Q18_15neg + Q18_16neg + Q18_8neg + Q18_12neg + Q18_17neg + Q18_14neg + Q17_22neg + Q18_10

+ SACQ6 =~ Q17_3neg + Q17_16neg + Q17_4neg + Q17_8neg + Q17_5

+ SACQ5 =~ Q17_2 + Q17_7 + Q17_9 + Q17_13neg

+ # Regressions

+ SACQ4 ~ explor+commit+geninstr+coethinstr+eicoethinstr #acadsatisfaction

+ SACQ2 ~ explor+commit+geninstr+coethinstr+eicoethinstr #acadfocus

+ SACQ6 ~ explor+commit+geninstr+coethinstr+eicoethinstr #acadperf

+ SACQ5 ~ explor+commit+geninstr+coethinstr+eicoethinstr #genpurpose

+ SACQ1 ~ explor+commit+genfriend+coethfriend+eicoethpeer #social-general fit

+ SACQ3 ~ explor+commit+genfriend+coethfriend+eicoethpeer #loneliness/socializing

+ '

> dis1 <- runMI (dis1, data=datapmm1, estimator="MLR", fun="sem")

> summary(dis1, standardized=TRUE, fit.measures=TRUE)

lavaan.mi object based on 10 imputed data sets.

Convergence information:

The model converged on 10 imputed data sets

Rubin's (1987) rules were used to pool point and SE estimates across 10 imputed data sets, and to calculate degrees of freedom for each parameter's t test and CI.

Robust corrections are made by pooling the naive chi-squared statistic across 10 imputations for which the model converged, then applying the average (across imputations) scaling factor to that pooled value.

To instead pool the robust test statistics, set test = "D2" and pool.robust = TRUE.

Model Test User Model:

Test statistic 1010.187 853.214

Degrees of freedom 1110 1110

P-value 0.985 1.000

Scaling correction factor 1.184

Model Test Baseline Model:

Test statistic 2915.474 2414.672

Degrees of freedom 1197 1197

P-value 0.000 0.000

Scaling correction factor 1.207

User Model versus Baseline Model:

Comparative Fit Index (CFI) 1.000 1.000

Tucker-Lewis Index (TLI) 1.063 1.227

Robust Comparative Fit Index (CFI) 1.000

Robust Tucker-Lewis Index (TLI) 1.000

Root Mean Square Error of Approximation:

RMSEA 0.000 0.000

90 Percent confidence interval - lower 0.000 0.000

90 Percent confidence interval - upper 0.000 NA

P-value RMSEA <= 0.05 1.000 1.000

Robust RMSEA 0.000

90 Percent confidence interval - lower 0.000

90 Percent confidence interval - upper NA

Standardized Root Mean Square Residual:

SRMR 0.074 0.074

When I try to get the fit measures for 10 imputed datasets by setting

fitMeasures(dis1, test = "D2", pool.robust = TRUE),

I get the error message:

Negative pooled test statistic was set to zero, so fit will appear to be arbitrarily perfect. Robust corrections uninformative, not returned.

Here is the output I get when I run the model on the basis of just one imputed dataset:

> dis11 <- sem (dis1, data=datapmm2, estimator="MLR")

> summary(dis11, standardized=TRUE, fit.measures=TRUE)

lavaan 0.6-5 ended normally after 141 iterations

Estimator ML

Optimization method NLMINB

Number of free parameters 129

Number of observations 779

Model Test User Model:

Standard Robust

Test Statistic 6508.117 5406.498

Degrees of freedom 1110 1110

P-value (Chi-square) 0.000 0.000

Scaling correction factor 1.204

for the Yuan-Bentler correction (Mplus variant)

Model Test Baseline Model:

Test statistic 15367.955 12604.650

Degrees of freedom 1197 1197

P-value 0.000 0.000

Scaling correction factor 1.219

User Model versus Baseline Model:

Comparative Fit Index (CFI) 0.619 0.623

Tucker-Lewis Index (TLI) 0.589 0.594

Robust Comparative Fit Index (CFI) 0.628

Robust Tucker-Lewis Index (TLI) 0.599

Loglikelihood and Information Criteria:

Loglikelihood user model (H0) -59850.751 -59850.751

Scaling correction factor 1.182

for the MLR correction

Loglikelihood unrestricted model (H1) -56596.693 -56596.693

Scaling correction factor 1.201

for the MLR correction

Akaike (AIC) 119959.503 119959.503

Bayesian (BIC) 120560.386 120560.386

Sample-size adjusted Bayesian (BIC) 120150.748 120150.748

Root Mean Square Error of Approximation:

RMSEA 0.079 0.070

90 Percent confidence interval - lower 0.077 0.069

90 Percent confidence interval - upper 0.081 0.072

P-value RMSEA <= 0.05 0.000 0.000

Robust RMSEA 0.077

90 Percent confidence interval - lower 0.075

90 Percent confidence interval - upper 0.079

Standardized Root Mean Square Residual:

SRMR 0.094 0.094

Now the test statistic is larger than the degrees of freedom, and fit indices make more sense (although the model clearly does not fit). Why is the pooled test statistic much smaller than the test statistic for individual dataset, and why does it turn negative when I try to get the robust test statistic?

Terrence Jorgensen

unread,

May 15, 2020, 10:30:35 AM5/15/20

to lavaan

I get the error message:
Negative pooled test statistic was set to zero, so fit will appear to be arbitrarily perfect. Robust corrections uninformative, not returned.

That is not an error message. It is just information.

Why is the pooled test statistic much smaller than the test statistic for individual dataset, and why does it turn negative when I try to get the robust test statistic?

Not sure, but make sure this is still occurring in the latest semTools:

devtools::install_github("simsem/semTools/semTools")

You can also fit the model to each imputed data set, save the 10 chi-squared stats in a vector, and send it yourself to the calculate.D2() function to verify this is what happens when you pool that set of stats.

I do notice you have exogenous variables in your model, which lavaan treats as fixed by default (fixed.x=TRUE). An older version of runMI() set fixed.x=FALSE in all models, but I don't see how that would have changed your results very much.

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam