Pooled test statistic

337 views
Skip to first unread message

aygul

unread,
May 3, 2020, 4:30:11 PM5/3/20
to lavaan
Hello,

I am new to using lavaan so apologies if my issue appears naive. I am working with 10 imputed datasets and having issues with the test statistic. The test statistic across 10 imputed datasets is relatively small, whereas when I run the model on the basis of only one dataset, it is much larger. What could be the issue here? Many thanks for your help.

Here is what I run and the output I get for 10 imputed datasets: 

> dis1 <- '

+ SACQ1 =~ Q18_2 + Q18_3 + Q18_13 + Q18_19 + Q18_7 + Q18_1 + Q18_18 + Q18_4 + Q18_9 + Q18_11

+ SACQ4 =~ Q17_23 + Q17_21 + Q17_19 + Q17_17 + Q17_24 + Q17_18 + Q17_14 + Q18_6 + Q17_11

+ SACQ2 =~ Q17_10neg + Q17_12neg + Q17_6neg + Q17_15neg + Q17_20neg + Q17_1

+ SACQ3 =~ Q18_15neg + Q18_16neg + Q18_8neg + Q18_12neg + Q18_17neg + Q18_14neg + Q17_22neg + Q18_10

+ SACQ6 =~ Q17_3neg + Q17_16neg + Q17_4neg + Q17_8neg + Q17_5

+ SACQ5 =~ Q17_2 + Q17_7 + Q17_9 + Q17_13neg

+

+ # Regressions

+ SACQ4 ~ explor+commit+geninstr+coethinstr+eicoethinstr #acadsatisfaction

+ SACQ2 ~ explor+commit+geninstr+coethinstr+eicoethinstr #acadfocus

+ SACQ6 ~ explor+commit+geninstr+coethinstr+eicoethinstr #acadperf

+ SACQ5 ~ explor+commit+geninstr+coethinstr+eicoethinstr #genpurpose

+ SACQ1 ~ explor+commit+genfriend+coethfriend+eicoethpeer #social-general fit

+ SACQ3 ~ explor+commit+genfriend+coethfriend+eicoethpeer #loneliness/socializing

+

+ '

> dis1 <- runMI (dis1, data=datapmm1, estimator="MLR", fun="sem")

> summary(dis1, standardized=TRUE, fit.measures=TRUE)

lavaan.mi object based on 10 imputed data sets.

Convergence information:

The model converged on 10 imputed data sets

 

Rubin's (1987) rules were used to pool point and SE estimates across 10 imputed data sets, and to calculate degrees of freedom for each parameter's t test and CI.

Robust corrections are made by pooling the naive chi-squared statistic across 10 imputations for which the model converged, then applying the average (across imputations) scaling factor to that pooled value.

To instead pool the robust test statistics, set test = "D2" and pool.robust = TRUE.

 

Model Test User Model:

  Test statistic                              1010.187     853.214

  Degrees of freedom                              1110        1110

  P-value                                        0.985       1.000

  Scaling correction factor                                  1.184

 

Model Test Baseline Model:

  Test statistic                              2915.474    2414.672

  Degrees of freedom                              1197        1197

  P-value                                        0.000       0.000

  Scaling correction factor                                  1.207

 

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    1.000       1.000

  Tucker-Lewis Index (TLI)                       1.063       1.227

                                                                 

  Robust Comparative Fit Index (CFI)                         1.000

  Robust Tucker-Lewis Index (TLI)                            1.000

 

Root Mean Square Error of Approximation: 

  RMSEA                                          0.000       0.000

  90 Percent confidence interval - lower         0.000       0.000

  90 Percent confidence interval - upper         0.000          NA

  P-value RMSEA <= 0.05                          1.000       1.000

                                                                 

  Robust RMSEA                                               0.000

  90 Percent confidence interval - lower                     0.000

  90 Percent confidence interval - upper                        NA

 

Standardized Root Mean Square Residual:

  SRMR                                           0.074       0.074


When I try to get the fit measures for 10 imputed datasets by setting 
fitMeasures(dis1, test = "D2", pool.robust = TRUE),
I get the error message: 

Negative pooled test statistic was set to zero, so fit will appear to be arbitrarily perfect. Robust corrections uninformative, not returned.


Here is the output I get when I run the model on the basis of just one imputed dataset: 

> dis11 <- sem (dis1, data=datapmm2, estimator="MLR")

> summary(dis11, standardized=TRUE, fit.measures=TRUE)

lavaan 0.6-5 ended normally after 141 iterations

  Estimator                                         ML

  Optimization method                           NLMINB

  Number of free parameters                        129

                                                      

  Number of observations                           779

                                                      

Model Test User Model:

                                              Standard      Robust

  Test Statistic                              6508.117    5406.498

  Degrees of freedom                              1110        1110

  P-value (Chi-square)                           0.000       0.000

  Scaling correction factor                                  1.204

    for the Yuan-Bentler correction (Mplus variant) 


Model Test Baseline Model:


  Test statistic                             15367.955   12604.650

  Degrees of freedom                              1197        1197

  P-value                                        0.000       0.000

  Scaling correction factor                                  1.219


User Model versus Baseline Model:


  Comparative Fit Index (CFI)                    0.619       0.623

  Tucker-Lewis Index (TLI)                       0.589       0.594

                                                                  

  Robust Comparative Fit Index (CFI)                         0.628

  Robust Tucker-Lewis Index (TLI)                            0.599


Loglikelihood and Information Criteria:


  Loglikelihood user model (H0)             -59850.751  -59850.751

  Scaling correction factor                                  1.182

      for the MLR correction                                      

  Loglikelihood unrestricted model (H1)     -56596.693  -56596.693

  Scaling correction factor                                  1.201

      for the MLR correction                                      

                                                                  

  Akaike (AIC)                              119959.503  119959.503

  Bayesian (BIC)                            120560.386  120560.386

  Sample-size adjusted Bayesian (BIC)       120150.748  120150.748


Root Mean Square Error of Approximation:

  RMSEA                                          0.079       0.070

  90 Percent confidence interval - lower         0.077       0.069

  90 Percent confidence interval - upper         0.081       0.072

  P-value RMSEA <= 0.05                          0.000       0.000

                                                                  

  Robust RMSEA                                               0.077

  90 Percent confidence interval - lower                     0.075

  90 Percent confidence interval - upper                     0.079


Standardized Root Mean Square Residual:

  SRMR                                           0.094       0.094


Now the test statistic is larger than the degrees of freedom, and fit indices make more sense (although the model clearly does not fit). Why is the pooled test statistic much smaller than the test statistic for individual dataset, and why does it turn negative when I try to get the robust test statistic?

Terrence Jorgensen

unread,
May 15, 2020, 10:30:35 AM5/15/20
to lavaan
I get the error message: 

Negative pooled test statistic was set to zero, so fit will appear to be arbitrarily perfect. Robust corrections uninformative, not returned.


That is not an error message.  It is just information. 

Why is the pooled test statistic much smaller than the test statistic for individual dataset, and why does it turn negative when I try to get the robust test statistic?


Not sure, but make sure this is still occurring in the latest semTools: 

devtools::install_github("simsem/semTools/semTools")

You can also fit the model to each imputed data set, save the 10 chi-squared stats in a vector, and send it yourself to the calculate.D2() function to verify this is what happens when you pool that set of stats.

I do notice you have exogenous variables in your model, which lavaan treats as fixed by default (fixed.x=TRUE).  An older version of runMI() set fixed.x=FALSE in all models, but I don't see how that would have changed your results very much. 

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Reply all
Reply to author
Forward
0 new messages