Values of df = inf using semTools (cfa.mi)

82 views
Skip to first unread message

Brandon McCormick

unread,
May 28, 2020, 1:19:05 PM5/28/20
to lavaan
I ran a cfa from 40 imputations (had the same result with just 5 imputations). And got an odd result of all my df = inf, as well as  warning messages:

 1: In if (attr(x, "information") == "observed") { :
  the condition has length > 1 and only the first element will be used
2: In cbind(c1, c2, deparse.level = 0) :
  number of rows of result is not a multiple of vector length (arg 1)

 Here are my full results:

lavaan.mi object based on 40 imputed data sets. 
See class?lavaan.mi help page for available methods. 

Convergence information:
The model converged on 40 imputed data sets 

Rubin's (1987) rules were used to pool point and SE estimates across 40 imputed data sets, and to calculate degrees of freedom for each parameter's t test and CI.

Model Test User Model:

  Test statistic                               697.943
  Degrees of freedom                                77
  P-value                                        0.000

Model Test Baseline Model:

  Test statistic                              7210.955
  Degrees of freedom                                91
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.913
  Tucker-Lewis Index (TLI)                       0.897

Root Mean Square Error of Approximation:

  RMSEA                                          0.049
  90 Percent confidence interval - lower         0.046
  90 Percent confidence interval - upper         0.053
P-value RMSEA <= 0.05                          0.636

Standardized Root Mean Square Residual:

  SRMR                                           0.036

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model            Expected
  Standard errors                           Structured
  Information                               Structured

Latent Variables:
                   Estimate  Std.Err  t-value       df  P(>|t|)   Std.lv  Std.all
  AD_W1 =~                                                                       
    AD1_W1            0.302    0.011   26.324      Inf    0.000    0.302    0.481
    AD2_W1            0.229    0.009   25.789      Inf    0.000    0.229    0.472
    AD3_W1            0.227    0.007   32.219      Inf    0.000    0.227    0.572
    AD4_W1            0.258    0.011   23.682      Inf    0.000    0.258    0.438
    AD5_W1            0.202    0.011   17.837      Inf    0.000    0.202    0.337
    AD6_W1            0.210    0.009   24.049      Inf    0.000    0.210    0.444
    AD7_W1            0.219    0.014   15.885      Inf    0.000    0.219    0.302
    AD8_W1            0.334    0.010   32.996      Inf    0.000    0.334    0.584
    AD9_W1            0.233    0.008   29.981      Inf    0.000    0.233    0.539
    AD10_W1           0.207    0.006   32.528      Inf    0.000    0.207    0.577
    AD11_W1           0.167    0.006   27.246      Inf    0.000    0.167    0.496
    AD12_W1           0.305    0.013   22.755      Inf    0.000    0.305    0.422
    AD13_W1           0.259    0.011   22.769      Inf    0.000    0.259    0.422
    AD14_W1           0.251    0.009   26.567      Inf    0.000    0.251    0.485

Variances:
 Estimate  Std.Err  t-value       df  P(>|t|)   Std.lv  Std.all
    AD_W1             1.000                                        1.000    1.000
   .AD1_W1            0.302    0.008   37.467      Inf    0.000    0.302    0.769
   .AD2_W1            0.183    0.005   37.598      Inf    0.000    0.183    0.777
   .AD3_W1            0.106    0.003   35.698      Inf    0.000    0.106    0.672
   .AD4_W1            0.281    0.007   38.074      Inf    0.000    0.281    0.808
   .AD5_W1            0.317    0.008   39.104      Inf    0.000    0.317    0.886
   .AD6_W1            0.180    0.005   37.996      Inf    0.000    0.180    0.803
   .AD7_W1            0.477    0.012   39.366      Inf    0.000    0.477    0.909
   .AD8_W1            0.215    0.006   35.414      Inf    0.000    0.215    0.659
   .AD9_W1            0.133    0.004   36.447      Inf    0.000    0.133    0.710
   .AD10_W1           0.086    0.002   35.587      Inf    0.000    0.086    0.667
   .AD11_W1           0.086    0.002   37.231      Inf    0.000    0.086    0.754
   .AD12_W1           0.428    0.011   38.264      Inf    0.000    0.428    0.822
   .AD13_W1           0.309    0.008   38.261      Inf    0.000    0.309    0.822
   .AD14_W1           0.205    0.005   37.406      Inf    0.000    0.205    0.765

Warning messages:
1: In if (attr(x, "information") == "observed") { :
  the condition has length > 1 and only the first element will be used
2: In cbind(c1, c2, deparse.level = 0) :
  number of rows of result is not a multiple of vector length (arg 1)\


Here is my code:

library(lavaan)
library(semTools)
library(Amelia)
mTBI_SEM_Data <- read.delim("~/Desktop/mTBI_SEM_Data.txt", na.strings="-99")
View(mTBI_SEM_Data)
set.seed = 123456
mTBI_SEM_Data_amelia <- amelia(mTBI_SEM_Data, m = 40, noms = c("Sex", "Ethnicity",   "HI_Bi_B21"))

imps <- mTBI_SEM_Data_amelia$imputations
AD_W1 <- 'AD_W1 =~ NA* AD1_W1 + AD2_W1 + AD3_W1 + AD4_W1 + AD5_W1 + AD6_W1 + AD7 _W1 + AD8_W1 + AD9 _W1 + AD10_W1 + AD11_W1 + AD12_W1 + AD13_W1 + AD14_W1
 AD_W1 ~~ 1* AD_W1'

fitAD_W1 <- cfa.mi(AD_W1,data=imps)
summary(fitAD_W1,standardized=TRUE,fit.measures=TRUE)

Brandon McCormick

unread,
May 28, 2020, 4:22:14 PM5/28/20
to lavaan
I was able to replicate the error with the Holzinger Swineford data set (example from https://rdrr.io/cran/semTools/man/runMI.html). The only difference is some df values = inf rather than all. 

Here are the results: 

lavaan.mi object based on 3 imputed data sets. 
See class?lavaan.mi help page for available methods. 

Convergence information:
The model converged on 3 imputed data sets 

Rubin's (1987) rules were used to pool point and SE estimates across 3 imputed data sets, and to calculate degrees of freedom for each parameter's t test and CI.

Model Test User Model:

  Test statistic                                60.393
  Degrees of freedom                                24
  P-value                                        0.000

Model Test Baseline Model:

  Test statistic                               656.799
  Degrees of freedom                                36
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.941
  Tucker-Lewis Index (TLI)                       0.912

Root Mean Square Error of Approximation:

  RMSEA                                          0.071
  90 Percent confidence interval - lower         0.049
  90 Percent confidence interval - upper         0.094
  P-value RMSEA <= 0.05                          0.059

Standardized Root Mean Square Residual:

  SRMR                                           0.063

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model            Expected
  Standard errors                           Structured
  Information                               Structured

Latent Variables:
                   Estimate  Std.Err  t-value       df  P(>|t|)   Std.lv  Std.all
  visual =~                                                                      
    x1                1.000                                        0.895    0.768
    x2                0.562    0.105    5.372  939.048    0.000    0.503    0.428
    x3                0.735    0.115    6.419  870.405    0.000    0.658    0.583
  textual =~                                                                     
    x4                1.000                                        0.952    0.819
    x5                0.670    0.049   13.531  137.991    0.000    0.637    0.776
    x6                0.991    0.068   14.528  233.415    0.000    0.943    0.862
  speed =~                                                                       
    x7                1.000                                        0.631    0.580
    x8                1.242    0.194    6.407 3229.343    0.000    0.784    0.775
    x9                0.870    0.135    6.458  185.748    0.000    0.549    0.559

Covariances:
                   Estimate  Std.Err  t-value       df  P(>|t|)   Std.lv  Std.all
  visual ~~                                                                      
    textual           0.420    0.076    5.546  210.207    0.000    0.493    0.493
    speed             0.232    0.058    4.011      Inf    0.000    0.410    0.410
  textual ~~                                                                     
    speed             0.159    0.051    3.152 2320.091    0.002    0.265    0.265

Variances:
                   Estimate  Std.Err  t-value       df  P(>|t|)   Std.lv  Std.all
   .x1                0.557    0.118    4.731  575.366    0.000    0.557    0.410
   .x2                1.129    0.106   10.686      Inf    0.000    1.129    0.817
   .x3                0.842    0.094    8.919 3923.682    0.000    0.842    0.660
   .x4                0.445    0.058    7.706  344.491    0.000    0.445    0.330
   .x5                0.268    0.030    8.868   22.853    0.000    0.268    0.398
   .x6                0.307    0.050    6.147   67.273    0.000    0.307    0.256
   .x7                0.785    0.087    8.992 3667.386    0.000    0.785    0.663
   .x8                0.408    0.090    4.525 8484.786    0.000    0.408    0.399
   .x9                0.664    0.071    9.337   45.751    0.000    0.664    0.687
    visual            0.801    0.150    5.324 1465.293    0.000    1.000    1.000
    textual           0.905    0.117    7.769 5063.068    0.000    1.000    1.000
    speed             0.398    0.094    4.241 4887.961    0.000    1.000    1.000

Warning messages:
1: In if (attr(x, "information") == "observed") { :
  the condition has length > 1 and only the first element will be used
2: In cbind(c1, c2, deparse.level = 0) :
  number of rows of result is not a multiple of vector length (arg 1)


Terrence Jorgensen

unread,
May 29, 2020, 4:18:57 AM5/29/20
to lavaan
all my df = inf

Only approximately.  When df > 9999, I just print Inf to keep the output from being too messy.  When a t statistic's df are that large, it is effectively a normal distribution.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Brandon McCormick

unread,
May 29, 2020, 12:03:35 PM5/29/20
to lav...@googlegroups.com

Thank you, that greatly helps my interpretation. 

Do you have any insight into these warnings? I am coming up short in my investigation. 

 1: In if (attr(x, "information") == "observed") { :
  the condition has length > 1 and only the first element will be used
2: In cbind(c1, c2, deparse.level = 0) :
  number of rows of result is not a multiple of vector length (arg 1)



On May 29, 2020, at 3:19 AM, Terrence Jorgensen <tjorge...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/61506561-cd47-4a84-9191-a09cc168698b%40googlegroups.com.

Terrence Jorgensen

unread,
May 29, 2020, 5:09:24 PM5/29/20
to lavaan
Do you have any insight into these warnings?

 1: In if (attr(x, "information") == "observed") { :
  the condition has length > 1 and only the first element will be used

No idea, and it seems strange because there should be only 1 value for the type of information used to derive SEs (i.e., length == 1). 

2: In cbind(c1, c2, deparse.level = 0) :
  number of rows of result is not a multiple of vector length (arg 1)
 
I think this follows from the first warning, where the type of information is supposed to be printed, but there is more than one.  Perhaps this is related to a change Yves made in version 0.6-6 ; is that the version you are using?

Brandon McCormick

unread,
May 29, 2020, 5:32:08 PM5/29/20
to lav...@googlegroups.com
Yes, I updated lavaan and semTools. The bug seems inconsequential fortunately.  

On May 29, 2020, at 4:09 PM, Terrence Jorgensen <tjorge...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

Yves Rosseel

unread,
May 31, 2020, 1:33:56 PM5/31/20
to lav...@googlegroups.com
On 5/29/20 11:09 PM, Terrence Jorgensen wrote:
> Do you have any insight into these warnings?
>
>  1: In if (attr(x, "information") == "observed") { :
>   the condition has length > 1 and only the first element will be used

This will be due to a change in lavaan 0.6-6: the information= (and
related options) can (optionally) be a vector of 2 elements, eg
information = c("observed", "expected"). The first one is used to
compute vcov (and hence the standard errors). The second one is only
used for the test statistic. Decoupling this turns out to be useful in
some settings.

> 2: In cbind(c1, c2, deparse.level = 0) :
>   number of rows of result is not a multiple of vector length (arg 1)

> I think this follows from the first warning

Indeed.

Yves.
Reply all
Reply to author
Forward
0 new messages