FIML and missing standard errors

263 views
Skip to first unread message

kinga.bie...@gmail.com

unread,
Feb 27, 2019, 11:10:15 AM2/27/19
to lavaan
Dear colleagues, 

I am completely new to lavaan, and it may be a beginner's question but still I would appreciate any help.

I am working with a secondary dataset and have three IVs with missing data (x1, x2, x3). More specifically, participants randomly responded to only one IV, and the other two were not displayed. That is, each IV has 2/3 of missings. 

When I try to fit my model with all 3 IVs using FIML to deal with missings, lavaan doesn't estimate CFI, TLI and standard errors for the model. 
When I try to fit the same model but with only 1 or 2 IVs, lavaan estimates everything normally. 

I assume it has to do with the missings and with the fact that FIML estimates high covariances between the IVs. I wonder if there is any way to run the model with all variables and get all estimates (other than using MI). 

Thanks for any suggestions! 

Model <- "
y1 ~ b1*m1 + b2*m2 + c1*x1 + c2*x2+ c3*x3
m1 ~ a1*x1+ a2*x2 + a3*x3
m2 ~ a4*x1+ a5*x2 + a6*x3
m1 ~~ m2
"
fit <- sem(model = Model, data = Data, fixed.x = F, missing = "ml")

Best regards, 

Kinga Bierwiaczonek 

Terrence Jorgensen

unread,
Feb 28, 2019, 10:33:46 PM2/28/19
to lavaan
fit <- sem(model = Model, data = Data, fixed.x = F, missing = "ml")

You can set fixed.x=TRUE and missing="fiml.x" to allow missings on exogenous covariates.  But what are the 3 IVs?  Are they conceptually redundant (multicollinear if there were any overlap in observations)?

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Kinga Bierwiaczonek

unread,
Mar 1, 2019, 3:29:28 AM3/1/19
to lav...@googlegroups.com
Dear Terrence, 

Thanks a lot for your answer! When I changed the parameters as suggested, lavaan estimated standard errors, but CFI and TLI, is still NA and RMSEA is 0 with a p value NA. 

The 3 IVs are three types of intergroup threat, they are conceptually related and usually also correlated at about .5 (in other studies). There are no overlapping observations in this dataset, participants always responded to only one type of threat. I assume, however, that the IVs may produce similar variance of DVs and mediators and that ends up producing “multicollinearity” when using fiml, if that makes sense. 

If so, is there any solution? 

Thanks again, 
Kinga 


--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/H65xyGbzhdA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

kinga.bie...@gmail.com

unread,
Mar 21, 2019, 1:57:52 PM3/21/19
to lavaan
Dear colleagues, 

I was able to debug the above script by switching to EM robust. Both DVs and one mediator are quite skewed, but robust EM seems to be handling this well. 

However, I am running into another issue. When I try to test invariance with two groups using group.equal = "regressions", lavaan does not seem to constrain the paths and I get the same output as for the unconstrained model. No warnings are produced, but I do notice that the SE of covariance between two DVs with missings is not estimated in Group 2. 

I am wondering why that might be. I would be grateful for any suggestions! 

I am pasting below my script and the output.  
Kinga

model <- "
Hostil + neg ~ b1*CN + b2*nat_id_s + c1*status_t + c2*values_t
CN ~ a1*status_t + a2*values_t
nat_id_s ~ a3*status_t + a4*values_t
CN ~~ nat_id_s
indirect1 := a1*b1
indirect2 := a2*b1
indirect3 := a3*b2
indirect4 := a4*b2
"
summary(fit.all6, standardized = T, fit.measures = T, rsq = T)

fit.all <- sem(model = model, data = Data, fixed.x = F, 
                  missing = "robust.two.stage", 
                  group = "Stat", group.equal = "regressions")

lavaan 0.6-3 ended normally after 63 iterations

  Optimization method                           NLMINB
  Number of free parameters                         54
  Number of equality constraints                     4

  Number of observations per group         
  HS                                               688
  LS                                               732
  Number of missing patterns per group     
  HS                                                 3
  LS                                                 8

  Estimator                                         ML      Robust
  Model Fit Test Statistic                      39.361      23.332
  Degrees of freedom                                 4           4
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  1.687
    for the Satorra-Bentler correction

Chi-square for each group:

  HS                                            39.361      23.332
  LS                                             0.000       0.000

Model test baseline model:

  Minimum Function Test Statistic             2680.476   22649.458
  Degrees of freedom                                28          28
  P-value                                        0.000       0.000

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.987       0.999
  Tucker-Lewis Index (TLI)                       0.907       0.994

  Robust Comparative Fit Index (CFI)                         0.988
  Robust Tucker-Lewis Index (TLI)                            0.915

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -11529.976  -11529.976
  Loglikelihood unrestricted model (H1)     -11510.267  -11510.267

  Number of free parameters                         50          50
  Akaike (AIC)                               23159.952   23159.952
  Bayesian (BIC)                             23422.872   23422.872
  Sample-size adjusted Bayesian (BIC)        23264.040   23264.040

Root Mean Square Error of Approximation:

  RMSEA                                          0.112       0.083
  90 Percent Confidence Interval          0.082  0.145       0.059  0.108
  P-value RMSEA <= 0.05                          0.001       0.014

  Robust RMSEA                                               0.107
  90 Percent Confidence Interval                             0.068  0.151

Standardized Root Mean Square Residual:

  SRMR                                           0.011       0.011

Parameter Estimates:

  Information                                 Observed
  Information saturated (h1) model          Structured
  Observed information based on                     H1
  Standard Errors                     Robust.two.stage


Group 1 [HS]:

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Hostil ~                                                              
    CN        (b1)    0.087    0.045    1.932    0.053    0.087    0.119
    nat_id_s  (b2)   -0.144    0.032   -4.580    0.000   -0.144   -0.206
    status_t  (c1)    0.235    0.046    5.146    0.000    0.235    0.290
    values_t  (c2)    0.337    0.064    5.265    0.000    0.337    0.344
  neg ~                                                                 
    CN        (b1)    0.087    0.045    1.932    0.053    0.087    0.111
    nat_id_s  (b2)   -0.144    0.032   -4.580    0.000   -0.144   -0.192
    status_t  (c1)    0.235    0.046    5.146    0.000    0.235    0.270
    values_t  (c2)    0.337    0.064    5.265    0.000    0.337    0.319
  CN ~                                                                  
    status_t  (a1)    0.309    0.078    3.943    0.000    0.309    0.279
    values_t  (a2)    0.620    0.084    7.364    0.000    0.620    0.462
  nat_id_s ~                                                            
    status_t  (a3)    0.136    0.075    1.819    0.069    0.136    0.117
    values_t  (a4)    0.408    0.102    3.989    0.000    0.408    0.290

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
 .CN ~~                                                                 
   .nat_id_s          0.596    0.093    6.385    0.000    0.596    0.451
 .Hostil ~~                                                             
   .neg               0.353    0.047    7.537    0.000    0.353    0.500
  status_t ~~                                                           
    values_t          0.285    0.012   24.255    0.000    0.285    0.253

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Hostil           -0.135    0.213   -0.634    0.526   -0.135   -0.142
   .neg               0.204    0.216    0.942    0.346    0.204    0.200
   .CN               -0.101    0.306   -0.330    0.742   -0.101   -0.078
   .nat_id_s          2.940    0.385    7.634    0.000    2.940    2.173
    status_t          3.091    0.070   43.890    0.000    3.091    2.647
    values_t          3.855    0.056   69.103    0.000    3.855    4.001

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Hostil            0.640    0.073    8.799    0.000    0.640    0.714
   .neg               0.780    0.055   14.308    0.000    0.780    0.753
   .CN                1.078    0.116    9.302    0.000    1.078    0.644
   .nat_id_s          1.619    0.116   13.919    0.000    1.619    0.885
    status_t          1.363    0.093   14.600    0.000    1.363    1.000
    values_t          0.928    0.091   10.254    0.000    0.928    1.000

R-Square:
                   Estimate
    Hostil            0.286
    neg               0.247
    CN                0.356
    nat_id_s          0.115


Group 2 [LS]:

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Hostil ~                                                              
    CN               -0.017    0.059   -0.286    0.775   -0.017   -0.020
    nat_id_s         -0.096    0.043   -2.246    0.025   -0.096   -0.100
    status_t          0.340    0.058    5.873    0.000    0.340    0.361
    values_t          0.373    0.077    4.823    0.000    0.373    0.322
  neg ~                                                                 
    CN               -0.045    0.063   -0.715    0.474   -0.045   -0.055
    nat_id_s         -0.152    0.044   -3.438    0.001   -0.152   -0.165
    status_t          0.346    0.062    5.589    0.000    0.346    0.384
    values_t          0.383    0.075    5.128    0.000    0.383    0.346
  CN ~                                                                  
    status_t          0.463    0.054    8.640    0.000    0.463    0.417
    values_t          0.621    0.066    9.389    0.000    0.621    0.455
  nat_id_s ~                                                            
    status_t          0.190    0.065    2.926    0.003    0.190    0.194
    values_t          0.420    0.074    5.643    0.000    0.420    0.347

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
 .CN ~~                                                                 
   .nat_id_s          0.297    0.062    4.800    0.000    0.297    0.343
 .Hostil ~~                                                             
   .neg               0.351    0.048    7.371    0.000    0.351    0.430
  status_t ~~                                                           
    values_t          0.469       NA                      0.469    0.434

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Hostil           -0.755    0.303   -2.491    0.013   -0.755   -0.694
   .neg              -0.202    0.283   -0.712    0.477   -0.202   -0.194
   .CN               -0.880    0.264   -3.330    0.001   -0.880   -0.687
   .nat_id_s          3.010    0.270   11.168    0.000    3.010    2.655
    status_t          4.168    0.065   63.922    0.000    4.168    3.611
    values_t          4.465    0.051   87.878    0.000    4.465    4.760

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .Hostil            0.856    0.076   11.325    0.000    0.856    0.723
   .neg               0.779    0.059   13.126    0.000    0.779    0.721
   .CN                0.745    0.078    9.543    0.000    0.745    0.454
   .nat_id_s          1.007    0.077   13.029    0.000    1.007    0.784
    status_t          1.332    0.105   12.682    0.000    1.332    1.000
    values_t          0.880    0.067   13.148    0.000    0.880    1.000

R-Square:
                   Estimate
    Hostil            0.277
    neg               0.279
    CN                0.546
    nat_id_s          0.216

Defined Parameters:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    indirect1         0.027    0.013    2.006    0.045    0.027    0.033
    indirect2         0.054    0.029    1.899    0.058    0.054    0.055
    indirect3        -0.020    0.012   -1.597    0.110   -0.020   -0.024
    indirect4        -0.059    0.021   -2.772    0.006   -0.059   -0.060


On Friday, March 1, 2019 at 8:29:28 AM UTC, Kinga Bierwiaczonek wrote:
Dear Terrence, 

Thanks a lot for your answer! When I changed the parameters as suggested, lavaan estimated standard errors, but CFI and TLI, is still NA and RMSEA is 0 with a p value NA. 

The 3 IVs are three types of intergroup threat, they are conceptually related and usually also correlated at about .5 (in other studies). There are no overlapping observations in this dataset, participants always responded to only one type of threat. I assume, however, that the IVs may produce similar variance of DVs and mediators and that ends up producing “multicollinearity” when using fiml, if that makes sense. 

If so, is there any solution? 

Thanks again, 
Kinga 
On Mar 1, 2019, at 3:33 AM, Terrence Jorgensen <tjorge...@gmail.com> wrote:

fit <- sem(model = Model, data = Data, fixed.x = F, missing = "ml")

You can set fixed.x=TRUE and missing="fiml.x" to allow missings on exogenous covariates.  But what are the 3 IVs?  Are they conceptually redundant (multicollinear if there were any overlap in observations)?

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam


--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/H65xyGbzhdA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+unsubscribe@googlegroups.com.

Kinga Bierwiaczonek

unread,
Mar 21, 2019, 2:05:07 PM3/21/19
to lavaan
Just to add one more detail, the correlation between the two IVs retrieved from the fitted model is .43 for Group 2. 

To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/H65xyGbzhdA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

Terrence Jorgensen

unread,
Mar 22, 2019, 4:45:44 AM3/22/19
to lavaan
When I try to test invariance with two groups using group.equal = "regressions", lavaan does not seem to constrain the paths and I get the same output as for the unconstrained model. No warnings are produced, but I do notice that the SE of covariance between two DVs with missings is not estimated in Group 2. 

I am wondering why that might be.

You are already providing labels for the paths, but only for one group because you are only providing one label.  Instead, you need a vector of labels, one per group


You can constrain paths to equality by using the same label for both groups.

Another mistake in your script is that you are using labels for parameters in equations with 2 DVs on the lefthand side, which (I assume inadvertently) constrains the effects on Hostil to equal the effects on neg.  Instead, use different labels on different lines for each DV

model <- "
Hostil ~ c(h.b1, h.b1)*CN + ...
neg    ~ c(n.b1, n.b1)*CN + ...
...
"

Kinga Bierwiaczonek

unread,
Mar 22, 2019, 11:33:57 AM3/22/19
to lav...@googlegroups.com
Dear Terrence, 

Thanks a million for your advice! I was convinced group.equal would take care of labels, but indeed using a vector solved the issue indeed. 

As to the second point, thanks for your keen observation. The effects on Hostil and neg are constrained intentionally as both DVs are theoretically related and constraining these paths to equality improves the overall model fit. 

Best, 
Kinga


--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/H65xyGbzhdA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages