FIML and missing standard errors

kinga.bie...@gmail.com

unread,

Feb 27, 2019, 11:10:15 AM2/27/19

to lavaan

Dear colleagues,

I am completely new to lavaan, and it may be a beginner's question but still I would appreciate any help.

I am working with a secondary dataset and have three IVs with missing data (x1, x2, x3). More specifically, participants randomly responded to only one IV, and the other two were not displayed. That is, each IV has 2/3 of missings.

When I try to fit my model with all 3 IVs using FIML to deal with missings, lavaan doesn't estimate CFI, TLI and standard errors for the model.

When I try to fit the same model but with only 1 or 2 IVs, lavaan estimates everything normally.

I assume it has to do with the missings and with the fact that FIML estimates high covariances between the IVs. I wonder if there is any way to run the model with all variables and get all estimates (other than using MI).

Thanks for any suggestions!

Model <- "

y1 ~ b1*m1 + b2*m2 + c1*x1 + c2*x2+ c3*x3

m1 ~ a1*x1+ a2*x2 + a3*x3

m2 ~ a4*x1+ a5*x2 + a6*x3

m1 ~~ m2

"

fit <- sem(model = Model, data = Data, fixed.x = F, missing = "ml")

Best regards,

Kinga Bierwiaczonek

Terrence Jorgensen

unread,

Feb 28, 2019, 10:33:46 PM2/28/19

to lavaan

fit <- sem(model = Model, data = Data, fixed.x = F, missing = "ml")

You can set fixed.x=TRUE and missing="fiml.x" to allow missings on exogenous covariates. But what are the 3 IVs? Are they conceptually redundant (multicollinear if there were any overlap in observations)?

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Kinga Bierwiaczonek

unread,

Mar 1, 2019, 3:29:28 AM3/1/19

to lav...@googlegroups.com

Dear Terrence,

Thanks a lot for your answer! When I changed the parameters as suggested, lavaan estimated standard errors, but CFI and TLI, is still NA and RMSEA is 0 with a p value NA.

The 3 IVs are three types of intergroup threat, they are conceptually related and usually also correlated at about .5 (in other studies). There are no overlapping observations in this dataset, participants always responded to only one type of threat. I assume, however, that the IVs may produce similar variance of DVs and mediators and that ends up producing “multicollinearity” when using fiml, if that makes sense.

If so, is there any solution?

Thanks again,

Kinga

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/H65xyGbzhdA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

kinga.bie...@gmail.com

unread,

Mar 21, 2019, 1:57:52 PM3/21/19

to lavaan

Dear colleagues,

I was able to debug the above script by switching to EM robust. Both DVs and one mediator are quite skewed, but robust EM seems to be handling this well.

However, I am running into another issue. When I try to test invariance with two groups using group.equal = "regressions", lavaan does not seem to constrain the paths and I get the same output as for the unconstrained model. No warnings are produced, but I do notice that the SE of covariance between two DVs with missings is not estimated in Group 2.

I am wondering why that might be. I would be grateful for any suggestions!

I am pasting below my script and the output.

Kinga

model <- "

Hostil + neg ~ b1*CN + b2*nat_id_s + c1*status_t + c2*values_t

CN ~ a1*status_t + a2*values_t

nat_id_s ~ a3*status_t + a4*values_t

CN ~~ nat_id_s

indirect1 := a1*b1

indirect2 := a2*b1

indirect3 := a3*b2

indirect4 := a4*b2

"

summary(fit.all6, standardized = T, fit.measures = T, rsq = T)

fit.all <- sem(model = model, data = Data, fixed.x = F,

missing = "robust.two.stage",

group = "Stat", group.equal = "regressions")

lavaan 0.6-3 ended normally after 63 iterations

Optimization method NLMINB

Number of free parameters 54

Number of equality constraints 4

Number of observations per group

HS 688

LS 732

Number of missing patterns per group

HS 3

LS 8

Estimator ML Robust

Model Fit Test Statistic 39.361 23.332

Degrees of freedom 4 4

P-value (Chi-square) 0.000 0.000

Scaling correction factor 1.687

for the Satorra-Bentler correction

Chi-square for each group:

HS 39.361 23.332

LS 0.000 0.000

Model test baseline model:

Minimum Function Test Statistic 2680.476 22649.458

Degrees of freedom 28 28

P-value 0.000 0.000

User model versus baseline model:

Comparative Fit Index (CFI) 0.987 0.999

Tucker-Lewis Index (TLI) 0.907 0.994

Robust Comparative Fit Index (CFI) 0.988

Robust Tucker-Lewis Index (TLI) 0.915

Loglikelihood and Information Criteria:

Loglikelihood user model (H0) -11529.976 -11529.976

Loglikelihood unrestricted model (H1) -11510.267 -11510.267

Number of free parameters 50 50

Akaike (AIC) 23159.952 23159.952

Bayesian (BIC) 23422.872 23422.872

Sample-size adjusted Bayesian (BIC) 23264.040 23264.040

Root Mean Square Error of Approximation:

RMSEA 0.112 0.083

90 Percent Confidence Interval 0.082 0.145 0.059 0.108

P-value RMSEA <= 0.05 0.001 0.014

Robust RMSEA 0.107

90 Percent Confidence Interval 0.068 0.151

Standardized Root Mean Square Residual:

SRMR 0.011 0.011

Parameter Estimates:

Information Observed

Information saturated (h1) model Structured

Observed information based on H1

Standard Errors Robust.two.stage

Group 1 [HS]:

Regressions:

Estimate Std.Err z-value P(>|z|) Std.lv Std.all

Hostil ~

CN (b1) 0.087 0.045 1.932 0.053 0.087 0.119

nat_id_s (b2) -0.144 0.032 -4.580 0.000 -0.144 -0.206

status_t (c1) 0.235 0.046 5.146 0.000 0.235 0.290

values_t (c2) 0.337 0.064 5.265 0.000 0.337 0.344

neg ~

CN (b1) 0.087 0.045 1.932 0.053 0.087 0.111

nat_id_s (b2) -0.144 0.032 -4.580 0.000 -0.144 -0.192

status_t (c1) 0.235 0.046 5.146 0.000 0.235 0.270

values_t (c2) 0.337 0.064 5.265 0.000 0.337 0.319

CN ~

status_t (a1) 0.309 0.078 3.943 0.000 0.309 0.279

values_t (a2) 0.620 0.084 7.364 0.000 0.620 0.462

nat_id_s ~

status_t (a3) 0.136 0.075 1.819 0.069 0.136 0.117

values_t (a4) 0.408 0.102 3.989 0.000 0.408 0.290

Covariances:

Estimate Std.Err z-value P(>|z|) Std.lv Std.all

.CN ~~

.nat_id_s 0.596 0.093 6.385 0.000 0.596 0.451

.Hostil ~~

.neg 0.353 0.047 7.537 0.000 0.353 0.500

status_t ~~

values_t 0.285 0.012 24.255 0.000 0.285 0.253

Intercepts:

Estimate Std.Err z-value P(>|z|) Std.lv Std.all

.Hostil -0.135 0.213 -0.634 0.526 -0.135 -0.142

.neg 0.204 0.216 0.942 0.346 0.204 0.200

.CN -0.101 0.306 -0.330 0.742 -0.101 -0.078

.nat_id_s 2.940 0.385 7.634 0.000 2.940 2.173

status_t 3.091 0.070 43.890 0.000 3.091 2.647

values_t 3.855 0.056 69.103 0.000 3.855 4.001

Variances:

Estimate Std.Err z-value P(>|z|) Std.lv Std.all

.Hostil 0.640 0.073 8.799 0.000 0.640 0.714

.neg 0.780 0.055 14.308 0.000 0.780 0.753

.CN 1.078 0.116 9.302 0.000 1.078 0.644

.nat_id_s 1.619 0.116 13.919 0.000 1.619 0.885

status_t 1.363 0.093 14.600 0.000 1.363 1.000

values_t 0.928 0.091 10.254 0.000 0.928 1.000

R-Square:

Estimate

Hostil 0.286

neg 0.247

CN 0.356

nat_id_s 0.115

Group 2 [LS]:

Regressions:

Estimate Std.Err z-value P(>|z|) Std.lv Std.all

Hostil ~

CN -0.017 0.059 -0.286 0.775 -0.017 -0.020

nat_id_s -0.096 0.043 -2.246 0.025 -0.096 -0.100

status_t 0.340 0.058 5.873 0.000 0.340 0.361

values_t 0.373 0.077 4.823 0.000 0.373 0.322

neg ~

CN -0.045 0.063 -0.715 0.474 -0.045 -0.055

nat_id_s -0.152 0.044 -3.438 0.001 -0.152 -0.165

status_t 0.346 0.062 5.589 0.000 0.346 0.384

values_t 0.383 0.075 5.128 0.000 0.383 0.346

CN ~

status_t 0.463 0.054 8.640 0.000 0.463 0.417

values_t 0.621 0.066 9.389 0.000 0.621 0.455

nat_id_s ~

status_t 0.190 0.065 2.926 0.003 0.190 0.194

values_t 0.420 0.074 5.643 0.000 0.420 0.347

Covariances:

Estimate Std.Err z-value P(>|z|) Std.lv Std.all

.CN ~~

.nat_id_s 0.297 0.062 4.800 0.000 0.297 0.343

.Hostil ~~

.neg 0.351 0.048 7.371 0.000 0.351 0.430

status_t ~~

values_t 0.469 NA 0.469 0.434

Intercepts:

Estimate Std.Err z-value P(>|z|) Std.lv Std.all

.Hostil -0.755 0.303 -2.491 0.013 -0.755 -0.694

.neg -0.202 0.283 -0.712 0.477 -0.202 -0.194

.CN -0.880 0.264 -3.330 0.001 -0.880 -0.687

.nat_id_s 3.010 0.270 11.168 0.000 3.010 2.655

status_t 4.168 0.065 63.922 0.000 4.168 3.611

values_t 4.465 0.051 87.878 0.000 4.465 4.760

Variances:

Estimate Std.Err z-value P(>|z|) Std.lv Std.all

.Hostil 0.856 0.076 11.325 0.000 0.856 0.723

.neg 0.779 0.059 13.126 0.000 0.779 0.721

.CN 0.745 0.078 9.543 0.000 0.745 0.454

.nat_id_s 1.007 0.077 13.029 0.000 1.007 0.784

status_t 1.332 0.105 12.682 0.000 1.332 1.000

values_t 0.880 0.067 13.148 0.000 0.880 1.000

R-Square:

Estimate

Hostil 0.277

neg 0.279

CN 0.546

nat_id_s 0.216

Defined Parameters:

Estimate Std.Err z-value P(>|z|) Std.lv Std.all

indirect1 0.027 0.013 2.006 0.045 0.027 0.033

indirect2 0.054 0.029 1.899 0.058 0.054 0.055

indirect3 -0.020 0.012 -1.597 0.110 -0.020 -0.024

indirect4 -0.059 0.021 -2.772 0.006 -0.059 -0.060

On Friday, March 1, 2019 at 8:29:28 AM UTC, Kinga Bierwiaczonek wrote:

Dear Terrence,

Thanks a lot for your answer! When I changed the parameters as suggested, lavaan estimated standard errors, but CFI and TLI, is still NA and RMSEA is 0 with a p value NA.

The 3 IVs are three types of intergroup threat, they are conceptually related and usually also correlated at about .5 (in other studies). There are no overlapping observations in this dataset, participants always responded to only one type of threat. I assume, however, that the IVs may produce similar variance of DVs and mediators and that ends up producing “multicollinearity” when using fiml, if that makes sense.

If so, is there any solution?

Thanks again,
Kinga

On Mar 1, 2019, at 3:33 AM, Terrence Jorgensen <tjorge...@gmail.com> wrote:

fit <- sem(model = Model, data = Data, fixed.x = F, missing = "ml")

You can set fixed.x=TRUE and missing="fiml.x" to allow missings on exogenous covariates. But what are the 3 IVs? Are they conceptually redundant (multicollinear if there were any overlap in observations)?

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
http://www.uva.nl/profile/t.d.jorgensen

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/H65xyGbzhdA/unsubscribe.

To unsubscribe from this group and all its topics, send an email to lavaan+unsubscribe@googlegroups.com.

Kinga Bierwiaczonek

unread,

Mar 21, 2019, 2:05:07 PM3/21/19

to lavaan

Just to add one more detail, the correlation between the two IVs retrieved from the fitted model is .43 for Group 2.

To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/H65xyGbzhdA/unsubscribe.

To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

Terrence Jorgensen

unread,

Mar 22, 2019, 4:45:44 AM3/22/19

to lavaan

When I try to test invariance with two groups using group.equal = "regressions", lavaan does not seem to constrain the paths and I get the same output as for the unconstrained model. No warnings are produced, but I do notice that the SE of covariance between two DVs with missings is not estimated in Group 2.

I am wondering why that might be.

You are already providing labels for the paths, but only for one group because you are only providing one label. Instead, you need a vector of labels, one per group

http://lavaan.ugent.be/tutorial/groups.html

You can constrain paths to equality by using the same label for both groups.

Another mistake in your script is that you are using labels for parameters in equations with 2 DVs on the lefthand side, which (I assume inadvertently) constrains the effects on Hostil to equal the effects on neg. Instead, use different labels on different lines for each DV

model <- " Hostil ~ c(h.b1, h.b1)*CN + ... neg ~ c(n.b1, n.b1)*CN + ... ... "

Kinga Bierwiaczonek

unread,

Mar 22, 2019, 11:33:57 AM3/22/19

to lav...@googlegroups.com

Dear Terrence,

Thanks a million for your advice! I was convinced group.equal would take care of labels, but indeed using a vector solved the issue indeed.

As to the second point, thanks for your keen observation. The effects on Hostil and neg are constrained intentionally as both DVs are theoretically related and constraining these paths to equality improves the overall model fit.

Best,

Kinga

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/H65xyGbzhdA/unsubscribe.

To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.

Reply all

Reply to author

Forward