Hi all,
I've encountered to an interesting situation of relationship between predictors using Lavaan's SEM I'm having difficulties to thoroughly understand.
First, an outline: all IV has moderate to strong relations with one another. Each IV has negative correlations with DV. In a regression analyses using SEM, two variables changes signs (one of them remains significant). In a linear regression model, VIF seems to refute a concern of multicollinearity.
My main questions are:
1. 1. Is there a better way to test for multicollinearity in SEM?
2. 2. If indeed multicollinearity is no concern here, am I looking at a suppression model?
3. 3. If I do, what next steps would allow me to identity the suppressor variable?
Thanks!
Hila
Specific data and code:
Zero order correlation matrix:
|
x1 |
x2 |
x3 |
x4 |
|
|
|
x2 |
0.372 |
|
|||
|
x3 |
0.434 |
0.615 |
|
||
|
x4 |
0.38 |
0.322 |
0.447
|
|
|
|
Y |
-0.245
|
-0.4 |
-0.128 |
-0.021 |
|
Creating a SEM model with the following syntax:
model <- '
# measurement model#
X1 =~ 1x1 + 2x1 + 3x1
X2 =~ sem1expressionNew + sem2expressionNew + sem3expressionNew
X3=~ sem1hoffman.SelfDefinition + sem2hoffman.SelfDefinition + sem3hoffman.SelfDefinition
X4=~ sem1hoffman.SelfAcceptance + sem2hoffman.SelfAcceptance + sem3hoffman.SelfAcceptance
Y=~ 1Y + 2Y + 3Y
# regressions
Y ~ X1 + X2 + X3 + X4
# residual correlations
X1 ~~ X2 + X3 + X4
X2 ~~ X3 + X4
X3 ~~ X4'
fit <- sem(model, data = a, se = "bootstrap")
summary(fit, fit.measures = TRUE, rsq=TRUE, ci=TRUE)
Provided the following results:
lavaan (0.5-20) converged normally after 55 iterations Used Total Number of observations 479 483 Estimator ML Minimum Function Test Statistic 293.695 Degrees of freedom 80 P-value (Chi-square) 0.000 Model test baseline model: Minimum Function Test Statistic 4639.602 Degrees of freedom 105 P-value 0.000 User model versus baseline model: Comparative Fit Index (CFI) 0.953 Tucker-Lewis Index (TLI) 0.938 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -8277.149 Loglikelihood unrestricted model (H1) -8130.302 Number of free parameters 40 Akaike (AIC) 16634.299 Bayesian (BIC) 16801.167 Sample-size adjusted Bayesian (BIC) 16674.211 Root Mean Square Error of Approximation: RMSEA 0.075 90 Percent Confidence Interval 0.066 0.084 P-value RMSEA <= 0.05 0.000 Standardized Root Mean Square Residual: SRMR 0.071 Parameter Estimates: Information Observed Standard Errors Bootstrap Number of requested bootstrap draws 1000 Number of successful bootstrap draws 1000 Latent Variables: Estimate Std.Err Z-value P(>|z|) ci.lower ci.upper X1=~ 1X1 1.000 1.000 1.000 2X1 1.126 0.057 19.852 0.000 1.020 1.253 3X1 1.060 0.055 19.388 0.000 0.964 1.175 X2 =~ 1X2 1.000 1.000 1.000 2X2 0.961 0.037 25.802 0.000 0.891 1.039 3X2 0.912 0.036 25.442 0.000 0.839 0.985 X3 =~ 1X3 1.000 1.000 1.000 2X3 0.906 0.049 18.528 0.000 0.815 1.004 3X3 1.074 0.040 27.114 0.000 0.999 1.157 X4 =~ 1X4 1.000 1.000 1.000 2X4 1.983 0.446 4.447 0.000 1.118 2.896 3X4 1.324 0.354 3.742 0.000 0.610 1.989 Y =~ 1Y 1.000 1.000 1.000 2Y 1.431 0.071 20.228 0.000 1.296 1.572 3Y 1.348 0.061 22.087 0.000 1.235 1.476 Regressions: Estimate Std.Err Z-value P(>|z|) ci.lower ci.upper Y ~ X1 -0.100 0.025 -4.036 0.000 -0.150 -0.053 X2 -0.275 0.034 -8.023 0.000 -0.348 -0.212 X3 0.124 0.034 3.616 0.000 0.058 0.192 X4 0.093 0.062 1.504 0.133 -0.026 0.217 Covariances: Estimate Std.Err Z-value P(>|z|) ci.lower ci.upper X1~~ X2 0.360 0.056 6.394 0.000 0.247 0.475 X3 0.503 0.058 8.727 0.000 0.389 0.622 X4 0.266 0.077 3.473 0.001 0.148 0.445 X2 ~~ X3 0.578 0.055 10.565 0.000 0.467 0.684 X4 0.166 0.078 2.120 0.034 0.056 0.369 X3 ~~ X4 0.336 0.077 4.380 0.000 0.199 0.485 Variances: Estimate Std.Err Z-value P(>|z|) ci.lower ci.upper 1X1 0.655 0.064 10.222 0.000 0.527 0.784 2X1 0.339 0.046 7.351 0.000 0.239 0.423 3X1 0.391 0.049 8.006 0.000 0.299 0.494 1X2 0.121 0.018 6.830 0.000 0.087 0.157 2X2 0.182 0.023 7.966 0.000 0.137 0.225 3X2 0.168 0.020 8.483 0.000 0.130 0.207 1X3 0.513 0.042 12.305 0.000 0.428 0.593 2X3 0.534 0.062 8.656 0.000 0.426 0.660 3X3 0.227 0.033 6.956 0.000 0.162 0.292 1X4 0.672 0.097 6.914 0.000 0.430 0.832 2X4 0.790 0.255 3.095 0.002 0.334 1.343 3X4 2.224 0.181 12.318 0.000 1.884 2.602 1Y 0.113 0.015 7.318 0.000 0.085 0.145 2Y 0.143 0.018 8.157 0.000 0.110 0.178 3Y 0.051 0.012 4.262 0.000 0.027 0.074 X1 1.064 0.105 10.125 0.000 0.860 1.283 X2 0.753 0.064 11.763 0.000 0.631 0.880 X3 1.032 0.078 13.216 0.000 0.870 1.178 X4 0.337 0.104 3.228 0.001 0.191 0.607 Y 0.132 0.015 8.540 0.000 0.101 0.161 R-Square: Estimate 1X1 0.619 2X1 0.799 3X1 0.753 1X2 0.861 2X2 0.792 3X2 0.788 1X3 0.668 2X3 0.614 3X3 0.840 1X4 0.334 2X4 0.626 3X4 0.210 1Y 0.616 2Y 0.721 3Y 0.866 Y 0.270
Entering all Vars to a linear regression model, checking for multicollinearity:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0005982 0.0406861 0.015 0.988276
x1 -0.2001228 0.0469520 -4.262 0.0000244 ***
x2 -0.4954629 0.0524208 -9.452 < 0.0000000000000002 ***
x3 0.2092660 0.0559207 3.742 0.000205 ***
x4 0.1187509 0.0469726 2.528 0.011792 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.8904 on 474 degrees of freedom (4 observations deleted due to missingness)Multiple R-squared: 0.2157, Adjusted R-squared: 0.2091 F-statistic: 32.59 on 4 and 474 DF, p-value: < 0.00000000000000022 > vif(fit)
x1 x2 x3 x4 1.319706 1.638911 1.883168 1.317138
Is there some lavaan command to compute the VIF values directly?