Coefficients that reverse sign when entering SEM

267 views
Skip to first unread message

Hila

unread,
Apr 27, 2017, 5:45:11 AM4/27/17
to lavaan

Hi all,

I've encountered to an interesting situation of relationship between predictors using Lavaan's SEM I'm having difficulties to thoroughly understand.

First, an outline: all IV has moderate to strong relations with one another. Each IV has negative correlations with DV. In a regression analyses using SEM, two variables changes signs (one of them remains significant). In a linear regression model, VIF seems to refute a concern of multicollinearity. 

My main questions are:

1.       1. Is there a better way to test for multicollinearity in SEM? 

2.       2. If indeed multicollinearity is no concern here, am I looking at a suppression model?

3.       3. If I do, what next steps would allow me to identity the suppressor variable?


Thanks!

Hila


Specific data and code:


Zero order correlation matrix:

x1

x2

x3

x4

 

x2

0.372
< .001
481

 

x3

0.434
< .001
 481

0.615
< .001
481

 

x4

0.38
< .001
481

0.322
< .001
480

0.447
< .001
480

 

Y

-0.245
< .001
481

-0.4
< .001
480

-0.128
0.005
480

-0.021
0.647
480

 

 

Creating a SEM model with the following syntax:

model <- '

  # measurement model#

X1 =~ 1x1 + 2x1 + 3x1

X2 =~ sem1expressionNew + sem2expressionNew + sem3expressionNew

X3=~ sem1hoffman.SelfDefinition + sem2hoffman.SelfDefinition + sem3hoffman.SelfDefinition

X4=~  sem1hoffman.SelfAcceptance + sem2hoffman.SelfAcceptance + sem3hoffman.SelfAcceptance

Y=~ 1Y + 2Y + 3Y

# regressions

Y ~ X1 + X2 + X3 + X4

# residual correlations

X1 ~~ X2 + X3 + X4

X2 ~~ X3 + X4

X3 ~~ X4'

 

fit <- sem(model, data = a, se = "bootstrap")

summary(fit, fit.measures = TRUE, rsq=TRUE, ci=TRUE)

 

Provided the following results:

lavaan (0.5-20) converged normally after  55 iterations
 
                                                  Used       Total
  Number of observations                           479         483
 
  Estimator                                         ML
  Minimum Function Test Statistic              293.695
  Degrees of freedom                                80
  P-value (Chi-square)                           0.000
 
Model test baseline model:
 
  Minimum Function Test Statistic             4639.602
  Degrees of freedom                               105
  P-value                                        0.000
 
User model versus baseline model:
 
  Comparative Fit Index (CFI)                    0.953
  Tucker-Lewis Index (TLI)                       0.938
 
Loglikelihood and Information Criteria:
 
  Loglikelihood user model (H0)              -8277.149
  Loglikelihood unrestricted model (H1)      -8130.302
 
  Number of free parameters                         40
  Akaike (AIC)                               16634.299
  Bayesian (BIC)                             16801.167
  Sample-size adjusted Bayesian (BIC)        16674.211
 
Root Mean Square Error of Approximation:
 
  RMSEA                                          0.075
  90 Percent Confidence Interval          0.066  0.084
  P-value RMSEA <= 0.05                          0.000
 
Standardized Root Mean Square Residual:
 
  SRMR                                           0.071
 
Parameter Estimates:
 
  Information                                 Observed
  Standard Errors                            Bootstrap
  Number of requested bootstrap draws             1000
  Number of successful bootstrap draws            1000
 
Latent Variables:
                  Estimate  Std.Err  Z-value  P(>|z|) ci.lower ci.upper
  X1=~                                                                  
    1X1             1.000                               1.000    1.000
    2X1             1.126    0.057   19.852    0.000    1.020    1.253
    3X1             1.060    0.055   19.388    0.000    0.964    1.175
  X2 =~                                                      
    1X2             1.000                               1.000    1.000
    2X2             0.961    0.037   25.802    0.000    0.891    1.039
    3X2             0.912    0.036   25.442    0.000    0.839    0.985
  X3 =~                                                      
    1X3             1.000                               1.000    1.000
    2X3             0.906    0.049   18.528    0.000    0.815    1.004
    3X3             1.074    0.040   27.114    0.000    0.999    1.157
  X4 =~                                                               
    1X4             1.000                               1.000    1.000
    2X4             1.983    0.446    4.447    0.000    1.118    2.896
    3X4             1.324    0.354    3.742    0.000    0.610    1.989
  Y =~                                                                       
    1Y              1.000                               1.000    1.000
    2Y              1.431    0.071   20.228    0.000    1.296    1.572
    3Y              1.348    0.061   22.087    0.000    1.235    1.476
 
Regressions:
                   Estimate  Std.Err  Z-value  P(>|z|) ci.lower ci.upper
  Y ~                                                               
    X1             -0.100    0.025   -4.036    0.000   -0.150   -0.053
    X2             -0.275    0.034   -8.023    0.000   -0.348   -0.212
    X3              0.124    0.034    3.616    0.000    0.058    0.192
    X4              0.093    0.062    1.504    0.133   -0.026    0.217
 
Covariances:
                   Estimate  Std.Err  Z-value  P(>|z|) ci.lower ci.upper
  X1~~                                                                  
    X2              0.360    0.056    6.394    0.000    0.247    0.475
    X3              0.503    0.058    8.727    0.000    0.389    0.622
    X4              0.266    0.077    3.473    0.001    0.148    0.445
  X2 ~~                                                      
    X3              0.578    0.055   10.565    0.000    0.467    0.684
    X4              0.166    0.078    2.120    0.034    0.056    0.369
  X3 ~~                                                      
    X4              0.336    0.077    4.380    0.000    0.199    0.485
 
Variances:
         Estimate  Std.Err  Z-value  P(>|z|) ci.lower ci.upper
    1X1    0.655    0.064   10.222    0.000    0.527    0.784
    2X1    0.339    0.046    7.351    0.000    0.239    0.423
    3X1    0.391    0.049    8.006    0.000    0.299    0.494
    1X2    0.121    0.018    6.830    0.000    0.087    0.157
    2X2    0.182    0.023    7.966    0.000    0.137    0.225
    3X2    0.168    0.020    8.483    0.000    0.130    0.207
    1X3    0.513    0.042   12.305    0.000    0.428    0.593
    2X3    0.534    0.062    8.656    0.000    0.426    0.660
    3X3    0.227    0.033    6.956    0.000    0.162    0.292
    1X4    0.672    0.097    6.914    0.000    0.430    0.832
    2X4    0.790    0.255    3.095    0.002    0.334    1.343
    3X4    2.224    0.181   12.318    0.000    1.884    2.602
    1Y     0.113    0.015    7.318    0.000    0.085    0.145
    2Y     0.143    0.018    8.157    0.000    0.110    0.178
    3Y     0.051    0.012    4.262    0.000    0.027    0.074
    X1     1.064    0.105   10.125    0.000    0.860    1.283
    X2     0.753    0.064   11.763    0.000    0.631    0.880
    X3     1.032    0.078   13.216    0.000    0.870    1.178
    X4     0.337    0.104    3.228    0.001    0.191    0.607
    Y      0.132    0.015    8.540    0.000    0.101    0.161
 
R-Square:
         Estimate
    1X1    0.619
    2X1    0.799
    3X1    0.753
    1X2    0.861
    2X2    0.792
    3X2    0.788
    1X3    0.668
    2X3    0.614
    3X3    0.840
    1X4    0.334
    2X4    0.626
    3X4    0.210
    1Y     0.616
    2Y     0.721
    3Y     0.866
    Y      0.270

 

Entering all Vars to a linear regression model, checking for multicollinearity:

              Estimate Std. Error t value             Pr(>|t|)   

(Intercept)  0.0005982  0.0406861   0.015             0.988276   

x1          -0.2001228  0.0469520  -4.262            0.0000244 ***

x2          -0.4954629  0.0524208  -9.452 < 0.0000000000000002 ***

x3           0.2092660  0.0559207   3.742             0.000205 ***

x4           0.1187509  0.0469726   2.528             0.011792 * 

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
Residual standard error: 0.8904 on 474 degrees of freedom
  (4 observations deleted due to missingness)
Multiple R-squared:  0.2157,   Adjusted R-squared:  0.2091 
F-statistic: 32.59 on 4 and 474 DF,  p-value: < 0.00000000000000022
 
> vif(fit)
      x1       x2       x3       x4 
1.319706 1.638911 1.883168 1.317138 

 

Terrence Jorgensen

unread,
Apr 29, 2017, 6:16:00 AM4/29/17
to lavaan
Hi Hila,

Looks like a suppression situation.  Which variable you consider a "suppressor" might be as arbitrary as which variable in an interaction you consider the "moderator".  Here is some reading on the subject:


Since this question has nothing to do with lavaan, you would probably have more luck getting advice from the much larger forum of SEM experts on SEMNET:


Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Diógenes Bido

unread,
Jan 30, 2018, 11:04:42 AM1/30/18
to lavaan
Is there some lavaan command to compute the VIF values directly?

Or we have to compute VIF "by hand":  (i) running one model for each predictor, (ii) VIF = 1 / (1 = R²)...

Best regards

===================================

Terrence Jorgensen

unread,
Feb 12, 2018, 10:10:10 AM2/12/18
to lavaan
Is there some lavaan command to compute the VIF values directly?

No, the model does not estimate the effects of other predictors on each other.  You would need to run a model for each predictor in which it is regressed on all other predictors, to calculate that R².
Reply all
Reply to author
Forward
0 new messages