Question on multiple mediation

966 views
Skip to first unread message

Niels

unread,
Nov 23, 2015, 10:32:36 AM11/23/15
to lavaan

In the lavaan tutorial is an example with one mediator. I wonder, how i should proceed if i have multiple indirect effects.

Suppose a model with the following regressions:
lat4 ~ lat3 + lat2 + lat1
lat3 ~ lat2 + lat1
lat2 ~ lat1

which gives us the following path-diagram:



If i want to know the total effect of lat1 on lat4, i would do the following:

total             := c1 + (a1*b1) + (a2*b2) + (a1*c2*b2)
total_indirect := total - c1

Would this be correct?

Auto Generated Inline Image 1

Edward Rigdon

unread,
Nov 23, 2015, 12:09:49 PM11/23/15
to lav...@googlegroups.com



--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at http://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Niels

unread,
Nov 23, 2015, 12:57:06 PM11/23/15
to lavaan
Exactly, that´s the one-mediator example example, i mentioned. But i´m still unsure, if i adapted this concept correctly in the example with two mediators above.

Manuel Herrera-Usagre

unread,
Nov 25, 2015, 6:08:34 AM11/25/15
to lavaan
Seems to be correct for me. I wonder if  with such a great number of parameters, the model is finally identified (http://davidakenny.net/cm/identify.htm) But if you think that the model is identified or "overidentified" the results might be feasible. Hope its helpful.

Peace,

Jessica Fritz

unread,
May 19, 2016, 5:26:09 AM5/19/16
to lavaan
Hi lavaan group,

I was running a similar model as Niels, but I used a slighly different script. I thought that the mediators in a 'classic' multiple mediation model are not predicted by one another, but have a correlation? This would actually have an impact on the calculation of the indirect and total effect, given that a prediction would be included in the calculation, but a correlation perhaps not?

Question 1: Does anyone know which model is 'correct' in terms of classic multiple mediator models?

model1 <-
'Y ~ c1*X + b1*Mediator1 + b2*Mediator2
Mediator1 ~ a1*X
Mediator2 ~ a2*X
Mediator1 ~~ Mediator2
med := (a1*b1) + (a2*b2)

total := c1 + (a1*b1) + (a2*b2)
total_indirect := c1 - (a1*b1) + (a2*b2)'

model2 <-
'Y ~ c1*X + b1*Mediator1 + b2*Mediator2
Mediator1 ~ a1*X + c2*Mediator2
Mediator2 ~ a2*X
Mediator1 ~ c2*Mediator2
med := (a1*b1) + (a2*b2) + (a1*c2*b2)

total := c1 + (a1*b1) + (a2*b2) + (a1*c2*b2)
total_indirect := c1 - (a1*b1) + (a2*b2) + (a1*c2*b2)'

Question 2: Or is it wrong to integrate a correlation in the model without adding it to the calculation of the mediation and the total effect?

Question 3: Interestingly the models above reveal exactly the same results, despite for the beta value of the a2 and the c2 path. Even the values of the mediation and the total effect are exactly the same? How is that possible?

Question 4: What I am also wondering about is, why the model with the c2 path does converge, whereas the model with the correlation instead of the c2 path does not converge?

Model 1:


Model 2




Thank you very much for considering my questions!

Jes
Auto Generated Inline Image 1
Auto Generated Inline Image 2

Terrence Jorgensen

unread,
May 20, 2016, 4:27:00 AM5/20/16
to lavaan
Question 1: Does anyone know which model is 'correct' in terms of classic multiple mediator models?

The "correct" model to fit would be the model that reflects your substantive theory about these variables are related.  Covariance structure models can take any form (subject to identification constraints), and the examples you read about are just that:  examples.

Question 2: Or is it wrong to integrate a correlation in the model without adding it to the calculation of the mediation and the total effect?

Indirect paths are products of directed effects (single-headed arrows), not undirected relations (double-headed arrows).

Question 3: Interestingly the models above reveal exactly the same results, despite for the beta value of the a2 and the c2 path. Even the values of the mediation and the total effect are exactly the same? How is that possible?

Your models are saturated because each variable is directly connected (via single- or double-headed arrows) to every other variable in the model, so I would expect chi-squared and df == 0 either way.  But unless the residual correlation between mediators == 0 (or a1 or b2 == 0), I would expect the indirect paths to differ by the quantity (a1*c2*b2).  

But note that your syntax does not match your model.  You are regressing M2 on M1, not the other way around, so the actual indirect path in the model you actually fitted should be (a2*c2*b1).  Or you can leave it as it is but change "Mediator1 ~ c2*Mediator2" to "Mediator2 ~ c2*Mediator1" (and remove the redundant parameter two lines above it).

Also, your user-defined parameter "med" IS the total indirect effect of X on Y (i.e., you are summing all the indirect effects via the mediators).  The "total" calculation is correct, but your "total_indirect" calculation is not a meaningful quantity.

Question 4: What I am also wondering about is, why the model with the c2 path does converge, whereas the model with the correlation instead of the c2 path does not converge?

Was there an error message about the syntax having a redundant parameter?  ("Mediator1 ~ c2*Mediator2" was specified twice in your syntax.)

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Jessica Fritz

unread,
May 20, 2016, 9:47:01 AM5/20/16
to lavaan
Hi Terry, Thanks a lot for sharing your thoughts with me and for your  excellent advices.

You are completely right! I unfortunately posted a slightly wrong model syntax here (due to changing the order  and  the names of the variables to make the models more comprehensible). I was actually running exactly these two models (see below ) and do get the same results for all path except for path a2 and c2. I deleted now the calculation of 'indir_total', and kept med and total in because if I understood you right, these were correct.


model1 <-
'Y ~ c1*X + b1*Mediator1 + b2*Mediator2
Mediator1 ~ a1*X
Mediator2 ~ a2*X
Mediator1 ~~ Mediator2
med := (a1*b1) + (a2*b2)
total := c1 + (a1*b1) + (a2*b2)'

model2 <-
'Y ~ c1*X + b1*Mediator1 + b2*Mediator2
Mediator1 ~ a1*X
Mediator2 ~ a2*X
Mediator2 ~ c2*Mediator1

med := (a1*b1) + (a2*b2) + (a1*c2*b2)
total := c1 + (a1*b1) + (a2*b2) + (a1*c2*b2)'

For the first model with the correlation, I get the following warning, but no error:
Warning message:
In lavaan(slotOptions = object@Options, slotParTable = object@ParTable,  :
  lavaan WARNING: model has NOT converged!

My supervisor wants me to reduce the risk of multiple testing, given that I am running 4 times the same single mediator model. I was actually wondering how to correct for multiple testing when I have mediation models which have non-independent mediators. Therefore I decided to collapse the 4 single mediator models into 2 multiple mediator models (with two mediators). These are the models I tried to run above. However, I have brain imaging data and thus a very small sample of 50 participants. Therefore, I am a bit concerned  whether it is feasible at all to run multiple mediation models (2 mediators) with 50 participants. Is there anywhere a trustful guideline for the amount of variables you need when performing a two mediator model with SEM.

I also read in the literature that you could use bootstrapping to 'kind of' correct for multiple testing, is that a feasible approach. I at least found that bootstrapping in mediation analysis is already suitable for samples between 20 to 80 participants, which would be good in my case. So I thought that I could either (a) collapse the 4 mediation models into 2 (with 2 mediators each), or (b) bootstrap the four models, or (c) collapse the four models into two and then bootstrap the two mediation models.

Do you maybe have a recommendation whether either of these approaches might be feasible, and if yes, which one?

Or is there maybe anything else I can do to correct several SEM models for multiple testing?


Thank you for helping me!

Jessica Fritz

unread,
May 20, 2016, 5:23:20 PM5/20/16
to lavaan
Special thanks for briging up the following: Also, your user-defined parameter "med" IS the total indirect effect of X on Y (i.e., you are summing all the indirect effects via the mediators).  The "total" calculation is correct, but your "total_indirect" calculation is not a meaningful quantity.

I lost totally track at this point, I just now realize that this specification was completely wrong! Thanks a lot!

Terrence Jorgensen

unread,
May 22, 2016, 11:49:00 AM5/22/16
to lavaan
I've never heard of bootstrapping accounting for multiple testing.  Although I can imagine it would be possible to bootstrap a distribution of a maximum slope under the null in general linear models (like a bootstrapped studentized range distribution), I'm not sure how to do that in mediation models.  But these are general SEM questions, more appropriate for SEMNET.

Jessica Fritz

unread,
May 23, 2016, 7:15:12 AM5/23/16
to lavaan
Ok! Thanks a lot Terry!

olivier pahud

unread,
May 23, 2016, 8:33:32 AM5/23/16
to lavaan
Isn't it the (correct) way to do multiple mediation by letting the residual variance of both mediators covary (as suggested in the first model)?
-> residual of M1 ~~ residual of M2

You only suppress this covaraince if you have strong theoretical assumptions that these two residual variances are not related...

Terrence Jorgensen

unread,
May 26, 2016, 4:44:58 AM5/26/16
to lavaan
You only suppress this covaraince if you have strong theoretical assumptions that these two residual variances are not related...

I agree, but Jessica wasn't fixing it to zero, she was modeling it as a directed instead of undirected effect.

Irene G.

unread,
May 17, 2017, 8:41:39 AM5/17/17
to lavaan
Hello all,

I am running a similar model than the two previous ones from this line: multiple mediation model (2 mediators) but the endogenous variable (Y) is binary (I converted it with "as.factor"). I've been reading all the info available at http://www.da.ugent.be/cvs/pages/en/Presentations/Presentation%20Yves%20Rosseel.pdf
However, I sitll struggle with the interpretation of the results I got:

mymodel <- 'Coex~c*log_size +b*M1 + e*M2
+ M1~a*log_size
+ M2~d*log_size
+ indirect1 := a*b
+ indirect2 := d*e
+ total   := c + (a*b) + (d*e)
+ direct := c
+ Coex | b0*t1
+ probit11 := (-b0+c+b*a)/sqrt(b^2+1)
+ probit10 := (-b0+c )/sqrt(b^2+1)
+ probit00 := (-b0 )/sqrt(b^2+1)
+ indirect := pnorm(probit11) - pnorm(probit10)
+ direct := pnorm(probit10) - pnorm(probit00)
+ OR.indirect := (pnorm(probit11)/(1-pnorm(probit11)))/
+ (pnorm(probit10)/(1-pnorm(probit10)))
+ OR.direct := (pnorm(probit10)/(1-pnorm(probit10)))/
+ (pnorm(probit00)/(1-pnorm(probit00)))'


> fit <- sem (mymodel, data=result, ordered=c("Coex"))

> summary(fit, fit.measure=TRUE, standardize=TRUE, rsquare=TRUE)

lavaan (0.5-20) converged normally after  42 iterations

  Number of observations                            15

  Estimator                                       DWLS      Robust
  Minimum Function Test Statistic                3.755       4.023
  Degrees of freedom                                 1           1
  P-value (Chi-square)                           0.053       0.045
  Scaling correction factor                                  0.933
  Shift parameter                                            0.000
    for simple second-order correction (Mplus variant)

Model test baseline model:

  Minimum Function Test Statistic               27.411      23.485
  Degrees of freedom                                 6           6
  P-value                                        0.000       0.001

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.871       0.827
  Tucker-Lewis Index (TLI)                       0.228      -0.037

Root Mean Square Error of Approximation:

  RMSEA                                          0.444       0.465
  90 Percent Confidence Interval          0.000  0.957       0.058  0.976
  P-value RMSEA <= 0.05                          0.057       0.049

Weighted Root Mean Square Residual:

  WRMR                                           0.584       0.584

Parameter Estimates:

  Information                                 Expected
  Standard Errors                           Robust.sem

Regressions:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
  Coex ~                                                                
    log_size   (c)    2.627    0.842    3.118    0.002    2.627    1.319
    M1         (b)   -1.904    1.192   -1.597    0.110   -1.904   -0.418
    M2         (e)   -3.561    1.476   -2.413    0.016   -3.561   -0.616
  M1 ~                                                                  
    log_size   (a)    0.306    0.118    2.593    0.010    0.306    0.701
  M2 ~                                                                  
    log_size   (d)    0.216    0.082    2.615    0.009    0.216    0.626

Intercepts:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
    Coex              0.000                               0.000    0.000
    M1                0.088    0.132    0.666    0.505    0.088    0.308
    M2                0.056    0.124    0.450    0.653    0.056    0.247

Thresholds:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
    Coex|t1   (b0)    0.654    0.957    0.684    0.494    0.654    0.503

Variances:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
    Coex              0.457                               0.457    0.269
    M1                0.042    0.020    2.057    0.040    0.042    0.509
    M2                0.031    0.014    2.199    0.028    0.031    0.609

Scales y*:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
    Coex              1.000                               1.000    1.000

R-Square:
                   Estimate
    Coex              0.731
    M1                0.491
    M2                0.391

Defined Parameters:
                   Estimate  Std.Err  Z-value  P(>|z|)   Std.lv  Std.all
    indirect1        -0.583    0.388   -1.501    0.133   -0.583   -0.293
    indirect2        -0.768    0.383   -2.007    0.045   -0.768   -0.386
    total             1.276    1.036    1.231    0.218    1.276    0.641
    direct            0.440    0.227    1.936    0.053    0.440    0.453
    probit11          0.646    0.280    2.305    0.021    0.646    0.483
    probit10          0.917    0.273    3.354    0.001    0.917    0.753
    probit00         -0.304    0.533   -0.571    0.568   -0.304   -0.464
    indirect         -0.080    0.036   -2.232    0.026   -0.080   -0.089
    direct            0.440    0.227    1.936    0.053    0.440    0.453
    OR.indirect       0.626    0.112    5.566    0.000    0.626    0.635
    OR.direct         7.443    7.917    0.940    0.347    7.443    7.247

1. how can I account for the R2 and p-value of the direct way (c) and each indirect way to predict Coex (=coexistence)?  (R2= 0.731 or do i have to calculate something else? Which p-value counts? the one I obtained in the regression Coex ~log_size or the one at the end of the results for "direct effects" (p=0.053)?
2. Estimate (b) and (e) are negative... meaning negative Coexis~M1 and coexis~M2 relationships?
2. Coex variance has a p=0.4.... what is the implication for understanding my results?
3. Do I really need the last part of the model (pnorm and probit)? I saw a similar example  but I dont fully understand why it is used for. Actually, only one of my two indirect effects was included (should I include the other one?) All my paths but one (Coex ~M1) are significant... is this why my indirect1 is not significant (p=0.133)?  

Thanks in advance.

Irene

Reply all
Reply to author
Forward
0 new messages