Question on multiple mediation

Niels

unread,

Nov 23, 2015, 10:32:36 AM11/23/15

to lavaan

In the lavaan tutorial is an example with one mediator. I wonder, how i should proceed if i have multiple indirect effects.

Suppose a model with the following regressions:
lat4 ~ lat3 + lat2 + lat1
lat3 ~ lat2 + lat1
lat2 ~ lat1

which gives us the following path-diagram:

If i want to know the total effect of lat1 on lat4, i would do the following:

total := c1 + (a1*b1) + (a2*b2) + (a1*c2*b2)
total_indirect := total - c1

Would this be correct?

Auto Generated Inline Image 1

Edward Rigdon

unread,

Nov 23, 2015, 12:09:49 PM11/23/15

to lav...@googlegroups.com

See http://lavaan.ugent.be/tutorial/mediation.html

--Ed Rigdon

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at http://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Niels

unread,

Nov 23, 2015, 12:57:06 PM11/23/15

to lavaan

Exactly, that´s the one-mediator example example, i mentioned. But i´m still unsure, if i adapted this concept correctly in the example with two mediators above.

Manuel Herrera-Usagre

unread,

Nov 25, 2015, 6:08:34 AM11/25/15

to lavaan

Seems to be correct for me. I wonder if with such a great number of parameters, the model is finally identified (http://davidakenny.net/cm/identify.htm) But if you think that the model is identified or "overidentified" the results might be feasible. Hope its helpful.

Peace,

Jessica Fritz

unread,

May 19, 2016, 5:26:09 AM5/19/16

to lavaan

Hi lavaan group,

I was running a similar model as Niels, but I used a slighly different script. I thought that the mediators in a 'classic' multiple mediation model are not predicted by one another, but have a correlation? This would actually have an impact on the calculation of the indirect and total effect, given that a prediction would be included in the calculation, but a correlation perhaps not?

Question 1: Does anyone know which model is 'correct' in terms of classic multiple mediator models?

model1 <-
'Y ~ c1*X + b1*Mediator1 + b2*Mediator2
Mediator1 ~ a1*X
Mediator2 ~ a2*X
Mediator1 ~~ Mediator2
med := (a1*b1) + (a2*b2)

total := c1 + (a1*b1) + (a2*b2)

total_indirect := c1 - (a1*b1) + (a2*b2)'

model2 <-
'Y ~ c1*X + b1*Mediator1 + b2*Mediator2
Mediator1 ~ a1*X + c2*Mediator2
Mediator2 ~ a2*X
Mediator1 ~ c2*Mediator2
med := (a1*b1) + (a2*b2) + (a1*c2*b2)

total := c1 + (a1*b1) + (a2*b2) + (a1*c2*b2)

total_indirect := c1 - (a1*b1) + (a2*b2) + (a1*c2*b2)'

Question 2: Or is it wrong to integrate a correlation in the model without adding it to the calculation of the mediation and the total effect?

Question 3: Interestingly the models above reveal exactly the same results, despite for the beta value of the a2 and the c2 path. Even the values of the mediation and the total effect are exactly the same? How is that possible?

Question 4: What I am also wondering about is, why the model with the c2 path does converge, whereas the model with the correlation instead of the c2 path does not converge?

Model 1:

Model 2

Thank you very much for considering my questions!

Jes

Auto Generated Inline Image 1

Auto Generated Inline Image 2

Terrence Jorgensen

unread,

May 20, 2016, 4:27:00 AM5/20/16

to lavaan

Question 1: Does anyone know which model is 'correct' in terms of classic multiple mediator models?

The "correct" model to fit would be the model that reflects your substantive theory about these variables are related. Covariance structure models can take any form (subject to identification constraints), and the examples you read about are just that: examples.

Question 2: Or is it wrong to integrate a correlation in the model without adding it to the calculation of the mediation and the total effect?

Indirect paths are products of directed effects (single-headed arrows), not undirected relations (double-headed arrows).

Question 3: Interestingly the models above reveal exactly the same results, despite for the beta value of the a2 and the c2 path. Even the values of the mediation and the total effect are exactly the same? How is that possible?

Your models are saturated because each variable is directly connected (via single- or double-headed arrows) to every other variable in the model, so I would expect chi-squared and df == 0 either way. But unless the residual correlation between mediators == 0 (or a1 or b2 == 0), I would expect the indirect paths to differ by the quantity (a1*c2*b2).

But note that your syntax does not match your model. You are regressing M2 on M1, not the other way around, so the actual indirect path in the model you actually fitted should be (a2*c2*b1). Or you can leave it as it is but change "Mediator1 ~ c2*Mediator2" to "Mediator2 ~ c2*Mediator1" (and remove the redundant parameter two lines above it).

Also, your user-defined parameter "med" IS the total indirect effect of X on Y (i.e., you are summing all the indirect effects via the mediators). The "total" calculation is correct, but your "total_indirect" calculation is not a meaningful quantity.

Question 4: What I am also wondering about is, why the model with the c2 path does converge, whereas the model with the correlation instead of the c2 path does not converge?

Was there an error message about the syntax having a redundant parameter? ("Mediator1 ~ c2*Mediator2" was specified twice in your syntax.)

Terrence D. Jorgensen

Postdoctoral Researcher, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

UvA web page: http://www.uva.nl/profile/t.d.jorgensen

Jessica Fritz

unread,

May 20, 2016, 9:47:01 AM5/20/16

to lavaan

Hi Terry, Thanks a lot for sharing your thoughts with me and for your excellent advices.

You are completely right! I unfortunately posted a slightly wrong model syntax here (due to changing the order and the names of the variables to make the models more comprehensible). I was actually running exactly these two models (see below ) and do get the same results for all path except for path a2 and c2. I deleted now the calculation of 'indir_total', and kept med and total in because if I understood you right, these were correct.

model1 <-
'Y ~ c1*X + b1*Mediator1 + b2*Mediator2
Mediator1 ~ a1*X
Mediator2 ~ a2*X
Mediator1 ~~ Mediator2
med := (a1*b1) + (a2*b2)

total := c1 + (a1*b1) + (a2*b2)'

model2 <-
'Y ~ c1*X + b1*Mediator1 + b2*Mediator2
Mediator1 ~ a1*X

Mediator2 ~ a2*X
Mediator2 ~ c2*Mediator1

med := (a1*b1) + (a2*b2) + (a1*c2*b2)

total := c1 + (a1*b1) + (a2*b2) + (a1*c2*b2)'

For the first model with the correlation, I get the following warning, but no error:

Warning message:
In lavaan(slotOptions = object@Options, slotParTable = object@ParTable,  :
  lavaan WARNING: model has NOT converged!

My supervisor wants me to reduce the risk of multiple testing, given that I am running 4 times the same single mediator model. I was actually wondering how to correct for multiple testing when I have mediation models which have non-independent mediators. Therefore I decided to collapse the 4 single mediator models into 2 multiple mediator models (with two mediators). These are the models I tried to run above. However, I have brain imaging data and thus a very small sample of 50 participants. Therefore, I am a bit concerned whether it is feasible at all to run multiple mediation models (2 mediators) with 50 participants. Is there anywhere a trustful guideline for the amount of variables you need when performing a two mediator model with SEM.

I also read in the literature that you could use bootstrapping to 'kind of' correct for multiple testing, is that a feasible approach. I at least found that bootstrapping in mediation analysis is already suitable for samples between 20 to 80 participants, which would be good in my case. So I thought that I could either (a) collapse the 4 mediation models into 2 (with 2 mediators each), or (b) bootstrap the four models, or (c) collapse the four models into two and then bootstrap the two mediation models.

Do you maybe have a recommendation whether either of these approaches might be feasible, and if yes, which one?

Or is there maybe anything else I can do to correct several SEM models for multiple testing?

Thank you for helping me!

Jessica Fritz

unread,

May 20, 2016, 5:23:20 PM5/20/16

to lavaan

Special thanks for briging up the following: Also, your user-defined parameter "med" IS the total indirect effect of X on Y (i.e., you are summing all the indirect effects via the mediators). The "total" calculation is correct, but your "total_indirect" calculation is not a meaningful quantity.

I lost totally track at this point, I just now realize that this specification was completely wrong! Thanks a lot!

Terrence Jorgensen

unread,

May 22, 2016, 11:49:00 AM5/22/16

to lavaan

I've never heard of bootstrapping accounting for multiple testing. Although I can imagine it would be possible to bootstrap a distribution of a maximum slope under the null in general linear models (like a bootstrapped studentized range distribution), I'm not sure how to do that in mediation models. But these are general SEM questions, more appropriate for SEMNET.

Jessica Fritz

unread,

May 23, 2016, 7:15:12 AM5/23/16

to lavaan

Ok! Thanks a lot Terry!

olivier pahud

unread,

May 23, 2016, 8:33:32 AM5/23/16

to lavaan

Isn't it the (correct) way to do multiple mediation by letting the residual variance of both mediators covary (as suggested in the first model)?
-> residual of M1 ~~ residual of M2

You only suppress this covaraince if you have strong theoretical assumptions that these two residual variances are not related...

Terrence Jorgensen

unread,

May 26, 2016, 4:44:58 AM5/26/16

to lavaan

You only suppress this covaraince if you have strong theoretical assumptions that these two residual variances are not related...

I agree, but Jessica wasn't fixing it to zero, she was modeling it as a directed instead of undirected effect.

Irene G.

unread,

May 17, 2017, 8:41:39 AM5/17/17

to lavaan

Hello all,

I am running a similar model than the two previous ones from this line: multiple mediation model (2 mediators) but the endogenous variable (Y) is binary (I converted it with "as.factor"). I've been reading all the info available at http://www.da.ugent.be/cvs/pages/en/Presentations/Presentation%20Yves%20Rosseel.pdf

However, I sitll struggle with the interpretation of the results I got:

mymodel <- 'Coex~c*log_size +b*M1 + e*M2

+ M1~a*log_size

+ M2~d*log_size

+ indirect1 := a*b

+ indirect2 := d*e

+ total := c + (a*b) + (d*e)

+ direct := c

+ Coex | b0*t1

+ probit11 := (-b0+c+b*a)/sqrt(b^2+1)

+ probit10 := (-b0+c )/sqrt(b^2+1)

+ probit00 := (-b0 )/sqrt(b^2+1)

+ indirect := pnorm(probit11) - pnorm(probit10)

+ direct := pnorm(probit10) - pnorm(probit00)

+ OR.indirect := (pnorm(probit11)/(1-pnorm(probit11)))/

+ (pnorm(probit10)/(1-pnorm(probit10)))

+ OR.direct := (pnorm(probit10)/(1-pnorm(probit10)))/

+ (pnorm(probit00)/(1-pnorm(probit00)))'

> fit <- sem (mymodel, data=result, ordered=c("Coex"))

> summary(fit, fit.measure=TRUE, standardize=TRUE, rsquare=TRUE)

lavaan (0.5-20) converged normally after 42 iterations

Number of observations 15

Estimator DWLS Robust

Minimum Function Test Statistic 3.755 4.023

Degrees of freedom 1 1

P-value (Chi-square) 0.053 0.045

Scaling correction factor 0.933

Shift parameter 0.000

for simple second-order correction (Mplus variant)

Model test baseline model:

Minimum Function Test Statistic 27.411 23.485

Degrees of freedom 6 6

P-value 0.000 0.001

User model versus baseline model:

Comparative Fit Index (CFI) 0.871 0.827

Tucker-Lewis Index (TLI) 0.228 -0.037

Root Mean Square Error of Approximation:

RMSEA 0.444 0.465

90 Percent Confidence Interval 0.000 0.957 0.058 0.976

P-value RMSEA <= 0.05 0.057 0.049

Weighted Root Mean Square Residual:

WRMR 0.584 0.584

Parameter Estimates:

Information Expected

Standard Errors Robust.sem

Regressions:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

Coex ~

log_size (c) 2.627 0.842 3.118 0.002 2.627 1.319

M1 (b) -1.904 1.192 -1.597 0.110 -1.904 -0.418

M2 (e) -3.561 1.476 -2.413 0.016 -3.561 -0.616

M1 ~

log_size (a) 0.306 0.118 2.593 0.010 0.306 0.701

M2 ~

log_size (d) 0.216 0.082 2.615 0.009 0.216 0.626

Intercepts:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

Coex 0.000 0.000 0.000

M1 0.088 0.132 0.666 0.505 0.088 0.308

M2 0.056 0.124 0.450 0.653 0.056 0.247

Thresholds:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

Coex|t1 (b0) 0.654 0.957 0.684 0.494 0.654 0.503

Variances:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

Coex 0.457 0.457 0.269

M1 0.042 0.020 2.057 0.040 0.042 0.509

M2 0.031 0.014 2.199 0.028 0.031 0.609

Scales y*:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

Coex 1.000 1.000 1.000

R-Square:

Estimate

Coex 0.731

M1 0.491

M2 0.391

Defined Parameters:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

indirect1 -0.583 0.388 -1.501 0.133 -0.583 -0.293

indirect2 -0.768 0.383 -2.007 0.045 -0.768 -0.386

total 1.276 1.036 1.231 0.218 1.276 0.641

direct 0.440 0.227 1.936 0.053 0.440 0.453

probit11 0.646 0.280 2.305 0.021 0.646 0.483

probit10 0.917 0.273 3.354 0.001 0.917 0.753

probit00 -0.304 0.533 -0.571 0.568 -0.304 -0.464

indirect -0.080 0.036 -2.232 0.026 -0.080 -0.089

direct 0.440 0.227 1.936 0.053 0.440 0.453

OR.indirect 0.626 0.112 5.566 0.000 0.626 0.635

OR.direct 7.443 7.917 0.940 0.347 7.443 7.247

1. how can I account for the R2 and p-value of the direct way (c) and each indirect way to predict Coex (=coexistence)? (R2= 0.731 or do i have to calculate something else? Which p-value counts? the one I obtained in the regression Coex ~log_size or the one at the end of the results for "direct effects" (p=0.053)?

2. Estimate (b) and (e) are negative... meaning negative Coexis~M1 and coexis~M2 relationships?

2. Coex variance has a p=0.4.... what is the implication for understanding my results?

3. Do I really need the last part of the model (pnorm and probit)? I saw a similar example but I dont fully understand why it is used for. Actually, only one of my two indirect effects was included (should I include the other one?) All my paths but one (Coex ~M1) are significant... is this why my indirect1 is not significant (p=0.133)?