SEM Mediation - Indirect Effect Size

Emma Mills

unread,

Jun 4, 2016, 5:43:27 AM6/4/16

to lavaan

Hi,

I am new to SEM and Lavaan and I am trying to understand that the approach I am using is justifiable. I would be really grateful for any feedback on the below.

Having previously performed confirmatory factor analysis, I am now conducting a mediation analysis using the latent variables. My aim is to use bootstrapping and effect sizes rather than a causal steps/sobel test approach:

SCTModel2 <- 'PF1=~pers2+pers4+pers3+pers8+pers7

BehF1=~BF1+BF2+BF3+BF4+BF6+BF7

EnvF1=~Env1+Env3+Env4+Env5

# direct effect

BehF1 ~ c*PF1

# mediator

EnvF1 ~ a*PF1

BehF1 ~ b*EnvF1

# indirect effect (a*b)

indirect := a*b

# total effect

total := c + (a*b)

'

The next thing I did was to fit this using the syntax below.

SCTfit<-sem(SCTModel2, data = SMBI_Alldata, estimator="MLR",bootstrap = 5000)

summary(SCTfit, standardized = T, fit.measures = T, rsq = T)

I chose Robust maximum likelihood to account for mulivariate non-normality. This gives me the model fit statistics which I from my understanding tells my about the consistency of the hypothesised mediational model to the data.

Is it normal/ recommended to report to model fit statistics (e.g. CFI, TLI, RMSEA, SRMR) for a mediation analysis in a PhD thesis?

Also what has the 'bootstrap = 5000' command actually done here as I can't see any reference to bootstrapping in my output?

Number of observations 358

Estimator ML Robust

Minimum Function Test Statistic 145.322 136.336

Degrees of freedom 87 87

P-value (Chi-square) 0.000 0.001

Scaling correction factor 1.066

for the Yuan-Bentler correction

Model test baseline model:

Minimum Function Test Statistic 2047.623 1800.308

Degrees of freedom 105 105

P-value 0.000 0.000

User model versus baseline model:

Comparative Fit Index (CFI) 0.970 0.971

Tucker-Lewis Index (TLI) 0.964 0.965

Loglikelihood and Information Criteria:

Loglikelihood user model (H0) -5891.208 -5891.208

Scaling correction factor 1.490

for the MLR correction

Loglikelihood unrestricted model (H1) -5818.547 -5818.547

Scaling correction factor 1.217

for the MLR correction

Number of free parameters 48 48

Akaike (AIC) 11878.416 11878.416

Bayesian (BIC) 12064.682 12064.682

Sample-size adjusted Bayesian (BIC) 11912.402 11912.402

Root Mean Square Error of Approximation:

RMSEA 0.043 0.040

90 Percent Confidence Interval 0.031 0.055 0.027 0.052

P-value RMSEA <= 0.05 0.810 0.915

Standardized Root Mean Square Residual:

SRMR 0.045 0.045

Parameter Estimates:

Information Observed

Standard Errors Robust.huber.white

Latent Variables:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

PF1 =~

pers2 1.000 1.061 0.802

pers4 0.988 0.066 14.917 0.000 1.048 0.816

pers3 0.805 0.061 13.170 0.000 0.854 0.770

pers8 0.904 0.058 15.647 0.000 0.960 0.746

pers7 0.731 0.068 10.763 0.000 0.776 0.690

BehF1 =~

BF1 1.000 0.565 0.809

BF2 0.783 0.076 10.256 0.000 0.443 0.747

BF3 0.836 0.071 11.802 0.000 0.472 0.695

BF4 0.662 0.094 7.035 0.000 0.374 0.679

BF6 0.758 0.088 8.640 0.000 0.428 0.529

BF7 0.753 0.080 9.416 0.000 0.425 0.503

EnvF1 =~

Env1 1.000 0.569 0.636

Env3 0.880 0.122 7.229 0.000 0.501 0.696

Env4 0.908 0.137 6.615 0.000 0.516 0.657

Env5 0.964 0.121 7.986 0.000 0.548 0.706

Regressions:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

BehF1 ~

PF1 (c) 0.129 0.047 2.718 0.007 0.242 0.242

EnvF1 ~

PF1 (a) 0.131 0.046 2.880 0.004 0.245 0.245

BehF1 ~

EnvF1 (b) 0.162 0.103 1.585 0.113 0.164 0.164

Intercepts:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

pers2 4.749 0.070 67.871 0.000 4.749 3.587

pers4 4.955 0.068 72.955 0.000 4.955 3.856

pers3 5.109 0.059 87.120 0.000 5.109 4.604

pers8 4.835 0.068 71.067 0.000 4.835 3.756

pers7 4.754 0.059 80.023 0.000 4.754 4.229

BF1 4.385 0.037 118.747 0.000 4.385 6.276

BF2 4.573 0.031 146.061 0.000 4.573 7.720

BF3 4.419 0.036 123.102 0.000 4.419 6.506

BF4 4.656 0.029 159.860 0.000 4.656 8.449

BF6 4.419 0.043 103.165 0.000 4.419 5.452

BF7 4.184 0.045 93.656 0.000 4.184 4.950

Env1 2.830 0.047 59.837 0.000 2.830 3.162

Env3 2.961 0.038 77.838 0.000 2.961 4.114

Env4 3.165 0.042 76.166 0.000 3.165 4.026

Env5 2.989 0.041 72.812 0.000 2.989 3.848

PF1 0.000 0.000 0.000

BehF1 0.000 0.000 0.000

EnvF1 0.000 0.000 0.000

Variances:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

pers2 0.626 0.101 6.222 0.000 0.626 0.357

pers4 0.553 0.070 7.888 0.000 0.553 0.335

pers3 0.502 0.046 10.994 0.000 0.502 0.407

pers8 0.736 0.097 7.590 0.000 0.736 0.444

pers7 0.662 0.078 8.491 0.000 0.662 0.524

BF1 0.169 0.026 6.553 0.000 0.169 0.346

BF2 0.155 0.016 9.524 0.000 0.155 0.441

BF3 0.238 0.027 8.776 0.000 0.238 0.516

BF4 0.164 0.016 10.280 0.000 0.164 0.539

BF6 0.473 0.072 6.565 0.000 0.473 0.721

BF7 0.534 0.061 8.691 0.000 0.534 0.747

Env1 0.477 0.074 6.407 0.000 0.477 0.596

Env3 0.267 0.033 8.068 0.000 0.267 0.516

Env4 0.351 0.036 9.660 0.000 0.351 0.568

Env5 0.302 0.037 8.115 0.000 0.302 0.501

PF1 1.126 0.130 8.680 0.000 1.000 1.000

BehF1 0.286 0.053 5.401 0.000 0.895 0.895

EnvF1 0.304 0.066 4.616 0.000 0.940 0.940

R-Square:

Estimate

pers2 0.643

pers4 0.665

pers3 0.593

pers8 0.556

pers7 0.476

BF1 0.654

BF2 0.559

BF3 0.484

BF4 0.461

BF6 0.279

BF7 0.253

Env1 0.404

Env3 0.484

Env4 0.432

Env5 0.499

BehF1 0.105

EnvF1 0.060

Defined Parameters:

Estimate Std.Err Z-value P(>|z|) Std.lv Std.all

indirect 0.021 0.012 1.752 0.080 0.040 0.040

total 0.150 0.044 3.414 0.001 0.282 0.282

If I've understood this correctly, the regression shows b is not significant and so you would conclude no mediation rather than carry on at this stage (do you need a, b and c to be significant)? In addition the indirect effect is not significant so you would conclude no mediation? Or have I misinterpreted this?

Supposing my data did suggest mediation, my understanding is that R uses the Sobel (delta) approach to calculating the indirect effect and that arguably a better approach is bootstrapping, and reading the boot strapped confidence intervals as they can capture asymmetries in the distribution of the estimator. I chose bca.simple because it produces intervals using the adjusted bootstrap percentile correcting for bias. So I then ran the following command to get the bootstrap parameters:

> boot.fit <- parameterEstimates(SCT2fit, boot.ci.type="bca.simple",level=0.95, ci=TRUE,standardized = FALSE)

> boot.fit

lhs op rhs label est se z pvalue ci.lower ci.upper

1 PF1 =~ pers2 1.000 0.000 NA NA 1.000 1.000

2 PF1 =~ pers4 0.988 0.066 14.917 0.000 0.858 1.117

3 PF1 =~ pers3 0.805 0.061 13.170 0.000 0.685 0.925

4 PF1 =~ pers8 0.904 0.058 15.647 0.000 0.791 1.018

5 PF1 =~ pers7 0.731 0.068 10.763 0.000 0.598 0.864

6 BehF1 =~ BF1 1.000 0.000 NA NA 1.000 1.000

7 BehF1 =~ BF2 0.783 0.076 10.256 0.000 0.634 0.933

8 BehF1 =~ BF3 0.836 0.071 11.802 0.000 0.697 0.975

9 BehF1 =~ BF4 0.662 0.094 7.035 0.000 0.478 0.847

10 BehF1 =~ BF6 0.758 0.088 8.640 0.000 0.586 0.930

11 BehF1 =~ BF7 0.753 0.080 9.416 0.000 0.596 0.909

12 EnvF1 =~ Env1 1.000 0.000 NA NA 1.000 1.000

13 EnvF1 =~ Env3 0.880 0.122 7.229 0.000 0.641 1.118

14 EnvF1 =~ Env4 0.908 0.137 6.615 0.000 0.639 1.177

15 EnvF1 =~ Env5 0.964 0.121 7.986 0.000 0.727 1.200

16 BehF1 ~ PF1 c 0.129 0.047 2.718 0.007 0.036 0.222

17 EnvF1 ~ PF1 a 0.131 0.046 2.880 0.004 0.042 0.221

18 BehF1 ~ EnvF1 b 0.162 0.103 1.585 0.113 -0.038 0.363

19 pers2 ~~ pers2 0.626 0.101 6.222 0.000 0.429 0.823

20 pers4 ~~ pers4 0.553 0.070 7.888 0.000 0.415 0.690

21 pers3 ~~ pers3 0.502 0.046 10.994 0.000 0.412 0.591

22 pers8 ~~ pers8 0.736 0.097 7.590 0.000 0.546 0.926

23 pers7 ~~ pers7 0.662 0.078 8.491 0.000 0.509 0.814

24 BF1 ~~ BF1 0.169 0.026 6.553 0.000 0.118 0.220

25 BF2 ~~ BF2 0.155 0.016 9.524 0.000 0.123 0.187

26 BF3 ~~ BF3 0.238 0.027 8.776 0.000 0.185 0.291

27 BF4 ~~ BF4 0.164 0.016 10.280 0.000 0.132 0.195

28 BF6 ~~ BF6 0.473 0.072 6.565 0.000 0.332 0.615

29 BF7 ~~ BF7 0.534 0.061 8.691 0.000 0.413 0.654

30 Env1 ~~ Env1 0.477 0.074 6.407 0.000 0.331 0.623

31 Env3 ~~ Env3 0.267 0.033 8.068 0.000 0.202 0.332

32 Env4 ~~ Env4 0.351 0.036 9.660 0.000 0.280 0.423

33 Env5 ~~ Env5 0.302 0.037 8.115 0.000 0.229 0.376

34 PF1 ~~ PF1 1.126 0.130 8.680 0.000 0.872 1.381

35 BehF1 ~~ BehF1 0.286 0.053 5.401 0.000 0.182 0.390

36 EnvF1 ~~ EnvF1 0.304 0.066 4.616 0.000 0.175 0.434

37 pers2 ~1 4.749 0.070 67.871 0.000 4.611 4.886

38 pers4 ~1 4.955 0.068 72.955 0.000 4.822 5.088

39 pers3 ~1 5.109 0.059 87.120 0.000 4.994 5.224

40 pers8 ~1 4.835 0.068 71.067 0.000 4.702 4.969

41 pers7 ~1 4.754 0.059 80.023 0.000 4.638 4.871

42 BF1 ~1 4.385 0.037 118.747 0.000 4.313 4.458

43 BF2 ~1 4.573 0.031 146.061 0.000 4.511 4.634

44 BF3 ~1 4.419 0.036 123.102 0.000 4.349 4.489

45 BF4 ~1 4.656 0.029 159.860 0.000 4.599 4.714

46 BF6 ~1 4.419 0.043 103.165 0.000 4.335 4.503

47 BF7 ~1 4.184 0.045 93.656 0.000 4.097 4.272

48 Env1 ~1 2.830 0.047 59.837 0.000 2.737 2.922

49 Env3 ~1 2.961 0.038 77.838 0.000 2.886 3.035

50 Env4 ~1 3.165 0.042 76.166 0.000 3.083 3.246

51 Env5 ~1 2.989 0.041 72.812 0.000 2.908 3.069

52 PF1 ~1 0.000 0.000 NA NA 0.000 0.000

53 BehF1 ~1 0.000 0.000 NA NA 0.000 0.000

54 EnvF1 ~1 0.000 0.000 NA NA 0.000 0.000

55 indirect := a*b indirect 0.021 0.012 1.752 0.080 -0.003 0.045

56 total := c+(a*b) total 0.150 0.044 3.414 0.001 0.064 0.236

This shows that the 95% confidence interval for the indirect effect is (-.003, .045) - is that all you would report from this output?

How would I generate the effect size of an indirect effect using R - I understand that kappa squared is a good but I'm unsure if R allows me to do this?

Also - I know that bollen-stine bootstrap could test the model fit - what exactly are the advantages of this? Should this be added to my initial command when fitting the model to give me a different model fit output and what is the command in lavaan for this?

Apologies for all the questions - if this is not the right place for such questions i'd be grateful if you could let me know where else I should go!

Thanks

Emma.

Terrence Jorgensen

unread,

Jun 6, 2016, 4:39:14 AM6/6/16

to lavaan

Is it normal/ recommended to report to model fit statistics (e.g. CFI, TLI, RMSEA, SRMR) for a mediation analysis in a PhD thesis?

Yes, for any non-saturated SEM you would be expected to address whether the model appears consistent with how the data were generated from a population process.

Also what has the 'bootstrap = 5000' command actually done here as I can't see any reference to bootstrapping in my output?

If you set test = "bollen.stine", you would see a bootstrap p value for the chi-squared test of perfect model fit. Although the labels don't change, the SEs are the SDs of the bootstrap distribution, not calculated from the covariance matrix of the model parameters. But in your sem() call, you should also explicitly set se = "boot" if that is what you want.

If I've understood this correctly, the regression shows b is not significant and so you would conclude no mediation rather than carry on at this stage (do you need a, b and c to be significant)? In addition the indirect effect is not significant so you would conclude no mediation? Or have I misinterpreted this?

I thought you said you weren't following those steps (and that you were interested in effect sizes, not tests)? Whether a particular direct effect is significant at an arbitrary alpha level is a function of sample size and effect size, so it is not necessarily the case that "b" is truly zero in the population, just that the data you have don't allow you to distinguish the estimated effect from zero. For testing the null hypothesis that the indirect effect is zero in the population, I would only check test the relevant parameter: the indirect effect (which is also not significant in your case). But this isn't software specific, so you might find different advice if you post your question on SEMNET (a more general forum with a wider audience of SEM experts).

Supposing my data did suggest mediation, my understanding is that R uses the Sobel (delta) approach to calculating the indirect effect

"R" does lots of things in lots of different packages. The lavaan package will give you delta-method SEs (for a Sobel test) by default, if you don't request bootstrap SEs.

This shows that the 95% confidence interval for the indirect effect is (-.003, .045) - is that all you would report from this output?

If that is the only hypothesis you are interested in testing, then the CI will give you a way to test it (whether zero is in the CI).

How would I generate the effect size of an indirect effect using R - I understand that kappa squared is a good but I'm unsure if R allows me to do this?

R doesn't prevent you from doing anything. If you know the formula and can extract the relevant information from your results, you can calculate it. But the standardized slopes (std.all) you requested in your summary output are also effect sizes: they tell you how many SDs the outcome changes when the predictor goes up by 1 SD.

Also - I know that bollen-stine bootstrap could test the model fit - what exactly are the advantages of this? Should this be added to my initial command when fitting the model to give me a different model fit output and what is the command in lavaan for this?

You can set test = "bollen.stine". The advantage is that it bases the p value on a null hypothesis that is actually true because the data are rescaled to be perfectly consistent with estimated model parameters before bootstrapping and fitting the model to each bootstrapped sample. If you bootstrapped the chi-squared test statistic without that transformation, then the mean and variance of that bootstrap distribution would follow the observed data-model fit, so the resulting bootstrap distribution would not be a valid distribution for testing the null hypothesis of perfect model fit.

Note that if you are using the Bollen-Stine bootstrap and the bootstrap CIs for all tests, then you don't need to set estimator = "MLR" because bootstrapping already does not rely on the assumption of normality. But if you are only interested in bootstrap CIs for the indirect effects, then the robust SEs and robust chi-squared under MLR provide valid tests.

Terrence D. Jorgensen

Postdoctoral Researcher, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

UvA web page: http://www.uva.nl/profile/t.d.jorgensen

Emma Mills

unread,

Jun 7, 2016, 2:51:32 PM6/7/16

to lav...@googlegroups.com

This is really really helpful and makes things a lot clearer to me - thank you!

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/8xPLljwz4m4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Emma Mills

unread,

Jun 8, 2016, 8:17:07 AM6/8/16

to lav...@googlegroups.com

Just one quick think to clarify - would you use test = ''bollen.stine'' and se = boot or does the bolen stine test replace the need for se = boot?

Thanks

Emma.

Yves Rosseel

unread,

Jun 8, 2016, 10:16:42 AM6/8/16

to lav...@googlegroups.com

On 06/08/2016 02:17 PM, 'Emma Mills' via lavaan wrote:
> Just one quick think to clarify - would you use test = ''bollen.stine''
> and se = boot or does the bolen stine test replace the need for se = boot?

You need both.

Yves.

Emma Mills

unread,

Dec 17, 2016, 11:15:26 AM12/17/16

to lav...@googlegroups.com

Can I check that my syntax below has used the bollen stine bootstrap and bootstrap CIs for all tests (rather than just the indirect effects) and so removes the need for estimator="MLR"? (My data has multivariate nonnormality) If not where would I add this instruction?

SCTModelCov1 <- 'PF1=~pers2+pers4+pers3+pers8+pers7

BehF1=~BF1_AV+BF2_AV+BF3_AV+BF4_AV+BF6_AV+BF7_AV

EnvF1=~Env1+Env3+Env4+Env5

# direct effect

BehF1 ~ c*PF1

# mediator

EnvF1 ~ a*PF1

BehF1 ~ b*EnvF1

# indirect effect (a*b)

indirect := a*b

# total effect

total := c + (a*b)

PF1~~BehF1

PF1~~EnvF1

EnvF1~~BehF1

'

SCT1Covfit<-sem(SCTModelCov1, data = SMBI_Alldata, test = "bollen.stine",  se="boot", bootstrap = 5000)

summary(SCT1Covfit, standardized = T, fit.measures = T, rsq = T)

boot.fit1 <- parameterEstimates(SCT1Covfit, boot.ci.type="bca.simple",level=0.95, ci=TRUE,standardized = FALSE)

boot.fit1

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/8xPLljwz4m4/unsubscribe.

To unsubscribe from this group and all its topics, send an email to lavaan+unsubscribe@googlegroups.com.

Terrence Jorgensen

unread,

Dec 17, 2016, 6:39:10 PM12/17/16

to lavaan

Can I check that my syntax below has used the bollen stine bootstrap and bootstrap CIs for all tests (rather than just the indirect effects) and so removes the need for estimator="MLR"?

It does.

Reply all

Reply to author

Forward