lavaan.survey and bootstrap

926 views
Skip to first unread message

Niels

unread,
Aug 20, 2015, 8:14:20 AM8/20/15
to lavaan
Hi all,

have i missed something, or is se="boot" not available for lavaan.survey, yet?

Thank you!

Stas Kolenikov

unread,
Aug 20, 2015, 12:14:13 PM8/20/15
to lav...@googlegroups.com
Survey bootstrap involves a fair share of oddities, at least the
clustered one with few clusters per stratum. See
http://www.citeulike.org/user/ctacmo/article/1475866,
http://www.citeulike.org/user/ctacmo/article/1036970 and
http://www.citeulike.org/user/ctacmo/article/582039 for the underlying
methodological developments. (I also summarized them in
http://www.citeulike.org/user/ctacmo/article/9101177 along with my
Stata code for survey bootstrap.)

-- Stas Kolenikov, PhD, PStat (ASA, SSC)
-- Principal Survey Scientist, Abt SRBI
-- Education Officer, Survey Research Methods Section of the American
Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
-- http://stas.kolenikov.name
> --
> You received this message because you are subscribed to the Google Groups
> "lavaan" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lavaan+un...@googlegroups.com.
> To post to this group, send email to lav...@googlegroups.com.
> Visit this group at http://groups.google.com/group/lavaan.
> For more options, visit https://groups.google.com/d/optout.

Niels

unread,
Aug 24, 2015, 3:19:50 AM8/24/15
to lavaan
Oh, wow. Thank you! I should have guessed, that this won´t be easy ;)

A.Bi.

unread,
Sep 10, 2015, 10:20:32 AM9/10/15
to lavaan
Dear all,

I have also survey data (weighting variable) and want to calculate a multiple mediation model (with bootstrapping). Do you know if there exists an example or a  tutorial?


Thank you for your help!

Mattia Indi Gerin

unread,
Jun 27, 2017, 12:50:24 PM6/27/17
to lavaan
Can Anyone reply to this question by A.Bi please. I am also having the same problem

Terrence Jorgensen

unread,
Jun 28, 2017, 6:59:30 AM6/28/17
to lavaan
Can Anyone reply to this question by A.Bi please. I am also having the same problem

On Thursday, 10 September 2015 15:20:32 UTC+1, A.Bi. wrote:
I have also survey data (weighting variable) and want to calculate a multiple mediation model (with bootstrapping). Do you know if there exists an example or a  tutorial?

Presumably if you are using complex survey methods, you have a large enough sample size to warrant relying on the Sobel test (the test printed by default for user-defined parameters, using the delta method).  Or if the little bit of nonnormality in the nearly asymptotic sampling distribution of the indirect effect still bothers you (understandably), then you could rely on other methods that perform at least as well as the bootstrap, like Monte Carlo:


Hopefully you can get vcov() output from a lavaan.survey() result.  Also available in the semTools package:

?monteCarloMed

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam



Mattia Indi Gerin

unread,
Jun 28, 2017, 8:44:36 AM6/28/17
to lavaan
Thank you for the quick reply! Will try that!

Mattia Indi Gerin

unread,
Jun 28, 2017, 12:52:31 PM6/28/17
to lavaan
Dear Dr Jorgensen,

Again, thank you for your quick reply.

I have two more queries that you may be able to help me with:

1. Given that bootstrapping seems to be slightly superior to Monte Carlo, do you know of any way that bootstrapping can be applied to a SEM fitted model from lavaan.survey (i.e. I am only aware of this been possible with lavaan, but not lavaan.survey - however I have to use lavaan.survey (and not lavaan) because I am using weights in my model, and lavaan does not support weights).

2. the online tool that you suggested for calculating the CI of indirect effects using the Monte Carlo gives you the a*b CI. Do you know of any other tool or script that can also calculate the CI for each coefficient separately (i.e.  a, b and c’)  and also for the total model - i.e. c’ + (a*b).

Than you in advance,

Mattia



On Wednesday, 28 June 2017 11:59:30 UTC+1, Terrence Jorgensen wrote:

Terrence Jorgensen

unread,
Jun 29, 2017, 4:27:26 AM6/29/17
to lavaan
1. Given that bootstrapping seems to be slightly superior to Monte Carlo, do you know of any way that bootstrapping can be applied to a SEM fitted model from lavaan.survey (i.e. I am only aware of this been possible with lavaan, but not lavaan.survey

This difficulty is precisely the advantage of the Monte Carlo method (similar to the parametric bootstrap method), as pointed out in one of the articles posted on Preacher's site I linked to:


I'm not convinced bootstrapping is really superior, though.  MacKinnon et al. (2004) found Monte Carlo had slightly lower Type I error rates and power than BCa bootstrap (which had nominal errors, hence the "slight" superiority description), but that power discrepancy shrank as N increased (and they only simulated up to N = 200 -- how big is your sample?).  Preacher and Selig's (2012) simulation, on the other hand, found comparable rejection rates between the two, using the same range of sample sizes.  The point is just that bootstrapping is not the only tool available for valid inferences, and a couple recent simulations have found that the Sobel test can even outperform bootstrapping:


2. the online tool that you suggested for calculating the CI of indirect effects using the Monte Carlo gives you the a*b CI. Do you know of any other tool or script that can also calculate the CI for each coefficient separately (i.e.  a, b and c’)  and also for the total model - i.e. c’ + (a*b).

The semTools package:

?monteCarloMed

Label the necessary parameters in the model syntax, and you can write whatever user-defined function of those parameters you want to test using the Monte Carlo method.

Mattia Indi Gerin

unread,
Jun 29, 2017, 10:25:38 AM6/29/17
to lav...@googlegroups.com
Fantastic! Thank you so much Terrence! I think it works, does the script below looks right (sorry I am quite new to R!)?



weightsdesign <- svydesign(ids = ~1, weights = ~weights, data = MT_CT_data_SEM)

model_AMY_INT <- 
'ANGERFEAR_NEUTRAL_Bilat ~ a*MT_CT
Internalizing_Symptoms_LastAssessmentMAX3 ~ b*ANGERFEAR_NEUTRAL_Bilat + c*MT_CT

direct:= c
indirect := a*b
total := c + (a*b)'

fit_model_AMY_INT <- sem(model_AMY_INT, data= MT_CT_data_SEM)

fit_model_AMY_INT_wght <- lavaan.survey(fit_model_AMY_INT, weightsdesign )



meda <- 'a'
medb <- 'b'
medc <- 'c'
medab <- 'a*b'
medabc <- ' c + a*b'


fit_model_AMY_INT_wght_a <- monteCarloMed(meda,object=fit_model_AMY_INT_wght, rep=20000, CI=95, plot=TRUE)
write.csv(fit_model_AMY_INT_wght_a, file="fit_model_AMY_INT_wght_a.csv")

fit_model_AMY_INT_wght_b <- monteCarloMed(medb,object=fit_model_AMY_INT_wght, rep=20000, CI=95, plot=TRUE)
write.csv(fit_model_AMY_INT_wght_b, file="fit_model_AMY_INT_wght_b.csv")

fit_model_AMY_INT_wght_c <- monteCarloMed(medc,object=fit_model_AMY_INT_wght, rep=20000, CI=95, plot=TRUE)
write.csv(fit_model_AMY_INT_wght_c, file="fit_model_AMY_INT_wght_c.csv")

fit_model_AMY_INT_wght_ab <- monteCarloMed(medab,object=fit_model_AMY_INT_wght, rep=20000, CI=95, plot=TRUE)
write.csv(fit_model_AMY_INT_wght_ab, file="fit_model_AMY_INT_wght_ab.csv")

fit_model_AMY_INT_wght_abc <- monteCarloMed(medabc,object=fit_model_AMY_INT_wght, rep=20000, CI=95, plot=TRUE)
write.csv(fit_model_AMY_INT_wght_abc, file="fit_model_AMY_INT_wght_abc.csv")

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/ENdcEjYfCPI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+unsubscribe@googlegroups.com.

To post to this group, send email to lav...@googlegroups.com.

Stas Kolenikov

unread,
Jun 29, 2017, 11:32:18 AM6/29/17
to lav...@googlegroups.com
The code may work (and I don't want to comment on that), but there are
methodological issues with how the whole thing is put together.

1. The survey weights may reflect unequal probabilities of selection,
as well as nonresponse and coverage adjustments. In some projects,
survey statisticians create replicate weights that reflect all of
these adjustments, so that the standard errors that you'd see account
for these. Sometimes, the standard errors would go up, sometimes,
down.

2. Beyond the weights, there are also aspects of stratification,
clustering, and finite population corrections that affect the variance
/ standard errors of the estimates. I pointed out to these earlier in
the thread in my 2015 post. The properly constructed replicate weights
account for these. Without access to the sampling methodology
documentation and supplementary data such as the population totals on
the calibration variables, it is very difficult to get these
adjustments right.

3. Simulating out of the asymptotic distribution out of the estimates
produced by lavaan() is all nice, but it does not really get the
benefits of the bootstrap out of the original data such as more
accurate confidence intervals. It is probably a step in the right
direction, but only about a third of the way. (I am pleasantly
surprised by the results in Preacher and Selig 2012 simulation study,
though.)

4. Taking 20 thousands replicates delivers as much on getting
asymmetry of the mediation effect estimates as it does on uncovering
the biases of the simulation out of the asymptotic distribution of the
original estimates as the method. (No method is unbiased, and
introduces its own errors; 20K is enough to drive the Monte Carlo
sampling errors to oblivion, so that any biases would only be too
obvious.)





-- Stas Kolenikov, PhD, PStat (ASA, SSC) @StatStas
-- Senior Scientist, Abt Associates @AbtDataScience
-- Program Chair (2018), Survey Research Methods Section of the
American Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer
-- http://stas.kolenikov.name



>> lavaan+un...@googlegroups.com.
>> To post to this group, send email to lav...@googlegroups.com.
>> Visit this group at https://groups.google.com/group/lavaan.
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "lavaan" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lavaan+un...@googlegroups.com.

Terrence Jorgensen

unread,
Jun 30, 2017, 5:50:19 AM6/30/17
to lavaan
Definitely pay attention to Stas' current and earlier posts.  But the syntax is correct, although there is no need to use it for the direct effects -- those z-tests in the output are already assumed normally distributed, so sampling a single parameter from a normal distribution doesn't buy you anything.  
Reply all
Reply to author
Forward
0 new messages