Mediation using maximum likelihood estimation with robust standard errors

892 views
Skip to first unread message

MP

unread,
Apr 18, 2019, 12:52:50 PM4/18/19
to lavaan
Hi all, I am trying to reproduce a SEM mediation analysis from a published article on my own data. The authors reported that parameter estimates were calculated using "maximum likelihood with robust standard errors" and that the "missing data were handled using full information maximum likelihood procedures". In an attempt to reproduce this, I entered this input:

modelFear <- 'CLg ~ c* HRR
             Fear ~ a* HRR
             CLg ~ b* Fear
             ab := a*b
             total := c + (a*b)'
fitFear <- sem(modelFear, data = cabsfull, missing = "fiml", se = "robust.sem")

When I run this, I get the following message:

Warning message:
In lav_options_set(opt) :
  lavaan WARNING: missing will be set to “listwise” for se = “robust.sem”

Does
mean I cannot use full information ML to deal with missing values if I want robust standard errors?

If I omit the se argument, and input just the following:

fitFear <- sem(modelFear, data = cabsfull, missing = "fiml")

I get the following message:

Warning message: In lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: 5 cases were deleted due to missing values in exogenous variable(s), while fixed.x = TRUE.

So if I understood correctly, it used listwise deletion even though I asked for full information ML. Is it possible to reproduce what these authors have done - i.e. calculate the parameter estimates using the ML with robust standard errors and use FIML for missing data? If yes, what am I doing wrong?
I am new to R and SEM so any advice is very welcome. Thanks!

Terrence Jorgensen

unread,
Apr 24, 2019, 12:23:56 PM4/24/19
to lavaan
You need Huber-White SEs (also a robust Yuan-Bentler test statistics).  You can set fixed.x=FALSE, or add ".x" to "fiml".  Try 

fitFear <- sem(modelFear, data = cabsfull, missing = "fiml.x", estimator = "MLR")

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

MP

unread,
Apr 25, 2019, 11:36:34 AM4/25/19
to lavaan
Hi Terrence, thank you very much for your reply! The solution you provided worked perfectly!

I was just wondering if I may ask you another question. As I mentioned before, I am trying to reproduce a mediation analysis that another group published some time ago. They mentioned  that "parameter estimates were calculated using maximum likelihood with robust standard errors" to account for non-normality in the data (my data also has the same issues) and "to test the significance of the indirect effects, bias‐corrected confidence intervals (CIs) for the indirect effects were generated using 1,000 bootstrap samples".  Therefore, If I use the function:

fitFear <- sem(modelFear, data = cabsfull, missing = "fiml.x", estimator = "MLR")

That will reproduce everything they did except bootstrapping CIs (bias-corrected) for the indirect effect. I was wondering if there is a possibility to add another function after my fitFear function that will just bootstrap the confidence intervals, or do I have to change my function entirely (and maybe try something like:


fitfear<- sem(ModelFear, data=cabsfull, missing = "fiml.x", estimator = "ML", test = "Yuan.Bentler", se = "bootstrap", bootstrap = 1000)

The overall idea is to provide the best solution that will account for the non-normality of the data. My sample size is N = 487. Thank you!


Message has been deleted

PD

unread,
Apr 25, 2019, 2:12:44 PM4/25/19
to lavaan
You can only bootstrap if you have the raw data otherwise use the robust standard errors you have calculated to see if you get to the same conclusion that the authors did. If you want to bootstrap for your study you can use se = "bootstrap" and bootstrap = 1000 (although you can use 5000 if you want, it shouldn't make too much difference).

Terrence Jorgensen

unread,
Apr 26, 2019, 3:56:36 AM4/26/19
to lavaan
I was wondering if there is a possibility to add another function after my fitFear function that will just bootstrap the confidence intervals
 
Use the parameterEstimates() function, which has an argument boot.ci.type="bca.simple" to request BC bootstrap CIs
Message has been deleted

MP

unread,
Apr 26, 2019, 5:08:25 AM4/26/19
to lavaan
Thanks Terrence! Just a final question, how do I enter a number of bootstrap draws in the parameterEstimates() function? I know that for the sem() function it is bootstrap = , but lavaan does not recognise that argument if I pass it to the parameterEstimates() function. If I just enter:

parameterEstimates(fitFear, boot.ci.type="bca.simple")


It works fine, but it doesn't say on how many bootstrap samples it is based on.

Thanks for all the help!

Terrence Jorgensen

unread,
Apr 27, 2019, 4:45:07 AM4/27/19
to lavaan
parameterEstimates(fitFear, boot.ci.type="bca.simple")

It works fine, but it doesn't say on how many bootstrap samples it is based on.

Because you already specified that when you fit the model fitFear

MP

unread,
Apr 27, 2019, 6:12:49 AM4/27/19
to lavaan
Apologies, but I am a bit confused now, because my fitFear model does not include a bootstrapping argument. It is as you recommended earlier:

fitFear <- sem(modelfear, data = cabsfull, estimator = "MLR", missing = "fiml.x")                                                                        parameterEstimates(fitFear, boot.ci.type="bca.simple")

(The first line should give me an MLR estimator with robust standard errors and FIML for missing data, and the second should just provide me with BCa bootstrapped CI for indirect effect)            

Also, when I run both fitFear and parameterEstimates functions I get the results immediately, whereas when I usually bootstrap things in Lavaan it takes time as it's requires computational power, so I'm not sure if I am doing it properly?

Many thanks for your patience!

Terrence Jorgensen

unread,
Apr 28, 2019, 4:56:14 AM4/28/19
to lavaan
Apologies, but I am a bit confused now, because my fitFear model does not include a bootstrapping argument.

Then add it.

It is as you recommended earlier:

You mean in response to your first post, which did not mention bootstrapping at all?

MP

unread,
Apr 29, 2019, 9:11:33 AM4/29/19
to lavaan
Thanks! Just to confirm, it should be like this then?

fitFear <- sem(modelfear, data = cabsfull, estimator = "MLR", missing = "fiml.x", bootstrap = 5000)                                            parameterEstimates(fitFear, boot.ci.type="bca.simple")

Terrence Jorgensen

unread,
May 8, 2019, 5:11:19 AM5/8/19
to lavaan


On Monday, April 29, 2019 at 3:11:33 PM UTC+2, MP wrote:
Thanks! Just to confirm, it should be like this then?

fitFear <- sem(modelfear, data = cabsfull, estimator = "MLR", missing = "fiml.x", bootstrap = 5000)                                            parameterEstimates(fitFear, boot.ci.type="bca.simple")

Sure, but I don't see why you would insist on MLR for robust SEs when you are already bootstrapping.  And if your N is large, there isn't any point to bootstrapping, just rely on the delta-method (robust) SEs provided by default for your indirect effects.

Nikola Ćirović

unread,
May 16, 2020, 6:36:38 PM5/16/20
to lavaan
Hello!

I also have a question regarding this. Namely, I followed the example above and specified the following command:

fit.<-sem(model,  data = data ,orthogonal=TRUE, estimator='ML', missing = "fiml.x", estimator = "ML", test = "Yuan.Bentler", se = "bootstrap", bootstrap = 500)

However, after this, I was confused to find that the scaled fit indices (CFI/TLI) have risen substantially (robust ones did not though) in comparison to what is obtained with the following code:

fit<-sem(model,  data = data,std.lv=TRUE,orthogonal=TRUE, estimator='MLR', missing='fiml')

I am not sure if the command also bootstrapped the chi-square statistic or only the SE and CI (the CI is important to me as this is a latent mediation model). The scaled chi-squared after first (bootstrapped) and second code differed only slightly (YB scaling factor differed in third decimal) and it confuses me that CFI is much more favourable after bootstrap. 


I understand the recommendation to simply use the delta method, however, I am writing in hope to understand what the command I have written actually performed. 
Sorry if I am asking (missing) something basic. 
Thanks in advance.

Terrence Jorgensen

unread,
May 18, 2020, 9:13:25 AM5/18/20
to lavaan


On Sunday, May 17, 2020 at 12:36:38 AM UTC+2, Nikola Ćirović wrote:
Hello!

I also have a question regarding this. Namely, I followed the example above and specified the following command:

fit.<-sem(model,  data = data ,orthogonal=TRUE, estimator='ML', missing = "fiml.x", estimator = "ML", test = "Yuan.Bentler", se = "bootstrap", bootstrap = 500)

However, after this, I was confused to find that the scaled fit indices (CFI/TLI) have risen substantially (robust ones did not though) in comparison to what is obtained with the following code:

fit<-sem(model,  data = data,std.lv=TRUE,orthogonal=TRUE, estimator='MLR', missing='fiml')

I am not sure if the command also bootstrapped the chi-square statistic or only the SE and CI (the CI is important to me as this is a latent mediation model). The scaled chi-squared after first (bootstrapped) and second code differed only slightly (YB scaling factor differed in third decimal) and it confuses me that CFI is much more favourable after bootstrap. 

Setting estimator = "MLR" sets test="yuan.bentler.mplus", not test="yuan.bentler".   See ?lavOptions description.  The Mplus variant is only asymptotically equivalent to what Yuan & Bentler described.

If the result didn't change much for the hypothesized model, then CFI probably changed because the baseline model's chi-squared changed more substantially.

FYI, setting missing = "fiml" vs. "fiml.x" can yield different behavior too, if you have missing data on exogenous observed variables, "fiml" will still listwise-delete observations with missing data on the exogenous covariates, but "fiml.x" retains them (so the model will be fitted to a larger sample).

Nikola Ćirović

unread,
Jun 1, 2020, 7:17:40 PM6/1/20
to lavaan
Dear professor Jorgensen, 

Thank you so much for your quick reply. 

I understand that there may be more than one possible reasons for this (I don't have observed exogenous variables as far as I understand because my model is a measurement model with latent paths for which I want to bootstrap SE and get CI so I am trying to wrap my mind around this other reason you singled out as a possibility - the MLR). 

However, given the task at hand - and that is to obtain bootstrapped CI for the latent path coefficients, i.e. their indirect effects - I want to be safe and only take the se="bootstrap" option (only using default ML, even though I need MLR, however, once I obtain MLR fit indices it remains only to find the CI of the paths). It takes quite some time to do this (setting it to 500 boot takes an hour) and I judge that it really performs boot on all the SEs. However, I find it confusing that when I chose parameterEstimates function with "boot.ci.type = "bca.simple" argument (as I am trying to follow Heyes on this one and to use bias-corrected percentile bootstrap) - this gets finished in a second. So, I wonder how may I described what happened with these two commands. Can I use the same terminology that Heyes uses in his book on the process analysis and can I say that I obtained bias-corrected bootstrapped CI for the latent path coefficients - indirect and total effects?

Terrence Jorgensen

unread,
Jun 2, 2020, 3:36:26 AM6/2/20
to lavaan
to obtain bootstrapped CI for the latent path coefficients, i.e. their indirect effects - I want to be safe and only take the se="bootstrap" option (only using default ML, even though I need MLR, however, once I obtain MLR fit indices it remains only to find the CI of the paths). It takes quite some time to do this (setting it to 500 boot takes an hour) and I judge that it really performs boot on all the SEs.

Yes, the entire model is fitted to each bootstrap sample.
 
However, I find it confusing that when I chose parameterEstimates function with "boot.ci.type = "bca.simple" argument (as I am trying to follow Heyes on this one and to use bias-corrected percentile bootstrap) - this gets finished in a second. 

Because the bootstrapping already happened when you called sem(), so parameterEstimates() only needs to do the calculations, not the bootstrapping. 

FYI, since you are using MLR anyway, it would be faster to use the Monte Carlo approach.  That would also take seconds rather than hours for 2000+ bootstraps (which is the condition under which bootstrapping works well, not 500), and performs just as well as bootstrapping (see References and Examples on the ?semTools::monteCarloMed help page).

Patrick (Malone Quantitative)

unread,
Jun 2, 2020, 10:16:02 AM6/2/20
to lav...@googlegroups.com
Chiming in that the principal reason to use bootstrapped CIs for indirect effects is to allow the CIs to be asymmetric, because the sampling distribution of a product of coefficients is asymmetric. Using bootstrap just to generate the SEs, while it has other merits in some situations, does not achieve this. Hence using the various approaches to boostrapped CIs (percentile, bias-corrected, accelerated).

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/4a962340-269b-434a-afce-6f7ee49a9a12%40googlegroups.com.


--
Patrick S. Malone, Ph.D., Malone Quantitative
NEW Service Models: http://malonequantitative.com

He/Him/His

Nikola Ćirović

unread,
Jun 3, 2020, 7:04:27 PM6/3/20
to lavaan
Dear Mr. Malone - thank you for your additional clarification. 

Dear Mr. Jorgensen, thank you for feedback. 

If I understood you correctly the procedure I used (the code for bias-corrected CI) is well written but the very procedure is insufficient in my model given the circumstances. 

I decided to pursue your advice to use Montecarlo method but it is somewhat more difficult to use that function (however I am eager to explore semTools' capabilities). Could you please help me by checking whether I used the function correctly - it is in the bottom of my code for one of the effects (I copied the syntax as it also confuses me how to correctly mark the parameters - so I use the same letters as I used in the syntax).

I used some examples for this google group. 


model.temp.rum<-'   
bis=~ p114+p125+p129+p130+p133+p136+p138
bas=~p116+p118+p127+p131+p137+p140
fight=~p115+p121+p123+p124+p132+p139
freez=~p117+p119+p126+p128+p134
flight=~p113+p120+p122+p135+p141

bis~~bas
bis~~fight
bis~~freez
bis~~flight

bas~~fight
bas~~freez
bas~~flight

fight~~freez
fight~~flight

freez~~flight

SAA=~ p142+p143+p144+p145+p146+p147+p148+p149+p150+p151+p152+p153+p154+p155+p156+p157+p158+p159+p160+p161+p162+p163+p164+p165+p166  
SA1=~   p143+p144+p145+p146
SA2=~  p144+p154+p155
SA3=~ p156+p157+p158+p159+p160
SA4=~p161+p162+p163+p164+p165+p166 


XY =~ p167+p168+p169+p170+p171+p172+p173+p174+p175+p176+p177+p178+p179+p180+p181+p183+p184+p185+p186
bbp =~p167+p168+p169+p170
vdp =~ p172+p173+p174
fft    =~p175+p177+p178+p179
p171 ~~ p185
p185 ~~ p186

XY~ b*bis+fight+flight+freez+a*bas
SAA~ c*XY+e*bis+fight+flight+freez+d*bas

direktBAS:= d
direktBIS:= e 
indirektBAS:=a*c
indirektBIS:=b*c
totalBAS:=d+(a*c)
totalBIS:=e+(b*c)

'

fit.model.temp.rum <-sem(model.temp.rum,  data = data ,std.lv=TRUE,orthogonal=TRUE, estimator='MLR', missing = "fiml")
summary(fit.model.temp.rum,
        fit.measures = TRUE,standardize = TRUE,
        rsquare = TRUE)

# only for indirect effect BAS-XY-SAA

med <- 'a*c'
myParams <- c("a","c")
myCoefs <- coef(fit.model.temp.rum)[myParams]     
myACM <- vcov(fit.model.temp.rum)[myParams, myParams]         #This one I did not particularly understand 
monteCarloMed(med, myCoefs, ACM = myACM, rep = 2000)


This produced a more narrow interval than my bias-corrected CI with 500 reps. 

Thanks.

Terrence Jorgensen

unread,
Jun 4, 2020, 6:40:48 PM6/4/20
to lavaan
the principal reason to use bootstrapped CIs for indirect effects is to allow the CIs to be asymmetric

Likewise, Monte Carlo CIs make no assumption about the form of the distribution of the product (or any function of model parameters), so can be asymmetric. See the references in the ?monteCarloMed help page.
 
med <- 'a*c'
myParams <- c("a","c")
myCoefs <- coef(fit.model.temp.rum)[myParams]     
myACM <- vcov(fit.model.temp.rum)[myParams, myParams]         #This one I did not particularly understand 
monteCarloMed(med, myCoefs, ACM = myACM, rep = 2000)

You are combining 2 different examples from the help page.  Notice the comments say the second example is unnecessary.  Mimic the first lavaan example:

monteCarloMed('a*c', object = fit.model.temp.rum)

Nikola Ćirović

unread,
Jun 6, 2020, 6:31:11 PM6/6/20
to lavaan
Dear Mr. Jorgensen, 

Thank you! This works well. So the single line of command that you wrote in your last post is enough? (I was confused with the command that specifies the "Asymptotic covariance matrix" )


I just have one more question (again maybe I'm missing something obvious) - can I use this command for total effects too? (the help page mentions only indirect effect - products - so it left me confused about the total effects).

Like this: 

monteCarloMed('d+(a*c)', object = fit.model.temp.rum.SA.basicBEZantiZAEFEbezBSre)

Terrence Jorgensen

unread,
Jun 12, 2020, 4:55:03 AM6/12/20
to lavaan
So the single line of command that you wrote in your last post is enough? (I was confused with the command that specifies the "Asymptotic covariance matrix" )
 
Yes.  When you pass a lavaan model, it will detect that and automatically extract the coefficients and their ACOV from the lavaan object.
 
can I use this command for total effects too? 

Like this: 

monteCarloMed('d+(a*c)', object = fit.model.temp.rum.SA.basicBEZantiZAEFEbezBSre)

Yes!  Honestly, it can be used to test any null hypothesis about a transformation of parameters.  The Monte Carlo method can be used as a semi-parametric alternative to the generic, normal-theory-based "delta method" (e.g., ?deltaMethod in the car package).  Normal theory is assumed for individual estimated parameters, but (unlike the delta method) no parametric assumptions are made about any functions of parameters.

Nikola Ćirović

unread,
Jun 14, 2020, 7:31:09 PM6/14/20
to lavaan
Dear Prof Jorgensen, thank you for your kind and helpfully elaborated comments (as always!).
Reply all
Reply to author
Forward
0 new messages