why is there no estimate="bootstrap"?

122 views
Skip to first unread message

Moritz Lürig

unread,
Jul 31, 2023, 6:38:59 AM7/31/23
to lavaan
I have a small dataset (160 observations) and a relatively complex path model with 8 individual models. Given the small size of the dataset, and because I noticed some variation in the results when I had to remove some data, I figured that I should boostrap my analysis. 

I successfully used the sd="bootstrap" and test = "bootstrap" options. Still, I was wondering why I can't bootstrap the estimates this way? Do I even want this, ore are bootsrapped p-values enough to account for data-sparsity? 

 

Christian Arnold

unread,
Jul 31, 2023, 6:56:46 AM7/31/23
to lavaan
Hi Moritz,

what do you mean by "I was wondering why I can't bootstrap the estimates this way "? If you are referring to the convidence intervals, you have to request them:

library(lavaan)

HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '

fit <- cfa(HS.model, data = HolzingerSwineford1939)
parameterEstimates(fit)

fit.boot <- cfa(HS.model, data = HolzingerSwineford1939, se = "bootstrap", bootstrap = 100, verbose = TRUE)
parameterEstimates(fit.boot, boot.ci.type = "perc")

Moritz Lürig

unread,
Jul 31, 2023, 8:00:55 AM7/31/23
to lavaan
Hi Christian,

thanks. I am using sem, then I pull out all statistics of interest using summary. What I mean was that I do get the bootstrapped SE and p-values in this summary, but the estimate remains the same, bootstrapping or not. 

I tried parameterEstimates on my standard model, which gives me the same as summary, but when I use parameterEstimates on the bootstrapped model, I get the following error:

Error in ci[free.idx, ] <- t(qq[c(3, 4), ]) :
  number of items to replace is not a multiple of replacement length

Christian Arnold

unread,
Jul 31, 2023, 8:28:37 AM7/31/23
to lavaan
Hi Moritz,

the bootstrap provides the parameter estimates for each draw. With these parameter estimates the CI's are formed and the SE's are calculated. The estimate is still that of the original model. If you want to look at the estimates of each bootstrap draw, you can use bootstrapLavaan. Somewhere in the lavaan object you will also find this information. I don't know off the top of my head where  to find it- probably fit@boot$coef.

If you need detailed information about individual bootstrap draws, maybe x.boot will help you (see forum). Unfortunately I can't say anything about the error message without code and without a reproducible example.

Best

Christian

Moritz Lürig

unread,
Jul 31, 2023, 10:50:17 AM7/31/23
to lavaan
Hi Christian, 

thanks for your help - I do have a few follow-up questions:

1) Ok so the coefficients (same as estimates?) are in fit@boot$coef, but I don't understand why, for a model with 10 parameters, I get 25 coefficients? 

2) The fact that the bootstrapped estimates are hidden this way brings me to the second part of my initial question: does presenting them in the path model instead of the original model estimates make even sense? 

3) What is the difference between using se/test = "bootstrap" and  bootstrapLavaan?

Cheers
Moritz



Christian Arnold

unread,
Jul 31, 2023, 11:59:51 AM7/31/23
to lavaan
Hi Moritz,

1) I don't think so. What do you mean by parameters? Lavaan bootstraps all free parameters by default. The Holzinger-Swineford example consists of 21 parameters (sometimes labeled npar). You can inspect the details:

HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '

fit <- cfa(HS.model, data = HolzingerSwineford1939)
lavInspect(fit, "free")

2) This makes no sense and is not the purpose of bootstrapping. You repeat the bootstrap a thousand times or more. You have a thousand or more estimates per parameter. How do you reasonably include that in a path diagram? The main purpose of the bootstrap is to build confidence intervals, and you can certainly incorporate those into a path diagram.

3) It is - I think - important to understand that what is "colloquially" called bootstrapping consists of 2 operations: (1) Fit the model very often (using random draws with replacement). (2) Create confidence intervals.

Operation 1 is identical for se = "bootstrap" and bootstrapLavaan:

set.seed(1234)

fit.boot <- cfa(HS.model, data = HolzingerSwineford1939, se = "bootstrap", bootstrap = 100, verbose = TRUE)

fit <- cfa(HS.model, data = HolzingerSwineford1939)
set.seed(1234)
boot <- bootstrapLavaan(fit, R = 100, verbose = TRUE)
fit.boot@boot$coef - boot


Operation 2 is not implemented in bootstrapLavaan. You have to calculate the CI's yourself, which is usually not too complicated:

parameterEstimates(fit.boot, boot.ci.type = "perc")[2, c("ci.lower", "ci.upper")]
round(quantile(boot[,1], c(0.025, 0.975), type = 6), 3)

HTH

Christian

Moritz Lürig

unread,
Aug 1, 2023, 5:42:33 AM8/1/23
to lavaan
Hi Christian,

thanks so much for taking the time to explain - this clarifies matters quite a bit!

Best 
Moritz
Reply all
Reply to author
Forward
0 new messages