Blavaan incredible slow...speed-up solutions?

Gabe Avakian Orona

unread,

Feb 23, 2022, 7:10:19 PM2/23/22

to blavaan

Hi All,

I'm working with a dataset of 260 rows (no missing data). I'm trying to run a cross-lagged model with several predictors and informative priors on some of the parameters. The estimation is incredibly slow: ~7 hours to complete. Any help in speeding things up? I paste some of my code below in case trouble-shooting can help.

Thank you very much!

bl1<-'
# Latent Variables
#_ _ _ _ _ _ _ _ _
CCR1 =~
z_ana1 + prior("normal(.5,.1)")*z_ana1 +
z_syn1 + prior("normal(.5,.1)")*z_syn1 +
z_win1 + prior("normal(.5,.1)")*z_win1 +
z_ptk1 + prior("normal(.5,.1)")*z_ptk1
CCR2 =~
z_ana2 + prior("normal(.5,.1)")*z_ana2 +
z_syn2 + prior("normal(.5,.1)")*z_syn2 +
z_win2 + prior("normal(.5,.1)")*z_win2 +
z_ptk2 + prior("normal(.5,.1)")*z_ptk2
CUR =~ NFC + EC + openess
EFF =~ Optim + ASE + consci
#_ _ _ _ _ _ _ _ _
#Regressions
CCR2 ~ zOuterBreadth + prior("normal(.13,.5)")*zOuterBreadth +
CCR1 + prior("normal(.7,.05)")*CCR1 +
CUR + EFF+
msf19jtsk2+ msf19nfge1+ polisoc+
male+ Urm + sesIndex+ junior +age+full_time +
d_pubhealth + d_bio + d_bus + d_soe + d_engi+ d_hum+ d_info+
d_nurs +d_phys +d_seco+ d_ss+ d_art
#_ _ _ _ _ _ _ _ _
# Variance/CUs
CCR2 ~~ CCR2
z_syn1 ~~ z_syn2
z_win1 ~~ z_win2
z_ptk1 ~~ z_ptk2
z_ana1 ~~ z_ana2
NFC ~~ openess
EC ~~ openess
EC ~~ ASE
'

#Fit
bl1 <- bsem(bl1, data=dat, std.lv=T,
n.chains = 3, burnin=5000,
sample=1000, target = "stan")

Best,

Gabe

Ed Merkle

unread,

Feb 23, 2022, 7:50:27 PM2/23/22

to Gabe Avakian Orona, blavaan

That is incredibly slow! Some misc thoughts:

- bsem() automatically adds some covariance parameters between latent variables here, and it might not be what you want (not sure). You might try specifying all the desired parameters (maybe you already have), then use blavaan() instead of bsem().

- I wonder whether you have tried sem() from lavaan for this model. That can provide a hint about whether it is a problem with the Bayesian estimation, or a more general problem with the model.

- Try a simpler model without the regressions, to see whether the problem remains.

- In my experience, if you need more than 500 burnin iterations, these models will never converge.

- For lines like

z_ana1 + prior("normal(.5,.1)")*z_ana1

you only need

prior("normal(.5,.1)")*z_ana1

I don't think that will change anything, but it will make the code easier to read.

Ed

--
You received this message because you are subscribed to the Google Groups "blavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blavaan+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/b59b3983-854c-4351-ac10-dd0882cab7a5n%40googlegroups.com.

Mauricio Garnier-Villarreal

unread,

Feb 24, 2022, 8:39:58 AM2/24/22

to blavaan

Gabe

can you show us the sessionInfo()?

To see if there is something to pay attention there

Mauricio Garnier-Villarreal

unread,

Feb 25, 2022, 11:16:40 AM2/25/22

to blavaan

Gabe

Based on the sessionInfo(), i would also recommend to do an update of the packages, as both lavaan, blvaan, and rstan have newer versions. And the latest version of blavaan had some speed improvements

update.packages(ask=F, checkBuilt=T)

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] blavaan_0.3-15 RcppParallel_5.1.4 Rcpp_1.0.6 lavaan_0.6-8

loaded via a namespace (and not attached):
[1] utf8_1.2.1 questionr_0.7.4 tidyselect_1.1.1 lme4_1.1-27 htmlwidgets_1.5.3 grid_4.1.0 munsell_0.5.0 codetools_0.2-18
[9] effectsize_0.4.5 DT_0.18 future_1.21.0 miniUI_0.1.1.1 withr_2.4.2 colorspace_2.0-1 OpenMx_2.19.5 highr_0.9
[17] knitr_1.33 rstudioapi_0.13 stats4_4.1.0 listenv_0.8.0 bayesplot_1.8.1 mi_1.0 emmeans_1.7.0 rstan_2.21.2
[25] mnormt_2.0.2 MCMCpack_1.5-0 parallelly_1.26.0 coda_0.19-4 vctrs_0.3.8 generics_0.1.0 TH.data_1.1-0 xfun_0.24
[33] R6_2.5.0 markdown_1.1 arm_1.11-2 rstanarm_2.21.1 assertthat_0.2.1 promises_1.2.0.1 scales_1.1.1 multcomp_1.4-17
[41] nnet_7.3-16 gtable_0.3.0 globals_0.14.0 rethinking_2.13 conquer_1.0.2 processx_3.5.2 mcmc_0.9-7 sandwich_3.0-1
[49] MatrixModels_0.5-0 timeDate_3043.102 rlang_0.4.11 splines_4.1.0 checkmate_2.0.0 inline_0.3.19 yaml_2.2.1 reshape2_1.4.4
[57] abind_1.4-5 threejs_0.3.3 crosstalk_1.1.1 backports_1.2.1 httpuv_1.6.1 rsconnect_0.8.18 Hmisc_4.5-0 tools_4.1.0
[65] psych_2.1.3 ggplot2_3.3.4 ellipsis_0.3.2 RColorBrewer_1.1-2 Rsolnp_1.16 stargazer_5.2.2 ggridges_0.5.3 plyr_1.8.6
[73] base64enc_0.1-3 purrr_0.3.4 rockchalk_1.8.144 ps_1.6.0 prettyunits_1.1.1 rpart_4.1-15 pbapply_1.4-3 zoo_1.8-9
[81] qgraph_1.6.9 haven_2.4.1 cluster_2.1.2 magrittr_2.0.1 data.table_1.14.0 nonnest2_0.5-5 openxlsx_4.2.4 SparseM_1.81
[89] colourpicker_1.1.0 truncnorm_1.0-8 tmvnsim_1.0-2 mvtnorm_1.1-2 matrixcalc_1.0-4 matrixStats_0.59.0 hms_1.1.0 shinyjs_2.0.0
[97] mime_0.10 evaluate_0.14 xtable_1.8-4 shinystan_2.5.0 XML_3.99-0.6 jpeg_0.1-8.1 gridExtra_2.3 shape_1.4.6
[105] rstantools_2.1.1 compiler_4.1.0 tibble_3.1.2 V8_3.4.2 crayon_1.4.1 minqa_1.2.4 StanHeaders_2.21.0-7 htmltools_0.5.1.1
[113] corpcor_1.6.9 later_1.2.0 semTools_0.5-4 Formula_1.2-4 tidyr_1.1.3 DBI_1.1.1 kutils_1.70 MASS_7.3-54
[121] see_0.6.4 boot_1.3-28 Matrix_1.3-3 cli_3.0.0 parallel_4.1.0 insight_0.14.5 igraph_1.2.6 forcats_0.5.1
[129] pkgconfig_2.0.3 sem_3.1-11 foreign_0.8-81 dygraphs_1.1.1.6 pbivnorm_0.6.0 CompQuadForm_1.4.3 estimability_1.3 stringr_1.4.0
[137] callr_3.7.0 digest_0.6.27 parameters_0.14.0 semPlot_1.1.2 rmarkdown_2.9 htmlTable_2.2.1 lisrelToR_0.1.4 curl_4.3.1
[145] quantreg_5.86 shiny_1.6.0 gtools_3.9.2 nloptr_1.2.2.2 lifecycle_1.0.0 nlme_3.1-152 glasso_1.11 jsonlite_1.7.2
[153] carData_3.0-4 fansi_0.5.0 labelled_2.8.0 pillar_1.6.1 lattice_0.20-44 loo_2.4.1 fastmap_1.1.0 pkgbuild_1.2.0
[161] survival_3.2-11 glue_1.4.2 xts_0.12.1 bayestestR_0.10.0 zip_2.2.0 fdrtool_1.2.16 png_0.1-7 shinythemes_1.2.0
[169] stringi_1.6.1 regsem_1.8.0 latticeExtra_0.6-29 dplyr_1.0.6 future.apply_1.7.0

Ed Merkle

unread,

Feb 25, 2022, 2:29:58 PM2/25/22

to Mauricio Garnier-Villarreal, blavaan

I agree with Mauricio that a package update might lead to some small improvements. I have also corresponded with Gabe off-list, and we found that some of the default priors on SD parameters were problematic. For example, some of the observed variables ranged from 0-100, so that a gamma(1,.5) on the corresponding residual SD is problematic. Additionally, some priors on regression weights were highly informative, and potentially conflicted with the data.

Maybe it is possible to have blavaan scale all observed variables to have an SD of 1 here, so that the default priors work better. But I worry that it would be problematic for models with certain equality constraints. For example, if you constrain a residual variance to be equal across groups, but then you scale the observed variables, perhaps the constraint leads to a different model than you started with.

Ed

Mauricio Garnier-Villarreal

unread,

Feb 25, 2022, 2:36:33 PM2/25/22

to blavaan

Gabe

Please keep the answers in the google gorup, instead of individual emails. Make it harder of us to keep track of the conversation

Base on the last conflict of versions that you send me

> library(blavaan)
Error: package or namespace load failed for ‘blavaan’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
namespace ‘rlang’ 0.4.11 is already loaded, but >= 1.0.1 is required

You need to install the latest version of rlang

install.packages("rlang",dep=T)

Ed Merkle

unread,

Feb 25, 2022, 8:27:18 PM2/25/22

to Mauricio Garnier-Villarreal, blavaan

Beyond Mauricio's recommendation, sometimes packages can get installed in different places when you use Rstudio, as opposed to just R. Then R might look in the wrong place for a package and find an old version, even though you have the new version installed somewhere else. The link below describes a bit more about this issue. You might also update R and Rstudio if you are not at the newest versions, and then fiddle with updating packages once you are on the newest versions.

https://www.accelebrate.com/library/how-to-articles/r-rstudio-library

Ed

--

You received this message because you are subscribed to the Google Groups "blavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blavaan+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/a48dcf05-4201-4528-96dc-55e6fbe2fb20n%40googlegroups.com.

Gabe Avakian Orona

unread,

Feb 26, 2022, 1:24:58 PM2/26/22

to Mauricio Garnier-Villarreal, blavaan

Sure thing. Thanks Mauricio.

I'm still getting this issue:

This is blavaan 0.4-1
On multicore systems, we suggest use of future::plan("multicore") or
future::plan("multisession") for faster post-MCMC computations.

Even after updating blavaan.

Gabe

--
You received this message because you are subscribed to a topic in the Google Groups "blavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/blavaan/jiBuYc1inYE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blavaan+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/a48dcf05-4201-4528-96dc-55e6fbe2fb20n%40googlegroups.com.

--

Gabe Avakian Orona

PhD Student
School of Education

University of California, Irvine

3200 Education Irvine, CA 92697

gor...@uci.edu

Ed Merkle

unread,

Feb 26, 2022, 3:22:03 PM2/26/22

to Gabe Avakian Orona, Mauricio Garnier-Villarreal, blavaan

That is just a startup suggestion to help speed things up and appears for everyone.

Ed

You received this message because you are subscribed to the Google Groups "blavaan" group.

To unsubscribe from this group and stop receiving emails from it, send an email to blavaan+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/CABNJFV967RrrXFYoN%2BFn80Hc8a%2BJd28QP2NZg0zRsuzRGeZ5vQ%40mail.gmail.com.

Gabe Avakian Orona

unread,

Feb 27, 2022, 12:53:37 PM2/27/22

to Ed Merkle, Mauricio Garnier-Villarreal, blavaan

Hi Ed and Mauricio,

There appears to be a unique problem occurring after implementing all the suggestions: the summary function (and other functions) don't seem to recognize my blavaan fit object as a blavaan object. For instance, when I execute summary() on my fit object, it shows "estimate" instead of posterior mean--and no information on SD, Rhat, etc. Here is an example of what I mean:

#1. Model Specification

m0<-'

# Latent Variables
#_ _ _ _ _ _ _ _ _
CCR1 =~
z_ana1 +

z_syn1 +
z_win1 +
z_ptk1
CCR2 =~
z_ana2 +
z_syn2 +
z_win2 +
z_ptk2

#_ _ _ _ _ _ _ _ _
#Regressions

CCR2 ~ CCR1

'
#_________________________________________

#2. Model Fit

fit0<-bcfa(m0,data = dat, std.lv = TRUE, save.lvs = T)

# 3. Summary

> summary(fit0)
lavaan 0.6-10 ended normally after 1000 iterations

Estimator BAYES
Optimization method NLMINB
Number of model parameters 25

Number of observations 260
Number of missing patterns 1

Model Test User Model:

Test statistic -2893.373
Degrees of freedom NA

Test statistic 0.003
Degrees of freedom NA

Parameter Estimates:

Latent Variables:
Estimate
CCR1 =~
z_ana1 0.691
z_syn1 0.807
z_win1 0.436
z_ptk1 0.281
CCR2 =~
z_ana2 0.491
z_syn2 0.509
z_win2 0.274
z_ptk2 0.242

Regressions:
Estimate
CCR2 ~
CCR1 1.009

Intercepts:

Estimate
.z_ana1 0.003
.z_syn1 0.002
.z_win1 -0.000
.z_ptk1 0.001
.z_ana2 0.002
.z_syn2 0.003
.z_win2 0.000
.z_ptk2 0.002
CCR1 0.000
.CCR2 0.000

Variances:
Estimate
.z_ana1 0.542
.z_syn1 0.369
.z_win1 0.826
.z_ptk1 0.934
.z_ana2 0.535
.z_syn2 0.499
.z_win2 0.863
.z_ptk2 0.896
CCR1 1.000
.CCR2 1.000

Mauricio Garnier-Villarreal

unread,

Feb 28, 2022, 5:57:02 AM2/28/22

to blavaan

Gabe

This is a semi recurrent problem. Happens when your R session is confused, and reads the blavaan object as a lavaan object instead, so it is presenting the lavaan summary

I still havent found a "good" solution when this happens. I just close Rstudio and open it again. When you set your code, make sure to only run library(blavaan), and NOT call library(lavaan). Blavaan will automatically load lavaan, and sometimes loading lavaan adds to the confusion of the R session

Ed Merkle

unread,

Feb 28, 2022, 9:28:36 AM2/28/22

to blavaan

I keep adding fixes that I think will solve this problem, but apparently it still exists. I agree that library(lavaan) after library(blavaan) might cause problems. One other possible solution, without exiting Rstudio, is:

blsumm <- getMethod(summary, "blavaan")

and then you use blsumm() as if it were summary().

Ed

Gabe Avakian Orona

unread,

Feb 28, 2022, 12:11:32 PM2/28/22

to blavaan

Hi Maurcio,

I implemented your suggested solutions. When I went back in to run and load only blavaan, I get this error:

Error in lav_syntax_get_modifier(rhs[[3]][[2L]]) :
lavaan ERROR: evaluating modifier failed: *()*

Any chance you have come across this issue before?

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/d3c4db9c-4be4-4330-a920-4d6dfae898can%40googlegroups.com.

Mauricio Garnier-Villarreal

unread,

Mar 1, 2022, 4:56:34 AM3/1/22

to blavaan

Gabe

I havent seen this error before. Can you share the full syntax that give you this?

Ed Merkle

unread,

Mar 1, 2022, 10:23:33 AM3/1/22

to Gabe Avakian Orona, blavaan

I think that lavaan is providing a hint about a problem with your model specification. It is saying that your model syntax is bad, like maybe you have an extra asterisk in it (maybe around a prior() statement).

Ed

Gabe Avakian Orona

unread,

Mar 1, 2022, 10:34:44 AM3/1/22

to Mauricio Garnier-Villarreal, blavaan

Hi Mauricio,

Sure, here it is:

library(blavaan)

bl2<-'

# Latent Variables
#_ _ _ _ _ _ _ _ _
CCR1 =~

prior("normal(.5,.4)")*z_ana1 +
prior("normal(.5,.4)")*z_syn1 +
prior("normal(.5,.4)")*z_win1 +
prior("normal(.5,.4)")*z_ptk1
CCR2 =~
prior("normal(.5,.4)")*z_ana2 +
prior("normal(.5,.4)")*z_syn2 +
prior("normal(.5,.4)")*z_win2 +
prior("normal(.5,.4)")*z_ptk2

CUR =~ NFC + EC + openess
EFF =~ Optim + ASE + consci
#_ _ _ _ _ _ _ _ _
#Regressions
CCR2 ~

prior("normal(.13,.8)")*zOuterBreadth +
prior("normal(.7,.5)")*CCR1 +
prior("normal(.2,.5)")*Hs_GPA +
prior("normal(.2,.5)")*sat+
prior("normal(.1,.8)")*CUR
prior("normal(.1,.8)")*EFF+
polisoc+ msf19nfge1 +

male+ Urm + sesIndex+ junior +age+full_time +
d_pubhealth + d_bio + d_bus + d_soe + d_engi+ d_hum+ d_info+
d_nurs +d_phys +d_seco+ d_ss+ d_art
#_ _ _ _ _ _ _ _ _
# Variance/CUs
CCR2 ~~ CCR2
z_syn1 ~~ z_syn2
z_win1 ~~ z_win2
z_ptk1 ~~ z_ptk2
z_ana1 ~~ z_ana2
NFC ~~ openess
EC ~~ openess
EC ~~ ASE
'

#Fit & Summary of Freq Model
bl2 <- bsem(bl2, data=dat, std.lv=T, save.lvs = T,
n.chains = 3, burnin=500,

sample=1000, target = "stan")

Error in lav_syntax_get_modifier(rhs[[3]][[2L]]) :
lavaan ERROR: evaluating modifier failed: *()*

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/341fc101-f249-420a-af38-74c4b45d7e28n%40googlegroups.com.

Gabe Avakian Orona

unread,

Mar 1, 2022, 10:36:26 AM3/1/22

to Ed Merkle, blavaan

Hi Ed,

The thing is this model was working prior to rebooting R and Rstudio and updating the packages. I have since tried deleting and re-installing both lavaan and blavaan. Neither of those options seemed to work.

Mauricio Garnier-Villarreal

unread,

Mar 1, 2022, 10:48:24 AM3/1/22

to blavaan

Gabe

I had a problem with the updates (different error), that got fixed by using the in development version of lavaan

Can try installing it with this

library(devtools)

install_github("yrosseel/lavaan")

Gabe Avakian Orona

unread,

Mar 1, 2022, 11:03:03 AM3/1/22

to blavaan

Oddly enough, when I just run lavaan models (frequentist models), there are no issues. I can get summaries, etc. The issue is mostly with blavaan. Do you think this would help blavaan, too?

Thank you for your help.

-Gabe

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/491f5ede-369f-40a5-9f31-2e119b5607bfn%40googlegroups.com.

Mauricio Garnier-Villarreal

unread,

Mar 2, 2022, 3:55:46 AM3/2/22

to blavaan

Yes, updating lavaan can help blavaan. Because blavaan works with a lot of lavaan structure

christop...@gmail.com

unread,

Apr 7, 2022, 12:10:02 PM4/7/22

to blavaan

For what it is worth: I too find blavaan extremely slow. A simple model that takes a few minutes in Mplus (with estimator = bayes) takes many hours in blavaan (I gave up once again and forced R to quit).

Sorry for being negative. I've moved back to Mplus, even though I have great respect for free and open-source software. Please note that I use large samples from 1000 to 30,000.

christop...@gmail.com

unread,

Apr 7, 2022, 12:14:40 PM4/7/22

to blavaan

PS. No problem with lavaan. Fast and good.

Ed Merkle

unread,

Apr 7, 2022, 12:40:14 PM4/7/22

to christop...@gmail.com, blavaan

Some models are indeed slow. I'd be interested to see the code for your slow model, in case it involves something I haven't seen before.

Ed

Gabe Avakian Orona

unread,

Apr 7, 2022, 12:59:20 PM4/7/22

to blavaan

@ christop...@gmail.com Blavaan is still incredibly slow for me after implementing the changes suggested. Anything you learn to speed it up I will be happy to hear. No worries about being negative; honesty is best in my view.

Thank you,

Gabe

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/0e7a9cad4a64efae9fdc709508eafc9bd8338367.camel%40gmail.com.

Mauricio Garnier-Villarreal

unread,

Apr 8, 2022, 7:44:01 AM4/8/22

to blavaan

Can you run this in your set up, and report the system.time() ?

Because I think the porblem is not blavaan, but something on the Stan installation is making the stan models run slow

library(rstan)

scode <- "
parameters {
real y[2];
}
model {
y[1] ~ normal(0, 1);
y[2] ~ double_exponential(0, 2);
}
"

system.time(
fit2 <- stan(model_code = scode, iter = 10000, verbose = FALSE)
)

Ed Merkle

unread,

Apr 8, 2022, 9:32:03 AM4/8/22

to Mauricio Garnier-Villarreal, blavaan

Mauricio and all,

While the Stan install could definitely be a problem, Christopher's comment about large samples makes me think that blavaan is probably to blame. That reminded me of a sufficient statistic trick that I hadn't gotten around to implementing, so I tried it out yesterday and it really seems to help certain models when there is no missing data. These additions are on github, if you want to try it yourself.

Beyond that, common causes of slowness are prior-data conflict (when priors are informative and conflict with data) and sampling for tens of thousands of iterations (fewer iterations are needed in Stan as compared to Gibbs samplers).

Ed

Mauricio Garnier-Villarreal

unread,

Apr 12, 2022, 9:13:30 AM4/12/22

to blavaan

Ed

Is this new trick in the github as part of target="stan" or needs something else to be called?

thanks

Ed Merkle

unread,

Apr 12, 2022, 9:43:22 AM4/12/22

to Mauricio Garnier-Villarreal, blavaan

It should work for target="stan", without requiring anything else from you. But right now, it only works for models with continuous variables, complete data, and fixed.x=FALSE. I think you would see the most speedup for large sample sizes (thousands or more).

Ed

Terrence Jorgensen

unread,

Apr 25, 2022, 9:44:57 AM4/25/22

to blavaan

right now, it only works for models with continuous variables, complete data, and fixed.x=FALSE.

Would the same trick work if implemented per missing-data pattern? Perhaps problematic for unplanned missingness incomplete, but from planned-missing designs, it can be possible to compute the summary stats for the subset of variables relevant per missing pattern. Vika and Ke-Hai capitalized on that for one of their proposed model-based transformation methods (extending Bollen-Stine for incomplete data).

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Ed Merkle

unread,

Apr 25, 2022, 1:46:06 PM4/25/22

to Terrence Jorgensen, blavaan

Yes, it should work for missing data but I think will require a custom lpdf in Stan. The problem is that you get to a missing data pattern with 1 observation, and the sample covariance matrix for that pattern is not positive definite, and then Stan throws an error when you try to model it. I think you can get around it by avoiding the constant terms in the density function.

Ed

--
You received this message because you are subscribed to the Google Groups "blavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blavaan+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/8c5dfbe8-1975-47b9-aeff-6dcecf16bd98n%40googlegroups.com.

christop...@gmail.com

unread,

Jan 2, 2023, 1:54:53 AM1/2/23

to blavaan

Sorry for bringing up the speed issue again.

I will teach Bayesian statistics and initially wanted to let participants in the workshop use R. With R, my preference would be blavaan. But speed seems to be an issue?

I have estimated a simple regression model (y ~ x1 + x2, N = approx 2000, item-level missing data less than 5%). Time needed for running this model with Bayesian estimations with diffuse priors is:

blavaan 0.4-3: more than 9 minutes

Stata 17: 10.6 seconds

Mplus 8.8: 1 second

That seems strange: 1 sec with Mplus but nearly 10 minutes with blavaan? Do these differences make any sense?

CODE:

fit <- bsem("stflife ~ agea + age2", data = data)

Only R and Stata are options for this particular workshop, and I would prefer using R.

(I just tried brms and couldn't even make brms run. But if the source of slow estimation with blavaan lies in Stan, I guess there's no point in trying brms.)

Thanks,

Christopher

Ed Merkle

unread,

Jan 2, 2023, 12:12:02 PM1/2/23

to christop...@gmail.com, blavaan

Christopher,

Thanks for the report. As it currently stands, I think the speed is ok for continuous variables and somewhat slow for ordinal variables. Missingness should slow it down a bit, but 9min does seem long if you are using continuous variables. If you can send data (off list if needed), I could explore it more.

Thanks,

Ed

Gabe Avakian Orona

unread,

Jan 2, 2023, 12:21:22 PM1/2/23

to Ed Merkle, christop...@gmail.com, blavaan

I'm glad this was brought up; I'm also still experiencing the slow run time, even with no missing data.

Gabe

--
You received this message because you are subscribed to a topic in the Google Groups "blavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/blavaan/jiBuYc1inYE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blavaan+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/bb13cc8d5b34fccfc0c5fc2e31d05fa6f744bd1c.camel%40gmail.com.

--

Gabe Avakian Orona, Ph.D.

Postdoctoral Research Fellow
Hector Research Institute of Education Sciences and Psychology
University of Tübingen
gabrie...@uni-tuebingen.de

Ed Merkle

unread,

Jan 3, 2023, 12:21:51 PM1/3/23

to Gabe Avakian Orona, christop...@gmail.com, blavaan

In the past, I have noticed speed problems when the user does not specify priors, and the blavaan default priors do not work well for the user's data. This might especially happen if your variables have lots of scores far from 0 (say, in the 100s). I am planning to change the defaults so they are more similar to rstanarm or brms, which scale with your data.

Ed

christop...@gmail.com

unread,

Jan 4, 2023, 6:01:03 AM1/4/23

to blavaan

Thanks, Ed

I've sent data and R code in a separate email. Hopefull, it will be possible to replicate my result.

I want to start teahing with non-informative priors, so no priors were added.

One question, though. I would like to avoid more complex code in RStan, especially when giving an introduction to Bayes. But would using RStan directly be any faster? I understand that the Mplus team has developed some proprietary algorithms that make computations run very fast, also compared to Stata. So I think it's fairer to compare with Stata. Why is RStan/R this much slower than Stata? (I don't think the implementation of SEM or Bayes is particularly good in Stata, so I would prefer teaching with R and blavaan for this reason too, in addition to the fact that R makes science more accessible than proprietary software does.)

Best,

Chris

Ed Merkle

unread,

Jan 5, 2023, 2:18:33 PM1/5/23

to christop...@gmail.com, blavaan

Thanks for this example. I can reproduce that it is slow (many minutes, when one would hope seconds).

Here is what I think is happening: I have spent time optimizing estimation for bigger models that require SEM software. This has neglected many tricks that can be done for regression (and related models) to speed up model estimation. I know that lavaan flags regression models and does an estimation specific to those models, and this example has made it clear that I need to do the same thing in blavaan.

Stan can be fast for your model. For example, the brms code below uses Stan and finishes in seconds (after model compilation).

library(brms)

mb <- brm(stflife ~ agea + age2, data = data)

With some additions, blavaan should be able to get close to that.

And if you have other slow models that do not qualify as regression, I'd be interested to see those.

Thanks,

Ed

Ed Merkle

unread,

Feb 6, 2023, 2:18:00 PM2/6/23

to christop...@gmail.com, blavaan

Chris and others on this thread,

If you are able to install from github, please give the new version of blavaan a try and look at the speed improvements. It should especially be faster for large sample sizes of complete, continuous data.

Instructions to install are at the bottom of this page:

https://github.com/ecmerkle/blavaan

I am working on getting this on CRAN, but it is taking longer than I hoped.

Ed

On Wed, 2023-01-04 at 03:01 -0800, christop...@gmail.com wrote:

christop...@gmail.com

unread,

Feb 7, 2023, 12:34:14 PM2/7/23

to blavaan

> remotes::install_github("ecmerkle/blavaan", INSTALL_opts = "--no-multiarch")

---

... Dialogue box pops up: "Building R package from source required installation of additional building tools"

I answer yes

----

Downloading GitHub repo ecmerkle/blavaan@HEAD
Error: Failed to install 'blavaan' from GitHub:
Could not find tools necessary to compile a package
Call `pkgbuild::check_build_tools(debug = TRUE)` to diagnose the problem.

> pkgbuild::check_build_tools(debug = TRUE)

Trying to compile a simple C file
Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
clang -mmacosx-version-min=10.13 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o foo.so foo.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: framework not found CoreFoundation
: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [foo.so] Error 1
Error: Could not find tools necessary to compile a package
Call `pkgbuild::check_build_tools(debug = TRUE)` to diagnose the problem.

---

I'm in a loop, it seems? as https://ecmerkle.github.io/blavaan/ states:

Compilation is required; this may be a problem for users who currently rely on a binary version of blavaan from CRAN.

Ed Merkle

unread,

Feb 7, 2023, 12:51:26 PM2/7/23

to christop...@gmail.com, blavaan

This generally means that your system requires extra tools outside of R to compile a Stan model. An advantage of installing from CRAN is that they handle this compilation step for you. So I think the question is whether you have the desire/bandwidth to install some extra stuff on your system:

If no, I hope to get the new version of blavaan on CRAN within a week.

If yes, then you would want to look at the RStan "getting started" materials, especially the part about configuring a C++ toolchain:

https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started

It can seem overwhelming if you've never done it before, but the process gets easier the more you do it.

Ed

christop...@gmail.com

unread,

Feb 7, 2023, 2:08:49 PM2/7/23

to blavaan

Thanks, Ed. In that case, I'll leave it for now. I don't think that procedure for installing Stan works for me, see here:

https://discourse.mc-stan.org/t/unable-to-run-rstan-and-brms/30008/6

I own a copy of Mplus 8.8., but I generally like lavaan/blavaan (and R). I will stick to Mplus for a while, and hopefully return later. Sorry for not being able to test!

Best,

Chris

christop...@gmail.com

unread,

Feb 17, 2023, 7:24:04 AM2/17/23

to blavaan

Ed, I've now been able to rerun the regression model mentioned above with blavaan (using the latest version).

These were the original estimates:

blavaan 0.4-3: more than 9 minutes

Stata 17: 10.6 seconds

Mplus 8.8: 1 second

New result:

blavaan 0.4-6: 1 minute

Ed Merkle

unread,

Feb 17, 2023, 3:27:47 PM2/17/23

to christop...@gmail.com, blavaan

Thanks, it is better than before but could also use more improvement. Eventually, I will try to circle back and add some faster code that is dedicated to regression.

In the meantime, there is a chance that parallelization helps a bit more. In your bsem command, you could add

bcontrol = list(cores = 3)

Also, to parallelize post-estimation computations, you could try

library("future")

plan("multicore") ## mac or linux

plan("multisession") ## windows

Although I have heard that Rstudio might complain about the future package...

Ed

--
You received this message because you are subscribed to a topic in the Google Groups "blavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/blavaan/jiBuYc1inYE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blavaan+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/841ff2e0-c0ad-4796-9559-e9991be52834n%40googlegroups.com.

christop...@gmail.com

unread,

Feb 17, 2023, 6:11:52 PM2/17/23

to blavaan

Well, Ed, I think a reduction by 90% is a huge improvement! Congratulation.

Of course, it's still much slower than Stata (which is much slower than Mplus, and my ultimate goal is to stop using Mplus).

But I was able to improve speed further: Going from estimations on an iMac Pro to estimations with an M1 processor on Macbook Air, I am now down to 29 seconds.

And then, using parallelization, I end up with 14 seconds on the M1 Macbook Air.

It's still much longer than on Stata with no parallelization, but I think your recent coding was a success. I'd be interested in knowing where the bottleneck is - blavaan or Stan?

Personally, I need SEM models much more than regression, also when using bayesian estimations. I have yet to test such models with blavaan after the update. I could report back once I know more.

Christopher

Ed Merkle

unread,

Feb 17, 2023, 6:56:03 PM2/17/23

to christop...@gmail.com, blavaan

Thanks, the main bottleneck is figuring out how to code the Stan model for maximal speed/efficiency. In the traditional case, you cycle through each row of the data and evaluate the model likelihood for each row. But if you have complete data, you can use the sample covariance matrix and sample mean in place of each row of the data (at least, you can do it for SEM). If you have complete data and don't care about the means, you can use the sample covariance matrix and Wishart distribution. This gives better speed for large datasets because you no longer have to do computations for each row of the data.

And I might have already said it in a previous email, but if you don't care about the ppp (the model fit metric), then setting

test = "none"

should speed up some more.

Ed

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/401c463c-319f-48e3-a6ce-b6a29219299fn%40googlegroups.com.

Christopher Bratt

unread,

Feb 18, 2023, 1:33:14 AM2/18/23

to Ed Merkle, blavaan

I very much care about the PPP ;-)

Thanks for the great work you’re doing. And thanks for explaining the challenge with missing data. One solution for me/us might be to try listwise deletion in an initial phase of more complex projects, and then include the full sample in the final estimations. I will try that, given moderate missingness at around 5%. (I guess this approach will be more problematic with a higher proportion of missingness.)

Vennlig hilsen,

Christopher Bratt

From: Ed Merkle <ecme...@gmail.com>
Sent: Saturday, February 18, 2023 12:55:59 AM
To: christop...@gmail.com <christop...@gmail.com>; blavaan <bla...@googlegroups.com>
Subject: Re: Blavaan incredible slow...speed-up solutions?

Gabe Avakian Orona

unread,

Jun 20, 2023, 7:30:06 AM6/20/23

to blavaan

Hello,

Blavaan is incredibly slow for me. I have implemented the suggestions. Are there any developments on this front?

Gabe

Ed Merkle

unread,

Jun 20, 2023, 10:37:21 AM6/20/23

to Gabe Avakian Orona, blavaan

Hi Gabe, could you be more specific about your model? This thread was previously about regression models with thousands of observations, so I wonder whether you are doing that or something different.

Ed

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/bd8f89cb-8c3f-4111-974b-4d4d97069fddn%40googlegroups.com.

Gabe Avakian Orona

unread,

Jun 21, 2023, 4:55:31 AM6/21/23

to blavaan

Hi Ed, thank you for your reply. I believe this thread was originally about 260 observations; but the issue still applies to thousands of observations. I am experiencing issues with about 300 observations.

Mauricio Garnier-Villarreal

unread,

Jun 21, 2023, 9:35:26 AM6/21/23

to blavaan

Hi Gabe

Could you give us more details? The model, data characterictics, sessionInfo for example.

Also, how fast does Stan runs in general? To be sure that is not an issue with the general installation of Stan

take care

Gabe Avakian Orona

unread,

Feb 7, 2024, 3:31:12 AM2/7/24

to blavaan

Hi All, Blavaan runs incredibly slow. I'm running a rather simple cfa model; meanwhile, Stan in general (stan_glm function) runs at light speed.

Ed Merkle

unread,

Feb 7, 2024, 9:30:15 AM2/7/24

to blavaan

Gabe, we continue to make improvements, and it would be helpful to see model syntax and details about the data to see where further improvement is needed. Ordinal data, missing data, and prior-data conflict are some common slowdowns.

Best,

Ed

Gabe Avakian Orona

unread,

Feb 7, 2024, 9:50:00 AM2/7/24

to Ed Merkle, blavaan

Hi Ed,

Thank you for your message. Yes, I have some ordered variables, but the issue has remained in non-ordered variables. The data have a lot of missing data, about 53 percent on a data file of 364 participants. Here is some syntax.

Also, is there a way to get a posterior distribution from a generated parameter (such as an indirect effect in mediation analysis)?

Thank you again for replying back to me.

Best,

Gabe

To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/6466bb9e-a761-4225-83b8-4f9c62c43a00n%40googlegroups.com.

Reply all

Reply to author

Forward