Monte Carlo Simulation: Implementing a design using simsem

151 views
Skip to first unread message

Chris Castille

unread,
Aug 31, 2021, 10:57:31 PM8/31/21
to lavaan
Hi folks,

I'm new to simulation with lavaan and simsem and trying to implement a particular design. I want to see how well certain estimating models perform at recovering population estimates as well as examine model fit. I need to create at least two population models and then two estimating models. 

The population models will consistent of 2 correlated latent variables (c1 ~ .2*c2; I call these "substantive") with 4 indicators each (x1-x8). However, one model will have one additional latent variable that functions as a method factor. In this iteration, the method factor (m1) contributes to x1-x8. This is a bifactor model. Another population model will include another method factor (m2) but this model will have m1 contributing to x1-x4 and m2 contributing to m4-m8 (m1 ~ 0*m2).

There will be two estimating models: one where the there are two substantive factors (method factors ignored) and another where there are two substantive factors and one method factor. While the former model is misspecified for both population models, the latter model should work well when there is one method factor underlying the data. 

We want to examine how well the estimating models perform across different sample sizes (n = 100, 300, 1000) and scale reliability levels (.70, .80, .90). So there are 18 conditions in this sample, but in our research setting there could be well over 200 conditions (I've kept things relatively small and easy to digest). That's a lot of code. I'm looking for any suggestions on using the sim() function of the simsem package for conducting the simulation and reporting the results. 

To share some sample code, the population, estimation, and my use of the sim() function are below:

pop <- ' 
  # define two substantive factors with standardized loadings
  c1 =~ 0.570*x1 + 0.632*x2 + 0.744*x3 + 0.543*x4
  c2 =~ 0.570*x5 + 0.632*x6 + 0.744*x7 + 0.543*x8
  
  # define a single ULMC 
  m1 =~ .271*x1 + .316*x2 + .374*x3 + .3000*x4 + .374*x5 + .3000*x6 + .271*x7 + .316*x8 

  # define a relationship linking substantive factors
  c1 ~~ 0.2*c2
  
  # substantive factors standardized
  c1 ~~ 1*c1
  c2 ~~ 1*c2
  
  # method factor standardized
  m1 ~~ 1*m1
  
  # method factors are not correlated with substantive factors
  c1 ~ 0*m1
  c2 ~ 0*m1
   
  # error variances for substantive factors
  x1 ~~ 0.602*x1
  x2 ~~ 0.501*x2
  x3 ~~ 0.307*x3
  x4 ~~ 0.615*x4
  x5 ~~ 0.535*x5
  x6 ~~ 0.511*x6
  x7 ~~ 0.373*x7
  x8 ~~ 0.605*x8
'

est <- ' 
  # define a single ULMC 
  m1 =~ x8 + x7 + x6 + x5 + x4 + x3 + x2 + x1 
  
  # define three substantive factors with standardized loadings
  c2 =~ x5 + x6 + x7 + x8
  c1 =~ x1 + x2 + x3 + x4

  # define a relationship linking substantive factors
  c1 ~~ c2
'

simOutput_1.100 <- sim(
  nRep = 100,
  model = est, # estimating model is pop model
  n = 1000,
  generate = pop, # generating model
  std.lv = TRUE,
  lavaanfun = 'cfa'
)

Any guidance is appreciated!

Chris

Alex Schoemann

unread,
Sep 1, 2021, 12:20:32 PM9/1/21
to lavaan
Hi Chris,

A couple of things I've found useful in running larger scale simulation studies with simsem:
  • If you're varying model parameters across replications, it's much easier to specify things using the simsem matrix style input than using lavaan syntax (see the vignette and examples for more information on this: https://github.com/simsem/simsem/wiki/Vignette)
  • I usually set up my simulation as a function that will run one condition (with the different varying components as arguments to that function) and then use an apply function to iterate across all simulation conditions (assuming a fully crossed design, often it's easy to set up a matrix of simulation conditions with expand.grid). The function outputs the results from the simulation (I usually output the raw results and do any processing of those separately). This also makes it easy to parallelize the simulation, running different conditions on different compute cores.
Hope this helps!

Alex

Chris Castille

unread,
Sep 2, 2021, 4:21:40 PM9/2/21
to lav...@googlegroups.com
Thanks Alex. The vignettes were quite helpful in getting me started. 

I’m working through your second bullet. I have the expand.grid portion in my code…I’m just struggling to unpack everything else you’ve described (my naiveté, I’m sure). Any chance you’d be willing to share some code or point me to a reproducible example that I might be able to adapt for my purposes? 

Thanks again for your time! This was very helpful for pushing me forward.

Chris

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/WCqvref2miU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/1336bafe-abe4-4612-afec-30f1508e9656n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages