Simsem error using sim function

323 views
Skip to first unread message

Samuel Donnelly

unread,
Oct 24, 2020, 5:18:47 PM10/24/20
to lavaan
I'm getting an error when using the "sim" function from the "simsem" package. "Error in clean(simResults): All replications in the result object are not convergent. Thus, the result object cannot be used. Below is code to reference. 

effectSizes <- c(.1,  .3, .5)

modelEffects <- expand.grid(effectSizes, effectSizes, effectSizes)
names(modelEffects) <- c('a', 'b', 'c')

models <- modelEffects %>%
    rowwise() %>%
    do({
        genModel <- paste0('# direct effect
                           Y ~ ', .$c, '*X
                           # mediator
                           M ~ ', .$a, '*X
                           Y ~ ', .$b, '*M
                           X ~~ 1*X
                           Y ~~ 1*Y
                           M ~~ 1*M
                           ')
        testModel <-'# direct effect
                    Y ~ c*X
                    # mediator
                    M ~ a*X
                    Y ~ b*M
                    # indirect effect (a*b)
                    ab := a*b
                    # total effect
                    total := c + (a*b)
                    '
        data.frame(a=.$a, b=.$b, c=.$c, 
                   gen=genModel, test=testModel, stringsAsFactors=F)
    })
#############################################################################

REDOSIMS=T #Set REDOSIMS = T for the first time executing and F for future iterations. 
if(REDOSIMS){
    allModelPowerSim <- models %>%
        rowwise() %>%
        do({
            manySims <- sim(NULL, model=.$test[1], n=50:1500, generate=.$gen[1], 
                            lavaanfun='sem', multicore=T) 
            data_frame(a=.$a, b=.$b, c=.$c, powersims=list(manySims))
        })
        #save the above
        saveRDS(allModelPowerSim, 'power_simulations.RDS')
} else {
    #load the above
    allModelPowerSim <- readRDS('power_simulations.RDS')
}
################################################################################

powerData <- allModelPowerSim %>% rowwise() %>%
    do({
        aSimPower <- as.data.frame(getPower(.$powersims, 
                                            nVal=seq(50, 1500, 5),
                                            powerParam='ab'))
        data_frame(a=.$a, b=.$b, c=.$c, 
                   alab=paste0('a=',.$a), 
                   blab=paste0('b=',.$b), 
                   clab=paste0('c=',.$c), 
                   N=aSimPower[,1],
                   ab=aSimPower[,2])
    })

print(powerData)

Terrence Jorgensen

unread,
Oct 25, 2020, 5:17:03 AM10/25/20
to lavaan
I'm getting an error 

Can you first try installing the development version of simsem, in case this was due to a bug that was already resolved?

devtools::install_github("simsem/simsem/simsem")

If the error persists, could you then write a minimal reproducible example?  Your current example must take quite some time to run n=50:1500, when only a few reps should be necessary to track down the problem.  And your syntax is not annotated, making it especially difficult to follow since it is written in Hadley Wickham's bizarro-R / tidyverse language instead of the standard style used in the simsem vignettes.  For instance, the top portion of your code seems to cover several data-generating conditions (necessary to track down the problem?  I expect not), and it could be much more legible when written simply as:

effectSizes <- c(.1,  .3, .5)

modelEffects
<- expand.grid(a = effectSizes, b = effectSizes, c = effectSizes)
modelEffects$genModel
<- paste0('# direct effect
                           Y ~ '
, modelEffects$c, '*X
                           # mediator
                           M ~ '
, modelEffects$a, '*X
                           Y ~ '
, modelEffects$b, '*M

                           X ~~ 1*X
                           Y ~~ 1*Y
                           M ~~ 1*M
                           '
)
modelEffects$testModel
<- paste0('# direct effect

                    Y ~ c*X
                    # mediator
                    M ~ a*X
                    Y ~ b*M
                    # indirect effect (a*b)
                    ab := a*b
                    # total effect
                    total := c + (a*b)
                    ')

Your second section is running the simulation and trying to save results.  But what do you hope to accomplish with list(manySims)?  I can't see how you would expect a SimResult object to be converted to a data.frame, not should it be necessary.  If you are trying to see how power to detect particular parameters changes with continuously increasing sample size, you could write a function that extracts what you are interested in from each SimResult object, such as whether the (in)direct effects were rejected at each sample size.  Following from my syntax above:

# when all effects = 0.3
manySims
<- sim(NULL, model = modelEffects$test[14], n = 10*c(10:20),
                generate
= modelEffects$gen[14], lavaanfun = 'sem')
pow
<- data.frame(N = manySims@n,
                  pow
.a = manySims@cilower$a > 0,
                  pow
.b = manySims@cilower$b > 0,
                  pow
.c = manySims@cilower$c > 0,
                  pow
.ab = manySims@cilower$ab > 0)
pow
# predict rejection with N to infer power
pow.ab
 <- glm(pow.ab ~ N, data = pow, family = binomial(logit))
# required N for power = 80%
round
((log(.8/(1 - .8)) - coef(pow.ab)[[1]]) / coef(pow.ab)[[2]]) # N = 113

This code runs fine using simsem version 0.5-15.905
 
If you share more details about what specifically you are trying to do, perhaps I suggest an efficient way to accomplish it.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Samuel Donnelly

unread,
Oct 25, 2020, 11:46:27 AM10/25/20
to lavaan
Hi Dr. Jorgensen, 

Installing the package from github solved the issue, thank you very much! I apologize for the sloppy code - fairly new to R and this is my first stab at simulating for a power analysis. 

The above was to get more familiar with using the sim function and getting sample sizes for different effect sizes. My end goal is to do a power analysis for conditional indirect effect, specifically Hayes process model 7 (moderation via the a-path). My IV is a dummy coded variable (2 pairwise variables: Self-efficacy control vs. Self-efficacy high and self-efficacy control vs. self-efficacy low). My moderator is also a dummy coded variable based on two condition types (ambiguous vs unambiguous feedback). My mediator (effort) and DV (task performance) are both observed continuous variables. I've never seen this type of simulation done with categorical IVs & Moderators, it is possible using the simsem package? If so, I have a good estimate for the final model effect size (R2) in my DV, which is 10%, although I don't have any great estimates for the individual effects within the model, which is why I wanted to try out a range of possibilities (.1, .3, .5). 

Here is my dummy coding to be more specific for a 3x2 design (6 groups)
X1:  0 = Control, 0 = Low Self-efficacy, 1 = High Self-efficacy
X2: 0 = Control, 1 = Low Self-efficacy, 0 = High Self-efficacy
W: 0 = Ambiguous, 1 = Unambiguous 
Interaction 1: W*X1
Interaction 2: W*X2

I'm a PhD student in Organizational Psychology - my field often skips power analysis for conditional indirect effects since we are slower to adopt programing, although I'm striving to add this skill to my toolkit. I greatly appreciate any acumen you can lend me.
-Sam 

Terrence Jorgensen

unread,
Oct 25, 2020, 5:35:15 PM10/25/20
to lavaan
I've never seen this type of simulation done with categorical IVs & Moderators, it is possible using the simsem package?

Yes, you can model exogenous covariates using the kappa and gamma matrices.  The dummy codes must be specified as fixed values rather than randomly simulated like normal variables, but that shouldn't be a problem.  Just base it on the sample sizes per group (e.g., equal for all 6 groups?).  You can find examples for a simpler case in this vignette:  https://github.com/simsem/simsem/blob/master/SupportingDocs/Examples/Version05/exMultipleFormat/covData.R

Alternatively, you could specify it as a multigroup model, with parameters varying across the 6 groups in a way consistent with the hypotheses you have about moderated mediation (varying indirect effects across groups).

Samuel Donnelly

unread,
Oct 25, 2020, 7:00:13 PM10/25/20
to lavaan
I'm glad to hear it's possible, thank you very much for your recommendations. I really like your first approach of creating exogenous covariates for my IVs and Moderator. This looks a few levels above my current R skills - if it's not too time consuming, would you mind providing code for the skeleton of what this should look like? You've been very generous with your time and more informative than any other source I've come across, it's much appreciated nonetheless! 

Terrence Jorgensen

unread,
Oct 26, 2020, 6:32:13 AM10/26/20
to lavaan
if it's not too time consuming, would you mind providing code for the skeleton of what this should look like? 

This is something I think has come up before.  I just made a new example to add to the list of vignettes, designed around your design:


Other than setting / varying your population parameters, this should be what you need.  Let me know if you have questions.

Samuel Donnelly

unread,
Oct 26, 2020, 10:32:35 AM10/26/20
to lavaan
This is amazing thank you so much!! I'll pass this knowledge onto my peers. 

The hypothesis I'm testing is "Self-efficacy will have an indirect effect on performance through effort that is greater in the ambiguous condition than in the unambiguous condition".  Should I make any modifications to better reflect this direction?

Terrence Jorgensen

unread,
Oct 26, 2020, 4:27:38 PM10/26/20
to lavaan
The hypothesis I'm testing is "Self-efficacy will have an indirect effect on performance through effort that is greater in the ambiguous condition than in the unambiguous condition".  Should I make any modifications to better reflect this direction?

You can add another user-defined parameter that is the difference between indirect effects in those 2 conditions:

ind1.diff := ind1.amb - ind1.unamb
ind2
.diff := ind2.amb - ind2.unamb

Then the summaryParam() output will show power for that specific NHST.

Samuel Donnelly

unread,
Oct 26, 2020, 5:35:14 PM10/26/20
to lavaan
I cannot thank you enough. Please let me know if there is anyway I can support your generous work here. 

Hyeseung Koh

unread,
May 22, 2022, 7:41:53 PM5/22/22
to lavaan
Please check that #k-1 = the number of dummy codes when k refers to the level of a variable. It would be inaccurate information on the number of dummy codes here: https://github.com/simsem/simsem/blob/master/SupportingDocs/Examples/Version05/exMultipleFormat/moderatedMediation.R

In addition, do you have any plan to construct the command for a multi-group model with three mediators with an independent variable and an outcome variable using an observed variables? 
2020년 10월 26일 월요일 오후 4시 35분 14초 UTC-5에 donne...@gmail.com님이 작성:
Reply all
Reply to author
Forward
0 new messages