I’m working on the sample size estimation for a moderated mediation model. Specifically, the independent variable is categorical and has three between-subject conditions. The mediator, moderator, and outcome are continuous. Theoretically, the moderator can moderate the path between the independent variable and the mediator, the path between the mediator and the outcome, or both. So I want to calculate the minimum sample size for the three possible scenarios based on given a prior parameters. I read the following wonderful tutorial and adapted their R scripts to fit the scenario that the moderator can moderate the path between the independent variable and the mediator, which is quite straightforward, and I hope I didn’t make any mistakes. I attached the complied function as follows.
Donnelly, S., Jorgensen, T. D., & Rudolph, C. W. (2022). Power analysis for conditional indirect effects: A tutorial for conducting Monte Carlo simulations with categorical exogenous variables. Behavior Research Methods, 1-18.
The mathematical expression of the model for the first scenario (intercept and residual omitted):

X1 and X2 represent the two dummy codings that define the independent variable.
M represents the mediator variable.
W represents the moderator variable.
X1W and X2W represent the two interaction terms.
Y represents the outcome variable.
a1 and a2 represent the effects of X1 and X2 on M.
a1.mod and a2.mod represent the moderating effect of W between the relations between X1 and X2 and M.
b represents the effect of M on Y.
c1 and c2 represent the effects of X1 and X2 on Y.
W. M and W.Y represent the effects of W on M and Y.
R script for the first scenario:
HH_moderate_a1_a2 <- function(N.per.group,
a1,a2,a1.mod,a2.mod,
c1,c2,c1.mod,c2.mod,
b,b.mod,
W.M,W.Y,
nRep,seed) {
N.per.group <- N.per.group # balanced group sample sizes
exoData <- data.frame(X1 = rep(c(0,1,0), each = N.per.group),X2 = rep(c(0,0,1), each = N.per.group))
W <- rnorm(n=3*N.per.group, mean = 0, sd = 1)
exoData <- cbind(exoData, W)
exoData$X1W <- exoData$X1*exoData$W
exoData$X2W <- exoData$X2*exoData$W
## effect of X1 on M
a1 <- a1
## effect of X2 on M
a2 <- a2
## how much "a1" path is moderated by W (i.e., effect of X1W interaction)
a1.mod <- a1.mod
## how much "a2" path is moderated by W (i.e., effect of X2W interaction)
a2.mod <- a2.mod
## effect of M on Y
b <- b
## effect of X on Y
c1 <- c1
c2 <- c2
## Assume that the b, c1, and c2 paths are not moderated by W
b.mod <- b.mod
c1.mod <- c1.mod
c2.mod <- c2.mod
## simple effects of moderator (W) on M or Y
W.M <- W.M # arbitrary, not the effect of interest
W.Y <- W.Y
kappa.free <- kappa.pop <- matrix(NA, nrow = 2, ncol = 5, dimnames = list(c("M","Y"), c("X1","X2","W","X1W","X2W")))
## label free parameters to define indirect / total effects
kappa.free["M","X1"] <- "a1"
kappa.free["M","X2"] <- "a2"
kappa.free["M","W"] <- "W.M"
kappa.free["M","X1W"] <- "a1.mod"
kappa.free["M","X2W"] <- "a2.mod"
kappa.free["Y","X1"] <- "c1"
kappa.free["Y","X2"] <- "c2"
kappa.free["Y","W"] <- "W.Y"
kappa.free["Y","X1W"] <- 0
kappa.free["Y","X2W"] <- 0
## fill in population parameters
kappa.pop["M","X1"] <- a1
kappa.pop["M","X2"] <- a2
kappa.pop["M","W"] <- W.M
kappa.pop["M","X1W"] <- a1.mod
kappa.pop["M","X2W"] <- a2.mod
kappa.pop["Y","X1"] <- c1
kappa.pop["Y","X2"] <- c2
kappa.pop["Y","W"] <- W.Y
kappa.pop["Y","X1W"] <- c1.mod
kappa.pop["Y","X2W"] <- c2.mod
exoPaths <- bind(free = kappa.free, popParam = kappa.pop)
beta.free <- beta.pop <- matrix(0, nrow = 2, ncol = 2, dimnames = list(c("M","Y"), c("M","Y")))
beta.free["Y", "M"] <- "b" # estimate the effect of M on Y, label it "b"
beta.pop["Y", "M"] <- b # set the population value
endoPaths <- bind(free = beta.free, popParam = beta.pop)
residCor <- binds(free = diag(as.numeric(NA), 2), popParam = diag(2))
userParams <- ' ind1 := a1 * b
total1 := ind1 + c1
ind2 := a2 * b
total2 := ind2 + c2
mod1 := a1.mod
mod2 := a2.mod
path.a1 := a1
path.a2 := a2
path.b := b'
modMed1 <- model.path(BE = endoPaths, RPS = residCor, KA = exoPaths, # or GA=exoPaths, arbitrary for path models
indLab = rownames(kappa.free), covLab = colnames(kappa.free), con = userParams)
rejectMCCI <- function(object) {
CIs <- semTools::monteCarloCI(object, level = 0.975)
apply(CIs, 1, function(CI) 0 < CI["ci.lower"] | 0 > CI["ci.upper"])
}
out <- sim(nRep = nRep, model = modMed1, covData = exoData, n = nrow(exoData), seed = seed, outfun = rejectMCCI, multicore = TRUE)
simulation_results_nonmonteCarloCI <- summaryParam(out, matchParam = TRUE, digits = 3, detail = TRUE)
testMCCI <- do.call(rbind, getExtraOutput(out))
list(full_simulation_results_nonmonteCarloCI=simulation_results_nonmonteCarloCI, simulation_results_monteCarloCI=colMeans(testMCCI))
}
The function for the first scenario went well. However, I couldn’t figure out how to adapt the R script to the second scenario (and the third scenario), where the moderator moderates the path between the mediator and the outcome. I tried to treat the interaction term of the moderator and the mediator (M*W) as a covariate and generated the data beforehand as take it as the input for the argument covData of the sim function, but got error message “Error in rep(psLab, nf:1) : invalid 'times' argument” when running the model.path function. I googled this error message but still couldn’t figure out what happened. I also tried other adaptations, and all got this error message… I also attached my adapted R script as follows. I really what to know what happened or what is the right way to adapt the script for my case. Thank you for your support!
The mathematical expression of the model for the second scenario (intercept and residual omitted):

MW represents the interaction term of the mediator and the moderator.
R script for the first scenario:
N.per.group <- 350 # balanced group sample sizes
exoData <- data.frame(X1 = rep(c(0,1,0), each = N.per.group),X2 = rep(c(0,0,1), each = N.per.group))
exoData$M <- (-0.4*exoData$X1) + (-0.3*exoData$X2)
W <- rnorm(n=3*N.per.group, mean = 0, sd = 1)
exoData <- cbind(exoData, W)
exoData$MW <- exoData$M*exoData$W
## effect of X1 on M
a1 <- -0.4
## effect of X2 on M
a2 <- -0.3
## effect of M on Y
b <- -0.4
## effect of X on Y
c1 <- 0.1
c2 <- 0.1
## effect of the interaction term on M
MW.M <- 0
## Assume that the b, c1, and c2 paths are not moderated by W
b.mod <- -0.15
## simple effects of moderator (W) on M or Y
W.M <- 0 # arbitrary, not the effect of interest
W.Y <- 0
kappa.free <- kappa.pop <- matrix(NA, nrow = 2, ncol = 3, dimnames = list(c("M","Y"), c("X1","X2","W")))
## label free parameters to define indirect / total effects
kappa.free["M","X1"] <- "a1"
kappa.free["M","X2"] <- "a2"
kappa.free["M","W"] <- "W.M"
kappa.free["Y","X1"] <- "c1"
kappa.free["Y","X2"] <- "c2"
kappa.free["Y","W"] <- "W.Y"
## fill in population parameters
kappa.pop["M","X1"] <- a1
kappa.pop["M","X2"] <- a2
kappa.pop["M","W"] <- W.M
kappa.pop["Y","X1"] <- c1
kappa.pop["Y","X2"] <- c2
kappa.pop["Y","W"] <- W.Y
(exoPaths <- bind(free = kappa.free, popParam = kappa.pop))
beta.free <- beta.pop <- matrix(0, nrow = 3, ncol = 3, dimnames = list(c("M","Y","MW"), c("M","Y","MW")))
beta.free["Y", "M"] <- "b" # estimate the effect of M on Y, label it "b"
beta.free["MW", "M"] <- 0
beta.free["MW", "Y"] <- "b.mod"
beta.pop["Y", "M"] <- b # set the population value
beta.pop["MW", "M"] <- MW.M
beta.pop["MW", "Y"] <- b.mod
(endoPaths <- bind(free = beta.free, popParam = beta.pop))
(residCor <- binds(free = diag(as.numeric(NA), 3), popParam = diag(3)))
userParams <- '
ind1 := a1 * b
total1 := ind1 + c1
ind2 := a2 * b
total2 := ind2 + c2
path.a1 := a1
path.a2 := a2
path.b := b
mod := b.mod'
## got error message “Error in rep(psLab, nf:1) : invalid 'times' argument” when running this part
modMed1 <- model.path(BE = endoPaths, RPS = residCor, KA = exoPaths, # or GA=exoPaths, arbitrary for path models
indLab = rownames(kappa.free), covLab = colnames(kappa.free), con = userParams)
set.seed(20230423)
datmod <- generate(modMed1, covData = exoData, n = nrow(exoData))
```
## memo.fit1 <- analyze(modMed1, data = datmod) # simsem function, which calls lavaan
memo.mod1 <- ' Y ~ c1*X1 + c2*X2+ b*M + b.mod*MW + W # outcome
M ~ a1*X1 + a2*X2 + W # mediator
ind1 := a1 * b
total1 := ind1 + c1
ind2 := a2 * b
total2 := ind2 + c2
path.a1 := a1
path.a2 := a2
path.b := b
mod := b.mod
'
memo.fit1 <- sem(memo.mod1, data = datmod) # lavaan function
parameterEstimates(memo.fit1, output = "pretty")
rejectMCCI <- function(object) {
CIs <- semTools::monteCarloCI(object, level = 0.975)
apply(CIs, 1, function(CI) 0 < CI["ci.lower"] | 0 > CI["ci.upper"])
}
out1 <- sim(nRep = 100, model = modMed1, covData = exoData, n = nrow(exoData), seed = 20230423, outfun = rejectMCCI)
summaryParam(out1, matchParam = TRUE, digits = 3, detail = TRUE)
testMCCI <- do.call(rbind, getExtraOutput(out1))
colMeans(testMCCI) # empirical estimate of power for Monte Carlo CIs