Moderation simsem

Zuzanna Wojcik

unread,

Apr 27, 2021, 7:53:44 AM4/27/21

to lav...@googlegroups.com

Hello,

I am running a power analysis with the R package "simsem" and the function sim.

In my population model, how can I represent the model complexity of an interaction that will be created using a product indicator approach with indProd function? Simply regressing another factor (i.e., the moderator), in my population model, does not account for such complexity. My idea was to set a high covariance (e..g, 0.7) between the moderator and the variables used to compute the moderator. Is it correct? What I could do?

My model consists of 6 latent factors (about 5 indicators each), one mediation and one moderation. The power of each path is 1 when using only N=300 and loadings are set to only .4. However, from my little experience with SEM models, I think that 300 participants might be not enough, especially considering the moderation.

I would appreciate an explanation using lavaan syntax (please not matrix style).

Thank you.

Terrence Jorgensen

unread,

Apr 28, 2021, 11:50:31 PM4/28/21

to lavaan

In my population model, how can I represent an interaction using a product indicator approach

You need to write your own custom data-generating function to pass to the generate= function, which first simulates factor scores for exogenous factors, then you can use your moderation model (plus simulated residuals) to generate endogenous factor scores. Then you create indicators of all factors using the measurement model parameters.
You need to write another custom function that accepts the data generated by function 1, which then applies indProd() to it. Pass that function to the datafun= argument.
You can use lavaan syntax to specify the analysis model, and pass that to the model= argument.

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Zuzanna Wojcik

unread,

Apr 29, 2021, 9:24:39 AM4/29/21

to lav...@googlegroups.com

That's great, I've really appreciated your help.

Do you have any examples that could give an idea of what the structure of the custom object/function should look like?

Actually, an example of all the three steps you described would be really helpful to check my syntax. Maybe a simple example with latent variables X regressed into Y moderated by W? (e.g., 3 indicators each).

Thank you very much, Terrence.

Zu

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/23c1cb87-0dee-48b3-a7f3-32ec87a89928n%40googlegroups.com.

Zuzanna Wojcik

unread,

May 1, 2021, 8:38:35 AM5/1/21

to lav...@googlegroups.com

Can someone help me interpreting Terrence's advice?

You need to write your own custom data-generating function

I would need more information about the structure of the function. then you can use your moderation model (plus simulated residuals) to generate endogenous factor scores. Does it mean matrices to create the model () arguments? Such as LY, RTE, and so on

I am wondering if the two sentences simply mean creating my population model to pass to the generate function.

Then you create indicators of all factors using the measurement model parameters.

Indicators scores? How?

You need to write another custom function that accepts the data generated by function 1, which then applies indProd() to it. Pass that function to the datafun= argument.

Does it mean something like the example on version 0.2-0 about simFunction()? (which is not included in version 0.6-8).

I could not find anyone able to help me, I complete example would be really useful.

Thank you.

Zu

Terrence Jorgensen

unread,

May 2, 2021, 2:23:37 AM5/2/21

to lavaan

I typed "simulate latent interaction" into the group's search bar and found an old post of mine from 2013 that should help show how to simulate from a latent interaction model:

https://groups.google.com/g/lavaan/c/PxFUKcIwPd0/m/-T67JNWL4doJ

Then you just need to put it into a function with a sample-size argument, as the ?sim help page describes for the generate= argument. There are numerous tutorials about writing R functions online, such as: http://adv-r.had.co.nz/Functions.html

Zuzanna Wojcik

unread,

May 2, 2021, 10:09:55 PM5/2/21

to lav...@googlegroups.com

Thanks so much for this, as I would have never been able to do that without that script. I have just one last doubt. In my model I have two endogenous variables, and I am trying to compute the residual covariance matrix for the endogenous variables. My attempt was this one:

# compute predicted endogenous
factors$endog1 <- 0 + .7*factors$esog1
factors$endog2 <- 0 + .7*factors$esog2 + .7*factors$esog1 + .7*factors$esog3 + .7*factors$endog1 + .5*factors$esog2*factors$esog3

# 4. specify a residual covariance matrix for the endogenous latent variables, draw random errors --------
# residual covariance matrix
endoMat <- matrix(cov(factors$endog1, factors$endog2), ### ??
nrow = 2, ncol = 2,
dimnames = list(c("endog1", "endog2"), c("endog1", "endog2")))
diag(endoMat) <- c(var(factors$endog1),var(factors$endog2)) ### ??

# draw errors
set.seed(327468)
uniqueFactors <- data.frame(MASS::mvrnorm(N, mu = rep(0, 2), Sigma = endoMat))

As you can see, I have set the means to 0. Then, for what concerns Sigma, I have set the covariance between the two endogenous variables using the covariance between the predicted endogenous. Then in the diagonal of that matrix I have put the variances of the two predicted endogenous. Is this correct?

Thank you very much,
Zu

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/c7b87763-ea53-4d72-9743-535fdcdcb0c0n%40googlegroups.com.

Terrence Jorgensen

unread,

May 4, 2021, 8:24:33 PM5/4/21

to lavaan

I have two endogenous latent variables

Your syntax is correct, except for one detail. How you set the residual-covariance parameters is to calculate the covariance among their explained variances (i.e., expected values given the common-factor effects):

endoMat <- matrix(cov(factors$endog1, factors$endog2), ### ??
nrow = 2, ncol = 2,
dimnames = list(c("endog1", "endog2"), c("endog1", "endog2")))
diag(endoMat) <- c(var(factors$endog1),var(factors$endog2)) ### ??

Even if these were the correct values, it would be sufficient to simply take the covariance matrix returned by cov()

cov(factors[ , c("endog1","endog2")])

But the residual covariance is the unexplained variances. You can just choose any values you want. If the explained variances are < 1 and you want the population slopes to be interpretable as standardized slopes, then you can use an identity matrix to calculate 1 minus the explain variance:

exoMat <- cov(factors[ , c("endog1","endog2")])

## subtract from 2 × 2 identity matrix

endoMat <- diag(2) - exoMat

## choose a residual correlation

resCor <- 0.3 # or whatever value you want

## convert to residual covariance by multiplying by the SDs

endoMat[1,2] <- endoMat[2,1] <- resCor * prod(sqrt(diag(endoMat)))

Then this line will correctly simulate the latent-variable residuals:

uniqueFactors <- data.frame(MASS::mvrnorm(N, mu = rep(0, 2), Sigma = endoMat))

Zuzanna Wojcik

unread,

May 5, 2021, 9:31:47 AM5/5/21

to lav...@googlegroups.com

Thank you Terrence!

If the explained variances are < 1 and you want the population slopes to be interpretable as standardized slopes [...]

In my case, the variance of endog2 is larger than one. In fact, when running your script, I get NaNs. Only if I set very low values (e.g., 0.2) of the predicted endogenous (factors$endog2 <- 0 + .2*factors$esog2 + .2*factors$esog1 + [...]) I get the variance of the endog2 < 1.

So, I deduct that:

If the explained variance is either <1 or >1 I have two options:

Option1: choose any values, e.g., 0.2, 0.3 or 0.5.

Option2: compute interpretable population slopes

If the explained variance is >1 I have one option:
Option1: choose any values, e.g., 0.2., 0.3 or 0.5.

Can I use these general "rules"?

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/c3a0a6c3-4d48-4d8d-bc72-d0047da50489n%40googlegroups.com.

Terrence Jorgensen

unread,

May 13, 2021, 10:51:18 PM5/13/21

to lavaan

If you want to set standardized population parameters, the total variance must be 1, in which case any subset (explained or unexplained) must be < 1. If your explained variance > 1, then your slopes are not standardized. They don't have to be.

Reply all

Reply to author

Forward