Moderation simsem

259 views
Skip to first unread message

Zuzanna Wojcik

unread,
Apr 27, 2021, 7:53:44 AM4/27/21
to lav...@googlegroups.com
Hello,

I am running a power analysis with the R package "simsem" and the function sim.

In my population model, how can I represent the model complexity of an interaction that will be created using a product indicator approach with indProd function? Simply regressing another factor (i.e., the moderator), in my population model, does not account for such complexity. My idea was to set a high covariance (e..g, 0.7) between the moderator and the variables used to compute the moderator. Is it correct? What I could do?

My model consists of 6 latent factors (about 5 indicators each), one mediation and one moderation. The power of each path is 1 when using only N=300 and loadings are set to only .4. However, from my little experience with SEM models, I think that 300 participants might be not enough, especially considering the moderation. 

I would appreciate an explanation using lavaan syntax (please not matrix style).

Thank you.

Terrence Jorgensen

unread,
Apr 28, 2021, 11:50:31 PM4/28/21
to lavaan
In my population model, how can I represent an interaction using a product indicator approach 

  1. You need to write your own custom data-generating function to pass to the generate= function, which first simulates factor scores for exogenous factors, then you can use your moderation model (plus simulated residuals) to generate endogenous factor scores.  Then you create indicators of all factors using the measurement model parameters.
  2. You need to write another custom function that accepts the data generated by function 1, which then applies indProd() to it.  Pass that function to the datafun= argument.
  3. You can use lavaan syntax to specify the analysis model, and pass that to the model= argument.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Zuzanna Wojcik

unread,
Apr 29, 2021, 9:24:39 AM4/29/21
to lav...@googlegroups.com
That's great, I've really appreciated your help.

Do you have any examples that could give an idea of what the structure of the custom object/function should look like?

Actually, an example of all the three steps you described would be really helpful to check my syntax. Maybe a simple example  with latent variables X regressed into Y moderated by W? (e.g., 3 indicators each). 


Thank you very much, Terrence. 

Zu




--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/23c1cb87-0dee-48b3-a7f3-32ec87a89928n%40googlegroups.com.

Zuzanna Wojcik

unread,
May 1, 2021, 8:38:35 AM5/1/21
to lav...@googlegroups.com
Can someone help me interpreting Terrence's advice? 


You need to write your own custom data-generating function
I would need more information about the structure of the function.  then you can use your moderation model (plus simulated residuals) to generate endogenous factor scores.   Does it mean matrices to create the model () arguments? Such as LY, RTE, and so on 

I am wondering if the two sentences simply mean creating my population model to pass to the generate function.

Then you create indicators of all factors using the measurement model parameters.
Indicators scores? How? 

You need to write another custom function that accepts the data generated by function 1, which then applies indProd() to it.  Pass that function to the datafun= argument.
Does it mean something like the example on version 0.2-0 about simFunction()? (which is not included in version 0.6-8).

I could not find anyone able to help me, I complete example would be really useful. 

Thank you. 

Zu






 

Terrence Jorgensen

unread,
May 2, 2021, 2:23:37 AM5/2/21
to lavaan
I typed "simulate latent interaction" into the group's search bar and found an old post of mine from 2013 that should help show how to simulate from a latent interaction model:


Then you just need to put it into a function with a sample-size argument, as the ?sim help page describes for the generate= argument.   There are numerous tutorials about writing R functions online, such as: http://adv-r.had.co.nz/Functions.html

Zuzanna Wojcik

unread,
May 2, 2021, 10:09:55 PM5/2/21
to lav...@googlegroups.com
Thanks so much for this, as I would have never been able to do that without that script. I have just one last doubt. In my model I have two endogenous variables, and I am trying to compute the residual covariance matrix for the endogenous variables. My attempt was this one:

# compute predicted endogenous
factors$endog1 <- 0 + .7*factors$esog1  
factors$endog2 <- 0 + .7*factors$esog2 + .7*factors$esog1 + .7*factors$esog3 + .7*factors$endog1 + .5*factors$esog2*factors$esog3 

# 4. specify a residual covariance matrix for the endogenous latent variables, draw random errors --------
# residual covariance matrix
endoMat <- matrix(cov(factors$endog1, factors$endog2), ### ??
                  nrow = 2, ncol = 2,
                  dimnames = list(c("endog1", "endog2"), c("endog1", "endog2")))
diag(endoMat) <- c(var(factors$endog1),var(factors$endog2)) ### ??

# draw errors
set.seed(327468)
uniqueFactors <- data.frame(MASS::mvrnorm(N, mu = rep(0, 2), Sigma = endoMat))


As you can see, I have set the means to 0. Then, for what concerns Sigma, I have set the covariance between the two endogenous variables using the covariance between the predicted endogenous. Then in the diagonal of that matrix I have put the variances of the two predicted endogenous. Is this correct? 

Thank you very much,
Zu 

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

Terrence Jorgensen

unread,
May 4, 2021, 8:24:33 PM5/4/21
to lavaan
I have two endogenous latent variables

Your syntax is correct, except for one detail.  How you set the residual-covariance parameters is to calculate the covariance among their explained variances (i.e., expected values given the common-factor effects):

endoMat <- matrix(cov(factors$endog1, factors$endog2), ### ??
                  nrow = 2, ncol = 2,
                  dimnames = list(c("endog1", "endog2"), c("endog1", "endog2")))
diag(endoMat) <- c(var(factors$endog1),var(factors$endog2)) ### ??

Even if these were the correct values, it would be sufficient to simply take the covariance matrix returned by cov()

cov(factors[ , c("endog1","endog2")])

But the residual covariance is the unexplained variances.  You can just choose any values you want.  If the explained variances are < 1 and you want the population slopes to be interpretable as standardized slopes, then you can use an identity matrix to calculate 1 minus the explain variance:

exoMat <- cov(factors[ , c("endog1","endog2")])
## subtract from 2 × 2 identity matrix
endoMat <- diag(2) - exoMat
## choose a residual correlation
resCor <- 0.3 # or whatever value you want
## convert to residual covariance by multiplying by the SDs
endoMat[1,2] <- endoMat[2,1] <- resCor * prod(sqrt(diag(endoMat)))

Then this line will correctly simulate the latent-variable residuals:

uniqueFactors <- data.frame(MASS::mvrnorm(N, mu = rep(0, 2), Sigma = endoMat))

Zuzanna Wojcik

unread,
May 5, 2021, 9:31:47 AM5/5/21
to lav...@googlegroups.com
Thank you Terrence!

If the explained variances are < 1 and you want the population slopes to be interpretable as standardized slopes [...]

In my case, the variance of endog2 is larger than one. In fact, when running your script, I get NaNs. Only if I set very low values (e.g., 0.2) of the predicted endogenous (factors$endog2 <- 0 + .2*factors$esog2 + .2*factors$esog1 + [...]) I get the variance of the endog2 < 1.

So, I deduct that:

If the explained variance is either <1 or >1 I have two options:
Option1: choose any values, e.g., 0.2, 0.3 or 0.5. 
Option2: compute interpretable population slopes

If the explained variance is  >1 I have one option:
Option1: choose any values, e.g., 0.2., 0.3 or 0.5. 

Can I use these general "rules"?



--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

Terrence Jorgensen

unread,
May 13, 2021, 10:51:18 PM5/13/21
to lavaan
If you want to set standardized population parameters, the total variance must be 1, in which case any subset (explained or unexplained) must be < 1.  If your explained variance > 1, then your slopes are not standardized.  They don't have to be.
Reply all
Reply to author
Forward
0 new messages