Just another post on moderated mediation analysis

25 views
Skip to first unread message

Dijana Ostojić

unread,
Aug 23, 2025, 3:17:38 AMAug 23
to lavaan
Hi everyone,

I have read many threads posted here and watched numerous yt tutorials by Regorz Statistics and QuantFish. I have also spent time reviewing other resources such as Cookbook for SEM, lavaan.org, etc. 

I want to model a moderated mediation (model 59 - see picture):
  • X (Personality) is a latent variable measured by 3 observed variables
  • Med (PRS SZ) is a continuous variable
  • Y (Quality of life) is a binary variable
  • Mod (Depression) is a binary variable
I have created interaction terms using semTools::indProd:
depression <- semTools::indProd(
  df,
  var1      = c("personality_1", "personality_2", "personality_3"),
  var2      = "depression_binary",
  match     = FALSE,
  meanC     = TRUE,
  namesProd = c("int_personalityXdep_1", "int_personalityXdep_2", "int_personalityXdep_3"),
  doubleMC  = TRUE
)

# prs_sz × depression (observed interaction)
depression_2 <- semTools::indProd(
  depression,
  var1      = "prs_sz",
  var2      = "depression_binary",
  match     = TRUE,
  meanC     = TRUE,
  namesProd = "int_prsXdep",
  doubleMC  = TRUE
)

The model is defined as:
model <- '
  # latent personality factor
  personality =~ personality_1 + personality_2 + personality_3

  # latent interaction (personality × depression)
  int_personalityXdep =~ int_personalityXdep_1 + int_personalityXdep_2 + int_personalityXdep_3

  # a path moderated by depression (personality x depression)
  prs_sz ~ a1*personality +
           a2*depression_binary +
           a3*int_personalityXdep +
           age + sex

  # b path moderated via observed interaction (personality x depression)
 quality_of_life ~ b1*prs_sz +
                         b2*depression_binary +
                         b3*int_prsXdep +
                         c_prime*personality +
                         c_mod*int_personalityXdep +
                         age + sex

  # conditional indirect effect
  indirect_dep0 := a1 * b1
  indirect_dep1 := (a1 + a3) * (b1 + b3)

  # conditional direct  effect
  direct_dep0 := c_prime
  direct_dep1 := c_prime + c_mod

  # conditional total  effect
  total_dep0 := indirect_dep0 + direct_dep0
  total_dep1 := indirect_dep1 + direct_dep1

  # error covariances
  int_personalityXdep_1 ~~ int_personalityXdep_2 + int_personalityXdep_3
'
Fit:
model_fit <- sem(
  model,
  data = depression_2,
  estimator = "WLSMV",
  parameterization = "theta",
  ordered = " quality_of_life",
  meanstructure = TRUE
)


summary(model_fit, fit.measures = TRUE, standardized = TRUE, ci = TRUE)

QUESTIONS:
1) Are error covariances required? Specified as: 
 int_personalityXdep_1 ~~ int_personalityXdep_2 + int_personalityXdep_3

Do I have to include any other error covariances? 

2) I have obtained extremely high chi-square statistics:
lavaan 0.6-19 ended normally after 129 iterations Estimator DWLS Optimization method NLMINB Number of model parameters 37 Number of observations 39954 Model Test User Model: Test statistic 175232.060 Degrees of freedom 38 P-value (Chi-square) 0.000 Model Test Baseline Model: Test statistic 224160.775 Degrees of freedom 28 P-value 0.000 User Model versus Baseline Model: Comparative Fit Index (CFI) 0.218 Tucker-Lewis Index (TLI) 0.424 Root Mean Square Error of Approximation: RMSEA 0.340 90 Percent confidence interval - lower 0.338 90 Percent confidence interval - upper 0.341 P-value H_0: RMSEA <= 0.050 0.000 P-value H_0: RMSEA >= 0.080 1.000 Standardized Root Mean Square Residual:  
SRMR 0.025
This is partially due to the large sample size (N = 39,954) and partially, I believe, due to multicollinearity between the interaction terms. I noticed that the highest MIs belong to interaction terms. See below:

              lhs                 op           rhs                    mi               epc     sepc.lv     sepc.all  sepc.nox
int_personalityXdep  ~   int_prsXdep       162426.457  -0.187  -0.447        -0.735    -0.447
int_personalityXdep  ~   quality_of_life   128758.399  -3.286  -7.832        -8.568    -8.568

Any suggestions on how to solve this issue?

3) Another moderator that I wish to include in the model lacks a normal sampling distribution (Mod_2: 0 = 39,704, 1 = 250). I assume that it would be problematic to include such a moderator in the analysis. Is there still a way (except undersampling and oversampling) to examine the effect of such a moderator?

4) In case where some of the variables violate the assumption of linearity, and I decide to include quadratic terms. At what level should these quadratic terms be included in the model specification? Do I have to create interaction terms based on the quadratic terms using indProd?

5) Are my conditional effects defined appropriately?

Thanks!

Kind regards,

Dijana

Model_59.png
Reply all
Reply to author
Forward
0 new messages