Best way to report indirect effects in a complex Structural Equation Model (SEM) with a large number of variables and paths

121 views
Skip to first unread message

Charly Marie

unread,
Nov 25, 2024, 5:20:46 AM11/25/24
to lavaan
Dear community,

I posted a question on CrossValidated a few days ago (which may not have been the best place to start). So I am reproducing my question below.

I am fitting a complex structural equation model (SEM) with 4 latent and 2 observed variables (15 direct and 36 indirect effects). Importantly, in this model, some indirect effects from a given predictor to a given outcome are positive and some are negative (e.g., total effect = 5, but indirect effect 1 = 50, indirect effect 2 = -40, and indirect effect 3 = -5).

To make my analysis easier to understand, I thought about plotting the indirect effects of interest so that the reader could see the main conclusions at a glance.

I have considered several ways of doing this, all of which have advantages and disadvantages, and I cannot decide which is the most appropriate:

  1. Plot the raw indirect effects with the total effect. Pros: It reports all raw results. Cons: In the case of suppression effects, it may be difficult to understand how much the most important indirect pathways contribute to the total effect.
  2. Plot the percentage of a given effect in the total effect. Advantages: Allows the reader to understand the contribution of each indirect effect to the total effect. Disadvantages: Some percentages may be greater than 100 due to suppression effects. This was already mentioned here.
  3. Solution two, but using the absolute total effect. In this solution, I would calculate the absolute total effect by summing the absolute value of the direct and indirect effects. Advantages: I can show the contribution of each mediation to the absolute total effect, with the direction of that mediation. Disadvantages: I sense disadvantages, but I cannot put my finger on them. Hence the question.

Each solution answers a different question, with advantages and disadvantages. I have come to mixed conclusions on my own, and definitely need an outside view on what you think is the most important information to convey to the reader in the main text (other interesting solutions may still find their way into the appendix).

Yago Luksevicius de Moraes

unread,
Nov 26, 2024, 9:06:44 AM11/26/24
to lavaan
Hi, Charly

Sorry, but I cannot even imagine how you did fit a mediation model with 4 latent and 2 manifest variables. The only model I know that can have more latent than manifest variables is the APE model for herdability estimation, and mediation makes no sense in this case.
Can you share a graphical representation of your model and/or its lavaan syntax?

Best regards,
Yago

Christian Arnold

unread,
Nov 26, 2024, 5:25:09 PM11/26/24
to lav...@googlegroups.com
Hi Yago,

Why should it be a problem to fit a model with 4 latent and 2 manifest variables and why doesn't mediation make sense?

Best

Christian 


Von: lav...@googlegroups.com <lav...@googlegroups.com> im Auftrag von Yago Luksevicius de Moraes <yagol...@gmail.com>
Gesendet: Dienstag, November 26, 2024 3:07:01 PM
An: lavaan <lav...@googlegroups.com>
Betreff: Re: Best way to report indirect effects in a complex Structural Equation Model (SEM) with a large number of variables and paths
--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/lavaan/1932906d-baf7-4ef6-822c-d7f5f329bdc8n%40googlegroups.com.

Rönkkö, Mikko

unread,
Nov 26, 2024, 11:54:08 PM11/26/24
to lav...@googlegroups.com

Hi,

 

The original post probably meant that there are 6 variables of interest in the model, of which 2 are directly measured, and 4 are latent measured with indicators. Not that there are 4 latent variables that are measured with a total of 2 indicators.

 

To answer the original question, I do not think calculating sums of absolute effects make sense. Assuming all variables can be scaled so that they can be meaningfully compared, I would do a forest plot where you group the effects by the predictor and color them based on what kind of effect it is.

 

Best regards,

 

Mikko

 

From: lav...@googlegroups.com <lav...@googlegroups.com> on behalf of Christian Arnold <Christia...@hhl.de>
Date: Wednesday, 27. November 2024 at 0.28
To: lav...@googlegroups.com <lav...@googlegroups.com>
Subject: Re: Best way to report indirect effects in a complex Structural Equation Model (SEM) with a large number of variables and paths

Hi Yago,

 

Why should it be a problem to fit a model with 4 latent and 2 manifest variables and why doesn't mediation make sense?

 

Best

 

Christian 

 


Von: lav...@googlegroups.com <lav...@googlegroups.com> im Auftrag von Yago Luksevicius de Moraes <yagol...@gmail.com>
Gesendet: Dienstag, November 26, 2024 3:07:01 PM
An: lavaan <lav...@googlegroups.com>
Betreff: Re: Best way to report indirect effects in a complex Structural Equation Model (SEM) with a large number of variables and paths


Hi, Charly

Sorry, but I cannot even imagine how you did fit a mediation model with 4 latent and 2 manifest variables. The only model I know that can have more latent than manifest variables is the APE model for herdability estimation, and mediation makes no sense in this case.

Can you share a graphical representation of your model and/or its lavaan syntax?

 

Best regards,

Yago

Em segunda-feira, 25 de novembro de 2024 às 07:20:46 UTC-3, charly.m...@gmail.com escreveu:

Dear community,

 

I posted a question on CrossValidated a few days ago (which may not have been the best place to start). So I am reproducing my question below.

 

I am fitting a complex structural equation model (SEM) with 4 latent and 2 observed variables (15 direct and 36 indirect effects). Importantly, in this model, some indirect effects from a given predictor to a given outcome are positive and some are negative (e.g., total effect = 5, but indirect effect 1 = 50, indirect effect 2 = -40, and indirect effect 3 = -5).

To make my analysis easier to understand, I thought about plotting the indirect effects of interest so that the reader could see the main conclusions at a glance.

I have considered several ways of doing this, all of which have advantages and disadvantages, and I cannot decide which is the most appropriate:

1.    Plot the raw indirect effects with the total effect. Pros: It reports all raw results. Cons: In the case of suppression effects, it may be difficult to understand how much the most important indirect pathways contribute to the total effect.

2.    Plot the percentage of a given effect in the total effect. Advantages: Allows the reader to understand the contribution of each indirect effect to the total effect. Disadvantages: Some percentages may be greater than 100 due to suppression effects. This was already mentioned here.

3.      Solution two, but using the absolute total effect. In this solution, I would calculate the absolute total effect by summing the absolute value of the direct and indirect effects. Advantages: I can show the contribution of each mediation to the absolute total effect, with the direction of that mediation. Disadvantages: I sense disadvantages, but I cannot put my finger on them. Hence the question.

Each solution answers a different question, with advantages and disadvantages. I have come to mixed conclusions on my own, and definitely need an outside view on what you think is the most important information to convey to the reader in the main text (other interesting solutions may still find their way into the appendix).

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/lavaan/1932906d-baf7-4ef6-822c-d7f5f329bdc8n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

Christian Arnold

unread,
Nov 27, 2024, 4:14:09 PM11/27/24
to lav...@googlegroups.com
Of course it was not about 2 manifest variables measuring 4 latent variables. That was clearly recognizable from the context. Hence my question to Yago about his confusing comments

Regarding the original question: What exactly should be documented?


Von: lav...@googlegroups.com <lav...@googlegroups.com> im Auftrag von Rönkkö, Mikko <mikko....@jyu.fi>
Gesendet: Mittwoch, November 27, 2024 5:54:11 AM
An: lav...@googlegroups.com <lav...@googlegroups.com>

Charly Marie

unread,
Nov 28, 2024, 6:20:05 AM11/28/24
to lavaan
Dear all,

To yago,

I may have explained my model poorly. Here is a simplified syntax to make it easier to understand:

model <- '
observed_outcome_end ~ latent_1 + latent_2 + latent3 + latent4 + observed_outcome_middle
observed_outcome_middle~ latent_1 + latent_2 + latent3 + latent4

latent4 ~ latent_1 + latent_2 + latent3 
latent3 ~ latent_1 + latent_2 
latent_2 ~ latent_1
'

To Mikko,

Thanks for your feedback. I am also doubtful about calculating sums of absolute effects, hence this post. I am thinking more and more about plotting standardized direct and indirect effects directly.
 
To Christian,

You ask what to document: I would like to report clearly and understandably the mediation effects of my model, which contains positive and negative mediation (suppression) effects. The model is so complex (15 direct effects and thus many more indirect effects) that I thought about plotting the direct and indirect effects, grouped by predictor and outcome, to make them easier to understand and reporting the tables in the Supplementary Materials. 

Because of the suppression effects, I wondered if calculating the absolute effects might help readers understand the role of each effect in the overall effect. The more I think about it, the less I appreciate this solution, but I have not managed to put it into words.

Thank you all for your answers,

Charly 

Charly Marie

unread,
Feb 6, 2025, 5:40:28 AM2/6/25
to lavaan
Hello and thank you all for your reviews which helped me to find out the pros and cons.

After many attempts, I have decided to show all direct, indirect, and total standardized effects in one figure. I think this gives the reader all the information they need to understand the results of the SEM.

For interested readers, here is an R code to do this, with only one indirect effect. For the full complex SEM model described above, the R code and data will be made available at OSF.

I use the built-in PoliticalDemocracy dataset.

    # Fit a SEM model
    library(lavaan)
   
    model <- '
    # measurement model
      ind60 =~ x1 + x2 + x3
      dem60 =~ y1 + y2 + y3 + y4
      dem65 =~ y5 + y6 + y7 + y8
     
    # regressions
      # direct effect
        dem65 ~ c*ind60
       
      # mediator
        dem60 ~ a*ind60
        dem65 ~ b*dem60
       
      # indirect effect (a*b)
        ind60_dem60_dem65 := a*b
       
      # total effect
        total := c + (a*b)
    '
   
    fit_model <- sem(model, PoliticalDemocracy)


Here is the code to plot the effects

    # Plot the standardized direct and indirect effects
    library(tidyverse)
    library(semTools)
   
    monteCarloCI(fit_model, standardized = TRUE) %>%
      rownames_to_column(var = "Indirect_Paths") %>%
      mutate(Indirect_Paths = case_when(
        Indirect_Paths == "ind60_dem60_dem65" ~ "Ind in 1960 → Dem in 1960 → Dem in 1965",
        Indirect_Paths == "total" ~ "Ind in 1960 → Dem in 1965
    (Total effect)")) %>%
      bind_rows( # Get the direct effect from standardizedsolution(fit_model, ci = T)
        tibble(
          Indirect_Paths = "Ind in 1960 → Dem in 1965
    (Direct effect)",
          est.std = 0.146,
          ci.lower = 0.008,
          ci.upper = 0.283
        )
      ) %>%
   
      ggplot(aes(x = Indirect_Paths, y = est.std)) +
      geom_hline(yintercept = 0, linetype = "dashed", color = "grey", size = 0.5, alpha = 0.8) +
      geom_point(size = 0.5) +
      geom_errorbar(colour = "black", linewidth = 0.3,
                    aes(ymin = ci.lower, ymax = ci.upper),
                    position = position_dodge(width = 0.75),
                    width = 0.25) +
      coord_flip() +
      geom_text(aes(label = ifelse(round(ci.lower, 3) * round(ci.upper, 3) > 0, glue::glue("{round(est.std, 2)}*"), glue::glue("{round(est.std, 2)}"))),
                size = 2.5, vjust = -1.5) +
      theme_minimal() +  
      labs(x = "Path",
           y = "Standardized estimate",
           caption = "Direct effect 95% confidence intervals calculated with the Delta method.
           Indirect and total effects 95% confidence intervals calculated from 9,999 Monte-Carlo repetitions.
           * Asterisks indicate that the 95% confidence interval does not include zero at the three-digits level.") +
      theme(axis.text.y = element_text(size = 6),
            axis.title.y = element_text(size = 8),
           
            axis.text.x = element_text(size = 6),
            axis.title.x = element_text(size = 8),
            plot.caption = element_text(size = 5, face = "italic")) +
      expand_limits(y = c(-0.1))
   
    # Export the plot
    ggplot2::ggsave("Plot.jpeg", height = 3, width = 4)


This will give you the attached plot.
Plot.jpeg
Reply all
Reply to author
Forward
0 new messages