Comparing path coefficients in SEM using lavTestLRT - can I compare standardized or unstandardized values?

820 views
Skip to first unread message

Ingo Man

unread,
Apr 22, 2023, 5:39:02 AM4/22/23
to lavaan
After having searched through a number of forum postings and tutorials, I am left with 2 unanswered questions at the moment. I am running a latent regression model with one dependent and 6 independent variables (3 observed variables, 3 latent factors). I am using lavaan 0.6.15.1838 with MLR-estimator. I would now like to see whether regression paths differ statistically from each other, i.e. determine which variable has the greatest effect, and calculate the Chi² differences between the unrestricted and restricted model using the lavTestLRT function.
- Does the result of the model comparison refer to the unstandardised or the standardised coefficients? So, am I allowed to say that the standardised path of V1 differs from that of V2 or am I only allowed to do this for the unstandardised paths? It is generally recommended to compare the unstandardised coefficients (https://stats.stackexchange.com/questions/513552/how-to-test-a-difference-between-two-regression-coefficients-in-sem-cfa-lavaan; https://www.researchgate.net/post/How_do_you_test_for_equality_of_path_coefficients_in_LISREL). But now I have different metrics in the independent variables. Does lavaan automatically include this in the calculations or do I have to do something else? (E.g. set factor variance to 1 instead of working with the reference indicator approach when scaling the latent factors)? According to this paper https://link.springer.com/article/10.3758/s13428-011-0088-6#Tab2 the phantom variable approach seems to be the right one and certainly integrated in lavaan.
- The second question relates to the approach for model comparisons. I would like to know whether all 6 paths are statistically significantly different from each other. So I would do 6 model comparison by always equating 5 variables and comparing against a free estimated one? Or do I need specific hypotheses about what I want to test? One possibility would be to sort the effects by size and then make the comparisons successively from the largest to the smallest effect.
Many thanks in advance.

Marcus

Christian Arnold

unread,
Apr 22, 2023, 7:15:50 AM4/22/23
to lavaan
If you equate two or more beta, then this refers to the unstandardized coefficients. If you want to do a pairwise comparison of the regression coefficients, you don't need to run chi-square difference tests. For a pairwise comparison you can specify parameters and fit everything in one model:

library(lavaan)

pop.model <- "y ~ 0.30 * x1 + 0.33 * x2 + 0.70 * x3"

set.seed(123)
data <- simulateData(pop.model, sample.nobs = 1000)

model <- "
y ~ b1 * x1 + b2 * x2 + b3 * x3
d12 := b1 - b2
d13 := b1 - b3
d23 := b2 - b3


fit <- sem(model, data)

The unstandardized differences:
parameterEstimates(fit)

The standardized results:
standardizedsolution(fit)

I have a hard time classifying the questions. What exactly is your ultimate goal? If you want to decompose the effect sizes, you have to keep in mind that the IV are probably correlated. You could try Shapley decomposition, which can probably also be brought into latent space: https://wernerantweiler.ca/blog.php?item=2014-10-10 You might also find something useful in this contribution: https://www.tandfonline.com/doi/full/10.1080/10705511.2021.2025377

HTH

Ingo Man

unread,
Apr 22, 2023, 5:41:42 PM4/22/23
to lavaan
Thank you very much for the helpful answer, that's more or less exactly what I needed. Sorry if I have expressed myself in a misleading way. My goal is to determine whether certain paths in my model differ from each other or not, at least I would like to know whether the two strongest paths differ from each other. I am doing this exploratively, so I don't have a specific hypothesis about it.
I have calculated the example, very nicely, that corresponds to the approach via contrasts (difference variable).
First, I wanted to know if I can also compare the coefficients with different metrics of the output variables in the reference indicator approach for identifying the measurement model, so I don't have to set the factor variance to 1 to identify the measurement model, as described in the link (https://stats.stackexchange.com/questions/513552/how-to-test-a-difference-between-two-regression-coefficients-in-sem-cfa-lavaan):

"Generally, you should use unstandardised estimates when making any statistical inferences in SEM. However, if you want to constrain covariances to equality, the unstandardised variables involved must have the same scale for the constraint to make sense. This is unproblematic here given that you are interested in factor covariances. It seems that you are using the lavaan default where factors inherit the scales of their first indicators. You can change this so that the variances of each factor are set to 1 instead. In lavaan, use the argument std.lv = TRUE when calling the cfa function. This will make it so that the factor covariances of the unstandardized model will be equal to the standardized covariances and can be interpreted as correlations."

This apparently works, as I see from your answer, since it is the unstandardised coefficients that are being equated.
One more question about this: Which of the solutions (unstandardized vs. standardized) should I consider and report? In my data, the results differ between undst. and stand. so that only half of the 6 contrasts are stat. significant in the unstandardized solution compared to the standardized. Is this a matter of different estimation of standard error or just the issue with the different metrics of the variables?

Thanks also for the tip about the methods for decomposing R². That would have been my next question, as I have built my model hierarchically (3 sub-models) and then wanted to determine whether the increase in R² is statistically significant in each case. But then I can determine that using one of these approaches, great.
Thank you very much for the quick help.

Shu Fai Cheung

unread,
Apr 22, 2023, 7:21:14 PM4/22/23
to lav...@googlegroups.com
This is an interesting scenario. How about creating a reproducible example of your model and data, with simulated data and some random names like x1, x2, ...? How to address your questions may depend on other factors, especially when the model involves latent variables. The following example illustrates the complexity in a similar scenario.

Also note that std.lv works differently for a model with structural paths. In the first version, fit, with std.lv = TRUE, the error term of the dependent latent factor (visual) is standardized, while the dependent latent factor itself is not.

# Adapted from the example of cfa()
library(lavaan)
#> This is lavaan 0.6-15
#> lavaan is FREE software! Please report any bugs.
HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9
              visual ~ a1 * textual + a2 * speed
              diff := a2 - a1'

fit <- sem(HS.model, data = HolzingerSwineford1939,
           std.lv = TRUE)
parameterEstimates(fit)[c(10, 11, 21:25), ]
#>        lhs op     rhs label   est    se     z pvalue ci.lower ci.upper
#> 10  visual  ~ textual    a1 0.434 0.096 4.510  0.000    0.246    0.623
#> 11  visual  ~   speed    a2 0.455 0.110 4.129  0.000    0.239    0.671
#> 21  visual ~~  visual       1.000 0.000    NA     NA    1.000    1.000
#> 22 textual ~~ textual       1.000 0.000    NA     NA    1.000    1.000
#> 23   speed ~~   speed       1.000 0.000    NA     NA    1.000    1.000
#> 24 textual ~~   speed       0.283 0.069 4.117  0.000    0.148    0.418
#> 25    diff :=   a2-a1  diff 0.021 0.153 0.135  0.893   -0.279    0.320
standardizedSolution(fit)[c(10, 11, 21:25), ]
#>        lhs op     rhs label est.std    se     z pvalue ci.lower ci.upper
#> 10  visual  ~ textual    a1   0.354 0.069 5.127  0.000    0.218    0.489
#> 11  visual  ~   speed    a2   0.370 0.076 4.850  0.000    0.221    0.520
#> 21  visual ~~  visual         0.664 0.071 9.355  0.000    0.525    0.803
#> 22 textual ~~ textual         1.000 0.000    NA     NA    1.000    1.000
#> 23   speed ~~   speed         1.000 0.000    NA     NA    1.000    1.000
#> 24 textual ~~   speed         0.283 0.069 4.117  0.000    0.148    0.418
#> 25    diff :=   a2-a1  diff   0.017 0.124 0.135  0.893   -0.227    0.260

fit2 <- sem(HS.model, data = HolzingerSwineford1939,
            std.lv = FALSE)
parameterEstimates(fit2)[c(10, 11, 21:25), ]
#>        lhs op     rhs label   est    se     z pvalue ci.lower ci.upper
#> 10  visual  ~ textual    a1 0.321 0.067 4.776  0.000    0.190    0.453
#> 11  visual  ~   speed    a2 0.538 0.130 4.152  0.000    0.284    0.792
#> 21  visual ~~  visual       0.537 0.117 4.582  0.000    0.307    0.767
#> 22 textual ~~ textual       0.979 0.112 8.737  0.000    0.760    1.199
#> 23   speed ~~   speed       0.384 0.086 4.451  0.000    0.215    0.553
#> 24 textual ~~   speed       0.173 0.049 3.518  0.000    0.077    0.270
#> 25    diff :=   a2-a1  diff 0.216 0.161 1.348  0.178   -0.098    0.531
standardizedSolution(fit2)[c(10, 11, 21:25), ]
#>        lhs op     rhs label est.std    se     z pvalue ci.lower ci.upper
#> 10  visual  ~ textual    a1   0.354 0.069 5.127  0.000    0.218    0.489
#> 11  visual  ~   speed    a2   0.370 0.076 4.850  0.000    0.221    0.520
#> 21  visual ~~  visual         0.664 0.071 9.355  0.000    0.525    0.803
#> 22 textual ~~ textual         1.000 0.000    NA     NA    1.000    1.000
#> 23   speed ~~   speed         1.000 0.000    NA     NA    1.000    1.000
#> 24 textual ~~   speed         0.283 0.069 4.117  0.000    0.148    0.418
#> 25    diff :=   a2-a1  diff   0.017 0.124 0.135  0.893   -0.227    0.260

Regards,
Shu Fai Cheung (張樹輝)


--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/a1f9ade2-5f3f-4000-bd38-6800e17c72d4n%40googlegroups.com.

Christian Arnold

unread,
Apr 23, 2023, 7:35:17 AM4/23/23
to lavaan
Hi Marcus and Shu Fai,

here are my 2 cents, which may be a bit off from your question (hopefully not too far). Others may strongly disagree, but it's nice and perhaps helpful to have a discussion like this. If you want to focus on the effect size (almost a definitional question) of each IV on the DV, then I would compare the standardized differences. The squared standardized beta can be interpreted as unique variance explained. See here in a slightly different context: https://quantpsy.org/pubs/lachowicz_preacher_kelley_2018.pdf. Implemented in the MBESS package: https://cran.r-project.org/web/packages/MBESS/index.html

I am seriously unsure whether this approach is finally satisfactory. Let's consider a simple case: Multiple regression with 3 IV explaining the DV. R^2 can then be calculated as follows:

R^2 = b1^2 + b2^2 + b3^2 + 2 * c12 * b1 * b2 +2 * c13 * b1 * b3 + 2 * c23 * b2 * b3

b... are the standardized betas and c... are the correlations between the corresponding IV. A demo with lavaan is attached below.

Unique variance explained may be defined as b...^2 (see Lachowicz, Preacher, Kelly linked above). The b... unfortunately also appear in other parts of the formula. It becomes particularly interesting if, for example, one of the b... sign is negative and a surpressor is present.

Against this (hopefully reasonably correctly) explained background, a number of methods have been developed to determine the relative effect sizes (importance) of the IV. The idea is (roughly) as follows: IV1 explains x1%, ... IVn explains xn% and sum(x1, ..., xn) = 1. Resampling (usually bootstrap) can be used to test whether x1, ... xn differ significantly. The main methods for this idea can be found in the relaimpo package: https://cran.r-project.org/web/packages/relaimpo/index.html

The bad news: relaimpo does not work with latent variables. I haven't read the article linked above (Gu, 2022, Assessing the Relative Importance of Predictors in Latent Regression Models), but it sounds like Shapley decomposition was implemented there, which is one of the methods to determine the relative importance (beware of my half-knowledge). So that *might* be interesting to you. Detachted from the LV problem, here's my immature take on Shapley Decomposition: the method is theoretically sound since it is based on the work of Shapley, and thus draws on ideas from cooperative games theory. Shapley himself did not develop the method for multiple regression and I have doubts that all of Shapley's basic assumptions are actually met. Nevertheless, the method seems to create quite useful results (which can certainly be seen critically, though).


HTH

Christian



library(lavaan)

pop.model <- "
y ~ 0.30 * x1 + 0.4 * x2 + 0.3 * x3
x1 ~~ 0.7 * x2 + 0.3 * x3
x2 ~~ 0.4 * x3

"

set.seed(123)
data <- simulateData(pop.model, sample.nobs = 1000)

model <- "
y ~ b1 * x1 + b2 * x2 + b3 * x3
x1 ~~ c12 * x2 + c13 * x3
x2 ~~ c23 * x3
"

fit <- cfa(model, data)

# lavaan result
lavInspect(fit, "rsquare")["y"]    

# By hand
pe  <- parameterEstimates(fit, standardized = TRUE)
b1  <- pe[pe$label == "b1", "std.all"]
b2  <- pe[pe$label == "b2", "std.all"]
b3  <- pe[pe$label == "b3", "std.all"]
c12 <- pe[pe$label == "c12", "std.all"]
c13 <- pe[pe$label == "c13", "std.all"]
c23 <- pe[pe$label == "c23", "std.all"]

b1^2 + b2^2 + b3^2 + 2 * c12 * b1 * b2 +2 * c13 * b1 * b3 + 2 * c23 * b2 * b3



Christian Arnold

unread,
Apr 23, 2023, 8:23:36 AM4/23/23
to lavaan
Oh dear, that may be misleading: "The idea is (roughly) as follows: IV1 explains x1%, ... IVn explains xn% and sum(x1, ..., xn) = 1". Sorry. x1%, .... xn% sum up to 1, because they should represent the contribution to R^2.  

Ingo Man

unread,
Jul 13, 2023, 6:15:34 AM7/13/23
to lavaan
Dear Shu Fai, dear Christian, first of all, please excuse my late reply, I was on parental leave and could now continue working on the article.
For the publication, it was now completely sufficient to investigate whether the unstandardized beta coefficients differ from each other using the contrast approach. I worked with std.lv = TRUE and now report standardised and unstandardised coefficients. Yes, my model involves latent as well as manifest variables. I cannot delve deeper into the matter at this point, but I am aware that I always check whether I use the unstandardised or standardised solution.

The question of the individual contribution to R² is something I will look into further when I get the chance, I got Gu's paper (2021) and read it, unfortunately I don't have it digitally, so the R-code would have to be typed out or reproduced via text recognition to try it out.

You have helped me a lot. Blessings.
Marcus

Jošt Bartol

unread,
Jul 13, 2023, 7:04:28 AM7/13/23
to lav...@googlegroups.com
I really got interested in the part "I have different metrics in the independent variable" and how to compare the unstandardized effects of such variables. From the above preceding, I do not really see/understand how this was solved (not to criticize, I just find this useful). I wonder, if a very simple solution could be to simply scale all variables to the same metric? For example, that all are from 1 to 100? Would this work?

Regards,
Jošt

V V čet., 13. jul. 2023 ob 12:15 je oseba 'Ingo Man' via lavaan <lav...@googlegroups.com> napisala:

Keith Markus

unread,
Jul 14, 2023, 8:59:04 AM7/14/23
to lavaan
Jost,
You may find the below article of interest.  It proposed coding continuous causal variables as the proportion of their maximum possible change.

Cohen, Patricia, Cohen, Jacob, Aiken, Leona S.,  West, Stephen G. (1999).  The Problem of Units and the Circumstance for POMP.  Multivariate Behavioral Research. 1999, Vol. 34 Issue 3, p315. 32p.

Keith
------------------------
Keith A. Markus
John Jay College of Criminal Justice, CUNY
http://jjcweb.jjay.cuny.edu/kmarkus
Frontiers of Test Validity Theory: Measurement, Causation and Meaning.
http://www.routledge.com/books/details/9781841692203/

Prof. Gavin Brown

unread,
Jul 14, 2023, 1:00:55 PM7/14/23
to lavaan
Hi Jost
I think Keith is introducing a solution documented quite some time earlier
Hovland, C. I., Lumsdaine, A. A., & Sheffield, F. D. (1955). A baseline for measurement of percentage change. In P. F. Lazarsfeld & M. Rosenberg (Eds.), The language of social research: A reader in the methodology of social research (pp. 77-82). The Free Press.
this chapter says much what i understood Keith to say
take the gain as a proportion of the possible distance to the maximum
on a 100 point scale a score of 90 has a possible increase of 10 while at 60 there are 40 possible points.
if the former goes up 5 points that is 50% of the total. if the latter goes up 10 points that is only 25% of the total.
so it's a way to compare differences on a common scale of proportion of maximum possible gain
cheers

Keith Markus

unread,
Jul 15, 2023, 8:56:05 AM7/15/23
to lavaan
Hi Gavin.  Thanks for that.  I checked the references in the Cohen et al. article, and they did not cite Hovland et al., suggesting that the were not aware of the precedent either, nor were their reviewers.  The only difference is one of scale, rather than 0 to 100, they used 0 to 1.  The motivation was situations where they would have a mix of continuous and dichotomous variables in a linear model and the regression weight for a dichotomous variable was larger, leading reviewers to object that it was more important.  However, the dichotomous variable can only change once from 0 to 1, whereas the continuous variable can increase several times over, let's say six times to be concrete.  So, by re-scaling the continuous variable to the 0 to 1 scale, the effect coefficient/regression weight now represents that sixfold possible increase rather than just one sixth of the possible increase in the variable.  This convinced the reviewers that focusing on the continuous variable had more utility than focusing on the dichotomous one.

Jošt Bartol

unread,
Jul 17, 2023, 4:28:59 AM7/17/23
to lav...@googlegroups.com
Keith and Gavin,

thanks for your suggestions and comments. I am currently not facing these problems, I just posted my inquiry out of curiosity. But I'm sure this will benefit me (and probably also others) in the future!

Thanks again,
Jošt

V V sob., 15. jul. 2023 ob 14:56 je oseba 'Keith Markus' via lavaan <lav...@googlegroups.com> napisala:
--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.

Jizhou Francis YE

unread,
Jul 17, 2023, 4:49:57 AM7/17/23
to lav...@googlegroups.com
Hi, Jost

Currently, our team is exploring methods to enhance the accuracy and efficiency of coefficient interpretation, particularly in the context of transforming data from the original scale to a normalized 0-1 scale using min-max normalization. We believe that this approach, similar to the one presented in the Cohen et al. (1996).

And we have published some articles, you may take these as a kind of references:
  1. Liu, LP., Ye, JF., Ao, HS., Sun, SX., Zheng, Y., Li, QR., Feng, GC., Wang, HY., Zhao, X. (2023). Effects of Electronic Personal Health Information Technology on American Women's Cancer Screening Behaviors Mediated through Cancer Worry: Differences and Similarities between 2017 and 2020. Digital Health, 9: 1–12.
  1. Liu, LP., Chang, A., Liu, MT., Ye, JF., Jiao W., Ao, HS., Hu, WR., Xu, K., Zhao, X. (2023). Effect of Information Encountering on Health Eating Concerns –Mediated through Body Comparison and Moderated by Body Mass Index or Body Satisfaction. BMC Public Health, 23, 254 (2023).
  1. Liu,LP., Zhao, X., Ye, JF. (2022). The Effect of the Use of Patient-Accessible Electronic Health Record Portals on Cancer Survivors’ Health Outcomes: Cross-sectional Survey Study. Journal of Medical Internet Research, 24 (10), e39614.
Bests,
Francis, Jizhou YE
University of Macau

From: lav...@googlegroups.com <lav...@googlegroups.com> on behalf of Jošt Bartol <barto...@gmail.com>
Sent: Monday, July 17, 2023 4:28:30 PM
To: lav...@googlegroups.com <lav...@googlegroups.com>
Subject: Re: Comparing path coefficients in SEM using lavTestLRT - can I compare standardized or unstandardized values?
 

Jošt Bartol

unread,
Jul 24, 2023, 6:00:08 AM7/24/23
to lav...@googlegroups.com
Francis,

thanks for all the information and the linked articles, will have a look! :)

Kind regards,
Jošt

V V pon., 17. jul. 2023 ob 10:49 je oseba Jizhou Francis YE <yjz1...@gmail.com> napisala:
Reply all
Reply to author
Forward
0 new messages