Does one’s SEM latent variable scaling convention (unit variance vs. marker variable) impact the interpretation of unstandardized indirect effects?

Nick Rosemarino

unread,

Dec 23, 2022, 4:22:52 AM12/23/22

to lavaan

Hi all,

This is my first time running a structural equation model, and my question involves understanding if my approach to setting the scale of the latent factor has any implications on if or how I should interpret the unstandardized indirect effects. Specifically, I used the unit variance approach (also referred to as “fixed factor method” or “reference factor method”), which fixes the latent factor’s variance to 0 and its mean to 1, thus allowing all factor loadings to be freely estimated, rather than the marker variable approach which sets the scale of the latent factor by forcing one indicator to have an intercept of 0 and a factor loading of 1.0.

My question is, does the unit variance scaling convention that I used have any implications on if I can report/interpret the unstandardized indirect effects? For context, in my current study I am planning to report both the standardized and unstandardized effects. And sure enough, my lavaan output produced different estimates for the unstandardized indirect effects (column “est”) and the standardized indirect effects (columns “Std.lv”/ “std.all”).

However, from my understanding, my unit variance scaling convention already standardized the latent variables; and therefore, I am unsure of how to interpret the unstandardized indirect effects. Stated differently, I am unsure of what exactly makes the standardized indirect effects in this case different from the unstandardized indirect effect. Or even if the unstandardized indirect effect is able to be interpreted.

Any clarification is greatly appreciated!

Screen Shot 2022-12-23 at 3.29.57 AM.png

Rönkkö, Mikko

unread,

Dec 23, 2022, 6:39:39 AM12/23/22

to lav...@googlegroups.com

HI,

You cannot scale the dependent variable by fixing its variance because the dependent variable’s variance is not a model parameter. Are you sure you are constraining the variances and not the variances of exogenous latent variables and variances of the error terms of the endogenous latent variables? If you are doing the latter, then the standardized results will not generally be the same as the unstandardized ones.

If you are interpreting the raw estimates, the scaling matters. I would not even try interpreting results that are obtained by scaling the variances of the error terms.

MIkko

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/32d62adb-fdad-4581-b939-65ac910c8207n%40googlegroups.com.

Nick Rosemarino

unread,

Dec 23, 2022, 4:29:55 PM12/23/22

to lavaan

Hi Mikko,

Thanks so much for helping me work through this-very much appreciated. Couple clarifying thoughts on this. Yes, I believe your description of the standardization procedure is exactly what I did. I used "std.lv=true", which to your point, standardized the latent dependent variables by fixing the intercepts to 0 and the residual variances to 1, and for the exogenous variables, fixed the mean to be 0 and the variance to be 1.

I think I am stuck on if it would be appropriate to report the unstandardized indirect effects in the results section of my paper. So for the first prediction: b =.08, CI [.038, .127]. In my tables, I do report both the standardized and unstandardized indirect effect estimates. I guess I am still trying to figure out if the scaling convention I used automatically eliminates my ability to report the unstandardized indirect effects. Do you have any thoughts about that?

Thanks again for your help with this, Mikko. All the best,

Nick

Rönkkö, Mikko

unread,

Dec 27, 2022, 5:58:13 AM12/27/22

to lav...@googlegroups.com

Hi,

I would not call the estimates standardized if you are scaling based on constraining error variances. I also do not know of any useful interpretation that estimates scaled that way would have. What is the reason why you want to scale the latent variables that way instead of fixing the first indicator loading, in which case you can interpret the coefficients on the scale of the indicators?

Mikko

To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/e02fe8a6-ce91-4b16-9306-1e7160788ec8n%40googlegroups.com.

Nick Rosemarino

unread,

Dec 28, 2022, 7:25:29 PM12/28/22

to lavaan

Hi Mikko,

Good to know that you wouldn't consider the estimates "standardized" with the unit variance scaling approach that I used to identify the model. Regarding why I choose this method (rather than the marker variable method which fixes one factor loading to 1): I didn't have a strong preference one way or the other. I simply learned the unit variance approach which uses the lavaan code Std.lv=true. I have a large model, so I used item parcels, and this approach allowed me to observe how the each of the three parcels were loading onto their respective latent factors.

Upon additional reading, it appears that some statistician discourage the use of the marker variable approach (although I'm sure there are disadvantages with the approach I used as well)

1. When more than one construct is estimated, [the fixed factor/unit variance] method has the added advantage of providing estimates of the between-construct relations in correlation metric (Little, 2013, p. 81)

2. the choice of marker variable and the metric that it introduces are arbitrary and provide little in terms of meaningful units to discuss or interpret in terms of the construct’s actual mean and variance (Little, 2013, p. 152)

3. [Regarding the marker variable method], The key issue here is that residual variance includes error variance and unique variance, so fixing the metric of the latent variable to an observed variable’s common variance has dubious value (Steiger, 2002, p. 214).

Regarding interpretation, I am currently reporting both unstandardized and standardized indirect effects, and since the latter puts the metric on units of standard deviation, I was planning to interpret that once since it's considered more of an effect size. However, I do report the unstandardized indirect effects in the result section. Just wanted to make sure that it was not considered "bad practice" to report the unstandardized indirect effect in my case.

-Nick

Rönkkö, Mikko

unread,

Dec 29, 2022, 1:41:00 AM12/29/22

to lav...@googlegroups.com

Hi,

Thanks for the thoughtful comment. I must disagree on a couple of points. I assume Little 2013 refers to the longitudinal structural equation modeling book. Unfortunately I have only a draft that I got from Todd so the page numbers do not match so I must rely on what is available on Google Books.

From: lav...@googlegroups.com <lav...@googlegroups.com> on behalf of Nick Rosemarino <rosemar...@gmail.com>
Date: Thursday, 29. December 2022 at 2.28
To: lavaan <lav...@googlegroups.com>
Subject: Re: Does one’s SEM latent variable scaling convention (unit variance vs. marker variable) impact the interpretation of unstandardized indirect effects?

Hi Mikko,

Good to know that you wouldn't consider the estimates "standardized" with the unit variance scaling approach that I used to identify the model. Regarding why I choose this method (rather than the marker variable method which fixes one factor loading to 1): I didn't have a strong preference one way or the other. I simply learned the unit variance approach which uses the lavaan code Std.lv=true. I have a large model, so I used item parcels, and this approach allowed me to observe how the each of the three parcels were loading onto their respective latent factors.

Upon additional reading, it appears that some statistician discourage the use of the marker variable approach (although I'm sure there are disadvantages with the approach I used as well)

1. When more than one construct is estimated, [the fixed factor/unit variance] method has the added advantage of providing estimates of the between-construct relations in correlation metric (Little, 2013, p. 81)

This is true in CFA models where all latent variables are exogenous. It is not true if you are fixing the scale of a latent variable by fixing its error variance. Your results demonstrate this: standardized estimates are on the correlation metric and they are not the same as the estimates scaled by constraining the error variances of latent variables.

2. the choice of marker variable and the metric that it introduces are arbitrary and provide little in terms of meaningful units to discuss or interpret in terms of the construct’s actual mean and variance (Little, 2013, p. 152)

This argument is presented in article form in.

Little, T. D., Slegers, D. W., & Card, N. A. (2006). A non-arbitrary method of identifying and scaling latent variables in SEM and MACS models. Structural Equation Modeling, 13(1), 59–72. https://doi.org/10.1207/s15328007sem1301_3

Effects coding technique that Little explains can be useful, but the conclusion that unit loading coding would be less useful is not warranted for two reasons. First, all scaling decisions are arbitrary. We can measure length in centimeters or in inches (or millimeters or feet…) and our result varies depending on which scaling approach we use. Does this mean that all measures of length are arbitrary and not useful? Of course not. The point is that variables should be scaled in a way that is understandable for the person who uses the results (see https://doi.org/10.1080/10705511.2022.2134140). Second, loadings of indicators that belong to the same scale are often very similar. In this case the use of effects coding and unit loading coding produce very similar results and it would not make a difference which one is used.

For a discussion on the relationship between the scaling methods, see https://doi.org/10.1080/10705511.2020.1796673

3. [Regarding the marker variable method], The key issue here is that residual variance includes error variance and unique variance, so fixing the metric of the latent variable to an observed variable’s common variance has dubious value (Steiger, 2002, p. 214).

The relevant part of Steiger is:

”There seems to be some confusion in the literature about the latter point. Numerous sources (e.g., Kline, 1998, p. 204) have made a statement to the effect that a ULI constraint for the loading of a particular mani- fest variable fixes the scale of the latent variable to be the same as the manifest variable. This misconception has led to the use of the term reference variable to refer to the manifest variable with the ULI attached. This view is wrong—if a value of unity is used, the variance of the latent variable is fixed to the variance of the common part of the manifest variable that has the ULI constraint. Moreover, as we have already seen, all other loadings emanating from the latent variable move up or down in concert with the value selected for the ULI constraint, and the variance of the common part is itself determined by the choice of variables in the measurement model. The key issue here is that residual variance includes error variance and unique variance, so fixing the metric of the latent variable to an observed variable’s common variance has dubious value.”

With unit loading scaling, the scale of the latent variable is set so that one unit increase in the latent variable leads to a one unit increase in the expected value of the scaling indicator (with the caveat that the model applies to a population and not any specific individual). Steiger says nothing that would challenge this interpretation because the error and unique parts of variance are constrained to be uncorrelated with the latent variable variance.

The main point of Steiger is that the variable scaling affects parameter tests. This is a valid concern.

Regarding interpretation, I am currently reporting both unstandardized and standardized indirect effects, and since the latter puts the metric on units of standard deviation, I was planning to interpret that once since it's considered more of an effect size. However, I do report the unstandardized indirect effects in the result section. Just wanted to make sure that it was not considered "bad practice" to report the unstandardized indirect effect in my case.

I would not recommend reporting the unstandardized effects if you are scaling based on latent variable error variances because it is difficult to come up with any useful interpretations of those estimates.

In fact, I do not remember seeing any empirical article that presents results that are scaled this way. (Though I am sure that there are articles that scale their estimates by constraining latent error variances and mispresent these as standardized estimates, but that is difficult to identify from a published artic.el) I also do not remember seeing any SEM book or methodological article that recommends scaling endogenous latent variables by fixing their error variances. For example the examples in Little’s longitudinal SEM book (at least in the pre-print that I have) scale the latent variables by fixing their loadings (either one loading or through effects coding)

Note that with CFA models where all latent variables are exogenous, fixing factor variances to unity makes a lot of sense.

Mikko

To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/7f8a09ff-122e-4fd0-806a-1f0239acc0edn%40googlegroups.com.

Nick Rosemarino

unread,

Dec 29, 2022, 3:39:50 AM12/29/22

to lavaan

Hi Mikko,

Wow thanks for this generous response. I think you've provided me with a lot of great insight for my advisor and I to consider, and I will talk to him about potentially rethinking our scaling convention.

Thanks again for your insight,

Nick

Keith Markus

unread,

Dec 29, 2022, 9:24:03 AM12/29/22

to lavaan

Nick,

Mikko has given you very helpful and detailed advice. I am just concerned that one point could be misinterpreted. Mikko said that he did not recommend reporting raw estimates based on fixing the disturbance variances of latent variables. This could be interpreted as recommending that you only report the standardized estimates but I do not think that this is what Mikko meant. It is generally recommended to always report raw estimates and treat standardized estimates as supplements to aid interpretation. So, I think that you should take Mikko's comment as a recommendation that you report raw estimates based on either unit loading constraints or effect coding constraints along with your standardized estimates. This is comparable to reporting both raw mean differences and a standardized effect size in an experiment. One reason for doing this is that there are no free lunches in methodology, there are always trade offs. In this case, standardized estimates make it easier to compare effect sizes at the cost of making it harder to evaluate how much change one should expect without intervention because, by design, they obscure differences across variables in the amount of variability. As a result, it is helpful to look at the solution both ways because each brings different aspects of the solution into relief.

Keith

------------------------
Keith A. Markus
John Jay College of Criminal Justice, CUNY
http://jjcweb.jjay.cuny.edu/kmarkus
Frontiers of Test Validity Theory: Measurement, Causation and Meaning.
http://www.routledge.com/books/details/9781841692203/

Reply all

Reply to author

Forward