Explaining differences between std.lv, std.all, and std.nox to a class

5,478 views
Skip to first unread message

Amanda Pollitt

unread,
Feb 13, 2015, 1:59:05 PM2/13/15
to lav...@googlegroups.com
Hi everyone,

I'm a teaching assistant for an SEM class taught in R/lavaan. Students have asked about the standardized solutions in lavaan and though describing std.lv is relatively straightforward, we're not sure how to explain the std.all and std.nox options. We're also not sure how to interpret intercepts in the std.all solution. The documentation reads: "If "std.lv", the standardized estimates are on the variances of the (continuous) latent variables only. If "std.all", the standardized estimates are based on both the variances of both (continuous) observed and latent variables. If "std.nox", the standardized estimates are based on both the variances of both (continuous) observed and latent variables, but not the variances of exogenous covariates." This is somewhat difficult to explain to a classroom of students new to SEM.

Any explanations and/or references?

Thank you!

Mark Seeto

unread,
Feb 13, 2015, 6:27:27 PM2/13/15
to lav...@googlegroups.com
Hi Amanda,

When you say "This is somewhat difficult to explain to a classroom of students new to SEM", are you saying that you understand it but can't explain it, or are you saying that you don't understand it?

Maybe you could post a reproducible example, either with some made-up data or with one of the data sets included in lavaan, to provide something concrete to discuss.

A good way of working out what the different options do is to simulate some data (so you know the true parameter values) with a very large sample size, and see what lavaan gives you.

Also, note (if you haven't already) that there are two different sets of "std.lv" values: the "est" column when you call parameterEstimates, having used "std.lv=TRUE" when fitting the model, and the "std.lv" column when you call parameterEstimates.

Mark

Edward Rigdon

unread,
Feb 13, 2015, 9:16:57 PM2/13/15
to lav...@googlegroups.com
Amanda--
     The "nox" part is the easiest to explain.  Covariates stand partly outside the model.  The distributional assumptions of maximum likelihood, for example, are conditional on the covariates.  Most particularly relevant (it seems to me), these covariates are often experimental treatment indicators or dummy variables.  In particular, it makes no sense to standardize a dummy variable.  First, you obliterate the real information in the variable. Second, the variance of a dichotomous variable is p times 1 - p, where p is the fraction of 1's, and can never be equal to 1.  So standardization creates an impossibility.  These are some especially good reasons to have the option to omit covariates when standardizing.
      I don't like standardization generally, so I won't comment on the other options.
--Ed Rigdon 

Sent from my iPad
--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at http://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

kma...@aol.com

unread,
Feb 14, 2015, 11:13:32 AM2/14/15
to lav...@googlegroups.com
Hi Ed,
You wrote:
>>>
In particular, it makes no sense to standardize a dummy variable. First, you obliterate the real information in the variable. Second, the variance of a dichotomous variable is p times 1 - p, where p is the fraction of 1's, and can never be equal to 1. So standardization creates an impossibility.
>>>

I am not clear why you would say that. Consider the following.

> Flag <- c(rep(0,25),rep(1,75))
> mean(Flag);sd(Flag)
[1] 0.75
[1] 0.4351941
> zFlag <- (Flag - mean(Flag))/sd(Flag)
> mean(zFlag);sd(zFlag)
[1] -2.775558e-17
[1] 1
> table(Flag, zFlag)
zFlag
Flag -1.72336879396141 0.574456264653803
0 25 0
1 0 75
>

zFlag has a variance of 1 and appears to retain all the information in Flag. What have I misunderstood?

I have always thought of the 0/1 coding as simply a matter of convenience that makes the effect coefficients easier to interpret (Cohen, Cohen, West & Aiken, 2002).

Keith
------------------------
Keith A. Markus
John Jay College of Criminal Justice, CUNY
http://jjcweb.jjay.cuny.edu/kmarkus
Frontiers of Test Validity Theory: Measurement, Causation and Meaning.
http://www.routledge.com/books/details/9781841692203/


Edward Rigdon

unread,
Feb 14, 2015, 12:04:39 PM2/14/15
to lav...@googlegroups.com
Keith--
Yes, you can transform the dummy into something with a variance of 1. But a dummy cannot have a variance of 1. I don't like assigning an impossible value to the variable.
Yes, you can mentally undo the transformations of standardized dummies, so my word "obliterate" was not correct--the information is not permanently lost. "Obscure" would have been a better word to use. You can avoid the mental effort of undoing the standardization by not standardizing. If there were a benefit of standardization, then you could weigh that against the cost, but I don't see that benefit. A set of dummies, representing each of several groups, establishes a baseline group--the group such that all dummies have values of 0. The straightforward-ness of that interpretation has its own value. No, this is not a rejection of effects codes, which establish a different baseline.
--Ed Rigdon

kma...@aol.com

unread,
Feb 15, 2015, 10:41:44 AM2/15/15
to lav...@googlegroups.com
Ed,
Thanks for fleshing out what you were saying. Certainly I agree that a standardized dummy coded variable is no longer a dummy coded variable. However, I evaluate standardized solutions a little differently. Here is the way I would describe it. The purpose of dummy coding is to make the raw coefficients easy to interpret. The purpose of standardization is to provide a different view of the results than that offered by raw coefficients. So, somebody interested in examining a standardized solution should not be looking for the interpretation of raw coefficients in that solution.

While the raw coefficients focus on expressing effects in relation to the range of the variable, the standardized coefficients focus on expressing effects in terms of variability. Two dummy codes with the same 0-1 range can differ in variability. Two causes with the same raw effect can differ in their relative impact on a common effect because one varies more. So, standardized solutions can provide a useful additional perspective on the estimates. In this case, raw estimates require the mental arithmetic to juggle raw effects and estimated variances at the same time and standardization simplifies interpretation.

I would not think that standardized estimates should ever replace raw estimates, but they can offer a useful supplement. As such, I would not describe a standardized value is an invalid value if one interprets it as a standardized value rather than as a raw dummy code. A valid standardized score need not be a valid raw score. At least, that is my somewhat more forgiving view of standardization, for what it is worth.

Incidentally, I typed my previous post with internet quotes around what you wrote, but Google Groups stripped them. It stripped indents from some code that I posted earlier too. So please pardon my clumsiness with Google Groups as I learn the quirks.
Reply all
Reply to author
Forward
0 new messages