Continuous latent variables and binary dependent variables

440 views
Skip to first unread message

Maya

unread,
Oct 11, 2015, 5:22:41 AM10/11/15
to lavaan

Hello,


I am new to using Lavaan and would like to double check a few small things about using Lavaan with a mix of continuous dependent latent variables and a binary dependent variable.


I have a SEM with 4 structural equations, 3 of them with a continuous latent dependent variable (let’s call them L1, L2, L3), and one with a binary dependent variable (H). The binary variable H is also a right hand side variable in the L2 and L3 equations.


L1 ~ L2 + …

L2 ~ L3 + H + …

L3 ~ H + …

H ~ x1 + x2 + x3 + …

 

1.       1. Is there any problem with the above specification where the binary variable H is both a right hand side variable in some equations and a dependent variable in another equation, or it is an OK specification? The model estimated fine.

 

2.       2. How can I specify in Lavaan that I want correlation/covariance parameters to be estimated between the error term in the structural equation of H and each of the error terms in the structural equations of L1, L2, and L3? Would I do this simply as follows?

H ~~ L1

H ~~ L2

H ~~ L3

 

3.       3. Are all error terms in the 4 equations above assumed to be normally distributed?

 

4.       4. In the equation of the H binary variable, would a positive parameter of a variable X mean that the larger X is, the more likely H is to be 1 than 0?

 

Thanks a lot.

Maya

 

yrosseel

unread,
Oct 12, 2015, 2:39:33 PM10/12/15
to lav...@googlegroups.com
On 10/11/2015 11:22 AM, Maya wrote:
> 1.1. Is there any problem with the above specification where the binary
> variable H is both a right hand side variable in some equations and a
> dependent variable in another equation, or it is an OK specification?

It is OK, as long as 'H' has been declared as 'ordered'.

> 2.2. How can I specify in Lavaan that I want correlation/covariance
> parameters to be estimated between the error term in the structural
> equation of H and each of the error terms in the structural equations of
> L1, L2, and L3? Would I do this simply as follows?
>
> H ~~ L1
> H ~~ L2
> H ~~ L3

Yes.

> 3.3. Are all error terms in the 4 equations above assumed to be normally
> distributed?

Yes for the first three. The fourth has a normally distributed error
term if you replace the binary variable 'H' by its underlying continuous
counterpart, H*

> 4.4. In the equation of the H binary variable, would a positive
> parameter of a variable X mean that the larger X is, the more likely H
> is to be 1 than 0?

Yes. Think of equation 4 as probit regression.

Yves.

Maya

unread,
Oct 13, 2015, 6:36:56 AM10/13/15
to lavaan
Dear Yves,

Thanks a lot for your response.
I just want to make sure I correctly understood what you said about H and H*. So it is correct to write the model as I did (where H ~ X1 + X2 + X3 + ...), and this equivalently means that H* = X1 + X2 + X3 + ..., and the error term in the H* equation (which is not explicitly written in the syntax) would be normal. Right?

Also one last question: the predict function (or LavPredict) would allow me to extract factor scores for all latent variables using the measurement model only. Right? So if I want to predict the new value of a latent variable because of a change in the value of an explanatory variable in its structural equation, how would I do it in Lavaan?


Thanks again,
Maya

yrosseel

unread,
Oct 15, 2015, 6:04:09 AM10/15/15
to lav...@googlegroups.com
On 10/13/2015 12:36 PM, Maya wrote:
> Thanks a lot for your response.
> I just want to make sure I correctly understood what you said about H
> and H*. So it is correct to write the model as I did (where H ~ X1 + X2
> + X3 + ...)

Yes.

and this equivalently means that H* = X1 + X2 + X3 + ...,
> and the error term in the H* equation (which is not explicitly written
> in the syntax) would be normal. Right?

Indeed. We think of 'H*' as the unobserved continuous variable
underlying the categorical (but ordered) observed variable H.

> Also one last question: the predict function (or LavPredict) would allow
> me to extract factor scores for all latent variables using the
> measurement model only. Right?

Correct. The current implementation (0.5-19) is rather slow, but it works.

> So if I want to predict the new value of
> a latent variable because of a change in the value of an explanatory
> variable in its structural equation, how would I do it in Lavaan?

For now, manually only. lavPredict() does not do this.

One way would be to 'augment' the data, and add a new case that is
identical to another case, except that for the explanatory variable,
you give it the changed value (say, x + 1); you fit the model again, and
unless the sample size is small, adding this new case will (almost) have
no effect on the parameter estimates; now, you can compare the output of
lavPredict() for the original case, and the new case.

Yves.

Maya

unread,
Oct 23, 2015, 2:34:57 PM10/23/15
to lavaan
Dear Yves,

Thanks again for your reply. I'd like to ask you a couple more questions as I go through the different stages of my model:

1. After I estimated the model, I used the lavPredict function to compute factor scores of the latent variables. My dataset consists of 360 observations. The factor scores for 14 observations are predicted as “NA” even though I have no missing data. If I change the model specification slightly, I get another set of 16 observations with “NA” factor scores. When I augment the dataset with data from another city (1000 observations) and re-estimate the model, I don’t have any more NA’s when I apply the lavPredict function.

Do you know what is causing this issue and should I be worried about it?

2.  I also have another question. So I have a multi-group SEM, with the groups being two cities. I saw that I can do tests of Measurement Invariance in Lavaan. But how can I test whether there is also Structural invariance? That is, I am interested in testing whether the coefficients in the structural equations of the latent variables are different across the two cities. More specifically, I want to test if the coefficients in all of the model (structural + measurement models) are different for the two cities. How can I do that in Lavaan? Can I do it all at once, or do I have to test the hypothesis of equality of coefficients separately for the measurement model and the structural model? My indicators are a mix of continuous and ordinal.

Thank you.
Maya

Terrence Jorgensen

unread,
Oct 25, 2015, 6:30:17 AM10/25/15
to lavaan
I want to test if the coefficients in all of the model (structural + measurement models) are different for the two cities. How can I do that in Lavaan? Can I do it all at once, or do I have to test the hypothesis of equality of coefficients separately for the measurement model and the structural model?

You should test each hypothesis in sequence.  The introduction of this article describes the method:


See the ?lavaan help page for how to constrain different types of parameters using the group.equal argument.  You can also use labels to constrain parameters to equality manually, as described here:


This page has lavaan examples and some discussion of how to test invariance with categorial indicators, which is a bit less straight-forward that with continuous indicators.


Terry


yrosseel

unread,
Dec 5, 2015, 8:46:40 AM12/5/15
to lav...@googlegroups.com
On 10/23/2015 08:34 PM, Maya wrote:
> 1. After I estimated the model, I used the lavPredict function to
> compute factor scores of the latent variables. My dataset consists of
> 360 observations. The factor scores for 14 observations are predicted as
> “NA” even though I have no missing data.

Can you try this again with lavaan 0.5-20? If you still get NA values,
please post (a snippet of) your dataset and a full script, so I can
investigate this further.

Yves.

Maya Abou Zeid

unread,
Dec 13, 2015, 11:58:48 AM12/13/15
to lav...@googlegroups.com
Dear Yves,

Thank you for the follow-up. Yes, the NA is still there with the new version of Lavaan using my old model specification. But I improved the model specification and now it is gone. It may be a small sample issue since the sample consists of 360 observations and the model has many indicators.

I would also like to check with you a couple of things. My model includes ordered indicators. For the corresponding continuous latent response variable (y*), as I understand, the variance of its error term should be 1 for identification purposes. This is indeed what I get when I estimate the model on two datasets separately. But when I combine the datasets and use multi-group analysis (with equality constraints on the structural equation parameters and on intercepts and thresholds in the measurement equations), Lavaan fixes the scale of y* to 1 for one of the groups and estimates the scale of y* for the other group. So my questions are:

1. What is the exact definition of the scale of y* and how is it related to the variance of the error term? Can the scale of the y* be constrained to be the same for the two datasets/groups (or constrained to a particular value) and how to do so?

2. Lavaan prints variances for the ordinal indicators (but without standard errors and z statistics). How are these variances computed? I thought that the variance of the error term for the y* (corresponding to these ordinal indicators) is 1 for identification purposes.

3. And finally, in the models of the two groups/datasets that are estimated separately, the intercepts of the latent variables are all zero. But when I estimate a multi-group model, Lavaan prints zero intercepts for the latent variables of one group but estimates the intercepts of the latent variables of the other group. Just wanted to check if that's possible only for multi-group models.

Thanks a lot for any clarification you can provide.

Maya






Yves.

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/44q3DphAufk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at http://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

yrosseel

unread,
Dec 13, 2015, 2:35:45 PM12/13/15
to lav...@googlegroups.com
On 12/13/2015 05:58 PM, Maya Abou Zeid wrote:
> Thank you for the follow-up. Yes, the NA is still there with the new
> version of Lavaan using my old model specification.

You get NA values (and a warning explaining why) if you have negative
variances.

> I would also like to check with you a couple of things. My model
> includes ordered indicators. For the corresponding continuous latent
> response variable (y*), as I understand, the variance of its error term
> should be 1 for identification purposes.

By default, lavaan will constrain the (total) variance of y* to be one.
The residual variance of y*, then, is not a free parameter, but a
function of other model parameters.

See:

Muthen, B. (1978). Contributions to factor analysis of dichotomous
variables. Psychometrika, 43, 551-560.
(http://www.statmodel.com/download/ContributionsToFactorAnalysis.pdf)

Equation 4 (page 552) shows how the (diagonal) elements of 'Psi'
(corresponding with the residual variances of the observed variables)
are a function of other model parameters. (Note: this is for a model
without a structural component)

For the multiple group setting, read this:

Muthen, B., & Christoffersson, A. (1981). Simultaneous factor analysis
of dichotomous variables in several groups. Psychometrika, 46(4), 407-419.

Next, read this webnote to understand the difference between 'delta' and
'theta' parameterization, and the role of the Delta scaling parameters:

http://www.statmodel.com/download/webnotes/CatMGLong.pdf

> I estimate the model on two datasets separately. But when I combine the
> datasets and use multi-group analysis (with equality constraints on the
> structural equation parameters and on intercepts and thresholds in the
> measurement equations), Lavaan fixes the scale of y* to 1 for one of the
> groups and estimates the scale of y* for the other group.

Indeed. Because in the multiple group setting, the (residual) variances
in all but the first (reference) group are estimable.

> 1. What is the exact definition of the scale of y* and how is it related
> to the variance of the error term?

See the two references above for a good explanation.

> Can the scale of the y* be
> constrained to be the same for the two datasets/groups (or constrained
> to a particular value) and how to do so?

Yes. If x1, x2, x3 are indicators of a latent variable, you could write
something like:

x1 ~*~ c(d1,d1,d1)*x1
x2 ~*~ c(d2,d2,d2)*x2
x2 ~*~ c(d3,d3,d3)*x2

In this example, I have assumed three groups. By setting the labels
equal across groups, the delta factors will be constrained to be equal.

> 2. Lavaan prints variances for the ordinal indicators (but without
> standard errors and z statistics). How are these variances computed?

See above.

> thought that the variance of the error term for the y* (corresponding to
> these ordinal indicators) is 1 for identification purposes.

A common misconception.

> 3. And finally, in the models of the two groups/datasets that are
> estimated separately, the intercepts of the latent variables are all
> zero. But when I estimate a multi-group model, Lavaan prints zero
> intercepts for the latent variables of one group but estimates the
> intercepts of the latent variables of the other group. Just wanted to
> check if that's possible only for multi-group models.

Under the usual parameterization (with free intercepts for all observed
indicators): yes.

Yves.


Maya Abou Zeid

unread,
Dec 15, 2015, 6:09:35 AM12/15/15
to lav...@googlegroups.com
Thank you Yves for all these useful references.
> --
> You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/44q3DphAufk/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
> To post to this group, send email to lav...@googlegroups.com.
> Visit this group at https://groups.google.com/group/lavaan.
Reply all
Reply to author
Forward
0 new messages