Sensitivity analysis for regression with causal relationship between "independent" variables

45 views
Skip to first unread message

Arun Luthra

unread,
Jan 21, 2019, 6:53:15 PM1/21/19
to lavaan
I would like to do regression of observed variables "A" and "B" to predict a target variable "y".

My goal is to have a sensitivity analysis for y, where I can use the coefficients to estimate the effect of A and B on y. However, I want to use my theory that the variations in B are partially caused by A.

There is a two step solution for this situation that does not involve SEM:
1) Do a linear regression to predict A as a function of B. Then, I can compute some new variable Z which is the residual of B. This residual is the part of B that is not caused by A.
2) Do a linear regression to predict "y" as a function of A and Z. Now, the sensitivity analysis will properly assign credit to A, and assign credit to the part of B that is not due to A.

Question: Is there a natural way to do this modeling in SEM in one step? Is SEM appropriate for this situation? Is there some slightly more complex situation where SEM would suddenly become preferable over the 2 stage linear regression approach?

My possibly misguided idea of how to solve this sensitivity analysis with SEM is to do a regression to predict B using A. Since B is an observed variable, it will have a latent residual variable attached to it (an error term). I can add a regression to the graph, which is "y ~ A + residualB". Is it possible to access the error term of B in lavaan? Is this approach reasonable?

Arun Luthra

unread,
Jan 21, 2019, 7:14:16 PM1/21/19
to lavaan
Correction: B is partially caused by A, so in step (1), I would model B as a function of A (not the other way around).

Edward Rigdon

unread,
Jan 21, 2019, 9:09:03 PM1/21/19
to lav...@googlegroups.com
     There is an extensive literature on mediation analysis. However, causal analysis folks will fault your methodology because it does not account for the possibility that A is also endogenous--that A and y are jointly dependent on some excluded variable. Nothing in your plan accomplishes anything beyond that textbook mediation analysis.
     But you certainly can turn the residual of B into a variable. Constrain B's residual variance to 0, and create a new latent variable residB, which is orthogonal to A and predicts B. Then residB will be B's residual, and if you want you can use it to predct y.

'
residB =~ 0*y             ! creates the new variable residB, which is not in the dataset
B ~ A + residB
B ~~ 0*B                 ! B has 0 residual variance
residB ~~ 0*A         ! residB is uncorrelated with A
y ~ A + residB         ! A and residB predict y
'

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Arun Luthra

unread,
Jan 23, 2019, 5:01:00 PM1/23/19
to lavaan
What is the meaning of residB =~ 0*y ? It seems like you are defining a new latent variable, with one loading which is equal to zero?

I had to make some changes to get lavaan to run without errors on my data:


myModel <- '

residB =~ 0*y

y ~ A + start(1.0)*residx2

B ~ A + start(1.0)*residx2

residB ~~ 0.001*A

B ~~ 0.001*B

'

Edward Rigdon

unread,
Jan 23, 2019, 5:13:11 PM1/23/19
to lav...@googlegroups.com
Yes. The ~= operator defines a latent variable on the left hand side. The observed variable on the right hand side does not matter, as long as you fix the loading at 0. So this line creates a new latent variable, and y does not load on it.
Reply all
Reply to author
Forward
0 new messages