1. Thanks Denis for your clarification. At least I now know what the issues are. BTW, the sigma and correlation from the above gsem model are retrieved as follows:
nlcom (sigma: sqrt(_b[/var(e.lwageCen)] +_b[wage:L]^2)) (rho: _b[lwageCen:L]/(sqrt((_b[/var(e.lwageCen)]+1)*(_b[/var(e.lwageCen)] + _b[lwageCen:L]^2))))
Likewise, the coefficients of the probit part is also retrieved by dividing the probit coefficients by sqrt(_b[/var(e.lwageCen)]+1))
Both sigma and rho after transformation match with the R's heckman selection estimation.
2. As for the simultaneous model we discussed earlier, GSEM also allows the simultaneous modeling (for mixed distribution) by introducing latent variables in equations and rescaling them if necessary. An example of the Instrumental variable with the binary r.h.s. variable may look something like this:
#Endogenous treatment model
y1 = a0+a1*x1 + a2*y2+u (y1 is gaussian, e.g., wage where y2 is endogenous)
y2 = b0 + b1*z1 + v (y2 is endogenous probit, e.g., union participation).
gsem (wage <- age grade i.smsa i.black tenure 1.union L)
(llunion <- i.black tenure i.south L@1,
family(gaussian, udepvar(ulunion))),
var(L@1 e.wage@a e.llunion@a)
generate llunion = 0 if union == 1 // missing otherwise
generate ulunion = 0 if union == 0 // missing otherwise
Note: unlike in Heckman, here the continuous variable wage is observed for both the treated and untreated samples (union and non-union members). The model allows correlation through the latent variables.
The output of a model like this may look like this:
Generalized structural equation model Number of obs = 1,210
Response: wage
Family: Gaussian
Link: Identity
Censoring of obs:
Lower response: llunion Uncensored = 0
Upper response: ulunion Left-censored = 957
Family: Gaussian Right-censored = 253
Link: Identity Interval-cens. = 0
Log likelihood = -3051.575
( 1) [llunion]L = 1
( 2) - [/]var(e.wage) + [/]var(e.llunion) = 0
( 3) [/]var(L) = 1
Coefficient Std. err. z P>|z| [95% conf. interval]
wage (outcome)
(coefficients, std errr, a-value etc.)
age .1487409 .0193291 7.70 0.000 .1108566 .1866252
grade .4205658 .0293577 14.33 0.000 .3630258 .4781057
1.smsa .9117044 .1249041 7.30 0.000 .6668969 1.156512
1.black -.7882472 .1367077 -5.77 0.000 -1.056189 -.520305
tenure .1524015 .0369595 4.12 0.000 .0799621 .2248408
1.union 2.945816 .2749549 10.71 0.000 2.406914 3.484718
L -1.706795 .1288024 -13.25 0.000 -1.959243 -1.454347
_cons -4.351572 .5283952 -8.24 0.000 -5.387207 -3.315936
llunion (treatment)
(coefficients, std errr, a-value etc.)
1.black .6704049 .148057 4.53 0.000 .3802185 .9605913
tenure .1282024 .0357986 3.58 0.000 .0580384 .1983664
1.south -.8542673 .136439 -6.26 0.000 -1.121683 -.5868518
L 1 (constrained)
_cons -1.302676 .1407538 -9.25 0.000 -1.578548 -1.026804
4 Example 46g — Endogenous treatmenteffects
model
var(L) 1 (constrained)
var(e.wage) 1.163821 .2433321 .7725324 1.753298
var(e.llun~n) 1.163821 .2433321 .7725324 1.753298
########################################
It would be wonderful to have these options in INLA thourgh its u and v's.