Can INlA doJoint likelihood with cross-eq correlation?

Alok Bohara

unread,

May 12, 2022, 12:36:09 PM5/12/22

to R-inla discussion group

I am new to INLA and trying to replicate a few of the models I am aware of. I tried to estimate a set of two likelihoods --one selection equation (binary) and one outcome equation (poisson or gaussian). I managed to create a Y matrix stacking two r.h.s. variables --binary and continuous. Using binomial (probit link) and gaussian families, I managed to estimate the two likelihoods. It turns out that the results are similar to estimating the two independent separate likelihood. I used STATA to verify the results.

I used the same --selection/outcome-- model as a joint maximum likelihood in STATA by allowing the covariance between the two errors to be non-zero. The correlation between the two errors with more efficiency gave me different results from what I got from INLA.

Question: In inla, is it possible to allow the two errors (binomial/normal and gaussian outcome) to have a correlation so making the estimation truly a joint estimation? I saw somewhere a reference to this issue but did not the answer and/or any details about how to accomplish using two types of errors in the formula.. etc. Thanks

Alok

Helpdesk

unread,

May 14, 2022, 2:24:43 AM5/14/22

to Alok Bohara, R-inla discussion group

there are models

iid2d

or

iidkd

(better implementation)

that you can use. just let the first component be associated with
likelihood 1 and the second one be associated with likelihood 2.

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/db236732-c516-461a-aeb8-68d33781ba74n%40googlegroups.com.

--
Håvard Rue
he...@r-inla.org

Alok Bohara

unread,

May 14, 2022, 1:12:48 PM5/14/22

to R-inla discussion group

Havard: Thanks. It is good to know that INLA can allow correlation between the two or more likelihoods. I have gone through quite a few INLA materials including couple of books I recently purchased. Most joint models I saw mentioned in these materials do seem to estimate two likelihoods independently (binomial-gaussian; binomial-poission etc..). I did not see any reference to iid2d or iidkd.

Can you please send me any link with any example of model set-up with iid2d and/or iidkd? Thanks.

Alok

Alok Bohara

unread,

May 20, 2022, 12:20:34 PM5/20/22

to R-inla discussion group

I went through the correlated random effects notes --iid2d etc. It appears that the iid2d assumes internally one column vector of two stacked r.e. variables, whereas my examples have two columns of two likelihoods --binomial and gaussian or poisson and binomial etc, and in some cases two different sample sizes for the two likelihoods. I have gone through the INLA books to estimate these models. I was successful in running these models but they all turned out to be independent likelihoods. I am interested in their correlations as well.

Examples of types: Y1 = X*B1 + d*Y2 + u, where Y1 is binomial (probit link) and Y2 = Z*B2 + v; (a gaussian or another binomial). Y1 could also be poisson and Y2 a binomial.

Or a system of two binomials (or poissons): Y1 = X*B1 + u and Y2 = Z*B2 + v with cov(u,v) is not 0.

That is, a correlated joint model with mixed likelihoods is what I am trying to estimate using INLA. Your tips or links to similar worked-out examples will be highly appreciated.

Alok

Denis Rustand

unread,

May 21, 2022, 3:19:58 AM5/21/22

to R-inla discussion group

Hi Alok,

The first example you give is a bit tricky since the linear predictor for Y1 includes Y2, I am not sure if that's what you meant but it is possible to share the linear predictor between two likelihoods, although a bit trickier than just correlated random effects. Attached is a code that simulates and fit the second example you gave, with two poisson regression models.

Denis Rustand
he...@r-inla.org

JointLik.R

Alok Bohara

unread,

May 22, 2022, 3:42:49 PM5/22/22

to R-inla discussion group

Thanks Denis. The seemingly related Poisson inla code your sent with iidkd was very helpful. I estimated it and the results came quite similar to other software's results -- Stata (copula method). Similar R program I tried is taking extremely long time and takes up a lot of memory --gjrm. It is still running. Inla is quite fast. I am trying to see the explanation of theta1, theta2, and theta3 that the script spits out.

1. The model with a feedback mechanism (the first example you referred to with Y2 appearing in the Y1 equation) is what I actually meant which allows feedback mechanism. Example:

Y1 = b1*x1+ dY2 + u (y1 is, say, ag production; y2 = intervention/technology adoption -- Y1 can be count or gaussian)

Y2 = b2*Z2 +v (technology adoption decision (1/0) depends on other factors like Z2 -- can be binomial or gaussian); cov(u,v) is not 0

I have not found any INLA codes along this direction.

I can set up this model using a GMM or a joint multi-variate normal (gaussian-binomial) maximum likelihood with a non-zero cov(u,v). We encounter a lot of models of this type in social sciences, and my interest to use INLA is to add spatial component which is rather convenient in INLA along with many other powerful properties.

2. I used your two count variable INLA script and converted it into a two binomial var model --using probit links. The fixed coefficients of this seemingly unrelated probit model (I call it) from your INLA code came out very close to what I got from other software (Stata and R). The problem was the var-covariance parameters. The diagonals from INLA were .013, and .012 (variances ?) and off-diagonal (covariance) was .0001, giving me a very small correlation (.009), as opposed to along the line of .14 or .2 other software provided. I don't know if, it being a binomial family, we need to fix the precision parameter to be 1 which is what we usually set our probit variance to be (1). But then, we are adding random effect parameters in the INLA setting whose variances even in the probit need not be 1 as in the case of other mle set-up. I think???

Thanks.

Denis Rustand

unread,

May 29, 2022, 5:53:47 AM5/29/22

to R-inla discussion group

Hi Alok,

In the code I provided, these theta1, theta2 and theta3 you see with the iidkd parameterization are the elements of the cholesky matrix, this is a new parametrization that has better properties compared to the previously used iid2d (parameterized as precision and correlation). The end of the code where I generate MC_samples is the way to get the variance-covariance of the random effects from the cholesky terms.

About the feedback model, I am not sure what you mean exactly by "Y2" in the first equation (linear predictor for this outcome?). With INLA, you can share the linear predictor of Y2 in Y1's linear predictor so I guess that's what you want to do?

About the binomial, It is indeed often an issue because binary data is not very informative compared to continuous or counts... The poor results are therefore not about INLA but about the fact that your data likely does not contain enough information to recover parameter estimates, particularly the random effects. Attached is an example for an univariate model with binomial but with 10 trials instead of 1, which works fine (alternatively, repeated individual measurements of a binary outcome can also provide enough information to properly recover the value of an individual random effect). When I reduce the number of trials to 1 for this example (i.e., binary), the results are not good but they match with the results from alternative estimation strategies (e.g., function glmer() from lme4 R package), I also illustrated this in the attached code.

If you have an example of a model that works with an other algorithm than INLA, please provide some details (code and output) so I can try to look into it.

Best,
Denis Rustand
he...@r-inla.org

JointLik_binary_univariate.R

Alok Bohara

unread,

May 29, 2022, 11:14:12 AM5/29/22

to R-inla discussion group

Hi Denise, Thanks for the response and your willingness to check my codes. I have created a small example of a selection model of a labor force participation (selection) and wage (outcome). I have pasted both data and all the codes. The selection procedure of R and Stata results match perfectly including the correlation. Both use maximum likelihood method. [As for Inla, I wonder if my data stacking is wrong or if there needs to be some scaling that is required (Havard;s note) given the two different data generating processes --binomial/probit and gaussian.

Much appreciated.

################################################################################

###Var-Cov from MC_samples is
[,1] [,2]
[1,] 0.0105 0.0008
[2,] 0.0008 0.0104 => rho = .077
whereas, the Stata and R's Heckman selection models give rho = .657 The fixed coefficients are also different between the INLA and R results.

R's sampleselection:

--------------------------------------------
Tobit 2 model (sample selection model)
Maximum Likelihood estimation
Newton-Raphson maximisation, 7 iterations
Return code 2: successive function values within tolerance limit (tol)
Log-Likelihood: -778.8733
753 observations (325 censored and 428 observed)
7 free parameters (df = 746)
Probit selection equation:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.19504 0.26577 -4.496 8.01e-06 ***
educ 0.13646 0.02230 6.120 1.51e-09 ***
huswage -0.04056 0.01151 -3.524 0.000451 ***
Outcome equation:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.08324 0.17936 -0.464 0.643
educ 0.10670 0.01182 9.024 <2e-16 ***
Error terms:
Estimate Std. Error t value Pr(>|t|)
sigma 0.54357 0.03711 14.65 < 2e-16 ***
rho 0.65705 0.11137 5.90 5.53e-09 ***
---

STATA:

Heckman selection model Number of obs = 753
(regression model with sample selection) Selected = 428
Nonselected = 325

Wald chi2(1) = 81.43
Log likelihood = -778.8733 Prob > chi2 = 0.0000

------------------------------------------------------------------------------
lwageCen | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwageCen |
educ | .1066999 .0118239 9.02 0.000 .0835256 .1298743
_cons | -.083243 .1793643 -0.46 0.643 -.4347906 .2683047
-------------+----------------------------------------------------------------
select |
educ | .1364611 .022296 6.12 0.000 .0927617 .1801605
huswage | -.0405632 .0115115 -3.52 0.000 -.0631253 -.018001
_cons | -1.195037 .2657708 -4.50 0.000 -1.715938 -.6741362
-------------+----------------------------------------------------------------
/athrho | .7876071 .1959807 4.02 0.000 .403492 1.171722
/lnsigma | -.6095971 .068275 -8.93 0.000 -.7434138 -.4757805
-------------+----------------------------------------------------------------
rho | .6570513 .1113726 .3829329 .8248235
sigma | .5435698 .0371123 .4754879 .6213999
lambda | .3571533 .0825198 .1954175 .518889
------------------------------------------------------------------------------
LR test of indep. eqns. (rho = 0): chi2(1) = 4.91 Prob > chi2 = 0.0268

INLA with iidkd

mean sd 0.025quant 0.5quant 0.975quant mode kld
mu.y -0.492 0.160 -0.806 -0.492 -0.179 -0.492 0
cov.y.edu 0.110 0.013 0.084 0.110 0.135 0.110 0
mu.z -1.129 0.264 -1.650 -1.128 -0.615 -1.127 0
cov.z.edu 0.133 0.022 0.089 0.133 0.177 0.133 0
cov.z.huswage -0.043 0.012 -0.067 -0.043 -0.019 -0.043 0
#############################################################################################

Data and Codes:

######################################

library(sampleSelection)
data("Mroz87")
names(Mroz87)
data<-Mroz87
data$lwage = log(data$wage+1)

##R's sample selection Heckman model (matches with Stata's heckman)

## A simple example
R.HeckMle <- heckit(
selection = lfp ~ educ + huswage,
outcome = lwage ~ educ,
method = "ml",
data = data)
summary(R.HeckMle)

###############INLA ####################################
library(INLA)
n <-nrow(data)

### covariates and eandorm effect data
#outcome eq
cov.y.edu<-data$edu
#selection eq
cov.z.edu<-data$edu
cov.z.huswage <-data$huswage
# Two r.e. (binomial and gaussian)
u <- c(1:n, rep(NA, n)) # binomial data
v <- c(rep(NA, n), n+(1:n)) # gaussian >0 data

#two likelihood data columns
Y <- matrix(NA,n+n,2)
Y[1:n,1] <- data$lfp
Y[n+1:n,2] <- data$lwage

#constants
mu.z <- c(rep(1,n),rep(NA,n))
mu.y <- c(rep(NA,n),rep(1,n))

#data preparation
ldat = list(
Y=Y,
mu.z=mu.z,
mu.y=mu.y,
# selection
cov.z.edu= c(cov.z.edu, rep(NA,n)),
cov.z.huswage= c(cov.z.huswage, rep(NA,n)),
#outcome
cov.y.edu = c(rep(NA, n), cov.y.edu),
u=u,
v=v)

#Mle Heckman INLA ?
for.Heck.inla = Y ~ 0+mu.y+ cov.y.edu +
mu.z + cov.z.edu + cov.z.huswage +
f(u, model="iidkd", order=2, n=2*n)+f(v, copy="u")

heckman.inla <- inla(for.Heck.inla, data=ldat,
family=c("binomial", "gaussian"),
control.inla = list(int.strategy = "eb"),
control.family=list(list(control.link = list(model = "probit")),
list(control.link = list(model = "identity"))) )

summary(heckman.inla)
Model <-heckman.inla

# random effects summary
MC_samples <- inla.iidkd.sample(10^4, Model, "u", return.cov=T)
VarCov <- matrix(unlist(MC_samples), nrow = 2^2)
VarCovMeans = matrix(rowMeans(VarCov),2,2);round(VarCovMeans,4) # mean of variance-covariance

VarCovSD = matrix(apply(VarCov, 1, sd),2,2);round(VarCovSD,2) # sd

VarCov025 = matrix(apply(VarCov, 1, function(x) quantile(x, 0.025)),2,2);round(VarCov025,2) # quantiles
VarCov05 = matrix(apply(VarCov, 1, function(x) quantile(x, 0.5)),2,2);round(VarCov05,2)
VarCov975 = matrix(apply(VarCov, 1, function(x) quantile(x, 0.975)),2,2);round(VarCov975,2)

.

Alok Bohara

unread,

May 29, 2022, 12:22:23 PM5/29/22

to R-inla discussion group

Sorry, I also tried filling the outcome wage of 0 with NA in the Y matrix for inla. INla results did not change.

data$lwageNA<-ifelse(data$wage>0,data$lwage,NA) # NA for missing docvis
Y <- matrix(NA,n+n,2) #

Y[1:n,1] <- data$lfp

Y[n+1:n,2] <- data$lwageNA

Reply all

Reply to author

Forward