BYM2 hurdle joint model with stacked responses

18 views

Skip to first unread message

Felipe Barletta

unread,

Oct 9, 2025, 1:02:57 PMOct 9

to r-inla-disc...@googlegroups.com

Hy my friends,

I am fitting a BYM2 hurdle joint model to hospitalization cancers in Portugal, where

it a bivariate hurdle model with spatial effects (BYM2) for two outcomes:

SevHigh (high severity hospitalizations count)
c.obitos (death hospitalization count)

Each outcome uses a two-part hurdle:

Binary part (is 0 if not count observed, 1 if observed): Binomial
Positive part (given > 0): censored Poisson with truncation at 0 (cenpoisson.I = c(0,0))

Both parts include BYM2 spatial random effects on Portuguese municipalities (278 areas).

Y_bin1 (binomial for SevHigh)
Y_bin2 (binomial for c.obitos)
Y_pos1 (censored Poisson for SevHigh)
Y_pos2 (censored Poisson for c.obitos)

Each vector has length L, with non-NAs only in its active block (size n) and NA elsewhere. Same pattern for covariates and random-effect indices.

Key pieces (abbreviated):

## --- Number of observations
n.l1 <- nrow(train)
n.l2 <- nrow(train)

## --- Auxiliary vectors
NAs.l1 <- rep(NA, n.l1)
NAs.l2 <- rep(NA, n.l2)

## --- Response variables
# Binary
Z1 <- as.integer(train$SevHigh > 0)
Z2 <- as.integer(train$c.obitos > 0)

# Positive part
Y1_pos <- ifelse(train$SevHigh > 0, train$SevHigh, NA)
Y2_pos <- ifelse(train$c.obitos > 0, train$c.obitos, NA)

# Stack each outcome in vector with NA to align
Y_bin1 <- c(Z1, NAs.l2, NAs.l1, NAs.l2) # binomial Y1
Y_bin2 <- c(NAs.l1, Z2, NAs.l1, NAs.l2) # binomial Y2
Y_pos1 <- c(NAs.l1, NAs.l2, Y1_pos, NAs.l2) # cenpoisson Y1
Y_pos2 <- c(NAs.l1, NAs.l2, NAs.l1, Y2_pos) # cenpoisson Y2

# Join in list

Y.joint <- list(Y_bin1, Y_bin2, Y_pos1, Y_pos2)

Covariates are stacked with the same block structure. I also create 4 index vectors for BYM2 effects (one per stacked part), with non-NAs only in the corresponding block:

# Covariates

covariates <- list(
inter = as.factor(c(rep("b.l1", n.l1), rep("b.l2", n.l2),
rep("p.l1", n.l1), rep("p.l2", n.l2))),

# month
month.l1 = c(train$mes_id, NAs.l2, train$mes_id, NAs.l2),
month.l2 = c(NAs.l1, train$mes_id, NAs.l1, train$mes_id),

# gender of patient
gender.l1 = c(train$gender, NAs.l2, train$gender, NAs.l2),
gender.l2 = c(NAs.l1, train$gender, NAs.l1, train$gender),

# age > 60
age60.l1 = c(train$age60, NAs.l2, train$age60, NAs.l2),
age60.l2 = c(NAs.l1, train$age60, NAs.l1, train$age60),

# average length of stay

avg.los.l1 = c(train$avg.los, NAs.l2, train$avg.los, NAs.l2),
avg.los.l2 = c(NAs.l1, train$avg.los, NAs.l1, train$avg.los),

#Ei
Ei.l1 = c( NAs.l2,NAs.l2,train$Ei2,NAs.l2),
Ei.l2 = c(NAs.l1, NAs.l2,NAs.l2,train$Ei3)
)

# Unique index
unique.id <- unique(train$city_id)
n.id <- length(unique.id)
idx.l1 <- match(train$city_id, unique.id)
idx.l2 <- match(train$city_id, unique.id)

## Random effects - spatial

r.effects <- list(
# binomial Y1
idx.sh.l1.1 = c(idx.l1, NAs.l2, rep(NA,n.l1), rep(NA,n.l2)),
# binomial Y2
idx.sh.l2.1 = c(NAs.l1, idx.l2, rep(NA,n.l1), rep(NA,n.l2)),
# cenpoisson Y1
idx.sh.l1.2 = c(rep(NA,n.l1), rep(NA,n.l2), idx.l1, NAs.l2),
# cenpoisson Y2
idx.sh.l2.2 = c(rep(NA,n.l1), rep(NA,n.l2), NAs.l1, idx.l2)
)

joint.data <- c(covariates, r.effects)
joint.data$Y <- Y.joint

#Offsets for the positive parts only:
E1 <- na.omit(joint.data$Ei.l1) # deve ter comprimento n.l1 = 26132
E2 <- na.omit(joint.data$Ei.l2) # idem

## model

m_hurdle <- inla(
Y ~ -1 + inter +
gender.l1 + age60.l1 + avg.los.l1 + month.l1 +
gender.l2 + age60.l2 + avg.los.l2 + month.l2 +
f(idx.sh.l1.1, model="bym2", graph=W.pt, scale.model=TRUE, hyper=prior1) +
f(idx.sh.l2.1, model="bym2", graph=W.pt, scale.model=TRUE, hyper=prior2) +
f(idx.sh.l1.2, model="bym2", graph=W.pt, scale.model=TRUE, hyper=prior1) +
f(idx.sh.l2.2, model="bym2", graph=W.pt, scale.model=TRUE, hyper=prior2),

data = joint.data,
family = c("binomial","binomial","cenpoisson","cenpoisson"),
Ntrials = list(rep(1,n.l1), rep(1,n.l2), NULL, NULL),
E = list(NULL, NULL, E1, E2),
control.family = list(
list(), list(),
list(cenpoisson.I=c(0,0)), # truncation in 0
list(cenpoisson.I=c(0,0))
),
control.compute = list(dic=TRUE, waic=TRUE, cpo=TRUE),
control.predictor = list(compute=TRUE),
control.inla = list(int.strategy="eb")
)

And that's the message error:

*** inla.core.safe: The inla program failed, but will rerun in case better initial values may help. try=1/1
Error in inla.core.safe(formula = formula, family = family, contrasts = contrasts, :
arguments imply differing number of rows: 26132, 0
The inla program failed and the maximum number of tries has been reached.

What I’ve checked

Lengths: every stacked vector (responses, covariates, index vectors)
Non-NAs per part: sum(!is.na(Y_bin1)) = n, same for Y_bin2, Y_pos1, Y_pos2.
Ntrials lengths match binomial non-NAs: length(rep(1,n)) = n.
Offsets E1, E2 have length n and contain no NAs.
Index ranges for the four BYM2 terms are within 1..278, with n_valid = n and n_levels = 278 for each stacked index.
Graph W.pt reports n = 278 and non-empty neighbor lists.

Given that, I suspect the error is with Ntrials and E but I'm not sure.

If any of the above are wrong, could that alone explain the 26132, 0 mismatch?

If someone has any recommendations or examples on correctly aligning Ntrials/E with stacked components, I’d be very grateful.
Thank you in advance for your time and help!

Finn Lindgren

unread,

Oct 9, 2025, 4:55:47 PMOct 9

to Felipe Barletta, r-inla-disc...@googlegroups.com

I think you're right in suspecting E/Ntrials: When using inla.stack() to organise the data for multiple observation models, E and Ntrials would be constructed to be vectors of the same length as the total predictor, so instead of

E = list(NULL, NULL, E1, E2)

you need
E = c(rep(1, length(Y_bin1) + length(Y_bin2)), E1, E2)

and similarly for Ntrials (the values won't be used by the parts of the models that don't involve them, so setting the to 1 is ok.
NA might also be ok, as I think that's what inla.stack() would do if you include the E/Ntrials vectors as part of the 'data' argument.

Note: Both inla.stack() and the inlabru::bru() interface would handle the data stacking for you.

Finn

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/r-inla-discussion-group/CAJz4ZjHQs2CAj0qdy8WRZAv%2B9spmZKWtLHPCPkuxJvZ25FD_Zg%40mail.gmail.com.

Finn Lindgren
email: finn.l...@gmail.com

Reply all

Reply to author

Forward

0 new messages