Left censoring with gaussian observations below LOD

84 views
Skip to first unread message

Jonas

unread,
May 1, 2026, 12:40:02 AM (13 days ago) May 1
to R-inla discussion group
Hi, 

I am currently working with a dataset where the outcome variable follows a Gaussian distribution but is subject to left censoring. Specifically, 35% of my observations fall below a limit of detection (LOD) (LOD is different for some of the outcomes).

I was unable to find a straightforward implementation for this specific scenario within INLA. However, since I am fitting a joint model with a spatial component and a large number of data points, I would like to remain within the INLA framework. Preferably a fully Bayesian approach or a likelihood-based adjustment (no classical imputation). 

Any easy or complicated suggestions would be greatly appreciated.

Kind regards,
Jonas

Helpdesk (Haavard Rue)

unread,
May 1, 2026, 4:19:55 AM (13 days ago) May 1
to Jonas, R-inla discussion group
do yy = exp(y), and then use 'lognormalsurv' for 'yy', which has the ability to
do all these cencoring stuff
> --
> You received this message because you are subscribed to the Google Groups "R-
> inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to r-inla-discussion...@googlegroups.com.
> To view this discussion, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/91de9b4e-0f57-4e8a-97e1-8b5665907473n%40googlegroups.com
> .

--
Håvard Rue
he...@r-inla.org

Helpdesk (Haavard Rue)

unread,
May 1, 2026, 4:34:33 AM (13 days ago) May 1
to Jonas, R-inla discussion group
something like this

## do linear regression where data are subject to left censoring
n <- 300
x <- rnorm(n)
eta <- 1 + 0.2 * x
s <- 0.1
y <- eta + rnorm(n, sd = s)
L <- 1
y <- pmax(L, y)

idx <- which(y <= L)
e <- rep(1, n)
e[idx] <- 2
YY <- inla.surv(time = exp(y), event = e)

r <- inla(YY ~ 1 + x,
data = list(YY = YY, x = x),
family = "lognormalsurv")
summary(r)
--
Håvard Rue
he...@r-inla.org

Jonas

unread,
May 5, 2026, 4:17:38 AM (9 days ago) May 5
to R-inla discussion group
Thank you for your reply and the example!

Kind regards,
Jonas

Op vrijdag 1 mei 2026 om 10:34:33 UTC+2 schreef Helpdesk (Haavard Rue):

Jonas

unread,
May 6, 2026, 2:12:18 PM (7 days ago) May 6
to R-inla discussion group
Hi, 

I have a follow up question. 
I am trying to fit a joint model in INLA that includes spatial random effects with gaussian outcome (blood measurements) and a censored gaussian outcome (soil measurements).
However, when I try to combine my stacks using:

> stk_joint <- inla.stack(stk_blood, stk_soil) Error in `dplyr::bind_rows()`: ! Can't combine `..1$y.1` <double> and `..2$y.1` <list>. Run `rlang::last_trace()` to see where the error occurred.I get an error. From what I understand, there is a mismatch in the structure of the response variables between the two stacks (numeric vs list via inla.surv).
I'm using the Rcode below with simulated data to check my method implementation.

Any suggestions on how to circumvent this problem?


# ------------------------------------------------------------------------------
#
# JOINT SPATIAL MODEL:
#   log(soil)  = intercept_soil + w(s)
#   log(blood) = intercept
#                         + fixed covariates                   (gender, age, education, eggs)
#                         + beta_scaling * w(s)             [baseline spatial field]
#                         + beta_eggs  * lok_ei * w(s)  [egg x space interaction]
# ------------------------------------------------------------------------------


rm(list = ls())


### Load packages
#----------------
library(INLA)
library(sp)
library(ggplot2)
library(gridExtra)
library(tidyr)
library(patchwork)


### Set seed
#-----------
set.seed(42)


# ------------------------------------------------------------------------------
# 1. SPATIAL DOMAIN & MESH
# ------------------------------------------------------------------------------


domain_size <- 100000
n_soil  <- 3000
n_blood <- 500


soil_xy <- data.frame(
  x = runif(n_soil,  0, domain_size),
  y = runif(n_soil,  0, domain_size)
)


blood_xy <- data.frame(
  X = runif(n_blood, 0, domain_size),
  Y = runif(n_blood, 0, domain_size)
)


mesh <- inla.mesh.2d(
  loc      = rbind(as.matrix(soil_xy), as.matrix(blood_xy)),
  max.edge = c(15000, 40000),
  cutoff   = 3000,
  offset   = c(10000, 30000)
)


# ------------------------------------------------------------------------------
# 2. TRUE PARAMETER VALUES
# ------------------------------------------------------------------------------


### Spatial field
#----------------
true_range_w <- 30000
true_sigma_w <- 1.2


### Copy scalings
#----------------
true_beta_blood <- 0.8
true_beta_eggs  <- 0.9


### Fixed effects
#----------------
true_intercept_soil  <- 2.0
true_intercept_blood <- 1.4


true_beta_gender   <- 0.3
true_beta_age2     <- 0.2
true_beta_age3     <- 0.5
true_beta_educ2    <- -0.1
true_beta_educ3    <- 0.3
true_beta_eggs_fix <- 0.4


### Observation noise
#--------------------
true_sigma_obs_soil  <- 0.1
true_sigma_obs_blood <- 0.1


# ------------------------------------------------------------------------------
# 3. SIMULATE SPATIAL FIELD
# ------------------------------------------------------------------------------


## Define projection matrices
#----------------------------
A_soil  <- inla.spde.make.A(mesh, loc = as.matrix(soil_xy))
A_blood <- inla.spde.make.A(mesh, loc = as.matrix(blood_xy))


### Define SPDE object
#---------------------
spde_sim_obj <- inla.spde2.pcmatern(mesh = mesh,
                                    prior.range = c(true_range_w, 0.5),
                                    prior.sigma = c(true_sigma_w, 0.5))

### Precision matrix
#-------------------
Q_sim <- inla.spde2.precision(spde_sim_obj,
                              theta = c(log(true_range_w), log(true_sigma_w)))


### Sample on MESH nodes
#-----------------------
w_true_nodes <- as.vector(inla.qsample(n = 1, Q = Q_sim))


### Project to observation locations
#-----------------------------------
field_w_soil  <- as.vector(A_soil  %*% w_true_nodes)
field_w_blood <- as.vector(A_blood %*% w_true_nodes)


#-------------------------------------------------------------------------------
# 4. SIMULATE COVARIATES
#-------------------------------------------------------------------------------


soil <- data.frame(
  x_ml72   = soil_xy$x,
  y_ml72   = soil_xy$y
)


blood <- data.frame(
  X        = blood_xy$X,
  Y        = blood_xy$Y,
  Gender   = rbinom(n_blood, 1, 0.5),
  age      = sample(1:3, n_blood, replace = TRUE),
  isced_hh = sample(1:3, n_blood, replace = TRUE),
  lok_ei   = rbinom(n_blood, 1, 0.5)
)


#-------------------------------------------------------------------------------
# 5. SIMULATE OUTCOMES
#-------------------------------------------------------------------------------


### Soil
#-------
log_soil <- true_intercept_soil + field_w_soil + rnorm(n_soil, 0, true_sigma_obs_soil)


log_soil <- pmax(exp(log_soil), 1)          # apply the censoring
idx <- which(log_soil <= 1)                 # select the censored ones
e <- rep(1, n_soil)                         # 1 for uncensored
e[idx] <- 2                                 # 2 for censored
YY <- inla.surv(time = log_soil, event = e) # create INLA survival object


### Blood
#--------
age_b  <- ifelse(blood$age == 2, true_beta_age2,
                 ifelse(blood$age == 3, true_beta_age3, 0))
edu_b  <- ifelse(blood$isced_hh == 2, true_beta_educ2,
                 ifelse(blood$isced_hh == 3, true_beta_educ3, 0))

log_blood <- true_intercept_blood +
  true_beta_gender * blood$Gender +
  age_b + edu_b +
  true_beta_eggs_fix * blood$lok_ei +
  true_beta_blood * field_w_blood +
  true_beta_eggs  * blood$lok_ei * field_w_blood +
  rnorm(n_blood, 0, true_sigma_obs_blood)

blood$pfos <- exp(log_blood)


#-------------------------------------------------------------------------------
# 6. FIT THE JOINT MODEL
#-------------------------------------------------------------------------------


spde <- inla.spde2.pcmatern(
  mesh,
  prior.range = c(10000, 0.1),
  prior.sigma = c(1, 0.05)
)


idx_w     <- inla.spde.make.index("idx_w",     spde$n.spde)
idx_w_bld <- inla.spde.make.index("idx_w_bld", spde$n.spde)
idx_w_egg <- inla.spde.make.index("idx_w_egg", spde$n.spde)


A_eggs <- inla.spde.make.A(
  mesh,
  loc     = as.matrix(cbind(blood$X, blood$Y)),
  weights = blood$lok_ei
)


stk_blood <- inla.stack(
  data = list(y = cbind(log(blood$pfos), NA)),
  A = list(A_blood, A_eggs, 1),
  effects = list(
    idx_w_bld,
    idx_w_egg,
    data.frame(
      Intercept.blood = 1,
      Gender = blood$Gender,
      Eggs   = blood$lok_ei,
      Age2   = as.integer(blood$age == 2),
      Age3   = as.integer(blood$age == 3),
      Education2 = as.integer(blood$isced_hh == 2),
      Education3 = as.integer(blood$isced_hh == 3)
    )
  ),
  tag = "blood"
)


# !!! ERROR with the SURVIVAL OBJECT as LIST !!!
# NA changed to --> rep(NA,n_soil)
stk_soil <- inla.stack(
  data = list(y = cbind(rep(NA,n_soil), YY)), 
  A = list(A_soil, 1),
  effects = list(
    idx_w,
    data.frame(
      Intercept.soil = rep(1, n_soil)
    )
  ),
  tag = "soil"
)

# !!! ERROR when joining the two inla.stack !!!
#                                  ???
stk_joint <- inla.stack(stk_blood, stk_soil)


formula_joint <- y ~ -1 +
  Intercept.soil + Intercept.blood +
  Gender + Age2 + Age3 + Eggs + Education2 + Education3 +
  f(idx_w, model = spde) +
  f(idx_w_bld, copy = "idx_w", fixed = FALSE) +
  f(idx_w_egg, copy = "idx_w", fixed = FALSE)


res <- inla(
  formula_joint,
  family = c("gaussian", "lognormalsurv"),
  data = inla.stack.data(stk_joint),
  control.predictor = list(A = inla.stack.A(stk_joint), compute = TRUE),
  control.compute = list(dic = TRUE, waic = TRUE, config = TRUE)
)


#-------------------------------------------------------------------------------
# 7. VALIDATION
#-------------------------------------------------------------------------------


summary(res)



Kind regards,
Jonas

Op dinsdag 5 mei 2026 om 10:17:38 UTC+2 schreef Jonas:

Finn Lindgren

unread,
May 6, 2026, 2:18:20 PM (7 days ago) May 6
to Jonas, R-inla discussion group
For multiple mixed response types, inla.stack has a special mechanism; see ?inla.stack to start.
But I would _really_ urge you to consider switching to the inlabru interface instead, that handles this automatically, in addition to simplifying spatial models (and entirely removes the need for the user to call inla.stack).
Finn

On 6 May 2026, at 19:12, 'Jonas' via R-inla discussion group <r-inla-disc...@googlegroups.com> wrote:

Hi, 
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.

To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.

Jonas MEIJERINK

unread,
May 7, 2026, 6:02:06 AM (7 days ago) May 7
to Finn Lindgren, R-inla discussion group
Hi,

Thank you for the suggestion!
I switched to INLABRU which is indeed working easily.

However, is it possible that the weight statement with a copy effect is not working?
As I'm using two copies of a spatial effect that interact with a dummy variable, I implemented this as one normal copy and a weighted copy.
Previously this was directly applied to the projection matrix within the inla.stack.

See part of my code below based on the same simulated data (and the old/full code in my previous message) 



spde <- inla.spde2.pcmatern(
  mesh,
  prior.range = c(10000, 0.1),
  prior.sigma = c(1, 0.05)
)


### Convert to spatialpoints
#---------------------------
soil_sp  <- SpatialPoints(cbind(soil$x_ml72, soil$y_ml72))
blood_sp <- SpatialPoints(cbind(blood$X, blood$Y))


### Build dataframe
#------------------
blood$log_pfos    <- log(blood$pfos)
blood$Age2        <- as.integer(blood$age      == 2)
blood$Age3        <- as.integer(blood$age      == 3)
blood$Education2  <- as.integer(blood$isced_hh == 2)
blood$Education3  <- as.integer(blood$isced_hh == 3)
blood$Intercept.blood <- 1
soil$Intercept.soil <- 1
soil_spdf  <- SpatialPointsDataFrame(soil_sp,  data = soil)
blood_spdf <- SpatialPointsDataFrame(blood_sp, data = blood)


cmp <- ~ -1 +
  Intercept.soil(Intercept.soil,   model = "linear") +
  Intercept.blood(Intercept.blood, model = "linear") +
  Gender(Gender,         model = "linear") +
  Age2(Age2,             model = "linear") +
  Age3(Age3,             model = "linear") +
  Eggs(lok_ei,           model = "linear") +
  Education2(Education2, model = "linear") +
  Education3(Education3, model = "linear") +
  idx_w(geometry,     model = spde) +
  idx_w_bld(geometry, copy = "idx_w", fixed = FALSE) +
  idx_w_egg(geometry, copy = "idx_w", fixed = FALSE, weights = lok_ei)            #  !!!  this is not working  !!!


### Define GAUSSIAN part
#----------------
lik_blood <- like(
  formula = log_pfos ~ Intercept.blood +

    Gender + Age2 + Age3 + Eggs + Education2 + Education3 +
    idx_w_bld + idx_w_egg,
  family  = "gaussian",
  data    = blood_spdf
)

### Define CENSORED GAUSSIAN part
#----------------
lik_soil <- like(
  formula = YY ~ Intercept.soil + idx_w,
  family  = "lognormalsurv",
  data    = soil_spdf
)


res <- bru(
  cmp,
  lik_blood,
  lik_soil,
  options = bru_options(

    control.compute = list(dic = TRUE, waic = TRUE, config = TRUE)
  )
)

Kind regards,
Jonas


Op wo 6 mei 2026 om 20:18 schreef Finn Lindgren <finn.l...@gmail.com>:

Finn Lindgren

unread,
May 7, 2026, 6:24:05 AM (7 days ago) May 7
to Jonas MEIJERINK, R-inla discussion group
Hi,

At first I thought it was using the inputs from the idx_w component also for the copy, but this old reported issue,
https://github.com/inlabru-org/inlabru/issues/147
had the _opposite_ issue, that they wanted it use the same inputs as the main component.
But I see in the code that it uses the wrong mapper that doesn't handle weights, as it using the mapper from the main component that wasn't setup to handle weights.
Can you try adding an explicit weights argument to the main component as well, so it's setup to handle weights?
Like this:
 idx_w(geometry,     model = spde, weights = 1) +
  idx_w_bld(geometry, copy = "idx_w", fixed = FALSE, weights = 1) +

  idx_w_egg(geometry, copy = "idx_w", fixed = FALSE, weights = lok_ei)   

In the past, all components handled weights, whether they were supplied or not, defaulting to weights=1.
That was removed at some point to speed things up, but apparently there wasn't a package test using copy+weights to catch the issue.
If you're able, please make a small reproducible example triggering the bug, and report on
https://github.com/inlabru-org/inlabru/issues
so I can deal with it when I have time?

Another workaround is to move the weighting to the predictor expression itself.
That will make it treat the model as non-linear, so set options=list(bru_max_iter=1) to avoid extra calculations (or = 2, as one extra iteration can be helpful also for linear models)
It will run slightly slower, but would avoid the bug in copy+weights. (the workaround with weights=1 above is better)

Finn
--
Finn Lindgren
email: finn.l...@gmail.com

Jonas MEIJERINK

unread,
May 7, 2026, 7:20:44 AM (6 days ago) May 7
to Finn Lindgren, R-inla discussion group
Hi Finn,

Thank you for the solution! This indeed seems to solve the problem for my simulation example.

I will make an issue on Git!

Kind regards,
Jonas

Op do 7 mei 2026 om 12:24 schreef Finn Lindgren <finn.l...@gmail.com>:

Finn Lindgren

unread,
May 7, 2026, 9:09:51 AM (6 days ago) May 7
to Jonas MEIJERINK, R-inla discussion group
For others following along, as discovered on the git issue https://github.com/inlabru-org/inlabru/issues/298
the issue is a bit more complicated, and the weights=1 workaround doesn't fully solve the problem as it then still may ignore the different weights in the copy component; there is a discrepancy between different views of the model, and some more investigation is needed to determine the correct fix.
The slower workaround is not affected by the problem.

Finn

Finn Lindgren

unread,
May 7, 2026, 9:26:49 AM (6 days ago) May 7
to Jonas MEIJERINK, R-inla discussion group
Good news, the weights=1 workaround _does_ appear to work as intended, but there is still an internal bug in that the internal model storage is inconsistent, and it's not entirely clear in what cases it will use the correct weights input; this will be investigated and fixed.

Finn

Finn Lindgren

unread,
May 7, 2026, 2:24:18 PM (6 days ago) May 7
to Jonas MEIJERINK, R-inla discussion group
inlabru 2.14.1.9002 should resolve the issue properly.
Finn

Jonas MEIJERINK

unread,
May 8, 2026, 4:14:43 AM (6 days ago) May 8
to Finn Lindgren, R-inla discussion group
Hi Finn, 

Thank you for solving this issue so quickly!

Kind regards,
Jonas

Op do 7 mei 2026 om 20:24 schreef Finn Lindgren <finn.l...@gmail.com>:
Reply all
Reply to author
Forward
0 new messages