How to specify β × S(s) term shared across two likelihoods in inlabru

47 views
Skip to first unread message

Brian

unread,
Nov 12, 2025, 12:36:57 AMNov 12
to R-inla discussion group
Hello,

I’m implementing a preferential sampling model based on the framework Pati, Reich, and Dunson (2011). In this approach, I jointly model the spatial point process (representing the sampling locations) and the continuous outcome, linking them through a shared spatial field that accounts for the preferential sampling mechanism. The joint model has two components:

Intensity model :  \Lambda(x) = \exp{\alpha + S_1(x)}
Observational model: Y_i = \mu + \beta S_1(x_i) + S_2(x_i) + Z_i

Where S_1 and S_2 are two independent Gaussian processes and the parameter \beta controls the degree of preferentiality in the sampling of the Y_i.

I wanted to ask  how I can specify  \beta S_1(x_i) in the model components. I have tried using copy = "S1" and fixed = FALSE, but the model runs very slowly, and I have to stop it before it finishes. Is this the correct way to implement it, or is there a more efficient approach?

```
intensity_model <- bru_obs(
  family = "cp",
  formula = geometry ~
    Intercept_intensity +
    S1,
  domain = list(geometry = mesh),
  data = dm_shp,
  tag = "Intensity"
)

yi_model <- bru_obs(
   "gaussian",
  formula = y_i ~
    Intercept_obs +
    beta +
    S2 +
    epsilon,
  data = dm_shp,
  tag = "Observations"
)

cmp <-  ~
  Intercept_intensity(1) +
  Intercept_obs(1) +
  S1(main = geometry, model = spde_intensity) +
  beta(main = geometry, copy = "S1", fixed = FALSE) +
  S2(main = geometry, model = spde_y) +
  epsilon(ID, model = "iid")

mod <- bru(
  cmp,
  intensity_model,
  yi_model,
  options = list(
    control.inla = list(
      int.strategy = "eb"
    ),
    bru_max_iter = 1,
    num.threads = "5:1"
  )
  )
```


Helpdesk (Haavard Rue)

unread,
Nov 12, 2025, 1:56:26 AMNov 12
to Brian, R-inla discussion group

what is the dimensions of this model (data, S1 S2) ?
> --
> You received this message because you are subscribed to the Google Groups "R-
> inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to r-inla-discussion...@googlegroups.com.
> To view this discussion, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/01bafd4b-50eb-428f-a16c-045d3b80771en%40googlegroups.com
> .

--
Håvard Rue
he...@r-inla.org

Brian

unread,
Nov 12, 2025, 2:57:07 AMNov 12
to R-inla discussion group
The data has n = 3949 sampled locations with observations, and both S1 and S2 are represented as SPDEs on meshes with 5420 nodes each.

Finn Lindgren

unread,
Nov 12, 2025, 3:03:41 AMNov 12
to R-inla discussion group
Is “epsilon” the Z component in your model?
Then if ID is the observation index you have double iid noise terms, as the gaussian observation model already handles that part; it’s inherent to that model. Remove epsilon from the model code.

Also, you need to specify the “samplers” polygon in the “cp” model. It’s almost never correct to have the mesh cover only the observation domain for the point process, and since the true intensity is zero outside of the observation polygon, you need to tell it what that polygon is, so it doesn’t need to attempt estimating a log-intensity of minus infinity in the outer parts of the mesh.

Finn

On 12 Nov 2025, at 05:37, Brian <brian...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.

Brian

unread,
Nov 12, 2025, 8:05:16 AMNov 12
to R-inla discussion group
Hi Finn,

Yes, 'epsilon'  represents  Z component in the model, and I have now removed.

I have also included the "samplers" polygon (which is my country's shapefile) in the "cp" model, since I had extended the mesh outer boundary. I am currently running the model. 

Thank you for your inputs and guidance on this. 

```
intensity_model <- bru_obs(
  family = "cp",
  formula = geometry ~
    Intercept_intensity +
    S1,
  domain = list(geometry = mesh),
  data = dm_shp,
  samplers = kenya_shapefile,

  tag = "Intensity"
)

yi_model <- bru_obs(
   "gaussian",
  formula = y_i ~
    Intercept_obs +
    beta +
    S2,

  data = dm_shp,
  tag = "Observations"
)

cmp <-  ~
  Intercept_intensity(1) +
  Intercept_obs(1) +
  S1(main = geometry, model = spde_intensity) +
  beta(main = geometry, copy = "S1", fixed = FALSE) +
  S2(main = geometry, model = spde_y)

mod <- bru(
  cmp,
  intensity_model,
  yi_model,
  options = list(
    control.inla = list(
      int.strategy = "eb"
    ),
    bru_max_iter = 1,
    num.threads = "5:1"
  )
  )
```

Brian

unread,
Nov 12, 2025, 12:10:18 PMNov 12
to R-inla discussion group
The model ran successfully. Now suppose that the preferential sampling happened in some areas but not in others where in some locations the sampling was independent. Say I have an indicator variable with 1 for preferential sampling and 0 for independent sampling. How can I incorporate this indicator to specify that the preferential sampling (the beta component linking S1 to the intensity) should only be active in areas where sampling was actually preferential?

Finn Lindgren

unread,
Nov 12, 2025, 12:53:58 PMNov 12
to Brian, R-inla discussion group
For this, there are several options, and it depends on the precise details of the model you want.
I _think_ you're saying you want to set beta=0 where your indicator is zero?
Then you can use it as a component weight. Assuming you have it stored as a `terra` raster:

components = ~ ... + beta(geometry, indicator_raster, copy="S1", fixed = FALSE)

Alternatively, define it as a constant component and multiply by it in the predictor expression:

components = ~ ... + beta(geometry, copy="S1", fixed = FALSE) +
  indicator(indicator_raster, model="const")
...
bru_obs(formula = y_i ~ Intercept_obs + beta * indicator + S2, ...)

Note, I'd recommend renaming "beta" to "beta_S1", and removing the "main = ";
the "main" component input is always the first argument, and there is no need to name it.

For the family="cp" part of the model, you may need to use
  domain = list(geometry = fm_subdivide(mesh, something))
to increase the integration scheme resolution to approximately match or exceed that of the indicator raster.

Finn



--
Finn Lindgren
email: finn.l...@gmail.com

Brian

unread,
Nov 13, 2025, 1:10:33 AMNov 13
to R-inla discussion group
Thank you Finn, this is really helpful. Yes, I want beta = 0. where the indicator is zero. Also, since  the mesh was defined with `max.edge = c(20, 40)` in km, and my raster resolution is ~5km, then I can use something like domain = list(geometry = fm_subdivide(mesh, 2.5))

Finn Lindgren

unread,
Nov 13, 2025, 1:24:18 AMNov 13
to Brian, R-inla discussion group
The subdivide argument is how many additional points to add per edge, so youe need something like 4, or 5 or maybe 6, not 2.5.

Finn Lindgren
email: finn.l...@gmail.com

Brian

unread,
Nov 13, 2025, 3:55:29 AMNov 13
to R-inla discussion group
Thank you, Finn! I understand if say the raster was ~1KM, then I would need something like 20?

Finn Lindgren

unread,
Nov 13, 2025, 4:51:39 AMNov 13
to Brian, R-inla discussion group
In principle, yes, but in that case you might consider smoothing the raster a bit instead, to make it less sharp/binary.
Finn

Finn Lindgren
email: finn.l...@gmail.com

Brian

unread,
Nov 13, 2025, 5:02:40 AMNov 13
to R-inla discussion group
Okay. This is well understood. Thank you for the support.
Reply all
Reply to author
Forward
0 new messages