Prediction in inlabru

21 views
Skip to first unread message

Brian

unread,
Dec 10, 2025, 11:55:46 PM (6 days ago) Dec 10
to R-inla discussion group
Hi,

I am fitting a binomial spatial model with covariates, an SPDE spatial effect, and an IID "nugget" term. The model fits successfully, but I'm struggling with prediction and especially when I have to include the covariates in my prediction formula, and it produces a warning of `longer object length is not a multiple of shorter object length`. Any guidance would be appreciated. Happy to provide more details if needed.

```
formula <- as.formula(paste(
  "n ~ ",
  paste(stepwise_covariates[[y]], collapse = " + "),
  "+ spde + epsilon + Intercept"
))

obs_model <- bru_obs(
  family = "binomial",
  formula = formula,
  Ntrials = df$N,
  domain = list(geometry = mesh),
  data = df,
  samplers = kenya_shapefile2,
)

cmp <- as.formula(
  paste(
    "~ ",
    paste(
      sprintf(
        "%s(%s)",
        stepwise_covariates[[y]],
        stepwise_covariates[[y]]
      ),
      collapse = " + "
    ),
    " + spde(as.matrix(coords.clusters), model = spde) + " ,
    'epsilon(cluster_number, model = "iid")',
    " + Intercept(1)"
  )
)

mod <- bru(
  cmp,
  obs_model,
  options = list(
    control.fixed = list(mean = 0, prec = 1),
    num.threads = "7:1"
  )
)

predict(
  mod,
  pred,
  ~ spde + Intercept + my_covariate_1 + my_covariate_2 + ...my_covariate_n
)

```

Finn Lindgren

unread,
Dec 11, 2025, 2:12:53 AM (6 days ago) Dec 11
to Brian, R-inla discussion group
Can you show the full output?

I’m not sure why you are specifying domain&samolers? Those are for point process models, and you have a binomial observation model that doesn’t involve that.

But the likely problem is the inout to the SPDE component; I can’t tell from the code, but in order for the predict call to work in this way, the coords.clusters variable must be part of the data objects (df and pred) and not a fixed global variable.

Finn

On 11 Dec 2025, at 04:55, Brian <brian...@gmail.com> wrote:

Hi,
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/r-inla-discussion-group/f84e9cbd-ebe3-4217-a634-b64e6c592b12n%40googlegroups.com.

Brian

unread,
Dec 11, 2025, 4:27:17 AM (5 days ago) Dec 11
to R-inla discussion group
Thank you for the response.

 I have removed the domain and samplers arguments. Regarding the coordinates for the SPDE component,  I' have now included them directly in the data object.

I no longer get the warnings. 

```
formula <- as.formula(paste(
  "n ~ ",
  paste(stepwise_covariates[[y]], collapse = " + "),
  "+ spde + epsilon + Intercept"
))
formula
n ~ aridity_index + built_up_volume + distance_to_major_roads +
    elevation + ndvi + prec + temp + travel_time_cities + ken_pop_count +
    IsUrban + spde + epsilon + Intercept

obs_model <- bru_obs(
  family = "binomial",
  formula = formula,
  Ntrials = df$N,
  data = df
)
cmp <- as.formula(
  paste(
    "~ ",
    paste(
      sprintf(
        "%s(%s)",
        stepwise_covariates[[y]],
        stepwise_covariates[[y]]
      ),
      collapse = " + "
    ),
    " + spde(cbind(km_x, km_y), model = spde) + " ,

    'epsilon(cluster_number, model = "iid")',
    " + Intercept(1)"
  )
)

mod <- bru(
  cmp,
  obs_model,
  options = list(
    control.fixed = list(mean = 0, prec = 1),
    num.threads = "7:1"
  )
)

mod
inlabru version: 2.13.0.9008
INLA version: 25.10.19
Components:
Latent components:
aridity_index: main = linear(aridity_index)
built_up_volume: main = linear(built_up_volume)
distance_to_major_roads: main = linear(distance_to_major_roads)
elevation: main = linear(elevation)
ndvi: main = linear(ndvi)
prec: main = linear(prec)
temp: main = linear(temp)
travel_time_cities: main = linear(travel_time_cities)
ken_pop_count: main = linear(ken_pop_count)
IsUrban: main = linear(IsUrban)
spde: main = spde(cbind(km_x, km_y))
epsilon: main = iid(cluster_number)
Intercept: main = linear(1)
Observation models:
  Family: 'binomial'
    Tag: <No tag>
    Data class: 'data.table', 'data.frame'
    Response class: 'numeric'
    Predictor:
        n ~ aridity_index + built_up_volume + distance_to_major_roads +
            elevation + ndvi + prec + temp + travel_time_cities + ken_pop_count +
            IsUrban + spde + epsilon + Intercept
    Additive/Linear: TRUE/TRUE
    Used components: effects[aridity_index, built_up_volume, distance_to_major_roads, elevation, ndvi, prec, temp, travel_time_cities, ken_pop_count, IsUrban, spde, epsilon, Intercept], latent[]
Time used:
    Pre = 1.22, Running = 6.93, Post = 0.46, Total = 8.62
Fixed effects:
                          mean    sd 0.025quant 0.5quant 0.975quant   mode kld
aridity_index            0.337 0.185     -0.025    0.337      0.702  0.337   0
built_up_volume         -0.679 0.163     -0.999   -0.679     -0.358 -0.679   0
distance_to_major_roads  0.061 0.076     -0.087    0.061      0.210  0.061   0
elevation               -0.033 0.339     -0.703   -0.031      0.628 -0.031   0
ndvi                     0.043 0.139     -0.230    0.043      0.315  0.043   0
prec                    -0.117 0.220     -0.548   -0.117      0.316 -0.117   0
temp                     0.471 0.317     -0.151    0.471      1.093  0.471   0
travel_time_cities       0.757 0.099      0.563    0.756      0.952  0.756   0
ken_pop_count            0.095 0.141     -0.182    0.095      0.372  0.095   0
IsUrban                 -2.841 0.113     -3.065   -2.841     -2.620 -2.841   0
Intercept                0.184 0.382     -0.575    0.182      0.950  0.182   0

Random effects:
  Name   Model
    spde SPDE2 model
   epsilon IID model

Model hyperparameters:
                         mean     sd 0.025quant 0.5quant 0.975quant    mode
Range for spde        186.896 48.976    112.946  179.555    304.102 164.147
Stdev for spde          1.279  0.164      0.988    1.268      1.634   1.245
Precision for epsilon   0.444  0.024      0.397    0.443      0.493   0.443

Marginal log-Likelihood:  -5803.52
 is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

results <- predict(
  mod,
  pred,
  ~ plogis(
      aridity_index + built_up_volume + distance_to_major_roads +
      elevation + ndvi + prec + temp + travel_time_cities + ken_pop_count +
      IsUrban + spde + epsilon + Intercept
  )
)
```
I have made sure that the km_x and km_y  are in both the prediction and the observed data and not a global object. I have also included the `cluster_number` variable in my prediction data to be able to use the epsilon/

```
df

Rows: 1,690
Columns: 16
$ cluster_number          <dbl> …
$ county                  <fct> …
$ aridity_index           <dbl> …
$ built_up_volume         <dbl> …
$ distance_to_major_roads <dbl> …
$ elevation               <dbl> …
$ ndvi                    <dbl> …
$ prec                    <dbl> …
$ temp                    <dbl> …
$ travel_time_cities      <dbl> …
$ ken_pop_count           <dbl> …
$ IsUrban                 <dbl> …
$ n                       <dbl> …
$ N                       <dbl> …
$ km_x                    <dbl> …
$ km_y                    <dbl> …

pred

Rows: 23,834
Columns: 13
$ aridity_index           <dbl> …
$ built_up_volume         <dbl> …
$ distance_to_major_roads <dbl> …
$ elevation               <dbl> …
$ ndvi                    <dbl> …
$ prec                    <dbl> …
$ temp                    <dbl> …
$ travel_time_cities      <dbl> …
$ ken_pop_count           <dbl> …
$ km_x                    <dbl> …
$ km_y                    <dbl> …
$ IsUrban                 <dbl> …
$ cluster_number          <int> …
Reply all
Reply to author
Forward
0 new messages