Areal data and preferential sampling with inlabru

447 views
Skip to first unread message

Nicola Criscuolo

unread,
Feb 2, 2022, 1:09:08 PM2/2/22
to R-inla discussion group
Dear INLA team,

I am working with a regular grid which aggregates counts of individuals in pixels with an area of approximately 10 km². Through the inlabru package, I defined the SPDE and I was able to fit a log Gaussian Poisson regression model to predict the overall abundance of my individuals in the whole study area.

However, my dataset might suffer of selection bias, due to the fact that I have sampled my data mostly were they were more available. This should be a case of preferential sampling.  As pointed out to me by Finn, with inlabru there is already a way to account for preferential sampling when dealing with an LGCP.

The problem is that my type of response is different, because I only consider counts. So far, I've just found this paper where authors create a joint model through the use of INLA and TMB to account for preferential sampling in areal data.

Since the procedure illustrated in inlabru for LGCP model looked more straightforward, I was wondering if something similar has already been implemented in a more "user-friendly" inlabru way.  

Thank you as always for your help and best regards.

Nico

Finn Lindgren

unread,
Feb 2, 2022, 1:21:44 PM2/2/22
to Nicola Criscuolo, R-inla discussion group
Not quite what you’re asking, but have you considered just replacing the lgcp lmodel with a count likelihood? In the inlabru interface, that just means using a different observation model; the associated second likelihood would probably be the same as before.

Finn

On 2 Feb 2022, at 18:09, Nicola Criscuolo <nico.cris...@gmail.com> wrote:

Dear INLA team,
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/02c21a03-1469-4d1b-a525-8b3d416ff292n%40googlegroups.com.

Nicola Criscuolo

unread,
Feb 2, 2022, 1:44:46 PM2/2/22
to R-inla discussion group
Hi Finn,

I am basically working under the assumption that my counts come from a preferential sampling which might be dependent on the underlying GRF. If you check the attached image of Switzerland, it shows more abundance in the city areas because in such areas there were more information where to collect counts from. If I have understood correctly, this might lead to underestimation of what's happening in the rural areas. For this reason, I was thinking to apply the LGCP example, but to my counts.

According to your answer, I can always define my SPDE in the usual way, but the likelihoods should change in the following way?

lik1 <- like(data = grid, # SpatialPixelsDataFrame
                  family = "poisson",
                  formula = abundance ~ covariate(s) + # I have them at the pixel level
                                    spde + 
                                    Intercept)

lik2 <- like(data = grid,
                   family = "poisson",
                   domain = list(coordinates = mesh),
                   formula = coordinates ~ spdeCopy + countIntercept)

Thanks again for your answer!

Nico
Screenshot 2022-02-02 at 19.34.38.png

Finn Lindgren

unread,
Feb 3, 2022, 9:49:13 AM2/3/22
to Nicola Criscuolo, R-inla discussion group
Yes, something like that.
Finn



--
Finn Lindgren
email: finn.l...@gmail.com

Nicola Criscuolo

unread,
Feb 18, 2022, 9:00:00 AM2/18/22
to R-inla discussion group
Hi Finn,

could you please explain me why, when using INLA/inlabru to account for preferential sampling, as in the shrimp example, there is the need to perform multiple iterations of the algorithm?

Thanks for your help,

Nico

Finn Lindgren

unread,
Feb 18, 2022, 10:03:12 AM2/18/22
to Nicola Criscuolo, R-inla discussion group


On Fri, 18 Feb 2022 at 15:02, Finn Lindgren <finn.l...@gmail.com> wrote:
Just a practical interface issue. Iterations are necessary for non-linear models and not for linear predictors, but it needs to detect if the model is linear or not, and currently it only does that based on basic facts:

Does not detect linearity:
components = ~ A + B + C
likelihood 1: likelihood 1: formula = resp1 ~ A + B
likelihood 2: formula = resp2 ~ A + C

Detects linearity:
components = ~ A + B + C
likelihood 1: formula = resp1 ~ ., include = c("A", "B")
likelihood 2: formula = resp2 ~ ., include = c("A", "C")

Finn







--
Finn Lindgren
email: finn.l...@gmail.com

Nicola Criscuolo

unread,
Feb 18, 2022, 11:49:54 AM2/18/22
to R-inla discussion group
Thanks for the prompt reply!

Nicola Criscuolo

unread,
Feb 28, 2022, 3:08:16 AM2/28/22
to R-inla discussion group
Dear Finn,

I moved the scale of my analysis with preferential sampling to a continental level (Europe). With a number of the mesh vertices around 4,000, everything works correctly. I can fit the normal model and also the preferential sampling model (again, similar to the shrimp example).

However, when I increase the vertices number to almost 40,000 I can only fit the normal model, while preferential sampling fails without even starting. I can see from the output that the covariates are loaded and then the run simply stops.

Please find attached the output files. I've included also one related to a successful run (with lower resolution).

Thanks a lot for your help!
pref_sampl_output_fail.txt
pref_sampl_error_fail.txt
pref_sampl_output_success.txt

Helpdesk

unread,
Feb 28, 2022, 11:30:16 AM2/28/22
to Nicola Criscuolo, R-inla discussion group

you're running out of memory

try with a recent testing version and add

library(INLA)
inla.setOption(inla.mode="experimental")

you can also reduce the number of threads to lower memory usage, or add

control.inla=list(int.strategy="eb")
> > > > > > > https://groups.google.com/d/msgid/r-inla-discussion-group/c2e0d04b-e105-4694-ba57-d3c675b55e33n%40googlegroups.com
> > > > > > > .
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Finn Lindgren
> > > > > > email: finn.l...@gmail.com
> > > > > --
> > > > > You received this message because you are subscribed to the
> > > > > Google Groups "R-inla discussion group" group.
> > > > > To unsubscribe from this group and stop receiving emails from
> > > > > it, send an email to r-inla-discussion...@googlegroups.com.
> > > > > To view this discussion on the web, visit
> > > > > https://groups.google.com/d/msgid/r-inla-discussion-group/1b0f2343-c8e0-4d46-a5bb-eb1a9a30a217n%40googlegroups.com
> > > > > .
> > > >
> > > >
> > > > --
> > > > Finn Lindgren
> > > > email: finn.l...@gmail.com
> > >
> > >

--
Håvard Rue
he...@r-inla.org

Nicola Criscuolo

unread,
Mar 1, 2022, 5:27:34 AM3/1/22
to R-inla discussion group
Thanks a lot for the answer. INLA by default was using all the threads on the cluster (128), hence I will also make a test by reducing their number.

Nico

Reply all
Reply to author
Forward
0 new messages