Neighborhood graph given lots of NA values

27 views
Skip to first unread message

Tim Meehan

unread,
Nov 28, 2022, 5:23:28 PM11/28/22
to R-inla discussion group

Capture.PNG

Hi all,

Say you have data that looks like this map, where there is a lattice
with grid cells and many grid cells have no data. Say you have to make
a CAR model with this data, so you need to deal with a neighborhood
graph. You have a few choices here. You can make a neighborhood graph
with many disconnected subgraphs, and lots of islands (and proceed as
Freni-Sterrantino et al.), or you can make a neighborhood graph using
the full lattice that covers the study area, but enter NA for Y values
in cells with no data. What would you do and why? Do you have any
citations for how to make this decision? I have only found a few
references for dealing with disconnected graphs and they usually focus
on a few singletons or don't otherwise talk about using NA values with
the full lattice.

Thanks,
Tim

Finn Lindgren

unread,
Nov 28, 2022, 6:19:34 PM11/28/22
to R-inla discussion group
Hi Tim,

if we (for a moment) exclude the bit in Alaska, I would construct a model of the unknown field, based on the full grid, if the grid cells with no data still conceptually exist.
(Depending on why the data there is missing you might need to consider also modelling the missingness mechanism, to avoid bias; there's a paper by Diggle et al from a couple of years ago I think that
discusses issues around preferential sampling).

Reasons:

CAR(1) models only have a well-defined spatial interpretation in the aggregate; see Besag&Mondal 2005 and an earlier Besag paper, both cited in our 2011 SPDE/GMRF paper,
so the only situation a basic equal-neighbour-weights CAR(1) model really makes sense is when space has been split into regions; this is an indirect result from the Besag papers and our SPDE/GMRF paper, that links discrete space models to continuous limits.  So for a process that is _modelled_ on a grid (not necessarily _observed_!) a CAR model can make sense; space is split into a regular patterm of rectangles, and a CAR(1) model is somewhat interpretable.

If one instead builds a CAR model only on the grid boxes that happen to have been observed, any link to the underlying _full_ spatial domain process is lost, and you get 
a purely ad hoc model construction that does induce some dependence between the data sites, but no consistency with respect to the observation pattern.
Consider the case of two years of observation, and a separate model for each year. If the CAR model is based on the observed sites _only_, the two models will be two completely different models,
and even the interpretation of the CAR model parameters can be different.

When possible, I much prefer to work with hierarchical model constructions where the estimands exist independently of any observations.
This is also a framework in which, conceptually, "mis-aligned data" is simply converted into multiple ways in which a common latent "truth" can be connected to different observation systems.

FInally, what about Alaska? One could make a grid with slightly irregular outline to stretch out to Alaska; this basically is the same thing as an SPDE model with alpha=1 (no pointwise evaluation meaning, but well defined for spatial integration such as pixels; see the Besag papers), so from this the step isn't far to being able to use models other process smoothnesses; the SPDE/GMRF connection was partly developed to show how CAR(p) model parameters _should_ be constructed so that the resulting models become as invariant to the source locations as possible.

Finn


--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/a1839181-a8ae-42c4-ad57-d8bd55b6be3en%40googlegroups.com.


--
Finn Lindgren
email: finn.l...@gmail.com

Tim Meehan

unread,
Nov 28, 2022, 7:07:15 PM11/28/22
to R-inla discussion group
Awesome, Finn, thanks!
Reply all
Reply to author
Forward
0 new messages