Fixed effect results very sensitive to mesh 'max.edge' in SPDE

429 views
Skip to first unread message

Joris Wiethase

unread,
Jun 1, 2021, 11:11:04 AM6/1/21
to R-inla discussion group
Hi all,

Some background:
I have been working on specifying a SPDE model, to analyse factors explaining per-pixel rates of a degradation index, in a pre-defined study region (approx. 350km x 170km).

Based on yearly degradation maps, I calculated the overall pixel rate (at 120m resolution), and built a model with Gaussian family,  specifying priors with 'inla.spde2.pcmatern' (see code below).

Following a previous question (link here), I have used CPO and maps of predictions to try and evaluate model fit. I used the 'INLA::meshbuilder()' function to choose and assess mesh parameters (mesh attached).

The problem:
When trying different mesh parameters, I noticed that slight changes in the 'max.edge' parameter of the 'inla.mesh.2d()' function lead to drastic changes in some of the fixed effects output. 

More specifically, changing "max.edge" from 0.03 to 0.035 essentially reverses the direction of some fixed effects (all other mesh parameters kept constant). The hyperparameter estimates appear to be fairly similar.

The questions:
I'm very much at a loss as to what might be causing this. From what I have read, it seems that the results should not be too sensitive to the mesh resolution. So I think my questions are:
  1. Should I be worried about this behaviour?
  2. If this sensitivity is normal, how do I choose the "correct" mesh parameters?
  3. If this sensitivity in not normal, what might be causing it?
Any help is greatly appreciated!

Joris


Fixed effects and hyperparameters for 'max.edge = 0.035':

mesh_maxEdge_0_035_fixed.png
mesh_maxEdge_0_035.png
Fixed effects and hyperparameters for 'max.edge = 0.03':

mesh_maxEdge_0_03_fixed.pngmesh_maxEdge_0_03.png

Mesh using 'max.edge =0.035': 

mesh_maxEdge_0_035_assess.pngmesh_maxEdge_0_035_plot.png





The code 
-----------------------------------------------------------------------
deg <- a raster stack, containing 20 rasters with 120m pixel resolution
load("BG_raw_spdf.RData"))   a SpatialPointsDataFrame, containing the covariates
annRain <- yearly rainfall for the time period
studyArea <- shapefile for the study area

### ----------------------------------------------------------------------------
time <- 1:nlayers(deg)
# Compute the overall slope of degradation, accounting for rainfall in a given year
slowfun <- function(y) {
      if(all(is.na(y))){
            NA
      } else {
            m = lm(y ~ time + annRain$mean); summary(m)$coefficients[2]   # get time
      }
}

deg_rate <- calc(deg, slowfun)

### Basic INLA model predicting rate from various covariates:
### ----------------------------------------------------------------------------

## Add the rate to the raw_spdf:
deg_rate_spdf <- as(deg_rate, "SpatialPointsDataFrame")
deg_rate_spdf <- deg_rate_spdf[!is.na(deg_rate_spdf@data$layer), ]

xy_raw  <- apply(round(coordinates(raw_spdf), 6), 1, paste, collapse = "_")
xy_rate <- apply(round(coordinates(deg_rate_spdf), 6), 1, paste, collapse = "_")

raw_spdf <- raw_spdf[order(xy_raw), ]
deg_rate_spdf <- deg_rate_spdf[xy_rate %in% xy_raw, ]
deg_rate_spdf <- deg_rate_spdf[order(xy_rate[xy_rate %in% xy_raw]), ]
raw_spdf$deg_rate <- deg_rate_spdf$layer * 10  # Small decimal numbers will use more memory to get to integer. Also easier to read, and index is arbitrary anyways

## Transform and scale covariates:
raw_spdf$lpop <- log(raw_spdf$pop + 1)  # log(x+1) for right skewed data containing zeros
raw_spdf$ltlu <- log(raw_spdf$tlu + 1)  
raw_spdf$pop_s <- c(scale(raw_spdf$lpop))
raw_spdf$rain_s <- c(scale(raw_spdf$rain))
raw_spdf$tlu_s <- c(scale(raw_spdf$ltlu))

## Make a mesh:
  boundary.loc <- as(raw_spdf, "SpatialPoints")
  boundary <- list(
    list(inla.nonconvex.hull(coordinates(boundary.loc), 0.335),
         as.inla.mesh.segment(studyArea)),
    inla.nonconvex.hull(coordinates(boundary.loc), 0.61))

  ## Build the mesh:
  mesh <- inla.mesh.2d(boundary=boundary,
                       max.edge=c(0.035, 0.175),
                       min.angle=c(30, 21),
                       max.n=c(48000, 16000), ## Safeguard against large meshes.
                       max.n.strict=c(128000, 128000), ## Don't build a huge mesh!
                       cutoff=0.007, ## Filter away adjacent points.
                       offset=c(0.335, 0.61)) ## Offset for extra boundaries, if needed.
  
# and the spde:
A = inla.spde.make.A(mesh=mesh, loc=data.matrix(coordinates(raw_spdf)))  # Combine actual points with mesh

# covariates:
Xcov = data.frame(intercept=1, 
                  pop_s = raw_spdf$pop_s,
                  rain_s = raw_spdf$rain_s,
                  tlu_s = raw_spdf$tlu_s,
                  CA1 = (raw_spdf$CA == 1) * 1,  # CCRO
                  CA2 = (raw_spdf$CA == 2) * 1,  # NP
                  CA3 = (raw_spdf$CA == 3) * 1)  # WMA

Xcov = as.matrix(Xcov)
colnames(Xcov)
stck <- inla.stack(tag='est',     # stack is INLA data structure
                   data=list(deg_rate=raw_spdf$deg_rate),
                   effects=list(
                         s=1:mesh$n,     # spatial random effect
                         Xcov=Xcov),
                   A = list(A, 1)
                   
## define the model:
# - sd(raw_spdf@coords[, 2])/10
# - 0.5*diff(range(raw_spdf@coords[, 2]))

spde = inla.spde2.pcmatern(mesh, 
                           prior.range = c(1.6, 0.5),  
                           prior.sigma = c(0.08, 0.1))

formula = deg_rate ~ -1 + Xcov +    
      f(s, model=spde)                 # spatial random effect

## make some linear combinations, for effect plots:
pop_lc <- inla.make.lincombs(Xcov1 = rep(1, 100),
                             Xcov2 = seq(min(raw_spdf$pop_s), max(raw_spdf$pop_s), len = 100))
names(pop_lc) <- paste0("pop", 1:100)
rain_lc <- inla.make.lincombs(Xcov1 = rep(1, 100),
                              Xcov3 = seq(min(raw_spdf$rain_s), max(raw_spdf$rain_s), len = 100))
names(rain_lc) <- paste0("rain", 1:100)
tlu_lc <- inla.make.lincombs(Xcov1 = rep(1, 100),
                             Xcov4 = seq(min(raw_spdf$tlu_s), max(raw_spdf$tlu_s), len = 100))
names(tlu_lc) <- paste0("tlu", 1:100)
CA_lc <- inla.make.lincombs(Xcov1 = rep(1, 4),
                            Xcov5 = c(0,1,0,0),
                            Xcov6 = c(0,0,1,0),
                            Xcov7 = c(0,0,0,1))
names(CA_lc) <- paste0("CA", 1:4)
all_lc <- c(pop_lc, rain_lc, tlu_lc, CA_lc)

res <- inla(formula, data=inla.stack.data(stck),
            control.predictor=list(A = inla.stack.A(stck), compute=TRUE),
            family = family,
            lincomb = all_lc,
            control.compute = list(cpo = TRUE),
            control.inla = list(int.strategy='eb'),    # empirical bayes (cheap computation for testing)
            verbose = TRUE)

summary(res)

CPO model fit:

model_fit_CPO_maxEdge_0_035.png



Joris Wiethase

unread,
Jun 2, 2021, 5:17:51 AM6/2/21
to R-inla discussion group
Hi all,

since yesterday, I have created a shared folder, containing a shorter script and the files used. Hopefully this will help clarify the problem:

At the 120m resolution, this takes quite a long time to compute, but I can share the model results. At 500m, it should take just a few minutes.

I have noticed that, at 500m resolution, the difference in the fixed effect when using the two different meshes is not as pronounced (although it is there, particularly at 'Xcov6').

Please let me know if there is anything else I can provide, to help with the question.


Kind regards,
Joris

Thierry Onkelinx

unread,
Jun 2, 2021, 7:59:59 AM6/2/21
to Joris Wiethase, R-inla discussion group
Dear Joris,

I'm not sure if this is related to the problem but I've noticed that the estimated range is smaller than 2 times max.edge. The model haq a hard time fitting a smooth SPDE when the range is that close to max.edge. Personally I prefer to have a max.edge that is 1/3 to 1/10 times smaller than the range. max.edge > range / 3: hard to fit a smooth SPDE, max.edge < range / 10: more detail than you really need.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry....@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////




Op wo 2 jun. 2021 om 11:17 schreef 'Joris Wiethase' via R-inla discussion group <r-inla-disc...@googlegroups.com>:
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/a66f3506-4c5c-4dc9-9902-5cfa39c1a335n%40googlegroups.com.

Joris Wiethase

unread,
Jun 2, 2021, 9:17:59 AM6/2/21
to R-inla discussion group
Dear Thierry,

thank you for taking the time to answer, this seems very helpful! I will fit the model again using max.edge = range / 5 (so roughly 0.01), to keep computation times a little lower. I will see if the results are still sensitive, and report back.

Many thanks,
Joris

Joris Wiethase

unread,
Jun 2, 2021, 10:18:07 AM6/2/21
to R-inla discussion group
I should have clarified in my initial email: The map unit is WGS84, so the resolution of the data at 120m is roughly 0.001 degrees. 

Finn Lindgren

unread,
Jun 2, 2021, 10:26:37 AM6/2/21
to Joris Wiethase, R-inla discussion group
Hi,
1) WGS84 is not a length unit; CRS info can specify either metres or kilometres.
2) You’re not specifying a CRS when building the mesh, so the mesh will sue the raw coordinates as they are stored (I.e. in whatever projection&scale specified by the input objects).

I recommend using kilometre units when possible for large scale domains, as the numerics of metres quickly becomes difficult. Log/lat coordinates in degreees are anisotropic, so if that matters, avoid building meshes in longlat. (I haven’t checked your code/files for the details of what you do; these are general recommendations.)

Finn

On 2 Jun 2021, at 15:18, 'Joris Wiethase' via R-inla discussion group <r-inla-disc...@googlegroups.com> wrote:

I should have clarified in my initial email: The map unit is WGS84, so the resolution of the data at 120m is roughly 0.001 degrees. 

Joris Wiethase

unread,
Jun 2, 2021, 10:48:57 AM6/2/21
to R-inla discussion group
Hi Finn,

thanks for your input, I will convert things to UTM, so I can work with kilometre units from here on.

Regards,
Joris

Reply all
Reply to author
Forward
0 new messages