Intercept estimate for SPDE model

simon brewer

unread,

Jun 22, 2021, 8:38:54 PM6/22/21

to R-inla discussion group

Hi all

I’m trying to a fit a SPDE model to a set of observations in the Himalayas. When I run a simple intercept only model with the SPDE term, the estimate of the intercept is quite different (~ -0.21) to a model with an intercept but no SPDE (~ -0.17). The mean of the response variable (mb_mwea) is about -0.17.

At first I thought that this was due to either me using the default mesh settings, or the clustered nature of the data (like this post: https://groups.google.com/g/r-inla-discussion-group/c/RqZ5v8jCkzo). So I've tried to improve on the mesh and to incorporate priors on the range and sd, but I still get roughly the same estimate for the intercept term.

I’d appreciate any advice on what I might be doing wrong here. The data are relatively normal looking, but somewhat leptokurtic, so it could be that using a Gaussian likelihood is not the best choice.

I’ve attached the dataset and the R script here.

Thanks in advance!

Simon

simon brewer

unread,

Jun 22, 2021, 8:44:29 PM6/22/21

to R-inla discussion group

hma_glaciers.zip

Finn Lindgren

unread,

Jun 23, 2021, 3:11:29 AM6/23/21

to simon brewer, R-inla discussion group

Hi Simon,

Unless the amount of observation locations is balanced across space, one should not expect the intercept to be the same for those two models.

Clustering is mostly an issue if one tries to resolve both small details within clusters at the same time as the large scale behavior, but the overall density variation in locations matters even without the complication of the mesh.

As a basic example, consider the following simplified setting:

On a 1d interval domain, imagine that we have a small number of observations near the left endpoint, with values around 10, and that we have a large number of observations near the right endpoint with values around 20.

An intercept-only model will then estimate an intercept close to 20, since that’s the plain average across the observations.

But if we take an intercept plus spatial linear slope model, the intercept should instead become close to 15, since that is the _spatial_ average of the estimated model.

The similar thing should happen with an intercept+SPDE model. The intercept will typically become closer to the spatial average of the estimated model, since the SPDE component takes into account the spatial correlation between nearby observations.

So in your case, it’s entirely possible that the difference in intercept estimates is entirely reasonable.

Now, since the mesh is often extended beyond the region of interest, one shouldn’t normally treat the intercept as a parameter of direct interest. If what you’re looking for is an estimate if the spatial average in a subregion of the mesh domain, you instead need to evaluate the spatial average of intercept+SPDE in that subregion. (This can be done via lincomb or posterior samples. The inlabru::ipoints() function can be used to construct the needed integration points&weights)

Finn

On 23 Jun 2021, at 01:38, simon brewer <brew...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/a2553098-1f05-4e07-ba4b-f32a902aea3dn%40googlegroups.com.

Finn Lindgren

unread,

Jun 23, 2021, 3:15:02 AM6/23/21

to simon brewer, R-inla discussion group

Note: in the illustrative basic example, I forgot to say the domain was centered around location zero.

Finn

On 23 Jun 2021, at 08:11, Finn Lindgren <finn.l...@gmail.com> wrote:

Hi Simon,

Finn Lindgren

unread,

Jun 23, 2021, 3:20:54 AM6/23/21

to simon brewer, R-inla discussion group

If the interval was placed elsewhere, the behaviour would depend on the parameterisation of the slope term; for the example, think intercept+beta*location

Alternatively, intercept +beta*(location-midpoint) which would make it invariant to the interval placement.

The main idea I tried to get across is that one should not expect the intercept to remain unchanged when introducing other model components, especially not when the locations aren’t spread uniformly across space. Just because a component has the same name, it doesn’t mean that it’s interpretation is the same across different models.

Finn

On 23 Jun 2021, at 08:14, Finn Lindgren <finn.l...@gmail.com> wrote:

Note: in the illustrative basic example, I forgot to say the domain was centered around location zero.

simon brewer

unread,

Jun 23, 2021, 9:13:26 AM6/23/21

to R-inla discussion group

Dear Finn

Thank you for the quick and detailed response – this is a really helpful explanation. And this explains why the intercept doesn’t change when running a model on the same mesh with a randomly generated outcome. This still has the same clustering of sites, but the random generation of values removes any spatial structure.

The intercept wasn’t a value I was really interested in, the question arose as I was worried about the change in values. However, having as estimate as a spatial average could be very useful, and I will try your suggestion to evaluate this over a subregion.