INLA-SPDE vs. Kriging

973 views
Skip to first unread message

Juan Ossa-Moreno

unread,
May 17, 2017, 9:12:47 PM5/17/17
to R-inla discussion group
Dear R-Inla group

Am currently working with INLA-SPDE and Kriging. I did a review in both methods but am still far from mastering the topic and there are some things I still do not understand completely. I wanted to suggest a discussion of both methods, that could help clarifying the differences between them. Hopefully this is useful not only for me but for other users with similar doubts. I summarise my points as follows, feel free to comment on them or add more. I write some questions as well, I would be very happy if anyone could help me solving them. 

1.       “Kriging is a generic name for a family of generalized least-square regression algorithms.” (Goovaerts, 1999).

INLA is a method to perform “Bayesian Inference in a subclass of structured additive regression models, named latent Gaussian models”. Rue et al (2009)

From this I would understand that Kriging is still a structured additive regression model, thus could it be analysed inside the INLA approach? How would the Bayes approach be included in Kriging?

Would the Z*(u) kriging estimator (shown below) be analogous , to the y_i, miu_i or nu_iin the structured additive regression model? How could the right hand side of the Kriging equation be integrated in the INLA one?

Equation 1 in Rue et al (2009)

μ_i=η_i=α+∑_(j=1)^(n_f)〖f^((j) ) (u_ji ) 〗+∑_(k=1)^(n_β)〖β_k (z_ki ) 〗+ε_i

Equation 13 in Goovaerts (1999)

Z^* (u)-m(u)=∑_(α=1)^n(u)〖λ_α (u) 〗 [Z(u_α )-m(u_α)]

I understand that the i and u both denote the location of the variables, but I wanted to keep both equations exactly as reported by authors.



2.       The variograms are key in Kriging.

Equation 1 in Goovaerts (1999)

γ ̂(h)=1/(2N(h)) ∑_(α=1)^(N(h))[z(u_α )-z(u_α+h)]^2 

Instead of analysing covariance, INLA uses precision matrices of random fields with a neighbourhood structure, specifically those with a Matern covariance function.

Equation 1 in Lindgren et al (2011)

r(u,v)=σ^2/(2^(υ-1) Γ(υ) ) (κ‖v-u‖)^ν Κ(κ‖v-u‖)

This means that the concept of variogram is analogous to that of the Matern Covariance function. The difference would be that while Kriging uses the variogram directly, INLA uses the inverse covariance function (precision matrix) in the SPDE approach.



3.       Co-kriging  and Kriging with External Drift allow introducing covariates in a Kriging analysis. However, it seems that this is way more limited than what INLA allows in terms of including covariates and in terms of including functions of covariates.



4.       Finally, while Kriging objective is to “minimise the estimation or error variance  〖σ^2〗_E=Var{Z^* (u)-Z(u)} ” Goovaerts (1999), “the objective of the SPDE approach is to find a GMRF, with local neighbourhood and sparse precision matrix Q, that best represents the Matern Field.” (Cameletti et al, 2013)

 

Thanks a lot in advance for your help!


Regards


Juan


References


Rue, Håvard, Sara Martino, and Nicolas Chopin. "Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations." Journal of the royal statistical society: Series b (statistical methodology) 71.2 (2009): 319-392.

Goovaerts, Pierre. "Geostatistics in soil science: state-of-the-art and perspectives." Geoderma 89.1 (1999): 1-45.

Lindgren, Finn, Håvard Rue, and Johan Lindström. "An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73.4 (2011): 423-498.

Blangiardo, Marta, et al. "Spatial and spatio-temporal models with R-INLA." Spatial and spatio-temporal epidemiology 7 (2013): 39-55.


Equations can be seen in more detail in the attached file.

INLASPDEvsKriging.docx

Finn Lindgren

unread,
May 18, 2017, 5:15:18 AM5/18/17
to Juan Ossa-Moreno, R-inla discussion group
Hi,
[warning: long!]
[note: some of the issues are included in
http://www.maths.ed.ac.uk/~flindgre/tmp/gmrf.pdf ]

These an similar questions appear from time to time, and unfortunately the
problem is usually that the question of "What?" gets mixed up in the
question of "How?".

Kriging is not a model.
Kriging is a procedure for constructing an estimator, which is
valid/relevant for some models.
(Your quote from Goovaerts even says that.)
"The kriging equations" is a set of formulas that can be used to
evaluate the estimator, but
the same estimator can be evaluated in more than one way.

For example, using the matrix inversion lemma provides a set of
equations expressed in
terms of precision matrices instead of covariance matrices. When
evaluated (exactly)
they will provide (exactly) the same estimator.

Assume a fully Gaussian model
x(s) = Gaussian process with mean \mu(s) and covariance R(s,s')
Y = {y_i, i=1,...,N}, y_i = x(s_i) + e_i
e_i = iid N(0, \sigma_e^2)

\mu() and R(), and the e_i-distribution depend on parameters \theta

In a fully Gaussian model, the Kriging estimator simply evaluates the
conditional expectation

E(x(s) | Y, theta)

by explicitly handling the joint expectation and covariance matrix for
{x(s), x(s_1), ..., x(s_N), y_1, ..., y_N}

The above is all about "What?", except for the detail of how the
conditional expectation is computed, which
is an answer to "How?".

Now, the SPDE/GMRF approach _also_ fits the above. The difference to
covariance or variogram
Kriging only appears in which subset of models we consider, and how
the computations are carried out.
In fact, a much wider class of models follows the same principle; the
SPDE/GMRF part is just to obtain
sparse matrices.

Restrict the model class to fields that can be written
x(s) = \sum_{k=1}^n \psi_k(s) w_k
where \psi_k(s) are some basis functions, and w_k have a joint N(m, S)
multivariate
Gaussian distribution. then the mean and covariance for x(s) are

\mu(s) = \sum_k \psi_k(s) m_k
R(s, s') = \sum_i \sum_j \psi_i(s) \psi_j(s') S_{ij}

Entering this into the classical Kriging equations will yield the same answer as
using the precision based expressions, where Q_w = S^{-1}, Q_e = I/\sigma_e^2,

Y = A w + e, where A_ij = \psi_j(s_i), e=(e_1,...,e_N)
Q_{w|Y} = Q_w + A' Q_e A
\mu_{w|Y} = m + Q_{w|Y} A' Q_e ( Y - A m )
\mu_{x|Y}(s) = \sum_k \psi_k(s) (\mu_{m|Y})_k
(etc for the kriging variance)

The SPDE/GMRF approach is simply a specific choice of basis functions \psi_k(s),
and choice of Q_w, such that x(s) is approximately a solution to a given SPDE.

What about "simple", "universal" kriging etc, or variogram models with
singular covariance?
These are all special cases, where e.g. a global intercept and
covariates are used as basis functions.
One needs to be careful with the singularities in deriving some of the
expressions, but on a
fundamental level they are just limiting cases, here some prior
variances go to infinity.

So, where does INLA fit into the above?
It doesn't. It's an orthogonal concept that involves some of the same
computations,
since that's what probability theory tells us is the right thing to do.
INLA is a method for Bayesian inference in Latent Gaussian Models, and again the
separation of What? and How? is important. INLA attempts to find
approximations to
the posterior densities
p(\theta | Y)
and
p(w_k | Y) (and also p(x(s) | Y) )

Even in a fully Gaussian model, note the difference between
E(x(s) | Y) and E(x(s) | Y, \theta).
Kriging deals with the latter, whereas the former is something else.

For non-Gaussian models, the inner Laplace approximation in INLA deals with
p(w | Y, \theta)
by computing a Gaussian approximation. In a fully Gaussian model, that step is
equivalent to Kriging.

When trying to understand the differences between these models and methods,
please remember the difference between a model and a computational method.
Sometimes computational methods follow directly from model statements, but often
there are several methods for computing the same thing.

I'll add some more comments interspersed below:

On 18 May 2017 at 02:12, Juan Ossa-Moreno <js.os...@gmail.com> wrote:
> From this I would understand that Kriging is still a structured additive
> regression model, thus could it be analysed inside the INLA approach? How
> would the Bayes approach be included in Kriging?

The Kriging equations provide one method for computing estimates of parameters
in additive regression models, conditionally on covariance parameters, yes.
Kriging says nothing about where the covariance parameters come from.
INLA is a method for estimating those in a Bayesian setting (and using the resul
to get spatial predictions, for example).
(Note: the kriging equations can be computed regardless of whether there
is a Gaussian model or not; it's just that in a Bayesian setting they appear
automatically when Gaussian variables are involved)

>
> Would the Z*(u) kriging estimator (shown below) be analogous , to the y_i,
> miu_i or nu_iin the structured additive regression model? How could the
> right hand side of the Kriging equation be integrated in the INLA one?

No. y_i are observations, and the others I believe are model parameters in
your formulation? Never confuse model parameters with their estimators!

See above for more on how all of this comes from basic theory
for multivariate Gaussian distributions.

> 2. The variograms are key in Kriging.

In the "kriging equations", yes. But you can either use the matrix inversion
lemma to reformulate the equations to covariances or precisions (except for
intrinsic stationary models with singular covariances, but those tend to have
well-defined precisions if one just handles limits properly; See Rue
and Held (2005) )
without changing what is being computed.

> Instead of analysing covariance, INLA uses precision matrices of random
> fields with a neighbourhood structure, specifically those with a Matern
> covariance function.

INLA doesn't care what the covariance is, and the R-INLA package just
sees the precision matrix. Even the SPDE based GMRFs go beyond Matern,
e.g. on the sphere, and oscillating covariances.

> This means that the concept of variogram is analogous to that of the Matern
> Covariance function. The difference would be that while Kriging uses the
> variogram directly, INLA uses the inverse covariance function (precision
> matrix) in the SPDE approach.

Yes. Except that it doesn't matter if it's Matern or not.

> 3. Co-kriging and Kriging with External Drift allow introducing
> covariates in a Kriging analysis. However, it seems that this is way more
> limited than what INLA allows in terms of including covariates and in terms
> of including functions of covariates.

Yes, those are just specific models where someone wrote down the
explicit formulas.
The latent Gaussian formulation includes all linear (Gaussian) models.

> 4. Finally, while Kriging objective is to “minimise the estimation or
> error variance 〖σ^2〗_E=Var{Z^* (u)-Z(u)} ” Goovaerts (1999), “the objective
> of the SPDE approach is to find a GMRF, with local neighbourhood and sparse
> precision matrix Q, that best represents the Matern Field.” (Cameletti et
> al, 2013)

These are two completely unrelated statements.

The Goovaerts statement concerns a characterisation of the kriging estimator;
Using the conditional expectation E(x|Y,theta) as an estimator minimises
the expected squared deviation. It basically means "probability theory provides
the answer".

The Cameletti et al statement relates to how the Q_w matrix should be chosen to
get the distribution of x(.) close to a Gaussian field with a specific
covariance function.
Once that done, again "probability theory provides the answer".

Finally, sorry if this came across as a bit "gruff"; I'm just a bit
annoyed that these
basic things aren't taught properly almost anywhere. ;-)

Finn

Finn Lindgren

unread,
May 18, 2017, 5:26:05 AM5/18/17
to Juan Ossa-Moreno, R-inla discussion group
Addendum:

I have recently learned that in some sciences (some parts of
geosciences at least), the word "model" is sometimes used to mean what
in statistics would be called an "estimate", which is fundamentally
different from a statistical _model_.

I you're in such a field of science, beware!

Finn
--
Finn Lindgren
email: finn.l...@gmail.com

Juan Ossa-Moreno

unread,
May 18, 2017, 9:52:24 PM5/18/17
to R-inla discussion group, js.os...@gmail.com

Hi, thanks a lot for your answer.


Taking into account the model:

x(s)  ~ N(μ(s),R(s,s^' ))

Y={y_i,i=1,…,N},y_i=x(s_i )+ e_i

link function g(.),g(μ_i )=η_i

the y_i are observations, \mu() and R()  are parameters that depend on hyper-parameters \theta (such as \sigma_e^2)?, and η_i is the predictor. Kriging would then be one of the posible methods to compute the expected conditional value

E(x(s)│Y,\theta)

The output of Kriging represents an estimate of the parameters of the model, however, if Kriging does not say anything about covariance parameters, where do these come from. Could perhaps these be related to the fitting of the theoretical variogram on the experimental one? Perhaps parameters of the variogram such as the nugget effect?


Regards Juan

Finn Lindgren

unread,
May 19, 2017, 12:23:48 AM5/19/17
to Juan Ossa-Moreno, R-inla discussion group
1) When there is a link function, the conditional distribution x|Y,theta is non-Gaussian, and the traditional kriging equations do not give E(x|Y,theta)
2) Non-Bayesian estimation of \theta and x aren't really on topic for this list; the INLA package does Bayesian estimation.
3) The parameters \theta (which includes all covariance parameters of the model) are the "hyperparameters" in INLA, and INLA finds the posterior distributions of them and of x. This is how they are estimated. See Rue et al 2009 for details.
4) See Lindgren, Rue, Lindström 2011 for more on the connection between covariances and GMRFs. Variograms (actual variograms, not the empirical ones) are just functions of covariance functions.

Finn
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To post to this group, send email to r-inla-disc...@googlegroups.com.
Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages