Trouble predicting on a grid from a model with rw2 covariates

872 views
Skip to first unread message

Virginia Morera Pujol

unread,
Jul 6, 2018, 6:54:42 AM7/6/18
to R-inla discussion group

Dear all,


I am trying to fit a point pattern model with inla that involves a spde spatial field and one (for now) covariate that doesn’t have a linear relation to the intensity, and so it is modelled as an “rw2” model. Because I am new at inla, I am working by steps of increasing complexity to try and understand what is going on.

 

First of all, mi data, mesh, boundary and covariate values at the mesh nodes and data points are in the attached datafile.

 

I set the spde model like so:


spde <- inla.spde2.pcmatern(mesh = mesh, alpha = 2,### mesh and smoothness parameter

                            prior.range = c(300, 0.01), ### P(practic.range<300)=0.01

                            prior.sigma = c(1, 0.1)) ### P(sigma>3)=0.1

 

And construct my “response” variable (with 0 weights outside the boundary) with the following code:

 

# 1.2 calculate weights for each node ####

dmesh <- inla.mesh.dual(mesh)

 

# number of meshnodes and number of datapoints

(nv <- mesh$n)

(n <- nrow(ds.df3)) #ds.df3 is the dataset containing the points

 

# calculate weights for each meshnode depending on whether they are inside or outside the boundary

sum(w <- sapply(1:length(dmesh), function(i) {

  if (gIntersects(dmesh[i,], bound))

    return(gArea(gIntersection(dmesh[i,], bound)))

  else return(0)

}))

 

table(w > 0) # false = nodes with weight 0 (outside the boundary)/true = nodes with some weight (inside the boundary)

 

# plot mesh, weights, and boundary. As we are working with lgcp we don't mind the integration points being in the nodes of the mesh

par(mar = c(2,2,1,1), mgp = 2:0)

plot(mesh$loc, asp = 1, col = (w == 0) + 1, pch = 19, xlab = '', ylab = '')

plot(mesh, add = TRUE)

plot(bound, border = 3, add = T)

 

# 1.3 Construct response ####

length(y.pp <- rep(0:1, c(nv, n))) # response (0 for all meshnodes, 1 for all data points)

summary(y.pp) # all 0 and 1

# expected (weights at each mesh node and 0 for all data points)

length(e.pp <- c(w, rep(0, n)))

summary(e.pp) # different values

 

The A matrix is generated like this:


lmat <- inla.spde.make.A(mesh, ds.df3@coords)

imat <- Diagonal(nv, rep(1, nv))

A.pp <- rBind(imat, lmat)

 

I now load the dataset with the value of the covariates at the mesh/datapoints. For the example, I’ll use the first column

 

load("covars.Rdata")

 

Now I construct the data stack and prepare the values for the hyperparameters of the rw2 model


stk.data <- inla.stack(data = list(y = y.pp, e = e.pp),

                       A = list(A.pp,1), tag = 'dat',

                       effects = list(list(sp_rdm = 1:spde$n.spde),

                                    data.frame(b0 = 1,

                                               gCovar = inla.group(covars[,1]))))

 

# 1.7 set priors for the covariate model in a list ####

pcprec <- list(prior = "pcprec",

               param = c(1, 0.01))

 

 

As an intermediate step, I run the model without a prediction stack, just to check the covariate’s effect.

 

# 2.1 formula for the model with covariate ####

frml_covar <- y ~ 0 + b0 +

                  f(gCovar, model = 'rw2', scale.model = T, hyper = list(prec = list(param = pcprec$param))) +

                  f(sp_rdm, model = spde)

 

# 2.2 Run the model ####

cov.model <- inla(frml_covar, family = 'poisson',

                              data = inla.stack.data(stk.data),

                              control.predictor = list(A = inla.stack.A(stk.data),

                                                       compute = FALSE, link = 1),

                              E = inla.stack.data(stk.data)$e,

                              control.compute = list(dic = F, waic = F),

                              control.inla = list(int.strategy = "eb"),

                              verbose = TRUE)

 

You can check the output of the model in this image.



Everything seems more or less correct, as well as the spatial field you can see in this other image.



On with step 2, I try and do a prediction stack. For this, following the SPDE tutorial, I first create a dataset of prediction points based on a lattice, and load the covariate values at these new set of points (done previously)

 

# 3.1 prepare the prediction points ####

stepsize <- 10 # this will control the resolution (smaller step size, more resolution, but more computing time)

 

# dimensions of the lattice

(nxy <- round(c(diff(poly.barrier@bbox[1,]), # range of longitude

                diff(poly.barrier@bbox[2,]))/stepsize)) # range of latitude

 

 

projgrid <- inla.mesh.projector(mesh, xlim = poly.barrier@bbox[1,], ylim = poly.barrier@bbox[2,], dims = nxy)

 

# create a Spatial Points dataset where we'll predict

predpts <- SpatialPoints(projgrid$lattice$loc, bbox = bound@bbox)

length(predpts)

 

# check these points are inside the boundary

plot(predpts, pch = ".")

plot(poly.barrier, add = T)

 

load("covarspred.Rdata")

 

I am interested in predicting the effect of the covariate on its own, without the spatial field or the intercept, and so I set the prediction stack like so:

 

stk.pred <- inla.stack(data = list(y = NA), #because we're making predictions

                       tag = "pred",

                       A = list(1),

                       effects = list(data.frame(Intercept = 1,

                                                gcovarpred = inla.group(covarspred[,1]))))

 

 

And now I stack the stacks

 

stk.all <- inla.stack(stk.data, stk.pred)

 

Finally, I run the model including the prediction

 

# 3.4 formula for the predictions

form_all <- y ~ 0 + Intercept +

                f(gcovarpred, model = 'rw2', scale.model = T, hyper = list(prec = list(param = pcprec$param))) +

                f(sp_rdm, model = spde)

 

 

# 3.6 Run the model ####

pred.model <- inla(form_all, family = 'poisson',

                   data = inla.stack.data(stk.all),

                   E = inla.stack.data(stk.all)$e,

                   control.predictor = list(A = inla.stack.A(stk.all), compute = TRUE, link = 1),

                   control.compute = list(dic = FALSE, waic = FALSE),

                          control.inla = list(int.strategy = "eb"),

                          verbose = TRUE)

 

However, when I plot the output of the model, the range and sd for the spatial field, the intercept, the precision for the covariate, and the effect of the covariate all are different. The quantiles of the effect are so far away from the mean that don’t even appear in the plot.



When plotting the mean (top) and sd (bottom) effect at each of the integration points I get complete nonsensical results.



 

Out of experience I am pretty sure I have missed something along the way, but after checking and rechecking the SPDE tutorial and several posts on this group, I still can’t figure out what it is, so any tips will be greatly appreciated!

 

I attach full code and data necessary to rerun the code.

 

 

Thanks in advance and sorry for the lengthy post!

INLA_fitting.R
INLA_helper_functions.R
covars.Rdata
covarspred.Rdata
data_mesh_boundary.Rdata
Auto Generated Inline Image 1
Auto Generated Inline Image 2
Auto Generated Inline Image 3
Auto Generated Inline Image 4

Finn Lindgren

unread,
Jul 6, 2018, 7:43:29 AM7/6/18
to Virginia Morera Pujol, R-inla discussion group
First of all,

I would very strongly recommend using the _same_ formula and varibale names when you estimate the spatial+covariate model as when you try to do prediction in that model, instead of changing the name of the covariates (gCovar vs gcovarpred). Even though you define a new formula with the new names, your existing stack.data uses the original names.
Corrected code:

gCv <- inla.group(covarspred[,1])

stk.pred <- inla.stack(data = list(y = NA), #because we're making predictions
                       tag = "pred",
                       A = list(1),
                       effects = list(data.frame(b0 = 1,
                                                gCovar = gCv)))

stk.all <- inla.stack(stk.data, stk.pred)

form_all <- frml_covar

Now, that change itself isn't enough, since inla then complains about the relative differences between covariate levels being too small, despite the use on inla.group. This is due to a fundamental problem with the inla.group-workaround, in that it needs to be applied simultaneously to _all_ the covariate values to work as intended (in the current code you apply it separately for the estimation and prediction values).  A better (in my opinion) solution is to use a proper basis function representation of the smooth effect.  This can be achieved by replacing the "rw2" model with a 1D spde model (with fixed long range).  I don't have a good working example of this at hand, but 1D spde examples are available here:
See the Tokyo example, with "free" boundary conditions instead of cyclic. degree=2 gives piecewise quadratic basis functions, which plays well the alpha=2 1D spde model.
This decouples the covariate values from the representation of the smooth effect, which mostly eliminates the "observations too close" problem.

Finn



--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsub...@googlegroups.com.
To post to this group, send email to r-inla-discussion-group@googlegroups.com.
Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.



--
Finn Lindgren
email: finn.l...@gmail.com

Virginia Morera Pujol

unread,
Jul 6, 2018, 9:32:40 AM7/6/18
to Finn Lindgren, R-inla discussion group
Hi Finn,

I did try fitting a 1d spde model for the covariate back when I was trying this with inlabru (this thread), but decided to use rw2 when I had to go back to inla, to make things a bit simpler. The problem now is that I'm not quite sure how to create my stack with the two spde (the 2d for the spatial field and the 1d for the covariate).

My approach so far:

# create mesh
knots <- seq(round(min(covars[,1], na.rm = T) - 5), round(max(covars[,1], na.rm = T) + 5), length = 25) # leave an "outer boundary" of 5
mesh1d <- inla.mesh.1d(knots, interval = c(-10, 15), degree = 2, boundary = "free")

# define model
spde1d <- inla.spde2.pcmatern(mesh = mesh1d, alpha = 2,### mesh and smoothness parameter
                            prior.range = c(20, NA), ### Prange == NA means range is fixed at 20

                            prior.sigma = c(1, 0.1)) ### P(sigma>3)=0.1

# make A matrix for the 1d spde?
A.1d <- inla.spde.make.A(mesh1d, covars[,1])

# create index
covar.index <- inla.spde.make.index("Covar", n.spde = spde1d$n.spde)

# 1.6 stack all the things we've created ####

# we include here the covariates, and choose which effects to use for each model later
# make a stack with data, A = observation matrix (projector matrix), effects = predictors (as many as obs matrices)


stk.data <- inla.stack(data = list(y = y.pp, e = e.pp),
                       A = list(A.pp, A.1d, 1), tag = 'dat',
                       effects = list(list(sp_rdm = 1:spde$n.spde),
                                      list(Covar = covar.index),
                                      data.frame(b0 = 1)))

Which gives the error:

Error in (function (data, A, effects, tag = "", compress = TRUE, remove.unused = TRUE)  : Row count mismatch for A: 4658,4658,1

I am pretty sure the syntax is wrong but can't figure out were...



Virginia Morera-Pujol
PhD Student
BEECA
University of Barcelona


To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To post to this group, send email to r-inla-disc...@googlegroups.com.

Finn Lindgren

unread,
Jul 6, 2018, 10:32:10 AM7/6/18
to Virginia Morera Pujol, R-inla discussion group
You encountered a special case where inla.stack cannot divine the proper length of the b0 covariate.
To fix that, use b0=rep(1, nrow(A.1d)) in the stack constructions, or supply Diagonal(nrow(A.1d), 1.0) as A-matrix;
All three A matrices need to have the same number of rows, which is what the error message is informing about.

Finn

Finn Lindgren

unread,
Jul 6, 2018, 10:35:29 AM7/6/18
to Virginia Morera Pujol, R-inla discussion group
Aaaah, sorry, not Diagonal. I meant
Matrix(1.0, nrow(...), 1)
The number of rows should be the same for all the A matrices (within an inla.stack call), and the number
of columns equal to the number of nodes of the effect, which for the intercept is just 1.

Finn

Finn Lindgren

unread,
Jul 6, 2018, 10:59:20 AM7/6/18
to Virginia Morera Pujol, R-inla discussion group
To make the spde version behave more like the rw2 model wrt identifiability, you’ll also need
  constr=TRUE
for the 1d spde, to impose an integrate-to-zero constraint.

Finn

Virginia Morera Pujol

unread,
Jul 6, 2018, 11:21:56 AM7/6/18
to Finn Lindgren, R-inla discussion group
Hi Finn,

Lucky me, with all the special cases... I tried your suggestion of using "Matrix(1.0, nrow(...), 1) ":

A.1d <- inla.spde.make.A(mesh1d, covars[,1])
A.1dM <- Matrix(1, nrow(A.1d), 1)

So str(A.1dM) gives:

Formal class 'dgeMatrix' [package "Matrix"] with 4 slots
  ..@ x       : num [1:4658] 1 1 1 1 1 1 1 1 1 1 ...
  ..@ Dim     : int [1:2] 4658 1
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ factors : list()

This gives the error:

Error in (function (data, A, effects, tag = "", compress = TRUE, remove.unused = TRUE)  : 
  ncol(A) does not match nrow(effect) for block 2: 1 != 28

As you said that the number of columns should be equal to the number of nodes of the effect, I guess that means the ncol should be equal to spde1d$n.spde, so I've also tried

A.1d <- inla.spde.make.A(mesh1d, covars[,1])
A.1dM <- Matrix(1, nrow(A.1d), 28) # 28 instead of 1 (as Finn suggested) to make the matrix have spde1$n.spde columns.

So str(A.1dM) gives:

Formal class 'dgeMatrix' [package "Matrix"] with 4 slots
  ..@ x       : num [1:130424] 1 1 1 1 1 1 1 1 1 1 ...
  ..@ Dim     : int [1:2] 4658 28
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ factors : list()


So A.1dM@Dim[2] is equal to spde1d$n.spde

However, trying to stack this gives another error:

Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match


Which doesn't give me a lot of info to work with. 

Sorry if I seem a bit clueless but of all the things I don't understand about inla (and believe me, there are many), A matrices are the thing I understand the least! 


Virginia Morera-Pujol
PhD Student
BEECA
University of Barcelona

Finn Lindgren

unread,
Jul 6, 2018, 11:43:35 AM7/6/18
to Virginia Morera Pujol, R-inla discussion group
In your original code you had 2 stack calls, one for estimation and one for prediction. You need different A matrices for theb0 component in the two calls. In both cases they should be matrices of ones with a single column.
(Need to see both the code and the error message to know what’s going on)

Finn

Virginia Morera Pujol

unread,
Jul 6, 2018, 1:02:35 PM7/6/18
to Finn Lindgren, R-inla discussion group
Hi again,

I hadn't even gotten to the prediction call with the two stacks, I was just trying to change the covariate model in the estimation stack from a rw2 to a 1d spde model. I have updated the code with my attempts, the updated part is between the headings #### UPDATED TO ADD SPDE #### and #### IT CRASHES HERE #### The rest of the code and data is the same


Virginia Morera-Pujol
PhD Student
BEECA
University of Barcelona

INLA_fitting.R

Finn Lindgren

unread,
Jul 6, 2018, 1:12:32 PM7/6/18
to Virginia Morera Pujol, R-inla discussion group
You need to go back to the slightly earlier code that used covar=index; you’ve commented out that version, but
that’s the correct approach, and that’s what needs debugging. And please provide the precise error message, not just where it happened.

Finn
<INLA_fitting.R>

Virginia Morera Pujol

unread,
Jul 6, 2018, 1:47:30 PM7/6/18
to Finn Lindgren, R-inla discussion group
Hi Finn,

Sorry for not being more precise! I have made several attempts at making the covar=index approach work:

# 1.6 Create 1D SPDE model for the covariates ####


# create mesh
knots <- seq(round(min(covars[,1], na.rm = T) - 5), round(max(covars[,1], na.rm = T) + 5), length = 25) # leave an "outer boundary" of 5
mesh1d <- inla.mesh.1d(knots, interval = c(-10, 15), degree = 2, boundary = "free")

# define model
spde1d <- inla.spde2.pcmatern(mesh = mesh1d, alpha = 2,### mesh and smoothness parameter
                            prior.range = c(20, NA), ### Prange == NA means range is fixed at 20
                            prior.sigma = c(1, 0.1), ### P(sigma>1)=0.1
                            constr = TRUE)
spde1d$n.spde

# make A matrix for the 1d spde
A.1d <- inla.spde.make.A(mesh1d, covars[,1])
str(A.1d) #4658 x 28 correct?

# A.1dM <- Matrix(1.0, nrow(A.1d), 1)
A.1dM <- Matrix(1.0, nrow(A.1d), 28)
str(A.1dM) # 4658*28 correct?


# create index
covar.index <- inla.spde.make.index("Covar", n.spde = spde1d$n.spde) # I had done this following the Tokyo example, but I think this is only for temporal models?


# 1.6 stack all the things we've created ####

# OPTION A with A.pp for the 2dspde, A.1d for the 1d spde and 1 for the A matrix of the intercept
stk.data <- inla.stack(data = list(y = y.pp, e = e.pp),
                       A = list(A.pp,  # 4658*564
                                A.1d,  # 4658*28
                                1),    # for the intercept
                       tag = 'dat',
                       effects = list(list(sp_rdm = 1:spde$n.spde),   # 564
                                      list(Covar = covar.index),       # 28
                                      list(b0 = rep(1, nrow(A.1d))))) # 4658??


# this gives the error
# Error in rbind(deparse.level, ...) :
#   numbers of columns of arguments do not match

# OPTION B (intercept with the A for the 1d spde)
stk.data <- inla.stack(data = list(y = y.pp, e = e.pp),
                       A = list(A.pp,  # 4658*564
                                A.1dM),  # 4658*28
                       tag = 'dat',
                       effects = list(list(sp_rdm = 1:spde$n.spde),    # 564
                                      list(Covar = covar.index,        # 28
                                           b0 = rep(1, nrow(A.1d)))))  # 4658??

# this gives the error
# Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  :
#   arguments imply differing number of rows: 28, 4658

# OPTION C (intercept with the A for the 2d spde)
stk.data <- inla.stack(data = list(y = y.pp, e = e.pp),
                       A = list(A.pp,  # 4658*564
                                A.1dM),  # 4658*28
                       tag = 'dat',
                       effects = list(list(sp_rdm = 1:spde$n.spde,   # 564
                                           b0 = rep(1, nrow(A.pp))), # 4658??
                                      list(Covar = covar.index)))    # 28

# this gives the error
# Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  :
#   arguments imply differing number of rows: 564, 4658

# OPTION D (intercept with the A for the 2d spde, but length = 564)
stk.data <- inla.stack(data = list(y = y.pp, e = e.pp),
                       A = list(A.pp,  # 4658*564
                                A.1dM),  # 4658*28
                       tag = 'dat',
                       effects = list(list(sp_rdm = 1:spde$n.spde, # 564
                                           b0 = rep(1, 564)),      # 564??
                                      list(Covar = covar.index)))  # 28


# this gives the error
# Error in rbind(deparse.level, ...) :
#   numbers of columns of arguments do not match

Attached is the complete code with the updates


Virginia Morera-Pujol
PhD Student
BEECA
University of Barcelona

INLA_fitting.R

Finn Lindgren

unread,
Jul 6, 2018, 1:57:33 PM7/6/18
to Virginia Morera Pujol, R-inla discussion group
Instead of
  list(Covar=covar.index)
you should have just
  covar.index

The output from inla.spde.make.index is already the required list!
(It works the same as for 2D meshes, but now I see you didn’t use the index function for that...)

Finn
<INLA_fitting.R>

Finn Lindgren

unread,
Jul 6, 2018, 1:58:45 PM7/6/18
to Virginia Morera Pujol, R-inla discussion group
...and that refers to your option A.

Finn

Virginia Morera Pujol

unread,
Jul 6, 2018, 2:18:51 PM7/6/18
to Finn Lindgren, R-inla discussion group
Ok.

I did fix the problem for the estimation stack, and now I am able to estimate the effect of the covariate on my pattern.

I still get different effects for the model with the prediction stack and without, which was in fact the original problem, but I will look deeper into that under this new conditions and see if I have solved it. I'll just leave the correct code here in case anyone has the same problem and finds this thread.

Thank you again for all the help!


Virginia Morera-Pujol
PhD Student
BEECA
University of Barcelona

INLA_fitting.R

Finn Lindgren

unread,
Jul 6, 2018, 2:24:19 PM7/6/18
to Virginia Morera Pujol, R-inla discussion group
In your original code you had different formulas and component names in the estimation and prediction,
so make sure you fixed that problem.
When adding the prediction part, the only thing you should need to change is to add the prediction stack.
Don’t change the formula.

Finn
<INLA_fitting.R>

Sylvan Benaksas

unread,
Feb 20, 2023, 6:37:08 AM2/20/23
to R-inla discussion group
Hi Finn, 

Just to follow up on this as I am trying to apply this code. If I normally scale my covariates (eg scale(x)), when applying a 1d spde smoother, should I use scaled values of the covariate or the regular value intervals as the knots for the 1d mesh? All other covariates in the model will be fit as linear scaled fixed effects.

Cheers,
Sylvan 

Sylvan Benaksas

unread,
Feb 20, 2023, 10:22:16 AM2/20/23
to R-inla discussion group
Hi Finn,

Thanks for the advice I have tried to fit the spde effect with appropriate scaled data. however I am having an issue when trying to set the prior range for the 1d spde pc priors.

the following code gives me an error

knots<-seq(min(dat1$depth.std, na.rm = T) - 1,
           max(dat1$depth.std, na.rm = T) + 1, length = 15)
                  # leave an "outer boundary" of 1

mesh1d <- inla.mesh.1d(knots, interval = c(min(knots),max(knots)),
                       degree = 2, boundary = "free", alpha=2)

A_dep <- inla.spde.make.A(mesh1d, dat1$depth.std)

spde_dep <- inla.spde2.matern(mesh1d,
                              prior.range = c(5, NA),
                              prior.sigma = c(0.05,0.01),
                              constr = T)

"Error in qr.coef(a, b) : 'qr' and 'y' must have the same number of rows"

the inla.spde2.matern function only runs if I remove the prior.range argument, do you have any thoughts on what is causing that problem?

Thanks again,

Sylvan 
Reply all
Reply to author
Forward
0 new messages