fitted values change in the same model with/without prediction stack

1,004 views

Skip to first unread message

Virginia Morera Pujol

unread,

Jul 9, 2018, 10:48:21 AM7/9/18

to R-inla discussion group

Dear all,

This is a continuation of this thread, but as the problem evolved and doesn’t fit the subject any more, I am starting a new one.

The general idea: I am running a point pattern model with a 2d spde for the spatial effect and a 1d spde to fit the covariate. I want to achieve two things:

1 – To check the effect the covariate has on the intensity of my point pattern.

2 – To plot this effect mean and sd on a grid.

But I have two problems:

First, I get a slightly different output (fitted values, range, spatial variance, precision) with and without the prediction stack, even if the rest of the code and the formula are the same, and

Second, I can’t project the fitted values onto a grid.

I have been jumping between this spde tutorial and the code for Blangiardo et al 2012. The code to set up my model is here, and the rest of the code, and data, are attached.

#### 0.3 load data, mesh and boundary ####

load("data_mesh_boundary.Rdata")

## Remove CRS information

proj4string(bound) = inla.CRS()

proj4string(ds.df3) = inla.CRS()

mesh$crs <- NULL

# set seed

set.seed(2016)

set.inla.seed = 2016

#### 0.4 Source helper functions ####

source("C:/Users/Virginia/Dropbox/Virginia/GPS_multi/scripts/INLA_helper_functions.R")

#### 1. PREPARE NECESSARY DATA ####

# 1.1 calculate weights for each node ####

dmesh <- inla.mesh.dual(mesh)

# par(mfrow = c(1,2));plot(mesh);plot(dmesh)

# par(mfrow = c(1,1))

# number of meshnodes and number of datapoints

(nv <- mesh$n)

(n <- nrow(ds.df3))

# calculate weights for each meshnode depending on whether they are inside or outside the boundary

sum(w <- sapply(1:length(dmesh), function(i) {

if (gIntersects(dmesh[i,], bound))

return(gArea(gIntersection(dmesh[i,], bound)))

else return(0)

}))

table(w > 0) # false = nodes with weight 0 (outside the boundary)/true = nodes with some weight (inside the boundary)

# plot mesh, weights, and boundary. As we are working with lgcp we don't mind the integration points being in the nodes of the mesh

par(mar = c(2,2,1,1), mgp = 2:0)

plot(mesh$loc, asp = 1, col = (w == 0) + 1, pch = 19, xlab = '', ylab = '')

plot(mesh, add = TRUE)

plot(bound, border = 3, add = T)

# 1.2 Construct response ####

length(y.pp <- rep(0:1, c(nv, n))) # response (0 for all meshnodes, 1 for all data points)

summary(y.pp) # all 0 and 1

# expected (weights at each mesh node and 0 for all data points)

length(e.pp <- c(w, rep(0, n)))

summary(e.pp) # different values

# 1.3 projector matrix ####

lmat <- inla.spde.make.A(mesh, ds.df3@coords)

imat <- Diagonal(nv, rep(1, nv))

A.pp <- rBind(imat, lmat)

str(A.pp) # 4658 x 564, correct

# 1.4 collect all covariates at points and mesh vertex ina single dataset ####

# this contains all covars at mesh nodes and locations scaled using base::scale

load("covars.Rdata")

# 1.5 create spde model and indices ####

spde <- inla.spde2.pcmatern(mesh = mesh, alpha = 2,### mesh and smoothness parameter

prior.range = c(300, 0.01), ### P(practic.range<300)=0.01

prior.sigma = c(1, 0.1)) ### P(sigma>3)=0.1

field.indices <- inla.spde.make.index("sp_rdm", n.spde = spde$n.spde)

# 1.6 Create 1D SPDE model for the covariates ####

# create mesh

knots <- seq(round(min(covars[,1], na.rm = T) - 5), round(max(covars[,1], na.rm = T) + 5), length = 25) # leave an "outer boundary" of 5

mesh1d <- inla.mesh.1d(knots, interval = c(-10, 15), degree = 2, boundary = "free")

# define model

spde1d <- inla.spde2.pcmatern(mesh = mesh1d, alpha = 2,### mesh and smoothness parameter

prior.range = c(20, NA), ### Prange == NA means range is fixed at 20

prior.sigma = c(1, 0.1), ### P(sigma>1)=0.1

constr = TRUE)

# make A matrix for the 1d spde

A.1d <- inla.spde.make.A(mesh1d, covars[,1])

str(A.1d) #4658 x 28 correct?

# create index

covar.index <- inla.spde.make.index("Covar", n.spde = spde1d$n.spde) # I had done this following the Tokyo example, but I think this is only for temporal models?

# 1.6 stack all the things we've created ####

# we include here the covariates, and choose which effects to use for each model later

# make a stack with data, A = observation matrix (projector matrix), effects = predictors (as many as obs matrices)

stk.data <- inla.stack(data = list(y = y.pp, e = e.pp),

A = list(A.pp, # 4658*564

A.1d), # 4658*28

tag = 'dat',

effects = list(c(field.indices, list(b0 = 1)), # 564

covar.index)) # 28

Now, without creating a prediction stack I can run this model and check the effects of the covariate like this:

#### 2. MODEL WITH COVARIATE ####

# 2.1 formula for the model with covariate ####

frml_covar <- y ~ 0 + b0 +

f(Covar, model = spde1d) +

f(sp_rdm, model = spde)

# check it is correct

frml_covar

# 2.2 Run the model ####

cov.model <- inla(frml_covar, family = 'poisson',

data = inla.stack.data(stk.data),

control.predictor = list(A = inla.stack.A(stk.data),

compute = FALSE, link = 1),

E = inla.stack.data(stk.data)$e,

control.compute = list(dic = F, waic = F),

control.inla = list(int.strategy = "eb"),

verbose = TRUE)

# 2.3 Output of the model ####

m <- matrix(c(1,2,3,4,5,5),nrow = 2,byrow = TRUE) # 5 plotting areas, without scale

layout(mat = m, heights = c(0.5, 0.5))

plot(cov.model$marginals.fix[[1]], type = 'l', xlab = 'Intercept', ylab = 'Density')

plot(cov.model$marginals.hyperpar[[3]], type = 'l', xlab = expression(sigma^2~"of the spatial random field"), ylab = 'Density')

plot(cov.model$marginals.hyperpar[[2]], type = 'l', xlab = 'Range for the spatial random field', ylab = 'Density')

plot(cov.model$marginals.hyperpar[[1]], type = 'l', xlab = expression(tau~"for the covariate"), ylab = 'Density')

plot(cov.model$summary.random[[1]][,1:2], type = 'l', xlab = 'Covariate effect', ylab = 'Effect')

abline(h = 0, lty = 3)

for (i in c(4,6)) {lines(cov.model$summary.random[[1]][,c(1,i)], lty = 2) }

# spatial field

par(mfrow = c(1,1))

p.mean <- my.levelplot(proj, cov.model$summary.random$sp_rdm$mean, col.regions = plasma(20), main = "Posterior mean")

(p.mean <- p.mean + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray")))

p.sd <- my.levelplot(proj, cov.model$summary.random$sp_rdm$sd, col.regions = plasma(20), main = "Posterior mean")

(p.sd <- p.sd + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray")))

grid.arrange(p.mean, p.sd, ncol = 2)

But if I add the prediction stack, and run the model again, using exactly the same formula and inla call, just changing to compute = TRUE in the control.predictor() call:

####3. PREDICTIVE MODELS WITH COVARIATES ####

# 3.1 Create the prediction stack ####

stk.pred <- inla.stack(data = list(y = NA, e = NA),

tag = "pred",

A = list(A.1d, 1), # for the intercept

effects = list(covar.index,

list(b0 = rep(1, nrow(A.1d))))) # 4658??

# 3.2 Stack the stacks ####

stk.all <- inla.stack(stk.data, stk.pred)

# 3.3 Rerun the model with the prediction stack ####

#exactly the same model but with stk.all instead of stk.data, na in the data call, and compute = TRUE in the predictor call

pred.model <- inla(frml_covar, family = 'poisson',

data = inla.stack.data(stk.all),

control.predictor = list(A = inla.stack.A(stk.all),

compute = TRUE, link = 1),

E = inla.stack.data(stk.all)$e,

control.compute = list(dic = FALSE, waic = FALSE),

control.inla = list(int.strategy = "eb"),

verbose = TRUE)

Here you can see the slightly different output. For the covariate effect, for example, the shape is the same but the values change.

# 3.4 Outputs of the model ####

m <- matrix(c(1,2,3,4,5,5),nrow = 2,byrow = TRUE) # 5 plotting areas, without scale

layout(mat = m, heights = c(0.5, 0.5))

plot(pred.model$marginals.fix[[1]], type = 'l', xlab = 'Intercept', ylab = 'Density')

plot(pred.model$marginals.hyperpar[[3]], type = 'l', xlab = expression(sigma^2~"of the spatial random field"), ylab = 'Density')

plot(pred.model$marginals.hyperpar[[2]], type = 'l', xlab = 'Range for the spatial random field', ylab = 'Density')

plot(pred.model$marginals.hyperpar[[1]], type = 'l', xlab = expression(tau~"for the covariate"), ylab = 'Density')

plot(pred.model$summary.random[[1]][,1:2], type = 'l', xlab = 'Covariate effect', ylab = 'Effect')

abline(h = 0, lty = 3)

for (i in c(4,6)) {lines(pred.model$summary.random[[1]][,c(1,i)], lty = 2) }

# spatial field

p2.mean <- my.levelplot(proj, pred.model$summary.random$sp_rdm$mean, col.regions = plasma(20), main = "Posterior mean")

(p2.mean <- p2.mean + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray")))

p2.sd <- my.levelplot(proj, pred.model$summary.random$sp_rdm$sd, col.regions = plasma(20), main = "Posterior mean")

(p2.sd <- p2.sd + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray")))

grid.arrange(p.mean, p2.mean, p.sd, p2.sd, ncol = 2) #for comparison

And here's my attempt so far to project the fitted values onto a grid:

# create index from stack

id.prd <- inla.stack.index(stk.all,'pred')$data

# use the index to extract the effect values

mean_effect <- pred.model$summary.fitted.values$mean[id.prd]

m.eff <- my.levelplot(proj, mean_effect)

# generate grid to project into

seq.x.grid <- seq(from = poly.barrier@bbox[1,1], to = poly.barrier@bbox[1,2],

length = round(diff(poly.barrier@bbox[1,])/10)) #10 km cells

seq.y.grid <- seq(from = poly.barrier@bbox[2,1],to = poly.barrier@bbox[2,2],

length = round(diff(poly.barrier@bbox[2,])/10)) #10 km cells

pred.grid = expand.grid(seq.x.grid,seq.y.grid)

str(pred.grid)

# dimensions of the grid (dividing range of long and lat by stepsize to get ncol and nrow)

(nxy <- round(c(diff(poly.barrier@bbox[1,]), # range of longitude

diff(poly.barrier@bbox[2,]))/stepsize)) # range of latitude

# Create mapping between mesh and grid

proj_grid <- inla.mesh.projector(mesh,

xlim = range(pred.grid[,1]),

ylim = range(pred.grid[,2]),

dims = nxy)

# Project the linear predictor on the grid

lp.mean.grid = inla.mesh.project(proj_grid, mean_effect)

At this step I get the error:

Error in projector$proj$A %*% as.vector(field) :

Cholmod error 'X and/or Y have wrong dimensions' at file ../MatrixOps/cholmod_sdmult.c, line 90

Any help will be greatly appreciated!

Virginia.

INLA_fitting_last.R

INLA_helper_functions.R

barrier_model_data.Rdata

covars.Rdata

Auto Generated Inline Image 1

Auto Generated Inline Image 2

Auto Generated Inline Image 3

Auto Generated Inline Image 4

Finn Lindgren

unread,

Jul 9, 2018, 11:52:15 AM7/9/18

to Virginia Morera Pujol, R-inla discussion group

From the output of one of the inla runs (the last one, as shown below), the optimiser appears to run into trouble, and the "stupid local search strategy" ends up reducing the log-density instead of increasing it; but that seems odd so not sure that's what's happening.

I did note that in the first call you include the covariate in the stack despite it not being used in the formula. I think that's best avoided; I'm not quite sure if that may affect the estimation or not.

Also, in the first estimate the sd was clearly above 1, which in your prior only has 10% chance, so you may want to consider your priors more carefully.

max.logdens= -7528.9427 fn= 50 theta= 1.0981 2.4923 4.0527 range=[-200.77 239.01]

max.logdens= -7493.0055 fn= 51 theta= 1.0981 2.4923 4.0627 range=[-201.34 239.89]
max.logdens= -7320.3504 fn= 53 theta= 1.1081 2.4923 4.0527 range=[-204.11 242.91]
max.logdens= 893.5897 fn= 57 theta= 1.3968 1.8926 4.7722 range=[-340.42 403.63]
max.logdens= 1228.7672 fn= 59 theta= 1.4068 1.8926 4.7722 range=[-345.91 409.80]
max.logdens= 10421.2328 fn= 64 theta= 1.6194 1.4457 5.3082 range=[-500.71 587.59]
max.logdens= 10565.8237 fn= 65 theta= 1.6194 1.4457 5.3182 range=[-502.91 590.96]
max.logdens= 10886.4144 fn= 69 theta= 1.6294 1.4457 5.3082 range=[-508.38 595.95]
max.logdens= 19396.2424 fn= 71 theta= 1.7761 1.1313 5.6855 range=[-652.83 758.45]
max.logdens= 19973.4707 fn= 73 theta= 1.7861 1.1313 5.6855 range=[-662.51 768.65]
Iter=1 |grad|=2.66e+06 |x-x.old|=5.27 |f-f.old|=1.43e+04 Reached numerical limit!
Number of function evaluations = 94
Compute the Hessian using central differences and step_size[0.1]. Matrix-type [dense]
Mode not sufficient accurate; switch to a stupid local search strategy.
max.logdens= -7868.3716 fn= 235 theta= 2.4986 1.3392 3.5621 range=[-307.38 245.91]

max.logdens= -3296.2866 fn= 249 theta= 2.7640 1.3392 3.5621 range=[-524.41 341.64]

Finn

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsub...@googlegroups.com.
To post to this group, send email to r-inla-discussion-group@googlegroups.com.
Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.

Finn Lindgren
email: finn.l...@gmail.com

Virginia Morera Pujol

unread,

Jul 10, 2018, 4:40:41 AM7/10/18

to R-inla discussion group

Hi Finn,

I have tried changing the prior for the sigma for both spde models to more reasonable values. I guess this will require more thought, but the outputs of both models are even more different now. It is really surprising to me and I am sure there's something in the specification of the stacks or the inla call I am doing incorrectly, otherwise the output should surely be the same with and without the prediction stack?

SET UP:

# 1.5 create spde model and indices ####

spde <- inla.spde2.pcmatern(mesh = mesh, alpha = 2,### mesh and smoothness parameter
prior.range = c(300, 0.01), ### P(practic.range<300)=0.01

prior.sigma = c(3, 0.1)) ### P(sigma>3)=0.1

field.indices <- inla.spde.make.index("sp_rdm", n.spde = spde$n.spde)

# 1.6 Create 1D SPDE model for the covariates ####

# create mesh
knots <- seq(round(min(covars[,1], na.rm = T) - 5), round(max(covars[,1], na.rm = T) + 5), length = 25) # leave an "outer boundary" of 5
mesh1d <- inla.mesh.1d(knots, interval = c(-10, 15), degree = 2, boundary = "free")

# define model
spde1d <- inla.spde2.pcmatern(mesh = mesh1d, alpha = 2,### mesh and smoothness parameter
prior.range = c(20, NA), ### Prange == NA means range is fixed at 20

prior.sigma = c(4, 0.1), ### P(sigma>1)=0.1

constr = TRUE)

# make A matrix for the 1d spde
A.1d <- inla.spde.make.A(mesh1d, covars[,1])
str(A.1d) #4658 x 28 correct?

# create index
covar.index <- inla.spde.make.index("Covar", n.spde = spde1d$n.spde)

# stack

stk.data <- inla.stack(data = list(y = y.pp, e = e.pp),
                       A = list(A.pp, # 4658*564
                                A.1d), # 4658*28
                       tag = 'dat',
                       effects = list(c(field.indices, list(b0 = 1)),   # 564
                                      covar.index)) # 28

RUN ESTIMATION MODEL

#### 2. MODEL WITH COVARIATE ####

# 2.1 formula for the model with covariate ####
frml_covar <- y ~ 0 + b0 +
                  f(Covar, model = spde1d) +
                  f(sp_rdm, model = spde)

# check it is correct
frml_covar

# 2.2 Run the model ####
cov.model <- inla(frml_covar, family = 'poisson',
                              data = inla.stack.data(stk.data),
                              control.predictor = list(A = inla.stack.A(stk.data),
                                                       compute = FALSE, link = 1),
                              E = inla.stack.data(stk.data)$e,
                              control.compute = list(dic = F, waic = F),
                              control.inla = list(int.strategy = "eb"),
                              verbose = TRUE)

m <- matrix(c(1,2,3,4,5,5),nrow = 2,byrow = TRUE) # 5 plotting areas, without scale
layout(mat = m, heights = c(0.5, 0.5))
plot(cov.model$marginals.fix[[1]], type = 'l', xlab = 'Intercept', ylab = 'Density')
plot(cov.model$marginals.hyperpar[[3]], type = 'l', xlab = expression(sigma^2~"of the spatial random field"), ylab = 'Density')
plot(cov.model$marginals.hyperpar[[2]], type = 'l', xlab = 'Range for the spatial random field', ylab = 'Density')

plot(cov.model$marginals.hyperpar[[1]], type = 'l', xlab = expression(sigma~"for the covariate"), ylab = 'Density')

plot(cov.model$summary.random[[1]][,1:2], type = 'l', xlab = 'Covariate effect', ylab = 'Effect')
abline(h = 0, lty = 3)
for (i in c(4,6)) {lines(cov.model$summary.random[[1]][,c(1,i)], lty = 2) }

par(mfrow = c(1,1))

SET AND RUN PREDICTION MODEL

stk.pred <- inla.stack(data = list(y = NA, e = NA),
                       tag = "pred",
                       A = list(A.1d, 1),    # for the intercept
                       effects = list(covar.index,
                                      list(b0 = rep(1, nrow(A.1d))))) # 4658??

# 3.2 Stack the stacks ####
stk.all <- inla.stack(stk.data, stk.pred)

# 3.3 Rerun the model with the prediction stack ####

#exactly the same model but with stk.all instead of stk.data, na in the data call, and compute = TRUE in the predictor call
pred.model <- inla(frml_covar, family = 'poisson',
                   data = inla.stack.data(stk.all),
                   control.predictor = list(A = inla.stack.A(stk.all),
                                            compute = TRUE, link = 1),
                   E = inla.stack.data(stk.all)$e,
                   control.compute = list(dic = FALSE, waic = FALSE),
                   control.inla = list(int.strategy = "eb"),
                   verbose = TRUE)

# 3.4 Outputs of the model ####

m <- matrix(c(1,2,3,4,5,5),nrow = 2,byrow = TRUE) # 5 plotting areas, without scale
layout(mat = m, heights = c(0.5, 0.5))
plot(pred.model$marginals.fix[[1]], type = 'l', xlab = 'Intercept', ylab = 'Density')
plot(pred.model$marginals.hyperpar[[3]], type = 'l', xlab = expression(sigma^2~"of the spatial random field"), ylab = 'Density')
plot(pred.model$marginals.hyperpar[[2]], type = 'l', xlab = 'Range for the spatial random field', ylab = 'Density')

plot(pred.model$marginals.hyperpar[[1]], type = 'l', xlab = expression(sigma~"for the covariate"), ylab = 'Density')

plot(pred.model$summary.random[[1]][,1:2], type = 'l', xlab = 'Covariate effect', ylab = 'Effect')
abline(h = 0, lty = 3)
for (i in c(4,6)) {lines(pred.model$summary.random[[1]][,c(1,i)], lty = 2) }

COMPARE SPATIAL FIELDS FOR BOTH MODELS

p.mean <- my.levelplot(proj, cov.model$summary.random$sp_rdm$mean, col.regions = plasma(20), main = "Posterior mean")

p.mean <- p.mean + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))

p.sd <- my.levelplot(proj, cov.model$summary.random$sp_rdm$sd, col.regions = plasma(20), main = "Posterior sd")

p.sd <- p.sd + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))

p2.mean <- my.levelplot(proj, pred.model$summary.random$sp_rdm$mean, col.regions = plasma(20), main = "Posterior mean pred")

p2.mean <- p2.mean + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))

p2.sd <- my.levelplot(proj, pred.model$summary.random$sp_rdm$sd, col.regions = plasma(20), main = "Posterior sd pred")

p2.sd <- p2.sd + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))

grid.arrange(p.mean, p2.mean, p.sd, p2.sd, ncol = 2) # for comparison

I am at a loss as to what can be happening, really.

El dilluns, 9 juliol de 2018 17:52:15 UTC+2, Finn Lindgren va escriure:

To post to this group, send email to r-inla-disc...@googlegroups.com.

Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.

Auto Generated Inline Image 1

Auto Generated Inline Image 2

Auto Generated Inline Image 3

Finn Lindgren

unread,

Jul 10, 2018, 6:29:32 AM7/10/18

to Virginia Morera Pujol, R-inla discussion group

[Edit after writing the below: the pred.model estimate seems to have lost the fixed effect b0 completely! Weird!

I reappears if I set e=1 in the prediction stack, but the estimate is again unreasonable.]

In theory, the results should be the same, but due to the way the numerics are done, having the prediction part there does affect the numerics, in this case derailing the optimiser.

For the non-prediction run, you can restart it to get a more accurate mode:

cov.model2 <- inla.rerun(cov.model)
cov.model3 <- inla.rerun(cov.model2)
> rbind(cov.model$mode$theta,
+       cov.model2$mode$theta,
+       cov.model3$mode$theta)
     log(Stdev) for Covar log(Range) for sp_rdm log(Stdev) for sp_rdm
[1,]             1.723732              7.938504              2.108551
[2,]             2.160237              7.142892              3.656166
[3,]             2.273326              7.550832              3.551317

It's clear that the first run didn't find the mode accurately. The second and third are close to eachother, and likely close to the correct mode.

(but those range estimates correspond to 1260, 1900, and 2100, so even the last two estimates are far apart!

Note: A major issue with the model parameter estimation is that the intercept appears to be non-identifiable (ranges between -100 and 0 approximately!)

This is likely the cause of the unstable parameter estimates, and the grossly underestimated sd values for them.

You can then take the estimated parameters from the last non-prediction run and use them as starting values for the prediction run to reduce the risk:

pred.model <- inla(...,

control.mode = list(restart = TRUE,
theta = cov.model3$mode$theta))

> rbind(cov.model3$mode$theta,
+       pred.model$mode$theta)
     log(Stdev) for Covar log(Range) for sp_rdm log(Stdev) for sp_rdm
[1,]             2.273326              7.550832              3.551317
[2,]             2.272669              7.650832              3.656981

Much better!

An alternative that would require only running cov.model and cov.model2 is to use posterior sampling instead of having inla() compute the predictions.

For large models, this tends to be faster and more stable. inla.posterior.sample() and/or inlabru::generate() (See ?inlabru::generate.inla for details).

Finn

To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsubscr...@googlegroups.com.

To post to this group, send email to r-inla-disc...@googlegroups.com.
Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.

--
Finn Lindgren
email: finn.l...@gmail.com

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsub...@googlegroups.com.

To post to this group, send email to r-inla-discussion-group@googlegroups.com.

Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.

Virginia Morera Pujol

unread,

Jul 11, 2018, 11:10:19 AM7/11/18

to Finn Lindgren, R-inla discussion group

Hi,

I've been playing around with generate.inla a bit, but I don't get what I expected.

I've generated a new SpatialPointsDataframe with points on a regular grid, and with the value for the covariate at each point (pred.pts, attached). Then, from the last of the rerun models (cov.model3, also attached) I've tried to generate samples using:

samples <- inlabru:::generate.inla(cov.model3, data = pred.pts,
formula = ~ 0 + b0 + Covar + sp_rdm
n.samples = 1)

What I get is a numerical vector with as many elements as pred.pts, but the same value in all of them:

summary(samples[[1]])


    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  88.26   88.26   88.26   88.26   88.26   88.26

length(samples[[1]])

[1] 22848

length(pred.pts)

[1] 22848

I've tried with different formulas: only the covariate (which is what I would ideally want), or only the spatial random effect; but to no avail. I don't know if that's because that's not the function I need or because I'm using it wrong... Ideally I would need something that would do a similar job as predict(fit, pixels(mesh), ~ sp_rdm + Covar + Intercept) would do in inlabru. I expected I would be able to use inla.mesh.projector/inla.mesh.project functions to take the output of the generate function and project it to my 2d study area, but it's proving harder than I thought!

cov.model3.Rdata

Virginia Morera-Pujol

PhD Student

BEECA

University of Barcelona

To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.

To post to this group, send email to r-inla-disc...@googlegroups.com.
Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.

--
Finn Lindgren
email: finn.l...@gmail.com

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.

To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.

To post to this group, send email to r-inla-disc...@googlegroups.com.
Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.

pred.pts.Rdata

covars.Rdata

data_mesh_boundary.Rdata

INLA_fitting_last.R

Finn Lindgren

unread,

Jul 11, 2018, 11:21:25 AM7/11/18

to Virginia Morera Pujol, R-inla discussion group

I haven’t played around much with generate.inla with a formula,

just run it with only the inla result object and no data or formula.

This should gives you each component separately, so you can then calculate what you need,

Ex by adding b0 and an A matrix multiplied with the 1d spde coefficients.

Finn

<mime-attachment.png>

COMPARE SPATIAL FIELDS FOR BOTH MODELS

p.mean <- my.levelplot(proj, cov.model$summary.random$sp_rdm$mean, col.regions = plasma(20), main = "Posterior mean")
p.mean <- p.mean + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))
p.sd <- my.levelplot(proj, cov.model$summary.random$sp_rdm$sd, col.regions = plasma(20), main = "Posterior sd")
p.sd <- p.sd + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))
p2.mean <- my.levelplot(proj, pred.model$summary.random$sp_rdm$mean, col.regions = plasma(20), main = "Posterior mean pred")
p2.mean <- p2.mean + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))
p2.sd <- my.levelplot(proj, pred.model$summary.random$sp_rdm$sd, col.regions = plasma(20), main = "Posterior sd pred")
p2.sd <- p2.sd + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))

grid.arrange(p.mean, p2.mean, p.sd, p2.sd, ncol = 2) # for comparison

<mime-attachment.png>

<pred.pts.Rdata>

<covars.Rdata>

<data_mesh_boundary.Rdata>

<INLA_fitting_last.R>

Virginia Morera Pujol

unread,

Jul 11, 2018, 11:24:59 AM7/11/18

to Finn Lindgren, R-inla discussion group

samples <- inlabru:::generate.inla(cov.model3, n.samples = 1)

Error in generate.bru(object, ...) : 
  argument "data" is missing, with no default

samples <- inlabru::generate(cov.model3, n.samples = 1)

Error in generate.bru(object, ...) : 
  argument "data" is missing, with no default

Doesn't like it much either way

Virginia Morera-Pujol

PhD Student

BEECA

University of Barcelona

Auto Generated Inline Image 1

Virginia Morera Pujol

unread,

Jul 16, 2018, 7:19:26 AM7/16/18

to Finn Lindgren, R-inla discussion group

Hi,

Seeing that "generate" is not producing what I expected, I have been trying to use inla.posterior.sample() to obtain samples of the linear predictor. I have been following Haakon's example here https://haakonbakka.bitbucket.io/btopic113.html#5_draw_posterior_samples even though it's not for a spatial model. So far I have managed to successfully plot the covariate effect and the spatial random field from the posterior sample, and checked that is equivalent to what I obtain by plotting it from model$summary.random (example follows). However, I get stuck when I try to obtain the prediction. I attach the model output (rdata) to avoid pasting here the full script to set up the model, but it is attached).

To recap: I am modelling a Point Pattern using a 2d matérn as spatial random effect and a 1d matérn to fit a covariate (ideally there will be more than 1, but let's go step by step). I want to plot the effect of the covariate, and the linear predictor.

So far I've done:

#### 4. POSTERIOR SAMPLING FROM THE MODEL ####
load(cov.model3)

# 4.1 Generate 1000 posterior samples from the model
psample <- inla.posterior.sample(n = 1000, result = cov.model3, use.improved.mean = T)

# 4.2 Identify, inside the model object, where are the values we want
contents <- cov.model3$misc$configs$contents
effect <- "Covar" # first we extract from the posterior sample the covar values
id.effect <- which(contents$tag == effect) # - the numerical id of the effect
ind.effect <- contents$start[id.effect] - 1 + (1:contents$length[id.effect]) # - all the indices for the effect

# extract the sample from "Covar" from the posterior sample
cov_sample <- lapply(psample, function(x) x$latent[ind.effect])

cov_sample <- matrix(unlist(cov_sample), byrow = T, nrow = 1000)
# this contains a 18*1000 matrix: a col for each knot of the 1d mesh and a row for each iteration (1000)

mean_post <- colMeans(cov_sample)
sd_pred <- colSds(cov_sample)
library(matrixStats)
median_post <- matrixStats::colMedians(cov_sample)
quan_post <- matrixStats::colQuantiles(cov_sample, probs = c(0.025, 0.975))

# plot both side by side to compare
par(mfrow = c(2,1))
plot(cov.model3$summary.random[[1]][,1:2], type = 'l', xlab = paste(names(cov_list)[i], 'effect', sep = " "), ylab = 'Effect')

abline(h = 0, lty = 3)

for (j in c(4,6)) {lines(cov.model3$summary.random[[1]][,c(1,j)], lty = 2) }
plot(mean_post, type = 'l', xlab = paste(names(cov_list)[i], 'effect', sep = " "), ylab = 'Effect')

abline(h = 0, lty = 3)

for (j in c(1,2)) {lines(quan_post[,j], lty = 2) }
par(mfrow = c(1,1))

# same procedure for the spatial random effect
effect <- "sp_rdm"
id.effect <- which(contents$tag == effect) # - the numerical id of the effect
ind.effect <- contents$start[id.effect] - 1 + (1:contents$length[id.effect]) # - all the indices for the effect

# extract the sample from sp_rdm from the posterior sample
sp_sample <- lapply(psample, function(x) x$latent[ind.effect])

sp_sample <- matrix(unlist(sp_sample), byrow = T, nrow = 1000)
# this contains a 564*1000 matrix: a col for each node of the spatial field (564) and a row for each iteration (1000)

mean_sp_post <- colMeans(sp_sample)
sd_sp_post <- colSds(sp_sample)

# plots from the model output summary
sp.mean <- my.levelplot(proj, cov.model3$summary.random$sp_rdm$mean, col.regions = plasma(20),
                        main = "Posterior mean spatial field", xlab = "", ylab = "")
sp.mean <- sp.mean + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))
sp.sd <- my.levelplot(proj, cov.model3$summary.random$sp_rdm$sd, col.regions = plasma(20),
                      main = "Posterior sd spatial field", xlab = "", ylab = "")
p.sd <- sp.sd + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))

# plots from the posterior sample
psample.mean <- my.levelplot(proj, mean_sp_post, col.regions = plasma(20),
                       main = "Posterior mean spatial field", xlab = "", ylab = "")
psample.mean <- psample.mean + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))
psample.sd <- my.levelplot(proj, sd_sp_post, col.regions = plasma(20),
                     main = "Posterior sd spatial field", xlab = "", ylab = "")
psample.sd <- psample.sd + latticeExtra::layer(sp.polygons(poly.barrier, lwd = 1, fill = "gray"))

grid.arrange(p.mean, psample.mean, p.sd, psample.sd, ncol = 2)

So, ideally, to plot the predictor I should do the same but selecting the "Predictor" object. However, there are two, the "Predictor" and the "APredictor". This is what we can find inside my posterior.sample object:


$tag
[1] "APredictor" "Predictor"  "Covar"      "sp_rdm"     "b0"        

$start
[1]    1 4659 5241 5259 5823

$length
[1] 4658  582   18  564    1

"APredictor" has as many elements as my response (i.e. datapoints + meshnodes), and "Predictor" has as many elements as the nodes of the 2d mesh for the spatial field (564) + the nodes for the 1d mesh for the covariate (18). I am pretty sure the answer is somewhere in there, but I can't figure out how to access and plot it. Also, I am not sure what to do with the "b0" which is the intercept. Any help would be, as always, much appreciated!

cov.model3.Rdata

covars.Rdata

data_mesh_boundary.Rdata

Virginia Morera-Pujol

PhD Student

BEECA

University of Barcelona

INLA_helper_functions.R

INLA_fitting_last.R

Reply all

Reply to author

Forward

0 new messages