Attempting SPDE model for binomial spatial (point) data

645 views
Skip to first unread message

Paul Lantos

unread,
Jan 15, 2017, 11:25:09 PM1/15/17
to R-inla discussion group
Hi,
 I'm quite new to INLA modeling. I have a spatial dataset with 3500 points that have coordinate pairs and some binary and continuous variables. I've copied a variety of code examples, but keep running into one error or another when I try it with my own data. My latest error is posted below. Would appreciate any help.

Thanks,
Paul


 ### Define coordinates
 coords<-as.matrix(data[,1:2])
 
 ### Boundary
 bnd<-inla.nonconvex.hull(coords, convex=-0.1)
 
 ### Mesh
 cmv.mesh<-inla.mesh.2d(boundary=bnd, offset=c(.06,.1), max.edge=c(.05,.2))
 
 
 ### Create SPDE object
 cmv.spde<-inla.spde2.matern(mesh=cmv.mesh, alpha=2)
 
 ### List of indexes for the spatial effect
 s.index<-inla.spde.make.index("spatial", n.spde = cmv.spde$n.spde)
 
 ### Projector matrix
 A.est<-inla.spde.make.A(mesh=cmv.mesh, loc=coords, index=rep(1:nrow(coords)))
 dim(A.est)
[1] 3504  800
 
 cmv.stack.est<-inla.stack(
+   data=list(y=data$result),
+   tag="est",
+   A=list(A.est),
+   effects=list(c(s.index, list(intercept=1))))
                
 formula<-  y~ -1 + intercept + f(spatial, model=cmv.spde)
 
output1<- inla(formula, data = list(y=data$result), family="binomial", Ntrials=1)
Error in eval(expr, envir, enclos) : object 'intercept' not found

INLA help

unread,
Jan 16, 2017, 12:29:13 AM1/16/17
to Paul Lantos, R-inla discussion group

doing

> browseVignettes()

you'll see SPDEhowto, This should answer your question ;-)

Best,
H
> -- 
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to r-inla-discussion...@googlegroups.com
> .
> To post to this group, send email to r-inla-discussion-group@googlegr
> oups.com.
> Visit this group at https://groups.google.com/group/r-inla-discussion
> -group.
> For more options, visit https://groups.google.com/d/optout.

--
Håvard Rue
he...@r-inla.org
Screenshot from 2017-01-16 08-27-09.png

Paul Lantos

unread,
Jan 16, 2017, 10:14:09 AM1/16/17
to R-inla discussion group
Great, thank you!! That really helped.

I've now gotten as far as trying to project the models on a grid as described in the SPDEhowto, but I'm running into an error:


> gproj <- inla.mesh.projector(cmv.mesh, xlim = 0:1, ylim = 0:1, dims = c(300, 300))
> 
> g.mean <- inla.mesh.project(gproj, model_1b$summary.random$s$mean)
> 
> g.sd <- inla.mesh.project(gproj, model_1b$summary.random$s$sd)
> 
> library(lattice); library(gridExtra)
> 
> trellis.par.set(regions=list(col=terrain.colors(16)))
> 
> grid.arrange(levelplot(g.mean, scales=list(draw=F), xlab='', ylab='', main='mean'),
+              levelplot(g.sd, scal=list(draw=F), xla='', yla='', main='sd'), nrow=1)
Error in seq.default(zrng[1], zrng[2], length.out = cuts + 2) : 
  'from' cannot be NA, NaN or infinite
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf


This is the summary of the model object I'm trying to plot:

> summary(model_1b)

Call:
c("inla(formula = formula_1b, family = \"binomial\", data = inla.stack.data(stk.e), ",  "    control.compute = list(dic = TRUE), control.predictor = list(A = inla.stack.A(stk.e), ",  "        compute = TRUE))")

Time used:
 Pre-processing    Running inla Post-processing           Total 
         0.9201         78.0915          2.8592         81.8707 

Fixed effects:
      mean      sd 0.025quant 0.5quant 0.975quant    mode kld
b0 16.4385 24.9048   -33.1165  16.6287    64.8924 17.0072   0
x   0.2089  0.3159    -0.4197   0.2113     0.8235  0.2160   0

Random effects:
Name	  Model
 s   SPDE2 model 
fips   IID model 

Model hyperparameters:
                        mean        sd 0.025quant  0.5quant 0.975quant     mode
Theta1 for s          -3.579 4.658e-01     -4.526    -3.565     -2.699   -3.523
Theta2 for s           3.261 4.172e-01      2.471     3.249      4.111    3.212
Precision for fips 18216.870 1.807e+04   1217.048 12879.568  65805.303 3301.420

Expected number of effective parameters(std dev): 36.89(9.628)
Number of equivalent replicates : 94.98 

Deviance Information Criterion (DIC) ...: 4710.81
Effective number of parameters .........: 38.01

Marginal log-Likelihood:  -2376.97 
Posterior marginals for linear predictor and fitted values computed





Finn Lindgren

unread,
Jan 16, 2017, 10:26:59 AM1/16/17
to Paul Lantos, R-inla discussion group
Does your projection domain fall in the mesh domain?
What's the output from
summary(cmv.mesh)
summary(g.mean)
summary(g.sd)
?

Finn
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To post to this group, send email to r-inla-disc...@googlegroups.com.

Paul Lantos

unread,
Jan 16, 2017, 10:39:26 AM1/16/17
to R-inla discussion group
Looks like for some reason I'm getting NAs in g.mean and g.sd

>
summary(cmv.mesh) Manifold: R2 Vertices: 800 Triangles: 1547 Boundary segm.: 51 (1 group: 0) Interior segm.: 92 (1 group: 1) xlim: -79.43167 -78.27963 ylim: 35.39146 36.56701 zlim: 0 0

> summary(g.mean) V1 V2 V3 V4 V5 V6 Min. : NA Min. : NA Min. : NA Min. : NA Min. : NA Min. : NA 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA Median : NA Median : NA Median : NA Median : NA Median : NA Median : NA Mean :NaN Mean :NaN Mean :NaN Mean :NaN Mean :NaN Mean :NaN 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA Max. : NA Max. : NA Max. : NA Max. : NA Max. : NA Max. : NA NA's :300 NA's :300 NA's :300 NA's :300 NA's :300 NA's :300 V7 V8 V9 V10 V11 V12 Min. : NA Min. : NA Min. : NA Min. : NA Min. : NA Min. : NA


> summary(g.sd)
       V1            V2            V3            V4            V5            V6     
 Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA  
 1st Qu.: NA   1st Qu.: NA   1st Qu.: NA   1st Qu.: NA   1st Qu.: NA   1st Qu.: NA  
 Median : NA   Median : NA   Median : NA   Median : NA   Median : NA   Median : NA  
 Mean   :NaN   Mean   :NaN   Mean   :NaN   Mean   :NaN   Mean   :NaN   Mean   :NaN  
 3rd Qu.: NA   3rd Qu.: NA   3rd Qu.: NA   3rd Qu.: NA   3rd Qu.: NA   3rd Qu.: NA  
 Max.   : NA   Max.   : NA   Max.   : NA   Max.   : NA   Max.   : NA   Max.   : NA  
 NA's   :300   NA's   :300   NA's   :300   NA's   :300   NA's   :300   NA's   :300  
       V7            V8            V9           V10           V11           V12     
 Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA  
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsub...@googlegroups.com.

Finn Lindgren

unread,
Jan 16, 2017, 10:45:14 AM1/16/17
to Paul Lantos, R-inla discussion group

On 16 Jan 2017, at 15:39, Paul Lantos <paul....@gmail.com> wrote:

Looks like for some reason I'm getting NAs in g.mean and g.sd

Yes, your mesh has
xlim: -79.43167 -78.27963
ylim: 35.39146 36.56701
but your projection used xlim=c(0, 1) and ylim=c(0, 1), so you'll
need to change that.  The simplest option is to not specify xlim and
ylim for inla.mesh.projector(), which will make it use the mesh limits instead.
You only need to specify them if you want finer control.

You may also want to use gproj$x and gproj$y in the levelplot calls if you want
to have meaningful axes.

Finn

To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.

Paul Lantos

unread,
Jan 16, 2017, 11:25:21 AM1/16/17
to Finn Lindgren, R-inla discussion group
Thanks, that worked. Where do I put the gproj$x and gproj$y?

Finn

To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsubscr...@googlegroups.com.

To post to this group, send email to r-inla-disc...@googlegroups.com.
Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsub...@googlegroups.com.
To post to this group, send email to r-inla-discussion-group@googlegroups.com.

Finn Lindgren

unread,
Jan 16, 2017, 11:27:26 AM1/16/17
to Paul Lantos, R-inla discussion group
It's been a while since I used the lattice package. Look in the levelplot documentation.  I think the relevant parameters had "row" and "column" as part of their names.
Finn

Finn

Daniel Simpson

unread,
Jan 16, 2017, 11:46:01 AM1/16/17
to Finn Lindgren, Paul Lantos, R-inla discussion group
Some of Finn's old plotting code (obviously won't run, but it should tell you where to put things):

levelplot(row.values=proj$x, column.values=proj$y, x=plotdata,
                 mm=mm, panel=levelplotmap,
                 col.regions=my.palette,
                 xlim=range(proj$x), ylim=range(proj$y), aspect="iso",
                 contour=TRUE, cuts=11, labels=FALSE, pretty=TRUE,
                 xlab="Easting",ylab="Northing")

Dan

To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsub...@googlegroups.com.

Paul Lantos

unread,
Jan 16, 2017, 1:20:38 PM1/16/17
to Daniel Simpson, Finn Lindgren, R-inla discussion group
Thanks again -
 Might be best if I could figure it out in ggplot2, or better yet compile x, y, mean, and sd into a data frame I can use in GIS.

 I have a couple other parallel questions if you don't mind - I'm using this analysis as an alternative to generalized additive modeling, with a smoothed term for the x,y coordinate pairs. I'd like to figure out how to do the following things that I can do using GAMs (this would come after doing a prediction to a grid):

1 - calculate pointwise / local odds ratios comparing the local odds prediction to the global odds (i.e. the same model but without spatial terms). I'm assuming that I can leave the f(spatial, model=spde) part out of the formula to get a global model, then if it's predicted in log-odds just subtract the local from the global odds predictions at each point?

2 - draw "significance" contours where local odds is significantly different from global odds. I'm still trying to wrap my head around what this means in a Bayesian context!

Thanks for any help,
Paul

Finn Lindgren

unread,
Jan 16, 2017, 1:26:40 PM1/16/17
to Paul Lantos, Daniel Simpson, R-inla discussion group
Short answer for now:
The models in inla are GAMs, they just have "Bayesian" in front. ;-)

1. Yes, you can make linear combination in the log-scale and get the posteriors.
2. Yes. The "excursions" package and associated papers by Bolin & Lindgren 
Is meant for such things, and can interpret inla output.

Finn
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.

Paul Lantos

unread,
Jan 16, 2017, 1:28:08 PM1/16/17
to Finn Lindgren, Daniel Simpson, R-inla discussion group
That's great, thanks again for all the help -- really cool stuff, just a different way of thinking for me!

Paul

To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsub...@googlegroups.com.

Daniel Simpson

unread,
Jan 16, 2017, 1:39:18 PM1/16/17
to Paul Lantos, Finn Lindgren, R-inla discussion group
Some more plotting code (this time from me).

Here adm0 is a  SpatialPolygonsDataFrame that I want to clip the spatial plot to.

Again, code won't compile (and is ugly) but it might be useful.

D




proj = inla.mesh.projector(mesh,dims=c(200,200))
S = result$summary.fitted.values$mean[ind.fit]

field.plot = inla.mesh.project(proj,field=S,dims=c(200,200))


x = proj$x
y=proj$y
all = (expand.grid(x,y,KEEP.OUT.ATTRS = F))
names(all) = c("x","y")

plotdata = all %>%mutate(field = round(as.numeric(field.plot)) )


coordinates(plotdata)=~x+y #make plotdata a spatialPointsDataFrame!

##adm0 is a SpatialPolygonsDataFrame
#the points need to have the same coordinate system. 
#there's almost certainly a better way to do this

plotdata@proj4string = adm0@proj4string 
ppp=plotdata[adm0,] #this is it! subset the SpatialPoints object by the polygon!! 

plotdata.clip= as.data.frame(ppp)

ggplot(plotdata.clip,aes(x,y)) + geom_tile(aes(fill=field)) 

Paul Lantos

unread,
Jan 16, 2017, 7:40:42 PM1/16/17
to Daniel Simpson, Finn Lindgren, R-inla discussion group
Thanks again, Daniel and Finn.

When I run INLA using a binomial formula for a dichotomous outcome variable, are the values of the posterior mean log-odds?

If I predict the INLA model to an x,y grid, how do I extract a matrix of x, y, mean, and sd? It's hard to see where the x and y values reside in the output structure.

Thanks,
Paul

Finn Lindgren

unread,
Jan 16, 2017, 7:57:20 PM1/16/17
to Paul Lantos, Daniel Simpson, R-inla discussion group
Hmmm, in a binomial model the linear predictor is in logit-scale.
I'm not sure what the log-odds are in the model.

The x and y coordinate for spatial variables are used when defining
the predictor and are not themselves present in the output.

Usually, a "prediction stack" can be used, and inla.stack.index() be
used to extract the required indices needed for extracting the
corresponding values from summary.linear.predictor and/or summary.fitted.values.

Have you read the "read this first" paper? Page 19&20 has some example code:
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.

Paul Lantos

unread,
Jan 16, 2017, 8:23:11 PM1/16/17
to Finn Lindgren, Daniel Simpson, R-inla discussion group
So can I use this function on the posterior estimate to generate probabilities, then convert those to odds, i.e. antilogit(x) / (1 - antilogit(x))? This is from the R-INLA book:

Inline image 1


I've spent quite some time with your excellent paper!

The prediction code I'm using, which looks similar, is from the R-INLA book:

tcoo <- rbind(c(0.3, 0.3), c(0.5, 0.5), c(0.7, 0.7))
Ap <- inla.spde.make.A(mesh = cmv.mesh, loc = tcoo)
x0 <- c(0.5, 0.5, 0.5)
stk.pred <- inla.stack(tag='pred', A=list(Ap, 1), data=list(y=NA), ## response as NA
                       effects=list(s=1:cmv.spde$n.spde, data.frame(x=x0, b0=1)))
stk.full <- inla.stack(stk.e, stk.pred)
p.res <- inla(formula_1a, data=inla.stack.data(stk.full), ## full stack
              control.predictor=list(compute=TRUE, ## compute the predictor
                                     A=inla.stack.A(stk.full))) ## using full stack data

pred.ind <- inla.stack.index(stk.full, tag = "pred")$data
ypost <- p.res$marginals.fitted.values[pred.ind]

Still, I think what I'm having difficulty with here is 1) getting the posterior mean and sd into a form I understand, 2) attaching them to their corresponding coordinate pairs in a grid, and 3) getting those into a format I can export.


This is how I make predictions using GAMs. In that case I just manually make a say 100x100 grid, then predict the GAM to the grid:

row=100
col=100
grid=as.data.frame(expand.grid(x=seq(from=min(x), to=max(x),len=col),
                                  y=seq(from=min(y),to=max(y),len=row)))
gam1<-gam(result ~ s(x,y), family = binomial(logit))
grid$pred <- predict(gam1, grid)

That gets me all I need to export it as a table to GIS. If I have additional predictors in the model, say result ~ s(x,y) + age, I add a column with median age to the grid. I assume I would do the same using inla.stack for a grid with predictors?

To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages