binomial spatial example with a projected mesh

838 views
Skip to first unread message

Kath O'Reilly

unread,
Jun 30, 2014, 5:02:21 PM6/30/14
to r-inla-disc...@googlegroups.com
Dear INLA,

first of all thank you so much for developing the code for this suite of models. I'm fairly new to the INLA approach, but the models are really interesting, so apologies for the naive questions.

I have binomially distributed data, and I would like to estimate the model parameters and project onto a grid. I've been following some examples, and patching together where necessary. This example is inspired by  http://www.math.ntnu.no/~daniesi/Roros.pdf

I first create some random data, with a bit of variability. That's fine and the model is fitted. However I'm unsure how to project the results onto a grid. I pasted some code from the Parana rainfall example, but I've set up "inla.mesh.project" incorrectly, but I'm unsure of the solution.

Any assistance would be greatly appreciated, or an alternative...the grid plot is to view the mean of the random field. In the example the following code is used plot(old.mesh.class(mesh),r$summary.random$data_points$mean)
however old.mesh.class(mesh) is an old function I think. I suspect this plots the mean value. Any thoughts on alternatives?

In my real example I had set up the data as an INLA stack. However with use of the Ntrials in the binomial likelihood, I'm unsure how to set up the data. Could anyone direct me to an example where the data stack is used for a binomial model, and where estimates are projected?

Many thanks,

Kath 

library(INLA)
library(boot) #for inverse logit
# Make some fake data
N=100
points = matrix(runif(2*N),ncol=2)
#The ’true’ surface
fcn = function(loc){
  return( cos(loc[,1])*sin(4*loc[,2]))
}
noisy_data = rbinom(N,rep(1,N),inv.logit(fcn(points)))
plot(points,cex=noisy_data+0.5)
## Build a mesh
bnd = inla.mesh.segment(matrix(c(0,0,1,0,1,1,0,1),
                               ncol=2,byrow=TRUE))
mesh = inla.mesh.create(points,boundary=bnd,
                        refine=list(max.edge=0.1))
plot(mesh)
##Make the SPDE
# imatern = 
spde = inla.spde.create(mesh,model="imatern")

## Fit the model
fake_data = list(y=noisy_data, data_points = mesh$idx$loc)
formula = y ~ f(data_points, model=spde)-1
r = inla(formula, family="binomial",Ntrials=rep(1,N),
         data=fake_data,verbose=TRUE)
summary(r)
plot(r,plot.random.effects=F)
# out of date function...
#plot(old.mesh.class(mesh),r$summary.random$data_points$mean)

# project a fit to a grid
(stepsize <- 0.1)
(nxy <- round(c(diff(c(0,1)),diff(c(0,1)))/stepsize))
projgrid <- inla.mesh.projector(mesh, xlim=c(0,1),ylim=c(0,1), dims=nxy)
# make the predictions
xmean <- inla.mesh.project(projgrid, r$summary.random$i$mean)
xsd <- inla.mesh.project(projgrid, r$summary.random$i$sd)

Finn Lindgren

unread,
Jul 1, 2014, 4:19:38 AM7/1/14
to Kath O'Reilly, r-inla-disc...@googlegroups.com
Hi Kath,

the "Extended tutorial" and "Examples" texts at

https://sites.google.com/a/r-inla.org/www/examples/tutorials

contains much more updated code than the one you included.
In the "Extended tutorial", see section 3.1 for an example using
Binomial data and a 1D model; 2D models work in the same way, as
described in the rest of that text, and in the much more extensive
"Examples" document.

Finn L
> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to r-inla-discussion...@googlegroups.com
> <mailto:r-inla-discussion...@googlegroups.com>.
> To post to this group, send email to
> r-inla-disc...@googlegroups.com
> <mailto:r-inla-disc...@googlegroups.com>.
> Visit this group at http://groups.google.com/group/r-inla-discussion-group.
> For more options, visit https://groups.google.com/d/optout.

Kath O'Reilly

unread,
Jul 1, 2014, 7:00:07 AM7/1/14
to r-inla-disc...@googlegroups.com
Dear Finn,

thanks for pointing me in the right direction. Below is some code that simulates data, estimates the model with a covariate, and projects to a grid, should anyone be interested.

#binomial example of INLA
remove(list=ls())
library(INLA)
library(boot) #for inverse logit
# Make some fake data
N=500
points = matrix(runif(2*N),ncol=2)
#The ’true’ surface
fcn = function(loc){
  return( cos(loc[,1])*sin(4*loc[,2]))
}
noisy_data = rbinom(N,rep(100,N),inv.logit(fcn(points)))
plot(points,cex=noisy_data/10)
## Build a mesh
bnd = inla.mesh.segment(matrix(c(0,0,1,0,1,1,0,1),
                               ncol=2,byrow=TRUE))
mesh = inla.mesh.create(points,boundary=bnd,
                        refine=list(max.edge=0.1))
plot(mesh)

##Make the SPDE
# imatern = 
spde = inla.spde2.matern(mesh=mesh)

## organise the data into a stack
A <- inla.spde.make.A(mesh,loc=points)

#fake_data = list(y=noisy_data, data_points = mesh$idx$loc)
stack = inla.stack(data=list(y = noisy_data, link=1, Ntrials=rep(100,N)),
                   A=list(A,1),
                   effects=list(list(i=1:spde$n.spde),
                   data.frame(Intercept=1,long=inla.group(points[,2]))),
                   tag="est")
data = inla.stack.data(stack)
                   
formula = y ~ 0 + Intercept + f(i, model=spde) + f(long,model="rw1")

r = inla(formula, family="binomial",Ntrials=data$Ntrials,
         data=inla.stack.data(stack), 
         control.predictor=list(A=inla.stack.A(stack),compute=TRUE))

summary(r)

r.f <- inla.spde2.result(r, 'i', spde, do.transf=TRUE)
par(mfrow=c(2,3), mar=c(3,3.5,0,0), mgp=c(1.5, .5, 0), las=0) 
# r$marginals.fix is the density function of the intercept 
plot(r$marginals.fix[[1]], type='l', xlab='Intercept', ylab='Density') 
# names(r$summary.random) = "i"    "long"
plot(r$summary.random[[2]][,1:2], type='l',xlab='Long', ylab='Coeficient')
abline(h=0, lty=3) 


# this is the fitted predictor
# names(r$summary.fitted.values)
# [1] "mean"       "sd"         "0.025quant" "0.5quant"   "0.975quant" "mode" 
r$summary.fitted.values$mean[1:N]
plot(noisy_data,r$summary.fitted.values$mean[1:N])

# project a fit to a grid
(stepsize <- 0.1)
(nxy <- round(c(diff(c(0,1)),diff(c(0,1)))/stepsize))
projgrid <- inla.mesh.projector(mesh, xlim=c(0,1),ylim=c(0,1), dims=nxy)
# make the predictions
xmean <- inla.mesh.project(projgrid, r$summary.random$i$mean)
xsd <- inla.mesh.project(projgrid, r$summary.random$i$sd)

pdrcoo <- projgrid$lattice$loc
Aprd <- projgrid$proj$A
stk.prd <- inla.stack(data=list(y=NA, Ntrials=rep(100,nrow(pdrcoo))), A=list(Aprd,1),
                      effects=list(i=1:spde$n.spde,
                      data.frame(Intercept=1, long=pdrcoo[,2])), tag='prd')

stk.all <- inla.stack(stack, stk.prd)

r.pd = inla(formula, family="binomial",Ntrials=data$Ntrials,
         data=inla.stack.data(stk.all), 
         control.predictor=list(A=inla.stack.A(stk.all),
         compute=TRUE), quantiles=NULL,
         control.results=list(return.marginals.random=FALSE,
                              return.marginals.predictor=FALSE))

id.prd <- inla.stack.index(stk.all, 'prd')$data
sd.prd <- m.prd <- matrix(NA, nxy[1], nxy[2]) 
m.prd <- matrix(r.pd$summary.fitted.values$mean[id.prd], nxy[1], nxy[2])
sd.prd <- matrix(r.pd$summary.fitted.values$sd[id.prd],nxy[1], nxy[2])

# visualise the prediction
#install.packages("gridExtra")
library(gridExtra) 
library(lattice)
do.call('grid.arrange',
        lapply(list(xmean, xsd, m.prd, sd.prd), # 
               levelplot, col.regions=terrain.colors(16), xlab='', ylab='',
               scales=list(draw=FALSE)))

# end

Benny Kamau

unread,
May 20, 2016, 12:06:38 PM5/20/16
to R-inla discussion group

Dear Finn and Kath,

 

Thank you for the useful code.

 

I followed the example (plus other readings) and applied to point prevalence disease data on a particular country.

My goal is to explore the relationship between the disease and some environmental and demographic raster layers (temperature, rainfall, land cover, DEM, population density etc.). 

I was able to estimate the model with a 10 covariates. Results from the posterior mean indicates that 4 covariates are significant.

 

I am trying to use these 4 covariates to create a smooth surface map to show the predicted prevalence in the country, I followed this paper (http://www.math.ntnu.no/inla/r-inla.org/case-studies/Cameletti2012/Cameletti_et_al_2012_acc.pdf).

 

However, this doesn't work for me. I am getting a smooth map with intervals from so many NA's values to the highest value (e.g. -9999 to 100%). If I delete the NA’s, then the map shows 100% prediction for the entire country, which does make any sense.

 

Any idea or some useful references on this problem?

 

Let me know if you need more clarification

Reply all
Reply to author
Forward
0 new messages