marked LGCP model and area level model

Soraia Pereira

unread,

Dec 12, 2016, 6:02:40 PM12/12/16

to R-inla discussion group

Hi,

I'm using a marked LGCP model with spde effects and I would like to compare the results for a specific geographical area level with the results obtained by an area level model. The area level model that I considered uses the same distribution as the one in the LGCP model and uses besag effects. When I compare the results in terms of variation coefficients, relative bias and RRMSE, the area level model produces better results, unlike I expected.. I already tryed to improve the LGCP model used, but the results keep worse than the area level model.. Is there any reason for this?

Thank you very much.

Haakon Bakka

unread,

Dec 15, 2016, 2:30:08 PM12/15/16

to Soraia Pereira, R-inla discussion group

Hi Soraia,

The answer to this question would depend very much on the details. (You may want to share the data and the code with the discussion group, or only me.)

I would also expect the LGCP spde model to perform better than the besag.

What I would need to check first is

1. The result$logfile for both the models (please save as .txt files and attach to an email)

2. summary(result) for both models

3. Can you plot the mesh, with axes?

One thing I like to try is to fix the range parameter in the LGCP model, as my experience shows that this parameter can be unstable (leading to over- or under-fitting). So if you look at your spatial scale, plot the data, and think about the longest range at which interaction occurs, you can fix the spatial range to this value.

Kind regards,

Haakon Bakka

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion-group+unsub...@googlegroups.com.
To post to this group, send email to r-inla-discussion-group@googlegroups.com.
Visit this group at https://groups.google.com/group/r-inla-discussion-group.
For more options, visit https://groups.google.com/d/optout.

Soraia Pereira

unread,

Dec 16, 2016, 6:22:55 AM12/16/16

to R-inla discussion group, soraia....@gmail.com

Hi Haakon,

Thank you for your kind answer.

I will send the logfiles by email.

Please find attached the mesh used in the marked LGCP model.

### The marked LGCP model used:

jform68 <- y ~ 0 + b0.pp + b0.y + offset(log(dens.pp)) + f(i, model=spde) + f(j, copy='i', fixed=FALSE)+f(k, model=spde)

j.res68 <- inla(jform68, family=c('poisson', 'poisson'), data=inla.stack.data(j.stk2), E=inla.stack.data(j.stk2)$e,

control.predictor=list(A=inla.stack.A(j.stk2),compute=TRUE,link=1),

control.compute=list(config=TRUE,dic=TRUE,cpo=TRUE,waic=TRUE),

quantiles=c(0.025, 0.5, 0.975),

control.results=list(return.marginals.random=F,return.marginals.predictor=F),

control.inla = list(int.strategy = "eb"))

### The summary for marked LGCP model:

> summary(j.res68)

Call:

c("inla(formula = jform68, family = c(\"poisson\", \"poisson\"), data = inla.stack.data(j.stk2), ", " quantiles = c(0.025, 0.5, 0.975), E = inla.stack.data(j.stk2)$e, ", " control.compute = list(config = TRUE, dic = TRUE, cpo = TRUE, ", " waic = TRUE), control.predictor = list(A = inla.stack.A(j.stk2), ", " compute = TRUE, link = 1), control.inla = list(int.strategy = \"eb\"), ", " control.results = list(return.marginals.random = F, return.marginals.predictor = F))" )

Time used:

Pre-processing Running inla Post-processing Total

3.6422 3093.8768 6.6850 3104.2040

Fixed effects:

mean sd 0.025quant 0.5quant 0.975quant mode kld

b0.pp -8.4515 0.0709 -8.5910 -8.4514 -8.3126 -8.4512 0

b0.y -1.9791 0.0756 -2.1278 -1.9791 -1.8309 -1.9790 0

Random effects:

Name Model

i SPDE2 model

k SPDE2 model

j Copy

Model hyperparameters:

mean sd 0.025quant 0.5quant 0.975quant mode

Theta1 for i -1.4276 0.0756 -1.6008 -1.416 -1.3116 -1.3812

Theta2 for i -0.7983 0.0575 -0.8874 -0.807 -0.6665 -0.8332

Theta1 for k 2.7517 0.3195 2.1228 2.753 3.3774 2.7558

Theta2 for k -2.9486 0.3377 -3.6046 -2.952 -2.2775 -2.9625

Beta for j 0.0032 0.0149 -0.0228 0.002 0.0351 -0.0014

Expected number of effective parameters(std dev): 1947.26(0.00)

Number of equivalent replicates : 16.69

Deviance Information Criterion (DIC) ...: 48542.68

Effective number of parameters .........: 1907.54

Watanabe-Akaike information criterion (WAIC) ...: 68050.93

Effective number of parameters .................: 15486.26

Marginal log-Likelihood: -62685.19

CPO and PIT are computed

Posterior marginals for linear predictor and fitted values computed

### The area level model used:

formula.ST2_poisson<- obs ~ f(ID.area,model="besag",graph=portugal.adj)

model.inla.ST2_poisson <- inla(formula.ST2_poisson,family="poisson",data=data_paper2,offset=log(n),control.predictor=list(compute=TRUE), control.compute=list(config=TRUE,waic=TRUE,dic=TRUE,cpo=TRUE))

### the summary for the area level model:

> summary(model.inla.ST2_poisson)

Call:

c("inla(formula = formula.ST2_poisson, family = \"poisson\", data = data_paper2, ", " offset = log(n), control.compute = list(config = TRUE, waic = TRUE, ", " dic = TRUE, cpo = TRUE), control.predictor = list(compute = TRUE))" )

Time used:

Pre-processing Running inla Post-processing Total

3.0668 0.3126 0.2059 3.5853

Fixed effects:

mean sd 0.025quant 0.5quant 0.975quant mode kld

(Intercept) -2.8027 0.0297 -2.8621 -2.8024 -2.7454 -2.8016 0

Random effects:

Name Model

ID.area Besags ICAR model

Model hyperparameters:

mean sd 0.025quant 0.5quant 0.975quant mode

Precision for ID.area 16.94 9.189 5.757 14.80 40.53 11.46

Expected number of effective parameters(std dev): 14.28(1.922)

Number of equivalent replicates : 1.96

Deviance Information Criterion (DIC) ...: 206.20

Effective number of parameters .........: 14.74

Watanabe-Akaike information criterion (WAIC) ...: 205.82

Effective number of parameters .................: 10.92

Marginal log-Likelihood: -135.15

CPO and PIT are computed

Posterior marginals for linear predictor and fitted values computed

Thank you.

Soraia

Rplot.png

Haakon Bakka

unread,

Dec 16, 2016, 6:47:33 AM12/16/16

to Soraia Pereira, R-inla discussion group

Hi,

The model you have "summary(j.res68)" does not seem right

1. Do you actually want 3 spatial terms, and 5 hyper-parameters

2: According to the logfile, the model did not converge.

3. Why do you have "+f(k, model=spde)" ?

Suggestions

1. Remove the "+f(k, model=spde)" term (unless you need it for a specific reason)

2. Use PC priors for any spatial effects

## PC Prior for spatial effect

domain.size = 400

spde = inla.spde2.pcmatern(mesh,

prior.range=c(domain.size, 0.5),

prior.sigma=c(1, 0.5))

Let me know how it goes!

(That means, send me summaries of the model, and logfiles)

Kind regards,

Haakon Bakka

Soraia Pereira

unread,

Dec 19, 2016, 4:43:52 PM12/19/16

to R-inla discussion group, soraia....@gmail.com

Hi Haakon,

I followed your suggestions but the result of model$mode$mode.status is also higher than zero. I’ve attached the log file. The summary is

> summary(j.res75)

Call:

c("inla(formula = jform75, family = c(\"poisson\", \"poisson\"), data = inla.stack.data(j.stk), ", " quantiles = c(0.025, 0.5, 0.975), E = inla.stack.data(j.stk)$e, ", " control.compute = list(config = TRUE, dic = TRUE, cpo = TRUE, ", " waic = TRUE), control.predictor = list(A = inla.stack.A(j.stk), ", " compute = TRUE, link = 1), control.inla = list(int.strategy = \"eb\"), ", " control.results = list(return.marginals.random = F, return.marginals.predictor = F))" )

Time used:

Pre-processing Running inla Post-processing Total

3.4717 25325.2797 4.6753 25333.4267

Fixed effects:

mean sd 0.025quant 0.5quant 0.975quant mode kld

b0.pp -8.2997 0.0651 -8.4278 -8.2996 -8.1722 -8.2995 0

b0.y -1.7028 0.0210 -1.7443 -1.7027 -1.6618 -1.7026 0

Random effects:

Name Model

i SPDE2 model

j Copy

Model hyperparameters:

mean sd 0.025quant 0.5quant 0.975quant mode

Range for i 6.4807 0.2140 6.071 6.4771 6.9115 6.4701

Stdev for i 2.3760 0.0037 2.369 2.3760 2.3833 2.3760

Beta for j -0.0428 0.0128 -0.068 -0.0428 -0.0176 -0.0428

Expected number of effective parameters(std dev): 1848.26(0.00)

Number of equivalent replicates : 17.59

Deviance Information Criterion (DIC) ...: 48686.36

Effective number of parameters .........: 1805.29

Watanabe-Akaike information criterion (WAIC) ...: 67885.37

Effective number of parameters .................: 15214.72

Marginal log-Likelihood: -25739.38

CPO and PIT are computed

Posterior marginals for linear predictor and fitted values computed

###

If I remove the offset term, the result of model$mode$mode.status is zero, but the fitted values for the area level that I need are similar to the obtained values from the first model. I’v attached the log file. The summary of this model is:

> summary(j.res75semoffset)

Call:

c("inla(formula = jform75semoffset, family = c(\"poisson\", \"poisson\"), ", " data = inla.stack.data(j.stk), quantiles = c(0.025, 0.5, ", " 0.975), E = inla.stack.data(j.stk)$e, control.compute = list(config = TRUE, ", " dic = TRUE, cpo = TRUE, waic = TRUE), control.predictor = list(A = inla.stack.A(j.stk), ", " compute = TRUE, link = 1), control.inla = list(int.strategy = \"eb\"), ", " control.results = list(return.marginals.random = F, return.marginals.predictor = F))" )

Time used:

Pre-processing Running inla Post-processing Total

3.4388 6134.3440 7.2729 6145.0557

Fixed effects:

mean sd 0.025quant 0.5quant 0.975quant mode kld

b0.pp -7.7642 0.4424 -8.6341 -7.7638 -6.8974 -7.7630 0

b0.y -2.3786 0.0390 -2.4553 -2.3786 -2.3022 -2.3785 0

Random effects:

Name Model

i SPDE2 model

j Copy

Model hyperparameters:

mean sd 0.025quant 0.5quant 0.975quant mode

Range for i 34.1695 3.6547 26.9510 34.2472 41.1772 34.7230

Stdev for i 3.8088 0.3341 3.1362 3.8209 4.4359 3.8782

Beta for j 0.0811 0.0169 0.0514 0.0798 0.1172 0.0755

Expected number of effective parameters(std dev): 1696.60(0.00)

Number of equivalent replicates : 19.16

Deviance Information Criterion (DIC) ...: 55233.21

Effective number of parameters .........: 1602.79

Watanabe-Akaike information criterion (WAIC) ...: 85199.83

Effective number of parameters .................: 20382.41

Marginal log-Likelihood: -28982.59

CPO and PIT are computed

Posterior marginals for linear predictor and fitted values computed

###

I think the problem in my case is an over-smoothing. The estimates for the areas with higher values are lower than the observed values and the estimates for the areas with lower values are higher than the observed values. I think this smoothing is due to the spde effects but I don’t know how can I change it.

Can you help me with some advice?

Thank you.

j.res75$logfile.txt

j.res75semoffset$logfile.txt

Haakon Bakka

unread,

Dec 23, 2016, 4:30:28 PM12/23/16

to Soraia Pereira, R-inla discussion group

Hi,

See inline comments.

Happy holidays!

Haakon Bakka

On 19 December 2016 at 22:43, Soraia Pereira <soraia....@gmail.com> wrote:

Hi Haakon,

I followed your suggestions but the result of model$mode$mode.status is also higher than zero. I’ve attached the log file. The summary is

I do not know what that mode.status is unfortunately. I looked at your logfiles, and the first logfile is horrible (j.res75), it looks like the algorithm diverges. This is the model you are talking about here? This is probably because the posterior is flat or badly shaped.

> summary(j.res75)

Model hyperparameters:

mean sd 0.025quant 0.5quant 0.975quant mode

Range for i 6.4807 0.2140 6.071 6.4771 6.9115 6.4701

Stdev for i 2.3760 0.0037 2.369 2.3760 2.3833 2.3760

Beta for j -0.0428 0.0128 -0.068 -0.0428 -0.0176 -0.0428

###

If I remove the offset term, the result of model$mode$mode.status is zero, but the fitted values for the area level that I need are similar to the obtained values from the first model. I’v attached the log file. The summary of this model is:

The offset should not change the model, assuming it is constant, as it just shifts the intercept around, assuming you have an intercept. I don't see your model, did you have an intercept?

The second log-file looks OK, not great, but not bad either.

> summary(j.res75semoffset)

Model hyperparameters:

mean sd 0.025quant 0.5quant 0.975quant mode

Range for i 34.1695 3.6547 26.9510 34.2472 41.1772 34.7230

Stdev for i 3.8088 0.3341 3.1362 3.8209 4.4359 3.8782

Beta for j 0.0811 0.0169 0.0514 0.0798 0.1172 0.0755

###

I think the problem in my case is an over-smoothing. The estimates for the areas with higher values are lower than the observed values and the estimates for the areas with lower values are higher than the observed values. I think this smoothing is due to the spde effects but I don’t know how can I change it.

Hmm. You have a posterior range of 36, compare that to your study region. It seems like a short spatial range. On the other hand, you have a large amount of data, and just 1 spatial field?

## Other things to try 1: Rerun

Re-run the model with "control.mode=list(restart=T, theta=theta0)" inside the inla call, with

"theta0 = result$internal.summary.hyperpar$mean" from the previous call. This should make the algorithm more stable.

- I usually have these initial values in my code (as numbers), as it also speeds up the running time significantly!

## Other things to try 2: Overdisperson

It is reasonable to think that your model is not correct. And the poisson likelihood does not have any flexibility. And therefore you get oversmoothing.

You can add an iid random effect, to add overdispersion (or account for other errors and some model misspecification):

"+ f(iidx, model=\'iid\', hyper=hyper.iid)" in your formula with shrinkage prior:

"hyper.iid = list(prec = list(prior = 'pc.prec', param = c(3, 0.01)))"

Let me know if it helps,

Haakon

Haakon Bakka

unread,

Dec 23, 2016, 4:34:05 PM12/23/16

to Soraia Pereira, R-inla discussion group

PS: The overdispersion: You may want to add this to only one of the likelihood models (I recommend the abundance model). You may also want to add a separate iid effect to both likelihood models, as you have enough data for this to be sensible.

Haakon

Reply all

Reply to author

Forward