lgcp not working well with experimental mode

frousseu

unread,

Sep 28, 2022, 9:05:10 AM9/28/22

to R-inla discussion group

Hello INLA people!

I have been using the experimental mode for a little while now based on suggestions here and my own experience where it seemed more stable and faster. I had however a lot of difficulties fitting LGCP models and it turns out the experimental mode does not seem to be working so well compared to the classic mode. Basically, point patterns with spiky distributions and large areas of low intensities seem very difficult to fit for the experimental mode. This is partly expected since such distributions are wildly variable, but the classic mode does a much better job compared to the experimental mode which seems to run into numerical issues. Here are different scenarios with varying levels of clustering with on top, predictions from the classic mode and on the bottom, predictions from the experimental mode. Predictions are logged to facilitate visualization. The tolerance to clustering seems much lower for the experimental mode. Predictions seem to make sense just around the observations, but they quickly explode for areas without observations. The mesh size is obviously too coarse to describe the most clustered patterns, but I would expect both mode to behave the same way. One thing I noticed is that if I add a small number of random points in the whole domain, it is much easier to obtain reasonable resuts. My interpretation is that low intensities are very difficult to fit since you can always get closer to 0 which drives higher sd. Hence, even a small number of observations can help anchoring predictions. Below is the code for reproducing the figure. I am using inlabru to make the code simpler, but I am facing the same issue with my own INLA implementation for the LGCP model. Any idea what might be going on and how to solve this problem?

Thanks for your input!

François

library(sp)
library(terra)
library(INLA)
library(inlabru)

set.seed(12345)
ncenters <- 10
npoints <- 100
mult <- c(20, 40, 60, 80, 100, 150)

edge <- 100
mesh <-
inla.mesh.2d(
loc.domain = cbind(c(-300, 1300, -300, 1300), c(-300, -300, 1300, 1300)),
max.edge = c(edge, edge * 2),
cutoff = edge,
offset = c(edge, edge * 2)
)

matern <- inla.spde2.pcmatern(mesh,
prior.range = c(10, 0.1),
prior.sigma = c(3, 0.1))

cmp <- coordinates ~ mySmooth(coordinates, model = matern) + Intercept(1)

layout(matrix(1:(length(mult) * 2), nrow = 2, byrow = FALSE))

l <- lapply(mult, function(i) {
cx <- runif(ncenters, 0, 1000)
cy <- runif(ncenters, 0, 1000)
x <- i * rnorm(ncenters * npoints)
y <- i * rnorm(ncenters * npoints)
p <- SpatialPoints(cbind(cx + x, cy + y))
### Sprinkle uniform observations
#add<-SpatialPoints(matrix(runif(2*50,-500,1500),ncol=2))
#p<-rbind(p,add)

lapply(c("classic", "experimental"), function(j) {

bru_options_set(inla.mode = j, verbose = FALSE)
fit <- lgcp(cmp, p, domain = list(coordinates = mesh))

preds <- predict(
fit,
pixels(
mesh,
nx = 100,
ny = 100,
mask = TRUE
),
~ data.frame(lambda = exp(mySmooth + Intercept),
field = mySmooth),
n.samples = 50
)

ras <- rast(preds$lambda[, "mean"])
plot(log(ras), mar = c(1, 1, 1, 5), axes = FALSE)
plot(
p,
pch = 16,
cex = 0.5,
col = gray(0, 0.5),
axes = FALSE,
add = TRUE
)
mtext(
side = 3,
adj = 0.5,
line = -2,
text = j
)
})

})

Finn Lindgren

unread,

Sep 28, 2022, 9:24:21 AM9/28/22

to frousseu, R-inla discussion group

It would be clearer to plot the log(lambda) posterior properties directly instead of the log of the lambda properties, and with the same colour scales; some of the results seem to be much closer than they appear, with edge effects skewing the visual comparison.

When the posterior for the linear predictor eta is approx N(m,s^2), then

E(lambda) = E(exp(eta)) \approx exp(m + s^2/2)

so that

log(E(lambda)) \approx m + s^2/2

This means that the plot mixes aspects of posterior mean and posterior variance for the linear predictor; e.g. m might be accurate but s^2 overestimated.

But as you note, the variance on the log scale should be expected to be higher in regions with small intensity.

What typically happens with the experimental mode is that the results are closer to the true posterior for the given model; this isn't the same thing as "the result we think is more reasonable".

There are two main differences between the classic and experimental modes; a more compact internal model representation, which reduces numerical near-singular-matrix issues (this affects the optimisation, leading generally to a more accurate posterior mode), and the use of a variational approximation (this can have a slight bias effect on the total spatial integral of lambda, but we haven't seen noticeable pointwise issues; but we haven't looked at data of your type).

The different results might (or might not) be linked to differences in covariance parameter estimates.

Can you show the summary outputs as well?

Finn

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/8dadccc5-1b88-432b-a572-6f93472839ean%40googlegroups.com.

--

Finn Lindgren
email: finn.l...@gmail.com

Finn Lindgren

unread,

Sep 28, 2022, 10:23:12 AM9/28/22

to frousseu, R-inla discussion group

I tried to run the code, but with the latest INLA version (22.09.26) I get a gsl library core dump with the experimental mode. Not sure if that's related to your issue specifically or not (all regular testing code still runs fine).

Finn

Finn Lindgren

unread,

Sep 28, 2022, 11:20:59 AM9/28/22

to frousseu, R-inla discussion group

I checked with Haavard as well, and the main problem is that the model is close to singular, and not a good fit to the data;

while log-Gaussian Cox processes were introduced partly to model local clustering, they are not good at capturing all types of strong clustering, and the effect of that on model estimation is to give numerical problems.

I've attached my modified code that sets an integrate-to -zero constraint on mySmooth, tightens the covariance parameter priors (10% probability of std.dev. above 3 is _very_ large on the log-scale!),

and also increases the prior precision for the intercept.

With those changes, the two modes give comparable results for all except for the first test case; the one that is the least similar to what an actual stationary log-Gaussian Cox process output would look like.

In general, there's not much to be done about this except for possible methods for checking the output, and possibly improve the internal computation robustness and/or problem detection. It would be nice if the results were more robust under miss-specified models, but this is a very challenging problem.

Finn

code.R

frousseu

unread,

Sep 28, 2022, 11:26:50 AM9/28/22

to R-inla discussion group

Thanks for your quick response. I forgot to mention I was using the latest stable INLA version 22.05.07 and inlabru 2.5.3. I will update the code tomorrow to plot the linear predictor with both mode on a common scale.

Finn Lindgren

unread,

Sep 28, 2022, 11:30:23 AM9/28/22

to frousseu, R-inla discussion group

My code version already plot the E(log(lambda)) values (but not on common scale), and it appears to that the variational correction method diverges (also seen in the verbose=TRUE output).
Haavard might be able to find out if that can be improved, so that it might be able to still return the "best estimate, even though it's bad due to data-model mismatch".

Finn

To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/fbdcad4a-c50f-4fc2-9a41-82820629b6a0n%40googlegroups.com.

Finn Lindgren

unread,

Sep 28, 2022, 11:41:03 AM9/28/22

to frousseu, R-inla discussion group

For the benefit of the list, here's the E(log(lambda)) plots with the modified code (also the mesh was given half the max.edge value, but that only had a minor impact).

Finn

Rplot.png

Helpdesk (Haavard Rue)

unread,

Sep 28, 2022, 12:08:00 PM9/28/22

to Finn Lindgren, frousseu, R-inla discussion group

without the constraint on the smooth, then the smooths essentially
capture the intercept, like here

[0] x_mean[382] = -18.1007561954
[0] x_mean[383] = -17.4394735305
[0] x_mean[384] = -17.4133457156
[0] x_mean[385] = -17.5490516442
[0] x_mean[386] = -18.294247792
[0] x_mean[387] = -0.15195994289

the [387] is the intercept,

but with constr=T, then its the other way around

[0] x_mean[381] = -0.652092830049
[0] x_mean[382] = -0.605291747515
[0] x_mean[383] = -0.680238088313
[0] x_mean[384] = -0.649780685316
[0] x_mean[385] = -0.674308324682
[0] x_mean[386] = -0.418722776152
[0] x_mean[387] = -11.9482383956

Intuitive, you may want the intercept to capture the 'intercept'
(whatever we mean with that), and then the smooth, to be variations
around it.

I guess this happens since you're pushing the range towards infinity
(which *is* a good thing), but the other side of the story, is that this
behaviour gets into a bad spin and starts to take over the intercept-
role, which push the range to be higher, and there we go.

So the problem I think have several 'solutions', and inla() picks one.

*Why* the vb correction crash, is another issue, which I am looking at
now.

Helpdesk (Haavard Rue)

unread,

Sep 28, 2022, 12:28:25 PM9/28/22

to Finn Lindgren, frousseu, R-inla discussion group

But the uncertainty in the 'wrong mode' is way higher than in the 'good
mode', and this breaks the VB correction.

I think its wise to use constr=T for the smooth in these cases (at least
in the cases with little data), so we do not end up in the bad mode.

this is the mean and sd for the linear predictor in the 'good mode'

[0] i 1 (mean,sd) = -12.578198 1.644492
[0] i 2 (mean,sd) = -12.642153 1.673452
[0] i 3 (mean,sd) = -12.687935 1.664141
[0] i 4 (mean,sd) = -12.541333 1.707002

while in the 'bad mode', its

[0] i 1 (mean,sd) = -18.693001 10.785496
[0] i 2 (mean,sd) = -17.355483 6.970717
[0] i 3 (mean,sd) = -16.703274 6.074258
[0] i 4 (mean,sd) = -18.448008 11.043165

On Wed, 2022-09-28 at 09:16 -0700, Havard Rue wrote:

frousseu

unread,

Sep 30, 2022, 7:39:05 AM9/30/22

to R-inla discussion group

Below is the modified code (a bit messy...) for showing predictions on a common scale and their differences. I used the same parameters as in Finn's code. Both mode agree well where the density of points is high, but they differ a lot away from high density areas when points are clustered and the sd is also much higher with the experimental mode. As noted, there also seems to be a much stronger fight between the intercept and the smooth in the experimental mode.

I completely agree the LGCP model is not a good fit to this simulated data. The way it is simulated corresponds more to a Neyman-Scott process (I think?) with parent points and childrens. Hopefully, in my real application, heavy clustering should be at least partially explained by my covariates making the LGCP a more reasonable approximation.

To remove the effect of modeling the patterns with an inappropriate model, I simulated 3 point patterns with a LGCP, albeit with rather clustered scenarios. Again, the experimental mode seems to struggle a lot more with low density areas compared to the classic mode. In each scenario, I'm able to get back "correct" predictions if I use a tighter prior on the intercept precision (prec.linear=1/0.1^2 instead of prec.linear=1/10^2), even with a nonsense mean.linear=0. My guess is that this reduces the fight between the intercept and the smooth which leads to the problem. When predictions from the experimental mode do look ok, they indeed look better than the ones from the classic mode which appears more smoothed out, especially for higher density areas. Less clustered scenarios do not seem to show any problems. I'm not totally sure however if the predicted intensities should be identical to the simulated intensity or if only relative intensities should be compared. The code is also below and is for lambda.

Thank you for the time spent on this and let me know if I can be of further help with the simulations.
François

lgcp_clustered.R

lgcp_simulation.R

Helpdesk (Haavard Rue)

unread,

Oct 1, 2022, 6:35:35 AM10/1/22

to frousseu, R-inla discussion group

On Fri, 2022-09-30 at 04:39 -0700, frousseu wrote:
> scenario, I'm able to get back "correct" predictions if I use a
> tighter prior on the intercept precision (prec.linear=1/0.1^2 instead
> of prec.linear=1/10^2), even with a nonsense mean.linear=0.

also with constr=T on the smooth?

with experimental mode, you might try to turn off the VB correction

control.inla=list(control.vb=list(enable=TRUE))

let me know...

--
Håvard Rue
he...@r-inla.org

frousseu

unread,

Oct 1, 2022, 7:42:21 AM10/1/22

to R-inla discussion group

Yes, both mode are with constr=TRUE. When removing the VB correction with control.inla=list(control.vb=list(enable=FALSE)) (you meant enable=FALSE ?), everything looks fine. I also tried with enable=TRUE and constr=FALSE and it also looks fine. Thus, in these specific cases the problem appears with enable=TRUE and constr=TRUE. This is still with the latest stable INLA version 22.05.07.

Francois

Helpdesk (Haavard Rue)

unread,

Oct 1, 2022, 8:51:09 AM10/1/22

to frousseu, R-inla discussion group

can you try with the most recent testing version?

On Sat, 2022-10-01 at 04:42 -0700, frousseu wrote:
> Yes, both mode are with constr=TRUE. When removing the VB correction
> with control.inla=list(control.vb=list(enable=FALSE)) (you meant
> enable=FALSE ?), everything looks fine. I also tried with enable=TRUE
> and constr=FALSE and it also looks fine. Thus, in these specific cases
> the problem appears with enable=TRUE and constr=TRUE. This is still
> with the latest stable INLA version 22.05.07.
>
> Francois
>

frousseu

unread,

Oct 1, 2022, 10:21:32 AM10/1/22

to R-inla discussion group

With 22.09.28 and VB disabled it works OK but with VB enabled it is even worst with very high values for high density areas (min ~ 15, 370,...) and values running to infinity for low density areas.

lgcp_simulation_VB_testing version.png

Helpdesk (Haavard Rue)

unread,

Oct 1, 2022, 1:40:58 PM10/1/22

to frousseu, R-inla discussion group

with constr=T and a proper model for the intercept?

On Sat, 2022-10-01 at 07:21 -0700, frousseu wrote:
> With 22.09.28 and VB disabled it works OK but with VB enabled it is
> even worst with very high values for high density areas (min ~ 15,
> 370,...) and values running to infinity for low density areas.
>

frousseu

unread,

Oct 2, 2022, 4:44:40 AM10/2/22

to R-inla discussion group

Yes, this is all with constr=TRUE, but with the default intercept=0 and a sd=10. The intercept used for the simulation is -8. I tested further cases with varying priors on the intercept. All models with "flexible" priors on the intercept seem problematic (although an sd of 2 for the prior on the intercept is not very flexible). Using very thight priors leads to OK models, but that's even with a mean.linear=+10 far way from "truth" and prec.linear=1/0.1^2. I show the sd since I find it a bit more intuitive. All with constr=TRUE and VB enabled. If I disable VB, all models look OK, either with constr=TRUE or FALSE. The code is attached below if needed.

François

lgcp_simulation_intercept_priors.R

Helpdesk (Haavard Rue)

unread,

Oct 5, 2022, 5:48:16 AM10/5/22

to frousseu, R-inla discussion group

The interplay with the smooth and the intercept is the issue, and the
model gives to many options for interpretation.

I added an emergency option, so that these cases would be detected and
then the vb.correction will be disabled (and no correction done).

https://github.com/hrue/r-inla/commit/b040ac0e5db64e51b7d004dbc88012d90fc86572

it will be available in the next build. Let me know if this does not
work well (ps: you can set the 'emergecy' value in the control.vb, and
it to back-off if max|mean[i] - mean.old[i]|/ sd[i] > emergency (this is
checked at each iteration)

thank you François, this discussion (and your examples) were very
useful!

Best
H

On Sun, 2022-10-02 at 01:44 -0700, frousseu wrote:
> Yes, this is all with constr=TRUE, but with the default intercept=0
> and a sd=10. The intercept used for the simulation is -8. I tested
> further cases with varying priors on the intercept. All models with
> "flexible" priors on the intercept seem problematic (although an sd of
> 2 for the prior on the intercept is not very flexible). Using very
> thight priors leads to OK models, but that's even with a
> mean.linear=+10 far way from "truth" and prec.linear=1/0.1^2. I show
> the sd since I find it a bit more intuitive. All with constr=TRUE and
> VB enabled. If I disable VB, all models look OK, either with
> constr=TRUE or FALSE. The code is attached below if needed.
>
> François
>

Helpdesk (Haavard Rue)

unread,

Oct 5, 2022, 9:20:31 AM10/5/22

to frousseu, R-inla discussion group

the new version with the emergency option is out now.

On Sun, 2022-10-02 at 01:44 -0700, frousseu wrote:
> Yes, this is all with constr=TRUE, but with the default intercept=0
> and a sd=10. The intercept used for the simulation is -8. I tested
> further cases with varying priors on the intercept. All models with
> "flexible" priors on the intercept seem problematic (although an sd of
> 2 for the prior on the intercept is not very flexible). Using very
> thight priors leads to OK models, but that's even with a
> mean.linear=+10 far way from "truth" and prec.linear=1/0.1^2. I show
> the sd since I find it a bit more intuitive. All with constr=TRUE and
> VB enabled. If I disable VB, all models look OK, either with
> constr=TRUE or FALSE. The code is attached below if needed.
>
> François
>

frousseu

unread,

Oct 5, 2022, 11:20:43 AM10/5/22

to R-inla discussion group

Thank you very much for quickly looking at this. I will let you know if I find anything else with the newer version and I'll be happy to run more simulations if needed. A lot of the work I do relies on INLA so I'm always happy to see it improved!