Question on the code related to measurement error models in Muff et al. 2015

165 views
Skip to first unread message

Andrea Gilardi

unread,
Mar 6, 2021, 10:33:24 AM3/6/21
to R-inla discussion group
Dear INLA developers, thank you very much for your work and your assistance. I use INLA more or less every day, and it is extremely beneficial for my research projects.

I'm writing you this message since I have a question regarding the INLA code used in example 2 of the paper Bayesian analysis of measurement error models using INLA (Muff. et al., 2015). The code can be found here: https://sites.google.com/a/r-inla.org/www/examples/case-studies/muff-etal-2014 More precisely, I don't understand why the authors set the initial value of the hyperprior associated to idx_x equal to -15 (I attach here the file, which was downloaded from INLA website. See line 64). In the paper (page 10/37), the authors wrote: "The precision, fixed to some large value controls the similarity between x* and beta_x * x. Default 10^9". The hyperprior is specified in the internal scale (i.e. log-precision) and log(10^9) = 20 (more or less), so I was surprised when I checked the R code and noticed "initial = -15".

I run the code using "initial = 15", and the results are (not surprisingly) totally different from the previous configuration. Could you explain why my reasoning is wrong?

Thank you very much
Andrea



Steffi

unread,
Mar 7, 2021, 12:34:27 PM3/7/21
to R-inla discussion group
Hi Andrea,

I think the answer to your question can be found the supplementary material that we then in the end published:

INLA_suppmat2.png

Best regards,
Steffi

Finn Lindgren

unread,
Mar 7, 2021, 2:42:34 PM3/7/21
to Steffi, R-inla discussion group
Hi both,

this seems a bit unclear; "fixing the precision to a large value" and "fixing the precision to exp(-15)" are polar opposites; To avoid the random effect playing a role,
I would expect the former, which would mean fixing it to perhaps exp(+15), not exp(-15). With exp(-15), the random effect will instead be allowed to be very large,
so I would agree that having initial = -15 seems to be the opposite of what was intended!

Finn


--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/d17349e9-a557-4830-9379-57343e1a6f31n%40googlegroups.com.


--
Finn Lindgren
email: finn.l...@gmail.com

Andrea Gilardi

unread,
Mar 8, 2021, 10:30:11 AM3/8/21
to R-inla discussion group
Dear Prof. Muff and Prof. Lindgren, thank you very much for your answers.

Finn perfectly summarised my doubts, i.e. I don't understand why setting initial = -15 implies "fixing the precision to a large value". I think that is not the same behaviour as the "regular" models. For example, when I run a Gaussian model with INLA including an iid Gaussian random effect, then I need to set the precision of the hyperparameter of the Gaussian family to a large (and fixed) value; otherwise, I could  face confounding effects:

# packages
library(INLA)

# simulate from the following model
# yi ~ N(b_0 + b_z * z, 1), i = 1, ..., 200
# z ~ N(0, 1); b_0 = 1; b_z = -1
set.seed(08032021)
n <- 200; b_0 <- 1; b_z <- -1
z <- rnorm(n)
mu <- b_0 + b_z * z
y <- rnorm(n, mu)

summary(inla(
  formula = y ~ z,
  data = list(y = y, z = z)
))

# If I add an iid Gausian random effect, then I need to set the precision of the
# gaussian family to a very large and fixed value, otherwise I think there are
# confouding effects.
ID <- 1:n
summary(inla(
  formula = y ~ z + f(ID),
  data = list(y = y, z = z, ID = ID),
  control.family = list(
    hyper = list(
      prec = list(
        initial = log(10e5),
        fixed = TRUE
      )
    )
  )
))


On the other hand, it looks like the behaviour of the classical measurement error model is the opposite:

# packages
library(INLA)

# Simulate data from the following model:
# 1) Regression model
# y_i ~ Pois(lambda_i), i = 1, ..., 500
# log(lambda_i) = b_0 + b_z * z + b_x * x
# 2) Exposure model
# x = a_0 + a_z * z + eps
# 3) Error model
# w = x + u
# where
# b_0 = 1, b_z = -1 and b_x = 1;
# a_0 = 1; a_z = -1
# z ~ N(0, 1); eps ~ N(0, 1); u ~ N(0, 1)

# Constants
set.seed(08032021)
n <- 500
b_0 <- 1; b_z <- -1; b_x <- 1
a_0 <- 1; a_z <- -1

# Exposure model
z <- rnorm(n)
eps <- rnorm(n)
x <- a_0 + a_z * z + eps

# Error model
u <- rnorm(n)
w <- x + u

# Regression model
lambda <- b_0 + b_z * z + b_x * x
y <- rpois(n, exp(lambda))

# Build the Y matrix
Y <- matrix(NA_real_, nrow = 3 * n, ncol = 3)
Y[1:n, 1] <- y
Y[n + 1:n, 2] <- 0
Y[2 * n + 1:n, 3] <- w

# Reformat the data
my_data <- list(
  # Regression model
  Y = Y,
  b_0 = c(rep(1, n), rep(NA, 2 * n)),
  b_z = c(z, rep(NA, 2 * n)),
  b_x = c(1:n, rep(NA, 2 * n)),
  weight_x = c(rep(1, n), rep(-1, n), rep(1, n)),
  idx_x = c(rep(NA, n), 1:n, 1:n),
  # Exposure model
  a_0 = c(rep(NA, n), rep(1, n), rep(NA, n)),
  a_z = c(rep(NA, n), z, rep(NA, n))
)

# Run the INLA model
summary(inla(
  formula = Y ~ -1 +
    f(b_x, copy = "idx_x", hyper = list(beta = list(param = c(0, 0.1), fixed = FALSE))) +
    f(idx_x, weight_x, model = "iid", values = 1:500, hyper = list(
      prec = list(initial = -15, fixed = TRUE)
    )) +
    b_0 + b_z + a_0 + a_z,
  family = c("poisson", "gaussian", "gaussian"),
  data = my_data
))

When I run the same code setting initial = 15, then the inla routine stops with an error. I reported the output of the R code at the following gist: https://gist.github.com/agila5/90b13440dff7c2cc727c97ca04476a19. If you want, I can send you the R script to reproduce that (super small) simulation.

Just to be clear: I'm not saying that the code reported in the paper is wrong, but I just don't understand the influence and the effects of the precision parameter. Thank you very much for your assistance.

Andrea




Steffi

unread,
Mar 8, 2021, 1:18:09 PM3/8/21
to R-inla discussion group
Hi Finn and Andrea,

Sorry, I agree that it was unclear. I must admit I had now to go back and dig into this. The thing is that the f(idx.x...) component encodes for the actual (unobserved) covariate, and not for the paramter of the copy feature - that could be modified in the f(beta.x,...) component, but we leave the default value there, that is, 10^9. The reason why the uncertainty in the f(idx.x...) component is fixed to a large value (small precision) is for another reason: we have another layer where x is modelled, namely the layer where we specify the exposure model (level 2 in the code, corresponding to the second column in the stacked Y matrix). That is what we tried to say in the explanation that I copied out of the SuppMat above.

I hope that was clearer now!
Steffi

Andrea Gilardi

unread,
Mar 10, 2021, 3:45:53 AM3/10/21
to R-inla discussion group
Dear Prof. Muff, thank you very much again for your answer.

I still don't understand the importance behind the hyperprior assigned to idx.x, but now I understand that the argument precision models the relationship between x* and beta_x * x in the f() function and that the default value is exp(14) (i.e. x* and betax*x are the same thing).

Thanks.
Andrea
Reply all
Reply to author
Forward
0 new messages