Obtain summary of posterior predictive distribution when using offset

380 views
Skip to first unread message

fieeuvaan

unread,
Jul 23, 2014, 10:27:42 PM7/23/14
to r-inla-disc...@googlegroups.com
I follow the guideline on this post, replicate my data and refit the mode. However, I have also the offset E which is also varies over time. How should I treat the E when replicate the data, as NA or as the original.
Thanks.

Finn Lindgren

unread,
Jul 23, 2014, 10:46:58 PM7/23/14
to r-inla-disc...@googlegroups.com
On 23/07/14 20:27, fieeuvaan wrote:
>
> I follow the guideline on this post
>
> <https://groups.google.com/forum/#!searchin/r-inla-discussion-group/posterior$20predictive$20distribuiton/r-inla-discussion-group/2FYu8lAqid4/s227YnCcpIcJ>,
>
> replicate my data and refit the mode. However, I have also the offset E
> which is also varies over time. How should I treat the E when replicate
> the data, as NA or as the original.


The post you refer two only deals with the case of Gaussian observations,
but if you have "E" in your model I assume that you have Poisson
observations? In that case, I don't think the technique from that post
applies. It's not clear to me precisely what you are trying to do (the
"replicate my data and refit the mode"), but if you're trying to obtain the
posterior predictive distribution of the observation model, a common method
is to sample from the posterior of the inla-model (using
inla.posterior.sample() ), and then sample data conditionally on that. The
"E" to use when doing that depends entirely on what it means in you model;
usually it's "population at risk" or "exposure", and in both cases you need
to use the E that is appropriate for the posterior predictions you are
looking for. In inla, the E is part of the observation likelihood model,
and is not used for the transformed linear model part of the posterior
distributions.

Finn L

> Thanks.
>
> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to r-inla-discussion...@googlegroups.com
> <mailto:r-inla-discussion...@googlegroups.com>.
>
> To post to this group, send email to
> r-inla-disc...@googlegroups.com
> <mailto:r-inla-disc...@googlegroups.com>.
>
> Visit this group at
> http://groups.google.com/group/r-inla-discussion-group.
> For more options, visit https://groups.google.com/d/optout.

--
Finn Lindgren
email: finn.l...@gmail.com

Kinh Nguyen

unread,
Jul 24, 2014, 7:35:45 PM7/24/14
to Finn Lindgren, r-inla-disc...@googlegroups.com
On Thu, Jul 24, 2014 at 5:13 AM, Finn Lindgren <f.lin...@bath.ac.uk>
wrote:
>>> obtain the
>>> posterior predictive distribution of the observation model,
>> exactly, I need the PPD, more specifically the p(y_{iT} |
>> y_{i1},...,y_{i,T-1}), marginalizing out the parameter and
>> hyperparameter.
>
> OK, then the sampling method should work; just add the generalised
> linear model part for y_{iT} to the model, with NA "data" to get the
> posterior for the (transformed) linear predictor , \eta_{iT} in
> "inla-speak", so that inla.posterior.sample gives you samples of
> that, and then sample Poisson variables based on that. You may need
> to specify link=1 to get the transformed predictor (search the FAQ
> for example) but I'm not sure if that affects the sampler or not.

I've read some code and obtained the PPD like this:

- Extract a subset data up to time point T, set Y_T <- NA
- Fitted a Poisson model with `control.predictor = list(compute = TRUE)`
- Transform the predicted fitted marginal for time point *T* with
Mtrans <- inla.tmarginal(function(x) exp(x),
fit$marginals.fitted.values[[T]])
- Sampling from the above
Msamp <- inla.rmarginal(n = 100, Mtrans)
- Sampling again as Poisson to obtain PPD for time point *T*
newY <- rpois(n = 10, lambda = E * Msamp) #as I have the population
as risk E

The procedure described above run within a loop for the whole range of
time index. Is this somehow equivalent with what you described?

> If the expected Poisson counds are large enough, it's possible to
> derive approximate posterior distributions without sampling (using
> various Normal approximation properties), but to my knowledge noone
> has done that with inla yet.
>
> Finn

Finn Lindgren

unread,
Jul 24, 2014, 8:26:29 PM7/24/14
to r-inla-disc...@googlegroups.com
> On 24 Jul 2014, at 17:35, Kinh Nguyen <nguye...@ytecongcong.com> wrote:
> I've read some code and obtained the PPD like this:
>
> - Extract a subset data up to time point T, set Y_T <- NA

Why not use all the data?

> - Fitted a Poisson model with `control.predictor = list(compute = TRUE)`
> - Transform the predicted fitted marginal for time point *T* with Mtrans <- inla.tmarginal(function(x) exp(x), fit$marginals.fitted.values[[T]])

More efficient to let inla apply the link function internally.
Set link=1 in control.predictor, I believe.

> - Sampling from the above
> Msamp <- inla.rmarginal(n = 100, Mtrans)

If you only need this for a single variable, yes, that looks ok. For
multiple variables, that gives the wrong joint distribution. Use
inla.posterior.sample in that case.

> - Sampling again as Poisson to obtain PPD for time point *T*
> newY <- rpois(n = 10, lambda = E * Msamp) #as I have the population as risk E
>
> The procedure described above run within a loop for the whole range of time index. Is this somehow equivalent with what you described?

It's unclear what your goal is. A correct algorithm can only be
devised if one knows precisely what mathematical object is sought.

Finn L

>
>> If the expected Poisson counds are large enough, it's possible to derive approximate posterior distributions without sampling (using various Normal approximation properties), but to my knowledge noone has done that with inla yet.
>> Finn
>


Reply all
Reply to author
Forward
0 new messages