The difference between lppd_i and loo_i has been used as a sensitivity measure
(see, e.g., Gelfand et al 1992). Pareto shape parameter estimate k is likely to
be large if the difference between lppd_i and loo_i is large. It's not yet
clear to me whether the Pareto shape parameter estimate k would be better than
lppd_i-loo_i, but at least we know that estimate for lppd_i-loo_i is too small
if k is close to 1 or larger, so it might be better to look at k.
In stack example with normal model, k for one observation is large, but with
student-t model k is smaller. Normal model is same as student-t model, but with
very strong prior on degrees of freedom. So it's not just about having strong
prior or more shrinkage, but having a model which can describe the observations
well. With increased shrinkage and non-robust observation model, the one
observation could still be surprising.
Naturally it's not always the best solution to change to a more robust
observation model allowing "outliers". Instead it might be better to make the
regression function more nonlinear (that is having a less strong prior), or
transform covariates, or add more covariates.
So I do recommend looking at Pareto shape parameter values, but I don't
recommend to increase shrinkage if the values are large.
Aki
On 03.09.2015 00:36, Jonah Sol Gabry wrote:
> I'll defer to Aki for any definitive answer, but this makes sense to me. In
> particular for regression models where if the leave one out distribution is a
> bad approximation to the distribution for just point i then this could indicate
> leverage issues and that more shrinkage would be good.
...