On Apr 18, 2014, at 11:49 PM, Brian Hayden <
bha...@lbl.gov> wrote:
> What is the difference between doing log_lik in the transformed parameters block and using increment_log_prob in the model block, versus doing it in a generated quantities block like this:
>
> generated quantities {
> real log_lik;
> log_lik <- 0;
> for (n in 1:N){
> log_lik <- log_lik + normal_log(y[n], X[n]*b, sigma);
> }
> }
>
> I am asking mainly because I am trying to implement waic and I want to make sure the generated quantities version doesn't do something different.
They'll give you the same answer because Stan only cares about
the log probability function up to a constant as defined over
the parameters --- everything else is about what gets output.
So the two approaches will not give you a different answer.
But there are some subtle differences to what happens under the hood.
Transformed parameters are derived from parameters. This means they
involve heavy automatic differentiation types. This happens
every log prob evaluation, which is once per leapfrog step in
HMC/NUTS, which is roughly 2^treedepth per iteration.
Generated quantities are computed once per iteration using double
precision floating point values in C++ instead of auto-diff variables and
thus are faster to evaluate.
Another issue is that the function normal_log() does not drop constant
terms whereas the sampling statement version with ~ does. We may clean
this all up in the future so they do the same thing or are ideally
configurable, but that's how it works now and it means a bit more overhead
for calling normal_log() than using sampling statements. And if you
want to reuse these pieces to compute the log prob, then you have
to forego the built-in vectorization possibilities in the sampling
distributions.
Andrew prefers the generated quantities version because (a) it's faster,
and (b) it means the model can be written as it would be written if you weren't
computing WAIC. In contrast, the transformed parameter approach is nice because
you only have to write the model once, even if it's coded a bit differently
than it would otherwise be coded.
This general advice is all buried in the manual in the description of
which variables get computed where.
You probably want an array of log likelihoods, one for each data
point, too.
- Bob