On 6/25/13 11:11 AM, Julien Cornebise wrote:
> ...
> I have a somewhat stupid follow-up: is there a way to give Stan closed-form the derivatives to use (i.e. use NUTS with
> manual derivatives), since they are easy-ish in neural net models with the so-called "backpropagation", and would that
> be likely to speed anything up, or is AFD always using the most efficient way of computing the derivatives?
To elaborate a bit on Michael's answer, it depends what the model is.
Our auto-diff is pretty efficient, but there are certainly cases
where it can be made more efficient and more arithmetically stable.
For example, rather than doing poisson(exp(alpha)), we have a
distribution poisson_log(alpha) =def= poisson(exp(alpha)), and we
have customized the derivatives for poisson_log.
The auto-diff needs to be defined in C++ along the same lines
as our other functions.
In the particular case of logistic regression, even if we had
a custom auto-diffed GLM distribution
y ~ logistic_regression(x, beta);
where y is an N-vector, x is an (N x K) predictor matrix,
and beta is a K-vector, even a careful implementation probably
wouldn't be much more efficient than the current idiom:
y ~ bernoulli_logit(x * beta);
The main saving comes from eliminating intermediate expressions by
reducing (partially evaluating in computational terms) analytically.
- Bob