Stan Math Library - Newbie Question on AutoDiff Operator Overloading

383 views
Skip to first unread message

Adam Hartshorne

unread,
Mar 15, 2017, 6:38:05 PM3/15/17
to Stan users mailing list
Hi,

I am trying to use the autodiff functionality in the Stan Math Library to compute the Jacobian as part of a complex energy optimization. I have a reasonable understanding of how the autodiff approach works in theory.

The complication I have is that some of the variables in one energy term are non-trivial and whose derivatives are hidden within a 3rd party library (and are not trivial to calculate from first principles). They are related to change on curvature of patches representing a surface. 

As an example, lets call two of the variables u and v which are 3x1 vectors, which I am trying to optimize over. For known u and v, the API can return the (first, second, third derivatives) i.e. du, dv, duu, dvv, duv, etc.....so I am able to know ascertain these in addition to my current u and v values.

So my question if it is possible to overload a simple as example e.g.

std::math::var lp = 0.5 * cross(u,v)
lp
.grad() ;
std
::cout << " d.f / d.u = " << u.adj() << " d.f / d.v = " << v.adj() << std::endl;


such that when the Stan Math Library tries to find the partial derivatives of u and v, rather than attempting to perform a standard differentiation on what appears to be a simple pair of variables u and v, but replaces the nodes in the graph by either my known du and dv values (or a function which calls the API to get them).

I presume if you can, it is via some sort of overloading of the stan::math::var type (the type you normally define your standard variables as) ?

Thanks in advance,

Adam



Michael Betancourt

unread,
Mar 15, 2017, 6:51:00 PM3/15/17
to stan-...@googlegroups.com
You’ll want to wrap the propriety calculations in a custom Stan function.  See https://github.com/stan-dev/stan/wiki/Contributing-New-Functions-to-Stan.

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Adam Hartshorne

unread,
Mar 15, 2017, 7:07:02 PM3/15/17
to Stan users mailing list
Are there any examples of this or any functions in the library that do something similar that I could use as a possible template?

Bob Carpenter

unread,
Mar 15, 2017, 7:47:29 PM3/15/17
to stan-...@googlegroups.com
I think we're just talking about the math lib, not using
Stan for sampling, so much of that doc won't be relevant.

To plug in an external black box function for use with the rest of
our autodiff, you need to be able to compute its value
and its Jacobian.

An alternative, is to convert the double types to stan::math::var
types and make sure all the operations bottom out in functions the
Stan math library supports (all the C++ built-ins plus a lot more).

You want to check out the arXiv paper for details on how
our reverse-mode autodiff works:

https://arxiv.org/abs/1509.07164

It's very thoroughly tested.

Forward mode isn't nearly as thoroughly tested.

- Bob


> On Mar 15, 2017, at 6:38 PM, Adam Hartshorne <adam.ha...@gmail.com> wrote:
>

Adam Hartshorne

unread,
Mar 15, 2017, 7:59:15 PM3/15/17
to stan-...@googlegroups.com
Correct I am talking about the math lib.

To plug in an external black box function for use with the rest of 
our autodiff, you need to be able to compute its value 
and its Jacobian. 

My problem is that the Jacobian I want to calculate is a complex combination of the variable values and their derivatives from the function. I only have access to those, not the Jacobian i.e. I have for variables u, v, I also have du, dv, duu, duv, dvv, etc.

To give a more concrete example of type of function which I want to calculate the Jacobian of is

K = (EG - F*F)

where

E = dot(du,du)
F = dot(du,dv)
G = dot(dv,dv)

Where du, dv are first derivatives. So obviously if I was to expand this out via chain rule, I will be using the 2nd order diffs such as duu, duv, etc

Looking through the code I was wondering if something like the tgamma function might be a good starting template?


Bob Carpenter

unread,
Mar 17, 2017, 8:01:12 PM3/17/17
to stan-...@googlegroups.com
If you have the first-order derivatives of the output values
with respect to the input values, you have the Jacobian, so I'm
confused by your message here. I also can't tell what du and dv
are, as your function seems to be

f(E, G, F) = E * G + F^2

which has very a very simple gradient (here that's just the Jacobian
because there's only one output).

- Bob

> On Mar 15, 2017, at 7:59 PM, Adam Hartshorne <adam.ha...@gmail.com> wrote:
>
> Correct I am talking about the math lib.
>
> To plug in an external black box function for use with the rest of
> our autodiff, you need to be able to compute its value
> and its Jacobian.
>
> My problem is that the Jacobian I want to calculate is a complex combination of the variable values and their derivatives from the function. I only have access to those, not the Jacobian i.e. I have for variables u, v, I also have du, dv, duu, duv, dvv, etc.
>
> To give a more concrete example of type of function which I want to calculate the Jacobian of is
>
> K = (EG - F*F)
>
> where
>
> E = dot(du,du)
> F = dot(du,dv)
> G = dot(dv,dv)
>
> So obviously if I was to expand this out via chain rule, I will be using the 2nd order diffs such as duu, duv, etc
>
>
>
> On Wednesday, 15 March 2017 23:47:29 UTC, Bob Carpenter wrote:
> I think we're just talking about the math lib, not using
> Stan for sampling, so much of that doc won't be relevant.
>
> To plug in an external black box function for use with the rest of
> our autodiff, you need to be able to compute its value
> and its Jacobian.
>

Adam Hartshorne

unread,
Mar 18, 2017, 7:43:30 AM3/18/17
to Stan users mailing list
Sorry for being unclear. 

I have a function, f(u,v), hidden in a 3rd party API, which I don't have access to the core of what it is calculating and returns a [3,1] vector of values. There are also methods within the 3rd party API called GetFirstDerivatives(u,v), which returns partial derivative of f(u,v) i.e. du and dv, and GetSecondDerivatives(u,v), which returns duu, dvv, duv. Thus,

du is the partial derivative of f with respect to u.
dv is the partial derivative of f with respect to v.

duu is the second order partial derivative of f with respect to u
etc

Say I have a function K = (EP - F*F),

where 

E = dot(du,du) 
F = dot(du,dv) 
G = dot(dv,dv) 

Which is all fine.

However, now I want to create the Jacobian of K, which will obviously require combinations of 2nd order partial differentials (which I can numerically get from the function mentioned above). The autodiff library as it stands would obviously not be able to cope with this because it won't know what the partial differential of u, let alone du...as the base formula for f(u,v) is hidden from us.

So say when it comes to calculating the Jacobian of K, there is going to be need somewhere in the graph to calculate things like d/du (F), which written out in full will be duu * dv + du * dvu. 

Autodiff won't know how to calculate duu or dvu. So there are going to be nodes in the graph I need to overload with special operators, such in this example GetSecondDerivatives(u,v) is called in order to insert the known value of duu or dvv.

Adam Hartshorne

unread,
Mar 18, 2017, 3:39:46 PM3/18/17
to Stan users mailing list
Thought it was worth mentioning that the Adept AutoDiff library has a very useful function for problems such as mine. They offer something called append_derivative_dependence, which allows you to specific that you know the differential as calculated by a third party library.

Section 2.7

Daniel Lee

unread,
Mar 18, 2017, 9:04:02 PM3/18/17
to stan-users mailing list
HI Adam,

It's still very unclear. If you can compute the function value and the gradients with respect to the parameters, there are ways to create Stan autodiff variables with the right structure. That said, you'll have to be pretty familiar with how the math library works. It's outlined in the arXiv paper (with working examples). The next place to look is the code and in particular, the unit tests that show exactly how to instantiate these things.

Mind trying to describe:
- what you're trying to do (the overall goal)
- what pieces you have available to you; be specific here down to function signatures if you can

Hopefully that's enough for us to help out.


Daniel



--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.

Adam Hartshorne

unread,
Mar 19, 2017, 10:34:23 AM3/19/17
to stan-...@googlegroups.com
My ultimate goal is a least squares optiminization, via Levenberg-Marquart method, thus I need to form the Jacobian of a cost function. The cost function I am using includes a term which is the difference of Gaussian Curvature. For the other terms I have the actual function defined in a traditional way that fits with AutoDiff, but for the difference of Gaussian Curvature I don't have access to the underlying function.

Gaussian Curvature can be defined as a combination of First and Second Fundamental Forms of a Surface.

K = (e*g - f^2) / (E*G - F^2)

where

N = cross(du,dv) / norm(du,dv)
e = dot(N, duu)
f = dot(N,duv)
g = dot(N, dvv) 
E = dot(du,du) 
F = dot(du,dv) 
G = dot(dv,dv) 

where for example du is partial differential of a hidden function f(u,v) with respect to u etc as stated before.


So I want to incorporate the Jacobian of this function in my LM optimization. 

I am using Pixar's OpenSubDiv library to model a surface. I have modified this set of functions, such that it returns upto 3rd Derivs. However, I don't have access to the base function f(u,v), only the ability to call a function which returns things like du, duv, ...


What it does is a little bit more complicated, but for ease of explanation lets say that it effectively takes a set of triples (faceId,u,v) and from which you can return the values of the partial first, second and third order derivations. The faceId isn't important, just think of it as a function f(u,v). The exact function is a blackbox hidden in OpenSubDiv (it is actually different functions depending on various conditions, which is why I don't want to open the blackbox up directly to AutoDiff).

So what I want to do is override some operators in Stan that effectively says when it is trying to calculate the Jacobian and will differentiate K, for all the partial diffs that will be required I know what the value is and insert that into the graph directly.

As stated in my last post, Adept appears to have exactly this function built-in called  append_derivative_dependence, which is explained in Section 2.7 of their documentation with an example not a million miles from what I am trying to achieve.


Michael Betancourt

unread,
Mar 19, 2017, 11:19:48 AM3/19/17
to stan-...@googlegroups.com
You are overcomplicating the problem, which is already
complicated enough.

First, let’s assume that you are studying a surface that
admits a global coordinate system so that the Gaussian
curvature equations can be defined self-consistently 
everywhere.

Then you can write your cost function as 

C : R^N \times R^N -> R
(u, v) \mapsto K(u) - K(v)

The function K can be written entirely in terms of the
first and second-order coordinate surface tangents 
made available by this third-party library.  

Then you can compute the gradient of f analytically
in terms of the first, second, and third-order surface
tangents which you claim to also be available.

So now you have equations for C and its gradient
and you can evaluate them using information provided
by the third-party library.

_This_ is what you implement as a function in the Stan
Math Library.  If you cannot provide that gradient
analytically and can’t or won’t open up the black box
then we literally do not have the information necessary
to try to compute the gradient automatically.


On Mar 19, 2017, at 10:34 AM, Adam Hartshorne <adam.ha...@gmail.com> wrote:

My ultimate goal is a least squares optiminization, via Levenberg-Marquart method, thus I need to form the Jacobian of a cost function. The cost function I am using includes a term which is the difference of Gaussian Curvature. 

Gaussian Curvature can be defined as a combination of First and Second Fundamental Forms of a Surface.

K = (e*g - f^2) / (E*G - F^2)

where

N = cross(du,dv) / norm(du,dv)
e = dot(N, duu)
f = dot(N,duv)
g = dot(N, dvv) 
E = dot(du,du) 
F = dot(du,dv) 
G = dot(dv,dv) 

where du is partial differential of f with respect to u etc as stated before.


So I want to incorporate the Jacobian of this function in my LM optimization. 

I am using Pixar's OpenSubDiv library to model a surface. I have modified this set of functions, such that it returns upto 3rd Derivs. However, I don't have access to the base function f(u,v), only the ability to call a function which returns things like du, duv, ...


What it does is a little bit more complicated, but for ease of explanation lets say that it effectively takes a set of triples (faceId,u,v) and form which you can return the values of the partial first, second and third order derivations. The faceId isn't important, just think of it as a function f(u,v). The exact function is a blackbox hidden in OpenSubDiv (it is actually different functions depending on various conditions, which is why I don't want to open the blackbox up directly to AutoDiff).

So what I want to do is override some operators in Stan that effectively says when it is trying to calculate the Jacobian and will differentiate K, for all the partial diffs that will be required I know what the value is and insert that into the graph directly.

As stated in my last post, Adept appears to have exactly this function built-in called  append_derivative_dependence, which is explained in Section 2.7 of their documentation.


On Sunday, 19 March 2017 01:04:02 UTC, Daniel Lee wrote:
HI Adam,

It's still very unclear. If you can compute the function value and the gradients with respect to the parameters, there are ways to create Stan autodiff variables with the right structure. That said, you'll have to be pretty familiar with how the math library works. It's outlined in the arXiv paper (with working examples). The next place to look is the code and in particular, the unit tests that show exactly how to instantiate these things.

Mind trying to describe:
- what you're trying to do (the overall goal)
- what pieces you have available to you; be specific here down to function signatures if you can

Hopefully that's enough for us to help out.


Daniel


On Sat, Mar 18, 2017 at 3:39 PM, Adam Hartshorne <adam.ha...@gmail.com> wrote:
Thought it was worth mentioning that the Adept AutoDiff library has a very useful function for problems such as mine. They offer something called append_derivative_dependence, which allows you to specific that you know the differential as calculated by a third party library.

Section 2.7

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.

To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.

Bob Carpenter

unread,
Mar 19, 2017, 6:01:18 PM3/19/17
to stan-...@googlegroups.com

> On Mar 19, 2017, at 11:19 AM, Michael Betancourt <betan...@gmail.com> wrote:
>
> You are overcomplicating the problem, which is already
> complicated enough.

@Michael: I think Adam was just replying to what Daniel asked about, which is
to fully specify the problem being solved. The problem people have
is that until they know the guts of autodiff in general or our system
in particular, it's hard to break these things down into the minimal
required components. Thanks for figuring this out and responding!

@Adam: You can do something very similar in Stan for gradients. It's
harder if you need second-order derivatives, because we need a templated
first-order derivative function in order to write our mixed mode
reverse nested in forward mode for second-order derivatives.

I added a note on how to code up a function with known gradients here:

https://github.com/stan-dev/math/wiki/Adding-a-new-function-with-known-gradients

that parallel's that section of the Adept doc (thanks for the pointer).

It doesn't really solve Adam's original problem, though, which needs
higher-order derivatives. The problem we have is that there's not an easy
way to implement a forward-mode function.

- Bob

Michael Betancourt

unread,
Mar 19, 2017, 6:14:46 PM3/19/17
to stan-...@googlegroups.com
Yup, that’s how I interpreted Adam’s detailed response, which
was very helpful and welcome!

That’s the thing — from Adam’s description he doesn’t actually
need higher-order autodiff. He has all of the inputs he needs
from the third-party code, he just has to abstract the problem
a bit and think about the cost function, which requires first
and second-order tangents, as the function to be implemented
in Stan. Then it’s a straightforward problem, although the
necessary analytic calculations certainly wouldn’t be trivial!
Reply all
Reply to author
Forward
0 new messages