Thanks for the in-depth reply, it'll take me a few reads to fully digest...
I have no particular opinions guiding my preference for one AD tool over another. I'm happy to be educated as to the benefits of your approach.
There are some slides on the general code layout here
These are probably best used together -- watch the video and click through the updated slides to get an approximation of an updated experience...
OpenVSP is a large event driven GUI application. Including libraries that are 'ours', it entails about 300,000 lines of C++. Since it is event driven, there is no simple way to describe the flow of execution.
Most numerical codes that one would consider using AD tools have relatively simple control flow.
1) Set up the problem (set variables or read a text file)
2) Do the algorithm
3) Dump our results to file.
This is clearly not the case for OpenVSP. In the application, control flow can go in any direction at any time -- it isn't useful to think of differentiating that.
OpenVSP's code is roughly divided into three chunks:
1) Geometry core
2) GUI
3) 3D Graphics stuff (OpenGL)
It is possible to compile just the geometry core into a stand-alone executable, or a library that can be linked into other programs.
I think we would not want to do any AD on parts 2) and 3). There really is no point and they are hopelessly complex and irrelevant to the differentiation workflow that matters to us.
Ideally, we don't need to touch much of 1) (only parts of it). Unfortunately, some of our core libraries will undoubtedly get pulled into the mix.
The relevant use case is based around optimization. Basically speaking, the non-differentiated code path is something like this...
1) Build a model up, or read in a *.vsp3 file.
2) Update() the model -- this generates the Piecewise Bezier Surfaces that are the true geometry, and then also generates a wireframe from that to be displayed on screen.
3) Do some analysis starting from the Bezier surfaces or the Wireframe, or export data based on those to some file format.
A Bezier surface is a parametric surface. Consequently (x,y,z)=f(u,v) -- the x,y,z coordinates of a point on the surface are a function of u,v. The Bezier surface is defined by a finite number of control points (Cp_i,j) each, a point in x,y,z. If we enumerate the x,y,z as k, then we have C_p,i,j,k for the control points. The surface evaluation could more properly be written as X_k=f(C_pk, u, v) -- where C_pk is a matrix of control points across i,j for a given surface.
So, the derivative codepath will be something like this...
1) Build a model up, or read in a *.vsp3 file.
2) Identify one or more Parms (OpenVSP variables are called parameters) to take derivatives with respect to (call them P_a) All Parms in a model have a unique ID. At the time OpenVSP is compiled, we would not know which Parms will be selected. If it made a huge difference, we could theoretically know which Parms at compilation time, but this means that ordinary users would need to be able to easily re-compile OpenVSP on the fly.
3) Update() the model -- this generates the Piecewise Bezier Surfaces
4) UpdateDerivative() -- calculate the derivative of the control points with respect to the identified Parms. i.e. d C_p,i,j,k / d P_a . It should be obvious that the product ni*nj*nk is going to be much larger than na, so traditionally a forward mode would be preferred. Also, if we serialized i,j,k to say index b, I would expect dCpb/dPa to be very sparse.
5) Evaluate the derivative of the Bezier surface at a set of user supplied u,v points. d Xk / d P_a = f( d C_pb / d P_a, u, v)
That is probably where we will stop for an initial proof of concept. Taking the derivative of all the possible downstream analyses is intractable. We'll have to handle them one at a time as demand requires.
We would want an API call to do 5) repeatedly, or at least for a large vector of u,v points. Likewise, we would want to be able to do 4,5 for different Parms, or at least a moderate size vector Pa.
I can almost convince myself that we only ever need to calculate derivatives from the API -- never from an interactive session. Any interactive session needs could dump a *.vsp3 file and call the library version. It would be faster to use the same model that aready exists in memory, but the simplification may be worth it.
I believe that use cases that include normal operation with occasional derivative operation either result in some overhead due to duplicate computation and holding the model in memory twice -- or elaborate lengths to prevent such things. But it may be un-preventable.
One idea would involve wrapping the Geometry core in a namespace at compile time. That way we could have normal::foo() and derivative::foo() at the same time. We would still have to duplicate things in memory and re-compute, but it could help to keep the code touched by AD contained.
This is the sort of thing I'm most interested in your thoughts -- how do you approach modifying a huge complex program for automatic differentiation when vast swaths of code do not need to be involved in AD?
Rob