I have setup a wiki describing details of our ODE sub system. I hope this lays out what we have, in what direction I would like to change things and what changes I propose.
https://github.com/stan-dev/math/wiki/ODE-System-Refactor
Best,
Sebastian
It would be great to proceed on this one here. Maybe little steps first. As a first step I am suggesting to route the coupled_ode_system autodiff operations through the ode_system class and since we are changing this thing already remove the shifting operation from it. The branch which does this is here
https://github.com/stan-dev/math/tree/feature/issue-310-coupled_ode_system-refactor-only
In brief the changes are:
- move rev/{arr => mat}/functor/coupled_ode_system.hpp
- make ode_system a member of coupled_ode_system whenever any Jacobian is needed
- set the initial state differently to make shifting unnecessary
- adjust tests to include rev/mat.hpp instead of rev/arr.hpp
Hence, I leave the ODE sub system as it is designed, but route autodiff call through ode_system. The use case of this is to enable in an easy way the possibility to supply analytic Jacobians.
If this is deemed ready for inclusion, I would proceed and file a pull request.
Best,
Sebastian
--
You received this message because you are subscribed to the Google Groups "stan development mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Thanks for taking the time to go through this. I am a bit worried where this is heading. My proposed changes to not aim at a full blown rewrite of the ODE subsystem, which I do not think is needed. As such the wiki page I wrote was not intended to be a full design spec, but rather it explains on a high-level where we are and where I think we should go. All elements are there which are needed to make an efficient ODE subsystem, at least to my understanding; we just need to make better use of our existing code. Doing so will be much less work in comparison to throw everything away and start from scratch.
In brief my goals are:
1. unify rk45 and bdf design in terms of how stuff is calculated => I am proposing effectively to go back to a design as it has been proposed by Michael, i.e. let the bdf integrator use the coupled_ode_system.
2. make it easy to provide via partial specializations analytic Jacobians to the ODE subsystem (stiff and non-stiff case). This requires to route all AD calls through the ode_system object => this is implemented in the branch I pointed out earlier in this thread. So users would have to provide a small C++ code snippet to make this work. This aligns well with the plans we have to let users include C++ code. I do not intend to make this feature accessible via the Stan language at this point; this needs further consideration which is above my head as it involves changes to the language.
I do think that the above goals can be achieved with what we have and we do not need a full blown rewrite of the ODE subsystem which would be a huge amount of work.
Answering all your other questions would make sense to do in a wiki form which I think is what we need should we really go for a full blown rewrite, so let me for the moment answer your questions to my proposed changes first:
- what does this help with? "take away all Stan types out of objects being passed to integrators"
I am intending to make the integration process free of AD calls for the case of a supplied analytic Jacobian. This way the AD stack does not get modified during integration and hence the ODE integration becomes a double only operation which allows parallelization of the ODE code. So, to refine this - using type information at compile time for whatever reason is still OK, but not more if we want to achieve that the AD stack is never touched during integration if the analytic Jacobians are given.
- why is this important? "have a single point where AD is done; in the ode_system object which then allows to plug in analytic Jacobians easily" What is being done now? What are you trying to do to make this better? (what changes are associated with this bullet point?)
In order to provide the analytic Jacobian information in an easy way we need to have a single place where derivatives are being calculated. If I wanted to supply analytic Jacobians to the coupled_ode_system, I would have to write those out in 3 different places (var/double, double/var, var/var). Having ode_system in place and making use of it wherever a Jacobian is needed allows to easily replace all those calls with analytic variants.
- when is coupled_ode_system used? Are you planning to remove this?
Not at all. In fact, I would like to route the execution during the rhs_sens call in the bdf integrator through the coupled_ode_system object, just like Michael did it. This will replace the rhs_sens* functions inside the cvodes_ode_data object.
Best,
Sebastian
Its only the ode_system object which should stay constant - and I do not see why this should be changed any time soon. The ode_system object contains all what we need from the math side in order to construct everything desired, i.e. we need Jy and Jtheta of the ODE RHS.
As we are going to enable C++ extensions, this would nicely fit into the concept.
The benefits are much larger than just a >2X speedup:
1. My example was a non-stiff case which was small (N=3 states and S=7 sensitivities). This means that there is much more to get whenever system size is larger (AD cost should grow faster in comparison to a plain analytic Jacobian) and whenever the system is stiff (triggering more frequent Jacobian evaluations).
2. There is another huge opportunity here: Enabling a double-only (that is no AD use) ODE integration process clears the way for embarrassingly parallelization schemes for ODE integration. At first this may only be accessible again via C++, but it will be a huge speedup.
I expect that 1 will make stuff like PBPK possible and 2 will be a huge step forward for all among us who have a cluster available. Problems will go from statistical research to statistical practice.
In addition, at least in my field, inference is usually around a handful of ODE equation systems. As such, working out the Jacobians for those would already help a lot (it would nicely align with plans for C++ extensions). Finally, generating the code is really not so hard - there are a number of tools out there which help immensely with this.
Best,
Sebastian
PS: What I have in mind would leave the general design as it is. In fact, I would align the design of the bdf integrator with the one from rk45, but the interfaces being exposed stay (about) the same.
- Bob
--
You received this message because you are subscribed to the Google Groups "stan development mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.
> >
> > Sigh... I have tried to discuss the center-piece, which is ode_system, a while ago. Back then it appeared that everyone was on-board. Redoing things now looks like overhead, but ok...
>
> I'm sorry this feels like overhead. Think of it as proactively paying
> off technical debt. It's sometimes hard to see what's being proposed
> by a design, but we like to start there.
>
> This whole project is a ton of overhead to keep it in a maintainable
> state. We're being much more careful now up front given the abominations
> we've created in the past. Because we've had to go through and refactor
> them over and over to get them into a manageable state.
No offense here... this was just a very human comment. Stan is big which is what makes it sometimes not super efficient by construction; I can well understand.
>
> Are you saying that you can save time computing Jy and Jtheta together?
> That would argue for having two things:
>
> Jy for cases where you just need Jy
>
> (Jy, Jtheta) for when you need both
>
> Remember that constructing the expression is the fast part of autodiff.
> It's the reverse passes computing derivatives that are slow, and the
> speed is based on number of expressions.
I do not know if it is more efficient to do Jy and Jtheta always in a single sweep, but I do think so. Doing Jy and Jtheta in a single sweep should be faster for AD in comparison to doing them sequentially; this would be my intuition, but this has to do with how AD works and this tricked me already a few times.
>
> >> Instead of building god objects, I think we can do this through composition. That is, if we have the ode system specified by the language, say ode_rhs, then we can do something like:
> >>
> >>
> >> template <class ode_rhs>
> >> class Jy {
> >> Jy(ode_rhs& ode_rhs);
> >> ___ operator()(...);
> >> };
> >> and the same for Jtheta.
> >
> > Looks nice. As I said, we need Jy only or Jy and Jtheta, so I am not sure if we want to split out Jtheta independently.
>
> Yes. The point is that we don't want to build up big objects
> that sometimes have components in them and sometimes don't.
> It leads to bad combinatorial testing properties and makes the
> code hard to read because at any given point you can't tell what
> you have on hand in an object.
Yup, makes sense.
Sebastian
>
> - Bob