Hi Bob!
Yes, sure, we can move this to stan-dev. Let me answer inline:
> Can we move this disucssion to stan-dev?
Sure.
>
> Yes, I think we should consider CVODES.
Great.
>
> The main issues with inclusion are getting the builds to work on
>
> * every platform (Mac, Windows, Linux)
> * through every compiler (g++, clang++, MSVC, mingw, Intel, ...?)
> * through every interface's installation (CmdStan, RStan, PyStan)
>
> Ideally that would mean with only using make so we don't have to add on
> any more build dependencies.
The current status is that CVODES builds with g++, clang++ and Intel. Moreover, only make is used to perform the build and install. For Rstan I figured how I get R to build CVODES - it would need Ben to look into what is best here (dynamic, static linking). For Python there are a couple of packages which make CVODES available for use in Python so I suppose this is doable as well (a nice one with a neat c++ interface is pycvodes).
> Of course, we also need it to pass our library tests. And it needs to be
> demonstrably better than whatever is already implemented.
>
> I'm all sorts of confused about which systems we're talking about and what
> they're capable of, along at least the following dimensions:
>
> * switching between stiff and non-stiff regimes
Yes, CVODES can switch between stiff and non-stiff, but not automatically. The user has to say upfront what the problem is. Here LSODA is nice in that it switches on heuristics between stiff (BDF) and non-stiff (Adams).
>
> * using finite diffs for Jacobians to handle stiffness
If you do not give CVODES the Jacobian, it will use finite diffs. Same for LSODA.
>
> * allowing us to pass in system Jacobians for stiffness
Yup, that's possible either CVODES. I guess (don't know) same is true for LSODA.
>
> * what kind of sparseness is implied by the system Jacobians
> (I think all the parameter w.r.t. other parameter components
> are zero)
> and how the solvers deal with them
CVODES has support for sparse structures. To my knowledge LSODA has not, but LSODAS has. Note CVODES = CVODE with Sensitivity while Sparseness is always included if switched on; LSODAS = LSODA with Sparseness.
>
> * what underlying ODE solver algorithms are being used
CVODES: BDF for stiff, Adams for non-stiff both with variable order
LSODA: same as CVODES, but with fixed order to my knowledge.
>
> * how CVODE relates to CVODES relates to LSODA and whether there
> are other acronyms I should be throwing around here
This is the important one:
CVODE: ODE integrator which solves the problem type y' = f(t,y) ; y(t0) = y0
So you specify a y0 and the rhs side of the ODE and you get y(t) calculated. NO SPECIAL SENSITIVITY SUPPORT HERE. In order to get the gradients dy(t)/dp for K parameters p, you have to give the integrator as rhs the coupled_ode system. Therefore, the Jacobian (as seen from CVODE) becomes very large and in order to get at least some useful performance you must take advantage of the sparseness property of the large Jacobian.
CVODES: ODE integrator which solves the problem type y' = f(t,y,p) ; y(t0) = y0(p). The dependency on parameters p is made explicit within CVODES! The integrator takes as input the rhs of the ODE (just as CVODE), but in addition you can ask for the additional solution of the sensitivity equations. So you give CVODES the rhs which is of size N and you give CVODES another rhs_sensitivity function for each parameter p for which you want the gradients dy(t)/dp solved along with y(t). The rhs_sensitivity function must use internally the Jacobian of the original system. Now, you also give to CVODES the Jacobian, but you give CVODES the Jacobian of the original system (with N states only). CVODES then takes automatically for you advantage of the sparse structure of the problem - there is no need any more to program a sparse Jacobian for the coupled system as all of this is taken care of within CVODES. This is why I refer to CVODES as "integrated ODE-sensitivity" solver. Another advantage is that CVODES provides another approach to solve the sensitivity problem. This other approach scales much better with N or K getting large according to the docs.
LSODA: Essentially like CVODE with the convenience to switch between stiff/non-stiff automatically. Solves the problem y' = f(y,t); y(t0) = y0. Hence, NO support to solve the sensitivity system in an integrated way. You have to code the sparseness yourself (which requires to use LSODAS). Note that CVODE is the direct successor of LSODA.
>
> * whether our current non-stiff solver is OK or you want to
> swap that out with a different Sundials component
Solving non-stiff problems with an enlarged coupled ode system is not such a big performance problem, because the Jacobian of the coupled system is not needed. The need to get rid of the boost odeint solver is now much relaxed as the boost solver is now robust (means the solver throws an exception instead of making Stan hang). However, CVODES should perform better for non-stiff for two reasons: (i) it uses the Adams stepper which supposedly evaluates the rhs less frequently and (ii) you cannot take advantage of the alternative way of solving the sensitivity problem.
>
> * availability of interpolation for "dense" solutions
I don't think this is important. My understanding is that this matters if the internal step size dt is much larger than the typical delta t observations are apart from one another.
>
> * whether we can throttle the stepper so as not to go into what
> feels like an infinite loop
CVODES gives up whenever maxSteps is reached. So yes, CVODES has the important robust property to discard (extreme) draws.
>
> * ability to control absolute and relative tolerances
Sure, every solver can do this. From my performance tests this does not really matter what you set there.
>
> * "integrated ODE+sensitivity solver" (Sebastian brought this up, but
> I don't know what it means)
CVODES = integrated ODE+sensitivity solver, solves y'=f(t,y,p) y(t0) = y0(p) and has built-in support to get the gradients dy(t)/dp while taking advantage of sparseness of the Jacobian automatically.
CVODE and LSODA = solvers for y'=f(t,y) y(t0) = y0; to get the sensitivities you must give it the coupled ode system as rhs and you have to program sparse Jacobians yourself (and then the performance is still by a factor of 5-500 lower in comparison to CVODES).
>
> * ability to provide (sparse) user-defined Jacobians
> -- support for sparse matrix Jacobian vs. dense
This removes the need of AD and I found this very important. If the user can even specify the sparsity, this would even be better, of course - but the big gain is to remove AD (8x speed increase for N=3, K=7 case for the gradient evaluation).
>
> * ability to provide user-defined sensitivities
Isn't that the same as providing the Jacobian by the user? Or do you mean that users provide analytic dy(t)/dp functions?
Sebastian
>
> - Bob
>
> > On Oct 2, 2015, at 6:13 AM, Weber, Sebastian
> >
> > Hi Bob & Daniel!
> >
> >
> >
> > I think that we should consolidate in a wiki all the knowledge accumulated
> for solving ODE sensitivities – otherwise I will start to forget about things
> again and I have some ideas in my head which would be nice. I can try to start
> a wiki page on that.
> >
> >
> >
> > Now, before I start that – where are we with a possible inclusion of
> CVODES? I mean, some of my ideas depend on its feasibility to include it. In
> other words, do you consider this as feasible given that with some time we
> carefully think through this? If you agree that this route is worthwhile to
> explore, then it makes sense to start documenting things. It would also be
> desirable to define criteria which are needed to take on CVODES, i.e. under
> what circumstances are we (or better Daniel who decides) good to seriously
> consider an inclusion.
> >
> >
> >
> > FYI, the current state with deploying cvode: It can now be installed with a
> simple make and it uses only g++/clang++/icpc. I also already tried to roll this
> into Rstan and my first attempts look very promising, i.e. R just builds cvode
> during installation. For PyStan: I am not a Python expert, but there are a
> couple of packages out there which make CVODES available to Python users.
> >
> >
> >
> > To make things even more fun, I looked into odeSD which is the
> presumably the fastest ODE sensitivy integrator (according to
>
http://www.biomedcentral.com/content/pdf/1752-0509-6-46.pdf).
> >
> > The good about odeSD:
> >
> > - It is just one ~1000lines C code file which includes a single function ->
> there are chances to convert this into a C++ header-only code, I think
> >
> > - About 2x faster than cvodes according to the benchmarks; it’s main
> speedup comes from requiring fewer calls to the system Jacobian which is
> good news as autodiff is so expensive
> >
> > - The authors are friendly and would consider dual-licensing odeSD
> under the BSD as well (I asked)
> >
> > - Compiles under windows
> >
> > The bad:
> >
> > - LGPL – we have to wait for dual-licensing if one were to modify it
> >
> > - By far not as proven in the wild; cvodes has been much more used
> >
> > - The code looks scary to me – lots of hard coded constants, but
> possibly that’s normal for numerics code
> >
> > - Dependency on BLAS and LAPACK – here one could try to use Eigen’s
> implementation, but this requires a Fortran compiler; or one changes the
> BLAS calls to Eigen calls – no idea how straightforward this is. Since Eigen
> implements BLAS, there would a good template available to do that; lots of
> work possibly.
> >
> > - Interface looks a bit messy, but they provide examples
> >
> > - Not as many techniques to solve ODE sensitivities as in CVODES
> >
> >
> >
> > BTW, to me the need for a stiff solver is much reduced once the robust
> patch is in Stan along with the patch from Michael. In fact, if I were to decide,
> I would opt for user gradients first by now, since many systems are non-stiff
> and these would be made a lot faster by removing AD with user gradients.
> For non-stiff systems there is not a huge gain with CVODES, at least this is
> what I expect (whereas for stiff systems there is almost no way around
> CVODES as the Jacobian explodes).
> >
> >
> >
> > Best,
> >
> >
> >
> > Sebastian Weber