ode tolerances / MCMC bias

53 views
Skip to first unread message

Sebastian Weber

unread,
Apr 4, 2016, 1:57:21 PM4/4/16
to stan...@googlegroups.com
Hi!

Probably better on the list rather than a comment on the running CVODES issue...

So here are the results for a hierarchical PK/PD model. To be specific, I simulated 50 patients of a 2-cmt model oral dosing situation for typical parameter values I use to see. Two doses (t=0 and t=24) are given and I compare against an analytic solution which I placed on the very left of the plots with abs_tol=1E-16 for machine precision. Now, I had to use the CVODES implementation as the CVODE+coupled integrator which is currently in develop did freeze Stan for 3 days while CVODES returned the results in a few hours, depending on the abs+rel tolerance set. I used a 1E-9, 1E-6 and a 1E-3 abs+rel tolerance. Have a look yourself, but I can't find any threshold of the ODE solver which seems to invalidate/bias results. I used 4 chains and per chain I did 500 warmups and 500 iterations. True values are

   theta_true <-  c(0.326634259978281, 0, 2.30258509299405, 1.6094379124341, 3.68887945411394)
  omega_true <-  c(0.3, 0.4)
  sigma_y_true <- 0.2

Any comments for improvements on the comparisons are very welcome.

Sebastian
metrics.pdf

Michael Betancourt

unread,
Apr 5, 2016, 9:33:07 AM4/5/16
to stan...@googlegroups.com
Regardless of the settings it looks like the MCMC estimators using CVODES vary by between 
0.1 and 1%, which is not huge but still significant.  What is particularly troubling is that this is
larger than the standard error in each of the example, which means that the CVODES variation
is the dominate source of error.  This completely changes how we would have to report fits
that use ODE solvers.  Is there any intuition to why this variation is so strong and there is no
apparently convergence?

--
You received this message because you are subscribed to the Google Groups "stan development mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<metrics.pdf>

Sebastian Weber

unread,
Apr 5, 2016, 10:48:16 AM4/5/16
to stan development mailing list
Hi!

if you want to go to that precision, we should discuss what simulation (warmup/sampling) size to use and there are a few details here which should probably be addressed, let's discuss later in the meeting. These runs take a lot of CPU time and I don't want to draw too many resources from our cluster, but I am happy to run a few well defined experiments.

From what I take, this is the type of graphs you were looking for. Right?

This morning I decreased even further the abs tolerances to 1E-2 (still all ok to me) and then to 1E-1. The 1E-1 was interesting in that now Stan is still computing, i.e. the run takes much longer in comparison to 1E-3! So there seems to be a natural limit at which point the ODE output is not any more usable for NuTS.

Sebastian

Krzysztof Sakrejda

unread,
Apr 5, 2016, 1:28:13 PM4/5/16
to stan development mailing list
Hey,

The group here at UMass is doing interacting (multi-serotype) SIR models, mostly with particle filtering at this point but wanting to use Stan for parameter estimation so I'm very interested in where this discussion goes. I should've said something at the meeting but it's a little confusing with so many people in the room. If the discussion can stay here or somewhere else that's not a private channel it would help me and the group here get a sense of where things are at. Thanks,

Krzysztof

Sebastian Weber

unread,
Apr 5, 2016, 3:20:15 PM4/5/16
to stan development mailing list
Hi!

@Krysztof: Great to know that you also have a stake at ODEs.

From the discussion earlier I figured the quickest thing to do is to repeat the plots I have done, but this time I split by chain which is now color coded. So each color corresponds to the same seed and the same chain id at different values of abs+rel tol passed to the ODE integrator (500 warmup/500 iterations). Note that the seed for the parameters is the same for all runs, only the chain_id varied while the stan_seed is also the same everywhere. Moreover, in the plots I am sending around what is declared in the Stan model as parameter is

theta, omega, sigma

The other quantities are derived from these. This time I also include the low precision runs from this morning. I think one can see a threshold which is 1E-2 where inferences seem to be consistently pulled into one direction. For the other abs/rel tolerances, from my view on it, I think it is hard to say anything. For me from 1E-6 (including) and onwards everything looks really ok. Although, one would probably have to run more chains and run an ANOVA against it to see if the abs/rel tolerance has an impact.

Cheers,
Sebastian
metrics2.pdf

Bob Carpenter

unread,
Apr 5, 2016, 3:32:37 PM4/5/16
to stan...@googlegroups.com
I take it this is CVODES and perfect behavior would
be overlayed horizontal lines at the true answer. So
the question is what's good enough?

What I'd like is some criterion by which we
can make decisions on which of these to keep:

* Boost
* CVODE
* CVODES

Asked another way, what does Sebastian need to do to convince
everyone that CVODEs should go in? At that point, is there
any reason to keep Boost or CVODE if they're slower and less
robust?

- Bob
> <metrics2.pdf>

Michael Betancourt

unread,
Apr 5, 2016, 4:02:47 PM4/5/16
to stan...@googlegroups.com
The point of this study was to identify how the CVODES numerical
error affects the MCMC output. Having the results separated by
chain is really helpful as I think it supports Bob’s hypothesis that
the small differences in the solver output as the tolerances are
varied doesn’t affect the states much but introduces enough variation
to make the RNGs incoherent. In particular, the variation across
tolerances can be accounted for by the expected MCMC standard
error.

I am not happy with setting the boost/CVODE/CVODES tolerances
around 1e-6 to 1e-8 with the tests requiring 1e-6 or 1e-7 accuracy
compared to analytic solutions. In particular, Sebastian you should
update the CVODES tests to test for, say, 1e-6 accuracy with
whatever CVODES tolerances achieve this accuracy. Then we
can merge the CVODES changes once the tests pass.

CVODES can largely replace CVODE, but we should keep the boost
solver in as it is _much_ faster for non stiff problems. It will be important
to have both until we get a solver that can switch between multiple
approaches like LSODA.

Sebastian Weber

unread,
Apr 5, 2016, 5:26:26 PM4/5/16
to stan development mailing list

So 1E-6 is the tolerance we take forward, great!

@Michael: I assume you refer to the integrate_ode_cvode_prim_test.cpp and integrate_ode_cvode_grad_rev_test.cpp tests. I will double check again, but at least integrate_ode_cvode_grad_rev_test.cpp does test against an analytical solution at a precision of at least 1E-7.

Moreover, I have ported ALL tests which have been written for CVODE to CVODES and they do pass.

Finally, I have already brought the CVODES branch up-to-date with develop such that these printf patches to CVODE are integrated into CVODES by now.

I fully agree with Michael that we need a non-stiff solver in addition to a stiff solver. This is why I enabled a solver option for the cvodes integrator object which allows to set if a non-stiff (=0) or stiff (=1) solver should be used.

I would like to expose the solver option to the user - should I change the existing integrate_ode_cvode function or add a new one integrate_ode_cvodes? The new function signature would have same arguments as integreate_ode_cvode PLUS the solver integer-argument.

Sebastian


On Tuesday, April 5, 2016 at 10:02:47 PM UTC+2, Michael Betancourt wrote:
> The point of this study was to identify how the CVODES numerical
> error affects the MCMC output. Having the results separated by
> chain is really helpful as I think it supports Bob’s hypothesis that
> the small differences in the solver output as the tolerances are
> varied doesn’t affect the states much but introduces enough variation
> to make the RNGs incoherent. In particular, the variation across
> tolerances can be accounted for by the expected MCMC standard
> error.
>
> I am not happy with setting the boost/CVODE/CVODES tolerances
> around 1e-6 to 1e-8 with the tests requiring 1e-6 or 1e-7 accuracy
> compared to analytic solutions. In particular, Sebastian you should
> update the CVODES tests to test for, say, 1e-6 accuracy with
> whatever CVODES tolerances achieve this accuracy. Then we
> can merge the CVODES changes once the tests pass.
>
> CVODES can largely replace CVODE, but we should keep the boost
> solver in as it is _much_ faster for non stiff problems. It will be important
> to have both until we get a solver that can switch between multiple
> approaches like LSODA.
>
> On Apr 5, 2016, at 8:31 PM, Bob Carpenter
>
> > I take it this is CVODES and perfect behavior would
> > be overlayed horizontal lines at the true answer. So
> > the question is what's good enough?
> >
> > What I'd like is some criterion by which we
> > can make decisions on which of these to keep:
> >
> > * Boost
> > * CVODE
> > * CVODES
> >
> > Asked another way, what does Sebastian need to do to convince
> > everyone that CVODEs should go in? At that point, is there
> > any reason to keep Boost or CVODE if they're slower and less
> > robust?
> >
> > - Bob
> >
> >> On Apr 5, 2016, at 3:20 PM, Sebastian Weber
> >>

Michael Betancourt

unread,
Apr 5, 2016, 5:48:31 PM4/5/16
to stan...@googlegroups.com
The solver option switches between Adams and Newton?

Sebastian Weber

unread,
Apr 5, 2016, 5:53:34 PM4/5/16
to stan development mailing list
Yes, it does. However, it does not switch during integration as LSODA does. So you have to know upfront if the problem is stiff or not and then choose which solver to use (or just try).

Sebastian

Bob Carpenter

unread,
Apr 5, 2016, 6:10:33 PM4/5/16
to stan...@googlegroups.com
See inline.

> On Apr 5, 2016, at 5:26 PM, Sebastian Weber <sdw....@gmail.com> wrote:

> I fully agree with Michael that we need a non-stiff solver in addition to a stiff solver.

OK, that's good to know.

> This is why I enabled a solver option for the cvodes integrator object which allows to set if a non-stiff (=0) or stiff (=1) solver should be used.

I now see from later email that it's two different solvers
built into CVODES. Do we have:

stiff: CVODES > CVODE
non-stiff: CVODES > Boost

If so, I don't see any reason to keep CVODE or the Boost solver.

> I would like to expose the solver option to the user

This is just a stiffness option --- not a direct specification
of solver.

> - should I change the existing integrate_ode_cvode function

I don't want to break backward compatibility unless absolutely
necessary, so we can't change existing function argument structure.
At most you can deprecate integrate_ode_cvode and introduce
something new.

> or add a new one integrate_ode_cvodes?

> The new function signature would have same arguments as integreate_ode_cvode PLUS the solver integer-argument.

How about

integrate_ode(...)
integrate_ode_stiff(...)

and we deprecate

integrate_ode_cvode(...)

and leave open the option to swap in new solvers under the hood (out of
user sight and function naming) later.

I think we'll need to also add a new integrate_ode() with user-defined
tolerances. So there may be four functions under the hood (stiff/non-stiff,
specify control parameters/default tolerances).

Or do you think we're really going to need to give people a whole host
of options of which solver to use? In which case we might still want
a default (one without a solver name attached) and we may want to flag
these other instances as unsupported (as in they might go away
in the future).

The primary issue is that we don't want to minimize the dependent pacakges
we have to distribute (including libraries within packages like Boost).

The secondary issue is that we don't want to give users inferior choices as
options, because their tendency will be to try them all if there's more than
one given, especially if they think some might be faster.

- Bob

Michael Betancourt

unread,
Apr 5, 2016, 6:34:32 PM4/5/16
to stan...@googlegroups.com

On Apr 5, 2016, at 11:09 PM, Bob Carpenter <ca...@alias-i.com> wrote:

I now see from later email that it's two different solvers
built into CVODES.  Do we have:

  stiff:      CVODES > CVODE
  non-stiff:  CVODES > Boost

Not quite.  Both CVODE and CVODES can switch between
stiff (Netwon) and non stiff (Adams) solvers.  I just used the
Newton solver in my original CVODE implementation as
we didn’t have any stiff capabilities yet.

I’d say we keep the boost functionality in integrate_ode
for the moment, deprecate integrate_ode_cvode and
add 

integrate_ode_cvodes(…, tolerances, max steps) // adams
integrate_ode_cvodes_stiff(…, tolerances, max steps) // newton


Bob Carpenter

unread,
Apr 5, 2016, 6:36:32 PM4/5/16
to stan...@googlegroups.com
I can live with that.

- Bob

Sebastian Weber

unread,
Apr 5, 2016, 6:43:18 PM4/5/16
to stan development mailing list
So yes, CVODES brings all those solvers we need (stiff / non-stiff).

I like the idea of the 4 proposed functions, i.e. integrate_ode/integrate_ode_stiff with/without control of abs+rel tolerances + maximum steps.

It feels bad to ditch RK45 from boost... I will try to benchmark Adams vs RK45 on a few cases to get an opinion on that, but probably you are right in that boost odeint is not needed anymore.

In what sense do you want to maintain back-ward compatibility? I mean the integrate_ode_cvode function only ever existed on the develop branch. This is why I was thinking it would be ok to change it even now.

@Michael: I think Bob meant that CVODES has a built-in non-stiff and a stiff solver. I agree that the terminology is confusing. We have the CVODES solver suite which includes Newton (stiff) and Adams Moulton (non-stiff).

You both were too fast for me in posting (and I am too tired). We are cross-posting.

I can live with that proposal too. Still, I would like to understand at what point you guys consider that we need to maintain back-wards compatibility...


On Wednesday, April 6, 2016 at 12:10:33 AM UTC+2, Bob Carpenter wrote:
> See inline.
>
> > On Apr 5, 2016, at 5:26 PM, Sebastian Weber
>

Michael Betancourt

unread,
Apr 5, 2016, 7:01:37 PM4/5/16
to stan...@googlegroups.com
Right now we maintain the existing integrate_ode. Perhaps we can
deprecate it later if CVODES non stiff proves superior.

For now just add four new functions,

integrate_ode_cvodes(…) // adams, default tolerances/max steps
integrate_ode_cvodes(…, tolerances, max steps) // adams, explicit tolerances/max steps
integrate_ode_cvodes_stiff(…, tolerances) // newton, default tolerances/max steps
integrate_ode_cvodes_stiff(…, tolerances, max steps) // newton, explicit tolerances/max steps

The defaults would just call the explicit versions. Bob — should the
defaults be defined in the language or in the math library?

In a following PR we can remove the current CVODE implementation
if we want.

Bob Carpenter

unread,
Apr 5, 2016, 7:06:34 PM4/5/16
to stan...@googlegroups.com
Either way's OK with me. The doc will probably be in
the language, so that might make a bit more sense.

- Bob

Bob Carpenter

unread,
Apr 5, 2016, 7:11:32 PM4/5/16
to stan...@googlegroups.com

> On Apr 5, 2016, at 6:43 PM, Sebastian Weber <sdw....@gmail.com> wrote:
> ...

> I will try to benchmark Adams vs RK45 on a few cases to get an opinion on that, but probably you are right in that boost odeint is not needed anymore.

Great. We can keep the old one for backward compatibility if
we introduce new function names. At least dependencies on Boost
are header only.

> In what sense do you want to maintain back-ward compatibility? I mean the integrate_ode_cvode function only ever existed on the develop branch.

It was released as part of CmdStan.

> This is why I was thinking it would be ok to change it even now.

So let's just deprecate it rather than change it.

> @Michael: I think Bob meant that CVODES has a built-in non-stiff and a stiff solver. I agree that the terminology is confusing. We have the CVODES solver suite which includes Newton (stiff) and Adams Moulton (non-stiff).
>
> You both were too fast for me in posting (and I am too tired). We are cross-posting.
>
> I can live with that proposal too. Still, I would like to understand at what point you guys consider that we need to maintain back-wards compatibility...

Everything we expose and document should be maintained until we hit
a major version number, the next one of which will be Stan 3 (and
that probably won't arrive for a year, and even then we don't need
to change things so much as deprecate them).

It is really annoying to users to find code not working that
used to work.

I am, on the other hand, OK with removing the doc for things that
are deprecated as long as they still work.

- Bob

Daniel Lee

unread,
Apr 5, 2016, 7:25:07 PM4/5/16
to stan-dev mailing list
It never made it to a released version of CmdStan. It's post 2.9.0.


Sebastian, I'll be following up with an email tonight regarding code integration with our code base. One of the difficulties of swapping out ODE integrators that are not header-only is that there's an additional burden on each of the interfaces to get the builds working.



Daniel



- Bob

Bob Carpenter

unread,
Apr 5, 2016, 7:27:33 PM4/5/16
to stan...@googlegroups.com
Great. No issue removing something we haven't released.
I don't want to get into the business of supporting snapshots
of develop!

- Bob

Sebastian Weber

unread,
Apr 6, 2016, 3:34:56 AM4/6/16
to stan development mailing list
Hi Daniel!

Yes, let me know how I can help. I noticed the printf oddness raised by Ben (R does not want printf in code) and the CVODES includes already the patches.

Other than that, I have local modifications of the build system to Stan and CmdStan which align it with CVODES instead of CVODE.

Sebastian

Sebastian Weber

unread,
Apr 6, 2016, 3:38:23 AM4/6/16
to stan development mailing list
That's why I always assumed it is safe to dump integrate_ode_cvode as it only ever lived on develop.

However, I just recalled, that the function got official mentioning on the blog:

http://andrewgelman.com/2015/12/16/working-stiff/

But I guess this does not change the fact that we can dump it now or does the blog make the function worthy to maintain?

Sebastian

On Wednesday, April 6, 2016 at 1:27:33 AM UTC+2, Bob Carpenter wrote:
> Great. No issue removing something we haven't released.
> I don't want to get into the business of supporting snapshots
> of develop!
>
> - Bob
>
> > On Apr 5, 2016, at 7:25 PM, Daniel Lee
> >

Bob Carpenter

unread,
Apr 6, 2016, 3:49:33 AM4/6/16
to stan...@googlegroups.com
I wouldn't worry about the blog citation. I think
maintaining backward compatibility for the official
releases is enough.

- Bob

Michael Betancourt

unread,
Apr 6, 2016, 3:51:32 AM4/6/16
to stan...@googlegroups.com
There are no guarantees of stability on develop, although we
strive to make it as stable as possible. Early adopters of the
CVODE implementation can add an extra “s” to their code
easily enough.
Reply all
Reply to author
Forward
0 new messages