ODE Refactoring

Sebastian Weber

unread,

Aug 13, 2016, 6:53:54 AM8/13/16

to stan development mailing list

Hi!

I have setup a wiki describing details of our ODE sub system. I hope this lays out what we have, in what direction I would like to change things and what changes I propose.

https://github.com/stan-dev/math/wiki/ODE-System-Refactor

Best,
Sebastian

Sebastian Weber

unread,

Aug 22, 2016, 3:25:35 PM8/22/16

to stan development mailing list

*bump*

It would be great to proceed on this one here. Maybe little steps first. As a first step I am suggesting to route the coupled_ode_system autodiff operations through the ode_system class and since we are changing this thing already remove the shifting operation from it. The branch which does this is here

https://github.com/stan-dev/math/tree/feature/issue-310-coupled_ode_system-refactor-only

In brief the changes are:

- move rev/{arr => mat}/functor/coupled_ode_system.hpp
- make ode_system a member of coupled_ode_system whenever any Jacobian is needed
- set the initial state differently to make shifting unnecessary
- adjust tests to include rev/mat.hpp instead of rev/arr.hpp

Hence, I leave the ODE sub system as it is designed, but route autodiff call through ode_system. The use case of this is to enable in an easy way the possibility to supply analytic Jacobians.

If this is deemed ready for inclusion, I would proceed and file a pull request.

Best,
Sebastian

Daniel Lee

unread,

Aug 30, 2016, 4:03:25 PM8/30/16

to stan-dev mailing list

Hi Sebastian,

Thanks for taking the effort to start the document. Unfortunately, it doesn't actually discuss what needs to be talked about.

I'll list a few things I would like help understanding. I really don't want to add more work to anyone's load, but we need to be able to talk about what we're trying to accomplish in order for us to make group decisions.

In the meantime, if you wanted some resources on what's useful to spec:

- http://www.joelonsoftware.com/articles/fog0000000035.html

- http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/32713.pdf

(no need to read through them, but they are useful)

Here's what I'd want to see in the doc:

What does a ode_rhs system look like coming out of Stan's compiler?

What are the methods?
What are the types of arguments?
What does it return?

For the non-stiff, rk45 integrator:

what are the requirements for the ode_rhs system?

does this need more than what the ode_rhs system looks like from Stan's compiler?
does this need Jy?
does this need Jtheta?

what does an ode_rhs look like for it to be used in the integrator?

Please provide the actual function signatures required.
If it needs a Jy, please provide the function signatures for that.
If it needs a Jtheta, please provide the function signatures for that.

after the integration, how are the results put together?

For the stiff, bdf integrator:

same questions as above

Are there additional requirements?

I think you want to design this with writing analytic Jacobians in mind. Can you provide examples?

Answering these questions will help with design. I've been staring at the proposed changes and I really can't tell what can be reused, what can't, whether the design that's there now is a union of all the requirements (which I guess it has to if it wants to work), why it's being done in one way or another.

Additional questions:

For the non-stiff, rk45 integrator, are we planning to use the ode_rhs_sensitivity system? The one that's expanded with sensitivity variables?
For the stiff, bdf integrator, are we planning to use the ode_rhs_sensistivity system?
If the requirements for the two solvers are different in terms of providing Jy vs Jtheta, could we simplify design by composition? (could we have one class that accepts the original ode_rhs and then exposes just Jy? And another that just exposes Jtheta?)
Does the Stan language need to change to provide any additional information?
Is the generated ode_system enough? Is there anything easy that could be provided? Is there anything else that's necessary?
What is required for parallelization?

We should be working on the design to match the requirements -- stuff we need to provide. If the integrators are expecting different forms of the ode_system (or coupled_ode_system or whatever), those sorts of things shouldn't really creep into this design. We can easily use the adapter pattern to handle the matching of the interface:

https://en.wikipedia.org/wiki/Adapter_pattern

I'm not sure I understand what you're trying to do with the "Proposed Changes" section within the wiki. In particular:

- what does this help with? "take away all Stan types out of objects being passed to integrators"

- why is this important? "have a single point where AD is done; in the ode_system object which then allows to plug in analytic Jacobians easily" What is being done now? What are you trying to do to make this better? (what changes are associated with this bullet point?)

- when is coupled_ode_system used? Are you planning to remove this?

For an additional requirement from the math library:

- all functions must be able to be called with all doubles. This is not optional. (it shouldn't really be a problem, right?)

I think that's it for now.

Daniel

--
You received this message because you are subscribed to the Google Groups "stan development mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sebastian Weber

unread,

Sep 3, 2016, 4:44:34 AM9/3/16

to stan development mailing list

Hi Daniel!

Thanks for taking the time to go through this. I am a bit worried where this is heading. My proposed changes to not aim at a full blown rewrite of the ODE subsystem, which I do not think is needed. As such the wiki page I wrote was not intended to be a full design spec, but rather it explains on a high-level where we are and where I think we should go. All elements are there which are needed to make an efficient ODE subsystem, at least to my understanding; we just need to make better use of our existing code. Doing so will be much less work in comparison to throw everything away and start from scratch.

In brief my goals are:

1. unify rk45 and bdf design in terms of how stuff is calculated => I am proposing effectively to go back to a design as it has been proposed by Michael, i.e. let the bdf integrator use the coupled_ode_system.

2. make it easy to provide via partial specializations analytic Jacobians to the ODE subsystem (stiff and non-stiff case). This requires to route all AD calls through the ode_system object => this is implemented in the branch I pointed out earlier in this thread. So users would have to provide a small C++ code snippet to make this work. This aligns well with the plans we have to let users include C++ code. I do not intend to make this feature accessible via the Stan language at this point; this needs further consideration which is above my head as it involves changes to the language.

I do think that the above goals can be achieved with what we have and we do not need a full blown rewrite of the ODE subsystem which would be a huge amount of work.

Answering all your other questions would make sense to do in a wiki form which I think is what we need should we really go for a full blown rewrite, so let me for the moment answer your questions to my proposed changes first:

- what does this help with? "take away all Stan types out of objects being passed to integrators"

I am intending to make the integration process free of AD calls for the case of a supplied analytic Jacobian. This way the AD stack does not get modified during integration and hence the ODE integration becomes a double only operation which allows parallelization of the ODE code. So, to refine this - using type information at compile time for whatever reason is still OK, but not more if we want to achieve that the AD stack is never touched during integration if the analytic Jacobians are given.

- why is this important? "have a single point where AD is done; in the ode_system object which then allows to plug in analytic Jacobians easily" What is being done now? What are you trying to do to make this better? (what changes are associated with this bullet point?)

In order to provide the analytic Jacobian information in an easy way we need to have a single place where derivatives are being calculated. If I wanted to supply analytic Jacobians to the coupled_ode_system, I would have to write those out in 3 different places (var/double, double/var, var/var). Having ode_system in place and making use of it wherever a Jacobian is needed allows to easily replace all those calls with analytic variants.

- when is coupled_ode_system used? Are you planning to remove this?

Not at all. In fact, I would like to route the execution during the rhs_sens call in the bdf integrator through the coupled_ode_system object, just like Michael did it. This will replace the rhs_sens* functions inside the cvodes_ode_data object.

Best,
Sebastian

Bob Carpenter

unread,

Sep 4, 2016, 4:22:20 PM9/4/16

to stan...@googlegroups.com

The motivations make sense to me.

I think Daniel may be uncomfortable with refactoring
to a capability we're not going to expose in the Stan
language directly (analytic Jacobians). I know you
really really want to get those in for the two times
speedups you've mentioned in the overall model fit
time.

What worries me is if we start urging users to do this
either through a hack or through C++, then we're going
to have to support their doing it. We have enough trouble
supporting what we have already and this just seems like
asking for trouble.

Also, it looks like a pretty major rewrite of everything
that's there. So it may not be a complete rewrite, but
it's close enough as to be painful to review. A lot of
that might be easier if we can set up google hangouts to
go through the code---it's much easier in person, but
a hangout might work.

How confident are you that if you put these changes in
that they'll be stable going forward? What we don't want
is to have this major subsystem constantly refactored.
That also seem slike asking for trouble.

- Bob

> --
> You received this message because you are subscribed to the Google Groups "stan development mailing list" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.

Sebastian Weber

unread,

Sep 5, 2016, 6:22:44 AM9/5/16

to stan development mailing list

Hi!

Its only the ode_system object which should stay constant - and I do not see why this should be changed any time soon. The ode_system object contains all what we need from the math side in order to construct everything desired, i.e. we need Jy and Jtheta of the ODE RHS.

As we are going to enable C++ extensions, this would nicely fit into the concept.

The benefits are much larger than just a >2X speedup:

1. My example was a non-stiff case which was small (N=3 states and S=7 sensitivities). This means that there is much more to get whenever system size is larger (AD cost should grow faster in comparison to a plain analytic Jacobian) and whenever the system is stiff (triggering more frequent Jacobian evaluations).

2. There is another huge opportunity here: Enabling a double-only (that is no AD use) ODE integration process clears the way for embarrassingly parallelization schemes for ODE integration. At first this may only be accessible again via C++, but it will be a huge speedup.

I expect that 1 will make stuff like PBPK possible and 2 will be a huge step forward for all among us who have a cluster available. Problems will go from statistical research to statistical practice.

In addition, at least in my field, inference is usually around a handful of ODE equation systems. As such, working out the Jacobians for those would already help a lot (it would nicely align with plans for C++ extensions). Finally, generating the code is really not so hard - there are a number of tools out there which help immensely with this.

Best,
Sebastian

PS: What I have in mind would leave the general design as it is. In fact, I would align the design of the bdf integrator with the one from rk45, but the interfaces being exposed stay (about) the same.

Bob Carpenter

unread,

Sep 5, 2016, 12:19:51 PM9/5/16

to stan...@googlegroups.com

> On Sep 5, 2016, at 12:22 PM, Sebastian Weber <sdw....@gmail.com> wrote:
>
> Hi!
>
> Its only the ode_system object which should stay constant - and I do not see why this should be changed any time soon. The ode_system object contains all what we need from the math side in order to construct everything desired, i.e. we need Jy and Jtheta of the ODE RHS.
>
> As we are going to enable C++ extensions, this would nicely fit into the concept.
>
> The benefits are much larger than just a >2X speedup:
>
> 1. My example was a non-stiff case which was small (N=3 states and S=7 sensitivities). This means that there is much more to get whenever system size is larger (AD cost should grow faster in comparison to a plain analytic Jacobian) and whenever the system is stiff (triggering more frequent Jacobian evaluations).

AD cost shouldn't grow faster in comparison to analytic.
If you look at our autodiff paper, there's not much more
headroom than a factor of 2, 4, or 8, depending on the
ratio of transcendental operations to plain arithmetic,
with more transcendental meaning a lower overhead (because
the transcendentals take up much more of the time).

> 2. There is another huge opportunity here: Enabling a double-only (that is no AD use) ODE integration process clears the way for embarrassingly parallelization schemes for ODE integration. At first this may only be accessible again via C++, but it will be a huge speedup.

We can parallelize AD at a cost of about 20% extra compute
time. We just need to declare the stacks as thread local.
We used to have that all in the code and it'd be easy to
put back.

It' the urging people to do this and then trying to support
all their questions that we're worried about, since nobody else
knows how to do this other than you.

> I expect that 1 will make stuff like PBPK possible and 2 will be a huge step forward for all among us who have a cluster available. Problems will go from statistical research to statistical practice.

That's an Andrew quote!

> In addition, at least in my field, inference is usually around a handful of ODE equation systems. As such, working out the Jacobians for those would already help a lot (it would nicely align with plans for C++ extensions). Finally, generating the code is really not so hard - there are a number of tools out there which help immensely with this.

There are other ways this can be done than writing C++.

And we don't want to link in those analytical solvers because
of licensing issues.

> Best,
> Sebastian
>
> PS: What I have in mind would leave the general design as it is. In fact, I would align the design of the bdf integrator with the one from rk45, but the interfaces being exposed stay (about) the same.

The interfaces to users is exactly the same, right? If not, this
is a much bigger deal.

- Bob

Bob Carpenter

unread,

Sep 5, 2016, 12:22:33 PM9/5/16

to stan...@googlegroups.com

> On Sep 5, 2016, at 12:22 PM, Sebastian Weber <sdw....@gmail.com> wrote:
>
> Hi!
>
> Its only the ode_system object which should stay constant - and I do not see why this should be changed any time soon. The ode_system object contains all what we need from the math side in order to construct everything desired, i.e. we need Jy and Jtheta of the ODE RHS.
>
> As we are going to enable C++ extensions, this would nicely fit into the concept.
>
> The benefits are much larger than just a >2X speedup:
>
> 1. My example was a non-stiff case which was small (N=3 states and S=7 sensitivities). This means that there is much more to get whenever system size is larger (AD cost should grow faster in comparison to a plain analytic Jacobian) and whenever the system is stiff (triggering more frequent Jacobian evaluations).

AD cost shouldn't grow faster in comparison to analytic.
If you look at our autodiff paper, there's not much more
headroom than a factor of 2, 4, or 8, depending on the
ratio of transcendental operations to plain arithmetic,
with more transcendental meaning a lower overhead (because
the transcendentals take up much more of the time).

> 2. There is another huge opportunity here: Enabling a double-only (that is no AD use) ODE integration process clears the way for embarrassingly parallelization schemes for ODE integration. At first this may only be accessible again via C++, but it will be a huge speedup.

We can parallelize AD at a cost of about 20% extra compute
time. We just need to declare the stacks as thread local.
We used to have that all in the code and it'd be easy to
put back.

It' the urging people to do this and then trying to support
all their questions that we're worried about, since nobody else
knows how to do this other than you.

> I expect that 1 will make stuff like PBPK possible and 2 will be a huge step forward for all among us who have a cluster available. Problems will go from statistical research to statistical practice.

That's an Andrew quote!

> In addition, at least in my field, inference is usually around a handful of ODE equation systems. As such, working out the Jacobians for those would already help a lot (it would nicely align with plans for C++ extensions). Finally, generating the code is really not so hard - there are a number of tools out there which help immensely with this.

There are other ways this can be done than writing C++.

And we don't want to link in those analytical solvers because
of licensing issues.

> Best,
> Sebastian
>
> PS: What I have in mind would leave the general design as it is. In fact, I would align the design of the bdf integrator with the one from rk45, but the interfaces being exposed stay (about) the same.

Sebastian Weber

unread,

Sep 5, 2016, 3:06:45 PM9/5/16

to stan...@googlegroups.com

Hi!

>> 2. There is another huge opportunity here: Enabling a double-only (that is no AD use) ODE integration process clears the way for embarrassingly parallelization schemes for ODE integration. At first this may only be accessible again via C++, but it will be a huge speedup.
>
> We can parallelize AD at a cost of about 20% extra compute
> time. We just need to declare the stacks as thread local.
> We used to have that all in the code and it'd be easy to
> put back.

Sounds easy. If so, then I would be interested to look into it… is there a stan-math version which has these declarations, I mean some git has tag?

>
> It' the urging people to do this and then trying to support
> all their questions that we're worried about, since nobody else
> knows how to do this other than you.

Well, this is all about a bit of documentation… and realistically it would mean that I (or whoever else wants to) creates some models which can be used in a ready made fashion; just like an extension. This should lead to a library of ODE models.

>
>> I expect that 1 will make stuff like PBPK possible and 2 will be a huge step forward for all among us who have a cluster available. Problems will go from statistical research to statistical practice.
>
> That's an Andrew quote!

Yup. It is - I like it lot. I should quote it next time.

>
>> In addition, at least in my field, inference is usually around a handful of ODE equation systems. As such, working out the Jacobians for those would already help a lot (it would nicely align with plans for C++ extensions). Finally, generating the code is really not so hard - there are a number of tools out there which help immensely with this.
>
> There are other ways this can be done than writing C++.

Like what? Do you mean coming up with another DSL for this?

>
> And we don't want to link in those analytical solvers because
> of licensing issues.

Now that I do not know, but I guess adding the usual BSD quote to such code (if intendend) would not hurt.

>
>> Best,
>> Sebastian
>>
>> PS: What I have in mind would leave the general design as it is. In fact, I would align the design of the bdf integrator with the one from rk45, but the interfaces being exposed stay (about) the same.
>
> The interfaces to users is exactly the same, right? If not, this
> is a much bigger deal.

Interfaces stay the same, yes.

Sebastian

>
> - Bob
>
> --
> You received this message because you are subscribed to a topic in the Google Groups "stan development mailing list" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-dev/iGLkuhEz3yg/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to stan-dev+u...@googlegroups.com.

Bob Carpenter

unread,

Sep 6, 2016, 7:20:17 AM9/6/16

to stan...@googlegroups.com

> On Sep 5, 2016, at 9:06 PM, Sebastian Weber <sdw....@gmail.com> wrote:
>
>> ...

>> parallelize AD at a cost of about 20% extra compute
>> time. We just need to declare the stacks as thread local.
>> We used to have that all in the code and it'd be easy to
>> put back.
>
> Sounds easy. If so, then I would be interested to look into it… is there a stan-math version which has these declarations, I mean some git has tag?

Yes, that part's easy. You just need a threadlocal declaration
on the memory structures for autodiff.

It's putting everyting back together that's harder.

>>> In addition, at least in my field, inference is usually around a handful of ODE equation systems. As such, working out the Jacobians for those would already help a lot (it would nicely align with plans for C++ extensions). Finally, generating the code is really not so hard - there are a number of tools out there which help immensely with this.
>>
>> There are other ways this can be done than writing C++.
>
> Like what? Do you mean coming up with another DSL for this?

Yes, I meant within the Stan language. Functions defined in a
Stan program get translated to C++.

>
>>
>> And we don't want to link in those analytical solvers because
>> of licensing issues.
>
> Now that I do not know, but I guess adding the usual BSD quote to such code (if intendend) would not hurt.

We only want to distribute with Stan things that are
BSD or similarly licensed (no GPL, no proprietary).

Any linking that way will have to be from a package
that includes Stan as a component. BSD is OK that
way, which is why I want to maintain it.

- Bob

Daniel Lee

unread,

Sep 10, 2016, 11:31:41 PM9/10/16

to stan-dev mailing list

Hi, just finally got enough time to really digest what's going on and able to respond to the thread. I'll just comment on a few things here. I think working on the specification on the wiki is going to be more productive.

I'm on board with the motivation.

The issue I have with the code was the design. The suggested changes tangled different pieces together into a global functions. Rather than having a coherent design where different classes had different functions, it was moving towards having the objects be god objects by design and also having global functions that switch.

I understand the functional impact of the code and the motivation for writing the code in this way. There's a reason why there's a term anti-pattern and this one is called the god object. It's really tempting to design code this way and it's easy. The problem is further down the line it becomes much hard to tease anything apart and make things better.

Based on the prototype, I see an alternative way to put it together that will take care of the speed, the flexibility of writing custom C++ analytic gradients, and making it easy to maintain. There are some key pieces of information that would be useful to make the design right. The first is what the integrators are expecting as inputs. And these questions (from the original email):

If the requirements for the two solvers are different in terms of providing Jy vs Jtheta, could we simplify design by composition? (could we have one class that accepts the original ode_rhs and then exposes just Jy? And another that just exposes Jtheta?)

Is the generated ode_system enough? Is there anything easy that could be provided? Is there anything else that's necessary?

Instead of building god objects, I think we can do this through composition. That is, if we have the ode system specified by the language, say ode_rhs, then we can do something like:

template <class ode_rhs>

class Jy {

Jy(ode_rhs& ode_rhs);

___ operator()(...);

};

and the same for Jtheta.

We don't need the objects to get more complicated. We can make them more focused and compose multiple objects together and accomplish the same thing, just easier. And hopefully you can see how the default implementation would be autodiff, but could be written analytically by replacing these classes (not really hacking anything together).

I'll stop here cause I feel like I'm starting to ramble.

Daniel

- Bob

--

You received this message because you are subscribed to the Google Groups "stan development mailing list" group.

To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+unsubscribe@googlegroups.com.

Sebastian Weber

unread,

Sep 12, 2016, 6:17:10 AM9/12/16

to stan development mailing list

Hi!

>
> I'm on board with the motivation.
>

Great.

> The issue I have with the code was the design. The suggested changes tangled different pieces together into a global functions. Rather than having a coherent design where different classes had different functions, it was moving towards having the objects be god objects by design and also having global functions that switch.
>

Sigh... I have tried to discuss the center-piece, which is ode_system, a while ago. Back then it appeared that everyone was on-board. Redoing things now looks like overhead, but ok...

>
>
> Based on the prototype, I see an alternative way to put it together that will take care of the speed, the flexibility of writing custom C++ analytic gradients, and making it easy to maintain. There are some key pieces of information that would be useful to make the design right. The first is what the integrators are expecting as inputs. And these questions (from the original email):

> If the requirements for the two solvers are different in terms of providing Jy vs Jtheta, could we simplify design by composition? (could we have one class that accepts the original ode_rhs and then exposes just Jy? And another that just exposes Jtheta?)Is the generated ode_system enough? Is there anything easy that could be provided? Is there anything else that's necessary?

Both integrators need the sensitivity RHS system (our coupled_ode_system). To construct this you need Jy and/or Jtheta which depends on the specific case. The bdf integrator needs in addition Jy separately to do its stiff integration.

The only thing what would worry me when doing Jy and Jtheta via composition is performance, but you can answer this quickly: Whenever we need Jy AND Jtheta my understanding is that it is more efficient to do that in a single sweep should we use AD, right? How can we ensure this? We never need Jtheta alone. It will always be Jy only or Jy and Jtheta.

>
>
> Instead of building god objects, I think we can do this through composition. That is, if we have the ode system specified by the language, say ode_rhs, then we can do something like:
>
>
> template <class ode_rhs>
> class Jy {
> Jy(ode_rhs& ode_rhs);
> ___ operator()(...);
> };
> and the same for Jtheta.

Looks nice. As I said, we need Jy only or Jy and Jtheta, so I am not sure if we want to split out Jtheta independently.

>
> We don't need the objects to get more complicated. We can make them more focused and compose multiple objects together and accomplish the same thing, just easier. And hopefully you can see how the default implementation would be autodiff, but could be written analytically by replacing these classes (not really hacking anything together).
>

The ode_system glues together the ode RHS with Jy and Jtheta which is all we ever need... looks not so different to me... but I agree your proposal looks like a good one, although I would like to see a concrete example (not necessarily working though, of course).

>
> I'll stop here cause I feel like I'm starting to ramble.

Some things need more words... NP

How should we move forward?

Sebastian

> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.

Bob Carpenter

unread,

Sep 12, 2016, 10:36:27 AM9/12/16

to stan...@googlegroups.com

> On Sep 12, 2016, at 6:17 AM, Sebastian Weber <sdw....@gmail.com> wrote:
>
> Hi!
>
>>
>> I'm on board with the motivation.
>>
>
> Great.
>
>> The issue I have with the code was the design. The suggested changes tangled different pieces together into a global functions. Rather than having a coherent design where different classes had different functions, it was moving towards having the objects be god objects by design and also having global functions that switch.
>>
>
> Sigh... I have tried to discuss the center-piece, which is ode_system, a while ago. Back then it appeared that everyone was on-board. Redoing things now looks like overhead, but ok...

I'm sorry this feels like overhead. Think of it as proactively paying
off technical debt. It's sometimes hard to see what's being proposed
by a design, but we like to start there.

This whole project is a ton of overhead to keep it in a maintainable
state. We're being much more careful now up front given the abominations
we've created in the past. Because we've had to go through and refactor
them over and over to get them into a manageable state.

>> Based on the prototype, I see an alternative way to put it together that will take care of the speed, the flexibility of writing custom C++ analytic gradients, and making it easy to maintain. There are some key pieces of information that would be useful to make the design right. The first is what the integrators are expecting as inputs. And these questions (from the original email):
>> If the requirements for the two solvers are different in terms of providing Jy vs Jtheta, could we simplify design by composition? (could we have one class that accepts the original ode_rhs and then exposes just Jy? And another that just exposes Jtheta?)Is the generated ode_system enough? Is there anything easy that could be provided? Is there anything else that's necessary?
>
> Both integrators need the sensitivity RHS system (our coupled_ode_system). To construct this you need Jy and/or Jtheta which depends on the specific case. The bdf integrator needs in addition Jy separately to do its stiff integration.

Not sure what you mean by Jy separately. You mean
a computation that just computes Jy? That's reason enough
to break it out.

> The only thing what would worry me when doing Jy and Jtheta via composition is performance, but you can answer this quickly: Whenever we need Jy AND Jtheta my understanding is that it is more efficient to do that in a single sweep should we use AD, right? How can we ensure this? We never need Jtheta alone. It will always be Jy only or Jy and Jtheta.

Are you saying that you can save time computing Jy and Jtheta together?
That would argue for having two things:

Jy for cases where you just need Jy

(Jy, Jtheta) for when you need both

Remember that constructing the expression is the fast part of autodiff.
It's the reverse passes computing derivatives that are slow, and the
speed is based on number of expressions.

>> Instead of building god objects, I think we can do this through composition. That is, if we have the ode system specified by the language, say ode_rhs, then we can do something like:
>>
>>
>> template <class ode_rhs>
>> class Jy {
>> Jy(ode_rhs& ode_rhs);
>> ___ operator()(...);
>> };
>> and the same for Jtheta.
>
> Looks nice. As I said, we need Jy only or Jy and Jtheta, so I am not sure if we want to split out Jtheta independently.

Yes. The point is that we don't want to build up big objects
that sometimes have components in them and sometimes don't.
It leads to bad combinatorial testing properties and makes the
code hard to read because at any given point you can't tell what
you have on hand in an object.

- Bob

Sebastian Weber

unread,

Sep 12, 2016, 10:51:35 AM9/12/16

to stan development mailing list

Hi!

> >
> > Sigh... I have tried to discuss the center-piece, which is ode_system, a while ago. Back then it appeared that everyone was on-board. Redoing things now looks like overhead, but ok...
>
> I'm sorry this feels like overhead. Think of it as proactively paying
> off technical debt. It's sometimes hard to see what's being proposed
> by a design, but we like to start there.
>
> This whole project is a ton of overhead to keep it in a maintainable
> state. We're being much more careful now up front given the abominations
> we've created in the past. Because we've had to go through and refactor
> them over and over to get them into a manageable state.

No offense here... this was just a very human comment. Stan is big which is what makes it sometimes not super efficient by construction; I can well understand.

>
> Are you saying that you can save time computing Jy and Jtheta together?
> That would argue for having two things:
>
> Jy for cases where you just need Jy
>
> (Jy, Jtheta) for when you need both
>
> Remember that constructing the expression is the fast part of autodiff.
> It's the reverse passes computing derivatives that are slow, and the
> speed is based on number of expressions.

I do not know if it is more efficient to do Jy and Jtheta always in a single sweep, but I do think so. Doing Jy and Jtheta in a single sweep should be faster for AD in comparison to doing them sequentially; this would be my intuition, but this has to do with how AD works and this tricked me already a few times.

>
> >> Instead of building god objects, I think we can do this through composition. That is, if we have the ode system specified by the language, say ode_rhs, then we can do something like:
> >>
> >>
> >> template <class ode_rhs>
> >> class Jy {
> >> Jy(ode_rhs& ode_rhs);
> >> ___ operator()(...);
> >> };
> >> and the same for Jtheta.
> >
> > Looks nice. As I said, we need Jy only or Jy and Jtheta, so I am not sure if we want to split out Jtheta independently.
>
> Yes. The point is that we don't want to build up big objects
> that sometimes have components in them and sometimes don't.
> It leads to bad combinatorial testing properties and makes the
> code hard to read because at any given point you can't tell what
> you have on hand in an object.

Yup, makes sense.

Sebastian

>
> - Bob

Bob Carpenter

unread,

Sep 12, 2016, 11:08:44 AM9/12/16

to stan...@googlegroups.com

> On Sep 12, 2016, at 10:51 AM, Sebastian Weber <sdw....@gmail.com> wrote:
>
> Hi!
>
>>>
>>> Sigh... I have tried to discuss the center-piece, which is ode_system, a while ago. Back then it appeared that everyone was on-board. Redoing things now looks like overhead, but ok...
>>
>> I'm sorry this feels like overhead. Think of it as proactively paying
>> off technical debt. It's sometimes hard to see what's being proposed
>> by a design, but we like to start there.
>>
>> This whole project is a ton of overhead to keep it in a maintainable
>> state. We're being much more careful now up front given the abominations
>> we've created in the past. Because we've had to go through and refactor
>> them over and over to get them into a manageable state.
>
> No offense here... this was just a very human comment. Stan is big which is what makes it sometimes not super efficient by construction; I can well understand.

We're trying to optimize efficiency in the future. It's
easy to get things in, but it's hard to maintain them if
they're not organized modularly. Sorry that Daniel and I
are the bottlenecks here in reading all of this code. I
think we need to do more in-person/online chats and less
email, because I think things are getting lost in translation
that might be more easily sorted face-to-face.

- Bob

Reply all

Reply to author

Forward