Removing duplication of evaluation and refine logic

90 views
Skip to first unread message

JSS95

unread,
Mar 31, 2021, 2:24:38 AM3/31/21
to sympy
In SymPy, we have two ways to get the mathematical property of an object : the old assumptions system and the new assumptions system.
With old assumptions system, the property is accessed by `.is_[...]` attribute. Here, property of the object is intrinsic.
```
In [1]: x = Symbol('x', positive=True)
In [2]: x.is_positive
Out[2]: True
```
With new assumption system, the property is accessed by assumption predicate and `ask()` function. Here, property of the object can be extrinsically assumed.
```
In [1]: x = Symbol('x')
In [2]: ask(Q.integer(x), Q.even(x))
Out[2]: True
```

Old assumption system is used in evaluation logic because it is faster. In refinement logic where user can pass local assumptions, new assumption system is used.
This results the duplication in two methods. For example, evaluation logic of `Abs` is vaguely like this:
```
def eval_Abs(x):
    if x.is_nonnegative:
        return x
    ...
```
And refinement logic of `Abs` is like this:
```
def eval_Abs(x, assumptions=True):
    if ask(Q.nonnegative(x), assumptions):
        return x
    ...
```

As you can see, two codes are mostly identical and the only difference is old assumptions vs new assumptions. By introducing a function which can switch between two behaviors, we can remove this duplication.

The first suggestion is to pass a getter function.
```
def eval_Abs(x, getter):
    if getter(x, 'nonnegative'):
        return x
    ...
```
Here, we can pass `lambda x, key : getattr(x, "is_%s" % key)` to getter for evaluation, and `lambda x, key : ask(getattr(Q, key)(x), assumptions)` for refinement.
Pros :
- Easy to implement
Cons:
- `getattr(x, "is_%s" % "positive")` is slower than `x.is_positive`.
- `getter(x, '...')` everywhere looks ugly.

The second suggestion is proposed by oscarbenjamin in https://github.com/sympy/sympy/pull/21203#issuecomment-810134012.
If I understood correctly, the idea is to introduce `AssumptionsWrapper` class like this:
```
class AssumptionsWrapper(Basic):
    def __new__(cls, expr, assumptions=True):
        if assumptions == True:
            return expr
        obj = super().__new__(cls, expr, assumptions)
        obj.expr = expr
        obj.assumptions = assumptions
        return obj
    def _eval_is_positive(self):
        return ask(Q.positive(self.expr), self.assumptions)
```
And `eval_Abs` will be like this:
```
def eval_Abs(x, assumptions=True):
    _x = AssumptionsWrapper(x, assumptions)
    if _x.is_positive:
        return x
    ...
```
Pros :
- No slowdown for retrieving `.is_positive` attribute with old assumptions
Cons:
- Generating wrapper objects is slow.
- Having wrapped objects everywhere makes the code messy

And here is the third suggestion, which is to modify `ManagedProperties` (metaclass for `Basic`) to automatically define `check_[...]` methods for every keys.
The design will be like this:
```
...
    # This method will be automatically generated by metaclass
    def check_positive(self, assumptions=True):
        if assumptions is True:
            return self.is_positive
        else:
            return ask(Q.positive(self), assumptions)
```
And `eval_Abs` will be:
```
def eval_Abs(x, assumptions=True):
    if x.check_nonnegative(assumptions):
        return x
    ...
```
Pros:
- Easy to implement. I tried this, and it required only <20 more lines in core/assumptions file
- Fastest
- Cleaner code
Cons:
- Basic will be cluttered up with more methods

JSS95

unread,
Mar 31, 2021, 2:29:30 AM3/31/21
to sympy
Here is the performance for three suggestions. Suggestion 3 is the fastest, and for me it looks cleaner than the others.

suggestions.PNG

2021년 3월 31일 수요일 오후 3시 24분 38초 UTC+9에 JSS95님이 작성:

Oscar Benjamin

unread,
Mar 31, 2021, 8:13:48 AM3/31/21
to sympy
Hi Jissoo,

I think that the real question is how are we going to refactor the evaluation code to make it possible to supply an assumptions argument?

At the moment the evaluation is typically in __new__ and no __new__ methods take an assumptions argument. We can't pass an assumptions argument to __new__ because downstream code will not be expecting it.

I think that the real issue is we need a redesign of evaluation something like this:

Oscar

JSS95

unread,
Mar 31, 2021, 10:05:01 AM3/31/21
to sympy
Assumptions are used only in ask() and refine(), so  __new__() does not need to take assumptions. Besides, using new assumption in __new__() will make everything extremely slow so I'd say __new__() must not take assumptions.

My idea is to make eval() method, not __new__(), able to take assumptions. __new__() will call eval() without assumption, using old assumptions system (and thus without any speed issue).
ask() or refine() will call eval() with assumption, using new assumptions system.

IMO this topic needs to be resolved first in order to implement _eval_new() method suggested in #17280.

Jisoo

Oscar Benjamin

unread,
Mar 31, 2021, 11:29:08 AM3/31/21
to sympy
On Wed, 31 Mar 2021 at 15:05, JSS95 <jeeso...@snu.ac.kr> wrote:
>
> Assumptions are used only in ask() and refine(), so __new__() does not need to take assumptions. Besides, using new assumption in __new__() will make everything extremely slow so I'd say __new__() must not take assumptions.

Agreed

> My idea is to make eval() method, not __new__(), able to take assumptions. __new__() will call eval() without assumption, using old assumptions system (and thus without any speed issue).
> ask() or refine() will call eval() with assumption, using new assumptions system.

I agree. That's what I was hinting at.

> IMO this topic needs to be resolved first in order to implement _eval_new() method suggested in #17280.

I think that the two ideas go together. There needs to be a hook for
refine to call into and that needs to be shared with __new__. Then
what we are talking about is moving the code that is currently in
__new__ into that hook. How easy that is to do without churning
everything depends on the answer to your questions above. Version 3
that you have shown requires hefty churn because every
expr.is_positive needs to be changed to
expr.check_positive(assumptions).

Note that your benchmark is flawed since it creates an
AssumptionsWrapper each time in the timing loop. If we only want to
make a single call to is_positive then we could have a function that
does that without creating a new Basic instance. Otherwise the
AssumptionsWrapper is created so that multiple assumptions queries can
be made. We can also optimise AssumptionsWrapper to inherit the
assumptions dict of the object it wraps.

We could make a method like eval as already used by Function. Would it
need to be a classmethod or an instance method?

Note that a classmethod would only have access to the args whereas an
instance method can also use the methods, attributes and properties of
the instance. I think that the reason a classmethod is used for eval
is because it potentially avoids the cost of Basic.__new__ but I'm not
sure how significant that actually is in any reasonable benchmark
(maybe it's more significant for Function than for other Basic
classes).


Oscar

JS S

unread,
Mar 31, 2021, 9:52:44 PM3/31/21
to sympy
> Note that your benchmark is flawed since it creates an
AssumptionsWrapper each time in the timing loop. If we only want to
make a single call to is_positive then we could have a function that
does that without creating a new Basic instance. Otherwise the
AssumptionsWrapper is created so that multiple assumptions queries can
be made. We can also optimise AssumptionsWrapper to inherit the
assumptions dict of the object it wraps.

Okay. That sounds good.

> We could make a method like eval as already used by Function. Would it
need to be a classmethod or an instance method?

I have no idea for that yet. After we decide how we should implement AssumptionsWrapper here, we can discuss the design for eval() in #17280.

Jisoo

Aaron Meurer

unread,
Apr 1, 2021, 3:02:20 PM4/1/21
to sympy
Indeed. It seems you are proposing to fix the problem of there being
two different ways to do things by adding a third way
https://xkcd.com/927/. I'm -1 to adding another syntax for checking
assumptions.

I've always seen the preferred solution to this problem being to merge
the two assumptions systems, so that x.is_positive and
ask(Q.positive(x)) do the same thing. The former would just be a
convenient shorthand for the cases that it can represent. If you need
to pass in assumptions in a way that isn't supported by the "old
assumptions", you can use ask() instead. This has already partly been
done in that ask() checks the assumptions set of Symbol. The reverse
is harder because ask() is currently too slow. I think we can improve
the performance of it. One thing we need to do is devise an evaluation
scheme where queries can be faster but less accurate. That is, a fast
query might return None when a slower query might return True or
False. These fast queries would be preferred for places where
performance matters.

Probably a more relevant design question is the problem that the old
assumptions system stores assumptions on objects, whereas ask() stores
them as separate predicates. So the question of how is_... calls ask()
isn't so obvious. Previously the idea for this was global assumptions
contexts (with the assuming() context manager). I think this is a good
idea, but passing assumptions through to functions as with Oscar's
idea may also be more convenient, although it does represent a pretty
significant API shift in SymPy.

Another problem, which also relates to performance, is that we need to
significantly reduce the use of assumptions in constructors. Automatic
evaluation should only happen when it can be fast, and checking
assumptions is antithetical to that. To me, your _eval_Abs above
should actually be Abs.doit() or Abs._eval_refine().

Aaron Meurer

>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/63e8baab-e299-4c15-a4c5-bc862fe406e3n%40googlegroups.com.

Oscar Benjamin

unread,
Apr 1, 2021, 6:31:44 PM4/1/21
to sympy
On Thu, 1 Apr 2021 at 20:02, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Wed, Mar 31, 2021 at 12:24 AM JSS95 <jeeso...@snu.ac.kr> wrote:
> >
> > In SymPy, we have two ways to get the mathematical property of an object : the old assumptions system and the new assumptions system.
> > With old assumptions system, the property is accessed by `.is_[...]` attribute. Here, property of the object is intrinsic.
...
> > - Basic will be cluttered up with more methods
>
> Indeed. It seems you are proposing to fix the problem of there being
> two different ways to do things by adding a third way
> https://xkcd.com/927/. I'm -1 to adding another syntax for checking
> assumptions.
>
> I've always seen the preferred solution to this problem being to merge
> the two assumptions systems, so that x.is_positive and
> ask(Q.positive(x)) do the same thing. The former would just be a
> convenient shorthand for the cases that it can represent. If you need
> to pass in assumptions in a way that isn't supported by the "old
> assumptions", you can use ask() instead. This has already partly been
> done in that ask() checks the assumptions set of Symbol. The reverse
> is harder because ask() is currently too slow.

If we have the new assumptions call the old and the old call the new
then we're left with something very circular. We could have infinite
recursion everywhere.

> I think we can improve
> the performance of it. One thing we need to do is devise an evaluation
> scheme where queries can be faster but less accurate. That is, a fast
> query might return None when a slower query might return True or
> False. These fast queries would be preferred for places where
> performance matters.

This is more or less the distinction between the two assumption
systems in how they can be used as I see it.

> Probably a more relevant design question is the problem that the old
> assumptions system stores assumptions on objects, whereas ask() stores
> them as separate predicates. So the question of how is_... calls ask()
> isn't so obvious. Previously the idea for this was global assumptions
> contexts (with the assuming() context manager). I think this is a good
> idea, but passing assumptions through to functions as with Oscar's
> idea may also be more convenient, although it does represent a pretty
> significant API shift in SymPy.

Yes, but it's a shift in the right direction.

> Another problem, which also relates to performance, is that we need to
> significantly reduce the use of assumptions in constructors. Automatic
> evaluation should only happen when it can be fast, and checking
> assumptions is antithetical to that. To me, your _eval_Abs above
> should actually be Abs.doit() or Abs._eval_refine().

I think Jissoo's proposal is that it would literally become the
implementation of _eval_refine. The fact that it would also be the
implementation of __new__ is just a reflection of the status quo. We
shouldn't get too tied up in removing assumptions-based evaluation at
the same time as trying to make the new assumptions usable. Removing
assumptions based evaluation is much harder. If we want to make a
BasicV2 and reinvent everything from scratch then we can do both. If
we want to find a path of incremental improvement then the proposals
here are one way.

While the new assumptions system is in some ways better defined it's problem is:

1) It doesn't yet have any substantial capability that improves over
the old assumptions for any significant practical use case.
2) It isn't really used anywhere (the cause of this is largely point 1),
3) The most useful part for end users is refine but making refine good
would require reimplementing all of the logic that already exists for
evaluation as a bunch of _eval_refine handlers

The proposal here is about avoiding duplicating everything from point
3. If instead we were to ignore that then a good path forwards would
be to implement inequality handling and make a good refine(Piecewise)
function. That alone would be enough to justify adding an assumptions=
keyword argument to a bunch of APIs and would make it possible to use
the new assumptions internally in a few key places.


Oscar

JSS95

unread,
Apr 1, 2021, 6:44:04 PM4/1/21
to sympy
One thing I'm concerned with is import issue. If we make ask(Q.positive(x)) and x.is_positive do the same thing
and use the former inside the evaluation methods, ask() and Q will be imported in core module. My idea to avoid
this was to pass getter function instead of assumptions, but there may be a better way.

AssumptionWrapper seems good to me, and if it gets more approval I am willing to make a prototype PR to discuss
the detailed design.

Jisoo

Aaron Meurer

unread,
Apr 1, 2021, 6:55:01 PM4/1/21
to sympy
On Thu, Apr 1, 2021 at 4:31 PM Oscar Benjamin
<oscar.j....@gmail.com> wrote:
>
> On Thu, 1 Apr 2021 at 20:02, Aaron Meurer <asme...@gmail.com> wrote:
> >
> > On Wed, Mar 31, 2021 at 12:24 AM JSS95 <jeeso...@snu.ac.kr> wrote:
> > >
> > > In SymPy, we have two ways to get the mathematical property of an object : the old assumptions system and the new assumptions system.
> > > With old assumptions system, the property is accessed by `.is_[...]` attribute. Here, property of the object is intrinsic.
> ...
> > > - Basic will be cluttered up with more methods
> >
> > Indeed. It seems you are proposing to fix the problem of there being
> > two different ways to do things by adding a third way
> > https://xkcd.com/927/. I'm -1 to adding another syntax for checking
> > assumptions.
> >
> > I've always seen the preferred solution to this problem being to merge
> > the two assumptions systems, so that x.is_positive and
> > ask(Q.positive(x)) do the same thing. The former would just be a
> > convenient shorthand for the cases that it can represent. If you need
> > to pass in assumptions in a way that isn't supported by the "old
> > assumptions", you can use ask() instead. This has already partly been
> > done in that ask() checks the assumptions set of Symbol. The reverse
> > is harder because ask() is currently too slow.
>
> If we have the new assumptions call the old and the old call the new
> then we're left with something very circular. We could have infinite
> recursion everywhere.

Only if deduction is implemented in a naive way, by recursively
calling the top-level assumptions APIs (which it currently
unfortunately is in the ask handlers). Deduction should be a
completely separate thing, which would be done with all the known
facts collected from whatever source.

>
> > I think we can improve
> > the performance of it. One thing we need to do is devise an evaluation
> > scheme where queries can be faster but less accurate. That is, a fast
> > query might return None when a slower query might return True or
> > False. These fast queries would be preferred for places where
> > performance matters.
>
> This is more or less the distinction between the two assumption
> systems in how they can be used as I see it.

I agree that we can make it so that the current "old assumptions"
syntax only does a limited deduction (it has to be limited anyway
because it is limited in what it can even represent). This makes sense
given its prevalence, that we would want to keep it as one of the
faster APIs.

>
> > Probably a more relevant design question is the problem that the old
> > assumptions system stores assumptions on objects, whereas ask() stores
> > them as separate predicates. So the question of how is_... calls ask()
> > isn't so obvious. Previously the idea for this was global assumptions
> > contexts (with the assuming() context manager). I think this is a good
> > idea, but passing assumptions through to functions as with Oscar's
> > idea may also be more convenient, although it does represent a pretty
> > significant API shift in SymPy.
>
> Yes, but it's a shift in the right direction.
>
> > Another problem, which also relates to performance, is that we need to
> > significantly reduce the use of assumptions in constructors. Automatic
> > evaluation should only happen when it can be fast, and checking
> > assumptions is antithetical to that. To me, your _eval_Abs above
> > should actually be Abs.doit() or Abs._eval_refine().
>
> I think Jissoo's proposal is that it would literally become the
> implementation of _eval_refine. The fact that it would also be the
> implementation of __new__ is just a reflection of the status quo. We
> shouldn't get too tied up in removing assumptions-based evaluation at
> the same time as trying to make the new assumptions usable. Removing
> assumptions based evaluation is much harder. If we want to make a
> BasicV2 and reinvent everything from scratch then we can do both. If
> we want to find a path of incremental improvement then the proposals
> here are one way.

I'm not sure if we can separate them, given the performance
implications of calling assumptions in constructors. The more the
assumptions are doing, the more that will affect any constructor that
calls them.

>
> While the new assumptions system is in some ways better defined it's problem is:
>
> 1) It doesn't yet have any substantial capability that improves over
> the old assumptions for any significant practical use case.

I disagree with this. Even though refine is almost completely useless,
the fact that you can do refine(sqrt((x - 1)**2), Q.positive(x - 1))
and it works is a huge boon to people in practice.

> 2) It isn't really used anywhere (the cause of this is largely point 1),

Just the one refine(sqrt((x - 1)**2), Q.positive(x - 1)) example,
which is almost the only thing in refine() currently that actually
works, is used. I've seen many instances in stackoverflow and
elsewhere where users need to eliminate square roots and refine is one
way to do this.

> 3) The most useful part for end users is refine but making refine good
> would require reimplementing all of the logic that already exists for
> evaluation as a bunch of _eval_refine handlers

Agreed.

>
> The proposal here is about avoiding duplicating everything from point
> 3. If instead we were to ignore that then a good path forwards would
> be to implement inequality handling and make a good refine(Piecewise)
> function. That alone would be enough to justify adding an assumptions=
> keyword argument to a bunch of APIs and would make it possible to use
> the new assumptions internally in a few key places.

I'm not sure I follow what you are saying here. Making refine and the
assumptions work better on inequalities would also be a huge boon,
although orthogonal to the benefits of the sorts of things that
currently work in constructors (or at least work in cases simple
enough to represent in the old assumptions).

Aaron Meurer

>
>
> Oscar
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxRnmwJ9y-9pOEdgnQ-2HeQ4RkdhAvCs62YdmBY%2B07uOrA%40mail.gmail.com.

Aaron Meurer

unread,
Apr 1, 2021, 6:58:49 PM4/1/21
to sympy
On Thu, Apr 1, 2021 at 4:44 PM JSS95 <jeeso...@snu.ac.kr> wrote:
>
> One thing I'm concerned with is import issue. If we make ask(Q.positive(x)) and x.is_positive do the same thing
> and use the former inside the evaluation methods, ask() and Q will be imported in core module. My idea to avoid
> this was to pass getter function instead of assumptions, but there may be a better way.

I'm not sure I follow why this is a huge concern. Is it a question of
avoiding circular imports? Anyway, if the two systems are merged, then
ask() and Q would only be needed for cases where the query can't be
represented using the old assumptions. The simpler is_predicate or
Symbol('x', predicate=True) syntax is much easier to type for the
cases where it can represent what you want to do.

>
> AssumptionWrapper seems good to me, and if it gets more approval I am willing to make a prototype PR to discuss
> the detailed design.

I hadn't noticed this in your original email. I agree we can
experiment with it. It does seem like an interesting idea, and it
doesn't modify any existing core or assumptions code, so there's no
cost to playing with it and seeing if it works.

Aaron Meurer

>
> Jisoo
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/60b55bc5-f45a-407d-a6dc-3afb59646a87n%40googlegroups.com.

Oscar Benjamin

unread,
Apr 1, 2021, 7:19:36 PM4/1/21
to sympy
On Thu, 1 Apr 2021 at 23:54, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Thu, Apr 1, 2021 at 4:31 PM Oscar Benjamin
> <oscar.j....@gmail.com> wrote:
> >
> > On Thu, 1 Apr 2021 at 20:02, Aaron Meurer <asme...@gmail.com> wrote:
> >
> > If we have the new assumptions call the old and the old call the new
> > then we're left with something very circular. We could have infinite
> > recursion everywhere.
>
> Only if deduction is implemented in a naive way, by recursively
> calling the top-level assumptions APIs (which it currently
> unfortunately is in the ask handlers). Deduction should be a
> completely separate thing, which would be done with all the known
> facts collected from whatever source.

Collecting "all known facts" can be very expensive. The advantage of
recursive calling is that we only collect "all needed facts".

> > > Another problem, which also relates to performance, is that we need to
> > > significantly reduce the use of assumptions in constructors. Automatic
> > > evaluation should only happen when it can be fast, and checking
> > > assumptions is antithetical to that. To me, your _eval_Abs above
> > > should actually be Abs.doit() or Abs._eval_refine().
> >
> > I think Jissoo's proposal is that it would literally become the
> > implementation of _eval_refine. The fact that it would also be the
> > implementation of __new__ is just a reflection of the status quo. We
> > shouldn't get too tied up in removing assumptions-based evaluation at
> > the same time as trying to make the new assumptions usable. Removing
> > assumptions based evaluation is much harder. If we want to make a
> > BasicV2 and reinvent everything from scratch then we can do both. If
> > we want to find a path of incremental improvement then the proposals
> > here are one way.
>
> I'm not sure if we can separate them, given the performance
> implications of calling assumptions in constructors. The more the
> assumptions are doing, the more that will affect any constructor that
> calls them.

Currently assumptions are one of many things that can be slow in
constructors. There are others e.g. sorting args, computing
simplification etc. In the longer term I do wonder if BasicV2 is the
way to go.

In the meantime it would be good to make the new assumptions used and
usable and being able to patch their capabilities into evaluation
would make a huge difference if it boosts the capabilities of refine.

> > While the new assumptions system is in some ways better defined it's problem is:

> > 2) It isn't really used anywhere (the cause of this is largely point 1),
>
> Just the one refine(sqrt((x - 1)**2), Q.positive(x - 1)) example,
> which is almost the only thing in refine() currently that actually
> works, is used. I've seen many instances in stackoverflow and
> elsewhere where users need to eliminate square roots and refine is one
> way to do this.

I mean that the new assumptions aren't used anywhere in the sympy
codebase. There would be more of an incentive to do so if they had
more generically useful capabilities. Your example makes sense for
interactive use but not as part of a programmatica API.

> > The proposal here is about avoiding duplicating everything from point
> > 3. If instead we were to ignore that then a good path forwards would
> > be to implement inequality handling and make a good refine(Piecewise)
> > function. That alone would be enough to justify adding an assumptions=
> > keyword argument to a bunch of APIs and would make it possible to use
> > the new assumptions internally in a few key places.
>
> I'm not sure I follow what you are saying here. Making refine and the
> assumptions work better on inequalities would also be a huge boon,
> although orthogonal to the benefits of the sorts of things that
> currently work in constructors (or at least work in cases simple
> enough to represent in the old assumptions).

What I mean is that just being able to do something like
expr = Piecewise((x, abs(x) < 1), (y, True))
expr = refine(expr, (0 < x) & (x < 1))
would be really useful and would make it useful to have an assumptions
argument to integrate or other functions. That is orthogonal to the
discussion here but it would make the system usable in practice.


Oscar

JSS95

unread,
Apr 3, 2021, 11:51:43 AM4/3/21
to sympy
Here is the PR which implements AssumptionsHandler.

Reply all
Reply to author
Forward
0 new messages