A plan for the assumptions

217 views
Skip to first unread message

Aaron Meurer

unread,
Aug 24, 2014, 1:46:40 PM8/24/14
to sy...@googlegroups.com
Some of you may have seen Sergey started a pull request to remove uses of the new assumptions in the code, because it's confusing to users who set the old assumptions and it doesn't work.

He asked me there

How the random mix of old and new assumptions can look for you as this transition (or a step in this direction, whatever)? I think - this only can surprise our users. And that's a bad idea to do so, esp. in the core modules (e.g., the sets module).

I wrote up a response there, but then I realized that it would be better to discuss this on the mailing list, so I am posting it here instead. 

You make a good point. However, if you are going to spend time on this, it would be better spent improving the new assumptions (and let's finally merge my branch as a start to this), or merging the two. I still maintain that this pull request is a step backwards. With these changes, the new assumptions aren't used at all, meaning they are tested much less, and more importantly, there is no more motivation to improve them. The old assumptions syntax isn't going away, so why don't we focus on making it so that the new assumptions can read the old assumptions. 

This is what @mrocklin and I attempted to do last summer (https://github.com/sympy/sympy/pull/2210). It didn't work, because we tried to do a two-way merge of the assumptions, i.e., so that both `x.is_integer` and `ask(Q.Integer(x))` were always identical (the logic I used was https://github.com/sympy/sympy/pull/2210/files#r5463892, which required both assumptions to be the same). 

Maybe an easier approach would be along the lines of what I have done at #2508. Have ask call a bunch of things, in order, until it gets an answer. In my branch, `ask` calls the current new assumptions, and then my satask. We could make it so that ask first calls the old assumptions, then the new assumptions. Furthermore, the new assumptions should gain trivial handlers for Symbols to so that `ask(Q.real(Symbol('x', real=True)))` is True. 

The first step though will be to go through all the assumptions and make sure they are identical in all systems. This is a good opportunity to clean up some of the contradictory or unexpected assumptions, so it should involve some community input. So I would do it on the mailing list. 

So I what I would recommend doing:

- [ ] Merge my branch #2508. We can maybe wait until after the release to do this so it is minimally disruptive, but this is important, as it will be much easier to add new facts to the new assumptions with satask (and I also want people to start thinking about how to improve the speed of it).
- [ ] Go through all the assumptions facts and make a note of what is different, and also what is confusing. A lot of this is already written down in various places (https://github.com/sympy/sympy/issues/5976, https://github.com/sympy/sympy/pull/2538, probably some of the issues mentioned in https://github.com/sympy/sympy/issues/6730, https://github.com/sympy/sympy/pull/2508/files#diff-e0f95401c86d07d90903363122990de0R172, https://github.com/sympy/sympy/wiki/assumptions_handlers), but we should gather them all down in one place. I think the main thing will be how we handle infinity. 
- [ ] Discuss with the community on the mailing list how we should handle the differences, until we come to a consensus. 
- [ ] Make the new assumptions call the old assumptions first. Also make the handlers access the old assumptions, so that everything sees `Symbol('x', real=True)` as real. This will be the hard part, and lead to lots of test failures. It might be worth trying this first, to really see what facts are different in the two systems.

Note that absent from my plan is making `x.is_real` call the new assumptions. This is what I tried at #2210, and it failed, because it requires everything to work perfectly, and there are so many places that the assumptions are used that it's too much work to get it all in one go. We should get the above working first before we consider this. I don't think there's a big issue with this, because already `ask(Q.real(x))` and `x.is_real` are different. My proposal is to have the first call the second but not the other way around.  

Aaron Meurer

Sergey Kirpichev

unread,
Aug 25, 2014, 5:37:56 AM8/25/14
to sy...@googlegroups.com


On Sunday, August 24, 2014 9:46:40 PM UTC+4, Aaron Meurer wrote:
You make a good point. However, if you are going to spend time on this, it would be better spent improving the new assumptions (and let's finally merge my branch as a start to this), or merging the two. I still maintain that this pull request is a step backwards.

It's a much more difficult task then the simple pr in question.  But I did some attempts
in the past, see my branches assumptions-merge-new-old-DRAFT and assumptions-merge-new-old-DRAFT-REBASED (recent rebase).
 
With these changes, the new assumptions aren't used at all, meaning they are tested much less, and more importantly, there is no more motivation to improve them.

I don't think they are much tested now.  For example, I have seen only one test failure after my modifications in the sets module.
 
I fail to see reasons to keep such confusing code in the master.  We should use one interface for the assumptions. In fact, this can simplify the mirgation too.

F. B.

unread,
Aug 25, 2014, 12:41:09 PM8/25/14
to sy...@googlegroups.com
What about creating a new branch on github to gradually integrate the new assumptions? This could be merged with master once the integration is complete.

Aaron Meurer

unread,
Aug 25, 2014, 8:10:28 PM8/25/14
to sy...@googlegroups.com
Keeping a long running branch is not the way to go. The only way to get such a large change is to split it up into a much smaller changes that can be merged into master as soon as possible.

Aaron Meurer



On Mon, Aug 25, 2014 at 11:41 AM, F. B. <franz....@gmail.com> wrote:
What about creating a new branch on github to gradually integrate the new assumptions? This could be merged with master once the integration is complete.

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/caf7d4ae-82d9-499e-b46b-1067414b9287%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Chris Smith

unread,
Aug 26, 2014, 1:32:29 AM8/26/14
to sy...@googlegroups.com
Where is it clearly stated what the difference is in the two systems? One thing I have noted is what a fog it is writing the routines since it is not clear what the routine must do and what will mathemagically be handled by the inference system -- and that, especially, I do not understand the difference in the two. I know they keep evaluating propositions until a clear answer is found but I don't know if there is a significant difference in the two in that regard. I know that the new system can handle "is_true" expressions and can handle assumptions involving Or (while the old assumptions only allow And-based assumptions: var('x', integer=True, positive=True).

One thing I noted today was that the old system can make a calculation based on arg(expression) which depends on the old assumptions. I don't know how that could be done in the new system. Specifically, I needed to compute, for a power, arg(base)*exponent/pi and see whether that was an integer or half integer. But arg() needs assumptions to return a value and that's where the problem arises. arg uses the old assumptions so if you try to do the calculation in the new assumption system (where the symbolic assumptions are not available with the symbols) it fails.

As far as passing symbol-based assumptions along to the old system from the new, wouldn't something like this be a start?

>>> def assumption(a):
...     if a.func is Not:
...         b = False
...         a = a.args[0]
...     else:
...         b = True
...     if isinstance(a, AppliedPredicate):
...         return a.func.name, a.args[0], b
...
>>> def update(k, v, d):
...     if v is not None:
...         if k in d:
...             assert v == d[k]
...         else:
...             d[k] = v
...
>>> def know(a):
...     s = {}
...     d = {}
...     if a.func is And:
...         p = a.args
...     elif isinstance(a, AppliedPredicate):
...         p = [a]
...     else:
...         raise NotImplementedError(a)
...     for i in p:
...         k, x, v = assumption(i)
...         if x not in s:
...             s[x] = {}
...         update(k, v, s[x])
...     return s
...
>>> know(Q.positive(x)&Q.negative(y))
{x: {'positive': True}, y: {'negative': True}}

Sergey Kirpichev

unread,
Aug 26, 2014, 10:07:26 AM8/26/14
to sy...@googlegroups.com
On Tuesday, August 26, 2014 9:32:29 AM UTC+4, Chris Smith wrote:
I know that the new system can handle "is_true" expressions and can handle assumptions involving Or (while the old assumptions only allow And-based assumptions: var('x', integer=True, positive=True).

Not just Or-ed, but the new system allow to specify arbitrary assumptions on expressions.  Something like Q.positive(log(x)).
 
One thing I noted today was that the old system can make a calculation based on arg(expression) which depends on the old assumptions. I don't know how that could be done in the new system.

I think, exactly as in the old one.  No difference.  But, you will have to use different syntax to query assumptions while computing arg(expr).  ask(Q.positive(foo)) instead of foo.is_positive and so on.

Or I don't understand your problem.

Aaron Meurer

unread,
Aug 26, 2014, 2:35:30 PM8/26/14
to sy...@googlegroups.com
On Tue, Aug 26, 2014 at 12:32 AM, Chris Smith <smi...@gmail.com> wrote:
>
> Where is it clearly stated what the difference is in the two systems?


It's not written down anywhere, that's why I suggested my first few
bullet points. Ideally there should be no differences, at least
mathematically.

>
> One thing I have noted is what a fog it is writing the routines since it is not clear what the routine must do and what will mathemagically be handled by the inference system -- and that, especially, I do not understand the difference in the two. I know they keep evaluating propositions until a clear answer is found but I don't know if there is a significant difference in the two in that regard. I know that the new system can handle "is_true" expressions and can handle assumptions involving Or (while the old assumptions only allow And-based assumptions: var('x', integer=True, positive=True).
>

The new system is, ironically, less smart about inferences. It only
actually makes automatic deductions based on the simple known facts at
https://github.com/sympy/sympy/blob/56b613c1394e7b86646574133c0cdfdb7a4cf064/sympy/assumptions/ask.py#L327-L374
(which are compiled into ask_generated.py).

The old assumptions always make deductions from the facts at
https://github.com/sympy/sympy/blob/56b613c1394e7b86646574133c0cdfdb7a4cf064/sympy/core/assumptions.py#L68-L102.

The main difference (aside from the difference in the facts
themselves) is that the old assumptions always compute every fact
every time any assumptions are used on an object. The new assumptions
compute known_facts from the SAT solver, but this is only the very
simple facts in that known_facts dict. Most of it is in the handlers
module, which doesn't use the SAT solver at all.

The new assumptions handlers don't (and really can't) use any
inference, which is why I feel that they are broken. They do do some
recursive calling of ask and other handlers, but this is not
inference. Inference works with any kind of logical expression.

The example of this I like to use is ask(Q.zero(x*y), Q.zero(x) |
Q.zero(y)). It is impossible for the handlers to give True for this,
even though that is the correct answer, because the Mul zero handler
has something like any(ask(Q.zero(i)) for i in expr.args). But given
the assumption Q.zero(x) | Q.zero(y), both Q.zero(x) and Q.zero(y) are
None. You know that one is zero, but you don't know which one.

That's what my branch does better. To write down a fact, you need to
understand it as a logical statement, like (Mul, Equivalent(Q.zero,
AnyArgs(Q.zero))) (for a Mul object, the expression being zero is
logically equivalent to any of the args being zero).

There is a cost to understanding assumptions as precise logical
expressions, but the advantage is that *all* possible inferences are
done. If satask doesn't reach a conclusion, it means that it cannot be
deduced from the facts that it has.

There are other advantages as well, like automatic detection of
logical inconsistencies and clear separation of the facts (data) and
the operations on those facts (code).

>
> One thing I noted today was that the old system can make a calculation based on arg(expression) which depends on the old assumptions. I don't know how that could be done in the new system. Specifically, I needed to compute, for a power, arg(base)*exponent/pi and see whether that was an integer or half integer. But arg() needs assumptions to return a value and that's where the problem arises. arg uses the old assumptions so if you try to do the calculation in the new assumption system (where the symbolic assumptions are not available with the symbols) it fails.

This is what I am suggesting to fix with my last bullet point above.
The new assumptions handlers should pull in facts from the old
assumptions, so that ask(Q.real(Symbol('x', real=True))) gives True.
The satask system from my branch handles this at
https://github.com/sympy/sympy/pull/2508/files#diff-e0f95401c86d07d90903363122990de0R172,
although that could be made much more streamlined (especially if the
old and new assumptions had the same definitions!).

>
> As far as passing symbol-based assumptions along to the old system from the new, wouldn't something like this be a start?

I'm not clear where these functions would be used. My suggestion is to
add code to sympy/assumptions/handlers so that expr.is_integer would
be checked for ask(Q.integer(expr)).

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/63b54e60-6538-4c7e-8f1f-155990dd6ac1%40googlegroups.com.

Chris Smith

unread,
Aug 26, 2014, 11:03:50 PM8/26/14
to sy...@googlegroups.com

Or I don't understand your problem [about the problem with calling for arg in the new assumptions handler]

The problem is that arg is written using old assumptions, so unless you rewrite arg to use the new assumptions and pass assumptions along to it, you won't get the benefit of its computations. Here's a trivial example using foo instead of arg but hopefully the point comes across:

def foo(e):
    if e.is_positive:  return True
    return False

Say that in a NEW handler you want to try

   ...
   if foo(e): return 1
   else: return 2
   ...

You will always get 2 if e is a symbol because when you call, say,

ask(Q.Bar(x), Q.positive(x))

x has no direct assumption so False will be returned from foo and thus 2 will result from the ask. So unless we either 1) assign assumptions to symbols (but that will not be sufficient to handle things like Q.positive(x + 2) or 2) start passing assumptions to all functions which have a dependence on assumptions (so something like def foo(e) -> def foo(e, assumptions={})) I don't know how we are going to get the old-assumptions-API-based functions to start working with the new assumptions. What am I missing? How will foo above need to be modified so it works?

Chris Smith

unread,
Aug 26, 2014, 11:11:59 PM8/26/14
to sy...@googlegroups.com

On Tuesday, August 26, 2014 1:35:30 PM UTC-5, Aaron Meurer wrote:
  I like to use is ask(Q.zero(x*y), Q.zero(x) | Q.zero(y))

When encountering an Or I would have done something like return all(ask(Q.zero(x*y), clause) for clause in Or.args).

Aaron Meurer

unread,
Aug 26, 2014, 11:15:28 PM8/26/14
to sy...@googlegroups.com
This is asking for the reverse of what I am suggesting. I am suggesting making the new assumptions recognize assumptions made in the old assumptions. You want the old assumptions to recognize assumptions made in the new assumptions. 

I tried doing it both ways, and it failed. It's too much work. I think just doing it in one direction is better, as then it only affects calls to ask(). We could even limit it further so that ask() only looks at the old assumptions on symbols. I suspect this change would lead to no test failures. 

Aaron Meurer

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.

Aaron Meurer

unread,
Aug 26, 2014, 11:35:48 PM8/26/14
to sy...@googlegroups.com
You mean any, not all, right?

Anyhow, you're right, that does work. I got the example backwards (because Q.zero(x*y) is equivalent to Q.zero(x) | Q.zero(y)). What it should be is ask(Q.zero(x) | Q.zero(y), Q.zero(x*y)). 

Simply put, satask lets you ask about things that aren't single predicates. You can imagine this comes in handy. For instance, we might say x**2 is real if x is real or if x is imaginary. So we would ask(Q.real(x) | Q.imaginary(x), assumptions).  This is not the same as ask(Q.real(x), assumptions) or ask(Q.imaginary(x), assumptions), because the latter is weaker. For example, if x = I**n where n is an integer, we know that x is real or x is imaginary, but we don't know which, so both ask(Q.real(I**n), Q.integer(n)) and ask(Q.imaginary(I**n), Q.integer(n)) would give None, but ask(Q.real(I**n) | Q.imaginary(I**n), Q.integer(n)), assuming the system is powerful enough, should give True. satask is a system that is powerful enough. 

Aaron Meurer

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.

Aaron Meurer

unread,
Aug 26, 2014, 11:44:16 PM8/26/14
to sy...@googlegroups.com
On Tue, Aug 26, 2014 at 10:15 PM, Aaron Meurer <asme...@gmail.com> wrote:



On Tue, Aug 26, 2014 at 10:03 PM, Chris Smith <smi...@gmail.com> wrote:

Or I don't understand your problem [about the problem with calling for arg in the new assumptions handler]

The problem is that arg is written using old assumptions, so unless you rewrite arg to use the new assumptions and pass assumptions along to it, you won't get the benefit of its computations. Here's a trivial example using foo instead of arg but hopefully the point comes across:

def foo(e):
    if e.is_positive:  return True
    return False

Say that in a NEW handler you want to try

   ...
   if foo(e): return 1
   else: return 2
   ...

You will always get 2 if e is a symbol because when you call, say,

ask(Q.Bar(x), Q.positive(x))

x has no direct assumption so False will be returned from foo and thus 2 will result from the ask. So unless we either 1) assign assumptions to symbols (but that will not be sufficient to handle things like Q.positive(x + 2) or 2) start passing assumptions to all functions which have a dependence on assumptions (so something like def foo(e) -> def foo(e, assumptions={})) I don't know how we are going to get the old-assumptions-API-based functions to start working with the new assumptions. What am I missing? How will foo above need to be modified so it works?

This is asking for the reverse of what I am suggesting. I am suggesting making the new assumptions recognize assumptions made in the old assumptions. You want the old assumptions to recognize assumptions made in the new assumptions. 

I tried doing it both ways, and it failed. It's too much work. I think just doing it in one direction is better, as then it only affects calls to ask(). We could even limit it further so that ask() only looks at the old assumptions on symbols. I suspect this change would lead to no test failures.

I gave this a shot, and, well, the tests take a long time to run. But so far, there has only been one test failures, because x.is_commutative is true by default, so the test ask(Q.commutative(x), ~Q.commutative(x)) fails. I guess one would also have to allow the new assumptions to override the old, or else check that they don't conflict. 

I pushed a proof of concept to https://github.com/sympy/sympy/pull/7925. It probably shouldn't be merged yet, one because of the issue I noted above, and two because the new and old assumptions still disagree on some things.  I hope this at least gives an idea of what I am talking about, though. 

Aaron Meurer

Sergey Kirpichev

unread,
Aug 27, 2014, 6:20:55 AM8/27/14
to sy...@googlegroups.com
>> Or I don't understand your problem [about the problem with calling for
>> arg in the new assumptions handler]
>>
> The problem is that arg is written using old assumptions, so unless you
> rewrite arg to use the new assumptions and pass assumptions along to it,

Yes. In particular, if we add a compatibility layer (is_foo predicates
call the new assumptions). Nothing new here, of course.

> x has no direct assumption so False will be returned from foo and thus 2
> will result from the ask. So unless we either 1) assign assumptions to
> symbols (but that will not be sufficient to handle things like Q.positive(x
> + 2)

Or add new global assumptions in the Symbol constructor (see my last PR). Of
course, this is sufficient to handle Q.positive(x + 2):

In [1]: p = Symbol('p', positive=True)

In [2]: ask(Q.positive(p + 2))
Out[2]: True

In [3]: ask(Q.positive(p - 2))

In [4]:

> or 2) start passing assumptions to all functions which have a
> dependence on assumptions (so something like def foo(e) -> def foo(e,
> assumptions={})) I don't know how we are going to get the
> old-assumptions-API-based functions to start working with the new
> assumptions. What am I missing? How will foo above need to be modified so
> it works?

To alter the global_assumptions context we have context managers.

An example from my branch:

In [10]: r = Symbol('r')

In [11]: ask(Q.real(r))

In [12]: with assuming(Q.real(r)):
...: print(ask(Q.real(r)))
...:
True

In [13]: ask(Q.real(r))

In [14]:


But, optionally, we can provide also assumptions kwarg to override
the current assumptions context.

Sergey Kirpichev

unread,
Aug 30, 2014, 7:21:36 AM8/30/14
to sy...@googlegroups.com
On Sunday, August 24, 2014 9:46:40 PM UTC+4, Aaron Meurer wrote:
Some of you may have seen Sergey started a pull request to remove uses of the new assumptions in the code, because it's confusing to users who set the old assumptions and it doesn't work.

BTW, if we decide to use this approach - I think that change is a good candidate for the next release.

So, please help with review.
 
You make a good point. However, if you are going to spend time on this, it would be better spent improving the new assumptions (and let's finally merge my branch as a start to this), or merging the two. I still maintain that this pull request is a step backwards.

We must offer for users a consistent interface.  But if some parts use foo.is_real assumption, and some ask(Q.real(foo)) - this is not the case. Second, I don't think there are regressions in the functionality of sympy. 
 
With these changes, the new assumptions aren't used at all, meaning they are tested much less, and more importantly, there is no more motivation to improve them.

We still have refine* helpers, we have some code in the unify module.  And we have our tests for the assumptions module.  Also, it seems we have bad test coverage for the code, changed in my pr.
 
The old assumptions syntax isn't going away, so why don't we focus on making it so that the new assumptions can read the old assumptions.

Yes, but this variant is much more complex then my PR.  Basically, we must first make our assumptions identical in both systems.
 
The first step though will be to go through all the assumptions and make sure they are identical in all systems. This is a good opportunity to clean up some of the contradictory or unexpected assumptions, so it should involve some community input. So I would do it on the mailing list. 

I think, first we should cleanup and document (see https://github.com/sympy/sympy/pull/7936) old assumptions.  Then freeze old assumptions.  No new facts, no new features - only bugfixes.  First step.
 
- [ ] Go through all the assumptions facts and make a note of what is different, and also what is confusing. A lot of this is already written down in various places (https://github.com/sympy/sympy/issues/5976, https://github.com/sympy/sympy/pull/2538, probably some of the issues mentioned in https://github.com/sympy/sympy/issues/6730, https://github.com/sympy/sympy/pull/2508/files#diff-e0f95401c86d07d90903363122990de0R172, https://github.com/sympy/sympy/wiki/assumptions_handlers), but we should gather them all down in one place. I think the main thing will be how we handle infinity. 

We must do all this in the old assumptions first (my first step above).

- [ ] Discuss with the community on the mailing list how we should handle the differences, until we come to a consensus. 
- [ ] Make the new assumptions call the old assumptions first. Also make the handlers access the old assumptions, so that everything sees `Symbol('x', real=True)` as real. This will be the hard part, and lead to lots of test failures. It might be worth trying this first, to really see what facts are different in the two systems.

Second step. We can make the new assumptions call the old assumptions on symbols (i.e. ask(Q.real(x)) returns True for x = Symbol('x', real=True)).  And we can start add new facts in a way, fully compatible with the old system.

Only at this stage we can start experimenting with new assumptions in the sympy codebase (i.e. not only in sympy.assumptions).

I don't think there's a big issue with this, because already `ask(Q.real(x))` and `x.is_real` are different. My proposal is to have the first call the second but not the other way around.

That's impossible now.  Simple example: x.is_real = Q.extended_real(x), not Q.real.

F. B.

unread,
Sep 3, 2014, 6:01:34 AM9/3/14
to sy...@googlegroups.com


On Saturday, August 30, 2014 1:21:36 PM UTC+2, Sergey Kirpichev wrote:

I don't think there's a big issue with this, because already `ask(Q.real(x))` and `x.is_real` are different. My proposal is to have the first call the second but not the other way around.

That's impossible now.  Simple example: x.is_real = Q.extended_real(x), not Q.real.

SymPy is still version zero, some API breaks may be acceptable. Otherwise on the next release (0.7.6), one could add a warning that x.is_real will undergo some logical modifications in version 0.7.7. After this, make the old and new assumptions equivalent in the master branch as soon as 0.7.6 is out.
Reply all
Reply to author
Forward
0 new messages