My GSOC Proposal for New Assump Module

186 views
Skip to first unread message

Shipra Banga

unread,
Mar 11, 2014, 2:47:30 AM3/11/14
to sy...@googlegroups.com
Hello,

My proposal is based on the project that is under - way in sympy : The newassump module . The work is progressing at
https://github.com/sympy/sympy/pull/2508

This is a very basic draft of my proposal. Please go through it and suggest changes .

https://github.com/sympy/sympy/wiki/GSOC-2014-Application-Shipra-Banga-Extending-the-New-Assumptions-Module

I am still to add the organisation of the module to my proposal. I will do that once I am done discussing it with the people working on it and finalised it.
Thank you so much!!

Shipra

Aaron Meurer

unread,
Mar 15, 2014, 2:18:32 PM3/15/14
to sy...@googlegroups.com
For anyone looking, the url is now
https://github.com/sympy/sympy/wiki/GSOC-2014-Application-Shipra-Banga-Building-the-New-Assumptions-Module.

Here are my comments on the proposal:

- You seem to be focusing pretty heavily on adding new facts to the
system. But I think this is actually a minor point. It will take a
considerable effort, to be sure, because there are quite a few facts,
but there are real issues to be considered which you haven't
addressed, like:

- How do you plan to make the system scale? You don't really make it
clear, but you are planning to extend the work I began at
https://github.com/sympy/sympy/pull/2508. This system already has
performance issues, and there are only a handful of facts implemented.
The old assumptions system has its limitations, but its actually quite
fast. This is clear from the fact that it is used everywhere in the
core. Take this very simple query:

Old assumptions:

In [1]: time Symbol('x', positive=False, real=True).is_negative
CPU times: user 160 µs, sys: 9 µs, total: 169 µs
Wall time: 176 µs

New assumptions:

In [5]: time ask(Q.negative(x), ~Q.positive(x) & Q.real(x))
CPU times: user 42.1 ms, sys: 4.01 ms, total: 46.1 ms
Wall time: 47.7 ms

Newask (my branch):

In [4]: time newask(Q.negative(x), ~Q.positive(x) & Q.real(x))
CPU times: user 86.9 ms, sys: 1.96 ms, total: 88.9 ms
Wall time: 87.8 ms

The new assumptions take two orders of magnitude longer than the old
assumptions, and my branch takes twice as long as the new assumptions.
If you look at Tom's proposal from last year, there's a reason that he
spends a lot of time talking about things like caching.

- Regarding the final switch (which, granted, you may not even get to
by the end of the summer), I think you understate its difficulty. I
give you some credit here: no one can really appreciate how difficult
this is until they attempt swapping out the assumptions system. I
didn't get it myself until I did
https://github.com/sympy/sympy/pull/2210. The assumptions are really
used by *everything*. That's the entire code-base of SymPy. You can
get a good ways by swapping out the syntax, so that the current
is_positive and so on calls the new assumptions, but then you have to
make sure that all the facts are either identical to the way they were
before, or places where they differ are changed (like with the
is_imaginary thing).

You also don't really discuss how you are going to handle the existing
old assumptions code. Does it make sense to keep around things like
the current implementation of pi.is_positive (it's just set explicitly
on the class)? I currently "use" the old assumptions for some classes
in my branch at
https://github.com/sympy/sympy/pull/2508/files#diff-bb2c25daed16ce71d13be11d912725edR217,
because this is more efficent than the new assumptions way, which just
calls evalf().

- I also doubt you will be able to get to this in the summer, but
another dream of the new assumptions is the ability to represent
relational assumptions, like x > 2. You should think about how to
implement this sort of thing.

In the end, don't consider the implementation at
https://github.com/sympy/sympy/pull/2508/ to be the truth handed down
from God on how the assumptions should be implemented. It's just
something that I hacked together. The earlier iterations were
definitely less elegant than what is there now, but I don't claim that
it can't be done even better, and I really don't claim that it should
be done differently in order to obtain scalability (which I think is
the most serious problem that needs to be solved in that branch).

Now, going back to adding facts to the system, this is still
important. There are some facts that need to be thought about a little
harder. For instance, consider the fact: "if all terms in an addition
are integers, the addition is even iff an even number of the terms are
odd". We might represent this symbolically as (Add,
Implies(AllArgs(Q.integer, Equivalent(Q.even,
EvenNumberOfArgs(Q.odd))))). The problem here is the EvenNumberOfArgs
part. To represent this, you need to use Xor, which requires an
exponential number of clauses. Are there tricks we can use (Tseitin
maybe) to get around this? Should we just put a default limitation on
the number of args that can be used? Or should we look at somehow
shipping this sort of thing off to a better suited solver than the SAT
solver?

Also think about how you would represent the following fact, from
https://github.com/sympy/sympy/pull/2952:

exp(x) is real if the imaginary part of x is a multiple of pi*I

And another interesting one. How would you make ask(Q.negative(x - 3),
Q.prime(x) & Q.even(x)) work?

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+un...@googlegroups.com.
> To post to this group, send email to sy...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/8a6d5105-7a67-400f-b5f7-dd8e41739a28%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Shipra Banga

unread,
Mar 16, 2014, 2:54:08 AM3/16/14
to sy...@googlegroups.com
Aaron,
Thank you so much for such a detailed review on my proposal. It helped me a lot and needless to say it changed the direction on which I was thinking completely.
On the new lines of thought I edited my proposal . This is the link to the new draft .

https://github.com/sympy/sympy/wiki/GSOC-2014-Application-Shipra-Banga-Building-the-New-Assumptions-Module

I hope I addressed all the issues you suggested in my proposal. If you feel there should be more further changes , please suggest them and I would work on them.

Thanking You
Shipra

Shipra Banga

unread,
Mar 19, 2014, 5:00:18 AM3/19/14
to sy...@googlegroups.com
Hello,
I have reached the final stage in drafting my proposal as we are nearing the deadline. So this is it :

https://github.com/sympy/sympy/wiki/GSOC-2014-Application-Shipra-Banga-Building-the-New-Assumptions-Module

It would be great if you could tell me where I can improve on my proposal.

Thanking You
Regards
Shipra

Richard Fateman

unread,
Mar 20, 2014, 12:43:06 AM3/20/14
to sy...@googlegroups.com
Unless I've missed something in your proposal, you intend to do forward chaining from
miscellaneous facts.
This  naive approach suffers from many problems, including inadequacy as well as failure to scale.
The only comparably bad, but different, approach is to reduce problems to satisfiability.

There are many more reasonable ideas in the MKM (mathematical knowlege manipulation)
literature,  geometric reasoning, theorem proving, etc.  Instead of duplicating the worst naive
stuff around, with the only contribution that now it is in python,  you might read up on some
of the literature.

If you have already done so and are proposing to do something that at least recognizes the
current state of the art, I have missed it in the scan of your description.  In that case, (or
in the other case too, I guess)  good luck.


Shipra Banga

unread,
Mar 20, 2014, 4:07:46 AM3/20/14
to sy...@googlegroups.com
The thing is assumptions is supposed to give result for queries like :
newask(Q.positive(2*x + y),Q.positive(x) & Q.positive(y))

So , in this case, we have to assume that x and y are positive and qyery for 2*x+y using the fact that addition of two positive numbers gives a positive number . By forward chaining I meant this process of reasoning . @Richard Could you please elaborate how theorem proving or geometric reasoning might work in this case ? All we are using is the simple closure properties of multiplication.

Shipra Banga

unread,
Mar 20, 2014, 4:21:08 AM3/20/14
to sy...@googlegroups.com
@ Richard I agree there is an issue with scalability and other performance issues. Could you please suggest other ways of implementing the assumptions as it works now , if not by forward chaining ?

Shipra Banga

unread,
Mar 20, 2014, 4:27:37 AM3/20/14
to sy...@googlegroups.com
My solution to the scalability problem was that that use buffering with forward chaining to store the derivations in a buffer so that repeatedly the computation does not happen and also call on the relevant subset of all facts not the entire global set.

Richard Fateman

unread,
Mar 20, 2014, 11:34:42 AM3/20/14
to sy...@googlegroups.com


On Thursday, March 20, 2014 1:27:37 AM UTC-7, Shipra Banga wrote:
My solution to the scalability problem was that that use buffering with forward chaining to store the derivations in a buffer so that repeatedly the computation does not happen and also call on the relevant subset of all facts not the entire global set.

Here's the problem:
  You have totally ignored the last 50 years of research in addressing this problem.
I gave you a tag to start your search, and now you want me to type 50 years of research into this
box for your education?

since you are unwilling to type mkm  into google, maybe you can click on this link
http://www.mkm-ig.org/

.  
Please try to first learn some math and computer science, and write code later.
 

Shipra Banga

unread,
Mar 20, 2014, 12:29:20 PM3/20/14
to sy...@googlegroups.com
Thanks for your suggestions. I will look into it and as regards learning thats what I am here for and am always willing to learn. This topics is again very welcome and I will do as much research as is possible on this. Thanking You once again.

Regards
Shipra

Ondřej Čertík

unread,
Mar 20, 2014, 1:42:24 PM3/20/14
to sympy
Hi Richard,
Thank you for your suggestions. I appreciate that you are helping
review the applications
and providing feedback about weaknesses and pointers to other work
that has been done in this area.

If possible, please try to be gentle to our our prospective students,
in majority of cases, they never contributed
to any opensource project before and are new to computer algebra as
well, typically are undergrads. They are
here to learn and Google Summer of Code provides the opportunity, if
their proposal is accepted. It's a competitive
process and only few will be accepted, but everybody is here to learn
and ultimately, every student has to start
and then keep improving by following their passion. I've been there
myself, when I started SymPy, I had no
clue about any computer algebra research, I just wanted to make a
useful library, and over the years I tried
to learn as much as I could and also many talented people have joined,
way better than I am. We are definitely
trying to implement the state of the art algorithms and learn from all
the published work out there.

I appreciate all feedback and pointers to literature if we missed something.

Ultimately, what really matters, at least to me, is to figure out the
best way to solve the technical problem.
If the proposed solution is not optimal, it would be nice if we can
figure out a better one.

Ondrej

Shipra Banga

unread,
Mar 20, 2014, 2:11:38 PM3/20/14
to sy...@googlegroups.com
According to suggestions made by Richard , I looked into previous papers on MKM and did some research . I found geometrical reasoning of interest and I added a very brief description of this idea in my proposal under the head Extra Research . I realise its extremely theoretical and general, but after discussing it with sympy mentors and through there guidance, I would be willing to put this basic idea into action.

This is the link to my proposal
https://github.com/sympy/sympy/wiki/GSOC-2014-Application-Shipra-Banga-Building-the-New-Assumptions-Module

Regards
Shipra
Message has been deleted

Richard Fateman

unread,
Mar 20, 2014, 7:45:11 PM3/20/14
to sy...@googlegroups.com
My feeling is that it does not do anyone a service to have a student embark on a project where he or she does not look at the state of the art in that area. 
The student does not learn much and the project does not benefit from the effort. Worse, any "senior" effort to guide this project further detracts from the overall set of tasks.

I do not view writing a bunch of code in python as a necessary good.

I do not know if GSOC will fund such things, but I would be embarrassed
to propose projects where the proposers have not done any homework
and are just proposing to write code pretty much off the top of their head.

Sorry I don't have the patience to be gentle.

RJF

Matthew Rocklin

unread,
Mar 20, 2014, 7:58:36 PM3/20/14
to sy...@googlegroups.com
Hi Richard,

This is a reasonable concern.

I recommend that you bring this to the attention of the core development team directly, rather than to the student.

The recommended way to accomplish this is by making an issue on github.  These issues are much more visible over the long term and serve as a lasting record for design conversations.  If you're up for this you can head to https://github.com/sympy/sympy/issues and click the green "New Issue" button.

-Matthew


--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To post to this group, send email to sy...@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.

Ondřej Čertík

unread,
Mar 21, 2014, 9:02:34 PM3/21/14
to sympy
Hi Richard,

On Thu, Mar 20, 2014 at 5:45 PM, Richard Fateman <fat...@gmail.com> wrote:
> My feeling is that it does not do anyone a service to have a student embark
> on a project where he or she does not look at the state of the art in that
> area.
> The student does not learn much and the project does not benefit from the
> effort. Worse, any "senior" effort to guide this project further detracts
> from the overall set of tasks.
>
> I do not view writing a bunch of code in python as a necessary good.
>
> I do not know if GSOC will fund such things, but I would be embarrassed
> to propose projects where the proposers have not done any homework
> and are just proposing to write code pretty much off the top of their head.
>
> Sorry I don't have the patience to be gentle.

We welcome and appreciate all feedback. If you don't have patience to
be gentle, then please send us such feedback offlist, privately (I
welcome such feedback, you can write me whatever you want, no rules).

But all emails to this list must be gentle and polite, this is how we
built our community.

Ondrej
Reply all
Reply to author
Forward
0 new messages