Hello! My name is Tilo and I am a sophomore at Pomona College in Claremont, California. Currently, I am double majoring in math and cs. I am highly interested in participating in GSoC with sympy because I believe it would be a great opportunity to enhance my programming skills and give back to the open-source community.
In the past two years, I have gained extensive experience in Python through personal projects, a Machine Learning club, leetcode, and college classes. Although Python is my most comfortable language, I also have substantial experience in Java through classes I have taken.
I have a strong interest in statistics and machine learning, and have taken several classes in these subjects. I’m currently taking a class on natural language processes which has further sparked my interest in the field.
In addition to these areas, I also have experience with discrete math. I have taken semester-long courses in both Number Theory and Combinatorics, as well as learned about discrete math through computer science courses.
I learned about SymPy in my first semester of college when I was taking linear algebra. As I do all of my math homework in LaTeX, I found the ability to instantly convert matrices into LaTeX particularly useful. This semester I’ve been using SymPy to help a professor test conjectures related to Schur Polynomials.
I am uncertain about which GSoC Ideas would suit me best, given my background. However, I have some tentative ideas:
Currently, I am enrolled in an introductory course on languages and the theory of computation, where we have recently started exploring parsers. Additionally, I am taking a natural language processing class that involves programming intensive assignments on probabilistic context-free grammars and sentence parsing. However, I have limited experience in some of the potentially required languages such as Fortran, C, C++, Julia, Rust, LLVM, Octave, and Matlab. I think a project that focuses on existing LaTeX functionality would be a good fit.
This idea seems very challenging but it also interests me a lot. However, a bug I found today made me want to learn more about how sympy’s assumption system works. It’s not exactly clear to me what the prereqs for working on this idea are. However, I have experience with number theory which is listed as one of the prereqs for some reason. Also, I’ve taken a class on functional programing with coq which seems like it could possibly be relevant.
If the other ideas seem unrealistic or impractical, this project seems well-suited to my capabilities. I have experience with HTML, Javascript, and CSS. (Actually, JavaScript was the first language I learned. I even taught a lesson on using JS to approximate integrals in my high school calculus class. Hey! Python would’ve been better but JavaScript worked!). While working on an issue related to polygons I noticed that there was no convenient way to plot polygons and other geometric objects, so maybe some of the work on this idea could add functionality related to that.
I would appreciate your opinion on which of these ideas to explore further, and whether there are any better-suited to my background.
For around a decade the SymPy community has wanted to transition from the “old assumptions” to the “new assumptions.” The old assumptions are seen as fundamentally limited compared to the new assumptions for reasons I’m trying to figure out. This is my understanding:
The old assumption system is deeply connected to the core of sympy. I think this is bad because it’s responsible for some of the other problems with it and violates OOP principles.
One fundamental flaw of the old assumption system is that it’s not possible to make assumptions about relations between different variables. The new assumption system has this capability, and even though it is not very feature rich, it has a lot of potential.
The old assumption system slows down SymPy. I’m not sure how true this still is. It seems like the core and the old assumption system have gotten many improvements to make them faster. For example, because I accidentally read the SymPy 1.11 instead of the SymPy 1.13 documentation I learned about the ManagedProperties metaclass and then saw it was removed to improve efficiency. (By the way, both of these sections from the documentation say that “This explanation is written as of SymPy 1.7.” even though the writing clearly changed between 1.11 and 1.13).
As for improving how the new assumptions deal with inequalities, there are some very basic features they’re missing. For example, ask(Q.negative(x),Q.ge(x,1)) returns None under the current system even though it seems like it would be really simple to implement something where if x > 0 then x is not negative. Similarly, ask(Q.gt(x,0),Q.gt(x,1)) returns None. A more complicated query that returns None is ask(Q.eq(x,1), Q.integer(x) & Q.gt(x,0) & Q.lt(x,2)).
I also notice that even equality is somewhat limited in the new assumption system. For example, ask(Q.positive(y), Q.eq(x,y) & Q.positive(x)) returns None. I’m a bit unsure why this hasn’t been implemented. One guess I have is that implementing this functionality would be too slow because maybe you would have to use the solver?
Frankly, although I’ve spent a lot of time reading the files in the new assumption system, I only have a surface level understanding of them:
relation/equality.py
I probably understand this file the best out of all the files in the new assumptions. It contains implementations of binary relation for >, ≥, ≤, <, =, and ≠.
Ask.py
Heart of the new assumptions. Defines ask() and Q. In general, ask tries to find computationally cheap ways to evaluate propositions. If it can’t do it in a cheap way it calls the SAT solver. Also, all of the assumptions are converted into a normal form which I think is analogous to the role Chomsky normal form plays in allowing a parser to parse a grammar efficiently.
How difficult would it be to implement something that deals with inequalities better? What about the specific cases I outline above? Would this be enough for a summer project? Do you think this task to advanced for someone of my background?
Thank you for your response Oscar. I think I’m interested in taking a stab at improving the capabilities of the new assumption system to deal with inequalities (and perhaps relations between symbols in general?). I’ve spent a lot of time reading through the documentation, code, discussions, and previous proposals related to assumptions and come to some understanding. I’ve tried to explain what I understand so far below. Could you read through what I’ve written and correct any mistakes in my thinking?For around a decade the SymPy community has wanted to transition from the “old assumptions” to the “new assumptions.” The old assumptions are seen as fundamentally limited compared to the new assumptions for reasons I’m trying to figure out. This is my understanding:
The old assumption system is deeply connected to the core of sympy. I think this is bad because it’s responsible for some of the other problems with it and violates OOP principles.
One fundamental flaw of the old assumption system is that it’s not possible to make assumptions about relations between different variables. The new assumption system has this capability, and even though it is not very feature rich, it has a lot of potential.
The old assumption system slows down SymPy. I’m not sure how true this still is. It seems like the core and the old assumption system have gotten many improvements to make them faster. For example, because I accidentally read the SymPy 1.11 instead of the SymPy 1.13 documentation I learned about the ManagedProperties metaclass and then saw it was removed to improve efficiency. (By the way, both of these sections from the documentation say that “This explanation is written as of SymPy 1.7.” even though the writing clearly changed between 1.11 and 1.13).
As for improving how the new assumptions deal with inequalities, there are some very basic features they’re missing. For example, ask(Q.negative(x),Q.ge(x,1)) returns None under the current system even though it seems like it would be really simple to implement something where if x > 0 then x is not negative. Similarly, ask(Q.gt(x,0),Q.gt(x,1)) returns None. A more complicated query that returns None is ask(Q.eq(x,1), Q.integer(x) & Q.gt(x,0) & Q.lt(x,2)).
I also notice that even equality is somewhat limited in the new assumption system. For example, ask(Q.positive(y), Q.eq(x,y) & Q.positive(x)) returns None. I’m a bit unsure why this hasn’t been implemented. One guess I have is that implementing this functionality would be too slow because maybe you would have to use the solver?
Frankly, although I’ve spent a lot of time reading the files in the new assumption system, I only have a surface level understanding of them:
relation/equality.py
I probably understand this file the best out of all the files in the new assumptions. It contains implementations of binary relation for >, ≥, ≤, <, =, and ≠.
Ask.py
Heart of the new assumptions. Defines ask() and Q. In general, ask tries to find computationally cheap ways to evaluate propositions. If it can’t do it in a cheap way it calls the SAT solver. Also, all of the assumptions are converted into a normal form which I think is analogous to the role Chomsky normal form plays in allowing a parser to parse a grammar efficiently.
How difficult would it be to implement something that deals with inequalities better? What about the specific cases I outline above? Would this be enough for a summer project? Do you think this task to advanced for someone of my background?
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/96242267-2259-400e-a529-42c20994e2c9n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAKgW%3D6K12KOEF_%2BACjqrbkEJa0U3pWpGD8CcX2Ub-NjDbcWmeA%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CABKqA0bCybNT9k_ntnn9fSu_CSk1Y8ODoESy%2B5iF1mqEDd%3D25Q%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAP7f1AiepG1f0AbGCbYDWpP%3DQ1ogk1o0Qkpjr2f_qL-fPr1rrw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CABKqA0YM%3DOh1-fERXM5JZWbWHyL%2BcVUceFo1tDgsjVExi7OjiA%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAP7f1AhxroLm7%3DQv1dHtm10k1MeT%2BNdgTLcONW%3DAfWU8Yk35Rw%40mail.gmail.com.
Especially with regard to mechanics how many problems could be
solved with piecewise functions on a fixed grid?
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CABKqA0Z--bV250s2NcmRugKKpPaJKRPhTWioLL01mM9Lp2Wmcg%40mail.gmail.com.