[GSoC 2026] Seeking Feedback on Assumptions system project.

242 views
Skip to first unread message

Sujal Kumar

unread,
Mar 19, 2026, 9:21:54 AMMar 19
to sympy
Hi SymPy community, 

I am Sujal Kumar, a second-year student at IIT Madras (GitHub handle: SujalKumar06). I have recently been working on the assumptions module in SymPy, and some of my contributions have been merged. My Work.

I am interested in continuing to work on this module. In particular, I am considering improving lra_satask as a potential project idea. I would greatly appreciate hearing from the community about other fundamental issues or areas within the assumptions module that could benefit from further work.

I understand that there is a lot of scope for improvement in this module, and I would like to make sure I am not overlooking any important directions.

Thank you for you time. I look forward to your suggestions.

Regards, 
Sujal Kumar

Tilo RC

unread,
Mar 23, 2026, 6:26:48 PMMar 23
to sympy
Hi Sujal, 

There's a lot of things that can be worked on with the assumption system. Currently there are several big and
important open PRs that you should take a look at:

- Krishnav Bajoria's GSoC work:
- My PRs:
  5. (more experimental / perhaps less important) Double the speed of ask by implementing an SMT theory solver that understands SymPy's unary predicat 
     + this implements an algorithm I came up with (or really a as far as I know novel combination of existing algorithms) as part of my undergrad thesis

Krishnav's PRs are 95% done. The main reason they haven't been merged yet is just because they were pretty big changes
that were hard to review all at once. It could be pretty impactful to finish them up and integrate them with the rest of the assumption system. 
My GSoC work actually has similar problems even though it got merged; PR #4 in the list above addresses some of these problems. (By the way, 
it would be great if in your GSoC proposal you discussed how you might avoid falling into the trap Sujal and I fell in to of having one or two massive PRs with the bulk of
our GSoC work that are hard to review). 

There's a lot of things that could be improved so you have a lot of flexibility in what you could write your GSoC proposal about. However, my suggestion is that you
focus on smoothing over the many rough edges of the assumption system so everything works more harmoniously together. I know that sounds pretty vague.
What I mean by this is that rather than implementing a new algorithm like Krishnav and I did you could improve upon and better integrate existing algorithms and 
assumption subsystems.  For example, I think the following issue could be the type of thing you devote one or two weeks to during GSoC (or maybe even fix beforehand):
- lra_theory: Implement more preprocessing steps
Another (probably more difficult) issue that would be great to fix would be to revive Krishnav's PRs and get them merged. 

Something to keep in mind is that the tests for the assumption system are not great. Also, beyond the performance improvements, PR #3 is very significant because it will
make testing the different assumption subsystems separately possible. So improving the testing system could be an important part the "roughing out the smooth edges"
of the assumption system idea. It could probably also just be its own separate GSoC project. You might want to take a look at this issue about improving the way the 
assumption system subsystems are tested:
- Improve testing for assumptions subsystems
You might also want to take a look at the following (AI written) file from PR #4:
That file is a WIP and entirely AI written so I would take everything in it with a grain of salt (though it has found some bugs). However, I think that style of testing could be used
for the rest of the assumption system. I'm not super familiar with Hypothesis python library but Aaron Meurer is and it could be good to read about it. You might also want to 
read about fuzzing

Please let me know of you have any questions. Asking smart questions is a great way to show that you would do well if you did GSoC. 

Sincerely, 

Tilo RC

Sujal Kumar

unread,
Mar 26, 2026, 10:39:31 AMMar 26
to sympy
Hey Tilo, 

Thanks a lot for writing such a detailed feedback, it has been quite a while I have been going through the things you shared.


> There's a lot of things that could be improved so you have a lot of flexibility in what you could write your GSoC proposal about. However, my suggestion is that you
focus on smoothing over the many rough edges of the assumption system so everything works more harmoniously together.

I agree on this, In fact I would really like to go on this for my proposal ("smoothing the rough edges of assumptions").  I think the last two GSoC work on assumptions (you and Krishnav) had lot of additions. Specifically Krishnav's work in PR#2 is really tough to go through currently for me. Though I see you have recently added `Please Take Over` tags to both PRs. I feel its better if we get them in but I think a lot of my time would go in actually understand what the additions were supposed to do and reading through the code. Is it ok if substantial amount of time of my GSoC period is being spent on that, given you already have an idea with the PR, won't it be much faster for you to go through the PR yourself, the issue I feel here is while working/reviving the PRs, I might end up slowing the process of merging them?


 > By the way,  it would be great if in your GSoC proposal you discussed how you might avoid falling into the trap Sujal and I fell in to of having one or two massive PRs with the bulk of our GSoC work that are hard to review
 
Sure I will look into it. (Also I think you meant someone else and not me :p)
I highly think the issue with the PR#1 and PR#2 is the amount of stuff written in one single PR. I feel this really slows the workflow as it might really get chaotic reviewing through such a big PR. I think the first step in merging these is to divide them properly into much smaller PRs. 

For your work in 2023 especially adding lra_satask, I went through your report, the future work had two specific linked issues, One of them still seems open Issue, do you think working with this should also be considered as smoothing out the rough edges. 


> For example, I think the following issue could be the type of thing you devote one or two weeks to during GSoC (or maybe even fix beforehand):

Can you tell what else do you have in mind, like which might lie under what you suggested as "improve upon and better integrate existing algorithms and 
assumption subsystems" especially from the perspective of lra_satask/lra_theory.


> Something to keep in mind is that the tests for the assumption system are not great. Also, beyond the performance improvements, PR #3 is very significant because it will
make testing the different assumption subsystems separately possible. So improving the testing system could be an important part the "roughing out the smooth edges"
of the assumption system idea. 

This seems like a pretty valid point, the issues even I saw and raised on GitHub should have been easily caught with proper testing. I am actually up for this Hypothesis testing thing, I had a go through hypothesis testing and fuzzing and I think this seems reasonable to me. Given the big PRs, I think it is very easy to miss out on tests that might lead to some bug surviving. I think after part of merging PRs for the previous said things are done, it is good for us to implement this slowly and fundamentally in assumptions module.

But won't these testing make the tests relatively slower, is it worth to make the tests slower rather than just adding more tests.

I think I have jotted my points and doubts above, looking forward for your reply. I feel your points are good for my proposal, I think I should divide my proposal in two parts (and so my GSoC period) one being smoothing out on things and integrating everything properly and the next being of hypothesis testing.

Thanks, 
Sujal  Kumar

Sujal Kumar

unread,
Mar 27, 2026, 12:48:56 AMMar 27
to sympy
Hey Tilo,

I just realised even in Krishnav's final report, there are a bunch of future work, do you think this can also be considered in integrating his work properly (smoothing the rough edges), because I see you raised #29507 which is a point in his future work. 

Sorry for missing this point earlier.

Thanks,
Sujal Kumar

Tilo RC

unread,
Mar 27, 2026, 6:10:49 PMMar 27
to sy...@googlegroups.com
> I feel its better if we get them in but I think a lot of my time would go in actually understand what the additions were supposed to do and reading through the code. Is it ok if substantial amount of time of my GSoC period is being spent on that, given you already have an idea with the PR,

Your concern about it taking a very long time to understand Krishnav's pull requests is valid.
To understand the algorithms involved, you would probably have to read the linked CS papers
many times. Papers like that are typically written for an audience of other computer scientists
rather than undergraduate students so they can be pretty dense. When I did GSoC, I spent
at least half of my time doing research and reading.

I'll take some time to look over Krishnav's pull requests and remember what was going on with
them. Maybe I can give you more detailed instructions on what you could work on. Ultimately,
you probably don't need to understand how the algorithms involved actually work; what's important
is understanding what they do. That means understanding how SMT theory solvers work in tandem
with a SAT solver and the details of the particular theories involved, namely LRA and EUF. 

You should keep in mind that this type of work typically involves more time spent reading and understanding 
things rather than writing code. Many things won't become clear until after you spend significant time learning 
things. In practice, what you write in your proposal might not be what you end up working on during 
GSoC. 

You must submit something to Google by the deadline to participate in GSoC. However, to get the most out of
GSoC and do the most impactful work possible, this sort of discussion we're
having is probably more important than the actual proposal. 


> won't it be much faster for you to go through the PR yourself, the issue I feel here is while working/reviving the PRs, I might end up slowing the process of merging them?

One idea I just had is that we could approach reviving the PRs more collaboratively. I could be the one 
to actually make a new PR reviving them and do most of the work. You would focus on reviewing my work
by writing tests and looking for bugs. 

Or perhaps since I can make commits to your PRs and you can't do that with
mine, you could be the one to open the PR but I could make most of the commits?

That way you wouldn't need to worry about the implementation details of the algorithms. You would still have to 
have a deep understanding of what they were supposed to do and their APIs, though. 

> Can you tell what else do you have in mind, like which might lie under what you suggested as "improve upon and better integrate existing algorithms and 
assumption subsystems" especially from the perspective of lra_satask/lra_theory.

A lot of bugs and problems are likely to come up if you write more tests for `ask`, its subsystems, and maybe
Krishnav's and my GSoC work. So in general, you could try to alternate between identifying problems with how things
are currently implemented and fixing some of those problems. 

> But won't these testing make the tests relatively slower, is it worth to make the tests slower rather than just adding more tests.

I'm not sure about this. Maybe you could mark these kinds of tests with the slow decorator. I found this issue
which is relevant:
- Hypothesis testing #20914 
Maybe you could ask Aaron questions about this. 


> I just realised even in Krishnav's final report, there are a bunch of future work, do you think this can also be considered in integrating his work properly (smoothing the rough edges)

I looked over the future work section. These points seem relevant:

  1. Integrate EUF solver with the ask system / satask
       + Replace provisional euf_ask file with the production API and retire temporary glue code.
  2. Remove the provisional euf_ask and fully wire theory solver into satask 
       + Ensure satask performs the single-call conversion of queries into CNF and sends the batch to the SAT+theory pipeline (no intermediate per-recursive-call satask invocations).
  7. Benchmarks & scaling tests 
       + Run the solver on larger CNFs extracted from real SymPy workloads and measure SAT decisions, learned clause size distribution, and total solve time.

Additionally, these points from my GSoC future work section seem the most relevant:
- Implement speed improvements from paper
Keep in mind that it sounds like I was a bit confused about a few things in my GSoC final report lol. Also, 
I've suggested a lot of things, so you should probably pick one of them to focus on at least at first as 
it's not feasible to do everything I'm suggesting in the 350 hour timeframe.

Let me know if you have any questions. 

Sincerely,
Tilo Reneau-Cardoso

--
You received this message because you are subscribed to a topic in the Google Groups "sympy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sympy/SU1PJ7xfzZc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sympy+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sympy/75d33130-8e30-4969-9b06-acb1442e324an%40googlegroups.com.

Sujal Kumar

unread,
Mar 31, 2026, 1:54:49 PM (13 days ago) Mar 31
to sympy
Hey Tilo,

I have submitted my final draft of the proposal, you can have look over it. I tried to include all the things from our discussion whichever seemed feasible. I am looking forward to work with you and under SymPy for my GSoC :)

My Proposal

Thanks a lot for the help :)

Thanks,
Sujal Kumar

Tilo RC

unread,
Mar 31, 2026, 4:53:08 PM (13 days ago) Mar 31
to sy...@googlegroups.com
Hi Sujal,

I took a brief look at it and I think it looks good. I'll read it more carefully later (in the next few days after the deadline).

Sincerely,

Tilo Reneau-Cardoso 

Reply all
Reply to author
Forward
0 new messages