AI generated pull requests

Oscar Benjamin

unread,

Oct 25, 2025, 6:46:10 PM10/25/25

to sympy

Hi all,

I am increasingly seeing pull requests in the SymPy repo that were
written by AI e.g. something like Claude Code or ChatGPT etc. I don't
think that any of these PRs are written by actual AI bots but rather
that they are "written" by contributors who are using AI tooling.

There are two separate categories:

- Some contributors are making reasonable changes to the code and then
using LLMs to write things like the PR description or comments on
issues.
- Some contributors are basically just vibe coding by having an LLM
write all the code for them and then opening PRs usually with very
obvious problems.

In the first case some people use LLMs to write things like PR
descriptions because English is not their first language. I can
understand this and I think it is definitely possible to do this with
LLMs in a way that is fine but it needs to amount to using them like
Google Translate rather than asking them to write the text. The
problems are that:

- LLM summaries for something like a PR are too verbose and include
lots of irrelevant information making it harder to see what the actual
point is.
- LLMs often include information that is just false such as "fixes
issue #12345" when the issue is not fixed.

I think some people are doing this in a way that is not good and I
would prefer for them to just write in broken English or use Google
Translate or something but I don't see this as a major problem.

For the vibe coding case I think that there is a real problem. Many
SymPy contributors are novices at programming and are nowhere near
experienced enough to be able to turn vibe coding into outputs that
can be included in the codebase. This means that there are just spammy
PRs with false claims about what they do like "fixes X", "10x faster"
etc where the code has not even been lightly tested and clearly does
not work or possibly does not even do anything.

I think what has happened is that the combination of user-friendly
editors with easy git/GitHub integration and LLM agent plugins has
brought us to the point where there are pretty much no technical
barriers preventing someone from opening up gibberish spam PRs while
having no real idea what they are doing.

Really this is just inexperienced people using the tools badly which
is not new. Low quality spammy PRs are not new either. There are some
significant differences though:

- I think that the number of low quality PRs is going to explode. It
was already bad last year in the run up to GSOC (January to March
time) and I think it will be much worse this year.
- I don't think that it is reasonable to give meaningful feedback on
PRs where this happens because the contributor has not spent any time
studying the code that they are changing and any feedback is just
going to be fed into an LLM.

I'm not sure what we can do about this so for now I am regularly
closing low quality PRs without much feedback but some contributors
will just go on to open up new PRs. The "anyone can submit a PR model"
has been under threat for some time but I worry that the whole idea is
going to become unsustainable.

In the context of the Russia-Ukraine war I have often seen references
to the "cost-exchange problem". This refers to the fact that while
both sides have a lot of anti-air defence capability they can still be
overrun by cheap drones because million dollar interceptor missiles
are just too expensive to be used against any large number of incoming
thousand dollar drones. The solution there would be to have some kind
of cheap interceptor like an automatic AA gun that can take out many
cheap drones efficiently even if much less effective against fancier
targets like enemy planes.

The first time I heard about ChatGPT was when I got an email from
StackOverflow saying that any use of ChatGPT was banned. Looking into
it the reason given was that it was just too easy to generate
superficially reasonable text that was low quality spam and then too
much effort for real humans to filter that spam out manually. In other
words bad/incorrect answers were nothing new but large numbers of
inexperienced people using ChatGPT had ruined the cost-exchange ratio
of filtering them out.

I think in the case of SymPy pull requests there is an analogous
"effort-exchange problem". The effort PR reviewers put in to help with
PRs is not reasonable if the author of the PR is not putting in a lot
more effort themselves because there are many times more people trying
to author PRs than review them. I don't think that it can be
sustainable in the face of this spam to review PRs in the same way as
if they had been written by humans who are at least trying to
understand what they are doing (and therefore learning from feedback).
Even just closing PRs and not giving any feedback needs to become more
efficient somehow.

We need some sort of clear guidance or policy on the use of AI that
sets clear explanations like "you still need to understand the code".
I think we will also need to ban people for spam if they are doing
things like opening AI-generated PRs without even testing the code.
The hype that is spun by AI companies probably has many novice
programmers believing that it actually is reasonable to behave like
this but it really is not and that needs to be clearly stated
somewhere. I don't think any of this is malicious but I think that it
has the potential to become very harmful to open source projects.

The situation right now is not so bad but if you project forwards a
bit to when the repo gets a lot busier after Christmas I think this is
going to be a big problem and I think it will only get worse in future
years as well.

It is very unfortunate that right now AI is being used in all the
wrong places. It can do a student's homework because it knows the
answers to all the standard homework problems but it can't do the more
complicated more realistic things and then students haven't learned
anything from doing their homework. In the context of SymPy it would
be so much more useful to have AI doing other things like reviewing
the code, finding bugs, etc rather than helping novices to get a PR
merged without actually investing the time to learn anything from the
process.

--
Oscar

gu...@uwosh.edu

unread,

Oct 25, 2025, 9:06:55 PM10/25/25

to sympy

Here's a brainstorming idea for how to implement something to address your valid concerns.

How about the following policy?

No review of a pull request will occur unless it meets certain minimum requirements:

1) It passes all pre-existing tests;

2) It includes test coverage for all new code;

3) It includes tests covering any bug fixes.

I can see how to implement #1 automatically. Could #2 be implemented using one of the coverage testing tools? My experience with those is limited. It also would require some work to make sure new tests cover all changed code. I think this would clear out a lot of the very low quality, doesn't work or does nothing code. However, I see a couple of problems as well:

1) What happens if the bug fix is to an erroneous test?

2) This does not address low quality descriptions of the PR and its goals.

3) People who are just learning the code base will need a way to get help on running and fixing issues with testing. I think contributors might have to be in the position of asking for help on this list with issues of that sort or maybe there should be a specific venue for that.

Just some ideas to help start the ball rolling.

Jonathan

Jason Moore

unread,

Oct 26, 2025, 2:16:04 AM10/26/25

to sy...@googlegroups.com

Hi Oscar,

Thanks for raising this. I agree, this problem will grow and it is not good. I think we should have a policy about LLM generated contributions. It would be nice if a SYMPEP was drafted for one.

Having a standard way to reject spam PRs would be helpful. If we could close a PR and add a label to trigger sympybot to leave a comment that says "This PR does not meet SymPy's quality standards for AI generated code and comments, see policy <link>" could be helpful. It still requires manual steps from reviewers.

I also share the general concern expressed by some in the scipy ecosystem here:

https://github.com/scientific-python/summit-2025/issues/35#issuecomment-3038587497

which is that LLMs universally violate copyright licenses of open source code. If this is true, then PRs with LLM generated code are polluting SymPy's codebase with copyright violations.

Jason

moorepants.info
+01 530-601-9791

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sympy/CAHVvXxQ1ntG0EWBGihrXErLhGuABHH7Kt5RmGJvp9bHcqaC5%3DQ%40mail.gmail.com.

Jason Moore

unread,

Oct 26, 2025, 2:27:53 AM10/26/25

to sy...@googlegroups.com

Here is scipy's discussion about a policy: https://discuss.scientific-python.org/t/a-policy-on-generative-ai-assisted-contributions/1702/18

Jason

moorepants.info
+01 530-601-9791

Oscar Benjamin

unread,

Oct 26, 2025, 3:24:40 PM10/26/25

to sy...@googlegroups.com

On Sun, 26 Oct 2025 at 01:06, 'gu...@uwosh.edu' via sympy
<sy...@googlegroups.com> wrote:
>
> Here's a brainstorming idea for how to implement something to address your valid concerns.
> How about the following policy?
> No review of a pull request will occur unless it meets certain minimum requirements:
> 1) It passes all pre-existing tests;
> 2) It includes test coverage for all new code;
> 3) It includes tests covering any bug fixes.

I think that what will happen is that the author will pass these
instructions to the LLM agent and then the agent will generate some
code that superficially resembles meeting these criteria. Then the PR
description will have a bunch of emoji-filled bullet points
redundantly stating that it meets those criteria.

I'm not going to point to the specific PR but I closed one where the
description had a statement in it like

"You can run the tests with `pytest sympy/foo/bar`"

That is literally an instruction from the LLM to the user for how they
can test the generated code and if you actually run the test command
it clearly shows that the code doesn't work. It was still submitted in
that form as a spam PR though.

Of course the tests in CI did not pass and it is not hard to see the
problem in that case but other cases can be more subtle than this. It
is not hard to generate code that passes all existing tests, includes
coverage etc while still being gibberish and this is really the
problem with using LLMs. There isn't any substitute in this situation
for actual humans doing real thinking.

--
Oscar

Oscar Benjamin

unread,

Oct 26, 2025, 3:30:05 PM10/26/25

to sy...@googlegroups.com

Yes, the copyright is a big problem. I don't think I would say that
LLMs universally violate copyright e.g. if used for autocompleting an
obvious line of code or many other tasks. There are certain basic
things like x += 1 that cannot reasonably be considered to be under
copyright even if they do appear in much code. Clearly though an LLM
can produce a large body of code where the only meaningful
interpretation is that the code has been "copied" from one or two
publicly available codebases.

The main difficulty I think with having a policy about the use of LLMs
is that unless it begins by saying "no LLMs" then it somehow needs to
begin by acknowledging what a reasonable use can be which means
confronting the copyright issue up front.

> To view this discussion visit https://groups.google.com/d/msgid/sympy/CAP7f1AhXNE-UapwEm1bQW9de%3Di%2BWixFen5sTp8MsCMScsqA-%3Dg%40mail.gmail.com.

Aaron Meurer

unread,

Oct 30, 2025, 2:08:12 PM10/30/25

to sy...@googlegroups.com

I like the Ghostty policy, which is that AI coding assistance is
allowed, but it must be disclosed
https://github.com/ghostty-org/ghostty/blob/main/CONTRIBUTING.md#ai-assistance-notice.
It should also be our policy that the person submitting the code is
ultimately responsible for it, regardless of what tools were used to
create it.

I think it would be a mistake to ban AI usage entirely because AI can
be very useful if used properly, i.e., you review the code it writes
before submitting it.

For me the copyright question doesn't really pass the smell test, at
least for the majority of the use-cases where I would use AI in SymPy.
For example, if I use AI to generate some fix for some part of SymPy,
say the polynomials module, then where would that fix have "come from"
for it to be a copyright violation? Where else in the world is there
code that looks like the SymPy polynomials module? Most code in SymPy
is very unique to SymPy. The only place it could have possibly come
from is SymPy itself, but if SymPy already had it then the code
wouldn't be needed in the first place (and anyways that wouldn't be a
copyright violation). I think there's a misconception that LLMs can
only generate text that they've already seen before, and if you
believe that misconception then it would be easy to believe that
everything generated by an LLM is a copyright violation. But this is
something that is very easily seen to not be true if you spend any
amount of time using coding tools.

As for PR descriptions, I agree those should always be hand-written.
But that's always been a battle, even before AI. And similarly almost
no one writes real commit messages anymore.

Aaron Meurer

> To view this discussion visit https://groups.google.com/d/msgid/sympy/CAHVvXxSW_5u4Qvj5kddZUQzzNdkteTZ5GJX46D_c3Gko87Dj%2Bg%40mail.gmail.com.

Jason Moore

unread,

Oct 30, 2025, 2:50:55 PM10/30/25

to sy...@googlegroups.com

I don't think it is a terrible idea to simply have a "no LLMs" policy at this point in time. We can always change it in the future as things get clearer. People will still use them in their LLM enhanced editors, of course, and we can never detect the basic uses of the tools. But if people submit large chunks of text and code that have hallmarks of full generation from an LLM, then we can reject and point to the policy.

As for the smell test and misconceptions about what an LLM can produce, this may depend on whether you think only a literal copy of something violates copyright or if also a derivative of something violates copyright. I think the essential question lies in whether the code a LLM produces is a derivative of copyrighted code. There are many past court cases ruling that derivatives are copyright violations in the US and the OSS licenses almost all state that derivatives fall under the license. I doubt the LLM can produce a fix to the polynomials module if the only training data was the polynomials module. An LLM relies entirely on training on a vast corpus of works and generating code from all of that large body. Now, is that output then a derivative of one, some, or all of the training data? That is to be determined by those that rule on laws (hopefully). Given that we have spent about 40 years collectively trying to protect open source code with copyright licenses, it seems terribly wrong that if you can make your copy source large enough that you no longer have to abide by the licenses.

Paul Ivanov and Matthew Brett have done a good job explaining this nuance here: https://github.com/matthew-brett/sp-ai-post/blob/main/notes.md

My personal opinion is that the LLMs should honor the licenses of the training set and if they did, then all is good. I have no idea how they can solve that from a technical perspective, but the companies are simply ignoring copyright and claiming they are above such laws and that all that they do is fair use. We plebes do not get that same ruling.

Jason

moorepants.info
+01 530-601-9791

To view this discussion visit https://groups.google.com/d/msgid/sympy/CAKgW%3D6LO6iAs22%2BRuwR893ef-8b6WpiexqoBF4f%2ByyjDQGcF3A%40mail.gmail.com.

Francesco Bonazzi

unread,

Nov 5, 2025, 5:50:23 PM11/5/25

to sympy

Maybe it should be made mandatory to disclose any usage of LLM when opening PRs.

Banning usage of LLM completely is a bit extreme, but it may be necessary if vibe spammers keep flooding github with useless PRs.

Oscar Benjamin

unread,

Nov 12, 2025, 9:01:46 AM11/12/25

to sy...@googlegroups.com

On Wed, 5 Nov 2025 at 22:50, Francesco Bonazzi <franz....@gmail.com> wrote:
>
> Maybe it should be made mandatory to disclose any usage of LLM when opening PRs.

There needs to be at the very least a policy that bans using LLMs
without saying that they were used. There should be checkboxes when
opening a PR:

[x] I have or have not used an LLM
[x] I have checked the code from the LLM
[x] I have tested the code myself
[x] I do/don't understand all of the code that was generated by the LLM.

> Banning usage of LLM completely is a bit extreme, but it may be necessary if vibe spammers keep flooding github with useless PRs.

Using LLMs to generate comments and PR descriptions should just be
banned outright with an exception for google translate type use only.
Obviously this cannot be enforced but the messaging needs to be clear:
don't dump LLM output as if it is a comment from yourself.

As for use of LLMs for writing code I don't necessarily object to
simple autocomplete type LLMs for convenience but I think if you look
at typical PRs and contributors right now they really should not be
using LLMs at all. The LLMs only seem to help people to make
time-wasting vibe-code PRs where the "author" has no understanding of
the code they are submitting and has not even done basic testing.

--
Oscar

Daiki Takahashi

unread,

Nov 13, 2025, 7:03:52 AM11/13/25

to sympy

Let me make this clear upfront: all of my posts on GitHub, including this one, rely on translation by an LLM.

I believe it would be reasonable to explicitly state in the policy that spam-like PRs and PRs relying heavily on LLMs are prohibited.
Along with that, the policy should also clarify that such PRs may be proactively closed without prior notice,
and that there should be a clear process for appealing an incorrect closure.

To reduce the review burden, one possible approach would be to require all PRs to undergo an initial review by Copilot before human review.
However, I am not sure how capable Copilot actually is.

2025年11月12日水曜日 23:01:46 UTC+9 Oscar:

Oscar Benjamin

unread,

Nov 13, 2025, 8:36:30 AM11/13/25

to sy...@googlegroups.com

On Thu, 13 Nov 2025 at 12:03, Daiki Takahashi <har...@gmail.com> wrote:
>
> Let me make this clear upfront: all of my posts on GitHub, including this one, rely on translation by an LLM.
>
> I believe it would be reasonable to explicitly state in the policy that spam-like PRs and PRs relying heavily on LLMs are prohibited.

It is very difficult to define what is meant by "spam-like" and I
doubt that someone submitting a PR would understand this in the same
way as reviewers would.

There are different ways of using LLMs and the way that you use them
is absolutely fine. The way that many novice contributors use them is
not useful at all though and at least right now is harmful to sympy
development. I'm not sure how to define the difference between those
in a policy though.

> Along with that, the policy should also clarify that such PRs may be proactively closed without prior notice,
> and that there should be a clear process for appealing an incorrect closure.

Realistically I think in some cases this is the only option. Just
deciding to close them is still a burden though.

> To reduce the review burden, one possible approach would be to require all PRs to undergo an initial review by Copilot before human review.
> However, I am not sure how capable Copilot actually is.

I don't know about Copilot specifically and actually there are many
things called "copilot". I have used "GitHub Copilot" which is an
editor plugin for autocomplete but now there is a "Copilot" button on
the GitHub website that is something different (more like ChatGPT).
Does anyone have any experience of using that?

I see better potential in using AI to help out with reviewing PRs than
having people use AI to write the PRs. Many PRs need quite simple
feedback like "this should have tests. Please add a test in file f"
that could easily be handled by AI (and probably in a more patient,
friendly and helpful way than feedback from human reviewers such as
myself).

Somewhere someone suggested using CodeRabbit which I have seen on some
other repos. I haven't seen it produce anything useful but supposedly
it gets better if you "teach" it.

--
Oscar

Daiki Takahashi

unread,

Nov 14, 2025, 9:24:50 AM11/14/25

to sympy

I created a PR and tried having GitHub Copilot review it as a test.
We can request a Copilot review directly from the PR page.
However, it requires Premium requests, so not everyone can use this feature.

I'm not sure how useful it really is, but it’s certainly convenient to try.

https://github.com/sympy/sympy/pull/28598

document:

https://docs.github.com/en/copilot/how-tos/use-copilot-agents/request-a-code-review/use-code-review

--

haru-44

2025年11月13日木曜日 22:36:30 UTC+9 Oscar:

Message has been deleted

Francesco Bonazzi

unread,

Nov 14, 2025, 2:15:51 PM11/14/25

to sympy

Let's remember that LLMs may write copyrighted material. There is some risk associated with copy-pasting from an LLM output into SymPy code.

Furthermore, what practical algorithmic improvements can an LLM do to SymPy? Can an LLM finish the implementation of the Risch algorithm? I doubt it.

On Friday, November 14, 2025 at 3:24:50 p.m. UTC+1 har...@gmail.com wrote:

However, it requires Premium requests, so not everyone can use this feature.

Most of these AI-assisted tools are designed to take money from developers. I would strongly advise against paying for these services.

LLMs look good at first because most questions had answers in their training set, as soon as you ask an LLM to do anything non-standard or just fix existing code in a way that is not trivial, they fail miserably.

Aaron Meurer

unread,

Nov 15, 2025, 1:06:56 PM11/15/25

to sy...@googlegroups.com

On Fri, Nov 14, 2025 at 12:15 PM Francesco Bonazzi
<franz....@gmail.com> wrote:
>
> Let's remember that LLMs may write copyrighted material. There is some risk associated with copy-pasting from an LLM output into SymPy code.
>
> Furthermore, what practical algorithmic improvements can an LLM do to SymPy? Can an LLM finish the implementation of the Risch algorithm? I doubt it.

"Finishing the Risch algorithm" is an enormous task. Of course an LLM
cannot one-shot that, and a human couldn't do it in one PR either. But
I have actually been using Claude Code to do some improvements to the
Risch algorithm and it's been working. So the statement that an LLM
cannot help with algorithmic improvements in SymPy is false.

>
> On Friday, November 14, 2025 at 3:24:50 p.m. UTC+1 har...@gmail.com wrote:
>
> However, it requires Premium requests, so not everyone can use this feature.
>
>
> Most of these AI-assisted tools are designed to take money from developers. I would strongly advise against paying for these services.

I can't speak to the specific tool being mentioned here, but the best
LLMs right now do require you to pay for them. If a tool is good and
actually improves developer quality of life, we shouldn't be afraid to
pay for it (that applies even outside of AI tools).

FWIW, when it comes to code review, my suggestion would be to use a
local LLM tool like claude code or codex to do the review for you. It
wouldn't be completely automated, but that would give you the best
results. I also agree that writing down the sorts of things you're
looking for in a SymPy review somewhere in the context is going to
make it work better. I would start by having an agent analyze recent
reviews (say, the 50 most recent PR reviews by Oscar), and use that to
construct a Markdown document listing the sorts of things that a good
reviewer is looking for in a review of a SymPy pull request.

>
> LLMs look good at first because most questions had answers in their training set, as soon as you ask an LLM to do anything non-standard or just fix existing code in a way that is not trivial, they fail miserably.

This was true three years ago with GPT 3 but it is not true anymore. I
encourage you to try using GPT-5 codex or Claude 4.5 Sonnet, ideally
using a modern tool like codex CLI, claude code, or Cursor. These
models are very good and can reason about problems they've never seen
before. They still have holes and you have to check everything they do
still, but you can't just assume that something isn't going to work
without trying it.

Even if you have moral qualms against AI (which I personally do not
share), you shouldn't let those give you the wrong impression about
the capabilities of these tools, especially the best-in-class models
like Claude.

Aaron Meurer

>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

> To view this discussion visit https://groups.google.com/d/msgid/sympy/3c225245-11be-49d4-8f50-f7c2f0010c44n%40googlegroups.com.

Oscar Benjamin

unread,

Nov 15, 2025, 2:25:03 PM11/15/25

to sy...@googlegroups.com

On Sat, 15 Nov 2025 at 18:06, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Fri, Nov 14, 2025 at 12:15 PM Francesco Bonazzi
> <franz....@gmail.com> wrote:
> >
> > Let's remember that LLMs may write copyrighted material. There is some risk associated with copy-pasting from an LLM output into SymPy code.
> >
> > Furthermore, what practical algorithmic improvements can an LLM do to SymPy? Can an LLM finish the implementation of the Risch algorithm? I doubt it.
>
> "Finishing the Risch algorithm" is an enormous task. Of course an LLM
> cannot one-shot that, and a human couldn't do it in one PR either. But
> I have actually been using Claude Code to do some improvements to the
> Risch algorithm and it's been working. So the statement that an LLM
> cannot help with algorithmic improvements in SymPy is false.

There is a huge difference between someone who knows what they are
doing, knows the codebase, understands the algorithm etc using an LLM
and reviewing the results as compared to a typical new SymPy
contributor using an LLM to write the code for them. If you are
someone who could write or review the code without using an LLM then
using an LLM and checking the results is reasonable.

By the way your most recent PR is here Aaron:
https://github.com/sympy/sympy/pull/28464
The PR looks good to me apart from the one bit where you said "Claude
generated this fix for the geometry test failure. It would be good to
review". I reviewed it and decided that it looks like a hacky fix and
showed a counterexample of the type that would break that code. The
LLM output cannot be trusted and does not substitute for humans
investigating and writing the code.

From my perspective as a reviewer "this is LLM code. It would be good
to review" is asking reviewers to do what would usually be expected to
be done by the author of the PR in the first instance. That is only a
small example but it shows the broader problem that I think LLMs will
cause in sympy by shifting greater burden onto reviewers while making
it easier for authors to generate more and more PRs to review.

If you are new to programming in general and new to a particular
codebase and then use an LLM to generate code that you don't
understand then the results are not going to be good. There have
always been PRs where the author clearly does not understand the
codebase well or does not understand all of the implications of the
particular choices made in the code. Now though there are PRs where
the author has no idea why the LLM wrote any of the code that it did
and has not even done the most basic of testing and can only respond
to feedback by copying it into an LLM.

What I think people submitting these PRs right now don't realise is
that when I see the LLM-generated PR description or comments, or the
LLM-generated code that they probably don't understand, it removes all
motivation to review any of their PRs and removes any level of trust
that I might give them from the usual benefit of the doubt. I am
mentally blacklisting contributors based on what I consider acceptable
even if there is not an agreed general policy.

I refuse to review PRs from newer contributors if this is the way that
it is going to happen so each of these needs to be reviewed by someone
else or can join the ever growing pile of unreviewed PRs.

> > On Friday, November 14, 2025 at 3:24:50 p.m. UTC+1 har...@gmail.com wrote:
> >
> > However, it requires Premium requests, so not everyone can use this feature.
> >
> >
> > Most of these AI-assisted tools are designed to take money from developers. I would strongly advise against paying for these services.
>
> I can't speak to the specific tool being mentioned here, but the best
> LLMs right now do require you to pay for them. If a tool is good and
> actually improves developer quality of life, we shouldn't be afraid to
> pay for it (that applies even outside of AI tools).
>
> FWIW, when it comes to code review, my suggestion would be to use a
> local LLM tool like claude code or codex to do the review for you. It
> wouldn't be completely automated, but that would give you the best
> results. I also agree that writing down the sorts of things you're
> looking for in a SymPy review somewhere in the context is going to
> make it work better. I would start by having an agent analyze recent
> reviews (say, the 50 most recent PR reviews by Oscar), and use that to
> construct a Markdown document listing the sorts of things that a good
> reviewer is looking for in a review of a SymPy pull request.

I was thinking more that an LLM bot on GitHub could handle basic
things like how to write the .mailmap file or how to add a test or run
the tests, fix trailing whitespace, interpret CI output and so on. It
would be good to have things like some tooling to identify older
related PRs or issues and say "maybe this fixes gh-1234 so a test
could be added for that" and other things of that nature as well.

I don't actually want an LLM to review the real code changes but I can
see the value in having LLMs help with some of the tedious back and
forth so that a contributor gets rapid help and when a human reviewer
gets to it the PR is more likely in a state that is ready to merge.

--
Oscar

Daiki Takahashi

unread,

Nov 16, 2025, 12:25:48 AM11/16/25

to sympy

> I was thinking more that an LLM bot on GitHub could handle basic
> things like how to write the .mailmap file or how to add a test or run
> the tests, fix trailing whitespace, interpret CI output and so on.

I was thinking along these lines as well. Thanks to the current automated checks in CI,

reviewers no longer need to spend time thinking about things that don't require human judgment.

For the time being, I'd like to see LLMs further expand that area -- handling more of the tasks that

can be automated so reviewers can focus on the parts that really matter.

--

haru-44

2025年11月16日日曜日 4:25:03 UTC+9 Oscar:

Francesco Bonazzi

unread,

Nov 16, 2025, 4:21:15 AM11/16/25

to sympy

The best way to use LLMs is as smart lookups. They have been trained on many books containing details of known algorithms, so LLMs can be used to get some clues from their knowledge, which would otherwise require a lot of time dedicated to reading and finding the correct paragraph explaining what you need in a book.

I have tried using LLMs for coding, the main problems I see are:

they keep hallucinating and making up non-existent APIs quite often, or sometimes even utterly wrong code,
LLMs apparently have no clues on understanding and continuing the existing code... each time they keep rewriting a lot of stuff.

Querying an LLM for suggestions should be seen as some evolution of looking up on Stackoverflow. However, care should be taken, because, unlike answers on Stackoverflow that are human-verified, LLM answers may be just wrong.

If LLMs are used the right way, they may help with being more productive. Unfortunately, my fear is that many developers will start using them without proper supervision. Average number of lines of code written will very likely increase, but so will the number of bugs.

Oscar Benjamin

unread,

Dec 1, 2025, 12:48:23 PM12/1/25

to sy...@googlegroups.com

On Sun, 16 Nov 2025 at 09:21, Francesco Bonazzi <franz....@gmail.com> wrote:
>
> If LLMs are used the right way, they may help with being more productive. Unfortunately, my fear is that many developers will start using them without proper supervision. Average number of lines of code written will very likely increase, but so will the number of bugs.

I think that for a novice SymPy contributor the only right way to use
LLMs can be something like to ask questions about the codebase. Using
them to write the code just means skipping the thinking process that
would be a prerequisite for being able to supervise the LLMs in
writing and checking the code properly.

So far I have avoided pointing at individual pull requests in this
discussion but this one jumped out at me just now:

https://github.com/sympy/sympy/pull/28681

It was opened 6 hours ago by an entirely new contributor and in under
an hour grew to 800 lines of new code.

The author of the PR has also opened a PR in their own repo using the
same branch and over there you can see it being reviewed by
coderabbit:

https://github.com/Cprakhar/sympy/pull/1

There is a comment from coderabbit there that says "Here are the
copyable unit test edits:" and then shows hundreds of lines of unit
test code that seem to have been copied into the PR.

I don't know whether the code in the PR is reasonable. It looks very
LLM-style verbose/duplicative but besides that I don't want to review
it in any detail if the author hasn't spent time doing that
themselves.

Oscar

Francesco Bonazzi

unread,

Dec 8, 2025, 6:15:55 PM12/8/25

to sympy

I fear that AI bots will start opening PRs soon (or maybe they are already doing it). AI can impersonate human conversation pretty well. The purpose of such bots is to use human feedback just to collect data.

Oscar Benjamin

unread,

Dec 9, 2025, 5:58:08 PM12/9/25

to sy...@googlegroups.com

On Mon, 8 Dec 2025 at 23:15, Francesco Bonazzi <franz....@gmail.com> wrote:
>
> I fear that AI bots will start opening PRs soon (or maybe they are already doing it). AI can impersonate human conversation pretty well. The purpose of such bots is to use human feedback just to collect data.

I am actually getting emails roughly once a week right now from AI
companies offering to pay me to review AI generated PRs but I have not
replied to any of them.

I don't think that we are seeing AI bots though. It is just humans
using AI tools sometimes in a reasonable way but more often badly.

We absolutely need to have a policy about this that insists that use
of AI to write the code needs to be disclosed. A policy should clearly
state that it is not acceptable to submit AI generated code if it is
not code that you understand yourself and should explain why this is
bad and what you should do instead.

Regardless of whether the policy is enforceable I think people need to
see a clear statement of what is a reasonable way of going about
things. Honestly I don't blame people for thinking that having an AI
just write all the code is the modern way with all the hype around
this.

Right now the majority of PRs opened are from people who have used
some AI tool to write the code. They have trusted the code in
deference to the AI's seemingly superior capabilities and knowledge
and just launched it into a PR.

The end result is that most PRs now are unchecked LLM output. It is a
waste of time to review these as long as the author thinks that
submitting unchecked LLM output is reasonable because any review
comments are just typed into the LLM and the LLM even writes their
comments in reply.

If we were talking about this in the context of software developers
working in a company together then I think that there could be all
sorts of ways of managing this. In the context of an open source
project having loads of people appearing from nowhere and spewing LLMs
into PRs is unmanageable.

--
Oscar

Anand Bansal

unread,

Dec 16, 2025, 6:37:26 AM12/16/25

to sympy

The problem of AI generated code is happening all across the open source. One case that I am close is https://github.com/paradigmxyz/reth. What I think they are doing is just adding a very strict CI for everything and again reviewers have to put so much effort.

Arka Saha

unread,

Dec 16, 2025, 12:48:15 PM12/16/25

to sympy

I read this discussion on AI bots performing PRs and indeed it's an issue, and would probably be a major issue in the next few months. But again, if the code so generated, even by a AI bot, if it performs well, benefits the organisation and solves the issue, then I don't see why we should demotivate such cases. Also yes, it would need proper fine tuning of LLM if someone aims to achieve this, training it to make proper comments, PRs, branch names etc.

And also yes sometimes a adversary might misuse such a bot, that's what we need to prevent.

Again, opinions might vary, and yeah I would always prefer to write my own code lol and not use any bot to make PRs, I prefer getting my hands dirty.

Oscar Benjamin

unread,

Dec 16, 2025, 1:47:55 PM12/16/25

to sy...@googlegroups.com

On Tue, 16 Dec 2025 at 17:48, Arka Saha <i.am.ar...@gmail.com> wrote:
>
> I read this discussion on AI bots performing PRs and indeed it's an issue, and would probably be a major issue in the next few months. But again, if the code so generated, even by a AI bot, if it performs well, benefits the organisation and solves the issue, then I don't see why we should demotivate such cases.

It isn't good code though and it doesn't benefit the organisation or
solve any issues.

What we are seeing is really just spam with many more pull requests of
much lower general quality. Even if some of them are good, the review
process is overloaded to separate the good ones from the rest.

The problem is that AI is enabling novices to generate low effort PRs
much more easily at the same time as making it harder to review those
PRs because superficially they look good but actually everything about
them is wrong in ways that are hard to predict or understand without
close attention.

Because the AI can write the code and open the pull request and answer
all of the questions about how to do all of those steps, people think
it is acceptable to do that without having done basic things like:

- Reading any of the code (before or after the changes).
- Thinking about any of the code or changes themselves.
- Knowing what changes are even in the PR that they have submitted.
- Knowing how to test changes to the code or how to run the test suite.

Previously it was not really possible to get to the point of having a
PR that passes CI checks without spending some time doing these
things. Now it is possible to skip all of those steps and then produce
a garbage PR that superficially looks reasonable while actually being
entirely wrong.

People at this level really are not benefitting from the use of AI. If
they learned how to do things without AI then they might become
capable of using the AI to produce something good in future.

--
Oscar

Sham S

unread,

Dec 17, 2025, 8:28:42 AM12/17/25

to sy...@googlegroups.com

I agree with Oscar on AI generated PRs. Not only sympy many other open source projects are experiencing this issue, I see most of them are closed without merging as they are verbose and often contain false claims about what they do or haven't passed basic unit tests. I think having a clear policy on disclosure and understanding of the code is the only way to manage this moving forward.

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/sympy/CAHVvXxRzz9ucfNJqT81xw81aNtRkSjywztq0n%2BKVKZJKFkDomA%40mail.gmail.com.

Peter Stahlecker

unread,

Dec 17, 2025, 10:44:38 AM12/17/25

to sympy

I have been following this discussion for a while.

What I do not understand is this: why would anybody want to push a PR which he does not understand?

This seems to take out all the fun.

I

Francesco Bonazzi

unread,

Dec 17, 2025, 12:53:28 PM12/17/25

to sympy

On Wednesday, December 17, 2025 at 4:44:38 p.m. UTC+1 peter.st...@gmail.com wrote:

What I do not understand is this: why would anybody want to push a PR which he does not understand?
This seems to take out all the fun.

This is a good question indeed. I have some theories:

People opening PR using AI seem to also chat using AI. I suspect some of the might be full AI-bots. Why? Maybe someone testing some product and looking at Github as a way to collect human feedback to further train their model.
many seem to have an overdecorated Github account, often with links at their LinkedIn or other networking sites. In this case, they may simply be trying to bolster their Github account by getting a lot of code merged into major projects, in order to look like fruitful developers. I suspect that in some cases this may help getting job contracts if the recruiters aren't careful enough.

These are my suspicions. Both cases are bad for our community and these kind of people are doing a lot of damage to open source communities.

Peter Stahlecker

unread,

Dec 17, 2025, 2:23:07 PM12/17/25

to sympy

Your point 2. makes emminent sense to me, more so since over 95% of these PRs seem to come from India, where such statistics are considered to be important. (I was in India over 100 times in my job as a salesman before I retired).

My worry is that key people, you, Oscar, Jason, others get tired of these AI - PRs and stop taking care of sympy - which would be the end of sympy.

NB: I am too old and too ignorant to ever push a PR to sympy, but i enjoy the community a lot.

Oscar Benjamin

unread,

Dec 17, 2025, 2:53:01 PM12/17/25

to sy...@googlegroups.com

On Wed, 17 Dec 2025 at 17:53, Francesco Bonazzi <franz....@gmail.com> wrote:
>
> On Wednesday, December 17, 2025 at 4:44:38 p.m. UTC+1 peter.st...@gmail.com wrote:
>
> What I do not understand is this: why would anybody want to push a PR which he does not understand?
> This seems to take out all the fun.
>
> This is a good question indeed. I have some theories:
>
> People opening PR using AI seem to also chat using AI. I suspect some of the might be full AI-bots. Why? Maybe someone testing some product and looking at Github as a way to collect human feedback to further train their model.
> many seem to have an overdecorated Github account, often with links at their LinkedIn or other networking sites. In this case, they may simply be trying to bolster their Github account by getting a lot of code merged into major projects, in order to look like fruitful developers. I suspect that in some cases this may help getting job contracts if the recruiters aren't careful enough.

I don't think that these are AI bots. They are humans who are using AI
for everything including writing the code and writing comments and
things.

The reason for doing this is the google summer of code (GSOC)
programme. SymPy enters that programme every year and a few people
(usually students) will do projects where they get paid by Google to
work on something in SymPy. This is what they want on their CV.
SymPy's rules are that someone has to have a PR merged to be
considered for GSOC so every year at this time large numbers of people
turn up and start opening PRs, many of which are low quality.

This is not anything new but the difference now is that it is much
easier to open a PR as I said above:

> I think what has happened is that the combination of user-friendly
editors with easy git/GitHub integration and LLM agent plugins has
brought us to the point where there are pretty much no technical
barriers preventing someone from opening up gibberish spam PRs while
having no real idea what they are doing.

Previously there were some barriers like you can't edit the code
without first looking at the code or you have to figure out git etc.
These barriers had two effects:

- They would filter out many PRs before they even existed.
- They required greater effort so that the person opening the PR was
forced to think more about what they were doing resulting in a better
PR.

Removing those barriers means having more PRs of a lower quality. AI
in particular makes it possible to reduce the time taken to generate a
low quality PR massively and the effect of this is obvious if you look
at how quickly some people are opening multiple PRs in succession.

--
Oscar

Oscar Benjamin

unread,

Dec 17, 2025, 3:26:27 PM12/17/25

to sy...@googlegroups.com

On Wed, 17 Dec 2025 at 19:52, Oscar Benjamin <oscar.j....@gmail.com> wrote:
>
> I don't think that these are AI bots. They are humans who are using AI
> for everything including writing the code and writing comments and
> things.
>
> The reason for doing this is the google summer of code (GSOC)
> programme.

Maybe there should be something about this in the GSOC rules/guidance.
The reality is that when it comes to ranking candidates for GSOC
spammy AI stuff is going to go in the bin so if someone is opening AI
PRs now in the hope of getting accepted for GSOC then I think they
have misunderstood the situation. We should probably make that clear
at the outset to anyone thinking of applying.

--
Oscar

Francesco Bonazzi

unread,

Dec 19, 2025, 3:49:37 AM12/19/25

to sympy

On Wednesday, December 17, 2025 at 8:53:01 p.m. UTC+1 Oscar wrote:

I don't think that these are AI bots. They are humans who are using AI
for everything including writing the code and writing comments and
things.

Let's do a simple test. Instead of commenting these PRs by typing text in, let's just attach an image containing the comment. This is no problem for human beings, but I expect AI-bots to fail in understanding the comment, unless they are connected with an OCR or use vision-language models.

The reason for doing this is the google summer of code (GSOC)
programme. SymPy enters that programme every year and a few people
(usually students) will do projects where they get paid by Google to
work on something in SymPy. This is what they want on their CV.
SymPy's rules are that someone has to have a PR merged to be
considered for GSOC so every year at this time large numbers of people
turn up and start opening PRs, many of which are low quality.

SymPy isn't the only project that's being spammed by AI-generated PRs. Apparently this problem is quite common.

Let's keep an eye on the preventive measures that other projects are taking.

Ralf Schlatterbeck

unread,

Dec 19, 2025, 4:00:59 AM12/19/25

to sy...@googlegroups.com

On Fri, Dec 19, 2025 at 12:49:36AM -0800, Francesco Bonazzi wrote:
>
> Let's do a simple test. Instead of commenting these PRs by typing text in,
> let's just attach an image containing the comment. This is no problem for
> human beings, but I expect AI-bots to fail in understanding the comment,
> unless they are connected with an OCR or use vision-language models.

I've successfully uploaded photographed pages from a book describing
contrapunct (music) rules and asked Claude (the AI from anthropic.com)
to write code for a project. So OCR is not a test for an AI these days.
Many AIs have OCR built in.

Ralf
--
Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16
Open Source Consulting www: www.runtux.com
Reichergasse 131, A-3411 Weidling email: off...@runtux.com

Oscar Benjamin

unread,

Dec 27, 2025, 9:16:05 AM12/27/25

to sy...@googlegroups.com

On Wed, 17 Dec 2025 at 19:52, Oscar Benjamin <oscar.j....@gmail.com> wrote:
>

> On Wed, 17 Dec 2025 at 17:53, Francesco Bonazzi <franz....@gmail.com> wrote:
> >
> > On Wednesday, December 17, 2025 at 4:44:38 p.m. UTC+1 peter.st...@gmail.com wrote:
> >
> > What I do not understand is this: why would anybody want to push a PR which he does not understand?
> > This seems to take out all the fun.
> >
> > This is a good question indeed. I have some theories:
> >
> > People opening PR using AI seem to also chat using AI. I suspect some of the might be full AI-bots. Why? Maybe someone testing some product and looking at Github as a way to collect human feedback to further train their model.
> > many seem to have an overdecorated Github account, often with links at their LinkedIn or other networking sites. In this case, they may simply be trying to bolster their Github account by getting a lot of code merged into major projects, in order to look like fruitful developers. I suspect that in some cases this may help getting job contracts if the recruiters aren't careful enough.
>
> I don't think that these are AI bots. They are humans who are using AI
> for everything including writing the code and writing comments and
> things.

You can see an actual sympy repo full of AI bots opening issues and PRs here:

https://github.com/agyn-sandbox/sympy/issues

That seems to be from an AI company. Their bot tagged me while
hallucinating a non-existent bug:
https://github.com/agyn-sandbox/sympy/issues/70

One of the bots there also commented on the SymPy repo in the last 24 hours:
https://github.com/sympy/sympy/pull/17#issuecomment-3693378386

I'm not sure whether to report them as spam/abuse but I don't want to
get notifications from their AI bots and we should take care to see if
they are commenting on the sympy repo.

--
Oscar

Oscar Benjamin

unread,

Dec 27, 2025, 9:25:39 AM12/27/25

to sy...@googlegroups.com

In fact there was even a pull request from an AI bot:
https://github.com/sympy/sympy/issues/28843

--
Oscar

Oscar Benjamin

unread,

Dec 30, 2025, 6:31:10 PM12/30/25

to sy...@googlegroups.com

This one here looks like a real AI bot:
https://github.com/sympy/sympy/pull/28862

Sangyub Lee

unread,

Jan 1, 2026, 8:29:39 AMJan 1

to sympy

> Let's do a simple test. Instead of commenting these PRs by typing text in, let's just attach an image containing the comment. This is no problem for human beings, but I expect AI-bots to fail in understanding the comment, unless they are connected with an OCR or use vision-language models.

I don't think that it is difficult problem to solve than using Captcha for submitting a PR. That does not 100% solve the problem but at least raises barrier to professional spammers and scammers who can bypass that.
Many social media services should have endured much more chaos than SymPy or Github for this stuff, so we can use their experiences. There are always more stricter options like policies allowing stricter face/ID verified accounts.
Github may not have submitting captcha before PR, but we can implement something like CI check with 3rd party captcha service for new contributors.

Jason Moore

unread,

Jan 14, 2026, 7:46:24 AMJan 14

to sy...@googlegroups.com

Dear all,

I opened this PR with a draft of a SymPy AI code contribution policy:

https://github.com/sympy/sympy/pull/28941

I suggest that people review and I will improve the text based on the feedback there until we have some consensus.

This will not solve all issues associated with the onslaught of AI slop, but can at least get some kind of policy codified that we can use to help fend off the slop.

Jason

moorepants.info
+01 530-601-9791

--

You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/sympy/5810a4bc-b237-4ee7-a707-cb29cddea2d2n%40googlegroups.com.

Jason Moore

unread,

Jan 16, 2026, 2:22:09 AMJan 16

to sy...@googlegroups.com

Dear all,

We now have a second draft of the AI policy:

https://github.com/sympy/sympy/pull/28941

based on a round of feedback. Please review the 2nd draft and leave any more feedback.

I would like to have it merged after this coming round of feedback. Please have a look.

Thanks,

Jason

moorepants.info
+01 530-601-9791

peter.st...@gmail.com

unread,

Jan 16, 2026, 2:45:29 AMJan 16

to sy...@googlegroups.com

I like it. If some reviewer rejects an AI generated PR / issue he/she has a “policy reason” to point to.

B.t.w.: I did not once see that a submitter, whose PR was rejected on AI grounds, objected to this – apparently it was AI generated.

Peter

To view this discussion visit https://groups.google.com/d/msgid/sympy/CAP7f1AiQqbW5CL%3DOyk6h3jnBCjrN4oBbv_wW4kxHt-yy7U3zPg%40mail.gmail.com.

Jason Moore

unread,

Jan 28, 2026, 8:39:54 AMJan 28

to sy...@googlegroups.com

Dear all,

Thanks for all the input. The AI policy is now live:

https://docs.sympy.org/dev/contributing/ai-generated-code-policy.html

Jason

moorepants.info
+01 530-601-9791

To view this discussion visit https://groups.google.com/d/msgid/sympy/061b01dc86bc%240d11db80%2427359280%24%40gmail.com.

Chris Smith

unread,

Jan 30, 2026, 3:28:21 PMJan 30

to sympy

I’d like to raise a process question regarding the application of the AI policy to reviewed pull requests.

A recent PR addressing a long-standing performance issue in degree() was closed with the label “AI slop,” despite the following:

The change was technically reviewed (by myself)
The correctness and performance improvements were verified
The author disclosed AI usage and stated that finding the source of the problem and design were manual
The PR appears to comply with current AI policy as written (or at least I don't see the violation)

I’m not arguing that the change must be merged. My concern is procedural: when a pull request has received substantive technical review and endorsement, it seems problematic for it to be summarily dismissed without a technical rationale. If there is no review and the code clearly doesn't address the problem in a meaningful way then I don't see a problem with closing it after a couple of days.

As the AI policy currently stands, it permits AI-assisted contributions provided the author understands and takes responsibility for the code. And having reviewed the code, I can't see why labelling it as “AI slop” is a sufficient basis for closure in the absence of technical objections.

I’d like to ask whether we should clarify policy or process here, for example by distinguishing between unreviewed submissions and those that have received substantive technical review. At minimum, it would be helpful to document whether reviewed PRs are expected to receive a technical disposition, even when AI assistance is involved.

I’m happy to help draft a clarification if that would be useful.

/c

Oscar Benjamin

unread,

Jan 30, 2026, 5:12:06 PMJan 30

to sy...@googlegroups.com

On Fri, 30 Jan 2026 at 20:28, Chris Smith <smi...@gmail.com> wrote:
>
> I’d like to raise a process question regarding the application of the AI policy to reviewed pull requests.
>
> A recent PR addressing a long-standing performance issue in degree() was closed with the label “AI slop,”

I definitely should have been more polite so I will acknowledge that.
I guess I must have been getting annoyed with there being so many low
quality and AI PRs.

If you want to review and merge that PR then go ahead.

> The author disclosed AI usage and stated that finding the source of the problem and design were manual
>
> The PR appears to comply with current AI policy as written (or at least I don't see the violation)

I think that this is a bit disingenuous and that you know that the
code was all written by AI and that this was not honestly disclosed.
The AI policy says that you should explain how you have used AI and
that is in the PR template but what was written there was just "review
suggestions were provided by an AI tool". Maybe you read that
differently from me but what it should honestly say is "the code was
all written by Claude".

Much like I can see a few lines of code and know immediately that it
was written by Christopher Smith I can also see a few lines of code
and know that it was written by Claude or its ilk. I'm pretty sure
that you can also read a few lines of code and say the same things.

> As the AI policy currently stands, it permits AI-assisted contributions provided the author understands and takes responsibility for the code. And having reviewed the code, I can't see why labelling it as “AI slop” is a sufficient basis for closure in the absence of technical objections.

I think that you are misunderstanding what I would consider to be the
problem of AI PRs. It is not about the actual code and its quality.
Many AI tools can produce better code than many of the people
currently opening SymPy PRs. The problem with the AI PRs is that they
are harder to review and are overloading us with spam. The problem
also is that if people don't explain honestly how they have used AI
then that in itself makes it harder to review the code because you
have to pick apart the AI hallucinations from the human
misunderstandings.

The other problem is that if we really want AI PRs then we don't
actually need new contributors to bring them. It would be far more
efficient for those of us who would have reviewed the PR to make AI
PRs directly without the other person getting in the way.

Technically the main problem with the PR is just the fact that it is
classic more code on top of code creating more space for bugs without
actually delivering much value. Unfortunately this is exactly what AI
makes easy: you ask it to do something and it just spits out more and
more code. The code might seem to work but if we merge it into the
codebase then it needs to be maintained and soon there will be bug
reports saying that degree doesn't work in this or that corner case
and then someone will have to review the bug reports and then someone
else will have an AI spit out more code on top of code and so on.

The PR may "fix" some issue but how much actual value are you getting?
It doesn't actually make it safe to call degree on an expression
because it still has a fallback case where it would effectively go
into an infinite loop like degree((x - 1)**1000000 - (x +
1)**1000000). The proper fix would be to have a version of Poly that
does not expand everything and that would be something very useful
that could be used in lots of places for much more than just degree
like it could also get the leading coefficient, support different
domains, evaluate and so on.

In any case all of the low quality PRs around now are seriously
getting me down. I think for now I will stop reviewing PRs from anyone
but a select list of people whose PRs don't annoy me (people who can
consistently produce something that does not require many iterations
unless it delivers significant actual value). This will mean that
unless other people do a lot more PR reviewing most GSOC-related PRs
are not going to get reviewed.

--
Oscar

Jason Moore

unread,

Jan 31, 2026, 1:56:54 AMJan 31

to sy...@googlegroups.com

A core dev was engaging with the PR, so closing it over top of them is not polite. Ideally we would not do such a thing at all (AI involved or not). We should also be more polite in the comments to the contributor, even if we are talking to an AI bot.

My suggestion would be that any core dev who wants to apply the policy should ask the other core devs involved in the PR discussion before making the universal call.

Jason

moorepants.info
+01 530-601-9791

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/sympy/CAHVvXxTi4x5ZBAeUmWhLZsU%2B_ffYFCL7Y87HnKhRAjMd0XL3sQ%40mail.gmail.com.

Oscar Benjamin

unread,

Jan 31, 2026, 7:15:16 AMJan 31

to sy...@googlegroups.com

Let's just clarify the individual points here:

- Do we agree that the code is AI generated?
- Do we agree that this was not honestly disclosed?
- Do we agree that the AI policy prohibits that and says we will close
a PR in that case?

I agree that it should be done more politely but if we don't have a
clear "undisclosed AI means close PR" rule then I think that the
policy is a failure.

Any suggestion that more effort, thought, discussions etc goes into
the decision to close an AI PR is missing the point that the problem
they cause is the time that it takes to sift and review them. The
effort ratio where someone can type a prompt and have an AI make a PR
and then a reviewer has to spend actual time considering its technical
merit is precisely the problem and we need a solution that is minimal
effort on the maintainer side.

I'm not sure if I even noticed that Chris was reviewing the PR. I just
saw people mentioning LLMs, looked and saw that the code was AI
generated and then looked and saw that the AI disclosure was
dishonest.

It is perhaps worth pointing out that the sympy repo has been drowning
in PRs for a long time. There are 921 open PRs and that number only
goes up. This was already unsustainable before the AI PRs came along.

Oscar

> To view this discussion visit https://groups.google.com/d/msgid/sympy/CAP7f1AjKhWt1yq4aP1rOHxBr-6ctdA7NCJ7eqCeqNSgNVPMGNA%40mail.gmail.com.

Rushabh Mehta

unread,

Jan 31, 2026, 7:31:15 AMJan 31

to sympy

I was the first person to comment under that PR and mention LLMs. There were couple reasons I did this, first: It is obvious looking at the code that its AI generated (example: the tests have numbered comments, classic sign of LLMs like ChatGPT and Gemini)
Second:
Knowing how newer contributors act, the person raising the PR must have inputted the old issue and the related file into an LLM, and simply raised a PR out of it, also generating a PR description in the process.

I am in total favour of the original decision of it being closed.

Jason Moore

unread,

Jan 31, 2026, 11:03:30 AMJan 31

to sy...@googlegroups.com

I agree that this PR can be closed based on the new policy. It sounds like you missing Chris's review was then just an oversight.

Jason

moorepants.info
+01 530-601-9791

To view this discussion visit https://groups.google.com/d/msgid/sympy/CAHVvXxRx49c8U07sH6-2foEWA04LdrSjT82nCmX0M8pkXLuCKg%40mail.gmail.com.

Peter Stahlecker

unread,

Jan 31, 2026, 11:38:00 AMJan 31

to sympy

My comment may only be tangent to the topic.

Oscar mentioned that there are over 900 PRs. I checked and the oldest one is from 2012!

My question: If a PR sits around for over a year and has not been merged, does this not indicate that it might not be a great one?

Would it make sense to just close untouched PRs that are beyond a certain age?

I am just a user, not a contributor, just following the discussions.

Peter

Aaron Meurer

unread,

Jan 31, 2026, 11:38:48 AMJan 31

to sy...@googlegroups.com

If a PR seems like it was generated by AI and doesn't disclose it, but
seems otherwise valuable, we should just ask the contributor if it was
written by AI. If they don't respond, or if they lie, we can close it.
The policy doesn't prohibit AI, just not disclosing it.

Of course, if a PR is just completely not valuable we should close it.
This is something we've gotten away with not doing so much in the
past, because it is a little rude, but with so many AI PRs we can't
really afford to be so lax anymore. Also, it's much less of a slap in
the face, even when a PR does have an actual human behind it, if they
didn't actually spend that much time on the PR (but note that AI
doesn't necessarily = no time spent. That's just the default).

In the case of this PR, the code does indeed look "slop-y", like it
could be much simpler. This is not unlike what I would expect from a
new contributor normally, actually, except that AI tends to write way
more comments than a human would. If the code were much simpler I
don't think it would be that bad to include, i.e., the whole thing
could probably be only a handful of lines instead of 60. My take is
that if a maintainer wants to take the time to push this to something
mergeable, they should. But if no one does, it should be closed.

Aaron Meurer

> To view this discussion visit https://groups.google.com/d/msgid/sympy/CAP7f1Ai-zEjOaJJ9rBXzXA%2B3i9o5hGhmfO2VPBrt_mtZNRdviQ%40mail.gmail.com.

Oscar Benjamin

unread,

Jan 31, 2026, 1:00:42 PMJan 31

to sy...@googlegroups.com

On Sat, 31 Jan 2026 at 16:38, Peter Stahlecker
<peter.st...@gmail.com> wrote:
>
> My comment may only be tangent to the topic.
> Oscar mentioned that there are over 900 PRs. I checked and the oldest one is from 2012!
> My question: If a PR sits around for over a year and has not been merged, does this not indicate that it might not be a great one?

Some are good and some are not. All you can really say is that they
have been forgotten about and there are just too many for anyone to go
back through.

> Would it make sense to just close untouched PRs that are beyond a certain age?

Yes it would and I think that this is what sympy should do now. We
should have some automation that closes a PR after 2 months of no
activity. The notification of it being closed can be a trigger that a
reviewer can quickly check if it is worth not closing or it can remind
the contributor that they should actually finish it or something.

There are a few reasons that a PR can end up in that state. Basically
when a new PR is opened or someone pushes new commits or comments etc
I get a notification by email. I regularly swipe through the messages
in that mailbox which shows me most activity happening in the repo as
it happens. At the time of someone opening/updating a PR I may choose
to look at it and if I do then I can decide between these options:

1. Merge the PR.
2. Make a comment suggesting changes.
3. Make a comment suggesting to close the PR.
4. Close it immediately (unusual but much more common recently).
5. Ignore the PR (perhaps just until further changes).

A lot of PRs end up in a space where you can choose between options 2,
3 and 5 depending on how much effort you as a reviewer want to put in.
How much work it would be to suggest changes that improve the PR
varies a lot depending on who is submitting the PR and what they are
trying to change. Often then I will just ignore the PR based on its
contents and who the author is.

Of course I am not the only person reviewing PRs and anyone is free to
ignore a PR. What that means is that if I do choose to ignore a PR I
don't actually know that that means that no one else is going to
review the PR although often that is the case. To some extent there is
a notion that there are certain people who would review PRs for some
particular parts of the codebase but there are also quite large parts
of the codebase that no one really actively maintains so a PR for
those can easily be ignored.

Another point that is probably worth mentioning here is that if you
are the first person to comment on a PR then that is often interpreted
like you are taking responsibility for that PR. Then the author will
assume that you are going to interact with them repeatedly rather than
just commenting once and ignoring it afterwards. Likewise other
potential reviewers may be discouraged from reviewing the PR by the
fact that someone else seems to be already doing that.

I think that automatically closing PRs after a certain time is
realistically the only good way of handling this and is just a
necessary admission of our resource limitations.

There are a lot of PRs where I know that it is not good enough to be
merged but I don't think that the effort (on our part) of helping the
author improve the PR is worth it. If I was saying my thoughts out
loud then it would be things like:

- This person is not capable of making this PR correct.
- It would be too much effort to explain to them what is needed.

The difference with the AI PRs now is that probably anyone can do
anything with enough prompting but PR review is now like prompting a
really broken AI (worse than using an actual AI directly). The fact
that people are using AI (for anything, not just writing the code)
increases the work involved from my perspective to do anything other
than just ignore the PR.

Another thing that happens particularly now with lots of new
contributors making mostly trivial PRs is that I would generally be
quite strict about making sure that someone's first or second PR must
be exactly correct and follow all standards. In the past what that
would mean is that if you give clear and detailed feedback to
someone's first PR then their second PR will not have any of the
issues that were discussed in the previous one.

Now though with AI PRs each PR comes from a different AI context
window. Everything you said in the previous PR like "use X not Y" will
have to be said again because the AI forgot all of your previous
feedback. A human actually thinking about and typing the code would
have remembered that feedback when typing it but a human using an AI
and not closely reviewing its output will just produce the same wrong
AI code again and again. Unfortunately a lot of new sympy contributors
seem to think that it is okay to expect that all the reviewing is done
by someone else and that means that they are just not experienced
enough to be able to use AI in any reasonable way.

I watched a talk from someone at Google who was something like
"director of AI infrastructure" and was basically overseeing how
Google developers use AI tools for programming. He literally said that
it meant that junior people were opening PRs in half the time but PR
reviews were taking twice as long and then ruminated about how to
manage the increased burden for senior people doing the reviewing. His
suggestions for managing that burden were reasonable things to do in a
company with employees, teams, line managers, and so on. None of his
ideas were remotely useful for an open source project where the
critical resource is always maintainer time.

I think that we need to question whether the open-to-anyone model of
open source collaboration is actually viable any more. In particular
for SymPy I think that GSOC is not worth it any more because it means
all of these low quality PRs now (not just the AI ones).

--
Oscar

gu...@uwosh.edu

unread,

Jan 31, 2026, 5:37:42 PMJan 31

to sympy

As a minor contributor, I strongly agree with the idea that PRs without activity for some period of time should be automatically closed. I am not sure the core maintainers and reviewers need to be notified. The originator of the PR should be notified with a message explaining that it was closed because of inactivity over the last XX period of time. They should be encouraged to review the PR carefully and decide if they have the time and interest in adjusting the code so that it meets all requirements for merging and address any concerns raised in the PR before it was closed. If so, they should open a new pull request with code updated to pass all tests when built with the current development branch and refer to the old PR in case reviewers want to look at the history.

I would be in favor of the period of inactivity being in the range of 6 months to a year. This would potentially close both pull requests I currently have open (https://github.com/sympy/sympy/pull/28258 and https://github.com/sympy/sympy/pull/24574). This seems reasonable, because although I would be interested in pursuing both of them, the reality is that my primary job is as a Chemistry Professor/Computational Quantum Chemist and I am unlikely to have much time for work on either of these until at least the end of the current semester. I do not object to opening a new PR or reopening the old one when I get back to being able to consider the code.

AI generated slop should definitely be closed when noticed. My experience as an instructor is that people will only do what you want if you enforce the expectations.

Jonathan

Oscar Benjamin

unread,

Jan 31, 2026, 6:10:44 PMJan 31

to sy...@googlegroups.com

On Sat, 31 Jan 2026 at 22:37, 'gu...@uwosh.edu' via sympy
<sy...@googlegroups.com> wrote:
>
> As a minor contributor, I strongly agree with the idea that PRs without activity for some period of time should be automatically closed. I am not sure the core maintainers and reviewers need to be notified. The originator of the PR should be notified with a message explaining that it was closed because of inactivity over the last XX period of time. They should be encouraged to review the PR carefully and decide if they have the time and interest in adjusting the code so that it meets all requirements for merging and address any concerns raised in the PR before it was closed. If so, they should open a new pull request with code updated to pass all tests when built with the current development branch and refer to the old PR in case reviewers want to look at the history.
>
> I would be in favor of the period of inactivity being in the range of 6 months to a year. This would potentially close both pull requests I currently have open (https://github.com/sympy/sympy/pull/28258 and https://github.com/sympy/sympy/pull/24574). This seems reasonable, because although I would be interested in pursuing both of them, the reality is that my primary job is as a Chemistry Professor/Computational Quantum Chemist and I am unlikely to have much time for work on either of these until at least the end of the current semester. I do not object to opening a new PR or reopening the old one when I get back to being able to consider the code.

I'm not sure how this would work in practice but for example if
someone else closes your PR then I think it isn't possible for you to
reopen it unless you are a member (some who can merge PRs). You could
of course comment that you would like to reopen and then a member
could do it and we could make sure that the bot that closes the PR
would leave a message explaining that.

I'm not sure about 6 months. I think that basically after 1 month of
inactivity the PR is usually forgotten but the amount of time passed
is still small enough that the PR could easily be revived if people
were reminded and actually wanted to continue with it. If all
maintainers simply happened to overlook a PR then 1 month is probably
a good amount of time for there to be some kind of notification so
perhaps a bot could comment then that it has been inactive for 1 month
and then if no one does anything then at the 2 month mark it can be
closed.

It would probably be helpful for some people if the 1 month bot
message explains some common reasons to the author like "if CI checks
have failed and there are red crosses everywhere then that might
explain why no one has reviewed your PR". This is something that is
more common right now in the AI age that someone has opened a PR with
broken code and then all of the checks have failed but it almost seems
like the author has not seen that all the tests have failed. I would
not generally bother commenting to say something like "as you can see
all the tests have failed" but perhaps after 1 month it might be good
to point that out to the author.

If we wait longer than 2 months then what are we actually waiting for?
No one is actually going to go back more than 2 months looking for PRs
to review. If anyone wants to revive it further in the future then it
can just be reopened but if no one expresses any positive interest in
doing that then what benefit do we get from keeping a 2 month old PR
in the "open" state rather than the "closed" state? The closed PRs are
still there with all their code and message for everyone to see.

--
Oscar

gu...@uwosh.edu

unread,

Jan 31, 2026, 9:53:39 PMJan 31

to sympy

I like Oscar's idea of a pre-warning that the PR will be closed in a month with suggestions about things to check if the author wishes to pursue getting the PR merged. I can understand from the maintainers' view not wanting the time frame for automatic closure of to be too long.

I only have short bursts of time I can spend contributing to the various open-source projects I am involved in, so am likely to have PRs that would not be completed in a two month time frame. However, from my perspective that could easily be worked around by opening a new PR referring back to the old one, when I am able to cycle back around to work on it again. Thus, I think that having PRs that I am continuing to work on closed after 2 months of inactivity would be fine. It is unlikely to deter me from continuing on projects I am interested in over the long term. I do think the shorter time frame will clean up the repo and may be effective at discouraging people who are not truly serious about working on sympy.

I strongly suggest that automatic closure of PRs after a period of inactivity be implemented. From my perspective as a sometime contributor, I am not sure that the time frame matters as much as the fact that it should happen. The time frame should be chosen to work best for the development cycle the core maintainers can manage.

Jonathan

Jason Moore

unread,

Feb 1, 2026, 1:53:28 AMFeb 1

to sy...@googlegroups.com

In the past we've used the "closed" designation on a PR to mean: 1) this is merged into master and 2) this will definitely not be merged into master. If we close PRs based on inactivity time, then we have PRs labeled "closed" which are neither 1 or 2, they still have the state "could be or might be merged to master or might be rejected" but now we've labeled them with "closed" which would seemingly imply 1 or 2. So it seems to me if you close based on inactivity time, then the meaning of "open" or "closed" PR no longer has distinct meanings.

Jason

moorepants.info
+01 530-601-9791

--

You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/sympy/e5ae466e-1cf1-40db-bfdc-5a8ba525ae5bn%40googlegroups.com.

peter.st...@gmail.com

unread,

Feb 1, 2026, 2:28:33 AMFeb 1

to sy...@googlegroups.com

Would a ‚time based closure’ not be close to no 2, weakened to something like likely not to be merged?

If a PR was really excellent, would it not be looked at by somebody in good time?

Oscar made an additional point about GSoC

I am convinced that the flood of low level (Oscar’s judgement. I do not have the skills to judge them) PRs is due to the fact that the submitters want to participate in GSoC.

I cannot judge the pros of sympy being “in” GSoC vs. the drawback of the flood of PRs, but surely the experts must have opinions based on their experience on this question.

Peter

From: sy...@googlegroups.com <sy...@googlegroups.com> On Behalf Of Jason Moore
Sent: Sunday, February 1, 2026 7:53 AM
To: sy...@googlegroups.com
Subject: Re: [sympy] AI generated pull requests

In the past we've used the "closed" designation on a PR to mean: 1) this is merged into master and 2) this will definitely not be merged into master. If we close PRs based on inactivity time, then we have PRs labeled "closed" which are neither 1 or 2, they still have the state "could be or might be merged to master or might be rejected" but now we've labeled them with "closed" which would seemingly imply 1 or 2. So it seems to me if you close based on inactivity time, then the meaning of "open" or "closed" PR no longer has distinct meanings.

Jason

moorepants.info
+01 530-601-9791

To view this discussion visit https://groups.google.com/d/msgid/sympy/CAP7f1AhcGTF_wTD96CyUFk3uPRpE3d_kxWm3-sW_5dLe20AXVg%40mail.gmail.com.

Jason Moore

unread,

Feb 1, 2026, 2:46:38 AMFeb 1

to sy...@googlegroups.com

GSoC has played a very significant role in the history of SymPy. I don't think that can be understated. We have always gotten lots of poor quality contributions and interactions during the annual application phase. AI slop causes a rise in this.

We should have some careful deliberation before deciding to not participate in GSoC in the future. My opinion is that GSoC results reflect proportionally the time/quality of mentorship in most cases.

Jason

moorepants.info
+01 530-601-9791

To view this discussion visit https://groups.google.com/d/msgid/sympy/050001dc934c%24551688d0%24ff439a70%24%40gmail.com.

Vedant Dusane

unread,

Feb 1, 2026, 4:24:15 AMFeb 1

to sympy

Hi everyone,

It was really helpful to understand the maintainer perspective and the review burden caused by low-quality or undisclosed AI-generated PRs.

As a newer contributor trying to learn SymPy through small, focused changes (mainly typing and cleanup), this clarified expectations a lot. I want to make sure my future PRs show real understanding and are worth reviewer time.

This discussion has helped me understand the review burden much better.
I was wondering if ‘lightweight’ approaches such as better AI disclosure or short reasoning notes in PRs could help reduce reviewer time without adding much overhead.”

What are the main signals you look for that indicate a PR is based on genuine understanding rather than automated generation?

Jason Moore

unread,

Feb 1, 2026, 5:42:00 AMFeb 1

to sy...@googlegroups.com

HI Vedant,

> What are the main signals you look for that indicate a PR is based on genuine understanding rather than automated generation?

My suggestion to you, as a new contributor, is to not use AI. Then your contribution is by default genuine because you created it using your own brain and effort.

If you are trying to figure out how to make your not-genuine PR look genuine via signals, then you are missing the point.

Jason

moorepants.info
+01 530-601-9791

To view this discussion visit https://groups.google.com/d/msgid/sympy/6ff6b01d-c1c6-4287-898c-1b8979eb98f5n%40googlegroups.com.

Vedant Dusane

unread,

Feb 1, 2026, 5:59:28 AMFeb 1

to sy...@googlegroups.com

Thank you sir for your response,

Currently I am trying my best. Reading codebase understand logic, what the function exactly do here and also I am able or not to explain the changes in PR.

Thank you again

To view this discussion visit https://groups.google.com/d/msgid/sympy/CAP7f1Ai%2BL%2BksWOf47E%2Bpt3cBpmkxkQnAh4RQEyCQgazgprAd6w%40mail.gmail.com.

Oscar Benjamin

unread,

Feb 1, 2026, 7:53:55 AMFeb 1

to sy...@googlegroups.com

On Sun, 1 Feb 2026 at 06:53, Jason Moore <moore...@gmail.com> wrote:
>
> In the past we've used the "closed" designation on a PR to mean: 1) this is merged into master and 2) this will definitely not be merged into master. If we close PRs based on inactivity time, then we have PRs labeled "closed" which are neither 1 or 2, they still have the state "could be or might be merged to master or might be rejected" but now we've labeled them with "closed" which would seemingly imply 1 or 2. So it seems to me if you close based on inactivity time, then the meaning of "open" or "closed" PR no longer has distinct meanings.

I think that currently open vs closed does not have distinct meanings.
Most PRs in the open state should really be in the closed state. It is
just that no one has closed them. Even if the PR is closed that is
usually because the author decided to close the PR which does not
necessarily reflect a decision from the project that the PR was the
wrong approach.

If we close based on inactivity then an open PR has an objective
meaning that there is some recent activity. A PR that is closed would
have a message saying that it was closed because of inactivity and
then it is clear that that is not necessarily a rejection of what is
in the PR.

Most of the time the reason a PR has not been merged is not really to
do with making a decision about what the PR is trying to do but just
because the author hasn't done it properly and at the time when anyone
looked at it it was not clear if the author was going to fix the
problems or not. There may or may not be a comment from a reviewer
explaining what the problem with the PR is.

--
Oscar

Oscar Benjamin

unread,

Feb 1, 2026, 8:23:48 AMFeb 1

to sy...@googlegroups.com

On Sun, 1 Feb 2026 at 07:46, Jason Moore <moore...@gmail.com> wrote:
>
> GSoC has played a very significant role in the history of SymPy. I don't think that can be understated. We have always gotten lots of poor quality contributions and interactions during the annual application phase. AI slop causes a rise in this.
>
> We should have some careful deliberation before deciding to not participate in GSoC in the future. My opinion is that GSoC results reflect proportionally the time/quality of mentorship in most cases.

I agree but I think that reviewing all these PRs right now is not
possible. I have now decided to change my own approach to doing this
in such a way that almost all of the PRs opened by new contributors
hoping to do GSOC will be ignored by me. The new basic rule is that if
the PR is not perfect or requires any effort to review then I will
ignore it and I will not leave any feedback about what is wrong with
the PR.

I don't think that we can reasonably select from GSOC candidates
without these PRs but I also don't think it is manageable to actually
review these PRs in a way that is supportive of novices. There are
just too many people who would require too much coaching.

I think it would be better if sympy gave up on GSOC and focussed on
recruiting higher quality contributors rather than training an army of
novices almost all of whom will leave as soon as the GSOC selections
are announced. Even those who do GSOC rarely contribute again in
future. I know that GSOC has been different in the past but I think
that things have changed and it isn't worth it any more.

The time spent with all these PRs would be better spent just doing
what the GSOC project was suggested to do. New contributors would
still come but they would be people with a broader range of
experience. They would be more likely to hang around long term making
it more worthwhile to coach them and their good PRs would not be
ignored because of being buried in a sea of bad PRs.

--
Oscar

Jason Moore

unread,

Feb 1, 2026, 8:44:19 AMFeb 1

to sy...@googlegroups.com

This reminds me that there is more nuance than I originally wrote.

Github has 4 PR states and as far as I can gather from the last 15 years of community practice this is how we've treated them:

- Open (green): request for review and author desires to have it merged

- Closed (red): ether a core dev closed it to signal it will not be merged or the submitter self-closes it to signal they will not pursue it further

- Merged (purple): a core dev merged the PR into master

- Draft (grey): pull request shared so others can view the work or collaborate but not ready for review and/or merging (we used to use "WIP" in the title, as this a relatively new GH feature)

I do think these have been distinct meanings and arose through many past conversations and practices. At one point in the past, we even used labels to designate "author's turn" or "reviewer's turn" to indicate who's responsibility it is to take the next steps in moving a PR forward. The green open PRs stall because we are waiting on one of these turns. This is not the first time we've discussed the fact that SymPy has a large number of open PRs and whether we should close them for other reasons than above. We can introduce closing a PR due to inactivity, but I do not see why doing this anything other than causing you to have to click an extra tab to see these PRs. I have always thought the stalebot tool in some repositories to be obnoxious and annoying. Some times it takes a long time to get a PR merged. I just searched "stalebot" and this was the first article that popped up: https://jacobtomlinson.dev/posts/2024/most-stale-bots-are-anti-user-and-anti-contributor-but-they-dont-have-to-be/. I agree with it being a turn-off to new contributors (and also just annoying to standing contributors). The second part of the article gives some tips not unlike our turns method we used in the past.

I think it is also ok that we don't get to every PR or issue and that accepting that issues/PRs are an unwinnable Whac-A-Mole game. We've been staring at a huge list of issues and PRs for decades now. I'm not sure what closing a bunch for inactivity will change.

Jason

moorepants.info
+01 530-601-9791

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/sympy/CAHVvXxSphOcZ%2BDYen_Z27FtUwcSVmO9iR2S55B%3D%2BYL%2BjNmX_Lg%40mail.gmail.com.

Jason Moore

unread,

Feb 1, 2026, 8:49:27 AMFeb 1

to sy...@googlegroups.com

In fact, these are the list of the first hits on my search of "stalebot":

- Most stale bots are anti-user and anti-contributor, but they don't have ...

- No Stale Bots

- GitHub - actions/stale: Marks issues and pull requests that have

- GitHub stale bot considered harmful : r/programming - Reddit

- Don't use stale bots

- Understanding the Helpfulness of Stale Bot for Pull-based Development

- Github Stale Bots: A False Economy - blog.benwinding

So, it seems that using such a practice may or may not be a positive thing.

Jason

moorepants.info
+01 530-601-9791

Oscar Benjamin

unread,

Feb 1, 2026, 8:55:12 AMFeb 1

to sy...@googlegroups.com

Most of those things are talking about stale bots for closing issues
rather than pull requests:
https://drewdevault.com/2021/10/26/stalebot.html

> To view this discussion visit https://groups.google.com/d/msgid/sympy/CAP7f1AiX-Am_uWixUtCHDm6ZFq-7zcuwO0E9JwLSyXBSpq1XMg%40mail.gmail.com.

Jason Moore

unread,

Feb 1, 2026, 9:05:08 AMFeb 1

to sy...@googlegroups.com

Thanks for sharing your opinion on GSoC. It is good to see.

If we have at least 2 core devs that want to support GSoC, I think I'm of the opinion we should let them do so and they can mentor at least 1 student. If no one wants to champion it, then, yes, it should be abandoned.

Note that GSoC may also be the sole reason we have any money owned by the project.

I have always been proud of SymPy's welcoming of new contributors and use of GSoC in that process. Long back we used to show a graph showing the number of contributors that have made varying counts of git commits and compared it to other projects like NumPy, SciPy, etc. It seemed we were less susceptible to the bus factor and that our approach with new contributors did pay off. I still believe this is true and it would be a shame if we lost this aspect of the SymPy community. I always admired Ondrej for his approach and instilling this behavior in the community, which has been linked with our GSoC participation.

Jason

moorepants.info
+01 530-601-9791

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/sympy/CAHVvXxSoBT4%3DKS3AuGMpcx%2BLtVBMNTGZqR8%2BLvJDAYaNNJ85QQ%40mail.gmail.com.

Oscar Benjamin

unread,

Feb 1, 2026, 9:18:02 AMFeb 1

to sy...@googlegroups.com

On Sun, 1 Feb 2026 at 13:44, Jason Moore <moore...@gmail.com> wrote:
>
> This reminds me that there is more nuance than I originally wrote.
>
> Github has 4 PR states and as far as I can gather from the last 15 years of community practice this is how we've treated them:
>
> - Open (green): request for review and author desires to have it merged
> - Closed (red): ether a core dev closed it to signal it will not be merged or the submitter self-closes it to signal they will not pursue it further
> - Merged (purple): a core dev merged the PR into master
> - Draft (grey): pull request shared so others can view the work or collaborate but not ready for review and/or merging (we used to use "WIP" in the title, as this a relatively new GH feature)

No one has ever used these consistently and if we had a system for
this then it would just generate more admin like manually marking
someone's PR as draft or having to explain to people what the system
is. The same problems happened with the "author's turn" labels.

> I do think these have been distinct meanings and arose through many past conversations and practices. At one point in the past, we even used labels to designate "author's turn" or "reviewer's turn" to indicate who's responsibility it is to take the next steps in moving a PR forward. The green open PRs stall because we are waiting on one of these turns.

This is why I think a 1 month reminder would be useful. It is a prompt
to figure out whose turn it is. If all the reviewers overlooked it for
1 month then it is reasonable to be reminded of that. The 1 month
reminder can also be a prompt for someone to say "I would like to
review this and I think it is good but I just don't have the time to
review it in full right now" which makes the situation clear to
everyone (and resets the inactivity timer).

On the other hand whether you use labels or whatever else I think a
lot of contributors just don't understand that it is their turn and
the 1 month reminder can have a message that prompts them to think
about this like:

- If someone has requested changes to the PR then they are probably
waiting for you to make those changes.
- If the CI checks have failed then reviewers are probably not going
to bother commenting to say that changes are needed since they would
expect that to be obvious.

On the other hand if we have a 1 month reminder and then nothing
happens in the next month then chances are high that the maintainers
don't want it or the contributor is not willing or able to complete
it. At that point leaving it open is usually just misleading. No one
is likely coming back to it and if they do come back to it then it can
be reopened to signal that change in status.

--
Oscar

gu...@uwosh.edu

unread,

Feb 1, 2026, 11:07:06 AMFeb 1

to sympy

I just want to state that I think Oscar has a point in his last paragraph:

On Sunday, February 1, 2026 at 8:18:02 AM UTC-6 Oscar wrote:

On the other hand if we have a 1 month reminder and then nothing
happens in the next month then chances are high that the maintainers
don't want it or the contributor is not willing or able to complete
it. At that point leaving it open is usually just misleading. No one
is likely coming back to it and if they do come back to it then it can
be reopened to signal that change in status.

--
Oscar

I am one of those contributors with very limited time and have open PRs that I currently do not have time to work on. As long as the reminder messages are clear about what the automatic closure means I would be fine with having my PRs closed and having to make a new PR (or reopen an old one) with updates when I am able to work on them. I think it would be useful to new people coming to the project to see a smaller number PRs that are actively being worked on.

In contrast to PRs, Issues should be left open until resolved in some way. I think a community push (say an "issue month") each year to review and resolve issues starting with the oldest would be a good idea. My quick look at some of the older issues suggests something of this sort has not been done in 4 or 5 years.

Jonathan

Oscar Benjamin

unread,

Feb 1, 2026, 12:37:31 PMFeb 1

to sy...@googlegroups.com

On Sun, 1 Feb 2026 at 14:17, Oscar Benjamin <oscar.j....@gmail.com> wrote:
>
> On Sun, 1 Feb 2026 at 13:44, Jason Moore <moore...@gmail.com> wrote:
> >
>
> > I do think these have been distinct meanings and arose through many past conversations and practices. At one point in the past, we even used labels to designate "author's turn" or "reviewer's turn" to indicate who's responsibility it is to take the next steps in moving a PR forward. The green open PRs stall because we are waiting on one of these turns.

Thinking about this a bit I think that most of the time it is not
really a clear binary state here. A lot of them are in an in-between
sort of state where the PR is not good enough but the author does not
know how to improve it and no reviewer is willing to coach it to
success. A lot of the PRs being opened as we speak will enter this
state.

From a process perspective it might seem like "reviewer's turn"
because no one has reviewed it and told the author what they should
do. Actually though, I (and probably others) have looked at the PR and
just decided not to reply telling them what to do because it is just
below some level where I effectively consider it still the "author's
turn" to produce something better without needing a reviewer to tell
them exactly what to do. (My decision now about reviewing PRs is just
to increase this threshold to a level that few new contributors will
reach.)

I know that I could say something like "oh, by the way Python code
needs to be properly indented and have valid syntax" and then with
enough back and forth it could likely turn into a merged PR but I have
judged that it would require unreasonable effort to do that.

In this situation it usually does not make sense to just close the PR because:

- Another reviewer might be willing to coach it.
- The author could come back to it and improve the PR.

The last point is important. In general the reason that we don't
immediately close PRs that are not perfect is because at the time of
activity a PR is generally a work in progress.

This is why I think closing after two months is reasonable. In the
current workflow there is no clear notification at a time when you can
conclude that probably no one is going to do what would be needed for
the PR to succeed. You can be notified when the PR is opened or after
a push but it is always possible (quite likely in fact) that there
will be a subsequent push soon after so even if it is not good enough
now it might be later. Two months is enough time to wait for that
though before concluding that the PR is just not being actively worked
on by the author or reviewers.

I also think it is better if a bot does the closing rather than a human.

If you think that it is not nice that a bot does this, how do you
think it would feel if a human explained the situation honestly:

"Look this code is not good enough but based on various factors (very
much including our perception of your personal capabilities) we don't
think it is worth our time to help you improve it so we think it is
better to give up on this one."

Alternatively is it nicer just to leave the PR permanently open but
ignored? I'm not sure it is.

Also I think that having one month and two month notifications would
actually result in more things being merged. A lot of maintainer PRs
get lost in the noise of all the others and end up waiting for review
before eventually being self-merged. The one month notification would
remind people when those have been overlooked for review. The two
month mark is probably a reasonable basis for saying "look I tried to
wait for a review on this but no review is coming so I'm just going to
merge it".

--
Oscar

Oscar Benjamin

unread,

Feb 1, 2026, 2:57:54 PMFeb 1

to sy...@googlegroups.com

In case anyone is interested in looking these are the PRs that are
currently inactive for about 2 months:
https://github.com/sympy/sympy/pulls?page=5&q=is%3Apr+is%3Aopen+sort%3Aupdated

My suggestion here is that those should have got a comment
notification at the 1 month mark and then if nothing happened in the
next month up to now those PRs would get closed automatically right
around now.

In many cases I would expect that the 1 month ping would have
triggered something to happen like someone might review it or the
contributor might close it or fix something or reply to some question
that was previously unanswered.

Most of those I think are going nowhere as it stands. Some are waiting
for a reviewer that isn't going to come. Some were just experiments
(e.g. one of mine) that can be closed and then reopened in future if
needed.

Most of them are in the author's turn state but it is not always clear
if the author understands that or actually knows how to do what is
needed. If they did understand and know how to do it then I think it
is reasonable to say that if they haven't done it by now then they are
just not actively working on it and we may as well close it until they
do want to come back to it.

To put it a different way: if we don't close those then what do we
expect to happen with them? I certainly don't think that anyone is
going to periodically go that far back through the PR backlog and
check whether anything can be done to help them out so what benefit do
they bring that they would not bring if they were closed?

--
Oscar

Oscar Benjamin

unread,

Feb 1, 2026, 6:19:43 PMFeb 1

to sy...@googlegroups.com

On Sun, 1 Feb 2026 at 19:57, Oscar Benjamin <oscar.j....@gmail.com> wrote:
>
> In case anyone is interested in looking these are the PRs that are
> currently inactive for about 2 months:
> https://github.com/sympy/sympy/pulls?page=5&q=is%3Apr+is%3Aopen+sort%3Aupdated

I went through these 25 PRs and summarised them. One thing to bear in
mind is that these are mostly from around the time that I started this
thread about AI generated pull requests and obviously predate having
any AI policy. Also there would have been many more pull requests at
the time but what we are looking at here is the PRs that did not get
closed or merged. Out of all time 93% of PRs got closed or merged so
these are something like 7% of the PRs and are obviously a skewed
sample.

This is my summary of each PR:

- A reasonable looking PR that has been overlooked by reviewers.
- Merge conflicts - AI description - Code mispasted from AI
- AI assisted description - not sure if code is correct but don't
trust probable AI
- An old GSOC PR from Summer - 4000 lines. I guess mentors don't have time.
- Reviewer said that the changes are not a good idea - author probably
doesn't understand.
- AI description and code - not sure if code is good. Was reviewed
once but not again after changes.
- AI everything - Messing with test runner but I don't trust author or
AI to do that.
- Experimental PR marked as draft.
- AI everything - Feedback given by reviewer and author has not addressed it.
- AI everything - Reviewer already told the author that the PR is pointless.
- Experimental PR described as draft (not actually marked as such).
- AI everything - Author has already been told that the PR is wrong.
- Reasonable PR - Reviewer agreed to merge but there are conflicts and
author needs to fix them.
- Not correct but it is unclear exactly what would be correct so
difficult to give feedback.
- Reviewer has already said that the PR is wrong. Further changes have been
made but I think the reviewer does not feel the need to explain again.
- Old GSOC PR - Reviewers seem unsure about the PR.
- Weird (wrong) AI PR - Not even sure what kind of AI is being used.
No one has reviewed it.
- AI PR - Probably okay but not sure if it is a good idea.
- AI everything - possibly good but no one has reviewed.
- AI everything - added tests have failed but author doesn't know why.
- Duplicate of weird AI PR.

I think that only a few of these are actually reasonable PRs where the
author is ready, capable and willing to work on it but is waiting for
review or where there are some good reviews but a little more is
needed from the author. My hope would be that a 1 month reminder PR
would handle some fraction of those by reminding the author/reviewers
to finish/review the PR.

The rest I think should just be closed but it is a question of who
takes the decision to close and when. It also matters how the decision
to close is explained to the author and for anyone else looking to
make a PR for the same issue in future. I think that just closing due
to inactivity after 2 months is a reasonable approach here. If someone
doesn't like it then they can ask for it to be reopened or ask for
feedback or something but there shouldn't be any obligation for anyone
to actually respond to that. A lot of this is just AI spam and we
should not need to spend effort giving feedback when the effort is not
being given on the other side.

This particular PR probably is a good illustration of how PRs
(especially AI PRs) can easily stall:
https://github.com/sympy/sympy/pull/28508

The issue is to add docstrings to some code in the LLVM code printer.
The author clearly used an AI to write the docstrings. Then the
doctests have failed and the author (via AI) said:
"""
The current doctest failure seem to be coming from llvmlite not being
installed in the CI environment (ModuleNotFoundError: No module named
'llvmlite'), which causes cascading NameError exceptions in examples
using ir and ll.

I believe these errors are unrelated to my docstring changes. Could
you please confirm if these are expected/known failures for optional
dependencies?
Thanks!
"""
If they had any idea what the LLVM printing code is for then they
would have understood the connection with llvmlite. Even if you just
think a little bit about what you are saying then the name "llvmlite"
is a big clue (maybe that's like counting the r's in strawberry
though). I guess that they also probably don't understand how doctests
work.

They can't have tested any of the code that they are putting in the
docstrings if they didn't install llvmlite themselves.

So I commented at the bottom of the PR "The llvmlite issue is
obviously not unrelated." and then no one has replied to that. Now I
could put more effort into explaining what llvmlite is but why should
I? What effort has this person put in? My comment is *deliberately*
terse because I don't want to have anything to do with their PR unless
they are going to put more effort at least to have some understanding
of the code they are "writing".

It would be possible to install llvmlite in CI to fix the doctest
failure although a bit more is needed. Why bother coaching this
particular PR to do that though? There is no value in that PR when it
is just a few AI generated docstrings pasted in by someone who has no
idea what they are doing. They are not the only person to do this
either: there have been 8 PRs trying to add those docstrings!

I already gave feedback on that PR and did not close it, leaving open
the possibility that the author could put in a bit more effort to get
it working or at least demonstrate some understanding. Now after 3
months I think it is reasonable to just close it and I don't think
anything we can say is particularly better than just a bot saying
"closing due to inactivity".

--
Oscar

Oscar Benjamin

unread,

Feb 4, 2026, 11:42:04 AM (14 days ago) Feb 4

to sy...@googlegroups.com

An article yesterday in the register talking about AI spam PRs on GitHub:
https://www.theregister.com/2026/02/03/github_kill_switch_pull_requests_ai/

GitHub are apparently looking into whether anything can be done to improve this:
https://github.com/orgs/community/discussions/185387

The article quotes someone summarizing the problems. I agree with all
of these points:

- Review trust model is broken: reviewers can no longer assume authors
understand or wrote the code they submit.
- AI-generated PRs can look structurally "fine" but be logically
wrong, unsafe, or interact with systems the reviewer doesn't fully
know.
- Line-by-line review is still mandatory for shipped code, but does
not scale with large AI-assisted or agentic PRs.
- Maintainers are uncomfortable approving PRs they don't fully
understand, yet AI makes it easy to submit large changes without deep
understanding.
- Increased cognitive load: reviewers must now evaluate both the code
and whether the author understands it.
- Review burden is higher than pre-AI, not lower.

The article quotes someone saying
"""
I'm generally happy to help curious people in issues and guide them
towards contributions/solutions in the spirit of social coding," he
wrote. "But when there is no widespread lack of disclosure of LLM use
and increasingly automated use – it basically turns people like myself
into unknowing AI prompters. That's insane, and is leading to a huge
erosion of social trust.
"""
That's basically how I feel about the situation although I would go
further. Reviewing these PRs is not like being an AI prompter because
the human using the AI behaves effectively like a broken AI. You would
get better, more trustworthy results much more quickly if you were
prompting the AI directly yourself.

--
Oscar

Peter Stahlecker

unread,

Feb 4, 2026, 12:28:57 PM (14 days ago) Feb 4

to sympy

I am a user to sympy only (actually of sympy.physics.mechanics), too old and too ignorant to contribute, but I follow the discussions.

I had wondered before, why anybody would push a PR he/she did not do him/herself -and might not even understand- , but Jason told me people are so eager to get into GSoC,

and they need at least one PR merged.

I can understand people like Oscar: They are willing to teach others to improve, but surely are not interested in conversing with some LLM, a non-person.

My concern is this: if key members / reviewers get too frustrated with AI and reduce their work, sypmy will suffer.

So, I think, reviewers should be very strict, even erring on the "wrong" side: If a PR looks like created by AI, close it!

But, as I said above, just the opinion of an interested old user,

Peter

Rushabh Mehta

unread,

Feb 4, 2026, 1:07:31 PM (14 days ago) Feb 4

to sympy

I am also a previous GSoC contributor (2025), and a new reviewer, my opinion does not hold the same weight as some of the other people who have commented under this thread, nonetheless I want to share my thoughts. I totally agree with these points mentioned by Oscar and Peter's opinion too. Why should good reviewers waste their time, energy and mental bandwidth on a suspicious/totally AI slop PR? Especially when the people making the PR have not put even a fraction of the same effort?

Newer contributors (aiming for GSoC ones) should understand that their AI nonsense is not going to get them anywhere. Only if the reviewing becomes strict (someone could say even a bit harsh) will the situation improve. To be clear: I am suggesting to close total AI slop PRs without any explanation.

Reviewing of the normal (totally human or AI assisted but sufficiently sensible PRs) can continue as it always was.

On another note: Maybe we can consider removing the patch requirement (atleast 1 open PR or merged PR) from GSoC. We should only select candidates with solid proposals. Someone who can actually write a good proposal, documenting their design, implementation details, links to relevant parts of the code etc says a lot more than someone who got an easy PR merged. Getting a merged PR does not necessarily imply a good candidate, but someone with a good proposal but no PRs merged could still be someone solid enough to get a project through. This notion of GSoC aspirants that "if I have a lot of contributions then I have a better chance of getting selected" is what is the crux of these many PRs.

Sorry if some sentences are long and confusing, I didn't put this through an LLM :)

Oscar Benjamin

unread,

Feb 4, 2026, 1:50:05 PM (14 days ago) Feb 4

to sy...@googlegroups.com

On Wed, 4 Feb 2026 at 18:07, Rushabh Mehta <mehtarus...@gmail.com> wrote:
>
> On another note: Maybe we can consider removing the patch requirement (atleast 1 open PR or merged PR) from GSoC. We should only select candidates with solid proposals. Someone who can actually write a good proposal, documenting their design, implementation details, links to relevant parts of the code etc says a lot more than someone who got an easy PR merged.

The proposals are all going to be written with AI as well. OpenAI has
just released an AI powered version of Overleaf:

https://openai.com/prism/

I'm pretty sure you can just go there and prompt it to write the whole proposal.

I actually think that the opposite is the case now that we cannot
trust the proposals and the PRs are the only reasonable way of
discriminating between the candidates. The way that this has always
worked though is that people have coached the applicants to be able to
produce some reasonable PRs. Then some GSOC applicants would have some
nontrivial PRs merged by the time GSOC applications are being ranked
and then that gives some confidence that they would be able to do the
project.

It has been getting worse in recent years and it was already bad last
year but the problem is that there are just too many different people
with low quality PRs (whether AI or not) where the amount of coaching
that would be needed to turn them into good PRs would be unmanageable.
If we don't do that though then none of the candidates can start out
with basic PRs and then improve to the point of making good/nontrivial
PRs and then be ready to do a GSOC project.

--
Oscar

Oscar Benjamin

unread,

Feb 4, 2026, 2:08:21 PM (14 days ago) Feb 4

to sy...@googlegroups.com

On Wed, 4 Feb 2026 at 17:29, Peter Stahlecker
<peter.st...@gmail.com> wrote:
>
> I had wondered before, why anybody would push a PR he/she did not do him/herself -and might not even understand- , but Jason told me people are so eager to get into GSoC,
> and they need at least one PR merged.

I think it is important to understand that AI gives people false
confidence. The people doing this think that they do understand the
code. They also believe that the code made with the help of the AI is
better than what they would have produced without the AI. Actually the
real problem here is not that they used AI to write the code, it is
that they used AI instead of *reading* any of the existing code. If
you didn't have AI you would have to read the code before you could
write anything.

The AI gives them some little bit of code that looks understandable
and they think they understand it but you can't truly understand a
small piece of code without understanding all the code around it. The
true understanding of some code is not just understanding just what it
does but why it is the way it is rather than any of a number of
alternatives that might superficially seem similar in the same
context. You don't get that understanding if the AI takes you straight
to seemingly working code.

There are empirical studies now comparing programmers using AI and not
using AI. It has been shown more than once I think that even
experienced programmers using AI will produce more bugs but at the
same time have more confidence in the code. It has also been shown
that people/teams using AI can have reduced productivity but at the
same time believe that their productivity has increased.

Have you ever tried using something like ChatGPT Peter?

ChatGPT is a sickening thing to talk to. I don't think that people
using LLMs to write emails and things understand just how much its
language upsets me. I imagine that you can have a conversation like:

You: Hey ChatGPT I want to make a PR for GSOC and I want to fix issue
12345. I think maybe we can fix it by adding some code to make the
thing negative.

ChatGPT: Wow that is an amazing idea Peter -- you are a genius! I'll
write some code for that write now. Here you go:
(lots of generic samey looking code)
This will certainly fix the issue based on your very innovative and
creative suggestion and the code is
* professionally written
* passes all relevant requirements and coding standards
This code based on your insightful idea will make an excellent PR --
the SymPy maintainers will surely love this PR!

--
Oscar

Peter Stahlecker

unread,

Feb 4, 2026, 3:24:57 PM (14 days ago) Feb 4

to sympy

As a matter of fact, yes some experience with chatgPT:

I want an ellipse to roll on a smooth line without slipping. I tried to get the correct function linking

the speed of the contact point to the rotational speed of the ellipse. Whatever I tried did not work.

So I asked chatgPT.

- it gave an answer it seemed fully confident of. I could easily try the answer, it was wrong.

- I told it it was wrong. The answer: very good observation! Now I give the fully correct answer. Wrong again.

- several more cylces like this, still no correct answer (I still do not have it)

Mine is an exceedingly simply application of LLM. How does a novice want to check whether some reply given by LLM

does solve the sympy issue on hand?

Peter

Sangyub Lee

unread,

Feb 4, 2026, 6:27:53 PM (13 days ago) Feb 4

to sympy

> The AI gives them some little bit of code that looks understandable and they think they understand it but you can't truly understand a small piece of code without understanding all the code around it.

I rather have very different opinion about this:
If AI doesn't generate the code well, now programming language itself looks like a problem, and have to be greatly improved in terms of safety.

Houses, bridges, skyscrapers, or even working at shipyard or ironworks are much more safer than before, even though not everyone fully understands the system.

So it's much more absurd to hear that only programming languages and software engineering are complaining with errors,
and keep bashing to the people (and AI) to have to understand such fragile system.

Maybe there needs another logical layer of 'garbage collection' like how we understand memory level garbage collection, that AI can operate on safe languages,
and humans should not bother with lower level, like how we treat assembly.

ChatGPT is anyway undoubtedly one of the greatest achievement of computer science,
like how landing in Mars is going to be greatest achievement of natural science,
and now if we are being bottleneck for ChatGPT, it means we are the problem.

Oscar Benjamin

unread,

Feb 4, 2026, 6:42:57 PM (13 days ago) Feb 4

to sy...@googlegroups.com

On Wed, 4 Feb 2026 at 23:27, Sangyub Lee <syle...@gmail.com> wrote:
>
> ChatGPT is anyway undoubtedly one of the greatest achievement of computer science,
> like how landing in Mars is going to be greatest achievement of natural science,
> and now if we are being bottleneck for ChatGPT, it means we are the problem.

What solution do you propose to that problem?

Just forget about SymPy and move on because the AI will make a better version?

ChatGPT even uses SymPy both for training and during inference as I
understand it.

--
Oscar

Peter Stahlecker

unread,

Feb 5, 2026, 4:46:28 AM (13 days ago) Feb 5

to sympy

Dear Sangyub,

of course chatGPT, or AI in general is a GREAT thing, and surely will improve at rapid pace.

For the example I cited I used the free version of chatGPT, maybe it is dumber than the paid for one, which I do not have.

I feel the PRs for sympy (and likely for other packages, too) serve a dual purpose:

1. make sympy even better

2. When the reviewers discuss with submitters, the (mostly young) submitters learn something.

Point 1.could be taken over by AI sometime, but them Point 2. would be dead,.

At least in my view point 2 is a VERY important one, and should not be given up lightly.

Peter

Oscar Benjamin

unread,

Feb 5, 2026, 4:32:53 PM (13 days ago) Feb 5

to sy...@googlegroups.com

On Thu, 5 Feb 2026 at 09:46, Peter Stahlecker
<peter.st...@gmail.com> wrote:
>
> I feel the PRs for sympy (and likely for other packages, too) serve a dual purpose:
>
> 1. make sympy even better
> 2. When the reviewers discuss with submitters, the (mostly young) submitters learn something.
>
> Point 1.could be taken over by AI sometime, but them Point 2. would be dead,.
> At least in my view point 2 is a VERY important one, and should not be given up lightly.

I was thinking about this a bit. If you look at the most recent PRs
then almost half of them are from one person and all of their PRs have
been merged by me (except one that got closed):
https://github.com/sympy/sympy/pulls?q=is%3Apr+is%3Amerged

That person is a new contributor and it says in the AI disclosure that
he is using AI but that he checks and edits the code afterwards. I
actually can't tell that he is using AI because he is using it in a
reasonable way: the end result is still what you would have ended up
with if the code was written by someone who knows what they are doing
and is not using AI. Note that he does not use LLMs to write comments
or messages so the communication is all human.

Those pull requests are mostly for a part of the codebase that I don't
know very well so I'm not really able to verify that everything is
correct (I could with more effort but don't want to). There are no
maintainers who know it well but I am happy to review and merge those
PRs based largely on trust that he knows what he is doing and is
improving things. You can see the beginning of that trust developing
through the human communication here:
https://github.com/sympy/sympy/pull/28994

Over time through human communication and PRs the trust and mutual
understanding grows and we end up at a point where it becomes easy to
review and merge the PRs.

The problems with a lot of the other PRs are that:

- It is hard to build trust when there is no actual human
communication (people using LLMs to write messages).
- Many of those people are demonstrably less experienced and so cannot
really be trusted in the competency sense either.
- They are often trying to make changes in parts of the codebase that
might typically be considered "more advanced" where you actually
needed the author to have a higher level of competency/trust (the AI
made them over-confident, without AI they would have looked at the
code and decided not to mess with it).
- Before AI they would have been able to earn some trust just based on
the fact that they figured out how to produce seemingly working code
and make the tests pass but that is now meaningless.

Basically we can't just merge all PRs on trust but we also can't
review all PRs on a zero trust basis (that is far too much work).
Right now we are getting too many PRs on a zero trust basis and at the
same time it has become really hard to build trust.

I think I am realising now how damaging the LLM messages are. Not
having actual human-to-human communication is a massive barrier to
trust building.

--
Oscar

Reply all

Reply to author

Forward