On Sat, 15 Nov 2025 at 18:06, Aaron Meurer <
asme...@gmail.com> wrote:
>
> On Fri, Nov 14, 2025 at 12:15 PM Francesco Bonazzi
> <
franz....@gmail.com> wrote:
> >
> > Let's remember that LLMs may write copyrighted material. There is some risk associated with copy-pasting from an LLM output into SymPy code.
> >
> > Furthermore, what practical algorithmic improvements can an LLM do to SymPy? Can an LLM finish the implementation of the Risch algorithm? I doubt it.
>
> "Finishing the Risch algorithm" is an enormous task. Of course an LLM
> cannot one-shot that, and a human couldn't do it in one PR either. But
> I have actually been using Claude Code to do some improvements to the
> Risch algorithm and it's been working. So the statement that an LLM
> cannot help with algorithmic improvements in SymPy is false.
There is a huge difference between someone who knows what they are
doing, knows the codebase, understands the algorithm etc using an LLM
and reviewing the results as compared to a typical new SymPy
contributor using an LLM to write the code for them. If you are
someone who could write or review the code without using an LLM then
using an LLM and checking the results is reasonable.
By the way your most recent PR is here Aaron:
https://github.com/sympy/sympy/pull/28464
The PR looks good to me apart from the one bit where you said "Claude
generated this fix for the geometry test failure. It would be good to
review". I reviewed it and decided that it looks like a hacky fix and
showed a counterexample of the type that would break that code. The
LLM output cannot be trusted and does not substitute for humans
investigating and writing the code.
From my perspective as a reviewer "this is LLM code. It would be good
to review" is asking reviewers to do what would usually be expected to
be done by the author of the PR in the first instance. That is only a
small example but it shows the broader problem that I think LLMs will
cause in sympy by shifting greater burden onto reviewers while making
it easier for authors to generate more and more PRs to review.
If you are new to programming in general and new to a particular
codebase and then use an LLM to generate code that you don't
understand then the results are not going to be good. There have
always been PRs where the author clearly does not understand the
codebase well or does not understand all of the implications of the
particular choices made in the code. Now though there are PRs where
the author has no idea why the LLM wrote any of the code that it did
and has not even done the most basic of testing and can only respond
to feedback by copying it into an LLM.
What I think people submitting these PRs right now don't realise is
that when I see the LLM-generated PR description or comments, or the
LLM-generated code that they probably don't understand, it removes all
motivation to review any of their PRs and removes any level of trust
that I might give them from the usual benefit of the doubt. I am
mentally blacklisting contributors based on what I consider acceptable
even if there is not an agreed general policy.
I refuse to review PRs from newer contributors if this is the way that
it is going to happen so each of these needs to be reviewed by someone
else or can join the ever growing pile of unreviewed PRs.
> > On Friday, November 14, 2025 at 3:24:50 p.m. UTC+1
har...@gmail.com wrote:
> >
> > However, it requires Premium requests, so not everyone can use this feature.
> >
> >
> > Most of these AI-assisted tools are designed to take money from developers. I would strongly advise against paying for these services.
>
> I can't speak to the specific tool being mentioned here, but the best
> LLMs right now do require you to pay for them. If a tool is good and
> actually improves developer quality of life, we shouldn't be afraid to
> pay for it (that applies even outside of AI tools).
>
> FWIW, when it comes to code review, my suggestion would be to use a
> local LLM tool like claude code or codex to do the review for you. It
> wouldn't be completely automated, but that would give you the best
> results. I also agree that writing down the sorts of things you're
> looking for in a SymPy review somewhere in the context is going to
> make it work better. I would start by having an agent analyze recent
> reviews (say, the 50 most recent PR reviews by Oscar), and use that to
> construct a Markdown document listing the sorts of things that a good
> reviewer is looking for in a review of a SymPy pull request.
I was thinking more that an LLM bot on GitHub could handle basic
things like how to write the .mailmap file or how to add a test or run
the tests, fix trailing whitespace, interpret CI output and so on. It
would be good to have things like some tooling to identify older
related PRs or issues and say "maybe this fixes gh-1234 so a test
could be added for that" and other things of that nature as well.
I don't actually want an LLM to review the real code changes but I can
see the value in having LLMs help with some of the tedious back and
forth so that a contributor gets rapid help and when a human reviewer
gets to it the PR is more likely in a state that is ready to merge.
--
Oscar