SWE-agent paper uses SymPy

105 views
Skip to first unread message

Aaron Meurer

unread,
May 18, 2024, 8:41:24 PM5/18/24
to sympy
The SWE-agent project uses LLMs to try to automatically fix issues in
GitHub repositories. I found their paper interesting, mostly because
they make extensive use of SymPy as a test repository.
https://swe-agent.com/

Apparently there are quite a few SymPy issues in the SWE-bench
dataset, which is a dataset of issues and corresponding pull requests
in open source projects. https://www.swebench.com/

Their model was able to fix 10% of SymPy issues in the dataset. That's
obviously not going to replace human developers any time soon, but
it's still interesting. From what I could tell, the issues seem to be
biased towards more easy/straightforward ones (i.e., ones that are
easy to verify if a fix is correct or not). But still, if LLMs are
reaching a point where they can fix bugs completely automatically that
could be very useful.

Aaron Meurer

Sangyub Lee

unread,
Jul 24, 2024, 12:43:24 PM7/24/24
to sympy
I notice a PR created today that is suspected to be generated by LLM (sorry if it is false accusation)
Title: Fix is_zero Attribute Handling in Expression SimplificationDes… by Devansh-46 · Pull Request #26850 · sympy/sympy (github.com)
However, it doesn't turn out to be successful anyway, due to the failing of tests, and coming up with quite terse but nonsensical description of PR.

Although the SWE paper notes about significant improvements over ChatGPT or old models about performance,
I still have a lot skepticism that the LLM models can write the code or fix issues of stuff that may be poorly organized, and may need many contextual understanding, like simplify function.
And it may be more easier for LLM to fix issues of code that is easy to understand and structured well by experienced programmars, so it won't easily degrade the jobs of good programmers.
However, it may be difficult to experience the improvement of performance, because they are still in research status and not used in production like Copilot.

Aaron Meurer

unread,
Jul 24, 2024, 1:59:34 PM7/24/24
to sy...@googlegroups.com
Just so we are clear, I am opposed to using AI to automatically open
pull requests in this way, and I would ask that people please don't do
this in the SymPy repository.

I think that AI tools can be useful to help find and fix bugs, and if
that can be streamlined more, for instance, by making the AI
automatically read the issue, that is great. But the thing that
shouldn't be automated is the actual opening of the pull request.
There still needs to be a human in the process who actually looks at
the code and can take responsibility for it. The accuracy of these
tools is not high enough yet that the whole process should be done
automatically. All this does is waste maintainer time.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/d93b3572-383b-4465-9172-790de55d8b8dn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages