Intent and transparency about my research on automated regression testing

80 views
Skip to first unread message

公主宁

unread,
Mar 16, 2026, 1:14:37 PM (8 days ago) Mar 16
to sympy
Hello everyone,

I am Fu Yiwen, a PhD student at Beihang University. My research focuses on automated regression testing, aiming to help improve software quality.

I have been experimenting with tools for automatically detecting issues in open-source projects, but I now realize I did not clearly communicate my intent and research background before opening issues on GitHub. This was my oversight, and I apologize for any confusion or disruption this has caused to the community.

I am fully committed to following community norms, guidelines, and proper communication channels. Moving forward, I will ensure all my contributions and activities are transparent, respectful, and aligned with what the community expects.

Thank you for your hard work on SymPy and for your understanding.

Best regards,
Fu Yiwen

Oscar Benjamin

unread,
Mar 16, 2026, 2:31:49 PM (7 days ago) Mar 16
to sy...@googlegroups.com
Hi Fu Yiwen,

Thank you for coming to the mailing list to explain this.

It is not enough though that you email here saying that you will be
transparent etc though. There needs to be a discussion about whether
it is reasonable for you to open these kinds of issues at all.

For the benefit of others reading, this follows these issues:

https://github.com/sympy/sympy/issues/29358
https://github.com/sympy/sympy/issues/29360
https://github.com/sympy/sympy/issues/29361
https://github.com/sympy/sympy/issues/29416
https://github.com/sympy/sympy/issues/29417
https://github.com/sympy/sympy/issues/29418
https://github.com/sympy/sympy/issues/29419

Also there have been other issues in other repositories:

https://github.com/pylint-dev/pylint/issues/10910
https://github.com/pylint-dev/pylint/issues/10909
https://github.com/pylint-dev/pylint/issues/10907
https://github.com/pylint-dev/pylint/issues/10906
https://github.com/pylint-dev/pylint/issues/10905

I understand that you are a PhD student and that you are hoping that
your software will be helpful but it should be quite clear from the
linked issues that you either need to stop doing this or you need to
do it very differently.

Firstly, you are posting the output of an AI tool as if it is from
yourself. You need to make it very clear what is human communication
and what is AI-generated output because otherwise you look like an AI
bot and it isn't clear which human is actually in control of the bot.

Secondly, you are using an open source repo for software Engineering
research but you have not sought any consent from the project. This
email from you here now is a statement from you saying that you will
improve your conduct going forwards but you have not asked the
question: does anyone consent to you doing this research?

Thirdly, you should not post issues like this based on your research
tool without vetting them very carefully as a human and it is clear
that you are not doing that to a high enough standard. It is not hard
for open source repos to end up buried under AI generated rubbish so
you need to be much more careful about ensuring that the issues you
open are valid.

In principle I like the idea of using AI to identify issues (much more
than using AI to write PRs) but the signal to noise ratio needs to be
high and the issues need to be about important things. What I don't
like is wasting time filtering AI slop to provide data for someone's
research project.

Does Beihang University follow a research ethics process?

There is no way that an ethics panel in my University would have
authorised anything like the way that you have behaved in the linked
issues.

I am open to discussing how you could use your tool in a way that
might be actually useful to sympy. The emphasis needs to be on what is
useful for sympy though and that means that a very low false positive
rate is needed and any issues need to be communicated very clearly for
the benefit of anyone reading. Otherwise it is just not useful to get
lots of AI generated bug reports of questionable quality.

--
Oscar
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/sympy/tencent_EFF5799CA0086F254BDDFE456BBEBF0E000A%40qq.com.

公主宁

unread,
Mar 16, 2026, 10:52:59 PM (7 days ago) Mar 16
to sympy
Dear Oscar and all,
Thank you for your careful and honest feedback. I truly appreciate you pointing out these issues.
I want to clarify the facts sincerely:
The earlier issues I submitted were not reviewed by multiple developers. That was my fault, and I take full responsibility for not following strict verification before opening them. I understand this has created noise and extra work for the community, and I am sorry for that.
However, the two latest issues I opened:
have been carefully discussed and verified by two developers, and we both confirmed that real regressions were introduced. These are not automated outputs—they are manually validated issues.
I fully accept your requirements:
I will stop opening any new issues until the community agrees.
I will clearly mark every issue with verification details.
I will only submit issues that are carefully reviewed and confirmed by real developers.
I will respect the community’s needs and will not conduct research without consent.
Helping find real bugs to improve SymPy is one of the most important intent of my research. I am sorry for the inappropriate way I used before. I am willing to communicate and improve in the way that is most helpful to the project.
Best regards,
Fu Yiwen


------------------ 原始邮件 ------------------
发件人: "sympy" <oscar.j....@gmail.com>;
发送时间: 2026年3月17日(星期二) 凌晨2:31
收件人: "sympy"<sy...@googlegroups.com>;
主题: Re: [sympy] Intent and transparency about my research on automated regression testing

Oscar Benjamin

unread,
Mar 17, 2026, 8:42:34 AM (7 days ago) Mar 17
to sy...@googlegroups.com
On Tue, 17 Mar 2026 at 02:52, '公主宁' via sympy <sy...@googlegroups.com> wrote:
>
> However, the two latest issues I opened:
> https://github.com/sympy/sympy/issues/29419
> https://github.com/sympy/sympy/issues/29420
> have been carefully discussed and verified by two developers, and we both confirmed that real regressions were introduced. These are not automated outputs—they are manually validated issues.

Okay, can you open new issues for those two and make sure that they
are properly formatted and communicated clearly. The link to a PR
should be clickable link which GitHub's markdown can do automatically
if you use the appropriate markup. The issues should include clear
properly formatted code for how to reproduce the problem and should be
carefully but also concisely explained so that even a novice can
understand what the issue is saying and what the problem is.

If you do that and then reply here with the links to the new issues
then I will reply to each of them on GitHub.

--
Oscar

公主宁

unread,
Mar 17, 2026, 9:14:20 AM (7 days ago) Mar 17
to sympy
Hi Oscar,

I have opened the two new properly formatted issues as you requested:
Each issue includes a clear minimal reproducible code example, a concise explanation, and the corresponding PR link.
Thank you for your help!

Best regards,
Yiwen Fu


------------------ 原始邮件 ------------------
发件人: "sympy" <oscar.j....@gmail.com>;
发送时间: 2026年3月17日(星期二) 晚上8:42
收件人: "sympy"<sy...@googlegroups.com>;
主题: Re: [sympy] Intent and transparency about my research on automated regression testing
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.

Oscar Benjamin

unread,
Mar 18, 2026, 9:23:08 AM (6 days ago) Mar 18
to sy...@googlegroups.com
On Tue, 17 Mar 2026 at 13:14, '公主宁' via sympy <sy...@googlegroups.com> wrote:
>
> Hi Oscar,
>
> I have opened the two new properly formatted issues as you requested:
>
> https://github.com/sympy/sympy/issues/29434
> https://github.com/sympy/sympy/issues/29435

One of these issues is about something that is already fixed on
master. You should check that before opening an issue.

The other issue is questionable. It isn't really a bug. It could be
argued that it is a case where something could be improved but I think
if you are going to use an AI tool to open issues you should focus
more on things that are about actual mathematical correctness to keep
a low false positive rate.

Oscar
Reply all
Reply to author
Forward
0 new messages