Request for comment: LLM contribution policy

60 views
Skip to first unread message

Andrew Helwer

unread,
Apr 27, 2026, 12:46:51 PM (3 days ago) Apr 27
to tla...@googlegroups.com
The past few weeks have suggested that the TLA+ project needs a LLM contribution policy. I have a draft PR open for the TLAPM repo on which I would welcome your feedback: https://github.com/tlaplus/tlapm/pull/264

Basic outline is that all design-level discussion (TLA+ mailing list, issues, PRs, community meeting, etc.) must be mediated by humans. I have to read enough slop at my day job where I'm paid to do it and don't want to do it in open source, where I derive happiness from interacting with you all!

For actual code contributions we can follow the Linux kernel policy where AI assistance is allowed, the code is subject to the same standards as anything else, and LLM usage must be disclosed in a git commit trailer (down to specific model version): https://github.com/torvalds/linux/blob/254f49634ee16a731174d2ae34bc50bd5f45e731/Documentation/process/coding-assistants.rst

My actual personal opinion is that we shouldn't use LLM-generated code at all for development on the core tools, as our true goal should be to improve our understanding of how the tools work in the minds of the humans who work on them. Reviewing a LLM-generated PR does not give the same understanding as writing the code yourself. I think this would manifest as a short-term boost in feature & bug work throughput at the cost of reducing the ability for humans to contribute to the tools in the longer term.

Andrew Helwer

Arian Ott

unread,
Apr 27, 2026, 5:06:25 PM (3 days ago) Apr 27
to tla...@googlegroups.com, Andrew Helwer
Hi Andrew,

On Monday, 27 April 2026 17:46:29 British Summer Time Andrew Helwer wrote:

> Basic outline is that all design-level discussion (TLA+ mailing list,
> issues, PRs, community meeting, etc.) must be mediated by humans. I have to
> read enough slop at my day job where I'm paid to do it and don't want to do
> it in open source, where I derive happiness from interacting with you all!

+1


> For actual code contributions we can follow the Linux kernel policy where
> AI assistance is allowed, the code is subject to the same standards as
> anything else, and LLM usage must be disclosed in a git commit trailer
> (down to specific model version):
> https://github.com/torvalds/linux/blob/254f49634ee16a731174d2ae34bc50bd5f45e
> 731/Documentation/process/coding-assistants.rst

That would work too. There are different schools of thought. In the following I
list some OSS projects with their stance:

Gentoo - AI is prohibited. [1]
Fedora - AI is allowed when disclosed. Code must be reviewed by a human in the
loop. [2]
curl - Allowed but very careful. Code must meet high quality standards [3]
Python - AI generated PRs are prohibited. Policy suggests to not use AI for
writing code blocks [4]
Apache Software foundation - AI is allowed as long as it is copyrightable and
meets certain criteria [5]

Debian has decided to not decide on AI policy. There are still discussions
happening. Here, the term "AI" refers to any code generated by an Artificial
Intelligence.

> My actual personal opinion is that we shouldn't use LLM-generated code at
> all for development on the core tools, as our true goal should be to
> improve our understanding of how the tools work in the minds of the humans
> who work on them. Reviewing a LLM-generated PR does not give the same
> understanding as writing the code yourself. I think this would manifest as
> a short-term boost in feature & bug work throughput at the cost of reducing
> the ability for humans to contribute to the tools in the longer term.

Totally agree. There are many different ways how a project handles AI
contributions. I'm convinced the truth is always somewhere in between. Banning
AI contributions is difficult, since there is not a reliable way (except
experience) to tell if code is AI Output or not. Personally, I would base it
on the quality of the contribution. If the quality is terrible and the author
is unable to comment on the PR, it can be closed.

TLA+ specifically has a different position. When looking into the AI topic and
TLA+, I came across the paper "Can Large Language Models Model Programs
Formally?" [6].

The paper shows that LLMs mostly fail at generating valid TLA+ code. That
should support the standpoint of not allowing LLM contributions.

Hope that helps.

References:
[1] https://wiki.gentoo.org/wiki/Project:Council/AI_policy
[2] https://docs.fedoraproject.org/en-US/council/policy/ai-contribution-policy/
[3] https://curl.se/dev/contribute.html#on-ai-use-in-curl
[4] https://devguide.python.org/getting-started/generative-ai/
[5] https://www.apache.org/legal/generative-tooling.html
[6] https://arxiv.org/pdf/2604.01851

Regards,
--
Arian Ott
aria...@ieee.org
signature.asc

Zachary Policzer

unread,
Apr 28, 2026, 8:25:24 PM (2 days ago) Apr 28
to tla...@googlegroups.com, Andrew Helwer
I'm convinced the truth is always somewhere in between. Banning
AI contributions is difficult, since there is not a reliable way (except
experience) to tell if code is AI Output or not. Personally, I would base it
on the quality of the contribution. If the quality is terrible and the author
is unable to comment on the PR, it can be closed.

I think I agree with Arian.  The policy I had championed for Ray was the notion that if I was the first human to read the code then it wasn't getting shipped (if you can't be bothered to read it why should I??).

Not full stop against AI code, ultimately what we want is high quality product at a reasonable pace.  Ideally with efficiency tools, this should give you more time to properly explore the problem space (alternative approaches and the like) and therefore be an advocate for the patch you're posting.

That said, a policy I wanted (but was vetoed) was a three strikes rule.  If you are found to be slop cannoning us up to three times, you'd get banned and we would auto close your PR's.

-Zac

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tlaplus/14030289.dW097sEU6C%40office.arianott.com.

Andrew Helwer

unread,
Apr 29, 2026, 1:53:01 PM (21 hours ago) Apr 29
to Zachary Policzer, tla...@googlegroups.com
Zig has an interesting post about why they ban AI contributions, which basically comes down to PRs being extra load on maintainers that only make sense to interact with if they are also an investment in the person who sent the PR: 

Andrew

Petru Mirzenco

unread,
Apr 29, 2026, 3:46:47 PM (19 hours ago) Apr 29
to tlaplus
Given LLMs generate lots of code that "somehow works", the "human-in-the-loop' should be the default approach for approving any AI generated contributions.
A some point, the TLA+ project repo might be flooded with low quality code pull requests and humans might be overwhelmed going through all of them.
Having clear guidelines what is an acceptable quality will help a lot (limits, expectations, things to avoid, etc.).
Disclosing AI usage should be mandatory, though in our times, the "non-AI code" becomes a rarity.

I think any software tool used to develop stable systems should prioritize quality over speed and volume.

Petru
Reply all
Reply to author
Forward
0 new messages