The Claude Code leak that spurred 8,100 DMCA takedown notices

65 views

Skip to first unread message

Georgia Jenkins

unread,

Apr 30, 2026, 3:50:33 AMApr 30

to ipkat_...@googlegroups.com

Home / AI / Artificial Intelligence / clean-room / computer software / fair use / Georgia Jenkins / interoperability / notice and takedown / reverse engineering / source code / USA / The Claude Code leak that spurred 8,100 DMCA takedown notices

The Claude Code leak that spurred 8,100 DMCA takedown notices

Georgia Jenkins Thursday, April 30, 2026 - AI, Artificial Intelligence, clean-room, computer software, fair use, Georgia Jenkins, interoperability, notice and takedown, reverse engineering, source code, USA

A few months ago this Kat reported on the ‘IP theft’ of Claude Code via model distillation. As it turns out, this was only the beginning of Anthropic’s IP woes. A month ago, Anthropic accidentally released some of Claude Code’s internal source code due to “human error”.

Chaofan Shou, a security researcher, was quick to spot the release given its sheer size (59.8 MB file) and found that it contained 512,000 lines of source code. As the release was available on Claude Code’s public repository, it was then mirrored on GitHub. This involves GitHub users creating forks (i.e. personal copies of a repository that can be kept in sync with the original). GitHub explains that ‘forks’ are ‘often used’ to propose changes to the original repository or to use it ‘as a starting point for your own idea’. The latter being in line with open source software values.

The real Claw-Code
Photo by Aleksandar Cvetanovic on Unsplash

In response, Anthropic submitted 8,100 Digital Millennium Copyright Act (DMCA) takedown notices relating to ‘direct copies’ of Claude Code written in TypeScript that require a Claude license. Overnight a ‘clean’ version appeared (a.k.a ‘Claw-Code’) that rewrote Claude Code in Python.

What was copied

The leaked files did not relate to its models (e.g. Opus or Sonnet) but software that interacts with its system prompts. These are somewhat ‘static’ prompts that instruct an LLM to reply to a user’s prompt (e.g. tone, tools and contextual information). This is perhaps what makes Claude Code so valuable; Anthropic appears to offer the most effective interaction with its models for coding. Though others had partially reverse-engineered the source code previously, this leak was the first time it became clear ‘how Claude Code assembles a context’ for a user. It also exposed Claude’s agentic framework that utilises context and turns a user’s query into a series of actions, updating its own instructions along the way.

Agentic loops are essential to AI infrastructures that move beyond LLM-based workflows that operate ‘through predefined code paths’. They allow LLMs to ‘dynamically direct their own processes and tool usage’ meaning that they have autonomy over how they complete tasks. Thus Claude Code interacts with LLMs in an iterative process to incorporate feedback gathered from accessing relevant tools to understand and perform user requests. So, as Anthropic (and others) positions itself as the ultimate AI-coding assistant, this leak provides competitors with the same functionality to improve comparable agentic systems at scale.

But is all this protected by copyright? Reportedly, “around 90% of Claude Code is written with Claude Code” in some strange chicken-egg vibe-coding nightmare. Following the US Supreme Court’s denial of certiorari in Thaler v. Perlmutter, the D.C. Circuit Court of Appeals judgment appears to rule out fully autonomous generative AI creativity. The question is therefore whether coding with Claude Code is comparable to a tool used by a human that meets the originality threshold. Part 2 of the US Copyright Office’s Report on Copyright and AI outlines that protection will be refused where ‘the traditional elements of authorship’ are determined by the system rather than the human (c.f. Zarya of the Dawn).

The nature of rewriting (copying?) AI-generated code

Assuming copyright subsists in the source code, there are questions relating to infringement and fair use. Following an overnight panic of Anthropic’s predicted response, the software community rallied. In the early hours of the morning, Korean developer, Sigrid Jin, used ‘Open AI’s Codex to rewrite the core agent architecture from scratch in Python’, pushing it to GitHub as ‘Claw-Code’, an open-source repository where it was mirrored to such an extent that it is the fastest growing repository in GitHub history (i.e.100,000 bookmarks). Today Claw-Code contains a disclaimer that while the leak is ‘part of the project’s background, […] the tracked repository is now centred on Python source rather than the exposed TypeScript snapshot’.

For DMCA takedown notices, disclaimers have no effect. Instead, one might reasonably expect that Anthropic’s legal team, which openly uses Claude to assist legal work, has considered whether Claw-Code infringes the right of reproduction (17 U.S.C. §106), the relevance of interoperability (17 U.S.C. §117), and the extent to which fair use may be applicable (17 U.S.C. §107). Much of this analysis rests on whether Claw-Code is an independent work and original. The former has yet to be applied in an AI context, but as some reflect, here ‘the separation is architectural, not cognitive’, referencing IBM developers that were physically separated for activities related to accessing, studying, and creating code based on its functionality (e.g. Computer Associates v. Altai) . The latter, following the discussion above, likely lacks originality making it a derivative work.

There is also not much to go on for interoperability. Sega and Sony allowed copying and reverse engineering as it was necessary for them to create their own products that work with other systems. Slightly broader, Google v. Oracle allowed API implementation so that developers can ‘put their accrued talents to work in a new and transformative program’. It is not clear whether Claw-Code is sufficiently transformative following Warhol, or even proportionate to the extent of copying (as one could argue this is an adaptation of Claude Code). There is also uncertainty over whether Claw-Code’s open-source objectives outweigh its impact of Anthropic’s actual or potential subscription market for Claude. So while Claw-Code allege that ‘[i]ndependent code audits confirm that the project contains no Anthropic proprietary source code, no model weights, no API keys, and no user data’, without a direct comparison to Claude Code, this remains legally untested.

The takedown notices

Anthropic initially issued DMCA takedown notices that temporarily disabled 8,100 repositories comprising many legitimate forks relating to their own public Claude Code repository. Within hours, the request was narrowed to one repository and 96 forks, allowing repositories to be reinstated that fell outside the direct copies. This problem is not unique to Anthropic’s (possibly unintended) misuse of takedowns pursuant to 17 U.S.C. § 512. GitHub’s Transparency Report records a 50% increase in DMCA-affected repositories in 2025 from 2024. Despite only 2,661 DMCA takedown notices, 47,288 repositories were impacted.

Some point to the odd copyright nexus that sits behind 17 U.S.C. § 512, as it ‘allow[s] the removal of material without any kind of evidence let alone a judicial order’. Effectively, it incentives intermediaries to act ‘expeditiously’ on takedowns, to which the developer community suggests should operate alongside a ‘refundable deposit’ scheme. Naturally similar arguments circle the implementation of Article 17 of the Digital Single Market Directive, whose review is due in June 2026.

Comment

Just as we thought the haze might clear a little on the relationship between humans, generative AI and copyright law, this leak catapults the discussion to an entirely new level. One where both Claude Code and Claw-Code challenge basic copyright elements centred upon being human. The added layer of DMCA takedown notices further complicates the picture as infringement is simply presumed without (much needed) judicial oversight on the application of legal doctrines to new contexts.

This is not to say that Anthropic’s innovation should not be protected, just that copyright is perhaps completely unsuited to respond to the generative AI chicken-egg trajectory of copyright, particularly when AI-coding assistants are used to reverse engineer source code. Indeed, it is not lost on this Kat that just as Grok seemed out of the AI agent coding game, xAI (the AI company building Grok) gifted Grok credit which ‘greatly helped with the rewrite of Claw Code’ according to Jin.

As another developer remarked, ‘These compute potentates wield far too much power and so I welcome whoever left a window open at Anthropic's digital Versailles.’ Marie Antoinette might never have actually said ‘let them eat cake’ but we seem determined to offer copyright as the same kind of out-of-touch solution.

For now, Claw-Code remains.

Do you want to reuse the IPKat content? Please refer to our 'Policies' section. If you have any queries or requests for permission, please get in touch with the IPKat team.

Reply all

Reply to author

Forward

0 new messages