You wouldn’t steal a car or clone Claude - or is model distillation just healthy competition?

120 views

Skip to first unread message

Georgia Jenkins

unread,

Mar 21, 2026, 4:55:37 AMMar 21

to ipkat_...@googlegroups.com

Home / AI / AI patent software / AI training / Anthropic / Artificial Intelligence / breach of contract / copyright / Digital Markets / DMA / European trade secrets directive / Georgia Jenkins / IP theft / LLM / model distillation / terms of use / You wouldn’t steal a car or clone Claude - or is model distillation just healthy competition?

You wouldn’t steal a car or clone Claude - or is model distillation just healthy competition?

Georgia Jenkins Saturday, March 21, 2026 - AI, AI patent software, AI training, Anthropic, Artificial Intelligence, breach of contract, copyright, Digital Markets, DMA, European trade secrets directive, Georgia Jenkins, IP theft, LLM, model distillation, terms of use

POV: You reported the other cat for fishing
while standing on the same pier

Anthropic recently identified three ‘industrial-scale campaigns’ by DeepSeek, Moonshot, and MiniMax to ‘illicitly extract Claude’s capabilities to improve their own models.’ Referred to as ‘model distillation’, this technical process allows less capable or smaller models to be trained on the output of a stronger model. However, these 3 Chinese-based AI labs allegedly used ‘approximately 24,0000 fraudulent accounts’ to generate ‘16 million exchanges with Claude’. While Anthropic acknowledged that model distillation is a ‘legitimate training method’, they characterised this activity as illicit, violating their terms of service.

The déjà vu is hard to ignore. Anthropic’s arguments are markedly similar to the same ‘output’ of authors, artists and performers used to train models. However, for Anthropic, this feels different. They warn that competitors (oh no!) can use it to acquire powerful capabilities from other labs in a fraction of the time and cost than it would take them develop it independently. Maybe they should recall the arguments made by creators across the creative industries...

Is model distillation ‘IP theft’?

Google’s Threat Intelligence Group likewise reported last month that ‘as organizations increasingly integrate LLMs into their core operations, the proprietary logic and specialized training of these models have emerged as high-value targets’. While previously more conventional ‘computer-enabled intrusion’ was used to access ‘data containing trade secrets’, the LLM-as-a-service business model has made this incredibly easier and more efficient. They explain that ‘actors can use legitimate API access to attempt to “clone” select AI model capabilities’. This allows a competitor to ‘gain insight into a model’s underlying reasoning and chain-of-thought processes’.

They do this by using ‘legitimate’ accounts to extract information to train a new model. Somewhat ironically, Google characterises this as a form of ‘IP theft’ despite access being limited to model outputs, not disclosed source code or model weights. From a copyright perspective, even if these outputs meet the threshold for originality, the model provider does not usually own them, but transfers them to the user (IPKat here). Further patents related to model distillation are not owned by Anthropic but Google amongst others, and even if Anthropic was the owner it would need to claim the exact process used. Though LLM architecture patents exist, infringement likely requires proving that the new model’s internal design is within the scope of the patent claims (allowing for equivalents), not just that it behaves similarly.

That leaves trade secrets and breach of confidence. From a UK/EU perspective this requires finding that information (e.g. model weights, internal architectures, and training data selection) are secret with related commercial value and are subject to reasonable steps to keep it confidential. However, model distillation does not require direct access. It uses outputs generated through public or paid interfaces. How outputs can be secret or confidential if it is accessible to thousands of users is a bit of a stretch. Even further, recital 16 of the Trade Secrets Directive allows a degree of reverse engineering of lawfully acquired products, unless it is contractually restricted. It is not obvious why probing the behaviour of a publicly offered AI service via prompts should be treated differently.

Vibe coding contractual fences around the Claude ecosystem

Anthropic’s Consumer Terms of Service prohibit users from sharing account-related information (e.g. the Anthropic API key) and are responsible for all activity undertaken with their account. Users cannot use Anthropic services to ‘develop any products or services that compete’ with Anthropic, ‘including to develop or train any artificial intelligence or machine learning algorithms or models’. They also prohibit decompiling, reverse engineering, and disassembling Anthropic’s services alongside a prohibition against crawling or scraping data or information, and accessing their services through automated means. There is also a general prohibition against ‘bypassing their systems or protective measures’ and a Usage Policy that prohibits users from engaging in ‘fraudulent, abusive or predatory practices’ and generally ‘abus[ing] our platform’.

Whether these clauses would qualify as unfair contract terms (e.g. EU Unfair Terms Directive) is debatable. However they certainly challenge interoperability (see, CMA AI Foundational Models Guidance and Article 6(7) of the Digital Markets Act). In effect, they likely complicate third-party tool integration and genuine ‘multi-homing’ between competing models by business users (e.g. think of a driver using Uber and Lyft simultaneously). If a provider like Anthropic holds significant market power, restrictions that ring-fence front-end access warrant further scrutiny as Anthropic’s upstream dominance is used to preference its tools downstream. It restricts access from neutral multi-model tools in software development that allow developers to use multiple LLMs within a single workflow.

Last month Anthropic doubled down on excluding external tools (e.g. OpenCode, OpenClaw, Cline, and Roo Code) from accessing their services. Previously developers could use the OAuth token from one of their cheaper Claude subscriptions alongside open source tools. By banning this activity, it appears that developers must pay Anthropic’s API pricing scheme or subscribe to Claude Code. The value is in the detail. It is worth remembering that every time a developer sends a request, the tool sends a large proportion of the codebase to the AI for context. So how Anthropic prices this activity is critical. To further cloud the picture, Anthropic is heavily subsidising Claude Code to likely push developers towards using Anthropic tools to become reliant on their ecosystem. While Anthropic’s Head of Product, Tariq Shihipar, attempted to respond to developer concerns on X, the text of the legal compliance note remains the same.

Comment: You really cannot make this up (generate it) if you tried

This Kat may have laughed (hard) when she heard that Anthropic was concerned about IP theft related to the alleged model distillation ‘attack’, particularly its plea that:

The window to act is narrow, and the threat extends beyond any single company or region. Addressing it will require rapid, coordinated action among industry players, policymakers, and the global AI community.

Sidelining arguments in Bartz v. Anthropic that the very data used to train Claude infringes copyright (IPKat here), a closer look at Anthropic’s terms of use and enterprise offerings hints at something much bigger. While Anthropic may have missed the boat on consumer-facing LLM integration (e.g. ChatGPT), it would seem that they have set their sights firmly on the enterprise software market. This Kat would suggest reading between (and beyond) the lines of the model distillation ‘attack’ statement. The picture is inherently more complex as Anthropic forges forward in their quest to bolster their responsible AI reputation and tighten contractual and technical control over their services to the detriment of agentic open-source tools and multi-model development environments.

So maybe you would download a car after all or perhaps even, distill Claude.

Do you want to reuse the IPKat content? Please refer to our 'Policies' section. If you have any queries or requests for permission, please get in touch with the IPKat team.

Reply all

Reply to author

Forward

0 new messages