Whenever this Kat writes about AI, responses have started to move away from talk about hallucination, confidentiality or quality. The new concern is one of economics, i.e. will the whole AI thing just become too expensive and are we putting ourselves at risk by outsourcing to AI. It is argued that AI tools are currently being provided at a loss, that the investment propping them up cannot flow forever, and that patent firms will soon not be able to afford the tokens. The implication is that firms racing to outsource work to AI are building on sand, and that we should be planning for the day the bubble pops, tokens escalate in price and the cost-efficiency case collapses. However, for this Kat, the concern that AI use is destined to become unaffordable for firms is based on some assumptions that misunderstand the current and future economics of both AI and the patent industry.
Are we in an AI bubble (and does it matter)
The argument that we are in an AI bubble point to OpenAI's reported deep negative margins, Anthropic's staggering IPO valuation approaching $1 trillion, and the view of popular journalists such as Ed Zitron, who argue that these are "dangerous, lossy companies" kept alive only by investors who will eventually want their money back.
![]() |
| A divergence in approach? |
However, the first point to note is that, even if there is a bubble, this tells us almost nothing about whether the underlying technology is real and/or whether firms can afford to ignore it. The relevant historical comparison here is the internet and the dot-com crash. The bursting of the dot-com bubble wiped out about $5 trillion in market value and sent the Nasdaq down 77% from its peak. Despite this, the internet did not turn out to be a fad and disappear. The companies and valuations were a bubble, but adoption of the technology itself was inexorable. Indeed, many of the companies involved in the crash resurfaced. Amazon itself fell about 90% from its peak and did not turn a profit until the end of 2001, and then went on to reshape global commerce and, via the cloud, the entire computing industry. A patent firm in 2001 that had concluded that the dot-coms were losing money and that the internet was a flash in the pan and could be safely ignored, would have been catastrophically wrong, bubble or no bubble. The lesson of a bubble is therefore not to ignore the technology.
It is also not at all clear that the foundational labs are actually a bubble. The losses cited by commentators are not, in fact, the losses of a business that cannot make money selling its product. They are, overwhelmingly, the R&D cost of building the next product. The vast bulk of the spend by the foundational LLM labs is research, development and the compute used to train ever-better models, not the cost of serving the models customers are actually paying for today. The labs are locked in an arms race to build the best model, and as long as that race is running, they will put every available dollar towards the next training run rather than bank a profit. Strip out the R&D and the picture of unit economics looks very different.
Critically, the costs of training and running models also happen to be decreasing for an equivalent standard of intelligence, not increasing. For the frontier labs this drive towards cost efficiency is becoming all the more necessary with the emergence of the Chinese models such as DeepSeek. A significant focus of LLM labs at the moment is therefore how to train and run models with the greatest efficiency. This involves optimizing the model architectures in ways that allow much more cost-effective training of the model, with the focus being on achieving more intelligence from less compute. Clearly, competition over price will therefore not make AI more expensive for an equivalent level of intelligence but will instead force the incumbents to compete on price by becoming more efficient. All of this is good news for AI users worried about cost.
AI is cheap, attorneys are expensive
It has also been argued that firms should be cautious about their adoption of AI and how much we outsource to AI tools, because AI prices are rising and will become so expensive that firms will have to ration it, cap attorney use, and/or eventually give it up.
This argument, this Kat would suggest, gets the economics of the patent profession entirely upside down. The relevant comparison is not one of AI cost today versus AI cost in three years. The relevant comparison is instead the AI cost versus the cost of the human hour it replaces. On that comparison, AI is and is likely to remain extraordinarily cheap compared to attorney time.
Anthropic's flagship model, Claude Opus 4.8, currently costs about $5 per million input tokens. A million tokens roughly translate to about 750,000 words or between 2,500 and 3,000 pages. For less than the price of a London pint, you can therefore have the most capable AI model on the market read and analyse the CIPA Guide to the Patents Acts (9th Edition), twice. Now, we can ask ourselves, what would it cost to have a senior partner read and digest 3,000 pages? The disparity in cost is so massive it is almost absurd.
The worry about the potential cost of AI therefore entirely misses the point and the value that AI offers. If the senior partners in a firm are spending their valuable time doing things that an AI can now do competently in seconds, then the problem is not that AI is too expensive. It is that you are deploying your most expensive resource on your cheapest tasks. Partner time should be spent on the complex strategic judgement that actually justifies a partner's billing rate, the things AI cannot do. Used in this way, AI does not threaten the economics of the firm. It rescues them. Even priced against a trainee or a junior associate, AI is cheap for what it achieves per hour.
Not all tokens are equal
It also seems that there is a technical confusion buried in the worry about the cost of AI for patent firms. There appears to be an assumption that a token is a fixed unit of value, so that a rising price per token means a rising cost to get a job done. However, this is not how the AI models work.
The first thing to understand is that, as the models improve, the capability you get per token keeps rising. As the models get smarter, they accomplish more with the same number of tokens. A frontier model can now read a 70-page patent specification and produce a clause-by-clause claim analysis in a single pass, where a weaker model needed the document fed in chunks, re-prompting each time it lost the thread or miscounted the claims, and a fee-earner checking every iterative output and prompting corrections. A more expensive but more capable model can finish a task in a fraction of the tokens a cheaper, clumsier one would use for the same task. If tokens are the petrol, and the model is the car, newer models are more faster, more fuel-efficient cars.
Second, there is no longer one model. There is a whole spectrum of models to choose from, even from a single provider, from tiny fast models to flagship reasoning models with extended thinking. A large part of the skill of using AI well is therefore choosing the right model for the job, so that you are not using a sledgehammer to crack a nut.
Finally, the LLM labs are constantly engineering token efficiency behind the scenes. Prompt caching, for example, can cut costs by up to 90% for repeated context, batch processing offers further savings, and techniques such as quantisation, mixture-of-experts routing and distillation mean the same answer is delivered for ever fewer real compute cycles.
Put those together and the claim that something cheap to do with AI last year could be eye-wateringly expensive in a few years is, on the evidence, simply incorrect. The exact opposite has been happening, year after year, at an incredible pace.
The cost of AI is going down, not up
Contrary to the popular view that AI costs are increasing, the price of a given level of AI capability has actually been collapsing. Similar to the famous Moore’s law (whereby the cost of a given amount of computing power roughly halves every couple of years), the venture firm Andreessen Horowitz has coined the term "LLMflation" for the phenomenon of increased AI performance. For a fixed level of performance, the cost of inference has been falling by roughly an order of magnitude, about 10x, every single year. For instance, GPT-4-equivalent performance that cost around $20 per million tokens in late 2022 now costs in the region of $0.40, and economy-tier models deliver comparable quality for a tenth of that again.
The reason for the reduction in costs is that running a model is a completely different problem from training one. Training is the huge, headline-grabbing expense. Inference (i.e. actually using the trained model) is comparatively cheap, and getting cheaper as hardware improves (newer chips, 4-bit quantisation, single-GPU serving of large models) and as the models themselves are made leaner. The well-supported industry expectation is that within a few years you will be able to run genuinely capable models on your own hardware.
We can also be fairly confident that this trend will continue. As noted above, the market is ferociously competitive, and the competition is increasingly on price. As this Kat has argued before, when it comes to AI you should not believe the hype, you should believe the data (IPKat). The data to look at in this case is Artificial Analysis's Intelligence vs. Price analysis. Two things stand out from this analysis. First, at the very top, AI intelligence has begun to plateau. Claude Opus 4.8 leads the Intelligence Index at 61.4, with GPT-5.5 at 60.2 and Gemini 3.1 Pro at 57. This is a tight cluster and not the runaway gaps of a couple of years ago. Second, precisely because the leaders are bunched on capability, the live competition is shifting to delivering that intelligence more cheaply, and to offering tailored, cost-effective models for particular tasks. When the frontier models are all roughly as smart as each other, the way you win customers is on price. That is clearly a market structure that will drive costs for users down, not up.
The token-billing scare stories should be read in this light. Many AI providers are moving to token-based billing. However, this is a story of exploding demand, not of unit prices rising. This is the so-called Jevons paradox, whereby when something useful gets cheaper, people use dramatically more of it (e.g. building more roads does not decrease traffic, it just increases road use). Enterprise generative-AI spending grew from $1.7 billion in 2023 to roughly $37 billion in 2025, a more than 20-fold rise, whilst simultaneously the price per token fell by more than 90%. In other words, bills are going up because usage is going up, because companies are recognising that the value-add is real. All of this is the sign of a technology that is transformative for the legal industry, not one that is going to burst and disappear.
Final thoughts
The capabilities of AI are rising fast, and, crucially, the cost of any given level of that capability is simultaneously falling fast, by an order of magnitude a year on the best independent measures. In this Kat's view, the fear of the unaffordability of AI gets the profession backwards, given that AI is and will remain vanishingly cheap compared to the expert attorney time it frees up. The prediction that costs will spiral upwards is also contradicted by the data showing that the cost of intelligence is falling. In this Kat’s view, therefore, scare stories over rising costs of AI misunderstand both the AI industry and the opportunities it offers patent attorneys. If thinking about the long-term economics, the biggest risk by far is to ignore AI as opposed to learn to use it. The firms that get into affordability trouble will not be the ones using AI, they will be the ones using attorney time like its still 2024.
Acknowledgements: Thanks, as always, to Mr PatKat (Laurence Aitchison, Head of Reasoning at Mistral) for his invaluable AI-industry insights.