Use of AI in the patent industry: The spectre of hallucination

45 views

Skip to first unread message

Rose Hughes

unread,

Oct 5, 2025, 4:49:35 AM (11 days ago) Oct 5

to ipkat_...@googlegroups.com

Last time, this Kat covered some practical steps on how to ensure client confidentiality when using AI tools (IPKat). In this post, we will look at a second concern many patent attorneys have with generative AI, its propensity to simply make up facts and present them as truth. What are the risks that the output from an AI will include fabricated facts, and how can patent attorneys using AI tools understand and mitigate this risk?

There is already evidence of LLM generated errors creeping into both science articles and patent applications, with numerous examples of AI artifacts. However, these are only the errors we can see. Theses are also the errors that could have been easily corrected with some simple proofreading (after all obvious errors creeping into patent applications is not a new problem!). The real concern about the use of AI in the patent industry is the possibility for LLMs to generate content that looks prima facie accurate, but is in fact entirely fabricated. All AI software tools for patents that use LLMs (this Kat is not aware of one that is not based on LLMs, IPKat) may produce hallucinations. Indeed, hallucinations are particularly problematic and even likely in complex technical areas such as patents. The solution is choosing the right tool for the right job and recognising that LLMs are a useful tool, but not a replacement for technical expertise.

Obviously AI generated content is just the tip of the iceberg

All patents are based on science, and so science is a good first place to look for the risks of relying on AI-generated content. There is already considerable evidence of errors resulting from the use of AI to write academic and peer-reviewed journal articles, examples of which are collated by the very helpful Academ-AI website. One particularly memorable example is the 2024 review article, "Cellular functions of spermatogonial stem cells in relation to JAK/STAT signaling pathway" published in Frontiers in Cell and Development Biology (Guo et al.). The article included a clearly AI-generated and downright disturbing Figure 1 (parental guidance recommended), and was subsequently retracted for not meeting "the standards of editorial and scientific rigor" for the journal.

Whilst the use of AI in Guo et al. was declared and obvious, there are also now hundreds of instances of the undeclared use of AI in academic articles and journals. This use is apparent from the obvious AI-generated errors creeping into the text. To choose just one recent example from many hundreds, the chapter on RNAi in the 2024 published book Biotechnological Advances in Agriculture, Healthcare, Environment and Industry includes the telling sentence "Please note that the status of FDA approvals may have changed since my last update, so it’s a good idea to consult the most recent sources or official FDA announcements for the latest information on RNAi‑based therapeutics". Similarly, Abbas et al. in Cancer Research Review includes the statement "Certainly! Here are 15 references related to 'The role of artificial intelligence in revolutionizing cancer research'".

Is that a Kat I see before me?

We also now have similar examples of careless AI use in patent drafting. The patent application AU2023233168A1, for example, is clearly unashamedly written by AI, without there apparently being any attempt by a human to edit the output. According to the background section, for example, the AI proclaims that "As of my last knowledge update in September 2021, I do not have access to specific patent numbers or details of similar inventions that may have been filed after that date."

Just a proofreading problem?

The examples given above show that AI is being used to write science content, both in academic articles and patents. For patents, there is nothing intrinsically bad about using AI to write content. What is shocking is that so little care was taken to proofread the text. Of course, this problem did not begin with AI. Before LLMs, patents were already full of errors. The description of the granted US patent US10942253 B2 includes a note "QUESTION to inventor: is that correct?", whilst US20040161257 A1 includes the following dependent claim: "The method of providing user interface displays in an image forming apparatus which is really a bogus claim included amongst real claims, and which should be removed before filing; wherein the claim is included to determine if the inventor actually read the claims and the inventor should instruct the attorneys to remove the claim". This Kat's favourite example remains US9346394 B1, which includes the memorable embodiment wherein "the sensor 408 works with a relay in a known manner I'm sorry babe, but I may actually have to be here late. I've got to get this patent application filed today. Thankfully, Traci is willing to stay late to help me get it done".

Pre- and post the rise of LLMs, it has always been necessary to take care with your descriptions. Patent specifications are long, wordy and repetitive, and so can be difficult and time-consuming to proofread. Ironically, this is where AI may actually add value by helping attorneys screen for obvious errors before filing. However, these obvious errors are clearly only the tip of the iceberg with respect to LLM use in patents. LLMs are improving all the time and it is becoming increasingly difficult to detect AI-generated content, as the rise and fall of the infamous AI-ism "delve" tells us (for some reason, LLMs love the word "delve", and it is one of the surest signs that text has been written by AI, until everyone realised this and prompted it away...)

Flying under the radar

Obvious AI errors are great for light entertainment and good fodder for the AI sceptics. However, the real risk of AI to patent quality comes from the errors that fly under the radar, in the form of highly believable hallucinations. Hallucination is the term given to an AI-generated response that contains false or misleading information dressed up as fact. The problem of hallucinations is no mere academic concern. Lawyers, including IP lawyers, have already found themselves in serious trouble for submitting legal briefs containing entirely fabricated case citations conjured up by an LLM, as this helpfully compiled online database of cases records. The recent decision of UK decision in O/0559/25, relating to a trade mark case, is a timely reminder of the dangers.

The risks for patents themselves are even greater. There is not only the risk of dodgy citations but the more serious issue of hallucinated facts and data creeping into the patent application itself. LLMs are perfectly capable of not only generating the text but of manufacturing false data, leading to the increased risk of patent applications containing fabricated examples.

Understanding why hallucinations happen

To understand how to address the problem of hallucinations, it is first helpful to understand why they happen. LLMs are initially trained using vast quantities of data. The amount of data on the internet available to LLMs is one of the reasons they are so effective. However, the underlying biases and characteristics of these data influence the outputs. One important characteristic of the internet is that people on the internet very rarely say "I don't know". The internet is a resource of information presenting itself as authoritative. Most questions posted on the internet will have been answered by someone, in some form, and there are no Wikipedia articles saying "sorry, no-one really knows much about this topic". At their core, LLMs are therefore not trained in a way that makes them likely to admit that they cannot answer a question. Instead, an LLM will often consider an answer, any answer, a more probable response than a simple "I don't know".

Hallucinations are particularly likely in the patent industry

Recognising hallucinations as a core problem with LLMs that needs solving, foundational LLM labs are making huge efforts to train their models not to make up facts and present them as genuine. One strategy is the use of human-led fine-tuning of the models (Reinforcement Learning from Human Feedback, RLHF) to reduce the propensity for the models to produce errors. In this training, humans are given a series of LLM prompts and responses and tasked with identifying errors and hallucinations. This data can then be fed back to train and update the model so as to make the errors less likely.

Human error labelling works well for reducing errors that can be clearly identified by a human. However, RLHF is far less effective at combating hallucinations involving complex prompts and technical areas. From a purely psychological perspective, if humans are not themselves confident of the answer, they are likely to reinforce an LLM that sounds certain of its conclusion. After all, we all tend to believe someone based on the level of confidence expressed. This leads to the phenomenon whereby LLMs are trained to become increasingly confident and intransigent in their positions the more wrong they are, to the extent that LLMs will even begin fabricating citations to support their position.

Additionally, there are simply not enough domain experts to adequately evaluate and train LLMs to perform well in complex technical areas. LLM labs are trying their best to recruit experts to help them train their models, including from the legal industry. However, expert time is expensive and in relative short supply. On top of this, the amount of training data available, such as outputs from sophisticated prompts, is also generally inadequate for the scale of training required. Given that patents are an area that is both highly technical and relatively niche, hallucinations remain highly likely.

Using an LLM to generate patent descriptions or suggest arguments in response to office actions based entirely on the LLM's own general knowledge base and not on any specific data or input, is thus fraught with hallucinogenic risk, especially when working within highly specialist technical areas. However, this is how many of the existing AI tools for IP operate. This is also why many of the current AI tools for patent work are currently not fit for purpose without considerable input and optimisation by an expert user. When an LLM is providing you with new facts from its own knowledge base and not based on any facts that you have supplied it with, it is absolutely necessary to check the accuracy of all of the new facts.

Interestingly, whilst many of the more established AI tool providers address the issue of confidentiality head-on, the problem of hallucination often gets only a passing mention. The majority of AI tool providers emphasise that the attorney remains "in the driving seat", and that it is up to the user to check the AI output. One provider states that their "models are designed by IP and Machine Learning experts to minimise hallucinations and produce reliable outputs". But as we covered in the last post, these AI providers are very much not developing their own LLM models, but are simply ChatGPT-wrappers prompting one or more foundational LLMs. The issue of hallucinations thus lies, to some degree, outside of the AI tool providers' control (however they might like to imply otherwise!).

The patent attorney's professional code of conduct remains unchanged

The possibility of factual error is not a new problem for the patent industry. Before an attorney relies on any information for the provision of legal advice or a patent application draft, it has always been necessary to check the original source material or an expert in the technical field. This requirement has not changed. Before LLMs, the internet was already full of advice and blog articles on science and patent law of dubious veracity. Indeed, prior to LLMs, Wikipedia was the misinformation whipping boy.

Guideline 3 of the epi guidelines on the Use of Generative AI in the Work of Patent Attorneys states that "Members remain at all times responsible for their professional work, and cannot cite the use of generative AI as any excuse for errors or omissions". This Kat would be shocked if any patent attorney thought differently. But how does a patent attorney using an LLM, or an AI tool using an LLM, ensure that there are no errors? Is the solution simply never to use LLMs? The explanatory note from epi reveals a certain scepticism that AI will bring any efficiencies, given the time needed for checking the outputs. epi thinks instead that Members "should be prepared to explain to clients that the checking requirements associated with use of generative AI may not result in net savings of time in specific instances".

The right tool for the right job

This Kat is far from an AI expert. Whilst relying on LLMs to tell you accurate facts about things you are not an expert in is fraught with risk, LLMs are excellent at processing and analysing language data provided to them by a user. Having a very clear set of user inputs and source material for the LLM to process also renders the validation process far more efficient. There are innumerable tasks within a patent workflow to which the appropriate use of LLMs can add considerable value, by not only improving efficiency but also quality. These processes and tasks are very far removed from asking an LLM to generate de novo content based on its own broad and generalist knowledge base.

The danger of LLM hallucinations in the patent industry is therefore very real. However, it is also important that we do not throw the baby out with the bathwater. Patent attorneys looking to use LLMs to cut corners to generate entire work products without expert direction and validation are likely to become unstuck. However, as one of the most language-based industries, patents is one of the technical fields to which LLMs undoubtedly have considerable value to add, if used appropriately and based on expert-led optimisation and sophisticated prompt engineering. It is, and has always been, essential that patent attorneys understand and have a deep technical knowledge of the subject matter within the work product they produce. This does not change just because AI is used.