On 2026-05-11 08:18, 'WARREN SMITH' via Honyaku E<>J translation list wrote:
> t is somewhat unusual to find myself on the consumer side of
> translation, rather than acting as the translator.
> I am currently handling a portfolio of approximately 30 recently
> published Chinese patent applications that need to be filed in the
> United States. As part of an initial assessment, I conducted a
> controlled comparison using one application. I reviewed the machine
> translation available on
patents.google.com and compared it against the
> source Chinese text. I then provided both the original text and the
> machine translation, along with the PDF of the source document, to
> ChatGPT for evaluation.
> The results were mixed. On the one hand, the machine translation was
> generally serviceable and, with attorney-level edits, could form a
> workable basis for direct U.S. filing. However, I observed several
> issues that are difficult to justify given the maturity of current
> translation systems. For example, a numerical value of 1.035 in the
> source text appeared as 1.065 in the English translation. There was also
> at least one instance of a fully omitted sentence. These are not the
> types of errors one would expect in a relatively controlled technical
> domain.
I would say, these are not the types of errors one would expect given a
modernist conception of AI (e.g. the Star Trek computer).
However, LLMs operate on post-postmodern principles, under which
unexpected behavior can be completely expected, and in the case of LLMs
is to be expected (even though it is unexpected).
I have actually noted recently an increase in cases of omitted text and
untranslated source appearing in the translation output of several
different LLM/MT systems, which I don't recall seeing in the past at all.
> ChatGPT performed well as a secondary review tool. It identified not
> only the numerical discrepancy and omission, but also flagged internal
> inconsistencies in the source material itself, such as a mismatch
> between “nitrogen” in the abstract and “argon” in the claims.
> What was more surprising was the variability across the portfolio. A
> second application, similar in subject matter and filed within a
> comparable timeframe, had a machine translation of significantly lower
> quality, with numerous substantive errors. This raises a practical
> concern: why is translation quality inconsistent across closely related
> documents, even when the underlying technology and filing dates are
> similar?
The most obvious explanation would be that different models were used
for the different translations, but even the same model can produce
highly divergent results.
An LLM has no object model of translation and always solves just one
problem: given the input (including user's prompt and LLM's own
previously generated output), what is the most likely next token (or set
of n tokens)? So on the whole, it behaves as a chaotic system, i.e.
potentially highly sensitive to some slight difference in the starting
conditions.
Herman