Google Translate Woes...

176 views
Skip to first unread message

Warren Smith

unread,
Nov 8, 2023, 3:53:10 PM11/8/23
to hon...@googlegroups.com

At least I think that Google Translate is the culprit (as I assume that Google Patents uses Google Translate).

 

This is the first time I have had to deal with this from the other side of the divide (from the patent agent side, rather than the translator side). I am doing a prior art search for a patent I am writing. Unfortunately, the Chinese have been filing literally millions of patents in the last several years (1.58 million patents from China in 2022). Most of these patents are junk, so I have not worried about them. Now, all of a sudden, I have to consider them as potential prior art.

 

In my current search, most of the patent applications that I find (through combining key words with international patent classifications) to be potential relevant are from China. Fortunately, Google Patents provides translations of the patents I am looking at. Unfortunately, the translations are completely incomprehensible. Check out a title: "Protection that inserted scaffold frame was used turns over board." What the heck does that mean? No clue. Perhaps the Chinese would have been more easily understood, even though I don't read Chinese. Fortunately there are usually pictures to look at....

 

It is shocking to me that compensation in the patent translation industry (at least from Japanese into English) has largely collapsed when machine translation is so BAD. How do we let our clients know that it is far premature to hop on to this machine translation bandwagon?

 

W

 

Matthew Schlecht

unread,
Nov 8, 2023, 5:24:16 PM11/8/23
to hon...@googlegroups.com
On Wed, Nov 8, 2023 at 3:53 PM Warren Smith <warren...@comcast.net> wrote:

 

It is shocking to me that compensation in the patent translation industry (at least from Japanese into English) has largely collapsed when machine translation is so BAD. How do we let our clients know that it is far premature to hop on to this machine translation bandwagon?


To date, East Asian languages have proven less amenable to LLM/MT, but that will change as LLMs are designed for and populated with the East Asian language source content.
AI-driven cars come with a caveat that a human supervisor must always be in the loop, and AI-derived MT content should be offered to end users with a similar caveat.
During the ATA National Conference two weeks ago, I spoke with James Phillips of the PCT Translation Division at WIPO about WIPO abstracts and the translations thereof offered at the site. Specifically, whether anything can be done about the quality.
WIPO doesn't do translation of the abstracts (no budget for that). When available, those translations have been supplied by the applicant. WIPO also doesn't do any QC on the translations (no budget for that, either). So, the fidelity of those translations is understandably variable. The fidelity of searches based thereon will then certainly suffer.
I know from experience that many English versions of source German or source Japanese patent abstracts are quite poor, and in some cases are not translations at all but a brief summary in English of what the summarizer thinks are the key points in the abstract. This approach is not conducive to effective searchability.
I know that it doesn't relate to the point you made in your post, but searching abstracts in the source language, or all the major source languages including Chinese, is probably the most effective approach to doing prior art searches.

Matthew Schlecht, PhD
Word Alchemy Translation, Inc.
Newark, DE, USA
wordalchemytranslation.com

Robin Capps

unread,
Nov 9, 2023, 9:48:04 PM11/9/23
to Honyaku E<>J translation list
Matthew, could you please clarify your statement "WIPO doesn't do translation of the abstracts (no budget for that). When available, those translations have been supplied by the applicant. WIPO also doesn't do any QC on the translations (no budget for that, either)."?
You perhaps misunderstood something in your conversation with James? If you're referring to PCT applications, a small percentage of abstracts are translated into English in-house at WIPO, and the remaining volume is translated using agencies contracted directly by WIPO. WIPO does perform QC on the translations, but again, only a small percentage of the total volume due to limited staff numbers. Abstract translations are not supplied by the applicant...

Thank you,
Robin Capps

Matthew Schlecht

unread,
Nov 11, 2023, 11:55:59 AM11/11/23
to hon...@googlegroups.com
On Thu, Nov 9, 2023 at 9:48 PM Robin Capps <robin...@gmail.com> wrote:
Matthew, could you please clarify your statement "WIPO doesn't do translation of the abstracts (no budget for that). When available, those translations have been supplied by the applicant. WIPO also doesn't do any QC on the translations (no budget for that, either)."?
You perhaps misunderstood something in your conversation with James? If you're referring to PCT applications, a small percentage of abstracts are translated into English in-house at WIPO, and the remaining volume is translated using agencies contracted directly by WIPO. WIPO does perform QC on the translations, but again, only a small percentage of the total volume due to limited staff numbers. Abstract translations are not supplied by the applicant...

Thanks for your critical reading of my comments.
I thought the best thing would be to get back to James Phillips to see whether I had understood him correctly.
I did get back to him, and it turns out that I did misinterpret some of what he told me in the brief discussion. The clarification:

We do translate the abstracts. It is stipulated in the PCT Treaty that this is our responsibility, together with written opinions and international search reports (written opinions are actually the bulk of the work).
The client can provide a translation of the abstract but whether it is used or not will depend on the accuracy. We do have a budget for all of this and we do perform quality control.
The quality control involves sampling a statistically calculated percentage of the documents.
We are trying to move to a more risk-based approach using business intelligence applications, as described in [his ATA64] presentation

The title of the presentation was "Post-editing at an International Organization", and dealt with WIPO's analyses of the post-editing of machine translations of patents and related documents, some of which is done in-house and much of which is contracted out. Some of the points made were that the post-editing effort was largely proceeding well, although a percentage of post-editors were found to be doing retranslation.
The QC is done on small samples according to statistical guidelines.

I regret that my comments were misleading.

The main point of my post was that an integrity of the content of abstract translations/summaries should not be assumed, and searching patent abstracts in the source language, or in all the major source languages including Chinese, is probably the most effective approach to doing prior art searches.

Dan Lucas

unread,
Nov 12, 2023, 2:14:46 AM11/12/23
to Matthew Schlecht, Honyaku E<>J translation list
Matthew, I have no real interest in patents, but I appreciate the conscientious update in response to Robin's query.

Dan Lucas
--
You received this message because you are subscribed to the Google Groups "Honyaku E<>J translation list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to honyaku+u...@googlegroups.com.

Philip Schnell

unread,
Nov 16, 2023, 8:03:09 PM11/16/23
to Honyaku E<>J translation list
Yes, I can confirm that the WIPO abstracts are all hand-translated—I do some of them on a weekly basis myself. I can’t vouch for the Chinese ones, but I’ve read thousands of the translations of Japanese abstracts over the years, and while the quality can be variable, they’re generally of decent quality, especially considering that the originals tend to be elliptical to a fault. Translators don’t have access to the full patent text (or even more than one drawing) when translating the abstracts, so at times it can be ... challenging.

The full text translations on Google Patents, on the other hand, are machine-generated (unless there is a pre-existing translation) and are quite poor, though they don’t seem to be generated on-the-fly like the JPO does. The Google algorithm can’t handle even simple claim structures adequately.

Philip Schnell

Matthew Schlecht

unread,
Nov 16, 2023, 10:35:37 PM11/16/23
to hon...@googlegroups.com
On Thu, Nov 16, 2023 at 8:03 PM Philip Schnell <fish...@gmail.com> wrote:
The Google algorithm can’t handle even simple claim structures adequately.

I believe that at least part of the problem is that the JA source text input often carries hard returns at the line breaks within a single claim, and consequently the MT input algorithm makes them into separate segments.
It is *patently* obvious that parsing individual phrases from within a claim in this way will lead to gibberish.
In going from Japanese to English, content at the end of the source claim string must appear at the beginning of the target claim string, which is impossible if these portions are in separate segments.
If soft returns are used at line breaks and the entire text of the claim is contained within a single segment, there is a much higher likelihood that the target machine-generated text will make sense.

Philip Schnell

unread,
Nov 17, 2023, 4:00:51 AM11/17/23
to Honyaku E<>J translation list
Yes, I think I think you’re right, though in my experience MT tends to fail pretty badly at parsing complex syntactical structures in patents anyway, even when the segments have been aligned properly.
Reply all
Reply to author
Forward
0 new messages