NYT article on improved Google Translate

525 views
Skip to first unread message

Tom Gally

unread,
Dec 14, 2016, 8:26:08 PM12/14/16
to hon...@googlegroups.com
The New York Times has an article on the wonders of artificial intelligence that begins by discussing the apparent vast recent improvements in the quality of the output of Google Translate between Japanese and English:


I tried running an editorial from today's Asahi Shimbun through it, and the English translation was remarkably good—some dummy “it” subjects and other problems, but understandable overall. A few sentences looked better than what an inexperienced human translator might produce. I can imagine that some clients who had been paying human translators to translate texts for information purposes might switch to Google for most of their work.

Back in 2010, I posted some remarks here about minimal sentences I have used to test the real-world knowledge of machine translation systems:


Though seemingly improved in other ways, the new AI-boosted Google Translate still cannot identify the relative ages of siblings based on their birth years. Here are the results I got today:

Original: I was born in 1993, and my sister was born in 1994.
GT: 私は1993年に生まれ、私の妹は1994年に生まれました。

Original: I was born in 1993, and my sister was born in 1992.
GT: 私は1993年に生まれ、私の妹は1992年に生まれました。

I was a freelance J-E translator from 1986 to 2005. Although there was fuss about MT during that period, too, I never worried about losing work to it. I will be retiring from my current university position in six years and have started thinking about what to do for my next career. Should I think twice about returning to translation?

Tom Gally
Yokohama, Japan

Jon Johanning

unread,
Dec 15, 2016, 2:02:58 PM12/15/16
to Honyaku E<>J translation list
Surveying the state of MT is one of my interests, and I can see from all over that the MT folks are bursting with pride at what AI is supposedly doing for them.

The old older-younger-sibling problem in Japanese<>English is not going to be easy for them to solve even with the snazziest neural nets. At least for a while yet. And that's just one snag out of many for machines trying to take over from us humans in bridging the two languages. It seems that MT'ing two languages from quite different cultures and histories is a lot harder than working between the European languages, for example.

As for returning to translation, I would say that the gap between very good, experienced freelancers and less experienced ones is growing pretty rapidly. The offers I keep getting in my email are mostly only suitable for people who probably aren't on this list. The agencies I have been getting jobs from for 30+ years are coming up mostly dry at this point, so I'm scrambling to get more direct clients whose needs I can meet, and that's not easy. I think you're right that clients who aren't particularly choosy about the quality of translations are switching, and have already switched, to getting free, instant results from not only Google, but other web sites.

By the way, I have not found the advice from the usual freelance translator gurus about getting more clients terribly helpful, though that might just be me. I recently got some mentoring from SCORE, the Small Business Administration service, which I found much more helpful. In general, I am trying to get guidance from people outside the translation industry who know small business and freelancing in general.

Jon Johanning // jjoha...@igc.org

S Christenson

unread,
Dec 15, 2016, 8:17:07 PM12/15/16
to Honyaku E<>J translation list
I've also been feeling a bit palpitant with each new announcement regarding the leaps and bounds of modern machine translation.

As lukewarm consolation, consider the following experiment:

① Abstract from a clinical trial on mice, sourced from Google Scholar:
In vivoにおける抗生物質の治療効果は, 主としてマウスの全身感染症の治療成績によつて評価されている。しかし, 抗生物質の生体内濃度は, 動物種によつて異なり, 抗生物質のin vivo効果を1種の感染実験系だけで評価するのは望ましくない。さきにわれわれは, クロトン油によつて作成したラットの無菌炎症Pouch浸出液中への抗生物質の移行性について報告したが1), 今回このPouch内に菌を接種して1種の局所感染系を確立し, これに対する2, 3のセファロスポリン誘導体の効果を検討した。
 
The therapeutic effect of antibiotics in vivo is mainly assessed by the outcome of treating murine systemic infection. However, the in vivo concentration of the antibiotic depends on the species of the animal, and it is not desirable to evaluate the in vivo effect of antibiotics only in one type of infection experiment system. In the past we reported on the transferability of antibiotics into sterile inflammatory Pouch's leachate of rats prepared by Croton oil 1), this time we inoculated bacteria in this Pouch and treated one kind of local infectious system The effect of 2, 3 cephalosporin derivatives on this was investigated.

Needs some grammar cleanup, doesn't quite grasp the "Pouch" usage, and presumably one would want to crosscheck its other glosses. But overall, quite readable. Especially those first two sentences.

② Literary excerpt (something I happened across some time ago on the web -- a note about Birnbaum's choices in Haruki Murakami's Hard-Boiled Wonderlandhttp://howtojaponese.com/2015/11/ )
「…しかし今となっては選り好みはできんようになった。あんたが不死の世界をまぬがれる手はひとつしかないです」
「どんな手ですか?」
「今すぐ死ぬことです」と博士は事務的な口調で言った。「ジャンクションAが結線する前に死んでしまうのです。そうすれば何も残らない」
深い沈黙が洞窟の中を支配した。博士が咳払いし、太った娘がため息をつき、私はウィスキーを出して飲んだ。誰もひとことも口をきかなかった。
「それは……どんな世界なんですか?」と私は博士にたずねてみた。「その不死の正解のことです」(412)

First, consider Birnbaum先生: 
“…But if you act now, you can choose, if choice is what you want. There’s on last hand you can play.”
“And what might that be?”
“You can die right now,” said the Professor, very business-like. “Before Junction A links up, just check out. That leaves nothing.”
A profound silence fell over us. The Professor coughed, the chubby girl sighed, I look a slug of whiskey. No one said a word.
***
“That…uh, world…what is it like?” I brought myself to voice the question. “That immortal world?” (285-286)

グーグル先生 falls just a wee bit short of the mark here:
"... but now it has become possible to do preference, there is only one hand you can leave the world of immortality."
"What kind of hand?"
"To die right now," Dr. said in an administrative tone. "Junction A dies before it gets connected, so there is nothing to do"
A deep silence dominated the inside of the cave. My doctor cleared his throat, a fat girl sighed, I drank whiskey and drank. Nobody talked to anyone.
I asked Dr. "What kind of world is it ...? "It is about that immortality correct answer" (412)

Dialogue and, well, any flowery language you plug in there just does not come out well. Double the pity, considering demand and usual rates for literary.

(If you want a regular shock to your system in this vein, ACM TechNews reports on machine translation breakthroughs with some frequency. And, incidentally, several months ago at the day job, we were treated to a scaremongering meeting about the tens or hundreds of thousands of wholesome, hardworking citizens coming from fields across the board who are all to be left miserably unemployed by the inexorable march of AI.)

Steve Christenson

S Christenson

unread,
Dec 15, 2016, 8:22:44 PM12/15/16
to Honyaku E<>J translation list
Hmmm.. couple probable typos in that copy-paste from howtojaponese, sorry.

正解→世界
on last hand→one last hand

Dan Kanagy

unread,
Dec 16, 2016, 7:48:06 PM12/16/16
to hon...@googlegroups.com
There are issues to consider here on ownership and privacy. I tried finding the terms of service for Goggle Translate but Google's algorithms didn't reveal it in the first page of search results.

I'm assuming that when you upload text to Google Translate, you forfeit some rights to it, that Google Translate can do what it wants to it, and that Google can keep the text for as long as it wants. Will orderers of translation be happy with this? Perhaps the document is something they want to keep private or to make public at their choosing.

Another question is, who owns the resulting translation? Are you free to use it for commercial purposes? Who holds the copyright?

I've heard of translators being asked to sign agreements not to use services like Google Translate to maintain the confidentiality of what is being translated.

Where translation is headed I do not know, but I imagine that demand for work-for-hire translations where the client has clear ownership will not disappear.

Dan Kanagy
--
You received this message because you are subscribed to the Google Groups "Honyaku E<>J translation list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to honyaku+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christopher Frederick

unread,
Dec 17, 2016, 4:15:58 AM12/17/16
to hon...@googlegroups.com
> Where translation is headed I do not know, but I imagine that demand
> for work-for-hire translations where the client has clear ownership
> will not disappear.

This is an excellent point that I had not considered, Dan. Though Google
Translate may be helpful for purely informational purposes, there is
certainly something to be said for owning your words.

— Chris

Stephen Suloway

unread,
Dec 18, 2016, 1:21:40 AM12/18/16
to hon...@googlegroups.com
It was scary to see this thread and the first few paragraphs of the NYT article.

Then I ran a dozen sentences from some company president messages I’m translating at the moment — and breathed a sigh of relief.

Some sentences did come out more accurately and naturally than I would have expected in the past. Others were not accurate, and/or no more natural than I’ve come to expect..

I think we can still say that if the text has any sophistication of content or style, machine translation is good mainly for discovering the nature (not the meaning) of the content, for deciding whether or not to have a proper translation (or a thorough revision) done by a proper translator. (Disclaimer: I have no project management experience.)

By the way, the Hemingway example in the NYT article was sort of cheating. Hemingway is noted for *plain language* — masterful of course, but simple.

Regards,
Stephen

~ ~ ~ ~ ~ ~ ~ ~ ~
Stephen Suloway


Kirill Sereda

unread,
Dec 18, 2016, 3:54:29 AM12/18/16
to hon...@googlegroups.com
I just used GoogleTranslate to translate a relatively short sentence (186 kanji) from a patent abstract. The Japanese sentence I used was:

本発明は、a)試料を、水酸化ナトリウムを含む溶液と接触させて、硫黄を硫酸ナトリウムに変換させる工程と、b)炉の中で工程a)の試料を燃焼させて、本質的にすべての有機材料を除去して残渣を生成する工程と、c)残渣を濃硝酸に溶解させる工程と、d)ICP発光分光法を用いて、試料の硫黄含有量を決定する工程とを含む、繊維又はポリマー樹脂の試料における硫黄含有量を測定する方法に関する。

A human translator would produce something like this:

"This invention relates to a method for measuring sulfur content in a fiber or polymer resin sample comprising: a) contacting the sample with a solution containing sodium hydroxide to convert sulfur to sodium sulfate, b) combusting the sample of step a) in a furnace to remove essentially all organic materials and produce a residue; c) dissolving the residue in concentrated nitric acid; and d) determining the sulfur content of the sample using ICP emission spectrometry."

Here is what GoogleTranslate produced:

"A) contacting a sample with a solution comprising sodium hydroxide to convert sulfur to sodium sulphate, b) burning the sample of step a) in a furnace to produce essentially all Removing the organic material to form a residue, c) dissolving the residue in concentrated nitric acid, and d) determining the sulfur content of the sample using ICP emission spectroscopy. Or a method of measuring the sulfur content in a sample of polymer resin."

Let's assume that the human translation above is the ideal standard and let's base our calculations on kanji.

Comparing the two translations, we can see that, from the standpoint of intelligibility:
(1) GT perfectly translated (a) with 46 kanji, (c) with 18 kanji, and (d) with 31 kanji (total: 95 kanji making sense);
(2) Miserably failed to make any sense of the rest of the sentence, including (b) etc.
(3) Therefore, GT got only 51% of the text right, leaving the rest in the form of a stinking garbage dump;
(4) For no apparent reason, GT simply decided to ignore extremely important words in the sentence, such as "This invention relates to" or "fiber";
(5) It could not "get" the overall structure of the sentence and had to break it up into smaller chunks (as some human translators do);
(6) It does not know the difference between British and US spelling.

Verdict.
(a) It is too early for the GoogleTranslate team to drink champagne. They have made decent progress, but THEY HAVE NOT CREATED ANY KIND OF "INTELLIGENCE" YET.
(b) Human translators should focus on COMPLICATED STUFF (not restaurant menus).

Take heart!

Kirill Sereda

Wolfgang Bechstein

unread,
Dec 18, 2016, 4:48:51 AM12/18/16
to hon...@googlegroups.com
Kirill Sereda wrote:

> Human translators should focus on COMPLICATED STUFF (not restaurant menus).

Yes, by all means let's leave menus, signs and other simple tasks to those wonderful zombies on autopilot:

https://www.youtube.com/watch?v=o63SIc8hvYk

Wolfgang Bechstein

Frank Apps

unread,
Dec 18, 2016, 7:27:28 AM12/18/16
to hon...@googlegroups.com
One issue to bear in mind is that what is served up by for example GT or other services that provide instant translations online is not necessarily MT. It could be a human translation that Google found or that was prepared while using Google's translation environment. Or it could be human-revised MT. In some services one can see a translation that starts off incredibly well but then apparently deteriorates into more or less pure MT.
David Apps



--
You received this message because you are subscribed to the Google Groups "Honyaku E<>J translation list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to honyaku+unsub...@googlegroups.com.

Christopher Frederick

unread,
Dec 18, 2016, 8:10:11 AM12/18/16
to Frank Apps
> One issue to bear in mind is that what is served up by for example GT
> or other services that provide instant translations online is not
> necessarily MT.

It’s interesting to think about translation as just another type of
Turing Test for our modern machines. Apparently Google Translate
hasn’t quite cleared that bar just yet.

— Chris

Kirill Sereda

unread,
Dec 18, 2016, 4:05:44 PM12/18/16
to hon...@googlegroups.com
Here is another small test. This is a very short sentence from a medical text:

Japanese:
末梢静脈を視診して,静脈瘤、動静脈奇形、動静脈シャント、血栓性静脈炎による炎症や圧痛がないか確認する。

Human output:
"The peripheral veins are observed for varicosities, arteriovenous malformations, arteriovenous shunts, and inflammation and tenderness due to thrombophlebitis."

GT output:
"Observe the peripheral veins and check for inflammation or tenderness due to varicose veins, arteriovenous malformations, arteriovenous shunts, thrombophlebitis."

As we can see, GT was unable to correctly parse even such a short sentence.

It inexplicably ignored "による" and consequently failed to connect "inflammation and tenderness" to "thrombophlebitis".

In addition, it does not know that the preposition "for" can be used with "observe" to dispense with "確認する" and make the sentence lighter and smoother.

No sign of AI. None whatsoever.

Kirill Sereda

PS. I almost spilled my coffee on the keyboard when I saw the translation of the human sentence above into Russian. Absolutely terrifying.

Kirill Sereda

unread,
Dec 18, 2016, 4:17:45 PM12/18/16
to hon...@googlegroups.com
I must correct myself. GT did translate "による".

However, because it is not a true AI (it does not possess background knowledge and cannot make logical conclusions), it connected "炎症や圧痛" to the entire list (静脈瘤、動静脈奇形、動静脈シャント、血栓性静脈炎), rather than to "血栓性静脈炎" only, as a human would have done.

Kirill Sereda

Christopher Frederick

unread,
Dec 18, 2016, 8:13:47 PM12/18/16
to hon...@googlegroups.com
> As we can see, GT was unable to correctly parse even such a short
> sentence.

To be fair, in procedural text I often translate Japanese infinitive
verbs (like 確認する) using the English imperative mood
(“check”) rather than the passive voice (“are observed”), so in
that sense I might actually prefer the output from Google Translate.
Still, I agree that “observe” and “check” sound a bit
redundant—not to mention the missing word ‘and’ at the end of the
list.

> However, because it is not a true AI (it does not possess background
> knowledge and cannot make logical conclusions), it connected
> "炎症や圧痛" to the entire list
> (静脈瘤、動静脈奇形、動静脈シャント、血栓性静脈炎),
> rather than to "血栓性静脈炎" only, as a human would have done.

For what it’s worth, if there is any ambiguity about whether the
entire list modifies the word that follows it I will sometimes retain
that ambiguity in my translation. For example:

“Examine the peripheral veins, checking for inflammation or tenderness
due to thrombophlebitis, arteriovenous shunts, arteriovenous
malformations, and varicose veins.”

Just my 2¢...

— Chris

Kirill Sereda

unread,
Dec 18, 2016, 9:10:30 PM12/18/16
to hon...@googlegroups.com
Christopher Frederick wrote:

>> To be fair, in procedural text I often translate Japanese infinitive verbs (like 確認する) using the English imperative mood (“check”) rather than the passive voice (“are observed”), so in that sense I might actually prefer the output from Google Translate.

Yes, the imperative mood is possible, but I don't think it should be automatically used in sentences with verbs in the dictionary form when the sentences are translated out of context. In my opinion, and I might be wrong here, from the standpoint of probability, direct instructions to the reader are less likely than a general description, so I tend to convert the verbs to the passive voice in such cases.

>> For what it’s worth, if there is any ambiguity about whether the entire list modifies the word that follows it I will sometimes retain that ambiguity in my translation.

In patents I always do so, too, even when semantics can help disambiguate the sentence.

Kirill Sereda

David J. Littleboy

unread,
Dec 18, 2016, 9:28:48 PM12/18/16
to hon...@googlegroups.com

>From: Kirill Sereda
>
>No sign of AI. None whatsoever.

FWIW, this is exactly correct. Back in the day (I studied/worked in AI in
the 1980s), AI had two factions: the scruffies and the neats. The scruffies'
approach was to look how people did things and try to simulate that, the
neats' was to try to write programs that did hard/interesting things
regardless of the techniques used. Both sides of AI had fizzled something
fierce by about 1990. But the scruffies completely disappeared while the
neats puttered away in the shadows, doing "corpus based" MT, among other
things.

Fast forward to the last few years, and the neats' end of AI is back. What's
different, one might ask.

IMHO, it's real simple: between 1991 and 2006, single-chip computers
(starting with the Intel 486 through the high-end i7) became about 1,000
times faster. The neats now have inconceivably more computer power than
anyone in AI had during the "5th Generation" period. But there is no concern
for seriously asking how people do things. There's lip service and hype. But
the whole gestalt of the field is one of expecting simple tricks repeated a
zillion times (neural nets and deep learning) to cause intelligence to
magically emerge. IMHO, of course, that's not going to happen.

In computation, 1,000 is a big number. Back in the mid 1970s, when I was
interested in chess and other game programs, people working on chess (with
programs barely as strong as a club player) realized that if they had gobs
of computation, a computer could play really good chess. MIT, Bell Labs, and
IBM all had hardware chess machine projects, and by 1989, IBM managed to
beat the world champion with a special-purpose computer that was about 1,000
times faster than the research machines of the day; nowadays any PC is world
champion strength. Consider Go. None of us trying to write game programs had
a clue as to how to write a decent Go program. Determining that a game is
over and counting the score was such a hard problem, that around 1980
someone got an MS in Comp. Sci. from MIT (under Ron Rivest) for solving that
(I think she also showed that professional Go players' endgame technique
actually was mathematically correct). In the early 2000s, with not 1 MIPS
(VAX 780), not 10 MIPS (Intel 486), but 20,000 MIPS (i7) about to become
available, it was found that a statistical technique (MCTS; Monte Carlo tree
search) found good moves in Go, and the best PC program (Zen 6) today on a
fast PC is better than most club players. Here's where the number 1,000
comes in. Google's Alpha Go, according to Google, "runs on 100 or so
servers". Well, what's a "server"? It's a 19-inch rack with 20 or so slots,
each one of which has multiple PC-class computers. So throw (well over)
1,000 times the computation at a game, and a good club playing program
suddenly becomes world class. No surprise there, or at least there shouldn't
have been had Google been honest about the technological issues involved.

But there's no "AI" in there...

--
David J. Littleboy
Tokyo, Japan

Christopher Carr

unread,
Dec 18, 2016, 9:44:05 PM12/18/16
to hon...@googlegroups.com
Point 1. 

The passive voice is generally shunned by English speakers. With the passive voice, more words are required for less meaning, since the subject can be omitted. This is a problem that is sometimes encountered by me when texts are being translated by me and it is thought by the client that more is known about English by him than is known by me.

Regarding the assertion that MT doesn't count as MT if it is based on "found" human translation, I do not see the relevance. I would claim that the Turing test represents a high bar to clear. Economic considerations represent a somewhat lower bar.


Point 2.

Turing Test:

Which of the following translations did a human produce, and which did Google Translate produce?

血行再建としてカテーテル治療も多く行われているが、バイパス手術は今もなお、主な治療手段である。

Translation A: For revascularization, catheter treatments are becoming common, but bypass surgery is still the primary approach to therapy.

Translation B: Many catheter treatments are performed as revascularization, but bypass surgery is still the main treatment method.

Economic Test:

How much did Translation A cost, and how much did Translation B cost to produce?


Point 3. 

It is the Japanese after all who believe that even rocks have "souls". This is a point that Alan Turing himself agreed with. For Turing, humans were just very sophisticated machines. The distinction between "real" AI and apparently not real AI that is being bandied about here is self-serving and meaningless. 



> --
> You received this message because you are subscribed to the Google Groups "Honyaku E<>J translation list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to honyaku+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.




--
Christopher Carr
ENS, MC, USNR
MD/MPH candidate, class of 2018
Tulane University School of Medicine
Tulane University School of Public Health & Tropical Medicine, Department of Epidemiology
Japanese < > English translator

Kirill Sereda

unread,
Dec 18, 2016, 11:28:13 PM12/18/16
to hon...@googlegroups.com

Christopher Carr wrote:

>> The passive voice is generally shunned by English speakers.

The degree, to which English writers shun the passive voice, depends on the type of text. The passive voice is a common feature in scientific writing. Medicine is one of the areas of scientific writing where the passive voice is very popular. The example was not taken from a mystery novel.

 

>> This is a problem that is sometimes encountered by me when texts are being translated by me and it is thought by the client that more is known about English by him than is known by me.

In the sentence in question, using the passive voice increases the length of the sentence by _just one_ word if we compare it to the sentence in the imperative mood and by _zero_ words, at the most, if the writer uses an indicative mood/active voice combination (because the writer will need a subject).

 

>> Point 2.

With GT mistranslating every third or fourth sentence, human translation is still the only option for serious translation.

 

>> For Turing, humans were just very sophisticated machines. The distinction between "real" AI and apparently not real AI that is being bandied about here is self-serving and meaningless.

The distinction is between AI and non-AI.  The current level of sophistication of GT is light years away from AI.

 

Kirill Sereda

Tom Gally

unread,
Dec 19, 2016, 8:40:37 AM12/19/16
to hon...@googlegroups.com
After a few more days to reflect on that New York Times article and to try out GT a bit more, two points stand out in my mind. One may be relevant to the people on this list. The other one probably isn’t, but I will raise it here anyhow.

The first is that, according to the article, "The A.I. system had demonstrated overnight improvements roughly equal to the total gains the old one had accrued over its entire lifetime.” Neural network techniques have supposedly been yielding similarly rapid improvements in other areas as well, such as voice and image recognition. If this is true and not hype, and if that pace of improvement continues for a while longer, then we may indeed be seeing the first glimmerings of a true disruption. Maybe, as Google keeps feeding texts to the GT servers and lets them chew on them, those neural nets will, on their own, make the necessary connections so that they seem to “know” that, for example, sisters born later are younger sisters. Or maybe not. We should know in a few years.

The point irrelevant to most of you but very important to me in my current job (director of a large language-education unit at a Japanese university) is that, although GT cannot yet reliably produce translations good enough for most commercial purposes, its English output from Japanese is already at least as good as the English written by typical Japanese university students. I spent some time today running through it Japanese texts on topics that English-writing teachers are likely to assign, and much of the output was pretty damn good when compared with typical college-student writing. Unlike the usually-gibberish output of other MT systems, it would also be harder for teachers to detect. In fact, the clearest sign that intermediate-level students might be using GT for their writing assignments would be that there are fewer surface errors, such as misspellings, omitted articles, and number agreement mistakes, than the students normally produce, and that the vocabulary is, in places, more idiomatic.

This raises all sorts of thorny issues for educators: Should students who are supposed to be trying to acquire foreign-language ability be allowed to use tools like GT? How does a teacher evaluate and give feedback on student writing that is produced with GT? If GT does continue to improve and soon can be used for most day-to-day purposes (shopping, asking directions, introducing oneself, reading signs and other simple texts, composing e-mails, etc.), should foreign language education no longer emphasize such practical communicative tasks? Should language teachers start looking for other jobs?

Again, this second point doesn’t matter to professional translators. If you read this far, thank you for your indulgence.

Tom Gally
Yokohama, Japan

--
You received this message because you are subscribed to the Google Groups "Honyaku E<>J translation list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to honyaku+unsubscribe@googlegroups.com.

Carl Freire

unread,
Dec 19, 2016, 9:50:54 AM12/19/16
to hon...@googlegroups.com
On 12/19/16 10:40 PM, Tom Gally wrote:
> This raises all sorts of thorny issues for educators: Should students
> who are supposed to be trying to acquire foreign-language ability be
> allowed to use tools like GT? How does a teacher evaluate and give
> feedback on student writing that is produced with GT? If GT does
> continue to improve and soon can be used for most day-to-day purposes
> (shopping, asking directions, introducing oneself, reading signs and
> other simple texts, composing e-mails, etc.), should foreign language
> education no longer emphasize such practical communicative tasks? Should
> language teachers start looking for other jobs?

Well, in terms of judging students' abilities in language education the
answer on the face of it is rather plain, I should think: decrease your
reliance on grading materials that students produce at home and increase
your reliance on what they do in front of you verbally or on exams and
other grading schemes where they cannot rely on the crutch of IT.

The issues are less what you do in the classroom (even granting it's
easier to put one thing down on paper than it is to implement it in
reality) and more the issues in the background. Convincing colleagues,
university bureaucracy, etc. of the need for this change in pedagogy to
accommodate changing realities. Convincing potential students that it
is still worth it to choose to actually learn a new language and eschew
use of a super-convenient crutch. And, of course, figuring out how the
hell you are going to reconstruct your curricula and rewrite the syllabi
to cope with the new reality.

None of these are trivial tasks, but the need to adapt to massive
environmental changes does come with the territory from time to time in
one form or another regardless of field, so you can take some small
comfort in being part of a long tradition. (Which I'm sure will be a
comfort to you after another day of leaving campus at 9 or 10 p.m. to
get back for morning lecture at 9:10 a.m. the next day . . .)

Carl@was once on track to be an educator/academic (history) and still
gets the journals as well as the pedagogical-tip-laden newsletters

Herman

unread,
Dec 19, 2016, 11:39:28 AM12/19/16
to hon...@googlegroups.com
On 18/12/16 20:28, Kirill Sereda wrote:

> The distinction is between AI and non-AI. The current level of
> sophistication of GT is light years away from AI.
>

A problem with the sort of AI used in Google Translate, just like Google
Search, is that the system improves to a point but then begins to
degrade due to the inevitable signal/noise issues arising from the fact
that, within the Internet-linked information space, noisy or bogus
output from the Translate or Search system inevitably re-enters the
system as new data.

For reference, the GT output for the above:

Google検索と同じようにAIの種類が問題になるのは、Google検索と同じように、システムはある時点まで改善されていますが、インターネットにリンクされている情報の中には不可避的な信号/
翻訳や検索システム自体からの空間、雑音、または偽の出力は、必然的に新しいデータとしてシステムに再入力されます。


Herman Kahn


Kirill Sereda

unread,
Dec 19, 2016, 4:20:40 PM12/19/16
to hon...@googlegroups.com
Dear David,

Thank you for very interesting "inside" information on AI research! As you correctly pointed out:

"But there is no concern for seriously asking how people do things. There's lip service and hype. But the whole gestalt of the field is one of expecting simple tricks repeated a zillion times (neural nets and deep learning) to cause intelligence to magically emerge. IMHO, of course, that's not going to happen."

Neural network-based computer systems have been known for more than 60 years, if I am not mistaken, and the addition of (1) big data and (2) extremely fast computing (which happened about 10 years ago) has not yielded anything approaching human intelligence so far. And, yes, it is very unlikely to be the magic wand that will yield artificial _general_ intelligence.

The same combination of neural networks, big data, and fast computing has been recently used by Matthew Lai at Imperial College London to create a strong chess engine:
https://www.technologyreview.com/s/541276/deep-learning-machine-teaches-itself-chess-in-72-hours-plays-at-international-master/

As the article shows, the strength of the chess engine reaches a rather high level (International Master), but then the growth basically stops. This, I guess, is what awaits the neural network-based GT system. In other words, in a few years, will reach a rather high level comparable to that of a beginning translator who often misses things and occasionally makes atrocious mistakes, and then the growth will stop.

Despite the obvious failure to produce perfect AI-based translation, this will, of course, lead to serious changes in the profession.

No, human translation and interpretation will not become extinct! However, over the next decade we can expect:

(1) a massive rate collapse and market shrinkage (75%? 90%?);
(2) translation/interpretation to become a rare niche profession, with very few young people, if any people at all, considering translation/interpretation as a career choice;
(3) the remaining niches for human translation/interpretation to be rare, highly specialized, very hard to find, and very hard to get into. It is interesting to know what exactly is going to remain protected from the onslaught.

Kirill Sereda

-----Original Message-----
From: hon...@googlegroups.com [mailto:hon...@googlegroups.com] On Behalf Of David J. Littleboy
Sent: Sunday, December 18, 2016 7:29 PM
To: hon...@googlegroups.com
Subject: Re: NYT article on improved Google Translate


Geoffrey Trousselot

unread,
Dec 19, 2016, 5:51:53 PM12/19/16
to hon...@googlegroups.com

One idea is to get students to write initially in Japanese, get the Japanese machine translated and mark them on their ability to change the machine translation for the better. It would put everyone on an even footing, only their work would be assessed and it would create a skill set that might be in demand in the new era of translation needs.
Ultimately, the skill set that is going to be required is the ability to know when the machine translation is wrong or inappropriate and to be able to make it right.

Geoffrey Trousselot

David J. Littleboy

unread,
Dec 19, 2016, 8:08:52 PM12/19/16
to hon...@googlegroups.com

>From: Geoffrey Trousselot
>
>One idea is to get students to write initially in Japanese, get the
>Japanese machine translated and mark them on their ability to change the
>machine translation for the better. It would put everyone on an even
>footing, only their work would be assessed and it would create a skill set
>that might be in demand in the new era of translation needs.
>Ultimately, the skill set that is going to be required is the ability to
>know when the machine translation is wrong or inappropriate and to be able
>to make it right.

Sorry to be argumentative here, but I doubt this is of much pedagogical
value. I don't write much Japanese, but when I do, it's horrible. On the
other hand, I find problems (the rare typo, the somewhat less rare unnatural
turn of phrase) in native-written or spoken Japanese fairly easily, and when
I bounce said found problem off a native speaker, they pretty much always
agree that it's a problem. (Except for ridiculously long run-on sentences,
which the Japanese love to write and readers have no trouble with; go
figure.) Also, we've all had experiences with our customers pet Madogiwazoku
finding violations of stupid prescriptive grammatical rules in our
translations.

Also, the point that GT is already way better than what the students are
able to produce on their own means that doing this exercise isn't what they
need, and is going to be really depressing for the poor students...

If I wanted to be able to write Japanese better, I'd have to actually do it
every day and get it checked, see what was wrong, and work on the places I
was weak. But I'd rather spend that time reading novels.

David J. Littleboy

unread,
Dec 19, 2016, 8:33:17 PM12/19/16
to hon...@googlegroups.com

>From: Kirill Sereda
>Dear David,
>Thank you for very interesting "inside" information on AI research! As you
>correctly pointed out:
>
>"But there is no concern for seriously asking how people do things. There's
>lip service and hype. But the whole gestalt of the field is one of
>expecting simple tricks repeated a zillion times (neural nets and deep
>learning) to cause intelligence to magically emerge. IMHO, of course,
>that's not going to happen."

I probably come off as more than a slight crank, so here's someone who is
saying similar things: Gary Marcus.

I missed him the first time around, because I disliked his book "Guitar
Zero". (As an ex-classical player, current jazz learner, I thought he missed
the interesting issues in music, which wasn't a particularly reasonable
criticism for a bloke trying to learn rock as his first attempt at doing
music.) Oops.

https://www.edge.org/conversation/gary_marcus-is-big-data-taking-us-closer-to-the-deeper-questions-in-artificial

Kirill Sereda

unread,
Dec 20, 2016, 12:53:34 AM12/20/16
to hon...@googlegroups.com
Gary Marcus:

"All of this apparent progress is being driven by the ability to use brute force techniques on a scale we've never used before."

Ditto! Same old brute force.

Kirill Sereda

-----Original Message-----
From: hon...@googlegroups.com [mailto:hon...@googlegroups.com] On Behalf Of David J. Littleboy
Sent: Monday, December 19, 2016 6:33 PM
To: hon...@googlegroups.com
Subject: Re: NYT article on improved Google Translate


JimBreen

unread,
Dec 21, 2016, 7:40:18 PM12/21/16
to Honyaku E<>J translation list
On Tuesday, 20 December 2016 00:40:37 UTC+11, Tom Gally wrote:
The point irrelevant to most of you but very important to me in my current job (director of a large language-education unit at a Japanese university) is that, although GT cannot yet reliably produce translations good enough for most commercial purposes, its English output from Japanese is already at least as good as the English written by typical Japanese university students. I spent some time today running through it Japanese texts on topics that English-writing teachers are likely to assign, and much of the output was pretty damn good when compared with typical college-student writing. Unlike the usually-gibberish output of other MT systems, it would also be harder for teachers to detect. In fact, the clearest sign that intermediate-level students might be using GT for their writing assignments would be that there are fewer surface errors, such as misspellings, omitted articles, and number agreement mistakes, than the students normally produce, and that the vocabulary is, in places, more idiomatic.

A very interesting comment. Food for much thought.

Jim

Laurie Berman

unread,
Dec 25, 2016, 12:33:11 PM12/25/16
to hon...@googlegroups.com
I realize I’m chiming in rather late here, but . . .

> On Dec 19, 2016, at 8:40 AM, Tom Gally <tomg...@gmail.com> wrote:
>
> Maybe, as Google keeps feeding texts to the GT servers and lets them chew on them, those neural nets will, on their own, make the necessary connections so that they seem to “know” that, for example, sisters born later are younger sisters. Or maybe not. We should know in a few years.

To me, this is a fairly elementary challenge compared with the kind I face many times a day, which involves figuring out what the writer really MEANT to say when the Japanese text is either highly ambiguous, hopelessly convoluted, or patently illogical. The writing seems to get sloppier all the time (don’t get me started on the factual errors I feel compelled to point out), and meanwhile I get the feeling that my agencies are doing their best to dispense with in-house editors altogether (else why are they sending me their clients' style questions to answer?). Of course, this means that I’m spending more time on my translations and am therefore making less money. But it also means that I’m not competing with Google Translate and the like.

If AI ever gets to the point where it can make such judgments, then the AI developers themselves will be in trouble.


Laurie Berman





Mark Spahn

unread,
Dec 25, 2016, 2:33:49 PM12/25/16
to hon...@googlegroups.com
Laurie Berman writes: "The writing [of Japanese texts to be translate]
seems to get sloppier all the time (don’t get me started on the factual
errors I feel compelled to point out), and meanwhile I get the feeling
that my agencies are doing their best to dispense with in-house editors
altogether (else why are they sending me their clients' style questions
to answer?)."

Laurie, Could you give an example the translation customers' style
questions you are asked? Do they refer to writing in Japanese, or in
English? You further write: "If AI ever gets to the point where it can
make such judgments [about writing style], then the AI developers
themselves will be in trouble."

Speaking of AI, take a look at this 2-minute video mispunctuatedly
titled "George Clooney, Amal Alamuddin divorce up dates":

https://www.youtube.com/watch?v=VaHwwNw2dus

The big news here is not the celebrity divorce, but how it is reported:
by an artificially intelligent female voice reading a text, with
reasonably good (but still unnatural) text-to-voice intonation. No
narrator need be hired. (At 0:37, "Among the reason [singular]
enumerated ..." is probably an error in the text, not in its enunciation.)

-- Mark Spahn (West Seneca, NY)


Laurie Berman

unread,
Dec 25, 2016, 3:15:57 PM12/25/16
to hon...@googlegroups.com

> On Dec 25, 2016, at 2:35 PM, Mark Spahn <mark...@twc.com> wrote:
>
>
> Laurie, Could you give an example the translation customers' style questions you are asked? Do they refer to writing in Japanese, or in English?

English. Questions about when it’s okay to use an abbreviation, for example.

Laurie Berman





Tom Gally

unread,
Dec 27, 2016, 11:40:11 PM12/27/16
to hon...@googlegroups.com
Laurie Berman remarks on "the kind [of challenge] I face many times a
day, which involves figuring out what the writer really MEANT to say
when the Japanese text is either highly ambiguous, hopelessly
convoluted, or patently illogical."

MT systems' inability (so far) to deal with ambiguity—whether that
ambiguity is due to stupid writing or is an inevitable result of using
human language—does seem to be one of their biggest weak points. As we
all know, a big part of the human-translation process is wondering
what meaning was intended by the writer and wondering whether one's
translation conveys that meaning correctly and appropriately. While I
continue to be impressed by the improvements in GT, I see no sign that
it is "wondering."

A fairly obvious improvement it could make would be to flag
expressions that can reasonably be translated in more than one way and
ask the user which meaning is intended. For example, if I input "My
sister was born in 1954" for a translation into Japanese, it could ask
"Is 'sister' an older sister or a younger sister?" and maybe give me a
pull-down menu to choose one or the other. Right now, it just spits
out 妹.

It wouldn't be feasible to do this with all potential ambiguities, of
course. But the system could rate the probability of ambiguity based
on the context and ask such questions only when that probability is
reasonably high. For example, in a previous post here I wrote about
"fewer surface errors, such as misspellings, omitted articles, and
number agreement mistakes." When I ran the entire paragraph through
GT, "omitted articles" came out as "省略された記事." Because the surrounding
text does contain words relating to writing, 記事 is not necessarily an
unreasonable guess. (A human translator might make the same mistake.)
But the system's reference data should also contain similar contexts
in which "articles" is translated as 冠詞. Instead of just choosing one
translation, GT could ask whether the intended meaning is "a written
text" or "a function word, such as 'the' or 'a'." Alternatively, if
the surrounding text had contained words like "shipping date,"
"container," and "invoice," then it could ask if the intended meaning
is "item, thing" and ignore the possibility of 冠詞.

This ask-the-user-the-meaning function would only work, of course,
when the user is inputting text in a language he or she understands.

Tom Gally
> --
> You received this message because you are subscribed to the Google Groups "Honyaku E<>J translation list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to honyaku+u...@googlegroups.com.

David J. Littleboy

unread,
Dec 28, 2016, 1:12:53 AM12/28/16
to hon...@googlegroups.com

>From: Tom Gally

>While I
>continue to be impressed by the improvements in GT, I see no sign that
>it is "wondering."

This is the point Gary Marcus and I are making: AI isn't trying to be
"intelligent", it's trying to do kewl things _without_ being intelligent.
Siri really isn't any smarter than ELIZA was back in the day. (Back in the
day, the bloke who wrote ELIZA completely freaked out that people would type
their deepest secrets at it (it was actually quite a fun toy), and became a
vociferous critic of AI. He was a really sweet nice guy (he used to be one
of the regulars in the crowd I ate lunch with (ca 1974 or so) when I was an
undergrad), but really lost it when thinking about AI. Of course, back in
the day, people like Winograd were trying to write programs that did
understand what people typed at it, so Weizenbaum's criticisms seemed off
the mark. Back in the day. (I buttonholed one of Minsky's PhD students at an
AI conference (ca 1983 or so) and said "This "truth maintenance" stuff you
are doing is really neat, but it ain't the way people think about Go or
Chess." "That's what Marvin said, and I disagree." Which is to say, that
Minsky (and Schank as well) tried to persuade their students into thinking
about how people did things, but didn't always succeed. (Actually, Schank
kicked people out if they tried to do mechanistic things in their
research.))

Whatever.

>A fairly obvious improvement it could make would be to flag
>expressions that can reasonably be translated in more than one way and
>ask the user which meaning is intended.

One thing the linguistics-oriented AI types freaked out about, again, back
in the day, was how amazingly syntactically ambiguous English is. They'd
write parsers that would try to disentangle the syntactic structure of
sentences they were fed, but they found that sentences from even cut and
dried newspaper writing would have thousands of syntactically correct but
different interpretations.

Language is way more ambiguous than we think it is...

>It wouldn't be feasible to do this with all potential ambiguities, of
>course. But the system could rate the probability of ambiguity based
>on the context and ask such questions only when that probability is
>reasonably high. For example, in a previous post here I wrote about
>"fewer surface errors, such as misspellings, omitted articles, and
>number agreement mistakes." When I ran the entire paragraph through
>GT, "omitted articles" came out as "省略された記事." Because the surrounding
>text does contain words relating to writing, 記事 is not necessarily an
>unreasonable guess. (A human translator might make the same mistake.)
>But the system's reference data should also contain similar contexts
>in which "articles" is translated as 冠詞. Instead of just choosing one
>translation, GT could ask whether the intended meaning is "a written
>text" or "a function word, such as 'the' or 'a'." Alternatively, if
>the surrounding text had contained words like "shipping date,"
>"container," and "invoice," then it could ask if the intended meaning
>is "item, thing" and ignore the possibility of 冠詞.

You're about to reinvent "selectional restrictions", which the linguists
(and after them, the AI types) have been worrying about since 1965...

Geoffrey Trousselot

unread,
Dec 28, 2016, 7:08:40 AM12/28/16
to hon...@googlegroups.com
Tom Gally <tomg...@gmail.com> wrote:
A fairly obvious improvement it could make would be to flag
expressions that can reasonably be translated in more than one way and
ask the user which meaning is intended. For example, if I input "My
sister was born in 1954" for a translation into Japanese, it could ask
"Is 'sister' an older sister or a younger sister?" and maybe give me a
pull-down menu to choose one or the other. Right now, it just spits
out 妹.

Along the same lines, currently the translation engine seems to be taking context from the sentence. It would be nice to add conditions to a translation query, such as specifying the field, adding keywords, etc. Even choosing a style of writing.

Also being able to change a few words in the translated sentence and sending it back for a re-translation would be good. From the case when the reader knows the target language.

And for users who only know the source language, there could be a back-translate function that could allow adjustments.

It would be nice for the advancements in technology to be shared so that other companies could produce competing services. A monopolization of such technology seems a bit pointless.

If they started experimenting with such options, it would eventually create a unique markup language that would lead to a universal markup language that software could be designed to use to convert, with human assistance, natural language to "translatable" language.

Reading this conversation this morning, it occurred to me there has always been a kind of hubris when it comes to machine translation. Rather than identifying the functions that are available and trying to use what is there for efficiency saving, there seems to be a desire to create the illusion that the translation engine can do it all alone.

I notice that conversational Japanese to English is still comparable to the statistical based engine.In fact, I have been putting in a lot of sentences ranging from economic papers, art papers to contemporary fiction and the contemporary fiction is not even understandable. About 0% of the story seems to be relayed. I thought conversation style language would be targeted for improvement as a lot of translation demand would be social media conversations, but it seems that you need to be thinking to reproduce the meaning, and as computers don't think, the translations just seem meaningless.

Apart from the risk to taking away jobs, the reality is that different languages create language barriers, which obviously restrict the flow of ideas. The idea that people from all over the world can use a translation software to read all kinds of information written in all kinds of languages, seems  more attainable now. 

The key element of a translator's work is to act as a medium to enable such communication. The actual work involved might change over the years, but I think there will always be a need for people who are experts in translation. 

I still haven't grasped the timelines of improvement. 
Maybe a 20% improvement every 20 years? 

Geoffrey Trousselot

S Christenson

unread,
Jan 26, 2017, 8:31:48 PM1/26/17
to Honyaku E<>J translation list
I happened across another article in this vein the other day.

The language pairs of the author are English>Dutch and Japanese>Dutch, and his market is video games; he goes into quite a bit of detail about a personal experiment with MT and the results of that experiment.

His conclusions offer a strong reason against MT, in that the unpredictable nature of MT ultimately required the result to be checked against the source and fixed by a competent human translator at a pace no better than human-plus-CAT translation of the text from the get go, even for a "close" language pair like English to Dutch.


Steve Christenson

Rene

unread,
Jan 26, 2017, 11:20:26 PM1/26/17
to hon...@googlegroups.com

On Fri, Jan 27, 2017 at 10:31 AM, S Christenson <transl...@accessj.com> wrote:
His conclusions offer a strong reason against MT, in that the unpredictable nature of MT ultimately required the result to be checked against the source and fixed by a competent human translator at a pace no better than human-plus-CAT translation of the text from the get go, even for a "close" language pair like English to Dutch.


Without reading the article, this statement is a non-sequitur. It makes it sound as if CAT and MT are mutually exclusive. But they are not. You can treat MT as simply another TM source (this function is built in in most CAT tools).

So the alternative is: human translation only, or human+CAT/MT translation plus human checking. And the difference is so stark, it is not even a contest.

Rene von Rentzell

Roland Hechtenberg

unread,
Jan 27, 2017, 12:43:52 AM1/27/17
to hon...@googlegroups.com
On 1/27/2017 12:16 PM, Rene wrote:

> So the alternative is: human translation only, or human+CAT/MT
> translation plus human checking. And the difference is so stark,
> it is not even a contest.

Isn't the alternative rather: MT only or human translation + CAT
(with or without MT)?

Have fun,

Roland
--
Roland Hechtenberg
Technical translator
Japanese > English <> German
rol...@ictv.ne.jp

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Rene

unread,
Jan 27, 2017, 11:26:08 AM1/27/17
to hon...@googlegroups.com

On Fri, Jan 27, 2017 at 2:12 PM, Roland Hechtenberg <rol...@ictv.ne.jp> wrote:
Isn't the alternative rather: MT only or human translation + CAT (with or without MT)?


No.

Rene von Rentzell

Stephen Suloway

unread,
Jan 27, 2017, 10:54:16 PM1/27/17
to hon...@googlegroups.com
Here is another article on Google Translate, with road-testing of the latest wrinkle. See opening sentence below.

http://www.theverge.com/2017/1/27/14409142/google-translate-japanese-word-lens-update

Jan. 27, 2017: "Google updated its Translate app this week with Japanese-to-English functionality. Through its Word Lens technology, English-speaking users can now translate Japanese words just by pointing their phone’s camera in their direction….”

Regards,
Stephen

~ ~ ~ ~ ~ ~ ~ ~ ~
Stephen Suloway
Reply all
Reply to author
Forward
0 new messages