Google Translator Toolkit

1,959 views
Skip to first unread message

Jeroen Ruigrok van der Werven

unread,
Jun 10, 2009, 1:38:02 AM6/10/09
to hon...@googlegroups.com
It was mentioned before on this list, but Google now released it. See
http://googleblog.blogspot.com/2009/06/translating-worlds-information-with.html
for a background and http://translate.google.com/toolkit/ for the
application.

So Google is moving in the translation memory side of things. Currently only
English is available as a source language.

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
Possession is nine points of the law...

Kevin Kirton

unread,
Jun 10, 2009, 2:21:50 AM6/10/09
to hon...@googlegroups.com
Jeroen Ruigrok van der Werven wrote:
> So Google is moving in the translation memory side of things.
I wonder how many human translator friends they'll make with
understatements on their website like this: "While we think Google
Translate <http://translate.google.com/>, our automatic translation
system, is pretty neat, sometimes machine translation could use a human
touch." Although I guess that is what a lot of people who pay for
translation are hoping... if only machine translation could be brushed up...

Kevin Kirton
Australia

Kevin Kirton

unread,
Jun 10, 2009, 2:41:45 AM6/10/09
to hon...@googlegroups.com
Actually, now that I've had a closer read of what they're offering, it
got me thinking about a thread in early 2005 about whether machine
translation would ever be possible. I suggested then that it probably
would (my post in the thread is here:
http://honyaku-archive.org/posts/172377/ -- the original thread was on
the JAT list, which I can't seem to search)
but I think that I said then that a collaborative effort involving
something like Wikipedia might just work. If a wave of collaborative
effort does materialize (as in the case of Wikipedia) then it certainly
would seem to offer a solid way of dramatically improving the results of
machine translation.

Kevin Kirton
Australia

Kevin Kirton

unread,
Jun 10, 2009, 3:16:13 AM6/10/09
to hon...@googlegroups.com
I tried out the Toolkit. Pretty impressive I thought. Google have always
been good at interfaces I suppose.

It would seem to be a perfect fit for things like Wikipedia, which is
one of the things it is aimed at, at least initially.

The default settings are interesting too:

*Pre-translation*
When you upload a document into Google Translator Toolkit, Google
Translator Toolkit translates each sentence in your document using a
combination of previous human translations, machine translation, and
source text. By default, Google Translator Toolkit pre-translates each
sentence as follows:

* *Exact match translation*: If there are previous human
translations of the source sentence, use the best available human
translation.
* *Machine translation*: If there is no previous human translation
of the source sentence, use the machine generated translation.
* *Source Text*: If there is no previous human translation of the
source sentence and machine translation is not available, pre-fill
with the untranslated source sentence.

Jake Dunlap

unread,
Jun 10, 2009, 7:39:29 PM6/10/09
to hon...@googlegroups.com
Hmm... doesn't seem like this is a very popular topic on honyaku. ;)

While it's going to be a while yet before the robots take over all our
work, stuff like this (especially this, probably, considering Google's
resources) is without a doubt going to drastically alter the landscape
of the translation industry within, I think, the next decade. I also
wonder if a free tool like this will entice more agencies away from
the proprietary and pricey CAT tools such as Trados, with the benefit
being ongoing improvement in MT and thus self-perpetuating
improvements in efficiency.

It might be nice to have the toolkit available offline for those of us
who cannot disclose what we are translating but would still like to
have a go at the Google way of doing things. (Maybe I just missed
that part of it, but.)

Jacob Dunlap

Kevin Kirton

unread,
Jun 10, 2009, 8:18:33 PM6/10/09
to hon...@googlegroups.com
Jake Dunlap wrote:
> Hmm... doesn't seem like this is a very popular topic on honyaku. ;)
>
I'm surprised at that too. I think Google's approach here though is
something that could be groundbreaking.

Aside from its very smooth interface and ease of use ("ease of use" is
not something usually associated with things like Trados), one of the
settings is "use global shared TM." I had to look into what they meant
by "global" but its definitely "global" in the sense of "Google world
domination." I thought it may have been a TM that you can share with a
closed set of translators of your choice online, and I think it's
possible to change the settings to achieve that kind of privacy, but the
help pages state:

"By default, we save your translations to a shared, publicly searchable
translation memory. By contributing your translations to this public
translation memory, you help other users bring content more quickly into
your language."

So theoretically, the size and searchability of the TM is of Google
proportions.

Of course, some may say that professional translators won't give up
their work that easily, but I can imagine circumstances changing such
that it would be senseless to swim against the tide.

Lots of big changes ahead I think. Interesting times.

Kevin Kirton
Australia

Richard Thieme

unread,
Jun 10, 2009, 8:21:55 PM6/10/09
to hon...@googlegroups.com
I agree heartily. It certainly gives one pause. The matrix comes to
translating.

Regards,

Richard Thieme

Ginstrom IT Solutions (GITS)

unread,
Jun 10, 2009, 8:54:54 PM6/10/09
to hon...@googlegroups.com
> [mailto:hon...@googlegroups.com] On Behalf Of Kevin Kirton

> So theoretically, the size and searchability of the TM is of
> Google proportions.

As a developer of CAT software, I'm certainly looking forward to what this
will entail. Combine this with something like Wave (secure, collaborative
document editing and publishing), and it could turn into something powerful
indeed.

I think that where this will be most useful is for non-commercial
translation, since most paying clients probably won't be too happy about you
sharing their documents with the world. Google Translate could cause a major
change to this landscape as well, though.

Regards,
Ryan

--
Ryan Ginstrom
trans...@ginstrom.com
http://ginstrom.com/

Jake Dunlap

unread,
Jun 10, 2009, 8:43:39 PM6/10/09
to hon...@googlegroups.com
On Thu, Jun 11, 2009 at 9:18 AM, Kevin Kirton<kpki...@gmail.com> wrote:
>
> Of course, some may say that professional translators won't give up
> their work that easily, but I can imagine circumstances changing such
> that it would be senseless to swim against the tide.

Yes, I wonder about this. The most resistance to something like this
will surely come from professional translators, who naturally view it
as a threat to their livelihood. I think Google will be the one
company that can counter this because they have the resources to make
the service free. Not only does this encourage everyone in the world
to get involved, but like I mentioned above, perhaps they could market
it to agencies as an alternative to something like Trados, with the
added benefit of MT that improves the more it's used.

Google also has the entire USPTO database in searchable (in other
words, translatable) format. That could have significant impact on
the patent market, particularly for E-J translations in the short
term, and, if Google starts archiving the Japanese patent office's
archives, for J-E litigation-related translation as well.

That being said this is still not applicable for those who want to
keep their information secret, but perhaps Google will introduce a
sort of "leech mode" where you can use Google's archives of TMs during
the translation process but not upload any of your results to the
server.

It's wonderful and terrible at the same time. ;)

Jacob Dunlap

Marc Adler

unread,
Jun 10, 2009, 9:00:54 PM6/10/09
to hon...@googlegroups.com
On Thu, Jun 11, 2009 at 9:18 AM, Kevin Kirton <kpki...@gmail.com> wrote:
 
Lots of big changes ahead I think. Interesting times.

Maybe I'm not seeing the big picture, but I'm not too worried. The simple fact that such tools exist doesn't mean there will no longer be a need for translators. As translation becomes easier (faster), the unit price might go down, but you're doing it faster! So it balances out in the end, doesn't it?

--
Marc Adler
www.adlerpacific.com
nirebloga.wordpress.com
mudawwanatii.wordpress.com
blogsheli.wordpress.com

Kevin Kirton

unread,
Jun 10, 2009, 11:11:28 PM6/10/09
to hon...@googlegroups.com
Marc Adler wrote:
> Maybe I'm not seeing the big picture, but I'm not too worried. The
> simple fact that such tools exist doesn't mean there will no longer be
> a need for translators. As translation becomes easier (faster), the
> unit price might go down, but you're doing it faster! So it balances
> out in the end, doesn't it?
I didn't mean to imply that there will no longer be a need for
translators, it's more the "big picture" I was thinking of. For example,
if someone had approached Britannica or Encarta in 1999 or 2000 and said
"I plan to start an online encyclopedia. It will be based on volunteer
input and editing, and I'm pretty sure it's going to big. It will face
various problems along the way, but it'll have more traffic, be more
popular, and more useful than your offerings within 10 years," then
that person would be considered crazy and/or arrogant. But Wikipedia has
happened, and despite all the criticism and problems it has faced, I
think there's little doubt that it is revolutionary and very useful in a
way that Britannica or Encarta can no longer match.

I think a similar potential exists in Google's translator toolkit.
Imagine the enormous amount of group-edited bilingual data that may be
produced and what statistical or example-based machine translation will
be able to do with that. At the moment, at least for J<>E, it's
impossible to rely on MT for meaningful output. But with Google's
approach, it may very well be possible (eventually) to produce readable
and accurate translations instantly. A lot of very good translators on
this list are already primarily checking, editing, and improving the
work of human translators. This approach may make it possible to reach a
tipping point where the first stage of translation is always MT.

And, still on the big picture, I think the world desperately needs
better and quicker communication among our various populations. It has
been easy to argue that MT will never be possible, but I think it's at
least a little easier now to argue that it could be. The analogy of
photography is still applicable though. These days it's not unusual to
have a fairly good camera on a phone, but professional photographers
haven't all gone out of business yet.

Kevin Kirton
Australia

Marc Adler

unread,
Jun 10, 2009, 11:45:41 PM6/10/09
to hon...@googlegroups.com
On Thu, Jun 11, 2009 at 12:11 PM, Kevin Kirton <kpki...@gmail.com> wrote:
 
that person would be considered crazy and/or arrogant. But Wikipedia has
happened, and despite all the criticism and problems it has faced, I
think there's little doubt that it is revolutionary and very useful in a
way that Britannica or Encarta can no longer match.

I see your point, but one thing that's undeniable (at least in my opinion ;-) ) is that Wikipedia is more useful solely for the quantity of information it contains, and *not* for the quality. I've recently gotten access to Britannica online through my local library, and reading the articles after a couple of years of reading only Wikipedia is eye-opening.

Similarly, I don't think the quality of future Google-type J<>E MT will ever be any good, for the same reasons current J<>E MT is no good -- there's just too big a difference between the languages. Unless a text has been specifically written to fit a certain translation memory, the absence of nouns, etc. will always cause problems on the target side. Hence, there will always be a need for someone who can read both languages to at least check the output, as you mention.
 
least a little easier now to argue that it could be. The analogy of
photography is still applicable though. These days it's not unusual to
have a fairly good camera on a phone, but professional photographers
haven't all gone out of business yet.

At least not all of them. ;-)

Jeroen Ruigrok van der Werven

unread,
Jun 11, 2009, 5:24:19 AM6/11/09
to hon...@googlegroups.com
-On [20090611 05:45], Marc Adler (marc....@gmail.com) wrote:
>I see your point, but one thing that's undeniable (at least in my opinion ;-) )
>is that Wikipedia is more useful solely for the quantity of information it
>contains, and *not* for the quality. I've recently gotten access to Britannica
>online through my local library, and reading the articles after a couple of
>years of reading only Wikipedia is eye-opening.

The wisdom of the crowds is only as good as the wise of the crowd actually
reading and adjusting said articles as needed. (Not to mention write in a
clear language.)

The whole process at Wikipedia lacks decent (copy) editors.
But it is still a great treasure trove of information.

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B

Open your Heart and push the limits...

Reply all
Reply to author
Forward
0 new messages