Moving to gettext format (POT/PO files) for translations

Dan Scott

unread,

Jan 24, 2008, 9:23:48 AM1/24/08

to Evergreen 中国

Hello!

First of all, this is really exciting work (and much more challenging
than the fr-CA localization that I started with).

Second, I wanted to point out that in Evergreen trunk I've been
putting together an internationalization framework that will rely on
the GNU gettext format for translations. The process works as follows:
at build time, we generate a set of POT files from the English DTD
files, properties files, database seed data, ILS events, etc.

From those POT files, we generate a set of PO files for each
localization that is desired. You can then use any PO editor (KBabel
is popular in the KDE environment, POedit for many different
platforms, there are Web frameworks for PO translation like Pootle, or
you can even use plain text editors) to provide the translated version
of each English string.

The great advantage to gettext is that the existing base of
translation support tools means that when you move from, say, version
1.2 to 1.4, the tools will automatically supply exact translation
matches for strings that haven't changed, and will provide "fuzzy"
matches for strings that may have changed only slightly - so
translators can immediately focus their attention on only the changed
and new strings, rather than having to manually compare the new
English source file with the old translated version on a file-by-file
basis.

The good news is that the tools support a migration path from a
translated DTD file to a PO file, so the work that has already done
will be preserved under the new translation framework.

I'm still working through the conversion of the staff client interface
to enable full localization, but making progress - the client side is
complete, I think, and the server/admin directory is almost done (just
a few straggling pieces that I need to think through there).

Dan

Jason Zou

unread,

Jan 24, 2008, 12:59:50 PM1/24/08

to Evergreen 中国

Hi Dan,

Grace and I have finished the first draft of Chinese version: lang.dtd
and opac.dtd. Now Grace is
doing the final check of those DTD files.

In order to work collaboratively and efficiently, we retrieved the
English version of lang.dtd and opac.dtd from the Evergreen trunk.
Then we set up a Pootle server for translating the two DTD files. On
this server, we also have installed Evergreen version 1.2.1.
Furthermore, a cron
job has been created to convert .PO files into .DTD files 24 times a
day. So we could always have latest Chinese version.

We did tried to use KBabel and POedit. Frankly, the KBabel is alright.
But the POedit is not as good as the KBabel. But using those tools,
we
don't have the convenience of having the latest Chinese version of
Evergreen.

For the staff client, we had a quick look at .xml files, .js, .xul
files. It seems that strings are scattered in all those files.
It is great to hear that you have finished the client side. So after
Grace finishes the final check, we can move on to the staff client
part.

For searching and indexing part, I have found a solution for handling
Chinese encoded in UTF8.
Although the solution has some limitations, it is working like a
charm.
Now We are trying to figure out the GB18030 and CNMARC issues.

Jason

Dan Scott

unread,

Jan 24, 2008, 1:13:11 PM1/24/08

to everg...@googlegroups.com

Oops - by "the client side is done" I actually meant "the offline
interface part of the client". I don't know what I was thinking when I
typed that - wishful thinking, I guess. There's actually still a fair
amount of staff client to convert, but I'm plugging away.

It's great to hear about all of your progress in infrastructure and
searching / indexing!