Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

From 50 to 100 locales

2 views
Skip to first unread message

Axel Hecht

unread,
Nov 2, 2006, 11:36:22 AM11/2/06
to
Hi all,

we'd like to collect some data and idea on how to get from the 50
locales and releasing 40 to a hundred.

Clearly, everybody feels the load of releasing a Mozilla product in 40
languages, and while we got the job done, it's about time to look ahead
and point at the places where going further leads to trouble.

So to start the discussion, it'd be a good idea to collect how much one
locale of the 40 currently 'costs'. How much time is spent on a locale
day-to-day (tinderbox build times come to mind), and how much man power
and computing resources do we need for a release? How much for an
update, how much for a major new release? I would hope to get
guestimates or number from folks like localizers, drivers, build, QA, BD
and product.

In parallel, we should collect ideas on how to make l10n easier, and
improve the quality over all. I have some, but I'll keep those for a
follow up.

This discussion should include options on the goal and expectations to
be set on the beast at large, too.

Axel

Cédric Corazza

unread,
Nov 2, 2006, 7:45:24 PM11/2/06
to
Axel Hecht a écrit :

> Hi all,
>
> we'd like to collect some data and idea on how to get from the 50
> locales and releasing 40 to a hundred.

Ambitious goal :)

> So to start the discussion, it'd be a good idea to collect how much one
> locale of the 40 currently 'costs'. How much time is spent on a locale
> day-to-day (tinderbox build times come to mind), and how much man power
> and computing resources do we need for a release? How much for an
> update, how much for a major new release? I would hope to get
> guestimates or number from folks like localizers, drivers, build, QA, BD
> and product.

From a localizer point of view:
Minor releases don't impact us as this is only about stability and
security matters, not localized strings.
From a day-to-day point of view (I guess you are only talking about
Firefox), I would say it's a matter of 30 mn a day: watching for new
check-ins, comments from our community, ... Well, this is an average
time, on cruise time (I mean not just before or after a major release),
the spent time can be twice or thrice more before a release, especially
when watching l1...@mozilla.com; because watching your address can help
to avoid to other locales to experience the same problems as other ones.

> In parallel, we should collect ideas on how to make l10n easier, and
> improve the quality over all. I have some, but I'll keep those for a
> follow up.

No idea

> This discussion should include options on the goal and expectations to
> be set on the beast at large, too.

Yes, most of us handle several products translation

Regards

Channy Yun

unread,
Nov 2, 2006, 8:32:08 PM11/2/06
to Axel Hecht, dev-...@lists.mozilla.org
On 11/3/06, Axel Hecht <ax...@pike.org> wrote:
> Hi all,
>
> we'd like to collect some data and idea on how to get from the 50
> locales and releasing 40 to a hundred.

Good! Cause we're world based software.

> So to start the discussion, it'd be a good idea to collect how much one
> locale of the 40 currently 'costs'. How much time is spent on a locale
> day-to-day (tinderbox build times come to mind), and how much man power
> and computing resources do we need for a release? How much for an
> update, how much for a major new release? I would hope to get
> guestimates or number from folks like localizers, drivers, build, QA, BD
> and product.

I have 30min.~one hour per day to follow up check-in. But, I must
confirm tinberbox builds after doing. It takes about half hour. Also I
need to read l10n mailing list and follow up bug related to my locale.
I cannot do them everyday, so my peer helps me and I share jobs with
him per a week.

In fact, string changes is not a problem. Not to fix strings to end is
real problem. It impact help files. If I miss it after BETA period, it
takes more times to make a bug and follow-up it as you know. I guess
most of translators is not a code developer, maybe it's not good at
that proccess.

> In parallel, we should collect ideas on how to make l10n easier, and
> improve the quality over all. I have some, but I'll keep those for a
> follow up.

My suggestion is to divide two locale groups. One is groups that have
an afford to follow-up process. The other is not to do that. For them,
we offers en-US locale files to be translated and the guideline and
deadline to them after major release. If locale files are submmited,
files make automatic converting to cvs and goes to release
periodcally. (In case of IE7, other locale will be released after
several week except major languages.) It will be helpful to novice
translators for Mozilla products. They will cost much time to be good
at all proccess.

After major release, there is almost no thing to do for tier-1
localizers including me. So they help such tier-2 localizers.

Channy

João Miguel Neves

unread,
Nov 3, 2006, 4:40:21 AM11/3/06
to Axel Hecht, dev-...@lists.mozilla.org
Qui, 2006-11-02 às 17:36 +0100, Axel Hecht escreveu:
> So to start the discussion, it'd be a good idea to collect how much one
> locale of the 40 currently 'costs'. How much time is spent on a locale
> day-to-day (tinderbox build times come to mind), and how much man power
> and computing resources do we need for a release? How much for an
> update, how much for a major new release? I would hope to get
> guestimates or number from folks like localizers, drivers, build, QA, BD
> and product.
>
I'll get back to you on this. I want to check with a couple of people
just to get you better numbers.

> In parallel, we should collect ideas on how to make l10n easier, and
> improve the quality over all. I have some, but I'll keep those for a
> follow up.

Here's my wishlist:

1) If you want to go from 40 to 100 locales, more attention must be paid
to registration bugs. For a localizer, waiting for a registration bug to
get through is painful: you've already done a ton of work to get that
localization going, and then everything suddenly stops for weeks. 60
more locales will mean something like 180 such bugs (thinking only of
Firefox, Thunderbird and Calendar). If you're heading a team, the team
loses interest.

2) Simplify localization (a "localization package" is the set of files
that should be localized):

a) There should be a en-US localization package that would serve as
the basis for a localization.

b) Localization packages should be localizable: no fixed content that
isn't supposed to be translated because of trademarks or MoCo agreements
or any other issue. No control keys, if they really are supposed to be
standard throughout all versions (there was some discussion against
this). These localizations that shouldn't happen were the most difficult
to learn, communicate and resulted in a lot of repeated checking on my
part before commiting.

c) Separate "content packs" that need a separate authorization process
(searchplugins, region.properties changes, feeds, feed readers).

d) Use a single format. Right now we three argument formats like:
&brandShortName; on DTDs, %s on properties, $Version$ in installer.inc.
The same applies to newlines and access keys. This is a barrier to entry
and makes the localization job a lot harder, because you need always to
remember where you are to check if a change. One possibility would be to
generate the "real" localization from localization packages.

e) Separate interface issues (window sizing) from the flow of
localization. I don't know how many times I had to un-translate width
and height.

3) Keep a schedule updated and only one. For Firefox 2.0 I was told to
follow a ics file. Then the file changed and I didn't know (I was
following the old one). then it was the schedule at Planet Mozilla. Even
so, that was just a schedule for Firefox. There should be somewhere
where a localization team can go to know the answer to questions like:
how much time do we have? what's the next product to be release? when is
the string freeze? These questions are critical to answer: where should
we work? With 5/6 people per locale, this means 500/600 asking these
questions... And keep the schedule updated. Any change or delay should
be there first. To deal with the current situation we tend to not
believe dates and delay starting working on something (if you recall,
freezes haven't happened on schedule except for RC2/RC3 - with work done
in the first few days being wasted).

I think this is all of it.

Best regards,
João Miguel Neves

signature.asc
Message has been deleted

victory

unread,
Nov 3, 2006, 12:46:58 PM11/3/06
to
#I have no relation to any of l10n teams, though ;-)

I think now it's hard to start to do l10n work.

People must find which file in l10n repos correspond to which file in
main repos because the structure has some difference.

The procedure to create language package is too complex,
so people needs to read many many pages.

so I believe you're forcing them to waste their precious time to do it :-)

--
victory

Novica Nakov

unread,
Nov 11, 2006, 12:37:03 PM11/11/06
to

> we'd like to collect some data and idea on how to get from the 50
> locales and releasing 40 to a hundred.

Well, it might be better to ask the l10n groups on other (major) FLOSS
projects. They already have more than 40 locales.

http://l10n.kde.org/teams-list.php - around 110
http://developer.gnome.org/projects/gtp/teams.html - around 125
http://projects.openoffice.org/native-lang.html#star - around 70.

> So to start the discussion, it'd be a good idea to collect how much one
> locale of the 40 currently 'costs'.

It costs a lot I think and since we are creating an official product of
the Mozilla Corporation, maybe we should start asking for a fee.

> In parallel, we should collect ideas on how to make l10n easier,

This is an easy one. Provide official POT files.


--
Novica

Ognyan Kulev

unread,
Nov 13, 2006, 3:46:56 AM11/13/06
to dev-...@lists.mozilla.org
Novica Nakov wrote:
>> In parallel, we should collect ideas on how to make l10n easier,
>
> This is an easy one. Provide official POT files.

I would stress on the same thing. It's great that MT is actively
developed, but IMHO it's a waste of time and energy to rewrite already
great PO editors like KBabel or Emacs+gettext.el. Some parts of the l10n
are a bit weird but they can be solved - like multi-line entry for
search engine list (.1, .2, ...) and http://po4a.alioth.debian.org/ for
XHTML files. Please provide familiar and all-loved framework (SCM+PO)
for translating. Debian-installer l10n, led by Christian Perrier, is an
excellent example for success in providing very comfortable "atmosphere"
for translating.

Regards,
ogi

Ognyan Kulev

unread,
Nov 13, 2006, 3:52:28 AM11/13/06
to dev-...@lists.mozilla.org
Ognyan Kulev wrote:
> Debian-installer l10n, led by Christian Perrier, is an
> excellent example for success in providing very comfortable "atmosphere"
> for translating.

Interesting pages in this regard are
http://d-i.alioth.debian.org/l10n-stats/translation-status.html and
http://d-i.alioth.debian.org/i18n-doc/

Regards,
ogi

Robert Kaiser

unread,
Nov 13, 2006, 7:50:18 AM11/13/06
to
Ognyan Kulev schrieb:

> Please provide familiar and all-loved framework (SCM+PO)
> for translating.

Who says that everyone loves that framework? In fact, I know multiple
people (including myself) that hate it and would always prefer the
current Mozilla model to it if they could choose.

Robert Kaiser

Ricardo Palomares Martinez

unread,
Nov 13, 2006, 5:34:18 PM11/13/06
to
Ognyan Kulev escribió:

> Novica Nakov wrote:
>>> In parallel, we should collect ideas on how to make l10n easier,
>>
>> This is an easy one. Provide official POT files.
>
> I would stress on the same thing. It's great that MT is actively
> developed, but IMHO it's a waste of time and energy to rewrite already
> great PO editors like KBabel or Emacs+gettext.el. (...)


What many of PO fans don't seem to acknowledge (I've had this same
discussion with some es-ES team members) is that PO is NOT *the*
standard; in fact, I don't think that there is such thing as a l10n
standard. Why PO is better than XUL model? Why is it better than just
Java Properties i18n model?

I'm pretty sure most localizers will agree to switch to a single tool
when either a single standard appears and get really "the single" one
used, or that tool will able to nicely deal with _any_ l10n file
format without needing a conversion process.

OTOH, AFAIK there is nothing to prevent you from using PO files to
translate Mozilla products. Translate toolkit is honoured as a great
solution by many people here, so you should give it a try.

What I'm sure about is that if [your favourite PO l10n tool] can't
manage Mozilla i18n model, is not Mozilla the one to switch to PO: we
need a i18n model neutral l10n tool, or if that's not possible, one
that works with Mozilla i18n model.

Ricardo.

--
If it's true that we are here to help others,
then what exactly are the OTHERS here for?

Erdal Ronahi

unread,
Nov 13, 2006, 6:48:47 PM11/13/06
to dev-...@lists.mozilla.org
> What many of PO fans don't seem to acknowledge (I've had this same
> discussion with some es-ES team members) is that PO is NOT *the*
> standard; in fact, I don't think that there is such thing as a l10n
> standard. Why PO is better than XUL model? Why is it better than just
> Java Properties i18n model?

Depends on how you define a standard. Noone said PO is *better* than
XUL model, but from a translator's point of view it's standard,
because it's much more widespread.

I am heading a team that translates a complete Linux roundup including
GNOME, KDE, XFCE and other applications. 98% use PO. Only the Mozilla
products and OpenOffice.org are different.

There sure is a reason that the translation-toolkit provides moz2po,
ooo2po and vice versa and not, say, "mozilla to openoffice" or
"mozilla to Java Properties".

All the team knows how to handle PO files, but for a different model
like mozilla one has to learn a few things from scratch. That has
nothing to do with "PO is *better*". It is just more familiar because
there are hundreds of applications in the Linux world using PO.

Regards,
Erdal

Axel Hecht

unread,
Nov 13, 2006, 11:20:53 PM11/13/06
to

I think it's fair to acknowledge that people localizing Mozilla
applications have quite a few new things to learn, independent of the
actual file format.
And I'm pretty confident that we want it to stay that way, there are
just a few things to an successful web-faced application that don't go
straight out of the box. Quite a few product decisions need compromises
that should be somewhat equivalent across locales, and those need a
significant amount of communication back and forth.

In addition, I'm not sure that the translate toolkit is currently
bugfree or at least pitfall-free.

And you may insert my standard rant about l10n tools supposed to be good
editors. Think source format preserving, being RCS-friendly etc.

Axel

João Miguel Neves

unread,
Nov 14, 2006, 3:11:16 AM11/14/06
to Axel Hecht, dev-...@lists.mozilla.org
Seg, 2006-11-13 às 20:20 -0800, Axel Hecht escreveu:
> I think it's fair to acknowledge that people localizing Mozilla
> applications have quite a few new things to learn, independent of the
> actual file format.

Agreed. But there are a lot of unneeded details that need to be learned.
Those are the problems that should be solved. Making it easy doesn't
mean making it stupid. It means avoiding errors by design. Different
file formats is just one of issues.

> And I'm pretty confident that we want it to stay that way, there are
> just a few things to an successful web-faced application that don't go
> straight out of the box. Quite a few product decisions need compromises
> that should be somewhat equivalent across locales, and those need a
> significant amount of communication back and forth.
>
> In addition, I'm not sure that the translate toolkit is currently
> bugfree or at least pitfall-free.

It isn't. Worst, it can't be bugfree or pitfall-free. For the simple
reason that mozilla file-formats are: 1) not documented and 2)
ever-changing. They tried to use the same code for Firefox 2.0 and 1.5
for instance, until they realised the encoding differences and other
details. My experience is that their support is great. Usually you get a
bug fixed in minutes or hours if you go through their irc channel.

As far as I've noted so far, there 4 different file formats for mozilla
(dtd, general properties, shellservice.properties and installation
properties). These differ in argument formats for strings, formatting
rules and in the case of the installer, encoding. You'll know that the
translate-toolkit is more pitfall-free the day they have special code
for each of the file formats.

Mozilla localization doesn't seem to me like a localization. It feels
more like a part part of the interface was taken from the project, given
to localizers and said: translate this. One of my biggest difficulties
is all the filtering needed on top of documenting to my team. But I've
said this before and pointed out what I'd like to see done. So no point
repeating.

signature.asc

Marek Stępień

unread,
Nov 14, 2006, 8:17:22 AM11/14/06
to
João Miguel Neves napisał:

> It isn't. Worst, it can't be bugfree or pitfall-free. For the simple
> reason that mozilla file-formats are: 1) not documented

The standard SGML/XML DTD is documented, so are the Java *.properties
files (the only difference between Java and Mozilla properties is that
we allow unescaped Unicode characters).

> As far as I've noted so far, there 4 different file formats for mozilla
> (dtd, general properties, shellservice.properties and installation
> properties). These differ in argument formats for strings,

This applies to .po files as well, the argument formats are different
depending on the language the program was written in. For example, while
localizing Novell's gnome-patch.po for SLED 10 I found many different
formats (from C/C++, C#, Perl etc.).

> formatting rules and in the case of the installer, encoding.

NSIS installer.properties use UTF-8, just like any other .properties file.

--
Marek Stępień <mar...@aviary.pl>
AviaryPL - polski zespół lokalizacyjny Mozilli
http://www.firefox.pl/ | http://www.mozilla.org.pl/

João Miguel Neves

unread,
Nov 14, 2006, 9:09:23 AM11/14/06
to dev-...@lists.mozilla.org
Ter, 2006-11-14 às 14:17 +0100, Marek Stępień escreveu:
> João Miguel Neves napisał:
> > It isn't. Worst, it can't be bugfree or pitfall-free. For the simple
> > reason that mozilla file-formats are: 1) not documented
>
> The standard SGML/XML DTD is documented, so are the Java *.properties
> files (the only difference between Java and Mozilla properties is that
> we allow unescaped Unicode characters).

Both those are just the base formats, not the localization formats. If
you just use that, you lose critical information, like "don't translate"
comments or the meaning of the arguments. You also lose all the
conventions for accelerators (.key vs .accesskey vs .commandKey). This
information is critical to create a working conversion of a
localization.

> > As far as I've noted so far, there 4 different file formats for mozilla
> > (dtd, general properties, shellservice.properties and installation
> > properties). These differ in argument formats for strings,
>
> This applies to .po files as well, the argument formats are different
> depending on the language the program was written in. For example, while
> localizing Novell's gnome-patch.po for SLED 10 I found many different
> formats (from C/C++, C#, Perl etc.).
>

Correct, po doesn't solve it. And I never claimed it did (I'm pointing
out problems with the current state of the localization, I'm not
defending that po will solve all the problems - you'll notice that I
haven't even refered to po in my initial participation on this thread).

I just wonder why firefox, using 2 languages (XUL for the interface,
that uses the DTDs) and C++ (if this is wrong, please correct - I don't
remember what the language is), has to have different types of
arguments.

> > formatting rules and in the case of the installer, encoding.
>
> NSIS installer.properties use UTF-8, just like any other .properties file.
>

I was refering to: toolkit/installer/windows/install.it, not
installer.properties.

Still in CP1252, and AFAIK, is going to remain that way. This is the
only example, but it's the fact that we have these little differences
that I'm complaining about. Converting from UTF-8 to CP1252 on the build
isn't that hard, and would mean that localizers only have an encoding to
work with. Some conversions would fail, it's true. At the moment the
builds fail if you have this file's encoding wrong, so it would not make
that much of a difference.

I just want the work to be simplified. People can deal with incredible
amounts of complexity. But they work better with the complexity taken
away. And this thread is about getting more people to do the work
without getting Axel in a sanatorium, right?

Simplifying the little things I pointed out in

http://groups.google.com/group/mozilla.dev.l10n/browse_thread/thread/5c6d1913b319c6c9/51ff133d45e4a494?lnk=raot#44bf50f7f9260793

means:
* Less documentation needed.
* Less questions to get a localization up to speed.
* Less bugs in the localizations.

It won't make the localization something doable by a five year-old. It
won't make someone who says "I'd like to translate Camino to elvish",
able to do it in 5 minutes. But it will avoid work for those integrating
the work. It will avoid bugs that took hours of our time just to
identify (and I've seen I bunch of them in the work done by my team and
some of them were even corrected by others in this list - my sincere
thanks).

signature.asc

Novica Nakov

unread,
Nov 14, 2006, 11:55:00 AM11/14/06
to
> I think it's fair to acknowledge that people localizing Mozilla
> applications have quite a few new things to learn, independent of the
> actual file format.


We can acknowledge that - no problem. However you are asking (or the
topic is) hot to get from 50 to 100 locales. So, really how do you do that?

Have you ever considered why GNOME has 3 times more locales then Firefox?

1. GNOME is an older project. Can this be a good explanation? Maybe, to
some extent. But Firefox has much bigger user base, it makes sense to
have it localized no matter of the operating system one uses. And
Firefox has just a handful of strings compared to GNOME.

2. GNOME uses gettext. Can this be a good explanation? Maybe. But it's
more important to note that GNOME provides a tool to localize things.
The tool (it's called GTranslator I think) is part of the GNOME project.
And since Ubuntu (and Rosetta), all the translator needs to know is
how to click on a web page and fill in empty fields (so, she/he doesn't
have to know how to work with gettext at all). I would say that this
projects have l10n up in their priorities list. Where is Firefox?
Mozilla doesn't use gettext, so people can't use existing localization
tools. Mozilla doesn't provide other official tools for doing the job.
And, AFAIK, Mozilla doesn't help the development of other
Mozilla-specific tools such as MT. Moreover localizers should learn CVS
and follow way to many mozilla web pages with different info and... Huh,
if I feel like contributing to FLOSS, I'll probably choose Ubuntu.


3. Contributing to GNOME gives much more satisfaction than contributing
to Mozilla. Can this be a good explanation? Maybe. GNOME doesn't wave
the corporate flag and volunteers doing l10n work feel much more at
home. Mozilla is quite a different story. And after all, no one is
contributing localization to the Microsoft Corporation web browser, and
even if it is possible I would bet that no one will.


I could probably add few more but I think this gives the general picture.

Now a true story:
I'm doing l10n work on mozilla stuff for quite some time now, and I'm
getting a bit tired and I don't really have as much time as I used to.
So, I decided to ask some people to step in and replace me. And I'm
talking to this guy - an experienced Linux user, familiar with l10n
stuff, and with cvs and all the other things one must know to do this
job and he understands everything I tell him. His answer? - No way I'm
doing this.
And I'm still doing it because I don't want to see my past efforts go to
waste - but sooner or later I will have to call it a day (like I said I
don't have as much time as I used to), and than probably no one will
replace me. At the same time the GNOME l10n team has changed leaders and
few more people joined in and they are working at full speed.


So, you want to get to 100 locales? Maybe gettext can help, but it is
not mandatory right? I mean, what the hell, .pot, .properties, .java or
.whatever. Who cares? Rosetta users don't. Why should mozilla
translators do? If you want more locales you should make people feel at
home. Provide tools, help them, make things easier, send T-shirts, give
them honorary positions, put more effort in the whole process.

Or, simple, pay people to do it and tell them to shut up and use notepad
for translating.


--
Novica

Marek Stępień

unread,
Nov 14, 2006, 1:20:21 PM11/14/06
to
João Miguel Neves napisał:

> Both those are just the base formats, not the localization formats. If
> you just use that, you lose critical information, like "don't translate"
> comments or the meaning of the arguments.

Right.

> I just wonder why firefox, using 2 languages (XUL for the interface,
> that uses the DTDs) and C++ (if this is wrong, please correct - I don't
> remember what the language is), has to have different types of
> arguments.

It's not only C++, it's also JS. Also, some parts, like the installer
files, are actually not for Mozilla, but for different software that
needs to be bundled, like the NSIS installer.


>>> formatting rules and in the case of the installer, encoding.
>> NSIS installer.properties use UTF-8, just like any other .properties file.
> I was refering to: toolkit/installer/windows/install.it, not
> installer.properties.

This file is from the old installer, which was replaced with NSIS
for Fx/Tb 2. I don't know why this wasn't removed from CVS, but nobody
needs to translate this file anymore. (Though it needs to be present,
so that the build doesn't fail)

You're right on shellservice.properties, though. The reason for using
"&" for accesskeys is that those strings are used for some native
Windows stuff, and not for XUL. These 2 *Label strings actually show up
in Windows dialogs, not in Firefox itself. (Though it should be possible
to break those two strings into the .label and .accesskey one, and
generate the Windows-friendly ones during build time or even runtime).

> means:
> * Less documentation needed.
> * Less questions to get a localization up to speed.
> * Less bugs in the localizations.
>
> It won't make the localization something doable by a five year-old. It
> won't make someone who says "I'd like to translate Camino to elvish",
> able to do it in 5 minutes.

Camino is a different thing. Since it's a native app, and not a
XUL-based one, it uses the standard Mac OS X localization techniques
(for the most part of the app, at least). So, most of the things you
talk here about actually don't apply to this browser. :)

Filip Miletic

unread,
Nov 14, 2006, 7:56:16 PM11/14/06
to
Ricardo Palomares Martinez wrote:
> standard. Why PO is better than XUL model? Why is it better than just
> Java Properties i18n model?
> [...] What I'm sure about is that if [your favourite PO l10n tool] can't

> manage Mozilla i18n model, is not Mozilla the one to switch to PO: we

My 0.02$ --- right now, PO handles issues like multiple plurals and
declinations, arising with some 'exotic' languages. As far as I can
see, XUL does not; . I guess if you don't happen to need a l10n for such
a language, you don't get to appreciate the difference. I am interested
in such a language though, so this PO feature is for me a significant plus.

f

Robert Kaiser

unread,
Nov 14, 2006, 8:38:56 PM11/14/06
to
Filip Miletic schrieb:

> My 0.02$ --- right now, PO handles issues like multiple plurals and
> declinations, arising with some 'exotic' languages.

No. gettext does to some extent. PO is just a file format, and it
doesn't handle that stuff by itself. The tools/code that apply and
integrate the localized content into the app need to support it. So
using PO format in Mozilla or converting Mozilla L10n files to PO format
doesn't help anything. And supporting such stuff inside XML (XUL) is
quite hard to do. In the JS/C/C++ code where we use stringbundles
(.properties files) that might be easier and doable, but where we're
using .dtd (i.e. in XML files) I don't see how it can be done easily.

So it's no question of the file format, it's a question of integrating
the L10n files into the code.

Robert Kaiser

Erdal Ronahi

unread,
Nov 14, 2006, 8:47:12 PM11/14/06
to Robert Kaiser, dev-...@lists.mozilla.org
On 11/15/06, Robert Kaiser <ka...@kairo.at> wrote:

> So it's no question of the file format, it's a question of integrating
> the L10n files into the code.

Yes, I think "switch to PO" should be read as "switch to the gettext
framework". From a translator point of view, this is equivalent. It is
a fact that abilites like a very sophisticated approach to plurals and
other things is important if you want to come from 50 to 100
languages. Obviously PO/gettext has advantages here.

Erdal

Axel Hecht

unread,
Nov 14, 2006, 9:28:35 PM11/14/06
to

I don't really like PO, I looked at it again today.

One thing that makes my head turn round is the plural stuff in
particular, and I'm not sure that that's gonna be good for tools like
compare-locales etc. Or l10n tools in general.

Another stuff that I miss in gettext is variable expansion.

Benjamin and I have been talking about doing something for resolving
entities in expat which would help us to do partial localizations and
regional builds and stuff, and in my head that currently calls for just
a different format. PO has just stuff in there that isn't needed for a
localization, at least not in the centralized tooling environment we
provide.

I'll still need to draft something out, on the other hand I'd be curious
to see a real requirements list on a l10n file format if you guys want
to fill in.

* RCS friendly
* utf-8 encoded
* symbolic key - value mapping
* variable syntax with different formats and parameter ordering,
parameter naming
* distributed source, composing a locale from multiple source files
* working plural syntax, comparable against the reference locale and the
plural grammar of a particular grammar

Or call this a draft of a draft.

Axel

Filip Miletic

unread,
Nov 14, 2006, 9:55:50 PM11/14/06
to
Axel Hecht wrote:
> to see a real requirements list on a l10n file format if you guys want
Here are two more:

* Support for declinations. (i.e. allowing several context-dependent
translations to exist for the same string. In English, the string
"Mozilla" will stay "Mozilla" forever, but for instance in Serbian it
might be "Мозила", "Мозиле", "Мозили", "Мозилу", "Мозило", "Мозилом", "о
Мозили", depending upon the context in which the string appears)
* Convertibility to PO (even if only to use the existing toolchest for
manipulating PO files)

HTH,
f

João Miguel Neves

unread,
Nov 15, 2006, 3:02:06 AM11/15/06
to Axel Hecht, dev-...@lists.mozilla.org
Ter, 2006-11-14 às 18:28 -0800, Axel Hecht escreveu:
> One thing that makes my head turn round is the plural stuff in
> particular, and I'm not sure that that's gonna be good for tools like
> compare-locales etc. Or l10n tools in general.
>
Unfortunately plurals will always be complex, thanks to the languages.
It will never be easy on tools.

I think you should take a look at XLIFF (XML Localization Interchange
File Format). Version 1.1 is at:
http://www.oasis-open.org/committees/xliff/documents/xliff-specification.htm

As it's XML it could probably be used by XUL.

I think the Extensibility section covers everything we need to deal with
plurals and variables.

I was given this reference by pootle's team which are working on
supporting it.

> I'll still need to draft something out, on the other hand I'd be curious
> to see a real requirements list on a l10n file format if you guys want
> to fill in.
>
> * RCS friendly

What do you with this?

> * utf-8 encoded
> * symbolic key - value mapping
> * variable syntax with different formats and parameter ordering,
> parameter naming
> * distributed source, composing a locale from multiple source files
> * working plural syntax, comparable against the reference locale and the
> plural grammar of a particular grammar
>
> Or call this a draft of a draft.

I think it's usual called a requirements doc ;).

I'd had the impossible feature:
* No more complicated than the language demands.
(This is the feature where I have my doubts about XLIFF)

Other things that should not be forgotten:
* Annotations (notes to the localizers).
* Place (some kind of identification of where is the string - po uses
source file, which is far from ideal but better than nothing - mozilla
uses the file structure and that's also a solution for this).

I've always thought that XML is not the right format for people to
write. I've always seen it as machine-writable, human-readable. It is
too fragile, particularly when we're trying to increase the number of
locales. I can remember at least two situations where this has bitten me
(the help search bug that you and Marcoos corrected for pt-PT and an &
in a .button that caused Firefox not even to display when we started
translating - that one took more than a week to find).

But that's just my take on it.

signature.asc

João Miguel Neves

unread,
Nov 15, 2006, 3:08:54 AM11/15/06
to Axel Hecht, dev-...@lists.mozilla.org
Qua, 2006-11-15 às 08:02 +0000, João Miguel Neves escreveu:
> > * RCS friendly
> What do you with this?

Sorry, I meant: What do you mean with this?

signature.asc

Dwayne Bailey

unread,
Nov 15, 2006, 3:14:39 AM11/15/06
to dev-...@lists.mozilla.org
This thread seemed to go all technical and I wanted to bring it back to
less technical level:

Caveat: I work on the WordForge project which programs on the Translate
Toolkit and Pootle. But then I've also managed 11 languages across:
OpenOffice.org, Mozilla, GNOME and KDE... so I must know something.

I think lets restate the some things....

Mozilla wants to do more localisation. The question is why? Is it
important? I assume its is for Mozilla. OK so lets all assume its
important. If its important then this needs to receive some real and
critical support. And then Mozilla needs to actually _listen_ to people
with real and broad localisation experience.

Can we assume we're doing well now? Clearly not since others are doing
much much better for software that used by many many fewer people.

Are the people we want to ask even here? I don't think so, the people
who find it too difficult haven't even started the people who have
experience on projects with better process are probably sitting in other
projects. The people sitting on this list have already overcome and
forgotten their newbie problems or have become so myopic that they
cannot see beyond their technical solution to rethink it from scratch
and examine their own assumptions and prejudices. And the people who
have experience get shot down around technical debates.

Why aren't we doing well? Here are my thoughts

* Mozilla acts as an island. We have not learnt anything from other
localisation projects. There is a real NIMBY attitude.

* The process is driven by technical people not localisers. Notice how
quick this thread became a debate about Gettext and PO. I saw only one
mail that addressed the problems someone has encountered in localising.

* It takes too much time. Too much QA is manual. Why must people wait
for tinderbox. Why are people having to spend so much time on lists
instead of translating and improving translations.

* We are too point release focused. There is no space to release and
update translations quickly. No space for release early. No space for
localisers to get a quality improvement pack out.

* QA is techincal QA. All QA is about Trademarks, broken UI stuff,
search and English text. All that wasted time and should be solved by
other means.

* There is no language QA. I'm not sure many teams do real QA, ie
check their translations, and they certainly don't have tools that can
help them to do this well (unless of course they use PO) then they're in
a slightly better position. How many teams have a glossary, use TM? My
guess close to none, indicating we're on the first rung of good l10n.

* Firefox is big. 35,000 words (including toolkit/), OOo is 80,000.
If you have to translate something that big before you see it. Man that
is discouraging. A professional translators working fulltime would do
that in 35 working days and that would exclude reviews. 7 weeks!
Almost 2 months with no reward.

* Localisation should not break Firefox. If it does we're doing
something wrong. We should be catching all of those things before we
make an XPI.

Solutions:

* Use the Translate Toolkit damit :). Even just to publish official POT
files. Even allow people to commit PO files to Mozilla l10n/ and
automatically convert those files to .dtd and friends. We thus open
Mozilla to the world of existing FOSS localisers without changing the
current process for other people. Quick win, no arguments about formats
or anything.

* Reduce the channels we need to follow to find information. My
suggestion, a simple blog that focuses only on announcing deadlines,
requirements and points back to the wiki if there is a need for more
detail.

* Make it possible to easily build .xpi and installable builds for
testing. Waiting for tinderbox and nightly builds is silly. Even
better, as soon as I commit stuff to my l10n/ dir Mozilla builds an XPI
for me for testing. Better yet an XPI that nows how to upgrade so that
anyone using a testing XPI will get it and can use it magically.

* Use Pootle. This will allow very raw newbies to begin translating
using a simple web interface. This would even allow the registration
process to happen in parallel. Would allow community bug fix
contributions and encourage new users. It even allows a Help ->
Translate option on the browser so that people can localise.

* Realise that most good translators will actually not use MT or even a
PO Editor for that matter, since that would be the same as insisting
that all people who code on Firefox should use vim. Have any of the
more technical people thought about how your choices if forced on you by
others is the equivalent of the editor wars? Our best translators use
commercial tools like Wordfast. If we're serious about language QA then
we need to use real l10n tools.

* Use pofilter in the toolkit (this does mean you need PO files, but I'm
sure Mozilla can be adapted). This picks up translated variables,
missing escapes, etc over 41 checks. As a localiser if you're not using
this I'm not so sure you care as much about quality as you say you do.
We use it and are pretty confident we don't have any technical induced
translation problems ie our XPIs aren't going to break things. The aim
here is to make sure we can never build a broken XPI because of problems
in the translations.

* Create a system to properly mark entries in .dtd and .properties files
so that we can automate checks on things that should not be translated.
If we ever see a translator translate a config variable then we must be
honest that the problem is with us not them. If 100 people have to ask
should 'true' be translated, sigh, then we'll never get it. My ideal
solution but probably harder to implement is to move all config
information out of these files into a config file. Thus only things
that must be translated are in the DTD or .properties.

* Create more tears with well defined goals. This would allow us to
define an newbie tear where a team can release anything at any time.
Its never officially released but is available for the brave and for
testing.

* Direct translators. Since it is so big what should people do first so
that they see quick results. This would also allow us to define better
requirements for completeness for a release. So even with someone on
80% we'd know better if that will still make a good UI experience.

One last thought. Mozilla Corp through Axel has asked a question. If
you don't like the answers is that because the answer is wrong or
because you see the world through your own tinted spectacles. Every
person who raised issues that made Mozilla l10n hard for them has raised
a valid point, think about that before dismissing ideas.

> _______________________________________________
> dev-l10n mailing list
> dev-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-l10n
--
Dwayne Bailey
Translate.org.za

+27-12-460-1095 (w)
+27-83-443-7114 (cell)

Ognyan Kulev

unread,
Nov 15, 2006, 3:44:00 AM11/15/06
to dev-...@lists.mozilla.org
> Ter, 2006-11-14 às 18:28 -0800, Axel Hecht escreveu:
>> I'll still need to draft something out, on the other hand I'd be curious
>> to see a real requirements list on a l10n file format if you guys want
>> to fill in.

Please don't forget ability to quickly focus only on untranslated or
changed strings (like fuzzy/untranslated in PO format). AFAIK MT has
this information but it's kept locally, not in CVS.

BTW the Bulgarian l10n uses PO since the beginning (Firefox 1.5) with
Translate Toolkit but it took me much more time dealing with Translate
Toolkit than translating Firefox so in 2.0 I use kind of my own "stable
branch" of the toolkit combined with script for synchronization: ml-sync
in
http://openfmi.net/plugins/scmsvn/cgi-bin/viewcvs.cgi/scripts/?root=mozilla-bg

Regards,
ogi

Erdal Ronahi

unread,
Nov 15, 2006, 4:11:13 AM11/15/06
to Dwayne Bailey, dev-...@lists.mozilla.org
On 11/15/06, Dwayne Bailey <dwa...@translate.org.za> wrote:

[...]

Wonderfully explained, thanks a lot, Dwayne.

Erdal

Marek Stępień

unread,
Nov 15, 2006, 6:27:57 AM11/15/06
to
Axel Hecht napisał:

> One thing that makes my head turn round is the plural stuff in
> particular, and I'm not sure that that's gonna be good for tools like
> compare-locales etc. Or l10n tools in general.

Well, my proof-of-concept solution for plurals in .properties from
https://bugzilla.mozilla.org/show_bug.cgi?id=177097#c6
doesn't differ much from e.g. the contentHandlers or search.order in
region.properties, which is already supported by compare-locales...

As for plurals in .PO files, most of the l10n tools support these
perfectly.

Axel Hecht

unread,
Nov 15, 2006, 1:56:38 PM11/15/06
to
João Miguel Neves wrote:
> Ter, 2006-11-14 às 18:28 -0800, Axel Hecht escreveu:
>> One thing that makes my head turn round is the plural stuff in
>> particular, and I'm not sure that that's gonna be good for tools like
>> compare-locales etc. Or l10n tools in general.
>>
> Unfortunately plurals will always be complex, thanks to the languages.
> It will never be easy on tools.
>
> I think you should take a look at XLIFF (XML Localization Interchange
> File Format). Version 1.1 is at:
> http://www.oasis-open.org/committees/xliff/documents/xliff-specification.htm
>
> As it's XML it could probably be used by XUL.
>
> I think the Extensibility section covers everything we need to deal with
> plurals and variables.
>
> I was given this reference by pootle's team which are working on
> supporting it.
>
>> I'll still need to draft something out, on the other hand I'd be curious
>> to see a real requirements list on a l10n file format if you guys want
>> to fill in.
>>
>> * RCS friendly
> What do you with this?

Revision Control System, something like for example CVS.

>> * utf-8 encoded
>> * symbolic key - value mapping
>> * variable syntax with different formats and parameter ordering,
>> parameter naming
>> * distributed source, composing a locale from multiple source files
>> * working plural syntax, comparable against the reference locale and the
>> plural grammar of a particular grammar
>>
>> Or call this a draft of a draft.
>
> I think it's usual called a requirements doc ;).
>
> I'd had the impossible feature:
> * No more complicated than the language demands.
> (This is the feature where I have my doubts about XLIFF)

Actually, this is one of my biggest point that I missed in my list,
thanks for bringing that up.
Any character that is just grammar is a useless source of bugs.

In addition, and that's not really a requirement but my personal taste
to some extent, I don't want process data in the l10n file format. Last
time I looked at xliff, it was full of that. Same for PO. Those files
are to some extent designed to be self-contained and mailed around.
Stuff like that is not really RCS friendly, though, as you check in
changes that are not really changing your work, but just the context of it.

> Other things that should not be forgotten:
> * Annotations (notes to the localizers).
> * Place (some kind of identification of where is the string - po uses
> source file, which is far from ideal but better than nothing - mozilla
> uses the file structure and that's also a solution for this).

I'm not sure I understand your point here.

> I've always thought that XML is not the right format for people to
> write. I've always seen it as machine-writable, human-readable. It is
> too fragile, particularly when we're trying to increase the number of
> locales. I can remember at least two situations where this has bitten me
> (the help search bug that you and Marcoos corrected for pt-PT and an &
> in a .button that caused Firefox not even to display when we started
> translating - that one took more than a week to find).
>
> But that's just my take on it.

Thanks for the input here.

Axel

Marek Stępień

unread,
Nov 15, 2006, 3:54:31 PM11/15/06
to
Axel Hecht napisał:

>> * Place (some kind of identification of where is the string - po uses
>> source file, which is far from ideal but better than nothing - mozilla
>> uses the file structure and that's also a solution for this).
>
> I'm not sure I understand your point here.

I suppose João wanted to say that in a typical .PO entry you usually
have lots of information about the string, like in this example (taken
from a real .PO from Novell YaST localization done by Aviary.pl):

#. screen title for update options
#. this is a heading
#: src/clients/inst_update.ycp:26 src/clients/update_proposal.ycp:366
msgid "Update Options"
msgstr "Opcje aktualizacji"

So, in the line starting with "#:" you have information of where that
string comes from precisely - source file location and line number.

IMO, as in Mozilla we usually have a 1:1 relationship between foo.xul
and foo.dtd, it's not that much of a problem.

Axel Hecht

unread,
Nov 15, 2006, 6:05:18 PM11/15/06
to
Marek Stępień wrote:
> Axel Hecht napisał:
>>> * Place (some kind of identification of where is the string - po uses
>>> source file, which is far from ideal but better than nothing - mozilla
>>> uses the file structure and that's also a solution for this).
>> I'm not sure I understand your point here.
>
> I suppose João wanted to say that in a typical .PO entry you usually
> have lots of information about the string, like in this example (taken
> from a real .PO from Novell YaST localization done by Aviary.pl):
>
> #. screen title for update options
> #. this is a heading
> #: src/clients/inst_update.ycp:26 src/clients/update_proposal.ycp:366
> msgid "Update Options"
> msgstr "Opcje aktualizacji"
>
> So, in the line starting with "#:" you have information of where that
> string comes from precisely - source file location and line number.
>
> IMO, as in Mozilla we usually have a 1:1 relationship between foo.xul
> and foo.dtd, it's not that much of a problem.
>

I think this is an artifact about sending files around in the world,
which we're not going to do. The other side of this is, we do have lxr
which indexes our code so I don't see a real use in that, and a value in
keeping that information alive. Like, just because someone add a
function before the localizable code, we don't want to change 50-100
localization files.

Axel

João Miguel Neves

unread,
Nov 15, 2006, 6:18:51 PM11/15/06
to Axel Hecht, dev-...@lists.mozilla.org
Qua, 2006-11-15 às 10:56 -0800, Axel Hecht escreveu:

> João Miguel Neves wrote:
> >> * RCS friendly
> > What do you with this?
>
> Revision Control System, something like for example CVS.
>
I know what RCS is (as CVS, SCCS, Mercurial, Subversion and others that
I use). What I don't know is what is RCS friendly.

> In addition, and that's not really a requirement but my personal taste
> to some extent, I don't want process data in the l10n file format. Last
> time I looked at xliff, it was full of that. Same for PO. Those files
> are to some extent designed to be self-contained and mailed around.
> Stuff like that is not really RCS friendly, though, as you check in
> changes that are not really changing your work, but just the context of it.
>

OK, now I think I got it. What you want is that changes in the version
control means changes in the localization, and not anything else. In
another way, make the commit logs meaningful.

> > Other things that should not be forgotten:
> > * Annotations (notes to the localizers).
> > * Place (some kind of identification of where is the string - po uses
> > source file, which is far from ideal but better than nothing - mozilla
> > uses the file structure and that's also a solution for this).
>
> I'm not sure I understand your point here.
>

One of the biggest problem we have is mapping the interface to the
translation file. In the first translation pass either one has a deep
knowledge of the application or its technology, or a lot of the
translation will be incorrect and that'll become obvious once you look
at the interface. Nowadays I use grep to find where a certain string is
because I don't now how to map from the interface to the files.

signature.asc

João Miguel Neves

unread,
Nov 15, 2006, 6:25:31 PM11/15/06
to Axel Hecht, dev-...@lists.mozilla.org
Without discussing the merits of PO files, I'd just like to point out a
couple of things:

* I know of no project that uses other format with more than 50 or 100
locales.
* Epiphany and Konqueror, which in terms of usage are minimal, have
more localizations than Firefox.
* Several teams use the PO format to translate Firefox.
* Thousands of localizers (not just programmers) are able to work with
the format.
* The localization of a project like KDE, with a major release a year,
costs less than 40 hours a year to the pt-PT localizers.

Whatever can be said about the format, there facts can't and shouldn't
be dismissed.

signature.asc

João Miguel Neves

unread,
Nov 15, 2006, 6:29:14 PM11/15/06
to dev-...@lists.mozilla.org
Qua, 2006-11-15 às 21:54 +0100, Marek Stępień escreveu:
> Axel Hecht napisał:
> >> * Place (some kind of identification of where is the string - po uses
> >> source file, which is far from ideal but better than nothing - mozilla
> >> uses the file structure and that's also a solution for this).
> >
> > I'm not sure I understand your point here.
>
> I suppose João wanted to say that in a typical .PO entry you usually
> have lots of information about the string, like in this example (taken
> from a real .PO from Novell YaST localization done by Aviary.pl):
>
> #. screen title for update options
> #. this is a heading
> #: src/clients/inst_update.ycp:26 src/clients/update_proposal.ycp:366
> msgid "Update Options"
> msgstr "Opcje aktualizacji"
>
> So, in the line starting with "#:" you have information of where that
> string comes from precisely - source file location and line number.
>
> IMO, as in Mozilla we usually have a 1:1 relationship between foo.xul
> and foo.dtd, it's not that much of a problem.
>
Almost there, but not quite. PO's solution is not enough. My problem is
context: how does one find out the context of a translation. The ideal
is what I've seen in some webapps: you click on a phrase on the
interface and translate it. This is not possible with most programs, so
identifying the source is the less worst solution.

Marek is correct that in mozilla the file structure maps the interface,
so information like the one in PO would be redundant. But it's still not
a perfect solution for context.

signature.asc

Robert Kaiser

unread,
Nov 15, 2006, 6:53:36 PM11/15/06
to
Marek Stępień schrieb:

> IMO, as in Mozilla we usually have a 1:1 relationship between foo.xul
> and foo.dtd, it's not that much of a problem.

That relationship is a very godd and often necessary one, which I found
that don't work too well with PO/gettext, at least in my PHP works.
Which IMHO makes that framework unusable for decent L10n.

I think our problem is not the formats though, it's some flexibility in
integration into the code.

Robert Kaiser

Robert Kaiser

unread,
Nov 15, 2006, 7:06:19 PM11/15/06
to
João Miguel Neves schrieb:

> * I know of no project that uses other format with more than 50 or 100
> locales.

Is there any project using PO format that has 50 *fully complete*
localizations, like Firefox has now with our supposedly inferior formats?

> * Epiphany and Konqueror, which in terms of usage are minimal, have
> more localizations than Firefox.

Again, how many of them are *fully complete*?


As the PO/gettext framework allows inclomplete translations and our
doesn't at the moment, I don't think a pure number of localizations is
just unfair. I'm pretty sure the numbers wouldn't differ much if we
could allow partial localizations (and that includes en-CA, en-AU, etc.)


Just my 2c,

Robert Kaiser

Ankit Patel

unread,
Nov 16, 2006, 12:23:08 AM11/16/06
to Robert Kaiser, dev-...@lists.mozilla.org
----- Original Message ----
From: Robert Kaiser <ka...@kairo.at>
To: dev-...@lists.mozilla.org
Sent: Thursday, November 16, 2006 5:36:19 AM
Subject: Re: PO format usage

>João Miguel Neves schrieb:
>> * I know of no project that uses other format with more than 50 or 100
>> locales.

>Is there any project using PO format that has 50 *fully complete*
>localizations, like Firefox has now with our supposedly inferior formats?

Open source projects like, Gnome, Kde, Xfce, Fedora, etc. are using PO formats to get the localizataion support & and i think in these projects there will be at least 20 locales which has *fully complete* localization & at least 50 locales which has 80% of localization & definitely 10 locales which has 50% of localization. I agree that there is no point of using 50% localized application, but at least translators all over the world are encouraged to start their work as they find the easy process of getting their language inclusion in these projects, which is quite tough in mozilla as far as i know.

>> * Epiphany and Konqueror, which in terms of usage are minimal, have
>> more localizations than Firefox.

>Again, how many of them are *fully complete*?

I think 80% is enough for the end user to feel the localization of application.

>As the PO/gettext framework allows inclomplete translations and our
>doesn't at the moment, I don't think a pure number of localizations is
>just unfair. I'm pretty sure the numbers wouldn't differ much if we
>could allow partial localizations (and that includes en-CA, en-AU, etc.)


Two important points from my side:
1. Translators prefer to work on PO format files
2. java .dtd & .properties files doesn't show the minor chages, we always have to compare the english files with our lang files using compare-locales.pl file & start working, which i think is not perfect.

Suggestions:
1. I guess we could survey this matter among the mozilla translators' community itself to know what translators want.
2. We could develop some tool to conver .dtd & .properties files to .po files & .po to .dtd & .properties, that may give flexibility to translators to choose either .dtd & .properties or .po format

>Just my 2c,

>Robert Kaiser

More suggestions are always welcome...


Ankit Patel


____________________________________________________________________________________
Sponsored Link

Mortgage rates near 39yr lows.
$310k for $999/mo. Calculate new payment!
www.LowerMyBills.com/lre

pi

unread,
Nov 16, 2006, 2:22:08 AM11/16/06
to

> 2. We could develop some tool to conver .dtd & .properties files to .po files & .po to .dtd & .properties, that may give flexibility to translators to choose either .dtd & .properties or .po format

I know a project to make these conversions moz2po & po2moz
(http://www.wordforge.org/static/toolkit-moz2po.html). The basque group
was use (I am menber) this scripts to make ff2 translation, and
basically runs fine.

Axel Hecht

unread,
Nov 16, 2006, 3:03:23 AM11/16/06
to
Ankit Patel wrote:
> ----- Original Message ----
> From: Robert Kaiser <ka...@kairo.at>
> To: dev-...@lists.mozilla.org
> Sent: Thursday, November 16, 2006 5:36:19 AM
> Subject: Re: PO format usage
>
>> João Miguel Neves schrieb:
>>> * I know of no project that uses other format with more than 50 or 100
>>> locales.
>
>> Is there any project using PO format that has 50 *fully complete*
>> localizations, like Firefox has now with our supposedly inferior formats?
>
> Open source projects like, Gnome, Kde, Xfce, Fedora, etc. are using PO formats to get the localizataion support & and i think in these projects there will be at least 20 locales which has *fully complete* localization & at least 50 locales which has 80% of localization & definitely 10 locales which has 50% of localization. I agree that there is no point of using 50% localized application, but at least translators all over the world are encouraged to start their work as they find the easy process of getting their language inclusion in these projects, which is quite tough in mozilla as far as i know.
>
>>> * Epiphany and Konqueror, which in terms of usage are minimal, have
>>> more localizations than Firefox.
>
>> Again, how many of them are *fully complete*?
>
> I think 80% is enough for the end user to feel the localization of application.

Or they feel like total crap. It really depends on the visibility of the
20%.

But I do agree, we need to find a way to get partial localizations out
there for testing purposes, for a whole bunch of reasons.

>> As the PO/gettext framework allows inclomplete translations and our
>> doesn't at the moment, I don't think a pure number of localizations is
>> just unfair. I'm pretty sure the numbers wouldn't differ much if we
>> could allow partial localizations (and that includes en-CA, en-AU, etc.)
>
>
> Two important points from my side:
> 1. Translators prefer to work on PO format files
> 2. java .dtd & .properties files doesn't show the minor chages, we always have to compare the english files with our lang files using compare-locales.pl file & start working, which i think is not perfect.

I don't know what you mean with "minor changes" here.

> Suggestions:
> 1. I guess we could survey this matter among the mozilla translators' community itself to know what translators want.
> 2. We could develop some tool to conver .dtd & .properties files to .po files & .po to .dtd & .properties, that may give flexibility to translators to choose either .dtd & .properties or .po format
>
>> Just my 2c,
>
>> Robert Kaiser
>
> More suggestions are always welcome...
>

yep

Axel

João Miguel Neves

unread,
Nov 16, 2006, 4:01:16 AM11/16/06
to Robert Kaiser, dev-...@lists.mozilla.org
This is a very long post. It includes rants, data, concerns a
translation process description and a couple of other things.

You've been warned.

I've never felt code integration to be an issue. Rushing away things,
not getting approval for localizations in a timely fashion, having to
wait almost days to test a localization are problems for me.

For Firefox 2 there were times where we had 3 people working not
fulltime, but overtime on it to make the deadlines. Then we'd have to
wait for the next build because we hadn't the resources to build
ourselves (the way the builds are working, if we commit at the end of
the day - around 6pm here, we can only test it the next afternoon). The
result is that some localization work was almost not tested (windows
installer comes to mind). The help changes and other late-work made us
have to cut on the testing and QA. We had about one hundred know issues
by the time of RC1 (not counting help). Most of it were small things,
some of them got lost in the way (we found them again later, with user
feedback).

The present pt-PT started because there was no pt-PT l10n for Firefox
1.5 and we wanted/needed it. We did one, and Axel said they wouldn't use
it. Blame it on wrong expectations, but there was a cost to me and my
team: for 1.5 we had 13 translators working on it, for 2.0 we had 4.

The thunderbird localization has stopped for several weeks now. It took
2 months to get the registration bug approved. It could have taken a
month less if it wasn't for issues on our side. The interesting part is
that those issues could have been avoided, if people weren't scared of
the bureaucracy (read non-localization work). That's why I was asked to
make the connection with Mozilla, instead of the people doing the real
translation. At the moment, after more than 1.5 months of work on it, we
don't have a Thunderbird build to test. Our Thunderbird team is not the
same as the Firefox team, so they do feel like 2nd class citizens to be
kept waiting because Firefox 2.0 was taking most of the resources.

For a good localization(*) we need this cycle:
1) Translate the files.
2) Do translation sessions:
a) Update the translation (aka, synchronize with en-US).
b) Test the translation (build the app or xpi with the translation,
run the translation through QA).
c) Correct the errors.

For Firefox, checking context errors (where a translation is wrong for
that context, even if otherwise seems correct), correcting, and
verifying that they are correct is a job for, at least half a week, not
minutes or hours (think of correcting, waiting a day, finding out you
did a mistake, correcting, wait another day for the build, verify that
the solution is good).

We can do work in the meantime because our l10n QA infrastructure
doesn't depend on a build (which obviously doesn't capture the context
errors). But we also keep our own repositories and we're starting to
think that we also need to have our own build infrastructure if we want
to have an Excellent l10n. Is this the amount/kind of resources we are
asking each team to have to do a good work? This means system
administration work, at least. Not the kind of thing I'd ask to a
professional translator.

By the way, our repositories allow us to keep working even when
mozilla's are closed. That has worked great, because that allows us to
be proactive in bug fixing, and makes the bug requesting/approval bit
more manageable (when the approvals open for a new release, we have them
ready to commit).

I do think Mozilla has a serious problem. From current and previous
translators there's a burnout feeling. Axel included. In stuff like KDE
in GNOME the pt-PT localizers complain of burnout sometimes, but they've
been able to keep doing their work for 3+ years so far. That hasn't
happened with mozilla pt-PT translations. And we see people that work on
several localizations complaining about the amount of time a project
like Firefox requires.

After what I experienced for Firefox 2.0 and from the feedback that this
was the least problematic release so far, I don't want to imagine the
previous problems. But as far as the road has been traveled, there are
still a lot of things to fix:
1) Making it easier to do the translation cycle, will make correcting
bugs faster.
2) With partial translations and a QA infrastructure, basically all the
translation process will be this cycle (bug count will include coherency
errors and number of unstranslated strings). So correcting bugs faster
will mean a better and faster localization work.
3) Fast bureaucratic procedures: this means lower entry requirements
and approvals must be a 1st level priority. At the moment this scares
people away (I was the 2nd team leader for my localization team - guess
what drove the other one away). If you have doubts check how many bugs
for registration teams haven't been followed through. Some disappear
(that's normal), but I still find troublesome that Firefox's
localization numbers are comparable with more complex free software with
less than 1/10th or 1/100th of the users.
4) All the rest of the stuff I've said before - sorry for not having a
simple list, I've tried that, and I just keep adding stuff.

(*) Not complete (and we include help in that definition) - I'd classify
the current Firefox's pt-PT localization as Good, but not Very Good or
Excellent, and it was Lousy when it was complete - things like correct
access keys, coherence in translation, not having the same terms in the
same context translated differently, spell-checking on all the
translation make the difference between a complete and an Excellent
translation.

Best regards,
João Miguel Neves

PS: I'm going to step away from this discussion for a while. My normal
work has more requests than I can handle right now, so I'm going to lay
down on this for a while. Keep it alive, please.

signature.asc

Dwayne Bailey

unread,
Nov 16, 2006, 4:54:40 AM11/16/06
to dev-...@lists.mozilla.org
On the number's game Robert is rights. Since we use PO for our Firefox
translations we are able to create partial complete translations. PO
allows us to quite simply track what is untranslated and what has
changed. So that simple fact would add 11 languages since we're 100%
for 1.5 but about 90% for 2.0.

The point that is missed is that somehow these projects encourage lively
contribution. PO just happens to be a defacto standard which allows
people to easily transfer skills. Afraid that is not so with Mozilla
l10n. That is when the figures make sense.

Damjan Georgievski

unread,
Nov 16, 2006, 4:56:18 AM11/16/06
to
>> Please provide familiar and all-loved framework (SCM+PO)
>> for translating.
>
> Who says that everyone loves that framework? In fact, I know multiple
> people (including myself) that hate it and would always prefer the
> current Mozilla model to it if they could choose.

See the number of KDE translations? See the number of Gnome translations?
How much bigger KDE and Gnome are than Firefox/Thunderbird?
How much smaller is the KDE+Gnome user community compared to the Firefox
community.

Now think... maybe... are you the only one not liking .PO framework?
Think.

BTW What is the "current" model? No tools?
Confusingly two (or more) file formats?
(Not to speak of the Mozilla bureaucracy)

--
damjan

Damjan Georgievski

unread,
Nov 16, 2006, 4:57:35 AM11/16/06
to
>>>> In parallel, we should collect ideas on how to make l10n easier,
>>>
>>> This is an easy one. Provide official POT files.
>>
>> I would stress on the same thing. It's great that MT is actively
>> developed, but IMHO it's a waste of time and energy to rewrite already
>> great PO editors like KBabel or Emacs+gettext.el. (...)
>
>
> What many of PO fans don't seem to acknowledge (I've had this same
> discussion with some es-ES team members) is that PO is NOT *the*
> standard; in fact, I don't think that there is such thing as a l10n

> standard. Why PO is better than XUL model? Why is it better than just
> Java Properties i18n model?

For a simple reason, there are tools, several of them. Useful tools, desktop
tools, web tools, lint checks.. etc.. TOOLS.


--
damjan

Dwayne Bailey

unread,
Nov 16, 2006, 5:00:40 AM11/16/06
to dev-...@lists.mozilla.org
On Wed, 2006-11-15 at 10:56 -0800, Axel Hecht wrote:

<snip>

> > * No more complicated than the language demands.
> > (This is the feature where I have my doubts about XLIFF)
>
> Actually, this is one of my biggest point that I missed in my list,
> thanks for bringing that up.
> Any character that is just grammar is a useless source of bugs.
>
> In addition, and that's not really a requirement but my personal taste
> to some extent, I don't want process data in the l10n file format. Last
> time I looked at xliff, it was full of that. Same for PO. Those files
> are to some extent designed to be self-contained and mailed around.
> Stuff like that is not really RCS friendly, though, as you check in
> changes that are not really changing your work, but just the context of it.

PO is noisy in RCS - XLIFF would be worse. But unfortunately l10n is
not programming and does require tracking process. I guess there are
ways to track process apart from the file but then things get
complicated when you are trying to keep things in sync.

Dwayne Bailey

unread,
Nov 16, 2006, 5:08:33 AM11/16/06
to dev-...@lists.mozilla.org
On Wed, 2006-11-15 at 21:23 -0800, Ankit Patel wrote:
> ----- Original Message ----
> From: Robert Kaiser <ka...@kairo.at>
> To: dev-...@lists.mozilla.org
> Sent: Thursday, November 16, 2006 5:36:19 AM
> Subject: Re: PO format usage
>
> >João Miguel Neves schrieb:
> >> * I know of no project that uses other format with more than 50 or 100
> >> locales.
>
> >Is there any project using PO format that has 50 *fully complete*
> >localizations, like Firefox has now with our supposedly inferior formats?
>
> Open source projects like, Gnome, Kde, Xfce, Fedora, etc. are using PO formats to get the localizataion support & and i think in these projects there will be at least 20 locales which has *fully complete* localization & at least 50 locales which has 80% of localization & definitely 10 locales which has 50% of localization. I agree that there is no point of using 50% localized application, but at least translators all over the world are encouraged to start their work as they find the easy process of getting their language inclusion in these projects, which is quite tough in mozilla as far as i know.
>
> >> * Epiphany and Konqueror, which in terms of usage are minimal, have
> >> more localizations than Firefox.
>
> >Again, how many of them are *fully complete*?
>
> I think 80% is enough for the end user to feel the localization of application.

Thanks for saying that. The notion of 100% is a farce. We need to
translate the strings that people will see 80% of the time. 100% is
ideal but I'm sure you have translated error messages that no one has
ever ever seen.

> >As the PO/gettext framework allows inclomplete translations and our
> >doesn't at the moment, I don't think a pure number of localizations is
> >just unfair. I'm pretty sure the numbers wouldn't differ much if we
> >could allow partial localizations (and that includes en-CA, en-AU, etc.)
>
>

> Two important points from my side:
> 1. Translators prefer to work on PO format files
> 2. java .dtd & .properties files doesn't show the minor chages, we always have to compare the english files with our lang files using compare-locales.pl file & start working, which i think is not perfect.

I call these files monolingual. They will always have a problem of not
being able to track their own changes. Bilingual files like PO and
XLIFF do not suffer from this problem.

In PO when a new message arrives its blank for me and I translate it.
If the English changes it goes fuzzy (and thus will not be used by
po2moz) and I can change it. The latest release of Gettext adds a way
to store the previous translation so your editor could then easily give
you a diff between the previous and current English text.

>
> Suggestions:
> 1. I guess we could survey this matter among the mozilla translators' community itself to know what translators want.
> 2. We could develop some tool to conver .dtd & .properties files to .po files & .po to .dtd & .properties, that may give flexibility to translators to choose either .dtd & .properties or .po format

Now that would be a great idea! Oh wait it was done about 4 years
ago ;)

http://translate.sourceforge.net/wiki/toolkit/moz2po

Marek Stępień

unread,
Nov 16, 2006, 5:24:43 AM11/16/06
to
João Miguel Neves napisał(a):

> For Firefox 2 there were times where we had 3 people working not
> fulltime, but overtime on it to make the deadlines. Then we'd have to
> wait for the next build because we hadn't the resources to build
> ourselves (the way the builds are working, if we commit at the end of
> the day - around 6pm here, we can only test it the next afternoon).

So, you were waiting for the nightly builds, right? There are tinderbox
builds built every hour or so:

http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox/latest-mozilla1.8-l10n/
http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox/latest-trunk-l10n/

João Miguel Neves

unread,
Nov 16, 2006, 5:57:50 AM11/16/06
to dev-...@lists.mozilla.org
Great. I've just added that to MDC's create a new localization's page.
I've been updating it with the things I've learned to make it faster for
others to get up to speed. If you know of anything else that would be
useful, please also add it in:

http://developer.mozilla.org/en/docs/Create_a_new_localization

Best regards,
João Miguel Neves

signature.asc

Wim

unread,
Nov 16, 2006, 6:22:13 AM11/16/06
to dev-...@lists.mozilla.org
Marek Stępień wrote:

>João Miguel Neves napisał(a):
>
>
>>For Firefox 2 there were times where we had 3 people working not
>>fulltime, but overtime on it to make the deadlines. Then we'd have to
>>wait for the next build because we hadn't the resources to build
>>ourselves (the way the builds are working, if we commit at the end of
>>the day - around 6pm here, we can only test it the next afternoon).
>>
>>
>So, you were waiting for the nightly builds, right? There are tinderbox
>builds built every hour or so:
>
>http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox/latest-mozilla1.8-l10n/
>http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox/latest-trunk-l10n/
>

And if you are in a hurry and really can't wait, put the file into your
local <language>.jar file and restart Firefox. Of course you need to
make back-ups.

I don't know how things work with PO-files, but what about a
translate-system similar (not equal!) to Babelzilla?

Wim

Wim

unread,
Nov 16, 2006, 6:22:13 AM11/16/06
to dev-...@lists.mozilla.org
Marek Stępień wrote:

>João Miguel Neves napisał(a):
>
>
>>For Firefox 2 there were times where we had 3 people working not
>>fulltime, but overtime on it to make the deadlines. Then we'd have to
>>wait for the next build because we hadn't the resources to build
>>ourselves (the way the builds are working, if we commit at the end of
>>the day - around 6pm here, we can only test it the next afternoon).
>>
>>
>So, you were waiting for the nightly builds, right? There are tinderbox
>builds built every hour or so:
>
>http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox/latest-mozilla1.8-l10n/
>http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox/latest-trunk-l10n/
>

Robert Kaiser

unread,
Nov 16, 2006, 7:20:47 AM11/16/06
to
Ankit Patel schrieb:
> From: Robert Kaiser <ka...@kairo.at>

>> João Miguel Neves schrieb:
>>> * I know of no project that uses other format with more than 50 or 100
>>> locales.
>
>> Is there any project using PO format that has 50 *fully complete*
>> localizations, like Firefox has now with our supposedly inferior formats?
>
> Open source projects like, Gnome, Kde, Xfce, Fedora, etc. are using PO formats to get the localizataion support & and i think in these projects there will be at least 20 locales which has *fully complete* localization & at least 50 locales which has 80% of localization & definitely 10 locales which has 50% of localization. I agree that there is no point of using 50% localized application, but at least translators all over the world are encouraged to start their work as they find the easy process of getting their language inclusion in these projects, which is quite tough in mozilla as far as i know.

Thanks for telling us that our 50 complete languages are much more than
the ~20 complete languages of others. Now the argument of people not
being able to work on our stuff is out of the way finally. That doiesn't
mean we don't need to support incomplete/partial L10n, in fact, we very
much have a need for that. But the argument that just because of PO
format other project have lots more of localizers and localizations
should finally be history now. Thanks for making that clear.


> 1. Translators prefer to work on PO format files

I am a translator, and I don't prefer those. I much more prefer
MozillaTranslator to KBabel, and much more prefer monolingual L10n files
to that bilingual waste of resources.


> 2. java .dtd & .properties files doesn't show the minor chages, we always have to compare the english files with our lang files using compare-locales.pl file & start working, which i think is not perfect.

MozillaTranslator shows every single string change in a very nice way.
Of course, one has to compare using a tool there with using a tool here,
and they do exist on both sides ;-)

Sure, MT could be improved a lot, and I'm very thankful we now have
someone working on it again, as we now can hope for real improvements.

> 2. We could develop some tool to conver .dtd & .properties files to .po files & .po to .dtd & .properties, that may give flexibility to translators to choose either .dtd & .properties or .po format

That tool does exist already, see other posts.

Robert Kaiser

Robert Kaiser

unread,
Nov 16, 2006, 7:26:41 AM11/16/06
to
Damjan Georgievski schrieb:

> See the number of KDE translations? See the number of Gnome translations?

See other places in this thread, where we finally could clear that
issue. They apparently don't have more full localizations than us - and
as we only take full localizations into the tree at the moment
(something we can and should improve upon, but that's one of the reasons
for this thread) you can only compare full localizations.

> How much bigger KDE and Gnome are than Firefox/Thunderbird?

Not as much as you might think.

> How much smaller is the KDE+Gnome user community compared to the Firefox
> community.

Not much, actually, I'd bet they are even bigger when counting active
contributors.

> Now think... maybe... are you the only one not liking .PO framework?
> Think.

I'm sure I'm not alone. But anyways, I can't determine where the project
heads anyways. I might stop working on German SeaMonkey if we change to
some crappy format like gettext/PO, but I'll think about that when it's
done.

> BTW What is the "current" model? No tools?

Wrong. Multiple tools, actually, ranging from MozillaTranslator via
mozptools to pootle, not to mention the source-level tools that ensure
top quality.

> Confusingly two (or more) file formats?

The two clearly defined formats we're using are not any more confusing
than the confusing PO format.

> (Not to speak of the Mozilla bureaucracy)

I can't tell, but I don't think that's much easier in other serious
projects.

Robert Kaiser

Robert Kaiser

unread,
Nov 16, 2006, 7:40:03 AM11/16/06
to
João Miguel Neves schrieb:

> I've never felt code integration to be an issue. Rushing away things,
> not getting approval for localizations in a timely fashion, having to
> wait almost days to test a localization are problems for me.

Now, I think we're getting somewhere. Much more than changes in the
code, we need improvements in our processes. That's what I was saying
the whole way starting when that thread was started.

Can you find out and give us some cencrete steps what you'd need/want to
happen so that your cycles work better? Do we need some simple build
script that packs up an ab-CD.jar just with your localization, without
needing to (re)build a whole Firefox/Thunderbird? Then you could just
copy that into your Firefox installation and test it.
Would that help? Would that requiring the "make" and "perl" tools
installed for that be an option? (We're currently using those in the
build process to pack the .jars, so we could reuse the same code.)

This is just one idea for helping with that.

Of course, for those that don't have the needed tools installed, we
could trigger a rebuild from the source in CVS by accessing some web
page or the checkin itself.

Should we have some always-open CVS branch, and only merge approved
changes to the really used branch on approval, or something similar?
Actually, that could be done quite easily, and probably even now, but it
needs some CVS knowledge from someone in each team that uses that approach.

What other concrete steps we could take can you think of?

I think that's the way we really need to work on, much more than the
technical file format topic we always tend to stumble into.

Robert Kaiser

João Miguel Neves

unread,
Nov 16, 2006, 10:11:13 AM11/16/06
to Robert Kaiser, dev-...@lists.mozilla.org
Qui, 2006-11-16 às 13:40 +0100, Robert Kaiser escreveu:
> João Miguel Neves schrieb:

> > I've never felt code integration to be an issue. Rushing away things,
> > not getting approval for localizations in a timely fashion, having to
> > wait almost days to test a localization are problems for me.
>
> Can you find out and give us some cencrete steps what you'd need/want to
> happen so that your cycles work better? Do we need some simple build
> script that packs up an ab-CD.jar just with your localization, without
> needing to (re)build a whole Firefox/Thunderbird? Then you could just
> copy that into your Firefox installation and test it.

Yes, that would help. I was able to do it for version 1.5 and that meant
we got up-to-date xpi files that normal people could use to gives us
feedback.

> Would that help? Would that requiring the "make" and "perl" tools
> installed for that be an option? (We're currently using those in the
> build process to pack the .jars, so we could reuse the same code.)
>

That would work for me. For others, a service to generate it from an
up-to-date localization would work, but if you provide me those tools, I
can provide that.

> This is just one idea for helping with that.
>
> Of course, for those that don't have the needed tools installed, we
> could trigger a rebuild from the source in CVS by accessing some web
> page or the checkin itself.
>

Yes, this would be great.

> Should we have some always-open CVS branch, and only merge approved
> changes to the really used branch on approval, or something similar?
> Actually, that could be done quite easily, and probably even now, but it
> needs some CVS knowledge from someone in each team that uses that approach.
>
> What other concrete steps we could take can you think of?
>
> I think that's the way we really need to work on, much more than the
> technical file format topic we always tend to stumble into.

I tried to start the discussion like that (separating content packs,
cleaning up the localization files). Nobody seemed to care... Could
someone make a description (You, Axel, Marek are probably the most
suited) of how to accomplish the following tasks (I think they're
missing from the docs) for non-programmers?

1) Find a particular string in a file (to answer a complaint, or to
correct a mistake). I think LXR is the answer, but not sure.
2) Create a build(or something else) that allows a localized app test.
3) How to commit a localization (a mini tutorial on CVS and branching
would help).

The separation of content packs would simplify a lot of the process, by
separating a lot of the needed authorizations from the localization.

A possible complete redoing of the process would be separating l10n
releases from the major. This could be done this way:

1) Create a group of tests for accepting a localization (Axel's work
for 2.0+litmus tests for localizations is a very good start).
2) Any localization that passes the tests can be released at that
point, mark it in CVS as FIREFOX_2.0.0.1_pt-PT, or something that allows
us to identify which release was that.
3) Give time in the normal process to have a number of localizations
ready for the .0 release (a real string freeze as early has possible -
working 6 hours after waiting 2 days after a string freeze and then
finding out more changed strings after doing the commit should not
happen - if the string freeze is delayed, even for a couple of hours,
l10n should be notified).

The measures above would reduce the bugzilla entries to bugs found by
users and the need for approvals for content packs. A localizer would
only need to create a single bug for a pair (version, product) and note
the results of the tests (aka sign-off). It would reduce Axel's job to
coaching, tutoring, managing sign-offs and team registrations. Axel, do
you think this reduction would be enough to manage 100 locales?

Localization of older versions are useful and this way they could be
done. Requests for a pt-PT localization of Firefox 1.5 is still our
number one FAQ even after the release of 2.0.

A proposal for ending the file format flamewar: Assuming responsability
in defining file formats (not the base ones, the ones we use) and of
keeping it up to date would end the file format flamewar. A lot of us
use PO for several projects. A lot of us use tools like pootle or
services like wordforge or rosetta. Keeping moz2po working would simply
put mozilla projects as one more project for most of the teams. PO is
used to translate mozilla's apps, no matter how good other formats or
tools are available. Recognizing that and facilitating the translation
would cause less problems for everyone. Clearly defining the format
would help MT and the Translate Toolkit, which will be more than happy
to work on top of such a definition.

Best regards and going back to work,
João Miguel Neves


signature.asc

Damjan Georgievski

unread,
Nov 16, 2006, 12:08:06 PM11/16/06
to
>> How much bigger KDE and Gnome are than Firefox/Thunderbird?
>
> Not as much as you might think.

33.870 strings in Gnome core, 100% translated in Macedonian - without much
effort bar the translation itself... The same people that did this,
complain that the Mozilla process is confusing and hard, and don't
contribute at all to Mozilla products.

kdelibs + kdebase is around 20000 strings, 98% translated in Macedonian.
These people have the same comments about mozilla. (Total translated
strings is 65000 out of 103000).


--
damjan

Marek Stępień

unread,
Nov 16, 2006, 1:28:49 PM11/16/06
to
João Miguel Neves napisał(a):

> I tried to start the discussion like that (separating content packs,
> cleaning up the localization files). Nobody seemed to care... Could
> someone make a description (You, Axel, Marek are probably the most
> suited) of how to accomplish the following tasks (I think they're
> missing from the docs) for non-programmers?
>
> 1) Find a particular string in a file (to answer a complaint, or to
> correct a mistake). I think LXR is the answer, but not sure.

LXR is the answer if you want to find it through a web page.

Locally, a simple grep is usually enough:

cd l10n/ab-CD
grep "Some string" * -R

and often it's faster than LXR. :)

For missing/obsolete strings compare-locales usually tells you what is
missing or should be removed.

> 2) Create a build(or something else) that allows a localized app test.

I don't understand what you mean here. Any build allows testing the
localization, if only you install the xpi. (Which is a pain to generate
currently, as it wants almost all of the build process, but that's
another problem)

If you mean how to build a localized app from scratch, there should
be a doc on wiki or devmo on that. It's just a normal build with the
"--enable-ui-locale=ab-CD" option.

> 3) How to commit a localization (a mini tutorial on CVS and branching
> would help).

There is a tutorial here:
http://developer.mozilla.org/en/docs/Mozilla_Source_Code_Via_CVS

That's a general CVS tutorial, not a l10n-oriented one, though.

There are two CVS-related pages for localizers:
http://wiki.mozilla.org/L10n:Updating_Localizations_in_CVS
http://wiki.mozilla.org/L10n:Localizing_Using_CVS

but they badly need an update and are probably just too confusing. :(

Robert Kaiser

unread,
Nov 16, 2006, 6:44:32 PM11/16/06
to
João Miguel Neves schrieb:

> The separation of content packs would simplify a lot of the process, by
> separating a lot of the needed authorizations from the localization.

I'm not sure if that would accomplish what you think it would...
Actually, it even depends on what you think those magic "content packs"
are...

Robert Kaiser

João Miguel Neves

unread,
Nov 16, 2006, 7:16:19 PM11/16/06
to Robert Kaiser, dev-...@lists.mozilla.org

From my previous email:

c) Separate "content packs" that need a separate authorization process
(searchplugins, region.properties changes, feeds, feed readers).

signature.asc

Axel Hecht

unread,
Nov 16, 2006, 8:55:41 PM11/16/06
to
Dwayne Bailey wrote:
> This thread seemed to go all technical and I wanted to bring it back to
> less technical level:
>
> Caveat: I work on the WordForge project which programs on the Translate
> Toolkit and Pootle. But then I've also managed 11 languages across:
> OpenOffice.org, Mozilla, GNOME and KDE... so I must know something.
>
> I think lets restate the some things....
>
> Mozilla wants to do more localisation. The question is why? Is it
> important? I assume its is for Mozilla. OK so lets all assume its
> important. If its important then this needs to receive some real and
> critical support. And then Mozilla needs to actually _listen_ to people
> with real and broad localisation experience.

Let me jump in here and restate that the work you submitted on the 1.8
branch was just plain unacceptable, so there must be something that you
did wrong.
Note, I did catch that, very late, and had to bring that up. And I did
take the heat internally from the product team at Mozilla for those
locales just not being up to stuff without any warning signs before.

> Can we assume we're doing well now? Clearly not since others are doing
> much much better for software that used by many many fewer people.

Maybe that software is used by very few people for a reason? Maybe
Mozilla does something right in terms of processes etc which enables us
to produce software that is actually attractive enough to tens of
millions of users around the globe to be used on a regular basis?

> Are the people we want to ask even here? I don't think so, the people
> who find it too difficult haven't even started the people who have
> experience on projects with better process are probably sitting in other
> projects. The people sitting on this list have already overcome and
> forgotten their newbie problems or have become so myopic that they
> cannot see beyond their technical solution to rethink it from scratch
> and examine their own assumptions and prejudices. And the people who
> have experience get shot down around technical debates.
>
> Why aren't we doing well? Here are my thoughts
>
> * Mozilla acts as an island. We have not learnt anything from other
> localisation projects. There is a real NIMBY attitude.

Whatever a NIMBY would be.

> * The process is driven by technical people not localisers. Notice how
> quick this thread became a debate about Gettext and PO. I saw only one
> mail that addressed the problems someone has encountered in localising.

Please note that the other half of this thread is actually in .planning,
where this discussion was supposed to happen. Sadly, the mailing list
part of the newsgroup didn't catch that.

> * It takes too much time. Too much QA is manual. Why must people wait
> for tinderbox. Why are people having to spend so much time on lists
> instead of translating and improving translations.
>
> * We are too point release focused. There is no space to release and
> update translations quickly. No space for release early. No space for
> localisers to get a quality improvement pack out.

That's not true.

> * QA is techincal QA. All QA is about Trademarks, broken UI stuff,
> search and English text. All that wasted time and should be solved by
> other means.
>
> * There is no language QA. I'm not sure many teams do real QA, ie
> check their translations, and they certainly don't have tools that can
> help them to do this well (unless of course they use PO) then they're in
> a slightly better position. How many teams have a glossary, use TM? My
> guess close to none, indicating we're on the first rung of good l10n.

Says who? You? See above. Sorry for that, but there's a line to cross,
and you're on the wrong side.
That's pretty rude towards all the communities around the world that
managed to get together great localizations, in particular,
localizations that are much more successful in their regions than the
en-US one is in the US.
And we're in the process to get native language speakers outside of the
communities to do some testing, too. We're not aggressively spreading
that data, as we're currently evaluating how that feedback is coming in
and how to present it. Nevertheless, there are lots of QA efforts on the
localization going on, and being opensource and popular is bringing us
up to great speed here.

> * Firefox is big. 35,000 words (including toolkit/), OOo is 80,000.
> If you have to translate something that big before you see it. Man that
> is discouraging. A professional translators working fulltime would do
> that in 35 working days and that would exclude reviews. 7 weeks!
> Almost 2 months with no reward.
>
> * Localisation should not break Firefox. If it does we're doing
> something wrong. We should be catching all of those things before we
> make an XPI.
>

Yes to the two above, and we're working on it. Benjamin Smedberg has
some ideas, I have some, Rob Helmer and preed are interested in helping,
too. Stuff on the map are fallback strings for builds, though those will
definitely not be releases, those would be just for testing purposes.
And for making l10n on the trunk somewhat feasible. Other items include,
probably more short-term, speeding up the l10n build process to get the
development cycle down. Part of this is likely a rewrite of the
repackaging in python (only), which would make it easier for localizers
to build themselves.

> Solutions:
>
> * Use the Translate Toolkit damit :). Even just to publish official POT
> files. Even allow people to commit PO files to Mozilla l10n/ and
> automatically convert those files to .dtd and friends. We thus open
> Mozilla to the world of existing FOSS localisers without changing the
> current process for other people. Quick win, no arguments about formats
> or anything.

Thanks, but no thanks. I've seen too many bugs as result of this.

> * Reduce the channels we need to follow to find information. My
> suggestion, a simple blog that focuses only on announcing deadlines,
> requirements and points back to the wiki if there is a need for more
> detail.

As I have been saying elsewhere, this newsgroup is the place to be.
Sadly, whenever I ask for feedback on a particular draft on either devmo
or wiki, there is sudden silence, which makes it hard to finish pages
like the ownership page I already linked to without totally going despot.

> * Make it possible to easily build .xpi and installable builds for
> testing. Waiting for tinderbox and nightly builds is silly. Even
> better, as soon as I commit stuff to my l10n/ dir Mozilla builds an XPI
> for me for testing. Better yet an XPI that nows how to upgrade so that
> anyone using a testing XPI will get it and can use it magically.

Yep.

> * Use Pootle. This will allow very raw newbies to begin translating
> using a simple web interface. This would even allow the registration
> process to happen in parallel. Would allow community bug fix
> contributions and encourage new users. It even allows a Help ->
> Translate option on the browser so that people can localise.

pootle is just an editor like any other. web interfaces only work for
some. It seems to me that you're proposing a change in process, which,
see above, may not result in a quality that we're aiming for.

That said, I have seen discussions going actively in the summit here in
MV about using a more P2P-focused RCS, which might move the pain in l10n
from one point to another.

> * Realise that most good translators will actually not use MT or even a
> PO Editor for that matter, since that would be the same as insisting
> that all people who code on Firefox should use vim. Have any of the
> more technical people thought about how your choices if forced on you by
> others is the equivalent of the editor wars? Our best translators use
> commercial tools like Wordfast. If we're serious about language QA then
> we need to use real l10n tools.

Sad for KaiRo and Ricardo, they apparently suck. Though last I checked,
they did translate the preference dialog.

> * Use pofilter in the toolkit (this does mean you need PO files, but I'm
> sure Mozilla can be adapted). This picks up translated variables,
> missing escapes, etc over 41 checks. As a localiser if you're not using
> this I'm not so sure you care as much about quality as you say you do.
> We use it and are pretty confident we don't have any technical induced
> translation problems ie our XPIs aren't going to break things. The aim
> here is to make sure we can never build a broken XPI because of problems
> in the translations.

None of those tests have anything to do with po. That's just the format
that you wrote them in. Your choice.

> * Create a system to properly mark entries in .dtd and .properties files
> so that we can automate checks on things that should not be translated.
> If we ever see a translator translate a config variable then we must be
> honest that the problem is with us not them. If 100 people have to ask
> should 'true' be translated, sigh, then we'll never get it. My ideal
> solution but probably harder to implement is to move all config
> information out of these files into a config file. Thus only things
> that must be translated are in the DTD or .properties.
>
> * Create more tears with well defined goals. This would allow us to
> define an newbie tear where a team can release anything at any time.
> Its never officially released but is available for the brave and for
> testing.

This is likely going to happen, in one way or another.

> * Direct translators. Since it is so big what should people do first so
> that they see quick results. This would also allow us to define better
> requirements for completeness for a release. So even with someone on
> 80% we'd know better if that will still make a good UI experience.
>
> One last thought. Mozilla Corp through Axel has asked a question. If
> you don't like the answers is that because the answer is wrong or
> because you see the world through your own tinted spectacles. Every
> person who raised issues that made Mozilla l10n hard for them has raised
> a valid point, think about that before dismissing ideas.

Sure. Not that a "do it my way, dammit" is gonna work. We're a
successful project, with lots of priorities. And we're not going to drop
the mindshare of a few hundred people if we can avoid it.

And surely we're not going to do so for yet another crappy compromise,
like PO in CVS.

Dropping source l10n completely seems totally out of the question for
me, if other disagree, I'd like to see rationales.

Axel

> On Thu, 2006-11-02 at 17:36 +0100, Axel Hecht wrote:
>> Hi all,
>>
>> we'd like to collect some data and idea on how to get from the 50
>> locales and releasing 40 to a hundred.
>>
>> Clearly, everybody feels the load of releasing a Mozilla product in 40
>> languages, and while we got the job done, it's about time to look ahead
>> and point at the places where going further leads to trouble.
>>
>> So to start the discussion, it'd be a good idea to collect how much one
>> locale of the 40 currently 'costs'. How much time is spent on a locale
>> day-to-day (tinderbox build times come to mind), and how much man power
>> and computing resources do we need for a release? How much for an
>> update, how much for a major new release? I would hope to get
>> guestimates or number from folks like localizers, drivers, build, QA, BD
>> and product.
>>
>> In parallel, we should collect ideas on how to make l10n easier, and
>> improve the quality over all. I have some, but I'll keep those for a
>> follow up.
>>
>> This discussion should include options on the goal and expectations to
>> be set on the beast at large, too.
>>
>> Axel

Axel Hecht

unread,
Nov 16, 2006, 9:06:28 PM11/16/06
to
João Miguel Neves wrote:
> This is a very long post. It includes rants, data, concerns a
> translation process description and a couple of other things.
>
> You've been warned.
>

Wasn't that bad.

I'd answer with a "yes". The build cycle just sucks, and we're going to
fix it. Details elsewhere in the thread. Even with an hourly, 5 hours
for a windows build just don't cut it.

I have been doing too much dot-release work lately, and I'll get that
part of my job on a rotation. That should leave more time to actually do
something constructive, like fixing build, or registering locales.

I also got some priorities redone for my work, and we're getting better
at not burning that much core-team time and effort on dot-releases in
general. I also did some l10n-friendly-coding evangelism at the summit
here which should help in not running into situations where we have to
fix the l10n-issues of features post-mortem.

Axel

Novica Nakov

unread,
Nov 17, 2006, 1:38:44 AM11/17/06
to

> Thanks for telling us that our 50 complete languages are much more than
> the ~20 complete languages of others. Now the argument of people not
> being able to work on our stuff is out of the way finally. That doiesn't
> mean we don't need to support incomplete/partial L10n, in fact, we very
> much have a need for that. But the argument that just because of PO
> format other project have lots more of localizers and localizations
> should finally be history now. Thanks for making that clear.

Surely, the 50 complete languages for Firefox are not completely
complete. I left maybe a 100 or 200 strings in English which can not be
translated, maybe others have done the same. But, I don't think there is
any way mozilla can check how much strings are actually left in original.

The argument of PO format is that past practices show that more people
join in because of different reasons around it. If the current majority
of translators on the Internet finds it more easy or simple to use PO,
it's much easier to change the format then to convince people that PO
isn't that great for whatever reasons.

--
Novica

Novica Nakov

unread,
Nov 17, 2006, 1:58:06 AM11/17/06
to

> Wrong. Multiple tools, actually, ranging from MozillaTranslator via
> mozptools to pootle, not to mention the source-level tools that ensure
> top quality.

But none provided by Mozilla corp.

> The two clearly defined formats we're using are not any more confusing
> than the confusing PO format.

The rest of the l10n world doesn't seem to share that opinion.

>> (Not to speak of the Mozilla bureaucracy)
>
> I can't tell, but I don't think that's much easier in other serious
> projects.

It's funny how every now and then we have this kind of a discussion on
the list. Someone from mozilla asks what should be done to make tings
better and people start talking: PO, gettext, difficult, etc, etc...

And then we receive an answer: well, you should put more effort, mozilla
tools are great, java files are great, the process allows this and that...

And if this is the opinion, why ask questions at all?
Just wait, and 100 complete locales will show up.

:)

--
Novica

Dwayne Bailey

unread,
Nov 17, 2006, 2:26:22 AM11/17/06
to dev-...@lists.mozilla.org
I was going to give up after the first paragraph and let Mozilla go and
reinvent the wheel. I'm sorry you take things so personally. But after
seeing Erdal's response, who I respect as a good cross-cutting
localiser, I realised I couldn't be all wrong and I'd just trundle
through the personal stuff.

On Thu, 2006-11-16 at 17:55 -0800, Axel Hecht wrote:
> Dwayne Bailey wrote:
> > This thread seemed to go all technical and I wanted to bring it back to
> > less technical level:
> >
> > Caveat: I work on the WordForge project which programs on the Translate
> > Toolkit and Pootle. But then I've also managed 11 languages across:
> > OpenOffice.org, Mozilla, GNOME and KDE... so I must know something.
> >
> > I think lets restate the some things....
> >
> > Mozilla wants to do more localisation. The question is why? Is it
> > important? I assume its is for Mozilla. OK so lets all assume its
> > important. If its important then this needs to receive some real and
> > critical support. And then Mozilla needs to actually _listen_ to people
> > with real and broad localisation experience.
>
> Let me jump in here and restate that the work you submitted on the 1.8
> branch was just plain unacceptable, so there must be something that you
> did wrong.
> Note, I did catch that, very late, and had to bring that up. And I did
> take the heat internally from the product team at Mozilla for those
> locales just not being up to stuff without any warning signs before.

I'll avoid justifying anything. But has anyone wondered why we have no
idea of quickly determining what a satisfactory localised UI experience
is. Even Microsoft's LIP program only aims at 80% of the programs that
people most use.

I'm sorry you took the flack - you helped us greatly to get it into CVS.
That was in fact my main goal. Then too see how good the experience
would be.

But why is that not automatically picked up? KDE, GNOME etc all have
graduations and minimum requirements. We should be able to define a
minimal acceptable l10n percentage, that doesn't happen so people aim at
this illusionary 100%.

> > Can we assume we're doing well now? Clearly not since others are doing
> > much much better for software that used by many many fewer people.
>
> Maybe that software is used by very few people for a reason? Maybe
> Mozilla does something right in terms of processes etc which enables us
> to produce software that is actually attractive enough to tens of
> millions of users around the globe to be used on a regular basis?

Or have you ever considered that software used by so many people should
already be localised into 200 languages since as you say the others are
used by far less. So we would expect these millions of users to bread a
few thousand localiser and then itself bread some very very competent
localisers.

So no I don't agree with your argument, although I agree its a point we
need to consider.

> > Are the people we want to ask even here? I don't think so, the people
> > who find it too difficult haven't even started the people who have
> > experience on projects with better process are probably sitting in other
> > projects. The people sitting on this list have already overcome and
> > forgotten their newbie problems or have become so myopic that they
> > cannot see beyond their technical solution to rethink it from scratch
> > and examine their own assumptions and prejudices. And the people who
> > have experience get shot down around technical debates.
> >
> > Why aren't we doing well? Here are my thoughts
> >
> > * Mozilla acts as an island. We have not learnt anything from other
> > localisation projects. There is a real NIMBY attitude.
>
> Whatever a NIMBY would be.

Sorry wrong expression. I meant 'not made here'. When I talked to
localisers at FOSDEM I was shocked at what they had to do to achieve
simple things. They had no idea that other localisation projects had no
similar problems.

Yes I say that. There are always teams that will do well. But the
title says it all 100 not lets 'pat the good guys on the back and hope
everything else magically happens'. There is no line, you asked for my
input and I'm giving it. You can choose not to listen if you like.

How would you, as in someone who is not a native speaker - not as in
Axel, even know what a great localisation is? You have absolutely no
way of knowing as this is not code that you can read. You have ZERO way
of telling whether the translations are good, whether they are
consistent, whether they conform to the team glossary, whether there are
spelling mistakes or whether they are stylistically correct for the
language.

We cannot pretend that 100% localised UI = good localisation.

> And we're in the process to get native language speakers outside of the
> communities to do some testing, too. We're not aggressively spreading
> that data, as we're currently evaluating how that feedback is coming in
> and how to present it. Nevertheless, there are lots of QA efforts on the
> localization going on, and being opensource and popular is bringing us
> up to great speed here.

So are you going to hire across 100 locales? I doubt it so I'm not
concerned about that intervention as it probably only affects tear 1
localisation. I'd rather see teams equipped to do this job themselves
and for their community to give easy feedback. No bugzilla is not easy
feedback.

> > * Firefox is big. 35,000 words (including toolkit/), OOo is 80,000.
> > If you have to translate something that big before you see it. Man that
> > is discouraging. A professional translators working fulltime would do
> > that in 35 working days and that would exclude reviews. 7 weeks!
> > Almost 2 months with no reward.
> >
> > * Localisation should not break Firefox. If it does we're doing
> > something wrong. We should be catching all of those things before we
> > make an XPI.
> >
>
> Yes to the two above, and we're working on it. Benjamin Smedberg has
> some ideas, I have some, Rob Helmer and preed are interested in helping,
> too. Stuff on the map are fallback strings for builds, though those will
> definitely not be releases, those would be just for testing purposes.
> And for making l10n on the trunk somewhat feasible. Other items include,
> probably more short-term, speeding up the l10n build process to get the
> development cycle down. Part of this is likely a rewrite of the
> repackaging in python (only), which would make it easier for localizers
> to build themselves.

All I can say is great. This would probably be the biggest step towards
getting people started and maintaining their enthusiasm.

Ironically though we've already achieved all of this with our PO tools,
PO checker and some hacks to the build scripts.

> > Solutions:
> >
> > * Use the Translate Toolkit damit :). Even just to publish official POT
> > files. Even allow people to commit PO files to Mozilla l10n/ and
> > automatically convert those files to .dtd and friends. We thus open
> > Mozilla to the world of existing FOSS localisers without changing the
> > current process for other people. Quick win, no arguments about formats
> > or anything.
>
> Thanks, but no thanks. I've seen too many bugs as result of this.

And those bugs would be... hard for people to fix, easily trapped.
Please be specific, instead of disparaging a tool.

Then again I honestly thought, silly me, that this was a thread to help
localisers and get their input. But amazing how in one line the door is
shut and the champion for all us localiser is essentially saying, bugger
off and do things my way. Sorry you wanted our input and when we give
it we just get door closing rubbish.

> > * Reduce the channels we need to follow to find information. My
> > suggestion, a simple blog that focuses only on announcing deadlines,
> > requirements and points back to the wiki if there is a need for more
> > detail.
>
> As I have been saying elsewhere, this newsgroup is the place to be.
> Sadly, whenever I ask for feedback on a particular draft on either devmo
> or wiki, there is sudden silence, which makes it hard to finish pages
> like the ownership page I already linked to without totally going despot.

Let me restate this one. The idea of a blog is that there is less
noise. This newsgroup is noisy, so noisy that I can't keep up and thus
miss important announcement. There is no simple place to see what
exactly the rules are now. I don't care about rules shifting I just
want to know what they are now, what the deadlines are, etc

In terms of the wiki I must be honest I find it so hard to find info on
it as I get confused all the time and I don't have loads of time to
unravel it.

> > * Make it possible to easily build .xpi and installable builds for
> > testing. Waiting for tinderbox and nightly builds is silly. Even
> > better, as soon as I commit stuff to my l10n/ dir Mozilla builds an XPI
> > for me for testing. Better yet an XPI that nows how to upgrade so that
> > anyone using a testing XPI will get it and can use it magically.
>
> Yep.
>
> > * Use Pootle. This will allow very raw newbies to begin translating
> > using a simple web interface. This would even allow the registration
> > process to happen in parallel. Would allow community bug fix
> > contributions and encourage new users. It even allows a Help ->
> > Translate option on the browser so that people can localise.
>
> pootle is just an editor like any other. web interfaces only work for
> some. It seems to me that you're proposing a change in process, which,
> see above, may not result in a quality that we're aiming for.

No not really. Pootle also allows goals, hosting by Mozilla, user
suggestions, will eventually manage process tracking, does TM and
manages glossary. So no its not another editor.

> That said, I have seen discussions going actively in the summit here in
> MV about using a more P2P-focused RCS, which might move the pain in l10n
> from one point to another.
>
> > * Realise that most good translators will actually not use MT or even a
> > PO Editor for that matter, since that would be the same as insisting
> > that all people who code on Firefox should use vim. Have any of the
> > more technical people thought about how your choices if forced on you by
> > others is the equivalent of the editor wars? Our best translators use
> > commercial tools like Wordfast. If we're serious about language QA then
> > we need to use real l10n tools.
>
> Sad for KaiRo and Ricardo, they apparently suck. Though last I checked,
> they did translate the preference dialog.

Instead of attacking me how about actually addressing the issue raised.

Let me restate it as perhaps you didn't get it. Most translators are
not cross skilled. Giving the example of 2 people is silly, we want
100's of people localising. In my environment the best localisers are
professional localisers, and I can almost guarentee you that in all our
other languages. How many people localising Mozilla are professional
localisers. Asking someone who wants to give a little inout to abandon
all their skills and tools to fit into our approach means you get
nobody. That is one reason why Pootle is actually a good idea, we get
localisers contributing over lunch from the office and these are
professional translators who translate for parliament and for the
police.

> > * Use pofilter in the toolkit (this does mean you need PO files, but I'm
> > sure Mozilla can be adapted). This picks up translated variables,
> > missing escapes, etc over 41 checks. As a localiser if you're not using
> > this I'm not so sure you care as much about quality as you say you do.
> > We use it and are pretty confident we don't have any technical induced
> > translation problems ie our XPIs aren't going to break things. The aim
> > here is to make sure we can never build a broken XPI because of problems
> > in the translations.
>
> None of those tests have anything to do with po. That's just the format
> that you wrote them in. Your choice.

And... exactly, we're adapting them to work with XLIFF. So there is no
reason for anyone not to adapt them to work against .properties and .dtd
files.

The fact is that the tool exists and that no one needs to reinvent the
wheel, just adapt it. Or use it against PO files, I know one person who
does that and then edits the native files. The issue is without using
such tools your QA is well not QA.

I'll always sing my way of doing it. Simply because I see localisers
struggling with issues that we have never ever had to deal with. And I
see them not achieving their best possible because all the localisation
decision are lead from a technical "we won't budge" point of view.

Unfortunately with this approach there is the reality that you won't
gain the mindshare of the 1000 of other people who are already active
and you just need to harness them.

That's why I started my reply by raising the strategic issues. And I
think those are either not clearly defined or are being ignored as from
where I sit Mozilla cherry picks ideas - ones that I now realise are all
relatively simple and need coding which makes them cool. Yet
steadfastly refuses to look at solutions that make localisation easier
and with higher quality.

> And surely we're not going to do so for yet another crappy compromise,
> like PO in CVS.
>
> Dropping source l10n completely seems totally out of the question for
> me, if other disagree, I'd like to see rationales.

Ah yes, always open to new ideas. The request for PO in CVS is simple.
It give teams the options of using PO and the tools that they can use.
It give the option to Mozilla to host Pootle and not have to see all
their teams move over to Rosetta where they will have even bigger
issues.

It makes the localisations stay within the framework of the Mozilla
project.

There was no request, well certainly not from me, to move away from
source l10n. My request is simple, make PO an accepted in CVS
localisation medium. Enhance the tools to autoamte the po2moz roundtrip
and make teams themselves responsible for the .properties and .dtd's
that they produce.

That would be a first step, see how it goes, you lose nothing. We as in
people who now use PO gain a lot and you also gain the potential new
contributors.

Cédric Corazza

unread,
Nov 17, 2006, 4:04:32 AM11/17/06
to
Axel Hecht a écrit :

> Wasn't that bad.
>
> I'd answer with a "yes". The build cycle just sucks, and we're going to
> fix it. Details elsewhere in the thread. Even with an hourly, 5 hours
> for a windows build just don't cut it.

About the build cycles, why compiling again and again l10n locales where
no changes happened to the l10n files of these locales since the last
buildin? Maybe a tag which says : "no changes happened, no compiling
necessary, moving to the other locales" would be worth. This might gain
many cycles I guess. But maybe it's not trivial to implement.

Dwayne Bailey

unread,
Nov 17, 2006, 4:22:36 AM11/17/06
to dev-...@lists.mozilla.org
On Thu, 2006-11-16 at 13:20 +0100, Robert Kaiser wrote:
> Ankit Patel schrieb:
> > From: Robert Kaiser <ka...@kairo.at>
> >> João Miguel Neves schrieb:
> >>> * I know of no project that uses other format with more than 50 or 100
> >>> locales.
> >
> >> Is there any project using PO format that has 50 *fully complete*
> >> localizations, like Firefox has now with our supposedly inferior formats?
> >
> > Open source projects like, Gnome, Kde, Xfce, Fedora, etc. are using PO formats to get the localizataion support & and i think in these projects there will be at least 20 locales which has *fully complete* localization & at least 50 locales which has 80% of localization & definitely 10 locales which has 50% of localization. I agree that there is no point of using 50% localized application, but at least translators all over the world are encouraged to start their work as they find the easy process of getting their language inclusion in these projects, which is quite tough in mozilla as far as i know.
>
> Thanks for telling us that our 50 complete languages are much more than
> the ~20 complete languages of others. Now the argument of people not
> being able to work on our stuff is out of the way finally. That doiesn't
> mean we don't need to support incomplete/partial L10n, in fact, we very
> much have a need for that. But the argument that just because of PO
> format other project have lots more of localizers and localizations
> should finally be history now. Thanks for making that clear.

Of course you'd have to compare apples with apples and since KDE is way
bigger then Mozilla. Plus many of these people work across projects and
so are not only contributing to for instance KDE but to other projects.
So unfortunately it does not put it to bed. Those people can do that
because they use one tool, not a new tool for each project that they
localise.

> > 1. Translators prefer to work on PO format files
>
> I am a translator, and I don't prefer those. I much more prefer
> MozillaTranslator to KBabel, and much more prefer monolingual L10n files
> to that bilingual waste of resources.

And I know many translators who simply use Word. Translators who
program are a small and rare bread. Don't confuse your preferences with
what is needed to ramp l10n up to 100 locales.

> > 2. java .dtd & .properties files doesn't show the minor chages, we always have to compare the english files with our lang files using compare-locales.pl file & start working, which i think is not perfect.
>
> MozillaTranslator shows every single string change in a very nice way.
> Of course, one has to compare using a tool there with using a tool here,
> and they do exist on both sides ;-)
>
> Sure, MT could be improved a lot, and I'm very thankful we now have
> someone working on it again, as we now can hope for real improvements.

There is nothing wrong with MT per se. Except that it is an editor for
one task. It is not going to inherit the good work that others have and
are doing on other localisation tools. I'd rather see someone put
effort into a tool that doesn't just do Mozilla.

> > 2. We could develop some tool to conver .dtd & .properties files to .po files & .po to .dtd & .properties, that may give flexibility to translators to choose either .dtd & .properties or .po format
>
> That tool does exist already, see other posts.
>
> Robert Kaiser

> _______________________________________________
> dev-l10n mailing list
> dev-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-l10n

Gervase Markham

unread,
Nov 17, 2006, 5:32:19 AM11/17/06
to
Novica Nakov wrote:
>
>> Wrong. Multiple tools, actually, ranging from MozillaTranslator via
>> mozptools to pootle, not to mention the source-level tools that ensure
>> top quality.
>
> But none provided by Mozilla corp.

Perhaps because the Mozilla Corporation doesn't need to provide tools,
because they already exist?

> It's funny how every now and then we have this kind of a discussion on
> the list. Someone from mozilla asks what should be done to make tings
> better and people start talking: PO, gettext, difficult, etc, etc...

If someone wants to translate Mozilla using .po, they should investigate
pootle, which (I believe) has two-way translation tools between our
formats and .po.

Gerv

Gervase Markham

unread,
Nov 17, 2006, 5:34:07 AM11/17/06
to João Miguel Neves, Axel Hecht, dev-...@lists.mozilla.org
João Miguel Neves wrote:
> It isn't. Worst, it can't be bugfree or pitfall-free. For the simple
> reason that mozilla file-formats are: 1) not documented and 2)
> ever-changing.

That's so not true. The properties file format is, I believe, defined by
the Java standard, and the DTD format by an XML-related standard.
Neither of them is defined by us, both are documented, and neither change.

Gerv

João Miguel Neves

unread,
Nov 17, 2006, 5:42:19 AM11/17/06
to Gervase Markham, Axel Hecht, dev-...@lists.mozilla.org

From a previous email on this discussion before (yes, someone else said
that before - Marek - and I replied):

Both those are just the base formats, not the localization formats. If
you just use that, you lose critical information, like "don't translate"
comments or the meaning of the arguments. You also lose all the
conventions for accelerators (.key vs .accesskey vs .commandKey). This
information is critical to create a working conversion of a
localization.

signature.asc

Axel Hecht

unread,
Nov 17, 2006, 5:46:39 AM11/17/06
to

On the list. It's not a flag, it's called "checking your dependencies".
We're just working around conflicts between make and jar packaging here,
and doing that right should fix that bug.

Axel

Gervase Markham

unread,
Nov 17, 2006, 5:46:15 AM11/17/06
to
Dwayne Bailey wrote:
> * Create more tears with well defined goals. This would allow us to
> define an newbie tear where a team can release anything at any time.
> Its never officially released but is available for the brave and for
> testing.

Not to be rude, but because it took me about five minutes to work out
what you were saying here, I just thought I'd mention: it's spelt
"tier". A tear is either something which comes out of your eye when you
are sad, or a rip in a piece of material. :-)

Gerv

Gervase Markham

unread,
Nov 17, 2006, 5:52:40 AM11/17/06
to
Dwayne Bailey wrote:
> Are the people we want to ask even here? I don't think so, the people
> who find it too difficult haven't even started the people who have
> experience on projects with better process are probably sitting in other
> projects.

I think this is an important point. To work out how to go from 50 to
100, we need to find the people who tried localising our stuff and gave
up, or who never started, because these are the people we want to
comprise the next 50 teams.

So asking existing teams "what needs to be easier" may well get you the
wrong answers, because they will be focussed on incremental improvements
to a process they have learned to live with.

> * Mozilla acts as an island. We have not learnt anything from other
> localisation projects. There is a real NIMBY attitude.

He means NIH :-)

> * Use the Translate Toolkit damit :). Even just to publish official POT
> files. Even allow people to commit PO files to Mozilla l10n/ and
> automatically convert those files to .dtd and friends.

Axel: you seem to be very set against this idea, but I haven't seen a
message where you explain why. Could you point me to one?

> * Reduce the channels we need to follow to find information. My
> suggestion, a simple blog that focuses only on announcing deadlines,
> requirements and points back to the wiki if there is a need for more
> detail.

To expand: this newsgroup is not a good substitute for the existence of
the channel recommended above.

It should be possible for an established localiser to localise Firefox
performing only the following tasks:

- Monitoring a blog/mailing list with 1-2 messages per week about deadlines
- Making translations and checking them in to CVS by those deadlines
- Sending one quick final sign-off message at the time of each release
- Replying to emails from the l10n coordinator in exceptional circumstances

Anything else is overhead.

Gerv

João Miguel Neves

unread,
Nov 17, 2006, 5:59:01 AM11/17/06
to Gervase Markham, dev-...@lists.mozilla.org
Sex, 2006-11-17 às 10:32 +0000, Gervase Markham escreveu:
> If someone wants to translate Mozilla using .po, they should investigate
> pootle, which (I believe) has two-way translation tools between our
> formats and .po.
>
That's what people refer to when they talk about the Translation Toolkit
of which Dwayne Bailey is one of the main developers (you can also see
him on this discussion). It's also the tools Axel complains about having
caused several problems in the localization (I have to agree with him
there, because I've also felt it).

The current state of Translation Toolkit is also the reason why I
complain of a non-defined file format.

signature.asc

David Fraser

unread,
Nov 17, 2006, 6:52:45 AM11/17/06
to Gervase Markham, dev-...@lists.mozilla.org, Dwayne Bailey
Three cheers for tiers that tear down barriers for localizers that bring
tears :-)

David

David Fraser

unread,
Nov 17, 2006, 7:19:31 AM11/17/06
to dev-...@lists.mozilla.org, Clytie Sidall
Interesting point - are there statistics on this?

I'll make one contribution to this thread, simply observations:

There are different areas to discuss, I've outlined some ideas below.
There are a lot of people supporting using PO format because of its
advantages, and a few people decrying it.
I just want to point out: although Firefox has successfully release 50
languages, which is brilliant, a fair number of the people advocating
usage of a different mechanism are the translators who have helped
produce those translations. And there seems to be a clear majority of
opinion inclined in this direction. That doesn't mean they're right, it
just means that there are a whole bunch of their concerns that should be
listened to - probably those who are struggling are the ones who have
the clues as to where the issues lie.
And many of those people are those who are doing localization across
large projects, and thus find the advantage in sharing tools, formats,
etc. There is a large open source localization community that could be
more effectively drawn on, and that should be a key aspect of the
discussion about expansion.
In fact my key recommendation would be to get Clytie Sydall to volunteer
to do Firefox in Vietnamese - if you know her from Gnome, Debian,
Ubuntu, OpenOffice.org you'd know she's a great example of someone who
is not technical but has co-ordinated localizations across multiple
large projects successfully, and I'm sure she'd have great
suggestions... given that apparently there have been at least three
attempts to translate into Vietnamese that have failed, that community
may also have ideas about the stumbling blocks (see
http://vi.mozdev.org/ and
http://wiki.mozilla.org/L10n:Localization_Teams#Vietnamese_.28vi-VN.29)

== technical implications of using different formats from a development
/ building point of view ==
* errors that may arise and the effects they have
* how much work has to be done on the build end
* checks that are available - as Axel pointed out, this should be
independent of the format

== benefits to the translators of different formats ==
the point here is that some of the information in PO, xliff formats are
there because its useful to localizers, whether its useful to
developers/builders or not.
Firstly a lot of localizers find that having the original text alongside
the translation in the document is a Good Thing.
In my mind you need to treat a set of translations as a document that
contain comments, annotations, etc, etc. These are useful to localizers,
and as long as the format is sane other tools can strip them out.
For something like XLIFF you may not want to have all the workflow in
CVS - in that case you could have tools that strip it out and maintain
the workflow elsewhere.
Plurals and declensions support is the kind of thing it's hard to hack
support onto existing formats for, but is really important for the
languages that need them.

== different places formats could be used ==
* as intermediate formats vs as native formats
* whether stored in central revision control or not - note that this is
independent of whether the format is intermediate or native e.g. PO or
xliff could be allowed in CVS whilst requiring conversion for actual
translation

== different tools to translate with ==
* clearly there are some Mozilla tools that work for some people
* something like Pootle will probably end up supporting the Mozilla
formats, but most other tools will not because they are Mozilla-specific

== different libraries ==
This is what is used to implement the actual translation mechanism
(gettext vs Mozilla internals)

João Miguel Neves

unread,
Nov 17, 2006, 7:46:32 AM11/17/06
to David Fraser, Clytie Sidall, dev-...@lists.mozilla.org
Sex, 2006-11-17 às 14:19 +0200, David Fraser escreveu:
> I just want to point out: although Firefox has successfully release 50
> languages, which is brilliant, a fair number of the people advocating
> usage of a different mechanism are the translators who have helped
> produce those translations. And there seems to be a clear majority of

Just one correction. Some of the 50 locales really use PO for the
translation work, not the dtd and properties. I think this is something
that's been missing from this discussion.

signature.asc

Djihed Afifi

unread,
Nov 17, 2006, 8:27:49 AM11/17/06
to João Miguel Neves, Clytie Sidall, dev-...@lists.mozilla.org, David Fraser
Well, here is a live example.

The Arabic team has never been able to produce a 100% translated Firefox
until we decided to work on po files. Complex scripts do the conversion
back and forth. But translators work solely on po. Although the
conversion takes place, there are a lot of problems associated with it.
The maintainer is dragged with extra overhead instead of translating,
and there is not much anybody else can do because.

And no, we can't work on other file formats. And we can't switch the
whole process to Pootle (although it is a great package and I've been
working to make it more user friendly, somewhat lazily). Both of these
solutions offset the problem to *translators*, who, after all, we're
trying to help, no?

I really don't understand the whole opposition to gettext. It's been
proven to be reliable and working. It's efficient for *translators* and
*they* would like to work with it.

Just another vote for adopting po. It will make life a lot easier for
many people.

Djihed

Abdulkadir Topal

unread,
Nov 17, 2006, 9:22:30 AM11/17/06
to
Robert Kaiser schrieb:
> Ankit Patel schrieb:

>> 1. Translators prefer to work on PO format files
>
> I am a translator, and I don't prefer those. I much more prefer
> MozillaTranslator to KBabel, and much more prefer monolingual L10n files
> to that bilingual waste of resources.


You are a programmer first and then a translator, but you have to admit
that's pretty different from the situation of the majority. Working on
the Firefox translation I feel like being back in the stone age of
localization. We can't use real translation memories and all the other
tools that the community already developed and that would make it so
much easier to contribute. Again we have to go our own way with MT,
while it's clear, that it can never reach the quality of the other tools
which have a lot more dev power behind them. It's absolutely
frustrating to have to work with three different people contributing to
/toolkit and having to to use a simple text editor as the least common
denominator. At least we don't have to use that horrible escaped unicode
anymore where every editor has it's own way in escaping (capitalized or
not).

--Abdulkadir

Gia Shervashidze

unread,
Nov 17, 2006, 4:47:36 PM11/17/06
to Gervase Markham, dev-...@lists.mozilla.org
Gervase Markham იწერება:
> _______________________________________________
> dev-l10n mailing list
> dev-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-l10n
>
Even that language, joke, logics is bit mozilla different.
g.\

Novica Nakov

unread,
Nov 17, 2006, 6:43:02 PM11/17/06
to

> Perhaps because the Mozilla Corporation doesn't need to provide tools,
> because they already exist?

I'll just ask a dumb question: if tools exist as you claim and if
everything else about localizing Firefox is in great as several other
posters here claim, then why are we having this discussion?

Like I said, wait and the 100 locales will show up.


> If someone wants to translate Mozilla using .po, they should investigate
> pootle, which (I believe) has two-way translation tools between our
> formats and .po.

But that was not really my point in the previous post. See above.


--
Novica

Damjan Georgievski

unread,
Nov 19, 2006, 1:51:13 PM11/19/06
to
>> Thanks for telling us that our 50 complete languages are much more than
>> the ~20 complete languages of others. Now the argument of people not
>> being able to work on our stuff is out of the way finally. That doiesn't
>> mean we don't need to support incomplete/partial L10n, in fact, we very
>> much have a need for that. But the argument that just because of PO
>> format other project have lots more of localizers and localizations
>> should finally be history now. Thanks for making that clear.
>
> Of course you'd have to compare apples with apples and since KDE is way
> bigger then Mozilla.

Not only that, but those KDE statistics also include Kdevelop and similar
packages that are ussually not translated (since the target group of the
software will NEVER use the translation, at least in mk)


--
damjan

Damjan Georgievski

unread,
Nov 19, 2006, 2:20:12 PM11/19/06
to
>> Can we assume we're doing well now? Clearly not since others are doing
>> much much better for software that used by many many fewer people.
>
> Maybe that software is used by very few people for a reason?

This comment really disgusts me... Are you trying to get from 50 to 10
locales?!?

You know he is talking about KDE and Gnome and similar projects... Projects
that make great and LIBRE software. Their only handicap is the Windows
platform monopoly... and you had to throw a poisonous remark like this.


--
damjan

Giacomo Magnini

unread,
Nov 20, 2006, 1:39:45 AM11/20/06
to
Damjan Georgievski wrote:
> You know he is talking about KDE and Gnome and similar projects... Projects
> that make great and LIBRE software.

Even such projects can take some critics: especially since they are so
good in fighting each other. It's not like you are killing someone if
you critic other OSS projects. Freedom of speech is something I would
not trade for a better version off Gnome or KDE, you know.
And we are talking about Mozilla, that makes, not only great and open
software, but also much more successful (since it's OS independent).

> Their only handicap is the Windows
> platform monopoly... and you had to throw a poisonous remark like this.

<rant about IT locale>
Their only handicap is not Windows: it's the arrogant people that are
translating them. I'm talking about Gnome here since I've practically
never used KDE in the last 3 years. Well, the Italian translation is
still incomplete and the quality is disgusting.
Two examples:
1) they use "bottone" instead of "pulsante" to translate "button": the
first thing you think about "bottone" is the button on your shirt.
2) they use "arrestare il computer" to translate "turn off the
computer": the first thing you think about "arrestare" is the police
coming to get you.
No one ever listens to those critics, nobody cares.
And since we are in contact with a KDE localizer, I can understand that
the situation is not better over there: if you argue with his
suggestions (and show that the suggestion was not correct), he ends up
saying that you are too much used to Windows translation, so you don't
understand the "new" style and wave...
[Note: Never translated a single Windows program in my life]
</rant>

Now please explain why those other great programs have a better
translation process: I can't see it. Oh, even OOo is not a good example
to counter-argument: the Italian version takes from one to eight weeks
before it comes out after a new release.

I've never been "soft" with the mozilla localization process, but since
they are now willing to hear suggestions, I can't understand why people
insist on suggesting other formats instead of improving the current
process. Faster l10n builds for testing is already a good improvement: 2
or 3 suggestions made here are heading in the right direction, too.

Since nobody still stressed this enough, one thing has not been clear
for the people involved in this discussion so far: moving to gettext is
not possible without a major rewriting of the interface and the way of
dealing with that. IIRC, even removing .properties files and use only
.dtd's will need some heavy work (but is doable).

Is MoCo willin to hire 1-2 people just to do this? Is it worth the pain?
What about the change of programming style for the rest of developers?

I'd much prefer that MoCo hires 1-2 people to improve the tools we have
(eg. MT) to deal with all the formats (xhtml, images, and the rest) and
to add translation memories to it (eg. taken from OmegaT), so that even
professional tools can help in the translation process (by contributing
already proven glossaries and improved support for collaborative work).

Those glossaries would be uploaded from time to time to mozilla.org/.com
(eg. when a new release comes out) so that they are not "lost" in case a
team or a member resigns, and so not a single translation can regress
much, if at all.

I personally will never trust scripts/tools that convert 2-3 formats
into a different one, then the translated version is being converted
back into 2-3 formats again while dealing with different charsets or
encodings: too much error prone, thanks.

Want to throw MT (because of Java and anything else) in the trash for
developing a good web based tool? That's fine, but there is no reason to
move to gettext for this, for example.
Ciao, Giacomo.

Ognyan Kulev

unread,
Nov 21, 2006, 4:18:14 AM11/21/06
to dev-...@lists.mozilla.org
Gervase Markham wrote:
> Perhaps because the Mozilla Corporation doesn't need to provide tools,
> because they already exist?

These external tools are forced to follow every weird use that shows up.
I still have a modified file in my Mozilla checkout that has something
like "<!-- <! ENTITY" instead of repository "<!-- <!ENTITY" because my
old modified Translation Toolkit thinks that "<!ENTITY" is real entity.
The use of "<!ENTITY %" is another example. The random use of
.accesskey, .akey and other is another problem with external tools.

I don't know how it is with recent versions but I had many problems with
old versions of Translate Toolkit. Fortunately, I'm developer and fixed
things myself. I don't know what's the state now - I read that it's much
better. Personally, I would use Debian's po4a
<http://po4a.alioth.debian.org/> and I even started to write modules for
it but I don't have enough time even for my l10n (bg).

I confirm that using non-PO-based translation process chases away
translators and I'm sure majority of localizers here can confirm this
too. In most cases, Mozilla localizers participate in other
localizations too and these other localizations mostly use PO files.

Regards,
ogi

Tsahi Asher

unread,
Nov 21, 2006, 2:56:10 PM11/21/06
to
ציטוט João Miguel Neves:
>
> e) Separate interface issues (window sizing) from the flow of
> localization. I don't know how many times I had to un-translate width
> and height.
>
however you separate UI from l10n, it should still be part of the l10n
package, because different languages have different lengths (in pixels)
of strings. sometimes a particular language needs to increase a certain
window size.

--
Tsahi Asher
Hebrew L10n Team
http://www.mozilla.org.il

Axel Hecht

unread,
Nov 22, 2006, 11:43:48 AM11/22/06
to
João Miguel Neves wrote:
> Qui, 2006-11-16 às 13:40 +0100, Robert Kaiser escreveu:
>> João Miguel Neves schrieb:

>>> I've never felt code integration to be an issue. Rushing away things,
>>> not getting approval for localizations in a timely fashion, having to
>>> wait almost days to test a localization are problems for me.
>> Can you find out and give us some cencrete steps what you'd need/want to
>> happen so that your cycles work better? Do we need some simple build
>> script that packs up an ab-CD.jar just with your localization, without
>> needing to (re)build a whole Firefox/Thunderbird? Then you could just
>> copy that into your Firefox installation and test it.
>
> Yes, that would help. I was able to do it for version 1.5 and that meant
> we got up-to-date xpi files that normal people could use to gives us
> feedback.
>
>> Would that help? Would that requiring the "make" and "perl" tools
>> installed for that be an option? (We're currently using those in the
>> build process to pack the .jars, so we could reuse the same code.)
>>
> That would work for me. For others, a service to generate it from an
> up-to-date localization would work, but if you provide me those tools, I
> can provide that.

The up-to-date nighlty langpacks for the builds done by tinderbox are on
ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla1.8-l10n/windows-xpi,
the hourlys at
ftp://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox/latest-mozilla1.8-l10n/windows-xpi.

>> This is just one idea for helping with that.
>>
>> Of course, for those that don't have the needed tools installed, we
>> could trigger a rebuild from the source in CVS by accessing some web
>> page or the checkin itself.
>>
> Yes, this would be great.
>
>> Should we have some always-open CVS branch, and only merge approved
>> changes to the really used branch on approval, or something similar?
>> Actually, that could be done quite easily, and probably even now, but it
>> needs some CVS knowledge from someone in each team that uses that approach.
>>
>> What other concrete steps we could take can you think of?
>>
>> I think that's the way we really need to work on, much more than the
>> technical file format topic we always tend to stumble into.
>
> I tried to start the discussion like that (separating content packs,
> cleaning up the localization files). Nobody seemed to care... Could
> someone make a description (You, Axel, Marek are probably the most
> suited) of how to accomplish the following tasks (I think they're
> missing from the docs) for non-programmers?
>
> 1) Find a particular string in a file (to answer a complaint, or to
> correct a mistake). I think LXR is the answer, but not sure.

It is.

> 2) Create a build(or something else) that allows a localized app test.

I pointed out elsewhere how to make a langpack.

> 3) How to commit a localization (a mini tutorial on CVS and branching
> would help).

That documentation is readily available on the cvs wiki, I'd rather not
create a partial copy of that. Maintaining forks is just a pain and
source of errors.

> The separation of content packs would simplify a lot of the process, by
> separating a lot of the needed authorizations from the localization.

Yes, I should pick up the factorization of BD-related content into a
single source location again.

> A possible complete redoing of the process would be separating l10n
> releases from the major. This could be done this way:
>
> 1) Create a group of tests for accepting a localization (Axel's work
> for 2.0+litmus tests for localizations is a very good start).

I'm actually a admin on litmus now (yac), so if you point me at tests
that you think should be run, I can set those up. The litmus solution is
probably to create a new subgroup distinct from the l10n tests right
now, which are really a) old, b) targeted at non-native speakers.

> 2) Any localization that passes the tests can be released at that
> point, mark it in CVS as FIREFOX_2.0.0.1_pt-PT, or something that allows
> us to identify which release was that.

Locations that are supposed to be released are part of shipped-locales,
which is the basis for the build team to tag them at release point.

The release tags are in the form of FIREFOX_2_0_RELEASE.

I'll point out elsewhere why we can't get rid of the distributed
sign-off process easily. Somehow, the use of bugzilla and dependencies
therein for 2.0 was an attempt to centralize and formalize that, which
worked to some degree. I guess it'd be interesting to know to which
degree, by the view of l10n teams.

> 3) Give time in the normal process to have a number of localizations
> ready for the .0 release (a real string freeze as early has possible -
> working 6 hours after waiting 2 days after a string freeze and then
> finding out more changed strings after doing the commit should not
> happen - if the string freeze is delayed, even for a couple of hours,
> l10n should be notified).

Let's put this the other way around. Maybe we should make a clarifying
statement of deadlines. Deadlines are serious deadlines, that is, really
really nothing should land after that. And of course, in theory, there
is no difference between theory and practice, but in practice, there is.

I don't think that we blew the string freeze deadline apart from one
major help landing. That was a project management bug, and we intend to
not make that bug again. At the summit, we worked on a checklist to be
gone through for new features (and reworks of existing features, too,
I'd suppose). Both "end-user documentation" and "l10n" will be on that
checklist, l10n is even a category on that list that I'll fill out.
"localizable" vs. "localizers know what to do about this" will be a big
point here, for example. I'll ask for comments on that section in this
newsgroup (new thread, though) once I get that set up.

> The measures above would reduce the bugzilla entries to bugs found by
> users and the need for approvals for content packs. A localizer would
> only need to create a single bug for a pair (version, product) and note
> the results of the tests (aka sign-off). It would reduce Axel's job to
> coaching, tutoring, managing sign-offs and team registrations. Axel, do
> you think this reduction would be enough to manage 100 locales?
>
> Localization of older versions are useful and this way they could be
> done. Requests for a pt-PT localization of Firefox 1.5 is still our
> number one FAQ even after the release of 2.0.
>
> A proposal for ending the file format flamewar: Assuming responsability
> in defining file formats (not the base ones, the ones we use) and of
> keeping it up to date would end the file format flamewar. A lot of us
> use PO for several projects. A lot of us use tools like pootle or
> services like wordforge or rosetta. Keeping moz2po working would simply
> put mozilla projects as one more project for most of the teams. PO is
> used to translate mozilla's apps, no matter how good other formats or
> tools are available. Recognizing that and facilitating the translation
> would cause less problems for everyone. Clearly defining the format
> would help MT and the Translate Toolkit, which will be more than happy
> to work on top of such a definition.
>

I think that my follow up to Gerv's post will detail on this part a bit.

Axel

Axel Hecht

unread,
Nov 22, 2006, 12:16:41 PM11/22/06
to
"Thanks" pretty much rounds my comments up on this one :-)

Axel

Axel Hecht

unread,
Nov 22, 2006, 12:22:59 PM11/22/06
to
Dwayne Bailey wrote:
<...>

>
> * Create more tears with well defined goals. This would allow us to
> define an newbie tear where a team can release anything at any time.
> Its never officially released but is available for the brave and for
> testing.

Cherry-picking this one.

A separate tier for not-so-vital locales has come up before, and I
recall that Gerv wasn't very fond of the idea back then.

I'm not sure yet how to set that up though, and how to make it cheap.
Say, the Frisian locale would be one of those. Now, for the trademarks
part, it largely follows the dutch locale, but the wikipedia link is
actually Frisian. Should we rip that out? Should we make the not-so-tier
use en-US, or are other locales fine?

It seems that we could get away with en-US product content for quite a
few of them, but I'm not sure if that's going to be really be a huge
step forward.

Maybe having a decent way to set up incubators for new locales is much
more like it.

Axel

Axel Hecht

unread,
Nov 22, 2006, 12:44:10 PM11/22/06
to
Gervase Markham wrote:
> Dwayne Bailey wrote:
>> Are the people we want to ask even here? I don't think so, the people
>> who find it too difficult haven't even started the people who have
>> experience on projects with better process are probably sitting in other
>> projects.
>
> I think this is an important point. To work out how to go from 50 to
> 100, we need to find the people who tried localising our stuff and gave
> up, or who never started, because these are the people we want to
> comprise the next 50 teams.
>
> So asking existing teams "what needs to be easier" may well get you the
> wrong answers, because they will be focussed on incremental improvements
> to a process they have learned to live with.

The amount of suggestions with incremental changes hasn't been
dominating this thread, though :-/.

>> * Mozilla acts as an island. We have not learnt anything from other
>> localisation projects. There is a real NIMBY attitude.
>
> He means NIH :-)

http://en.wikipedia.org/wiki/NIH says that that'd be the National
Institutes of Health.

>> * Use the Translate Toolkit damit :). Even just to publish official POT
>> files. Even allow people to commit PO files to Mozilla l10n/ and
>> automatically convert those files to .dtd and friends.
>
> Axel: you seem to be very set against this idea, but I haven't seen a
> message where you explain why. Could you point me to one?

I'm a tools-darwinist, for one. I don't think we are at the point where
we would want to bless one tool of choice.

For problems with the translate toolkit, see the new thread by Dwayne.
The major problems I see are the somewhat confuzzled help, there is one
DTD used in the help files, which leads to funny looking help content.
And so far the translate toolkit doesn't really make that obvious, AFAICT.
Accesskeys are another biggie. The arch to merge accesskeys into the
translated string and to extract them out again may work for most
locales, but that's a side effect of the tool that I don't see as the
final solution. Basically, the assumption that access keys are part of
the localized content is a po file format weakness.

More than those reasons though, blessing any tool at this point in time
says that our main developers are not going to check anything in that
breaks those tools, and I don't see us being at that point so far. All
the coding conventions are pretty subtle so far, and optional, and it
will take much more incremental education of developers to get to a
point where would want to do that.
.label and .accesskey alone probably have a bunch of open bugs, and
probably a lot more unfiled ones. This is something where the l10n
community needs to step in and provide patches. That is not going to get
fixed by a "we said you must not do that, I turn your build red" kinda
thing. It would certainly over-stretch my powers.

>> * Reduce the channels we need to follow to find information. My
>> suggestion, a simple blog that focuses only on announcing deadlines,
>> requirements and points back to the wiki if there is a need for more
>> detail.
>
> To expand: this newsgroup is not a good substitute for the existence of
> the channel recommended above.
>
> It should be possible for an established localiser to localise Firefox
> performing only the following tasks:
>
> - Monitoring a blog/mailing list with 1-2 messages per week about deadlines
> - Making translations and checking them in to CVS by those deadlines
> - Sending one quick final sign-off message at the time of each release
> - Replying to emails from the l10n coordinator in exceptional circumstances
>
> Anything else is overhead.

I don't think so.

I am not sure that localizers that are not interested in Firefox to the
extent that they're willing to at least ignore the threads in this
newsgroup will invest enough love into their localization to make it
rock. I'm leaving out other programs on purpose here, just because the
ratio of strings vs users is not that good.
Items 2-3 have been proven to not work; sad, but true. I'm not going to
point at locales here, but it's not working.
Re -4, I think that a bugzilla component is fair enough. We need to
ensure that I'm not a single point of failure on the mozilla side, and
bugzilla is a established and working tool to ensure that. Plus, it
gives some public tracking/archive of those messages. Or to put it the
other way around, any message in my inbox is tentatively lost knowledge.

Axel

Axel Hecht

unread,
Nov 22, 2006, 1:04:08 PM11/22/06
to
Novica Nakov wrote:
>
>> Thanks for telling us that our 50 complete languages are much more
>> than the ~20 complete languages of others. Now the argument of people
>> not being able to work on our stuff is out of the way finally. That
>> doiesn't mean we don't need to support incomplete/partial L10n, in
>> fact, we very much have a need for that. But the argument that just
>> because of PO format other project have lots more of localizers and
>> localizations should finally be history now. Thanks for making that
>> clear.
>
> Surely, the 50 complete languages for Firefox are not completely
> complete. I left maybe a 100 or 200 strings in English which can not be
> translated, maybe others have done the same. But, I don't think there is
> any way mozilla can check how much strings are actually left in original.

It'd be interesting to know which locale you're talking about ;-).

And there is a way for us to check for unchanged strings in a locale,
but that has nothing to do with non-localized. Nor are there good ways
to preventing spoofing that number by introducing worse bugs.

Thus I'm not looking at those numbers, on purpose.

> The argument of PO format is that past practices show that more people
> join in because of different reasons around it. If the current majority
> of translators on the Internet finds it more easy or simple to use PO,
> it's much easier to change the format then to convince people that PO
> isn't that great for whatever reasons.
>

I'd say the contrary, see Giacomo's reply, for example.

Axel

Ricardo Palomares Martinez

unread,
Nov 22, 2006, 4:10:30 PM11/22/06
to
Giacomo Magnini escribió:
>


I'm not really arguing about Giacomo's opinions here, but it looked as
the best message to hang my own reply... :-) I've been very quiet
these days, mainly due to lack of time, also gathering opinions.


>
> Since nobody still stressed this enough, one thing has not been clear
> for the people involved in this discussion so far: moving to gettext is
> not possible without a major rewriting of the interface and the way of
> dealing with that. IIRC, even removing .properties files and use only
> .dtd's will need some heavy work (but is doable).


I see the PO issue from a different point of view. It has been
suggested that Mozilla Foundation should provide official POT files
generated from DTD/properties. That sounds easy, since there is
already a tool doing so, but I think such a move would oblige MoFo to
persist in providing functional POT files and making sure that the
"MozFormat" <-> POT works all right all the time. I don't think MoFo
is willing to do it, because of reasons similar to why Axel says that
MoFo is not endorsing nor promoting any particular L10n tool so far.

For Translate Toolkit, the fact that moz2po stops working or do
something wrong is just an issue to be solved as time permits (just
like MT non-solved issues are right now). If MoFo sets itself to
provide POT files on a regular basis, it would mean breaking their
whole l10n process since POT files would be at the same level than
breaking DTD/properties themselves.

Of course, one can argue that not willing to take the plunge will
forbide MoFo reaching 100 locales, and that may be right. Or not.


>
> I'd much prefer that MoCo hires 1-2 people to improve the tools we have
> (eg. MT) to deal with all the formats (xhtml, images, and the rest) and
> to add translation memories to it (eg. taken from OmegaT), so that even
> professional tools can help in the translation process (by contributing
> already proven glossaries and improved support for collaborative work).


I've been learning and thinking about MT current lack of features. I
must first state that I've never intended or promised to develop MT in
the long-term. Instead, I intend(ed) to write a replacement for MT
while keeping MT functional enough to be usable.

Sadly, I'm not advancing in the replacement at a decent pace (young
people: believe or not, it's far easier to live when you are in the
20s than in the 30s...) :-)

So, either I stop at all at MT development, and maybe most of es-ES
l10n, or the replacement will arrive a lot later than I originally
expected. BTW, I always thought of the replacement as a file format
indepent tool; it would start being functional with Mozilla file
formats, but PO files were the next one in the list.

OK, stop vapourware chatting... :-) I'm not really sure that people
really care so much about PO vs. DTD/properties issue as they care
about features in PO tools not found in DTD/properties tools (namely, MT).

Then, there are people like Robert or Dwayne that are mindset about
the "right file format", and there maybe people liking more one format
than another (I don't particulary like PO format after having been
reading about it, but I probably wouldn't jump off a bridge if Mozilla
l10n process switched to it).

So, what would need MT to be more appealing to those people which feel
more comfortable with PO-based l10n processes?

- fixing bugs: there are a fair amount of them filed at
bugzilla.mozilla.org, most seem to be fixable.
- more QA features: there are some, but there are lot more to be
introduced (I have some of my own, I've read and like some of
KBabel, and I don't like others).
- memory translation / auto-translating: yeah, that sounds good. More
on this later.
- web-based translation: sorry, that's outside MT ambitions. And I'm
not interested myself in working in a web-based environment, so I
won't work on it. Still, as a long-term plan, I'd like to provide
web-services in the replacement, but that's far, far away...
- collaboration: that's an RCS issue. As long as the l10n tool can
work with a CVS/SVN based directory structure in a non-too annoying
way, I think it would suffice.

Now, the way I see it, either MoFo develops its own l10n tool, or
current choices go on their own to provide what volunteers devoting
time to them find useful/interesting. Since I'm doing some things in
MT, I'll speak about it:

- the first thing I'm starting to feel needed is getting back MT
sources into the sourceforge.net's repository. In the current
situation I can't get help and I'm pretty sure there things easy
enough to contribute. I've sent an e-mail to Serhiy asking (again) him
to find a solution; since I've got no answer so far, I'll write to
Henrik Lynggaard. If everything fails, we'll have to create a new
project, but I don't like the idea of people searching for MT sources
and finding two different projects.

- Enhancing QA queries is just a matter of manpower, as fixing most
current bugs is.

Now, the memory translation is a completely different thing: MT loads
*everything* in memory, and do it in a tree structure that seems very
logical for browsing files, but that doesn't fit with the idea of
searching similar strings in real-time while editing, for instance.

Also, I don't like too much the fact of an "editor replacement" (*)
needing more than 256 MB of RAM (Robert Kaiser's Glossary.zip is so
big that MT needs special parameters to run).

(*) In the end, what MT, Pottle or KBabel do can be done using a plain
UTF-8 editor. Sure, the tools provide us with more, but how much
computer power are you willing to devote for the benefits?

Switching from the current "all-in-memory" datamodel in
MozillaTranslator to a DB-based one looks like a major rewrite to me,
and that's where my replacement entered into the play.

However, the memory issue is my personal taste: maybe most of you find
it a minor consideration, and, at first sight, I think that's the only
real technical consideration preventing the implementation of memory
translation features. Besides that, I'd say only manpower can be
regarded as an issue. So, let me know about this.


>
> I personally will never trust scripts/tools that convert 2-3 formats
> into a different one, then the translated version is being converted
> back into 2-3 formats again while dealing with different charsets or
> encodings: too much error prone, thanks.
>


I agree with this. Again, this is my own opinion, and it may have a
piece of irrational disbelief, but I *feel* that's not the right
approach to a nice l10n tool/process.


> Want to throw MT (because of Java and anything else) in the trash for
> developing a good web based tool? That's fine, but there is no reason to
> move to gettext for this, for example.


Well, that's not fine to me. :-) I don't like web-based tools at all,
and Java shouldn't be an issue even for GPL die-hards now that it has
been completely opensourced by Sun. It may be a point for personal
reasons, just as I don't want to have anything at all resembling
.NET/Mono in my computer. :-)

Last, a quick review on other ideas not related to tools or file-formats:

- for the most part of the time, I think that l10n full builds could
be replaced by just building up the XPI (or even a ZIP with the JARs).
Full builds just once a day, XPIs/JARs the rest of the time. It would
save a lot of computer power and would pay with faster results for
localizers in order to test changes.

- all brand and trademark issues should really be treated in separate
way than the rest of the localization. They involve a lot of
administrative stuff and, except for the final packaging, doesn't
interfere too much with the localization tests and quality.

- (this is a requirement for us, the localizers) Provide feedback and
help with the docs, specially with entry docs. Axel is still waiting
for feedback on http://wiki.mozilla.org/L10n:Ownership, AFAIK, and I
blame myself for this, as I read it but didn't say anything, nor good
or bad (I'll try to fix this in a minute).

- a blog could be nice as a "clean view" of dates, deadlines and other
important stuff, but it wouldn't replace neither this newsgroup nor bmo.

- I'm pretty sure many teams have written docs and small tools for
easing their work. However, (at least this is our case in es-ES), both
things get written in the mother tongue, instead of in en-US. Sure, it
is easier to write in your own language, but writing it in english
will help to make you work useful for everyone, not just for your
teammates. For instance, I have more or less equivalent batch scripts
(bash and .BAT) to easily prepare SeaMonkey installers customized for
es-ES taste, along with the corresponding docs. I haven't published
them just because I thought it wouldn't be of much help to other
teams, but since I need between two and five minutes to get a
localized installer, they could deserve a look by other people. Except
that the docs and embedded comments are now in spanish. :-( Do
localizers feel themselves recognized in this story?

- Provide more useful help for Windows environment. I'm afraid a lot
of us are used to work with Linux environments (bash scripts, KBabel,
command line cvs, etc.) and that prevents Windows people from helping,
since they are kindly pushed to switch to Linux to be able to take
advantage of existing docs and know-how.

Ricardo.

--
If it's true that we are here to help others,
then what exactly are the OTHERS here for?

Marek Stępień

unread,
Nov 22, 2006, 6:18:13 PM11/22/06
to
Axel Hecht napisał(a):

>>> * Mozilla acts as an island. We have not learnt anything from other
>>> localisation projects. There is a real NIMBY attitude.
>>
>> He means NIH :-)
>
> http://en.wikipedia.org/wiki/NIH says that that'd be the National
> Institutes of Health.

or... http://en.wikipedia.org/wiki/Not_Invented_Here

;-)

--
Marek Stępień <marcoos at aviary dot pl>
Aviary.pl Team

Axel Hecht

unread,
Nov 22, 2006, 7:03:05 PM11/22/06
to

"Willing" is not really the right term, it's just that the benefits of
po-files as an intermediate don't justify the efforts as I can tell. The
efforts would be significant, like, we had to turn all tinderboxens red
if someone checked in some xul where .accesskey and .label aren't
followed, which may have been to be protected by npob or something like
that.
I really think that that manpower can be spent elsewhere much more usefully.

>> I'd much prefer that MoCo hires 1-2 people to improve the tools we have
>> (eg. MT) to deal with all the formats (xhtml, images, and the rest) and
>> to add translation memories to it (eg. taken from OmegaT), so that even
>> professional tools can help in the translation process (by contributing
>> already proven glossaries and improved support for collaborative work).
>
>
> I've been learning and thinking about MT current lack of features. I
> must first state that I've never intended or promised to develop MT in
> the long-term. Instead, I intend(ed) to write a replacement for MT
> while keeping MT functional enough to be usable.
>
> Sadly, I'm not advancing in the replacement at a decent pace (young
> people: believe or not, it's far easier to live when you are in the
> 20s than in the 30s...) :-)
>
> So, either I stop at all at MT development, and maybe most of es-ES
> l10n, or the replacement will arrive a lot later than I originally
> expected. BTW, I always thought of the replacement as a file format
> indepent tool; it would start being functional with Mozilla file
> formats, but PO files were the next one in the list.
>
> OK, stop vapourware chatting... :-) I'm not really sure that people
> really care so much about PO vs. DTD/properties issue as they care
> about features in PO tools not found in DTD/properties tools (namely, MT).

A side-tracking notes about po-based tools. There are so many of them,
that I'm tempted to assume that each and every one sucks at least
partially. Which may be due to the fact that some stuff is just done on
top of PO, like variable substitution (I think, from looking at the docs).

As we're allowed to rant, the popularity of PO is probably really just
focused on the GNU world, because there's GNU gettext. In the larger
picture, I wouldn't be suprised to find this an island, just due to the
licensing island that gnu is (as per internet population, at least). Not
that I find the license on either the docs nor the download page, nor on
the project page :-(.

If there is anything I can help you with, I'll give it a shot. Like, if
we don't get an answer from the current maintainers, I could try to
approach the sourceforge guys, or even make Frank Hecker do that. It
might be easier if you have a @mozilla email address.

> - Enhancing QA queries is just a matter of manpower, as fixing most
> current bugs is.

I really think that the QA tests should be done independently of a
particular tool. Which requirements that induces is unclear, at least to
standardize the output of a test, I would think.
I personally write tests mostly as python modules, which would enable at
least python based tools to directly hook into them.
Or, if runtime checks are necessary (there are some, i.e. help, and
accesskeys), to write them as extensions.

I would think that running QA tests separately shouldn't be that much of
an issue.

Another note, running automatic QA tests on a generated or intermediate
format works on the assumption that the conversion at least in the
back-direction is error free. Which is software :-(.

> Now, the memory translation is a completely different thing: MT loads
> *everything* in memory, and do it in a tree structure that seems very
> logical for browsing files, but that doesn't fit with the idea of
> searching similar strings in real-time while editing, for instance.
>
> Also, I don't like too much the fact of an "editor replacement" (*)
> needing more than 256 MB of RAM (Robert Kaiser's Glossary.zip is so
> big that MT needs special parameters to run).

... eclipse .... lol. Sorry, had to say that. I just learned that there
are people that hate eclipse. From the runtime perf, that may be
justified, but from the editing architecture, I'm still impressed on
what they have set up.

> (*) In the end, what MT, Pottle or KBabel do can be done using a plain
> UTF-8 editor. Sure, the tools provide us with more, but how much
> computer power are you willing to devote for the benefits?
>
> Switching from the current "all-in-memory" datamodel in
> MozillaTranslator to a DB-based one looks like a major rewrite to me,
> and that's where my replacement entered into the play.
>
> However, the memory issue is my personal taste: maybe most of you find
> it a minor consideration, and, at first sight, I think that's the only
> real technical consideration preventing the implementation of memory
> translation features. Besides that, I'd say only manpower can be
> regarded as an issue. So, let me know about this.
>

Fixing manpower has been proven to be tough :-).

Regarding the autotranslate stuff, I wrote an email to sherman who did
an analysis of plain text for some of our docs before, I'll try to get a
list of stuff that is common in en-US.

I'd really like to have us a web-based service to support this, and by
this, I mean really a web-based service. Web-interface, too, but also
some REST interface and a database export for offline use.

>> I personally will never trust scripts/tools that convert 2-3 formats
>> into a different one, then the translated version is being converted
>> back into 2-3 formats again while dealing with different charsets or
>> encodings: too much error prone, thanks.
>>
>
>
> I agree with this. Again, this is my own opinion, and it may have a
> piece of irrational disbelief, but I *feel* that's not the right
> approach to a nice l10n tool/process.

As I said above, that is a software-based process, and assuming that
that is bug-free is not really justified by software practice. Not even
rocket science manages to work on bug-free software, and they spend so
much more time and money on being able to prove bug-freeness.

>> Want to throw MT (because of Java and anything else) in the trash for
>> developing a good web based tool? That's fine, but there is no reason to
>> move to gettext for this, for example.
>
>
> Well, that's not fine to me. :-) I don't like web-based tools at all,
> and Java shouldn't be an issue even for GPL die-hards now that it has
> been completely opensourced by Sun. It may be a point for personal
> reasons, just as I don't want to have anything at all resembling
> .NET/Mono in my computer. :-)
>
> Last, a quick review on other ideas not related to tools or file-formats:
>
> - for the most part of the time, I think that l10n full builds could
> be replaced by just building up the XPI (or even a ZIP with the JARs).
> Full builds just once a day, XPIs/JARs the rest of the time. It would
> save a lot of computer power and would pay with faster results for
> localizers in order to test changes.

I know there are a bunch of low- to mediate-low hanging fruits to catch
when doing l10n repacks. Doing really depend builds is one. Going
through the libs target once instead of twice would be another one.
Doing a more buildbot-based cycle (push changes) instead of a tinderbox
one (pull changes) is likely going to gain perf, too.

> - all brand and trademark issues should really be treated in separate
> way than the rest of the localization. They involve a lot of
> administrative stuff and, except for the final packaging, doesn't
> interfere too much with the localization tests and quality.

I'm dropping the term "trademarks" here. I don't think it fits the
bigger picture. I use "product" instead, because it really comes from
porting what Mozilla considers a feature of the product to be, and how
to transport that into another locale.

I think that Firefox 2 has gotten much better here, merely by removing
stuff from l10n. We have removed most of the URLs for example. Though
probably too late to actually make stuff feeling easier this time around.
IMHO, product-related stuff should be in other-licenses/branding. Would
that be distinct enough?
I don't think that there is a good way to have stuff localizable and
outside of l10n at the same time, but I do think that there is
low-hanging fruit to actually make it obvious what you can just change
and where you should look more closely. Again, tools that target to
create a full localization may obfuscate borders in the source.

This is somewhat related to localization vs. translation. Really, stuff
that is part of product features need localization instead of
translation. And this part is not only harder to do, but also can do
real-life harm. Pointing a locale to a particular feed is hard to
distinguish from a DOS attack, and as a prominent member of the web,
Mozilla has to make sure that that doesn't happen.
Exposing the trademarks of other companies goes along the same story.

It's not that we make the rules up here because we're bored, these are
real world problems. We had both complaints about feeds and searches
exposed in localized Firefox versions which asked us to remove those for
merely the amount of traffic they generated.

Sometimes is just sucks to be good :-).

> - (this is a requirement for us, the localizers) Provide feedback and
> help with the docs, specially with entry docs. Axel is still waiting
> for feedback on http://wiki.mozilla.org/L10n:Ownership, AFAIK, and I
> blame myself for this, as I read it but didn't say anything, nor good
> or bad (I'll try to fix this in a minute).

Thanks. Really, it's hard to write the docs that other people need.

> - a blog could be nice as a "clean view" of dates, deadlines and other
> important stuff, but it wouldn't replace neither this newsgroup nor bmo.

I would need to agree that I myself don't really know which calendar I
should look at. For example, the bonecho calendar used to be just full
of meetings, but not have the real deadlines.

Maybe we should try to set up a basic calendar. It'd be interesting to
really know the scope, though. (I wonder if we could tag events.) Like,
should it have Firefox, Thunderbird, Gecko1.9, stable branches (1.8
and/or 1.8.0?)

> - I'm pretty sure many teams have written docs and small tools for
> easing their work. However, (at least this is our case in es-ES), both
> things get written in the mother tongue, instead of in en-US. Sure, it
> is easier to write in your own language, but writing it in english
> will help to make you work useful for everyone, not just for your
> teammates. For instance, I have more or less equivalent batch scripts
> (bash and .BAT) to easily prepare SeaMonkey installers customized for
> es-ES taste, along with the corresponding docs. I haven't published
> them just because I thought it wouldn't be of much help to other
> teams, but since I need between two and five minutes to get a
> localized installer, they could deserve a look by other people. Except
> that the docs and embedded comments are now in spanish. :-( Do
> localizers feel themselves recognized in this story?

I bet.

> - Provide more useful help for Windows environment. I'm afraid a lot
> of us are used to work with Linux environments (bash scripts, KBabel,
> command line cvs, etc.) and that prevents Windows people from helping,
> since they are kindly pushed to switch to Linux to be able to take
> advantage of existing docs and know-how.

Basically anything we have right now starts with a unix shell, either
requiring /bin/sh or /usr/bin/env. That's a problem, yes.

I filed https://bugzilla.mozilla.org/show_bug.cgi?id=361583.

Axel

Axel Hecht

unread,
Nov 22, 2006, 7:59:12 PM11/22/06
to
Dwayne Bailey wrote:
> I was going to give up after the first paragraph and let Mozilla go and
> reinvent the wheel. I'm sorry you take things so personally. But after
> seeing Erdal's response, who I respect as a good cross-cutting
> localiser, I realised I couldn't be all wrong and I'd just trundle
> through the personal stuff.

I'm not taking it personally. I really don't spend days doing QA (I
spent only one). But I'm confident that the localization teams do. I'm
just being blunt and speak up for all of those.

> On Thu, 2006-11-16 at 17:55 -0800, Axel Hecht wrote:
>> Dwayne Bailey wrote:
>>> This thread seemed to go all technical and I wanted to bring it back to
>>> less technical level:
>>>
>>> Caveat: I work on the WordForge project which programs on the Translate
>>> Toolkit and Pootle. But then I've also managed 11 languages across:
>>> OpenOffice.org, Mozilla, GNOME and KDE... so I must know something.
>> >
>>> I think lets restate the some things....
>>>
>>> Mozilla wants to do more localisation. The question is why? Is it
>>> important? I assume its is for Mozilla. OK so lets all assume its
>>> important. If its important then this needs to receive some real and
>>> critical support. And then Mozilla needs to actually _listen_ to people
>>> with real and broad localisation experience.
>> Let me jump in here and restate that the work you submitted on the 1.8
>> branch was just plain unacceptable, so there must be something that you
>> did wrong.
>> Note, I did catch that, very late, and had to bring that up. And I did
>> take the heat internally from the product team at Mozilla for those
>> locales just not being up to stuff without any warning signs before.
>
> I'll avoid justifying anything. But has anyone wondered why we have no
> idea of quickly determining what a satisfactory localised UI experience
> is. Even Microsoft's LIP program only aims at 80% of the programs that
> people most use.
>
> I'm sorry you took the flack - you helped us greatly to get it into CVS.
> That was in fact my main goal. Then too see how good the experience
> would be.
>
> But why is that not automatically picked up? KDE, GNOME etc all have
> graduations and minimum requirements. We should be able to define a
> minimal acceptable l10n percentage, that doesn't happen so people aim at
> this illusionary 100%.

More on that in a separate reply.

>>> Can we assume we're doing well now? Clearly not since others are doing
>>> much much better for software that used by many many fewer people.

>> Maybe that software is used by very few people for a reason? Maybe
>> Mozilla does something right in terms of processes etc which enables us
>> to produce software that is actually attractive enough to tens of
>> millions of users around the globe to be used on a regular basis?
>
> Or have you ever considered that software used by so many people should
> already be localised into 200 languages since as you say the others are
> used by far less. So we would expect these millions of users to bread a
> few thousand localiser and then itself bread some very very competent
> localisers.

The language count is likely not going to give us any significant market
share. We want to do it because it's the right thing to do. The market
share comes from probably some 20 locales, to give an upper-upper limit.
More likely a hand-full. en-US, pause, de, pause, fr, pause, en-GB, pl,
es-ES, pause, pt-BR, it, ja ... to give active user counts. That doesn't
really equate to market share yet, as the users have different surfing
habits based on locale.

> So no I don't agree with your argument, although I agree its a point we
> need to consider.

I merely refer to Giacomo's post about the status of l10n in italian, too.

>>> Are the people we want to ask even here? I don't think so, the people
>>> who find it too difficult haven't even started the people who have
>>> experience on projects with better process are probably sitting in other

>>> projects. The people sitting on this list have already overcome and
>>> forgotten their newbie problems or have become so myopic that they
>>> cannot see beyond their technical solution to rethink it from scratch
>>> and examine their own assumptions and prejudices. And the people who
>>> have experience get shot down around technical debates.
>>>
>>> Why aren't we doing well? Here are my thoughts


>>>
>>> * Mozilla acts as an island. We have not learnt anything from other
>>> localisation projects. There is a real NIMBY attitude.

>> Whatever a NIMBY would be.
>
> Sorry wrong expression. I meant 'not made here'. When I talked to
> localisers at FOSDEM I was shocked at what they had to do to achieve
> simple things. They had no idea that other localisation projects had no
> similar problems.

Yes, the fact that don't aggressively share knowledge among each others
is unfortunate. I must admit, I expected the meeting last year at FOSDEM
to have a bigger impact on this.

>>> * The process is driven by technical people not localisers. Notice how
>>> quick this thread became a debate about Gettext and PO. I saw only one
>>> mail that addressed the problems someone has encountered in localising.
>> Please note that the other half of this thread is actually in .planning,
>> where this discussion was supposed to happen. Sadly, the mailing list
>> part of the newsgroup didn't catch that.
>>
>>> * It takes too much time. Too much QA is manual. Why must people wait
>>> for tinderbox. Why are people having to spend so much time on lists
>>> instead of translating and improving translations.
>>>
>>> * We are too point release focused. There is no space to release and
>>> update translations quickly. No space for release early. No space for
>>> localisers to get a quality improvement pack out.
>> That's not true.
>>
>>> * QA is techincal QA. All QA is about Trademarks, broken UI stuff,
>>> search and English text. All that wasted time and should be solved by
>>> other means.
>>>
>>> * There is no language QA. I'm not sure many teams do real QA, ie
>>> check their translations, and they certainly don't have tools that can
>>> help them to do this well (unless of course they use PO) then they're in
>>> a slightly better position. How many teams have a glossary, use TM? My
>>> guess close to none, indicating we're on the first rung of good l10n.
>> Says who? You? See above. Sorry for that, but there's a line to cross,
>> and you're on the wrong side.
>> That's pretty rude towards all the communities around the world that
>> managed to get together great localizations, in particular,
>> localizations that are much more successful in their regions than the
>> en-US one is in the US.
>
> Yes I say that. There are always teams that will do well. But the
> title says it all 100 not lets 'pat the good guys on the back and hope
> everything else magically happens'. There is no line, you asked for my
> input and I'm giving it. You can choose not to listen if you like.
>
> How would you, as in someone who is not a native speaker - not as in
> Axel, even know what a great localisation is? You have absolutely no
> way of knowing as this is not code that you can read. You have ZERO way
> of telling whether the translations are good, whether they are
> consistent, whether they conform to the team glossary, whether there are
> spelling mistakes or whether they are stylistically correct for the
> language.

Actually, there are a few things that I can check, as well as input from
other sources that do speak the languages in question.
Firstly, we have a good idea about active users in a particular locale,
and market share in the region.
So there are clear indicators of "Firefox sucks". Together with habits
on working on the localization as well as known intl bugs, I have at
least a gut feeling on how good a localization would be. At least I
would know if it sucks.

> We cannot pretend that 100% localised UI = good localisation.

That's why I never use those numbers to talk about quality.
And that's why you won't see me letting anyone get along with a lower
percentage, as that's just going to make things worse.

>> And we're in the process to get native language speakers outside of the
>> communities to do some testing, too. We're not aggressively spreading
>> that data, as we're currently evaluating how that feedback is coming in
>> and how to present it. Nevertheless, there are lots of QA efforts on the
>> localization going on, and being opensource and popular is bringing us
>> up to great speed here.
>
> So are you going to hire across 100 locales? I doubt it so I'm not
> concerned about that intervention as it probably only affects tear 1
> localisation. I'd rather see teams equipped to do this job themselves
> and for their community to give easy feedback. No bugzilla is not easy
> feedback.

We don't hire in a single locale. We contract some QA resources in some
locales, and that goes way beyond tier-1 or 2. We may contract
localization resources, if appropriate, as Seth Bindernagel pointed out
in his blog. This would be along with other outstanding members of our
engineering and QA community.

>>> * Firefox is big. 35,000 words (including toolkit/), OOo is 80,000.
>>> If you have to translate something that big before you see it. Man that
>>> is discouraging. A professional translators working fulltime would do
>>> that in 35 working days and that would exclude reviews. 7 weeks!
>>> Almost 2 months with no reward.
>> >
>>> * Localisation should not break Firefox. If it does we're doing
>>> something wrong. We should be catching all of those things before we
>>> make an XPI.
>>>
>> Yes to the two above, and we're working on it. Benjamin Smedberg has
>> some ideas, I have some, Rob Helmer and preed are interested in helping,
>> too. Stuff on the map are fallback strings for builds, though those will
>> definitely not be releases, those would be just for testing purposes.
>> And for making l10n on the trunk somewhat feasible. Other items include,
>> probably more short-term, speeding up the l10n build process to get the
>> development cycle down. Part of this is likely a rewrite of the
>> repackaging in python (only), which would make it easier for localizers
>> to build themselves.
>
> All I can say is great. This would probably be the biggest step towards
> getting people started and maintaining their enthusiasm.
>
> Ironically though we've already achieved all of this with our PO tools,
> PO checker and some hacks to the build scripts.
>
>>> Solutions:


>>>
>>> * Use the Translate Toolkit damit :). Even just to publish official POT
>>> files. Even allow people to commit PO files to Mozilla l10n/ and

>>> automatically convert those files to .dtd and friends. We thus open
>>> Mozilla to the world of existing FOSS localisers without changing the
>>> current process for other people. Quick win, no arguments about formats
>>> or anything.
>> Thanks, but no thanks. I've seen too many bugs as result of this.
>
> And those bugs would be... hard for people to fix, easily trapped.
> Please be specific, instead of disparaging a tool.

Did that elsewhere.

> Then again I honestly thought, silly me, that this was a thread to help
> localisers and get their input. But amazing how in one line the door is
> shut and the champion for all us localiser is essentially saying, bugger
> off and do things my way. Sorry you wanted our input and when we give
> it we just get door closing rubbish.

To quote my original message,

> I would hope to get guestimates or number from folks like localizers, drivers,
> build, QA, BD and product.

Localizers is just one aspect of our process. From the other teams, so
far only build has responded, in .planning, though.

Note, there's nothing magical about 100, neither in locale count nor in
ratio of changed strings. It's merely a catch phrase to set out a bigger
goal, and that we need to adjust our process to be able to release the
locales that are eager to work in our process so that we are actually
able to process the requests.

>>> * Reduce the channels we need to follow to find information. My
>>> suggestion, a simple blog that focuses only on announcing deadlines,
>>> requirements and points back to the wiki if there is a need for more
>>> detail.

>> As I have been saying elsewhere, this newsgroup is the place to be.
>> Sadly, whenever I ask for feedback on a particular draft on either devmo
>> or wiki, there is sudden silence, which makes it hard to finish pages
>> like the ownership page I already linked to without totally going despot.
>
> Let me restate this one. The idea of a blog is that there is less
> noise. This newsgroup is noisy, so noisy that I can't keep up and thus
> miss important announcement. There is no simple place to see what
> exactly the rules are now. I don't care about rules shifting I just
> want to know what they are now, what the deadlines are, etc

That's not really working that well, as rules are changing, so there's
more than "those were the rules when I last read them". You also need to
watch the modifications to the rules.
And as said in my reply to Ricardo, where do you draw the line between
vital information and noise? Firefox? Which branches? Thunderbird?
Branches? Calendar? Targets? Branches? SeaMonkey? Minimo? Composer?

> In terms of the wiki I must be honest I find it so hard to find info on
> it as I get confused all the time and I don't have loads of time to
> unravel it.
>
>>> * Make it possible to easily build .xpi and installable builds for
>>> testing. Waiting for tinderbox and nightly builds is silly. Even
>>> better, as soon as I commit stuff to my l10n/ dir Mozilla builds an XPI
>>> for me for testing. Better yet an XPI that nows how to upgrade so that
>>> anyone using a testing XPI will get it and can use it magically.
>> Yep.
>>
>>> * Use Pootle. This will allow very raw newbies to begin translating
>>> using a simple web interface. This would even allow the registration
>>> process to happen in parallel. Would allow community bug fix
>>> contributions and encourage new users. It even allows a Help ->
>>> Translate option on the browser so that people can localise.
>> pootle is just an editor like any other. web interfaces only work for
>> some. It seems to me that you're proposing a change in process, which,
>> see above, may not result in a quality that we're aiming for.
>
> No not really. Pootle also allows goals, hosting by Mozilla, user
> suggestions, will eventually manage process tracking, does TM and
> manages glossary. So no its not another editor.
>
>> That said, I have seen discussions going actively in the summit here in
>> MV about using a more P2P-focused RCS, which might move the pain in l10n
>> from one point to another.
>>
>>> * Realise that most good translators will actually not use MT or even a
>>> PO Editor for that matter, since that would be the same as insisting
>>> that all people who code on Firefox should use vim. Have any of the
>>> more technical people thought about how your choices if forced on you by
>>> others is the equivalent of the editor wars? Our best translators use
>>> commercial tools like Wordfast. If we're serious about language QA then
>>> we need to use real l10n tools.
>> Sad for KaiRo and Ricardo, they apparently suck. Though last I checked,
>> they did translate the preference dialog.
>
> Instead of attacking me how about actually addressing the issue raised.
>
> Let me restate it as perhaps you didn't get it. Most translators are
> not cross skilled. Giving the example of 2 people is silly, we want
> 100's of people localising. In my environment the best localisers are
> professional localisers, and I can almost guarentee you that in all our
> other languages. How many people localising Mozilla are professional
> localisers. Asking someone who wants to give a little inout to abandon
> all their skills and tools to fit into our approach means you get
> nobody. That is one reason why Pootle is actually a good idea, we get
> localisers contributing over lunch from the office and these are
> professional translators who translate for parliament and for the
> police.
>
>>> * Use pofilter in the toolkit (this does mean you need PO files, but I'm
>>> sure Mozilla can be adapted). This picks up translated variables,
>>> missing escapes, etc over 41 checks. As a localiser if you're not using
>>> this I'm not so sure you care as much about quality as you say you do.
>>> We use it and are pretty confident we don't have any technical induced
>>> translation problems ie our XPIs aren't going to break things. The aim
>>> here is to make sure we can never build a broken XPI because of problems
>>> in the translations.
>> None of those tests have anything to do with po. That's just the format
>> that you wrote them in. Your choice.
>
> And... exactly, we're adapting them to work with XLIFF. So there is no
> reason for anyone not to adapt them to work against .properties and .dtd
> files.
>
> The fact is that the tool exists and that no one needs to reinvent the
> wheel, just adapt it. Or use it against PO files, I know one person who
> does that and then edits the native files. The issue is without using
> such tools your QA is well not QA.
>
>>> * Create a system to properly mark entries in .dtd and .properties files
>>> so that we can automate checks on things that should not be translated.
>>> If we ever see a translator translate a config variable then we must be
>>> honest that the problem is with us not them. If 100 people have to ask
>>> should 'true' be translated, sigh, then we'll never get it. My ideal
>>> solution but probably harder to implement is to move all config
>>> information out of these files into a config file. Thus only things
>>> that must be translated are in the DTD or .properties.


>>>
>>> * Create more tears with well defined goals. This would allow us to
>>> define an newbie tear where a team can release anything at any time.
>>> Its never officially released but is available for the brave and for
>>> testing.

>> This is likely going to happen, in one way or another.
>>
>>> * Direct translators. Since it is so big what should people do first so
>>> that they see quick results. This would also allow us to define better
>>> requirements for completeness for a release. So even with someone on
>>> 80% we'd know better if that will still make a good UI experience.
>>>
>>> One last thought. Mozilla Corp through Axel has asked a question. If
>>> you don't like the answers is that because the answer is wrong or
>>> because you see the world through your own tinted spectacles. Every
>>> person who raised issues that made Mozilla l10n hard for them has raised
>>> a valid point, think about that before dismissing ideas.
>> Sure. Not that a "do it my way, dammit" is gonna work. We're a
>> successful project, with lots of priorities. And we're not going to drop
>> the mindshare of a few hundred people if we can avoid it.
>
> I'll always sing my way of doing it. Simply because I see localisers
> struggling with issues that we have never ever had to deal with. And I
> see them not achieving their best possible because all the localisation
> decision are lead from a technical "we won't budge" point of view.
>
> Unfortunately with this approach there is the reality that you won't
> gain the mindshare of the 1000 of other people who are already active
> and you just need to harness them.
>
> That's why I started my reply by raising the strategic issues. And I
> think those are either not clearly defined or are being ignored as from
> where I sit Mozilla cherry picks ideas - ones that I now realise are all
> relatively simple and need coding which makes them cool. Yet
> steadfastly refuses to look at solutions that make localisation easier
> and with higher quality.
>
>> And surely we're not going to do so for yet another crappy compromise,
>> like PO in CVS.
>>
>> Dropping source l10n completely seems totally out of the question for
>> me, if other disagree, I'd like to see rationales.
>
> Ah yes, always open to new ideas. The request for PO in CVS is simple.
> It give teams the options of using PO and the tools that they can use.
> It give the option to Mozilla to host Pootle and not have to see all
> their teams move over to Rosetta where they will have even bigger
> issues.
>
> It makes the localisations stay within the framework of the Mozilla
> project.
>
> There was no request, well certainly not from me, to move away from
> source l10n. My request is simple, make PO an accepted in CVS
> localisation medium. Enhance the tools to autoamte the po2moz roundtrip
> and make teams themselves responsible for the .properties and .dtd's
> that they produce.
>
> That would be a first step, see how it goes, you lose nothing. We as in
> people who now use PO gain a lot and you also gain the potential new
> contributors.

As Ricardo properly realized, I don't see any resources to guarantee
that this process stays up.

Multilingual files in CVS (or any VCS) together with an active
development process is just a bug. I don't see how anybody could think
otherwise, changing a hundred files for a typo-fix in en-US is just
unfixably broken. Multilingual files are cool for a freeze-then-l10n
process, they totally blow for an cooperative development process. Which
we need in order to find our l10n bugs earlier in the development
process and to make our features work as good as possible globally.

Axel

Momar Dieng

unread,
Nov 22, 2006, 8:07:17 PM11/22/06
to dev-...@lists.mozilla.org
I have been reading this thread with a lot of interest, and decided to
raise my hand since I am one of those people Dwayne is asking about
below. I have to preface anything I am going to say by warning that I am
a complete newbie to all of this and I may say things veterans/experts
find laughable. But that's one of the points of this discussion, right? :-)

Background: I have been trying to gather/lead a diverse, west-African
l10n team for the past couple of months. The other people are mostly
linguists/language experts, and I am the "tech person" in the group even
though I have just started learning most of it in the past few months.
The idea is to produce free/open source software in Senegalese (and
later other west-African) languages that we hope will complement
initiatives like MIT's "$100 laptop project". Many of us feel that the
stars are aligned to make basic education/computing in native african
languages an imminent reality. Apart from economic advantages which I
won't go into, we have very strong ideological feelings about why our
countries should try harder to make at least basic primary education and
computing in the native languages an option for our populations. We
think making this happen will also take a stab at some of the structural
issues that underlie "underdevelopment". We would like to convince our
government(s) to invest in the open source movement by showing them what
is possible with (close to) no money and a lot of will, strategic
thinking and community spirit. Our group is starting with modest test
projects. The first is Abiword (http://www.abisource.com/) in the Wolof
language of Senegal. Firefox was tied at #1 on my list but the learning
curve seemed just too great (remember: complete newbie). In comparison I
had no trouble grabbing Abiword sources, following the l10n instructions
on the site and creating the .pot file. I do realize we are talking
about different scales and a similar process for Firefox could be
harder to streamline just because the size/structure is probably more
daunting. But I think it is well worth the effort. From the newbie
perspective the whole PO process makes a lot of sense. The bits I could
gather online made me decide that trying to do the same for Firefox was
not (yet) worth the effort because it seems very ad-hoc and much more
experimental than I am ready to attempt. Meanwhile, in 2 or 3 weeks of
learning from scratch I got a web-based Abiword l10n effort on the way
(the Abiword part actually took a an hour or so, after I understood what
PO files were; the web-based part is what took a couple of weeks).

Why should your/our community care? I can only speak for (West) Africa
but I am sure this applies to other places. Apart from South Africa, and
maybe a few other exceptions, the overwhelming majority of African
countries are in the dark when it comes to computing (no pun intended).
However that is not going to be forever and in fact it is changing right
now. For one thing broadband internet connectivity is increasing quite
rapidly in cities and pushing into rural areas. Several reasons make our
countries prime candidates to join the open source community and make it
bigger stronger and better (I'm a strong believer in case that wasn't
clear). For one it is financially the soundest choice for us, for
obvious reasons. Also, because we are building much from scratch, we
don't have the inertia factors other countries would have when
confronted with the option to switch to open source products both at the
institutional and individual level. For instance we are lobbying the
Senegalese government to adopt some Linux desktop flavor system-wide by
showing them what they would save in the long run, and it just might work.

Next, Africa has so many languages that l10n is not a luxury, but a
necessity for us even within the smallest countries. Senegal has only 11
million people but 3 different main native languages and about 8 or 9
more that are spoken by minorities. Again, open source software with a
streamlined l10n environment is really the only way I see to handle the
diversity and avoid internal frictions by enabling every language
community to (help) develop its own l10n. We could never get the kind of
flexibility we need from proprietary software and the market is just not
there. An approach that would convince the "powers that be" that not
only is computing in local languages possible, but it can be done
cheaper and relatively easier in the many native languages using open
source could be very successful. For this the l10n process will have to
be streamlined to a level that makes it more straightforward than it is,
not just for Firefox. There are close to a billion people (in Africa)
who are an easy target for the open source community if the environment
is right when the time is right. The effort to make things streamlined
now could be repaid manifold when all these people jump on the
bandwagon, and start contributing back.

One last thing about web-based l10n tools... Although some people
earlier expressed dislike/reluctance about them, they are necessity for
us, at least at the beginning. The reality is that most of the people
who would be leading/doing this on the African side would probably be in
the Diaspora between Europe and North America. Our own group is split
between France and the US. It just makes more sense to have a web-based
community effort so we can pool our limited resources more effectively.
We settled on using Entrans, a simple but nice web-based PO-editing tool
developed by an Indian community for the purposes of their own thriving
web-based l10n effort (see http://kannada.sampada.net/ ; Entrans can
be seen in action at http://translate.sampada.net/main.php).
Sorry, this was longer than I intended, but hope this gives another
perspective.

Momar

Axel Hecht

unread,
Nov 22, 2006, 8:19:22 PM11/22/06
to
Dwayne Bailey wrote:
> On Wed, 2006-11-15 at 21:23 -0800, Ankit Patel wrote:
>> ----- Original Message ----
>> From: Robert Kaiser <ka...@kairo.at>
>> To: dev-...@lists.mozilla.org
>> Sent: Thursday, November 16, 2006 5:36:19 AM
>> Subject: Re: PO format usage
>>
>>> João Miguel Neves schrieb:
>>>> * I know of no project that uses other format with more than 50 or 100
>>>> locales.
>>> Is there any project using PO format that has 50 *fully complete*
>>> localizations, like Firefox has now with our supposedly inferior formats?
>> Open source projects like, Gnome, Kde, Xfce, Fedora, etc. are using PO formats to get the localizataion support & and i think in these projects there will be at least 20 locales which has *fully complete* localization & at least 50 locales which has 80% of localization & definitely 10 locales which has 50% of localization. I agree that there is no point of using 50% localized application, but at least translators all over the world are encouraged to start their work as they find the easy process of getting their language inclusion in these projects, which is quite tough in mozilla as far as i know.
>>
>>>> * Epiphany and Konqueror, which in terms of usage are minimal, have
>>>> more localizations than Firefox.
>>> Again, how many of them are *fully complete*?
>> I think 80% is enough for the end user to feel the localization of application.
>
> Thanks for saying that. The notion of 100% is a farce. We need to
> translate the strings that people will see 80% of the time. 100% is
> ideal but I'm sure you have translated error messages that no one has
> ever ever seen.

Hrm. What is that number supposed to mean? Whenever a users sees a
translated string, you tick up the counter, and 20% of those ticks are
allowed to fail? That means that there are at least one bug in the main
menubar, and probably one in the bookmarks menu. Or so.

We should really get away from the "you need foo%", I personally don't
want to give out "a magic number" of translation coverage, as it's not
going to be 100% changed strings. Things like "ltr" as DOM direction for
example may be perfectly localized as the same string, and there are
others. Feature names could stay in English, too, or not. Accesskeys
depend largely on locale and langauge group. So where the number for
"complete" is depends on the language. And I don't want people to hack
in invisible chars randomly in the English copy of a string, just to
confuse a test.

On the other hand, I do think that in particular for odd edges in our
product, suddenly turning over to English is likely a really bad user
experience.

Test builds are something different here, but for releasing Firefox in a
particular language, "really complete" is the target for me.

Like all software, localizations are going to have bugs. Bugs are OK, "I
don't bother translating this" is not.

>>> As the PO/gettext framework allows inclomplete translations and our
>>> doesn't at the moment, I don't think a pure number of localizations is
>>> just unfair. I'm pretty sure the numbers wouldn't differ much if we
>>> could allow partial localizations (and that includes en-CA, en-AU, etc.)
>>
>> Two important points from my side:


>> 1. Translators prefer to work on PO format files

>> 2. java .dtd & .properties files doesn't show the minor chages, we always have to compare the english files with our lang files using compare-locales.pl file & start working, which i think is not perfect.
>
> I call these files monolingual. They will always have a problem of not
> being able to track their own changes. Bilingual files like PO and
> XLIFF do not suffer from this problem.

You can't track your own changes? Or do you think about not being able
to sign-off on changesets? If you want that, the right tools to support
that are VCSs with atomic commits, and tools that show which of those
affect your files and then have a database on your own to notch them
off. This process data does not belong into the source, though. No need
to go back to 20th century flat-file-eddies, as much as we love them.

Not that I want to make such a database part of our process. It might be
something we could do in the mozilla2 timeframe, though, if we find
web-service hacker resources to support it.

> In PO when a new message arrives its blank for me and I translate it.
> If the English changes it goes fuzzy (and thus will not be used by
> po2moz) and I can change it. The latest release of Gettext adds a way
> to store the previous translation so your editor could then easily give
> you a diff between the previous and current English text.

Apply typo in en-US here, or add an ellipsis. Yac.

Really, PO-files are a horrible format, and really force you to use an
tailored editor to get them right. That's not the way to go. Going from
lock-out to lock-in is just not the right decision to make.

I'm not resisting change here, I'm just resisting change for the popular
worse.

Axel

Novica Nakov

unread,
Nov 22, 2006, 8:44:50 PM11/22/06
to

> It'd be interesting to know which locale you're talking about ;-).

Sure: mk.

> And there is a way for us to check for unchanged strings in a locale,
> but that has nothing to do with non-localized. Nor are there good ways
> to preventing spoofing that number by introducing worse bugs.
>
> Thus I'm not looking at those numbers, on purpose.

Nice. But the strings are left because at the moment there is no way
those can be translated. We are waiting for the linguistics people come
up with the words.

> I'd say the contrary, see Giacomo's reply, for example.

I don't need to see Giacomo's reply. So Italian GNOME l10n is not that
good. OK. Again and again, your question is how to get to 100 locales.
It is obvious that the current model (that includes all these stuff like
MT, .properties, cvs, etc.) doesn't seem like the way to do it
(otherwise you wouldn't have asked the question). So maybe, just maybe,
POT can help, or maybe something else (see my other posts here).

Actually what I would like to see is a summary of this discussion and an
official position of MoCo. MoCo asked a question, the l10n community
replied. So what is MoCo's position on getting to 100 locales after this
discussion?

--
Novica

Axel Hecht

unread,
Nov 22, 2006, 9:02:53 PM11/22/06
to
Momar Dieng wrote:
> I have been reading this thread with a lot of interest, and decided to
> raise my hand since I am one of those people Dwayne is asking about
> below. I have to preface anything I am going to say by warning that I am
> a complete newbie to all of this and I may say things veterans/experts
> find laughable. But that's one of the points of this discussion, right? :-)

Yep.

Sounds like we really need to fix our documentation story much more than
anything else here.

Coming from Europe, in particular from Germany, I know exactly how
valuable l10n is for the success of a product like ours.
There are good reasons for Firefox being more popular in Germany than in
the US, and a top-notch localization simultaneously shipped is one of
the key reasons.

The other important part though is, "top-notch" and "simultaneously
shipped" are not just nice to have, they're essential. Miss one, and you
wont be able to reach out to a whole bunch of users.
We should be totally aware that users won't go out and give the en-US
version a try if the l10n version is bad.

> One last thing about web-based l10n tools... Although some people
> earlier expressed dislike/reluctance about them, they are necessity for
> us, at least at the beginning. The reality is that most of the people
> who would be leading/doing this on the African side would probably be in
> the Diaspora between Europe and North America. Our own group is split
> between France and the US. It just makes more sense to have a web-based
> community effort so we can pool our limited resources more effectively.
> We settled on using Entrans, a simple but nice web-based PO-editing tool
> developed by an Indian community for the purposes of their own thriving
> web-based l10n effort (see http://kannada.sampada.net/ ; Entrans can
> be seen in action at http://translate.sampada.net/main.php).

Entrans has a compelling aspect, it's demoed on a server where you can
actually see something without waiting for ages. I like the suggestion
part of it. Which makes me think that they may not be working on PO
itself. Pootle reports to have something like that, too, but I guess I'd
have to register. Not sure why. I had a hard time finding documentation
on the actual tool, can you help out there? Dwayne could report on where
suggestions and votes are stored in pootle, too. (Isn't it fun to see
that the west african team uses a tool from india instead of one from
ZA? Talk about losses by lack of agreement on standards.)

About the cooperation, could you point out which process you're actually
using? It seems to me that you're in something that I would call a
"scratch-pad" phase, and that the web-based part is really just saying
that you want something peer-to-peer without glamorous fancy notations
like you find in tools like mercurial.

On the diaspora side, you said diaspora between europe and US, which
would be africa (to neglect the atlantic here for a moment, and some
north/south aspects of "between"). On the other hand, your team is in
the US and France.
I'm really curious on this part, as it basically says that localizations
for african languages may for the most part not come from the African
continent. This is really interesting in the light of webtools, as
online time is much more rare there. In particular for african locales,
I would have expected off-line editing to be an important aspect for a
localization.

> Sorry, this was longer than I intended, but hope this gives another
> perspective.

It's a tad shorter if you remove the quoted part :-).

Axel

Axel Hecht

unread,
Nov 22, 2006, 11:40:23 PM11/22/06
to
Novica Nakov wrote:
>
>> It'd be interesting to know which locale you're talking about ;-).
>
> Sure: mk.

Didn't know that you contributed there. Nice.

>> And there is a way for us to check for unchanged strings in a locale,
>> but that has nothing to do with non-localized. Nor are there good ways
>> to preventing spoofing that number by introducing worse bugs.
>>
>> Thus I'm not looking at those numbers, on purpose.
>
> Nice. But the strings are left because at the moment there is no way
> those can be translated. We are waiting for the linguistics people come
> up with the words.

This is one of the interesting parts of l10n. Yes, there are words that
make sense to translate, or not. As for feature names, leaving them with
the en-US names has the advantage that users speaking English don't have
to rely on, say, Macedonian documentation, but can just search the web.
Take "Live Titles" for example, you'll hardly find something for that on
Macedonian. And non-English speakers are usually not aware of the fact
that "live title" doesn't convey *any* meaning to a native English
speaker. Like, zilch. Same for tabbed browsing. Once you know what that
is, you can remember those names, but the names itself don't hint you at
anything. Translating those really only makes sense for locales that
have a strong opinion about using English in everday language, and that
are used to find funny islands of terms instead, like France.

>> I'd say the contrary, see Giacomo's reply, for example.
>
> I don't need to see Giacomo's reply. So Italian GNOME l10n is not that
> good. OK. Again and again, your question is how to get to 100 locales.
> It is obvious that the current model (that includes all these stuff like
> MT, .properties, cvs, etc.) doesn't seem like the way to do it
> (otherwise you wouldn't have asked the question). So maybe, just maybe,
> POT can help, or maybe something else (see my other posts here).

My question was where our current process has points that keep us from
growing significantly. There are enough locales in the queue that would
be eager to start, but don't really have a good way to start. So fixing
that will be a priority.
Making it more discoverable on what localizers should do with a
particular string is another, and we'll severely change our feature
development process to fix that. That's not online yet, but we decided
on that last week, and I expect that to materialize once the dust
settles on all those poor turkeys. That would be some of the product part.
One big chunk that I haven't gotten to yet is QA.
And I'll probably have to poke Mic about the BD part of things, as she's
not so familiar with our communication habits. Bad me, I should have
just CCed here.
Builds is probably full of somewhat low-hanging fruit to speed up the
process, making builds with fall-back en-US content is more involved. I
yet don't know how to usefully display build logs for l10n, if anybody
has a clue, that'd be good. My status page is better than nothing, but
really only works for one product.

> Actually what I would like to see is a summary of this discussion and an
> official position of MoCo. MoCo asked a question, the l10n community
> replied. So what is MoCo's position on getting to 100 locales after this
> discussion?

As indicated above, I don't consider this discussion to be done. Apart
from that, anything I say may not be MoCo's decided point of view, but
the group determining the view will include me. If you'd ask anyone at
MoCo to round this up, it'd end on my lap. And anything that I don't
fight for has a much smaller chance to happen.

One thing I can probably say yet another time, multilanguage files as a
l10n source format is not anywhere near the process I want to see us
going. The comments on that are spread out across the thread, but that's
fair. The reasons to switch to one are just as spread out, and probably
not completely spoken out. Not by attitude, but merely because the "it
works with PO" is so much easier to think with than a "I could really
use this particular process".
I'm pretty sure that at least some of those processes are not applicable
for Firefox, at least not in the product stage of a localization.
So whenever I hear PO or just multilingual file format, I try to find
out what the process is behind that.

One thing that might be not be stated as verbose as it should be, nobody
is interested in shipping a localization of a particular version of some
product/project inside Mozilla. What we're interested in is having
members in our community that maintain a localization across multiple
partially co-existing versions. That excludes l10n projects that drop by
every other week, at least for the immediate future. I don't think that
that should keep us from having many more locales though, as Firefox,
once it is in your language, is really something that gives you a rush.
A totally new dimension of rush, if you did other projects so far.

I am still convinced that the changes in the processes that come from
using, say, PO, are minor compared to the impact of that change to the
development process at large. And I am totally positively sure that if
we did a change of that magnitude, it should really offer all that
today's knowledge offers, which is not PO. Tracking changesets for
example is something that today's VCSs can provide much more information
about, and I expect us to use one that can in the timeframe that would
be appropriate for a major architecture move.
In addition, this is not really just about Firefox, but we'd likely need
to move over the much more change-resistant extension ecosystem.
At this point, I sacrifice a few small languages for the success of the
product we have and that covers roughly (stats by Gerv as I recall them)
95% of the internet population.

Another summary point would be, Mozilla decided that documentation
doesn't get written by request, but by people. That's why we moved our
websites to hold that to wikis, as the slightly-more-formal process of
getting write access to mozilla.org wasn't working.
Documentation of any kind is a community effort, apart from the small
part that is actually tracking policy decisions.

The l10n community needs to share the knowledge they have across teams.
Looking at the islands that popped up in the thread, it seems that the
l10n community only transfers knowledge if there's one team working on
multiple projects, or just, not really. Even the PO-based community
seems to be totally fragmented and busy reinventing wheels.

That makes me think that if the l10n community within the Mozilla
project more actively shared information on our island, enabling other
islands would not look as attractive as they do for some now. And there
is so much stuff that's special for a successful webbrowser, that that
sharing needs to happen, come rain or shine.

Axel

Ankit Patel

unread,
Nov 23, 2006, 12:00:13 AM11/23/06
to Axel Hecht, dev-...@lists.mozilla.org
Just one thing i wanted to say at last. "Changes is the fact of life". Everything needs change. What's the problem if we try at least once to follow .PO formats. If we won't be comfortable then let's go back to .dtd & .properties again. At least we should give a try once.

----- Original Message ----
From: Axel Hecht <l1...@mozilla.com>
To: dev-...@lists.mozilla.org
Sent: Thursday, November 23, 2006 10:10:23 AM
Subject: Re: PO format usage

Axel


_______________________________________________
dev-l10n mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-l10n


____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com

João Miguel Neves

unread,
Nov 23, 2006, 3:30:33 AM11/23/06
to Axel Hecht, dev-...@lists.mozilla.org
Qui, 2006-11-23 às 01:03 +0100, Axel Hecht escreveu:
> "Willing" is not really the right term, it's just that the benefits of
> po-files as an intermediate don't justify the efforts as I can tell. The
> efforts would be significant, like, we had to turn all tinderboxens red
> if someone checked in some xul where .accesskey and .label aren't
> followed, which may have been to be protected by npob or something like
> that.
> I really think that that manpower can be spent elsewhere much more usefully.
>
Just to note that by not doing that, you're forcing the locales who ARE
using po files NOW to do that work, instead of having it done by the
machine in a coherent way. Please don't just talk like po files aren't
being used for mozilla projects. They are. If you can help us, please
do, but don't say that reducing work and complexity for some of the l10n
teams is not worth it.

Thanks,
João Miguel Neves

signature.asc
It is loading more messages.
0 new messages