L10N team is about to start working on its current primary target:
support for Locale::Maketext, localizable templates and Perl code.
There are several questions which I would like to discuss, such as:
Custom template support;
Directory structure;
How to maintain backwards compatibility;
Transition problems;
Build tools, test suites, etc.
Kindly look into
https://wiki.mozilla.org/Bugzilla:L10n:Roadmap#Use_Locale::Maketext,
and bug https://bugzilla.mozilla.org/show_bug.cgi?id=407752
Waiting for your feedback,
Regards,
Vitaly.
Maketext description is recently expanded (and moved to separate wiki
page). I also tried to address some concerns already expressed by
developers.
https://wiki.mozilla.org/Bugzilla:L10n:Maketext
Your feedback is most appreciated with regard to some questions still
unanswered:
* Directory structure and .po files placement
* How many .po files per language we need?
* Is Locale::Maketext::Fuzzy practical?
* Work breakdown needs review
Thanks in advance.
Regards,
Vitaly.
I think they should go into a new directory, something like
lang/<language>/
> * How many .po files per language we need?
Well, we're going to have an enormous number of strings, I
believe. I think we should probably have one .po per top-level template
directory that we have now. Otherwise we'd be reading in a huge file
for every page load, which would be a big problem under mod_perl (which
would never release that memory).
That is, unless we just pre-compile translated templates, in
which case it could all be in one file (but I think having such a huge
file would still be a problem for some text editors).
> * Is Locale::Maketext::Fuzzy practical?
No, we should avoid it if we can, I think. The standard
%quant() system used by most Maketext implementations seems better to
me.
-Max
--
http://www.everythingsolved.com/
Competent, Friendly Bugzilla and Perl Services. Everything Else, too.
-
To view or change your list settings, click here:
<http://bugzilla.org/cgi-bin/mj_wwwusr?user=dev-apps...@lists.mozilla.org>
> Bugzilla/* means Linux packagers will have to separate po files to
> another directory, as it contains only Perl code for now, and code goes
> to /usr/lib/perl5/...
Agreed, valid reason for not going this way.
> default/custom separation is from the days when no flexible version
> control systems existed, IMHO. Personally I have no problem maintaining
> a customizations on top of default/* hierarchy under git.
And what about per-project and per-extension templates?
> VF> * Is Locale::Maketext::Fuzzy practical?
> Will xgettext.pl create entries for special cases (and special cases
> will grow in combinatorial numbers)? If not, then it will be hard to
> use.
Well, when the code to 'say' a full sentence is long and sophisticated
(as can be seen in "auth_failure" error message), it may be practical to
wrap the entire snippet with [% |l %]...[% END %] instead of having
trouble of single words translated without context.
xgettext.pl will not help here, for obvious reasons. I'd suggest pseudo
comments putting together variants of text generated by this code snippet.
These variants may be addressed by old good static lexicon entries (yes
they may grow in combinatorial numbers) or by fuzzy entries. As these
will emerge in predictably MUCH less numbers than static entries,
perhaps we may keep TWO lexicons, 'static' and 'fuzzy', and use explicit
fuzzy calls like '[% |l_fuzzy %]...[% END %].
This would save us from 'charge of the light brigade': iterating through
thousands of keys each time fuzzy match is required.
File size is not a problem, there are Translate Toolkit tools and
special .po file editors (and yes Emacs major mode, too).
>> * How many .po files per language we need?
> Well, we're going to have an enormous number of strings, I
> believe. I think we should probably have one .po per top-level template
> directory that we have now. Otherwise we'd be reading in a huge file
> for every page load, which would be a big problem under mod_perl (which
> would never release that memory).
Mikhail Gusarov wrote:
> Why separate web/cli/email messages?
My primary concern here was lexicon loading overhead, not its memory
footprint. If we can reduce from one loading per request to, say, one
per single persistent http connection -- worth it? Single Bugzilla
object would cache only relevant lexicons, best matched by Accept-Language.
And single lexicon size and loading overhead may be reduced by splitting
to several 'caller contexts':
- web (CGI scripts, pm modules, web templates)
- email (cron scripts, pm modules, email templates)
- command line (.pl scripts, pm modules, global templates only)
Which will lead to approximately six lexicons:
- Generic templates (global/*)
- Email templates
- Web templates (the rest)
- Bugzilla.pm and below
- .cgi scripts
- .pl scripts
This split serves no other purpose but to avoid loading all of them
every time.
Regards,
Vitaly.
>> Why separate web/cli/email messages?
VF> My primary concern here was lexicon loading overhead, not its
VF> memory footprint. If we can reduce from one loading per request
VF> to, say, one per single persistent http connection -- worth it?
Anyway, IIRC gettext loads entries from .mo files on demand, so
splitting looks like a premature optimization even without
per-http-connection trick.
Maybe it is time to measure a bit?
--
Had a look into sources. In fact there wasn't any on demand load: both
Locale::Maketext::Gettext::read_mo() and
Locale::Maketext::Lexicon::Gettext::parse() seem to read all their
arguments at once.
Locale::Maketext::Lexicon::Tie() can be used as on demand loader, if
someone writes a backend.
Regards,
Vitaly.
What is the perf impact of using maketext (memory usage, CPU load,
responsiveness)? I would hate to see Bugzilla being a monster in these
respective areas. Some users reported that Bugzilla 3.2 is slower than
3.0. I don't know if they have valid data to confirm this or if they
just wanted to say something, but I would hate to discover that they are
right, even if 3.2 has more features than 3.0.
LpSolit