Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: L20n and platform - early questions

15 views
Skip to first unread message
Message has been deleted
Message has been deleted
Message has been deleted

Zbigniew Braniecki

unread,
Mar 10, 2010, 12:41:22 PM3/10/10
to Benjamin Smedberg
Benjamin Smedberg wrote:

> In what scope does this file run? Does it have access to full chrome
> privileges or just standard JS objects? I'm concerned that giving these
> files full chrome access may cause unexpected re-entry when getting
> properties.

It's going to be in an isolated scope.
We consider the fact that l20n will be compiled to JS a most important
security area of concern, so we'll definitely work with security team to
make sure it's safe and isolated.


> Do you have a description of the new APIs? How will JS and binary
> callers interact with l20n? Will there be a migration period where the
> existing nsIStringBundle APIs work with the new l20n format? Will
> nsIStringBundle go away (please)?

nsIStringBundle will go away, just not sure when. And I don't think we
have any plan already for migration but we'll have to build it.

API question is pretty open. We're experimenting around, but the raw one
may look something like this:
http://hg.mozilla.org/users/zbraniecki_mozilla.com/l20n/file/fdd0a9e35b61/testension/content/testension.xul#l13

For XUL we want one entity to provide all attributes. For example, l20n
may look like this:

<butId "value"
accesskey: "k"
tooltip: "Tooltip">

will get compiled to JS file and then in XUL you will just do:

<button l10n_id="butId">

My goal is to be able to provide much more info around, like list of
missing entities, amount of entities used by this XUL file etc.

But we're early enough to accept all API suggestions you may have. And
I'll make sure to blog about it to have this discussion in open, but
we'd like to have a final format structure first, to minimize the confusion.


> There's not a binary choice here, and it's hard to say without seeing
> the API. Marshaling arguments across XPConnect is pretty expensive, but
> you could implement the code in JavaScript and have a custom C++ bridge
> without a whole lot of effort.

The patch I provided is what I use to launch the examples from
./testension/ extension linked above. That's my POC for now.
If you have particular questions that would help you answer my
questions, I'll do my best to give you them :)

Greetings,
gandalf
--

Mozilla (http://www.mozilla.org)

Message has been deleted

Robert Kaiser

unread,
Mar 10, 2010, 12:56:22 PM3/10/10
to
Zbigniew Braniecki wrote:
> We (l10n-drivers) are moving forward with our goal to introduce new
> localization platform for Gecko codenamed L20n.

\o/ \o/ \o/
gandalf++
Pike++

> Currently I'm binding l20n into three places:
> - intl/l20n - XPCOM JS component
> - content/xul/document/src/nsXULContentSink.cpp - binding l10n_id XUL
> attribute to l20n entitity
> - toolkit/content/widgets/stringbundle.xml - recognizing l20n file from
> .properties and providing the right XPCOM (StringBundle vs. L20n)

I hope this will also do away with the different function names we
currently have with XUL-loaded and JS-loaded stringbundles and have them
all use the the L20n syntax to retrieve strings in the future?

Robert Kaiser

Axel Hecht

unread,
Mar 10, 2010, 1:07:53 PM3/10/10
to

Yes.

Axel

Message has been deleted

Robert Kaiser

unread,
Mar 10, 2010, 4:06:20 PM3/10/10
to
Neil Deakin wrote:
> It seems reasonable for scripted access, but most localized values are
> simple and static. Dropping into js while a xul document is being parsed
> would only be desirable in those cases where it was actually needed.

The problem is that that the entity-based L10n scheme sucks majorly and
never was really good - though it might be fast.
We need something that is flexible enough to allow plural forms and
similar stuff even in XUL, and we need to find a concept for that. Also,
the current loose binding of labels, accesskeys, tooltip texts, and
possibly commandkeys to each other has been a major pain to localizers
at times, we should improved it. And this work has a chance of actually
achieving that.

>> Yeah, we'd like to do sth like this:
>>
>> <button l10n_id="butId">
>
> I'm assuming that this is just a placeholder, but this seems like it
> should just use 'id', rather than inventing something else.

'id' already has other uses, we have elements that don't need to have a
XUL id but need localization.

> Otherwise, this makes more sense to me now. Some issues:
>
> - it makes the xul code harder to read. It isn't clear in the example
> above which attributes are or will be set, nor which attributes the
> localizer is expected to set.
> - it allows a localizer to specify any attribute. I wouldn't want a
> localizer to be able to modify the 'oncommand' attribue for example.
> - it doesn't allow localization of values that aren't specified in
> attributes.

Let's formulate this as requirements for the L20n implementation or
questions to the implementors instead of negative points. Something like
that:
- Can we make it clearer which attributes are, can be, or should be set
by the L20n code?
- We need to restrict which attributes the L20n code can set on XUL
elements.
- We need to find a possibility to set text nodes as well as attributes.

How does that sound?

Robert Kaiser

Robert Strong

unread,
Mar 10, 2010, 4:22:34 PM3/10/10
to dev-pl...@lists.mozilla.org
That sounds fine but performance should be a priority as important if
nor more so than the list above as it is with everything else we do.

Robert

Message has been deleted
Message has been deleted

Axel Hecht

unread,
Mar 10, 2010, 5:30:16 PM3/10/10
to

I have a bunch of postit notes on my office door for the whitepaper, and
the sticker with "perf" is on the very top of it.

In other words, yes, perf is essential.

I should note that the l20n substitution happens before fastload, i.e.,
if a xul doc is in the fastload cache, none of this will be hit. So
we're talking about first run performance, probably. But yes, we
shouldn't make that suck either.

Axel

Zbigniew Braniecki

unread,
Mar 10, 2010, 5:33:42 PM3/10/10
to Robert Kaiser

Robert Kaiser wrote:
> Neil Deakin wrote:
>> It seems reasonable for scripted access, but most localized values are
>> simple and static. Dropping into js while a xul document is being parsed
>> would only be desirable in those cases where it was actually needed.
>
> The problem is that that the entity-based L10n scheme sucks majorly and
> never was really good - though it might be fast.
> We need something that is flexible enough to allow plural forms and
> similar stuff even in XUL, and we need to find a concept for that. Also,
> the current loose binding of labels, accesskeys, tooltip texts, and
> possibly commandkeys to each other has been a major pain to localizers
> at times, we should improved it. And this work has a chance of actually
> achieving that.

This is another open topic for me, but it's not the highest priority
just yet. Let me explain why.

L20n definitely will allow to set a list of arguments that will
influence the returned value.
This, in a naive world, translates to:

<element l10n_id="brandShortName" l10n_arg="5"/> being expanded to
<element label="5 Firefoxes"/> or <element label="1 Firefox"/>

Now, the issue is, that it would sounds very natural then, to be able to
modify l10n_arg and have label updated live.

I'd love to see that, and it would fit my dream of ability to update
entities live (restart-free language switching, here I come), but is
beyond the scope of sole L20n project.

I will experiment with that, but we shouldn't hold our breath just yet,
however sexy that sounds :)

Zbigniew Braniecki

unread,
Mar 10, 2010, 5:37:40 PM3/10/10
to John J Barton

John J Barton wrote:

> One suggestion: cause the localization to do translation. By that I mean
> the XUL contains (possibily.stylized.)English strings so its easy to
> read and falls back to the common case. Then the localization updates by
> looking up the strings. We've done this in a crude way in Firebug and it
> helps because the XUL is readable, translators have more clues. and the
> error path is more forgiving.

Oh. We may do much more than that.

We could have a l10n fallback story in XUL. :)

And for error recovery I'm thinking about giving firebug like things a
whole stack of stats: number of entities used in the file, number of
missing ones, node+entity pair list.

That is a really nice opportunity to fix some long standing PITAs.

Neil Deakin

unread,
Mar 10, 2010, 6:08:58 PM3/10/10
to
Zbigniew Braniecki wrote:
>> It seems reasonable for scripted access, but most localized values are
>> simple and static. Dropping into js while a xul document is being parsed
>> would only be desirable in those cases where it was actually needed.
>
> The thing is that the strings are in JS file anyway. That's what we're
> going to ship.

I don't follow. None of the strings are currently in a JS file.


> Localizers will localize so called "l20n" files which will be our custom
> format. Then, we will compile those to "j20n" files (js file) which will
> be shipped.
>

So localizers aren't expected to edit the used localized string/data
scripts directly?


> We could load all simple entities from a context into memory at once
> instead of querying for each as we process through the document.
>
> Do you have other ideas?


>
>
>> I'm assuming that this is just a placeholder, but this seems like it
>> should just use 'id', rather than inventing something else.
>

> I'm not sure 'id' is the right thing. There may be cases when two
> elements have the same l10n_id, or when an id-less element has l10n_id.

Could you give an example of the former?

For the latter, if an element has no id, why would one not just be able
to give it one, rather than adding a different identifying attribute?

Zbigniew Braniecki

unread,
Mar 10, 2010, 6:32:22 PM3/10/10
to

Neil Deakin wrote:

> I don't follow. None of the strings are currently in a JS file.

I mean, we will store entities in .js files.

> So localizers aren't expected to edit the used localized string/data
> scripts directly?

That's correct.

Localizers will operate on a custom format that we're not yet finalized,
that will be focused on readability and easy error spotting.

This will be compiled (in Gecko case) to .j20n (javascript) that will be
shipped and then we have to read it while parsing XUL file and expand to
attributes.

> Could you give an example of the former?

Ten 'save' buttons in the last column of a grid.

> For the latter, if an element has no id, why would one not just be able
> to give it one, rather than adding a different identifying attribute?

That may be an option. I'd like to hear Axel's opinion, but I don't
think I see your concern with custom attribute.

Neil Deakin

unread,
Mar 10, 2010, 7:26:15 PM3/10/10
to
Zbigniew Braniecki wrote:
>
>
> Neil Deakin wrote:
>
>> I don't follow. None of the strings are currently in a JS file.
>
> I mean, we will store entities in .js files.
>

I still don't follow. Your argument is that using JS isn't an impact
because "the strings are in JS file anyway". Yet, we *don't* store
entities in a JS file. You are proposing that we do that.

I'm not opposed to that, but I do think that the majority of localized
data doesn't need that kind of extra work, much in the same way that
most of prefs.js doesn't.

Just to clarify, I do like the general idea, and also combining dtd and
property files into one file.


> Localizers will operate on a custom format that we're not yet finalized, that will be focused on readability and easy error spotting.
>
> This will be compiled (in Gecko case)

Who will compile it? The localizer, or Mozilla code? If the former, why
does it matter whether dtd/property files end up being the result?


> That may be an option. I'd like to hear Axel's opinion, but I don't think I see your concern with custom attribute.

In any case, I'd suggest 'localeid' as a name (or perhaps localeref) if
that's the direction that is needed.

Zbigniew Braniecki

unread,
Mar 10, 2010, 8:01:21 PM3/10/10
to

Neil Deakin wrote:

> I still don't follow. Your argument is that using JS isn't an impact
> because "the strings are in JS file anyway". Yet, we *don't* store
> entities in a JS file. You are proposing that we do that.

I was talking about L20n format.
In the L20n approach strings are stored in JS format.

> I'm not opposed to that, but I do think that the majority of localized
> data doesn't need that kind of extra work, much in the same way that
> most of prefs.js doesn't.

I understand what you're saying. But in order to separate those two
cases we'd have to separate files - separate file for simple entities,
separate for complex.

I'm not sure if we really want it, and I'm not sure how much additional
cost we have here.

Currently we have to parse DTD file to grab a set of simple entities. In
L20n scenario we parse .js file and we can get both, simple and complex
entities.

Do you expect that storing simple ones in some different format would be
a big perf win?

> Who will compile it? The localizer, or Mozilla code? If the former, why
> does it matter whether dtd/property files end up being the result?

dtd/property files are giving us plain key-value pair and they still
have to be parsed.
I expected that .js file loaded once with (example) 80% simple strings
and 20% complex should be faster than loading two files and juggling
between them, but I'd like to hear your suggestion.

Do you suggest to keep .dtd file for simple entities, or do you suggest
some other format, faster to parse than JS?

Justin Dolske

unread,
Mar 10, 2010, 10:10:56 PM3/10/10
to
On 3/10/10 7:10 AM, Zbigniew Braniecki wrote:

> In a shortest possible words, L20n is supposed to replace .properties
> and .dtd files in Gecko

I assume the older formats will still be available to use?

There are at least two cases (video controls and plugin-problem UI), and
maybe more (various about: pages?) where we use DTDs for strings,
because things run with content privileges and not chrome. So these
things can't use JS to do L10N work, because they don't have access and
JS might not even be enabled.

Justin

johnjbarton

unread,
Mar 10, 2010, 11:58:04 PM3/10/10
to
On 3/10/2010 2:37 PM, Zbigniew Braniecki wrote:
>
>
> John J Barton wrote:
>
>> One suggestion: cause the localization to do translation. By that I mean
>> the XUL contains (possibily.stylized.)English strings so its easy to
>> read and falls back to the common case. Then the localization updates by
>> looking up the strings. We've done this in a crude way in Firebug and it
>> helps because the XUL is readable, translators have more clues. and the
>> error path is more forgiving.
>
> Oh. We may do much more than that.
>
> We could have a l10n fallback story in XUL. :)

Sorry I don't understand what that means. By 'fallback' I meant simply
that if the JS translator fails, the XUL still works because it has
English string literals embedded.

>
> And for error recovery I'm thinking about giving firebug like things a
> whole stack of stats: number of entities used in the file, number of
> missing ones, node+entity pair list.

I guess this would be 'Chromebug-like', that is a platform rather than
web-level tool? But I don't want any entities, so we can't have any
missing ones. Just strings. Maybe a tool to show/extract them all easily.

>
> That is a really nice opportunity to fix some long standing PITAs.

Like entities for example ;-)
jjb

>
> Greetings,
> gandalf

Zbigniew Braniecki

unread,
Mar 11, 2010, 4:48:58 AM3/11/10
to

johnjbarton wrote:

>> We could have a l10n fallback story in XUL. :)
>
> Sorry I don't understand what that means. By 'fallback' I meant simply
> that if the JS translator fails, the XUL still works because it has
> English string literals embedded.

Yeah, and I responded by saying, that we can make L20n fallback from any
locale to any other locale in case of a missing entity.

So, for example, Spanish Mexican locale will fall back to Spanish locale.

Zbigniew Braniecki

unread,
Mar 11, 2010, 4:52:22 AM3/11/10
to

Justin Dolske wrote:
> On 3/10/10 7:10 AM, Zbigniew Braniecki wrote:
>
>> In a shortest possible words, L20n is supposed to replace .properties
>> and .dtd files in Gecko
>
> I assume the older formats will still be available to use?

I don't see a reason why not, but we'd like to have localizers localize
only one format.

> There are at least two cases (video controls and plugin-problem UI), and
> maybe more (various about: pages?) where we use DTDs for strings,
> because things run with content privileges and not chrome. So these
> things can't use JS to do L10N work, because they don't have access and
> JS might not even be enabled.

Does such case mean that there's no access to JS XPCOM components as
well? What is we switch our nsIL20n to C++?

Or is it a problem because we store entities in JS file, so we have to
read them, and even C++ XPCOM component will not be able to do that in
that case?

Thanks,
Zbigniew Braniecki
--

Mozilla (http://www.mozilla.org)

Message has been deleted

Axel Hecht

unread,
Mar 11, 2010, 5:06:40 AM3/11/10
to
On 11.03.10 01:26, Neil Deakin wrote:
> Zbigniew Braniecki wrote:
>>
>>
>> Neil Deakin wrote:
>>
>>> I don't follow. None of the strings are currently in a JS file.
>>
>> I mean, we will store entities in .js files.
>>
>
> I still don't follow. Your argument is that using JS isn't an impact
> because "the strings are in JS file anyway". Yet, we *don't* store
> entities in a JS file. You are proposing that we do that.
>
> I'm not opposed to that, but I do think that the majority of localized
> data doesn't need that kind of extra work, much in the same way that
> most of prefs.js doesn't.
>
> Just to clarify, I do like the general idea, and also combining dtd and
> property files into one file.

One of the major challenges with l10n is that the en-US coder in general
doesn't know if a string is simple or not. That decision is made on the
l10n side of the l20n scheme.

Thus separating the output of the compiler could break things.

Another point, in the common case of a XUL element with an accesskey,
the localizable object isn't doing a whole lot of magic, but it's
already not a simple string. That is, one thing we intend to do on the
way here is to drop the convention of key names to tie accesskeys to
their element's strings to actually put them into one localizable thingie.

But yes, figuring out if the format we're using is making us pay in perf
is something we have to look in to. I'm hoping that the js property
lookup code isn't worse than the dtd ENTITY lookup code in expat. Like
both of them need to hit a hashtable or something somewhere.

>> Localizers will operate on a custom format that we're not yet
>> finalized, that will be focused on readability and easy error spotting.
>>
>> This will be compiled (in Gecko case)
>
> Who will compile it? The localizer, or Mozilla code? If the former, why
> does it matter whether dtd/property files end up being the result?

For the code we ship, we intend to do the compilation at build time. The
idea is that we include a similar code into gecko itself so that it can
do runtime compilation as well, to keep the burdon for extension
developers down. But having to do that on the fly is definitely going to
be hitting the pre-fastload startup time, thus the idea to compile as
much as possible at build time.

>> That may be an option. I'd like to hear Axel's opinion, but I don't
>> think I see your concern with custom attribute.
>
> In any case, I'd suggest 'localeid' as a name (or perhaps localeref) if
> that's the direction that is needed.

The thing we're referencing is the localization, locale is a different
thing. Is that suggestion just to keep the numbers out of the ID?

Axel

Axel Hecht

unread,
Mar 11, 2010, 5:11:20 AM3/11/10
to
On 11.03.10 04:10, Justin Dolske wrote:
> On 3/10/10 7:10 AM, Zbigniew Braniecki wrote:
>
>> In a shortest possible words, L20n is supposed to replace .properties
>> and .dtd files in Gecko
>
> I assume the older formats will still be available to use?

Technically, I don't see a way to break DTDs. If ripping out
stringbundle impls is a good idea or not, I don't know. I don't think
it's something we need to worry about now. I'd expect a phase of
explicit deprecation before doing that.

> There are at least two cases (video controls and plugin-problem UI), and
> maybe more (various about: pages?) where we use DTDs for strings,
> because things run with content privileges and not chrome. So these
> things can't use JS to do L10N work, because they don't have access and
> JS might not even be enabled.
>

We know that we need to have priviledge-free code for both XUL and XHTML
docs. We'll hook up the logic in the content sink such that both will
support this.

How this maps to XBL is a good question. How's XBL2, by the way? ;-)

The neterror pages are another example where we'll need to spend some
thinking on how to make them really better.

Axel

Gervase Markham

unread,
Mar 11, 2010, 5:20:24 AM3/11/10
to
On 11/03/10 00:26, Neil Deakin wrote:
> I still don't follow. Your argument is that using JS isn't an impact
> because "the strings are in JS file anyway". Yet, we *don't* store
> entities in a JS file. You are proposing that we do that.

There seems to be a lot of confusion in this thread, with the
explanation of how l20n works coming out in bits and pieces.

Is there a design document Neil, bsmedberg and crew could read to get an
overview of how you want things to work, and what the goals are?

Gerv

Axel Hecht

unread,
Mar 11, 2010, 6:02:54 AM3/11/10
to

We're in the process of writing a whitepaper, but I'm not sure that's
the same thing you're asking for.

For the record, I don't think there's a "lot" of confusion, given the
size of the change. The majority of the questions are probably even
detailed enough to not be explained in docs without making that doc so
large that folks won't read it ;-). Having a public discussion might
just be the right thing for those.

Axel

Benjamin Smedberg

unread,
Mar 11, 2010, 9:25:52 AM3/11/10
to
On 3/11/10 6:02 AM, Axel Hecht wrote:

> For the record, I don't think there's a "lot" of confusion, given the
> size of the change. The majority of the questions are probably even
> detailed enough to not be explained in docs without making that doc so
> large that folks won't read it ;-). Having a public discussion might
> just be the right thing for those.

The proposal seems to be "store localizations in .js files and use them...
somehow". I'm sure you have a much better story than that, but I really
don't understand it yet. I have very basic questions such as, "how will
binary code get localized strings" and "how will non-XUL JS code get
localized strings". Obviously if locales are going to provide better
translations that vary according to gender and number, they are going to
need more context than just "give me string with key X". But I don't
understand how we're giving them the additional context, and how much that's
going to cost.

--BDS

Zbigniew Braniecki

unread,
Mar 11, 2010, 9:38:47 AM3/11/10
to

Benjamin Smedberg wrote:
> On 3/11/10 6:02 AM, Axel Hecht wrote:
>
>> For the record, I don't think there's a "lot" of confusion, given the
>> size of the change. The majority of the questions are probably even
>> detailed enough to not be explained in docs without making that doc so
>> large that folks won't read it ;-). Having a public discussion might
>> just be the right thing for those.
>
> The proposal seems to be "store localizations in .js files and use
> them... somehow".

Well, not "somehow". Just use them. Use them in JS via API that I gave
examples to above in the thread, use them in XUL via l10n_id attribute
that will expand.

> I'm sure you have a much better story than that, but I
> really don't understand it yet.

Then, we just have to give you the right answers, and this thread will
hopefully make our blog posts about it and our papers and wiki articles
give the answers to the kinds of questions you're asking.

> I have very basic questions such as,
> "how will binary code get localized strings" and "how will non-XUL JS
> code get localized strings".

Via API. For the moment it looks sth like this:

l20n =
Components.classes['@mozilla.org/intl/l20n;1'].createInstance(Components.interfaces.nsIL20n);
l20n.addReference('chrome://foo/locale/file.j20n')
l20n.getValue('entityId')

or l20n.getValue('entityId', {'attr1': 'val', 'attr2': 5})

> Obviously if locales are going to provide
> better translations that vary according to gender and number, they are
> going to need more context than just "give me string with key X".

Often that is valid, but not always.

For example gender aware language may add gender to brandShortName and
then use it to adjust the entity that uses brandShortName. Developer
doesn't have to provide any context.

More often than not, yes, dev will need to provide context. And it will
be a set of arguments in case of CPP/JS code. In case of XUL, we're not
sure yet how it will work.

Greetings,

Benjamin Smedberg

unread,
Mar 11, 2010, 10:05:42 AM3/11/10
to
On 3/11/10 9:38 AM, Zbigniew Braniecki wrote:

> l20n =
> Components.classes['@mozilla.org/intl/l20n;1'].createInstance(Components.interfaces.nsIL20n);
>
> l20n.addReference('chrome://foo/locale/file.j20n')
> l20n.getValue('entityId')
>
> or l20n.getValue('entityId', {'attr1': 'val', 'attr2': 5})

As an API point then, I strongly object to this using XPCOM (at least as a
visible API). The actual .getValue calls looks ok. I think that the C++ API
should be idiomatic to C++, perhaps something like:

When you .addReference, do all the values from the various j20n files get
flattened together? Is there a risk of name collision there, and/or are we
going to encourage the names to be longer, with some sort of namespace
scheme, e.g.:

browser_OpenNewTag
localeswitcher_LocalesMenu

Will key names not normally have dots in them (because dots are property
separators in JS)?

Are arguments always treated as strings, or does the JS expect some of them
to be actual numbers?

namespace mozilla {

class LocalizedStrings
{
LocalizedStrings();
AddReference(const char* uri);
GetValue(const char* name);


/**
* An enumerated union storing either strings or numbers to be passed
* as a GetValue argument.
*/
class Value
{
union {
string stringval_;
int32_t intval_;
double floatval_;
} val_;
enum {
TYPE_STRING,
TYPE_INT,
TYPE_FLOAT,
} type_;

// add appropriate constructors/initializers here
};

GetValue(const char* name, const std::map<std::string, Value>& arguments);
};

}


--BDS

Message has been deleted

Robert Kaiser

unread,
Mar 11, 2010, 10:34:01 AM3/11/10
to
Zbigniew Braniecki wrote:
> L20n definitely will allow to set a list of arguments that will
> influence the returned value.
> This, in a naive world, translates to:
>
> <element l10n_id="brandShortName" l10n_arg="5"/> being expanded to
> <element label="5 Firefoxes"/> or <element label="1 Firefox"/>
>
> Now, the issue is, that it would sounds very natural then, to be able to
> modify l10n_arg and have label updated live.

Well, sure that would be cool, and even just statically being able to do
it would be nice, already, but even just being able to use correctly
gendered variants of "%(brandShortName) is applying updates..." in a
toolkit XUL context would already be cool, as you know but many readers
here might not. ;-)

It also would be cool to find a solution for pref-style "Delete items
that are more than <input> days old" constructs in a way that we don't
have to split sentences in multiple parts - but that might as well be
something to do later (having a clue about the syntax might be good,
though).

In general, I think we should have the syntax settled somewhat before we
push an implementation to wider use, even if not all parts of that
syntax are actually implemented from the beginning.

Robert Kaiser

Robert Kaiser

unread,
Mar 11, 2010, 10:36:10 AM3/11/10
to
johnjbarton wrote:
> Sorry I don't understand what that means. By 'fallback' I meant simply
> that if the JS translator fails, the XUL still works because it has
> English string literals embedded.

Actually, falling back to a default locale is fine, but embedding any
kind of UI strings into XUL or code is one of the worst things we could
do (and generally makes the markup or code rather less readable in
general, esp. when it comes to longer strings).

Robert Kaiser

Axel Hecht

unread,
Mar 11, 2010, 10:47:20 AM3/11/10
to
On 11.03.10 16:05, Benjamin Smedberg wrote:
> On 3/11/10 9:38 AM, Zbigniew Braniecki wrote:
>
>> l20n =
>> Components.classes['@mozilla.org/intl/l20n;1'].createInstance(Components.interfaces.nsIL20n);
>>
>>
>> l20n.addReference('chrome://foo/locale/file.j20n')
>> l20n.getValue('entityId')
>>
>> or l20n.getValue('entityId', {'attr1': 'val', 'attr2': 5})
>
> As an API point then, I strongly object to this using XPCOM (at least as
> a visible API). The actual .getValue calls looks ok. I think that the
> C++ API should be idiomatic to C++, perhaps something like:

For most code consumers, I expect them to hit either a jsm or a jetpack
api feature in the end. For declarative markup, it'd be a set processing
instructions setting up the context for the document in question.

> When you .addReference, do all the values from the various j20n files
> get flattened together? Is there a risk of name collision there, and/or
> are we going to encourage the names to be longer, with some sort of
> namespace scheme, e.g.:
>
> browser_OpenNewTag
> localeswitcher_LocalesMenu

This is no different to what we do right now when including multiple
DTDs into a single xul doc. So yes, the names of entities are required
to be unique for all contexts in which a l20n file (working title: .lol,
localizable object list) is used.

> Will key names not normally have dots in them (because dots are property
> separators in JS)?

I think so, yes. This is less of a problem of "js", but that within
l20n, you can reference object properties via '.'.

> Are arguments always treated as strings, or does the JS expect some of
> them to be actual numbers?

l20n is expected to support string and number literals, as well as
hashes and lists. That's easy to map into scripting languages like js,
but challenging to do with C++. Might be that the C++ API won't get good
for all that l20n can do.

> namespace mozilla {
>
> class LocalizedStrings
> {
> LocalizedStrings();
> AddReference(const char* uri);
> GetValue(const char* name);
>
>
> /**
> * An enumerated union storing either strings or numbers to be passed
> * as a GetValue argument.
> */
> class Value
> {
> union {
> string stringval_;
> int32_t intval_;
> double floatval_;
> } val_;
> enum {
> TYPE_STRING,
> TYPE_INT,
> TYPE_FLOAT,
> } type_;
>
> // add appropriate constructors/initializers here
> };
>
> GetValue(const char* name, const std::map<std::string, Value>& arguments);
> };
>
> }

Sounds interesting. Is there a reason why you're using char* instead of
our string classes?

Axel

johnjbarton

unread,
Mar 11, 2010, 10:50:01 AM3/11/10
to

Here is an example of how it looks in practice:
<menuitem id="menu_clearActivationList"
label="firebug.menu.Clear Activation List"
command="cmd_clearActivationList"/>

I can't see how this is worse &than.the.alternative.

jjb

Zack Weinberg

unread,
Mar 11, 2010, 11:17:25 AM3/11/10
to Benjamin Smedberg, dev-pl...@lists.mozilla.org
Benjamin Smedberg <benj...@smedbergs.us> wrote:
> On 3/11/10 9:38 AM, Zbigniew Braniecki wrote:
>
> As an API point then, I strongly object to this using XPCOM (at least
> as a visible API).

Seconded.

> When you .addReference, do all the values from the various j20n files
> get flattened together? Is there a risk of name collision there,
> and/or are we going to encourage the names to be longer, with some
> sort of namespace scheme, e.g.:
>
> browser_OpenNewTag
> localeswitcher_LocalesMenu

I wonder if an explicit namespacing scheme would be worthwhile.

LocalizedStrings ls;
ls.BindTextDomain("css", "chrome://foo/locale/css.j20n")
str = ls.Get("css", "PEImportBadURI", aURLSpec.get());

"bindtextdomain" borrowed from gettext; don't actually care what it's
called.

Also,

> GetValue(const char* name, const std::map<std::string, Value>&
> arguments);

this is still going to be unpleasantly verbose to call, because there's
no such thing as an anonymous std::map literal (well, maybe there is in
bleeding edge c++0x but we're not gonna be able to use that for a
while). I could live with this as the most-general form that you
have to use for complicated cases, but I hope there will be a bunch of
overloads for passing in one or two scalar parameters ("scalar" in the
Perl sense). I might even argue for a varargs-based interface a la
printf().

zw

Axel Hecht

unread,
Mar 11, 2010, 11:41:48 AM3/11/10
to
On 11.03.10 17:17, Zack Weinberg wrote:
> Benjamin Smedberg<benj...@smedbergs.us> wrote:
>> On 3/11/10 9:38 AM, Zbigniew Braniecki wrote:
>>
>> As an API point then, I strongly object to this using XPCOM (at least
>> as a visible API).
>
> Seconded.
>
>> When you .addReference, do all the values from the various j20n files
>> get flattened together? Is there a risk of name collision there,
>> and/or are we going to encourage the names to be longer, with some
>> sort of namespace scheme, e.g.:
>>
>> browser_OpenNewTag
>> localeswitcher_LocalesMenu
>
> I wonder if an explicit namespacing scheme would be worthwhile.
>
> LocalizedStrings ls;
> ls.BindTextDomain("css", "chrome://foo/locale/css.j20n")
> str = ls.Get("css", "PEImportBadURI", aURLSpec.get());
>
> "bindtextdomain" borrowed from gettext; don't actually care what it's
> called.

Not sure which problem you're solving here. Sounds like you're adding a
third layer of grouping. There is already the chrome protocol offering
the same thing as text domain does, AFAICT.

And in general, you'll use one context per file, unless the two files
actually provide cross-referencing entries, just like we do today.

The only difference is that with l20n, you can add, for example,
brand.lol to your context and use the brandShortName directly in your
code.lol, without having to code that in two different .properties
accesses and doing the string replacement yourself.

> Also,
>
>> GetValue(const char* name, const std::map<std::string, Value>&
>> arguments);
>
> this is still going to be unpleasantly verbose to call, because there's
> no such thing as an anonymous std::map literal (well, maybe there is in
> bleeding edge c++0x but we're not gonna be able to use that for a
> while). I could live with this as the most-general form that you
> have to use for complicated cases, but I hope there will be a bunch of
> overloads for passing in one or two scalar parameters ("scalar" in the
> Perl sense). I might even argue for a varargs-based interface a la
> printf().
>

l20n will unlikely support positional params, but just kwargs-style
params. I suspect that a varargs implementation would loose a bunch of
compile-time knowledge and possibly even open up to easy runtime errors,
in gain of rather little.

I'd love to see some existing c++ code samples that actually need
complex params so that we're not ending up designing an API for
academia, but make our actual life easier.

Axel

Benjamin Smedberg

unread,
Mar 11, 2010, 11:47:10 AM3/11/10
to
On 3/11/10 10:47 AM, Axel Hecht wrote:

> For most code consumers, I expect them to hit either a jsm or a jetpack
> api feature in the end. For declarative markup, it'd be a set processing
> instructions setting up the context for the document in question.

Yes, it's the end-API I'm worried about. The internal implementation can
then change over time.

> This is no different to what we do right now when including multiple
> DTDs into a single xul doc. So yes, the names of entities are required

Typically, however, extension localization currently happens per-overlay, so
you could have a browser entity and an identical extension entity which
never conflicted. It seems at first glance that this is no longer the case:
won't l20n references be handled as DOM manipulations?

>> Are arguments always treated as strings, or does the JS expect some of
>> them to be actual numbers?
>
> l20n is expected to support string and number literals, as well as
> hashes and lists. That's easy to map into scripting languages like js,
> but challenging to do with C++. Might be that the C++ API won't get good
> for all that l20n can do.

What is the use case for supporting hashes and lists? It sounds like
basically you want arguments to be any structured data which can be
represented as JSON, and that sounds like a fair bit of complexity that we
should avoid unless we have no choice.


> Sounds interesting. Is there a reason why you're using char* instead of
> our string classes?

Just thinking about the standard call pattern, where you'd know the key and
just pass it as a "string literal" instead of having to create a wrapper
with NS_LITERAL_CSTRING. It doesn't much matter.

--BDS

Zack Weinberg

unread,
Mar 11, 2010, 12:29:08 PM3/11/10
to Axel Hecht, dev-pl...@lists.mozilla.org
Axel Hecht <l1...@mozilla.com> wrote:
> On 11.03.10 17:17, Zack Weinberg wrote:
> >
> > I wonder if an explicit namespacing scheme would be worthwhile.
> >
> > LocalizedStrings ls;
> > ls.BindTextDomain("css", "chrome://foo/locale/css.j20n")
> > str = ls.Get("css", "PEImportBadURI", aURLSpec.get());
> >
> > "bindtextdomain" borrowed from gettext; don't actually care what
> > it's called.
>
> Not sure which problem you're solving here. Sounds like you're adding
> a third layer of grouping. There is already the chrome protocol
> offering the same thing as text domain does, AFAICT.

This is *instead of* embedding the namespace in the identifiers;
browser_OpenNewTag becomes ("browser", "OpenNewTag"). From the C++
perspective, having the two identifiers be separate strings is much
more convenient. It also avoids collisions.

> The only difference is that with l20n, you can add, for example,
> brand.lol to your context and use the brandShortName directly in your
> code.lol, without having to code that in two different .properties
> accesses and doing the string replacement yourself.

I'm sorry, I have no idea what this means.

> > I could live with this as the most-general form
> > that you have to use for complicated cases, but I hope there will
> > be a bunch of overloads for passing in one or two scalar parameters
> > ("scalar" in the Perl sense). I might even argue for a
> > varargs-based interface a la printf().
>
> l20n will unlikely support positional params, but just kwargs-style
> params. I suspect that a varargs implementation would loose a bunch
> of compile-time knowledge and possibly even open up to easy runtime
> errors, in gain of rather little.
>
> I'd love to see some existing c++ code samples that actually need
> complex params so that we're not ending up designing an API for
> academia, but make our actual life easier.

The C++ code that I care about always substitutes a single parameter
into the translated string; that parameter is mostly a string,
sometimes a single character. This *should* be easily covered by
method overloads on Get().

I also need a better way of handling sentences with a common prefix:
currently css.properties has

PEUnexpEOF2=Unexpected end of file while searching for %1$S.

PESkipAtRuleEOF=end of unknown at-rule
PECharsetRuleEOF=charset string in @charset rule
PEGatherMediaEOF=end of media list in @import or @media rule
[many more]

PEUnexpEOF2 and one of the other *EOF strings are separately localized
and then the results are pasted together at runtime. I *could* just
delete PEUnexpEOF2 and copy the common prefix into all the other *EOF
strings but then they risk getting out of sync with each other.

zw

Axel Hecht

unread,
Mar 11, 2010, 12:42:46 PM3/11/10
to
On 11.03.10 17:47, Benjamin Smedberg wrote:
> On 3/11/10 10:47 AM, Axel Hecht wrote:
>
>> For most code consumers, I expect them to hit either a jsm or a jetpack
>> api feature in the end. For declarative markup, it'd be a set processing
>> instructions setting up the context for the document in question.
>
> Yes, it's the end-API I'm worried about. The internal implementation can
> then change over time.
>
>> This is no different to what we do right now when including multiple
>> DTDs into a single xul doc. So yes, the names of entities are required
>
> Typically, however, extension localization currently happens
> per-overlay, so you could have a browser entity and an identical
> extension entity which never conflicted. It seems at first glance that
> this is no longer the case: won't l20n references be handled as DOM
> manipulations?

Yes, but during parsing. I don't know the exact details of our overlay
logic, but I expect the documents to be merged post-parsing (otherwise,
we'd have the same issue with entity refs, I guess).

Technically, the content sink holds one l20n context per document it parses.

I think we should be fine here.

>>> Are arguments always treated as strings, or does the JS expect some of
>>> them to be actual numbers?
>>
>> l20n is expected to support string and number literals, as well as
>> hashes and lists. That's easy to map into scripting languages like js,
>> but challenging to do with C++. Might be that the C++ API won't get good
>> for all that l20n can do.
>
> What is the use case for supporting hashes and lists? It sounds like
> basically you want arguments to be any structured data which can be
> represented as JSON, and that sounds like a fair bit of complexity that
> we should avoid unless we have no choice.

Conceptionally, params are just contributing the l20n context like
source files, and within l20n, hashes and lists are known good data
types. It's what the system uses to actually do its work behind the
scenes. Thus, params could have the same values. Whether that's
practically relevant, and whether the C++ api has to expose that
feature, I don't know.

>> Sounds interesting. Is there a reason why you're using char* instead of
>> our string classes?
>
> Just thinking about the standard call pattern, where you'd know the key
> and just pass it as a "string literal" instead of having to create a
> wrapper with NS_LITERAL_CSTRING. It doesn't much matter.

'k

Axel

Robert Kaiser

unread,
Mar 11, 2010, 12:50:16 PM3/11/10
to

Other than your your proposal being nothing else than a mess that not
even I can decipher?

Robert Kaiser

Axel Hecht

unread,
Mar 11, 2010, 12:51:41 PM3/11/10
to
On 11.03.10 18:29, Zack Weinberg wrote:
> Axel Hecht<l1...@mozilla.com> wrote:
>> On 11.03.10 17:17, Zack Weinberg wrote:
>>>
>>> I wonder if an explicit namespacing scheme would be worthwhile.
>>>
>>> LocalizedStrings ls;
>>> ls.BindTextDomain("css", "chrome://foo/locale/css.j20n")
>>> str = ls.Get("css", "PEImportBadURI", aURLSpec.get());
>>>
>>> "bindtextdomain" borrowed from gettext; don't actually care what
>>> it's called.
>>
>> Not sure which problem you're solving here. Sounds like you're adding
>> a third layer of grouping. There is already the chrome protocol
>> offering the same thing as text domain does, AFAICT.
>
> This is *instead of* embedding the namespace in the identifiers;
> browser_OpenNewTag becomes ("browser", "OpenNewTag"). From the C++
> perspective, having the two identifiers be separate strings is much
> more convenient. It also avoids collisions.

Collisions within your own file? This is not gettext, which usually just
throws one big translation memory at the application. This is very much
like gecko, where you scope entities by putting them into the same file.
And then you load one file into one context, unless you really need
something from a different file in one very string.

The key of l20n is, it's not the C++ coder that does that. Whether this
is a fair optimization or not is something that the language expert
should decide. In your case, it might make sense to actually have the
error messages in one object, say:

<eof_error
_prefix: "Unexpected end of file while searching for",
SkipAtRule: "%(eof_error._prefix)s end of unknown at-rule",
CharsetRule: "%(eof_error._prefix)s charset string in @charset rule",
...
>

and the C++ code would just

GET("eof_error.SkipAtRule")

Disclaimer: String subst and other grammar details TBD.

I'm not sure if there are better ways to write down shared patterns in a
bunch of strings, maybe there are. Gandalf?

The bottom line is, you shouldn't glue the string together, even though
you should be able to factor strings in the source.

Axel

johnjbarton

unread,
Mar 11, 2010, 1:14:06 PM3/11/10
to

Hmm. Well currently we put this in a XUL file:
label="&copyCmd.label;"

The &...; construct has special meaning involving complex files ending
in ".dtd". Any mistake in these obscure files complete breaks Firefox.
If successful then the value of "&...;" will be replaced with a value in
another language.

If instead we put this in the XUL file:
label="copyCmd.label"
then we don't need the .dtd file so it's possible errors are not
catastrophic. The XUL file produces a usable UI readable by developers
and translators. The translation technology then has to operate from
"copyCmd.label" rather than "&copyCmd.label". Then we can simply and
naturally change to eg
label="firebug.Copy"
or whatever.

Anyway since it seems unlikely that my suggestion will be useful to you,
thanks for posting about the technology.

jjb

>
> Robert Kaiser

Benjamin Smedberg

unread,
Mar 11, 2010, 1:25:17 PM3/11/10
to
On 3/11/10 12:42 PM, Axel Hecht wrote:

> Yes, but during parsing. I don't know the exact details of our overlay
> logic, but I expect the documents to be merged post-parsing (otherwise,
> we'd have the same issue with entity refs, I guess).

I'm surprised by this. Why did we decide to resolve strings during XML
parsing? I figured that was one of the problems with the current scheme
which we would solve with the rewrite. Admittedly dynamic locale switching
is not a high priority, but having locales be part of XML fastload data
means that we can't precompile/ship fastload data as part of the app.

> Conceptionally, params are just contributing the l20n context like
> source files, and within l20n, hashes and lists are known good data
> types. It's what the system uses to actually do its work behind the

Yes, I find this unfortunate. That means that anyone implementing l20n has
to implement something like JSON to copy the data, instead of passing data
around as simple maps.

--BDS

Axel Hecht

unread,
Mar 11, 2010, 1:45:58 PM3/11/10
to
On 11.03.10 19:25, Benjamin Smedberg wrote:
> On 3/11/10 12:42 PM, Axel Hecht wrote:
>
>> Yes, but during parsing. I don't know the exact details of our overlay
>> logic, but I expect the documents to be merged post-parsing (otherwise,
>> we'd have the same issue with entity refs, I guess).
>
> I'm surprised by this. Why did we decide to resolve strings during XML
> parsing? I figured that was one of the problems with the current scheme
> which we would solve with the rewrite. Admittedly dynamic locale
> switching is not a high priority, but having locales be part of XML
> fastload data means that we can't precompile/ship fastload data as part
> of the app.

I don't think that we stand a chance to not regress perf noticeably
without doing the l10n work before putting documents into fastload.
Thus, IMHO, live locale switching stays what it was, an issue of us
being able to reliable clear the fastload cache. At least that's the
status of it in my brain.

We do keep the l10n source information within the document, though. That
enables better l10n tooling, and in principle would allow us to
re-process a document in a different locale. Not sure how to reproduce
other state-changes that code did to the DOM, though.

I'm open to alternative suggestions on how to do the conflict resolution
with overlays, too. Thinking briefly, I came up with putting the lol
references in attributes of elements, and to have them only be valid for
the subtree. Reminds me pretty much of all the bustages we have with
xmlns, though, so I'm not sure it's a good trade-off.

>> Conceptionally, params are just contributing the l20n context like
>> source files, and within l20n, hashes and lists are known good data
>> types. It's what the system uses to actually do its work behind the
>
> Yes, I find this unfortunate. That means that anyone implementing l20n
> has to implement something like JSON to copy the data, instead of
> passing data around as simple maps.

I don't think that having the same features being equally accessible in
all language bindings should be a design goal. I'm much more a friend of
"easy things should be easy, complex things should be possible". And
that "possible" piece might be hard to do in C++. Sorry for that ;-).

If the straight C++ API doesn't support full values, fine. We can add a
"JSON" variant, and make the library decode that on the fly, for
example. That's not easy, and not fast, but it'll make complex things at
least possible.

Axel

Zack Weinberg

unread,
Mar 11, 2010, 1:48:43 PM3/11/10
to dev-pl...@lists.mozilla.org
Axel Hecht <l1...@mozilla.com> wrote:
>
> Collisions within your own file? This is not gettext, which usually
> just throws one big translation memory at the application. This is
> very much like gecko, where you scope entities by putting them into
> the same file. And then you load one file into one context, unless
> you really need something from a different file in one very string.

If you're totally confident that each source file will only need to
refer to one set of translated messages, why are we talking about
namespacing at all?

But if not, explicit namespacing is much preferable to namespacing
embedded in the message labels, IMO.

[...]


> > I also need a better way of handling sentences with a common prefix:
> > currently css.properties has
> >
> > PEUnexpEOF2=Unexpected end of file while searching for %1$S.
> >
> > PESkipAtRuleEOF=end of unknown at-rule
> > PECharsetRuleEOF=charset string in @charset rule
> > PEGatherMediaEOF=end of media list in @import or @media rule
> > [many more]
> >
> > PEUnexpEOF2 and one of the other *EOF strings are separately
> > localized and then the results are pasted together at runtime. I
> > *could* just delete PEUnexpEOF2 and copy the common prefix into all
> > the other *EOF strings but then they risk getting out of sync with
> > each other.
> >
>
> The key of l20n is, it's not the C++ coder that does that. Whether
> this is a fair optimization or not is something that the language
> expert should decide.

Right, but here I am with a .properties file and C++ code that does do
that (it was that way when I got here, for the record); what's my
migration route?

zw

Zbigniew Braniecki

unread,
Mar 11, 2010, 2:29:02 PM3/11/10
to

johnjbarton wrote:

> Hmm. Well currently we put this in a XUL file:
> label="&copyCmd.label;"
>
> The &...; construct has special meaning involving complex files ending
> in ".dtd". Any mistake in these obscure files complete breaks Firefox.
> If successful then the value of "&...;" will be replaced with a value in
> another language.
>
> If instead we put this in the XUL file:
> label="copyCmd.label"
> then we don't need the .dtd file so it's possible errors are not
> catastrophic. The XUL file produces a usable UI readable by developers
> and translators. The translation technology then has to operate from
> "copyCmd.label" rather than "&copyCmd.label". Then we can simply and
> naturally change to eg
> label="firebug.Copy"
> or whatever.

It seems to me that you're melting localization API goals with
limitations of our current approach.
The fact that we end up with what you call a catastrophic error when
something goes wrong is not due to localization approach that we should
mitigate, but due to the nature of XML/DTD which was not meant to be
used for localization at all.

Now, we're working on a new approach that takes away this XML/DTD from
localization picture and allows us to do almost _whatever_ and by that I
mean, it allows us to do the _right_ things.

What you're doing is a nice dirty hack, but what I'm saying is that with
L20n you will not need that. We will allow for error recovery and locale
fallbacks.

Please, re-read the L20n examples I gave and explanations brought by
Axel and me without DTD in mind. DTD is wrong, and we know that. We're
not aiming at improving DTD, we're aiming at providing unified
localization experience. If I extract the core concept you're puting,
it's something like:

<element label="lorem.Ipsum"/> which will get replaced by
"lorem.ipsum"'s value. That is much more limiting than the approach in
which we want to cover all localizable attributes of an element in one
entity and make it so that UI does not break if the entity is missing.

I can promise you that what you're achieving with your workarounds will
be incorporated in L20n experience. That's for sure :)

I started this thread to get help on how to lay out the code for our
library inside Gecko directory structure, and what approach we should
take on a source code level.

The discussion you're taking is about XUL bindings of L20n and how they
should work. I believe that it is a discussion worth having and I'll
make sure we have a space for it, but first I want to test if what we're
trying is doable, and if we hit any performance regression walls on a
way there. At the same time Axel is trying to finalize our file format
proposal and grand vision written down.

Once we have this, we'll go public, we'll be blogging, tweeting, singing
and dancing to make sure everyone interested may comment on our proposal
and we can get the best out of it :)

Greetings,
gandalf
--

Mozilla (http://www.mozilla.org)

Zbigniew Braniecki

unread,
Mar 11, 2010, 2:34:59 PM3/11/10
to

Zack Weinberg wrote:
> Right, but here I am with a .properties file and C++ code that does do
> that (it was that way when I got here, for the record); what's my
> migration route?

The migration route will be to first migrate properties hashes to l20n
simple entities and then allow you to improve that.

What works in .properties, will work with .l20n. It's just that here you
can do so much more and, what's more important, with you knowing,
localizers can tune the localized message to be much more natural.

Zbigniew Braniecki

unread,
Mar 11, 2010, 2:42:57 PM3/11/10
to

Axel Hecht wrote:
> On 11.03.10 19:25, Benjamin Smedberg wrote:
>> On 3/11/10 12:42 PM, Axel Hecht wrote:
>>
>>> Yes, but during parsing. I don't know the exact details of our overlay
>>> logic, but I expect the documents to be merged post-parsing (otherwise,
>>> we'd have the same issue with entity refs, I guess).
>>
>> I'm surprised by this. Why did we decide to resolve strings during XML
>> parsing? I figured that was one of the problems with the current scheme
>> which we would solve with the rewrite. Admittedly dynamic locale
>> switching is not a high priority, but having locales be part of XML
>> fastload data means that we can't precompile/ship fastload data as part
>> of the app.

Just a side note. I'd be all for to move l10n post XML parsing, but we
still did not get a reliable perf numbers that would tell us how far
behind, or ahead we are.

I'll work on that now, and will try to come up with numbers, but I
wanted to know if what I'll be measuring (my POC) will give us reliable
results despite later changes. So, if you're saying that we should
switch l20n module to c++ code that loads .j20n file, than I'm not
confident that any perf measures I can do with the current POC will be
valuable.

We didn't decide on anything yet, we're testing approaches and now we're
asking you guys for help in turning the POC into the right POC :)

johnjbarton

unread,
Mar 11, 2010, 2:58:19 PM3/11/10
to

Perhaps my confusion comes when you use the word 'entity'. Perhaps if
you take a look at:
https://developer.mozilla.org/en/XUL_Tutorial/Localization
you will see why I assumed that you meant 'entity'.

>
> I can promise you that what you're achieving with your workarounds will
> be incorporated in L20n experience. That's for sure :)

Maybe one person's workaround is another person's solution to a problem.

>
> I started this thread to get help on how to lay out the code for our
> library inside Gecko directory structure, and what approach we should
> take on a source code level.

Ok, sorry if my suggestion was off topic.
jjb

Neil Deakin

unread,
Mar 11, 2010, 3:03:30 PM3/11/10
to

Indeed, the confusion stems from a solution being posted to comment on
without any description of the problem.

I will instead await the forthcoming problem, goals and design description.

Benjamin Smedberg

unread,
Mar 11, 2010, 3:30:31 PM3/11/10
to
On 3/11/10 1:45 PM, Axel Hecht wrote:

>> Yes, I find this unfortunate. That means that anyone implementing l20n
>> has to implement something like JSON to copy the data, instead of
>> passing data around as simple maps.
>
> I don't think that having the same features being equally accessible in
> all language bindings should be a design goal. I'm much more a friend of
> "easy things should be easy, complex things should be possible". And
> that "possible" piece might be hard to do in C++. Sorry for that ;-).

I'm not talking primarily about language bindings here. Even if this were
entirely in JS, you would have to copy the parameters passed in by value
before the l20js file saw them. In terms of implementation complexity,
copying an arbitrary large object structure (a la JSON) is much more complex
than supporting at most a single map of limited datatypes.

--BDS

Robert Kaiser

unread,
Mar 11, 2010, 4:00:41 PM3/11/10
to
Neil Deakin wrote:
> Indeed, the confusion stems from a solution being posted to comment on
> without any description of the problem.

I guess we in the L10n community know all the problems so well that we
tend to not name them any more when talk comes to this point. ;-)

Robert Kaiser

Robert Strong

unread,
Mar 11, 2010, 4:08:59 PM3/11/10
to dev-pl...@lists.mozilla.org
Sure but the specific problems this work is going to address should be
specified so the scope / expectations for this work is known. I doubt it
is going to fix everything though I guess it *might* be possible.

Robert

>
> Robert Kaiser
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform

Axel Hecht

unread,
Mar 11, 2010, 5:18:58 PM3/11/10
to

Why do we need to copy? Is that a security thing?

Axel

Axel Hecht

unread,
Mar 11, 2010, 5:26:10 PM3/11/10
to
On 11.03.10 22:08, Robert Strong wrote:
> On 3/11/2010 1:00 PM, Robert Kaiser wrote:
>> Neil Deakin wrote:
>>> Indeed, the confusion stems from a solution being posted to comment on
>>> without any description of the problem.
>>
>> I guess we in the L10n community know all the problems so well that we
>> tend to not name them any more when talk comes to this point. ;-)
> Sure but the specific problems this work is going to address should be
> specified so the scope / expectations for this work is known. I doubt it
> is going to fix everything though I guess it *might* be possible.

We have a set of problems that we indeed want to address. There are a
few problems that we would like to get in as we go.

As for anything else, only discussing them will actually clean up what
the folks behind l20n consider to be a feature, what's a bug, and what's
just life or out of scope. And whether we're wrong or right.

But yeah, it'd be nice to have the whitepaper before any discussion, but
life is, we don't have anyone that's good at writing and has a link to
the brains of those not good at writing. We held back on the discussion
for years, and had internal discussions for about as long, and that's
just not constructive.

I chatted with gandalf in the meantime on the phone, and the way he
phrased the question wasn't totally aligned to the answers that I would
expect. That doesn't make the discussion bad, and both gandalf and I are
in fact getting useful feedback here. Both in terms of what we want to
do as well as in terms of which things we should try to address in the
whitepaper.

Axel

Robert Strong

unread,
Mar 11, 2010, 5:37:59 PM3/11/10
to dev-pl...@lists.mozilla.org
On 3/11/2010 2:26 PM, Axel Hecht wrote:
> On 11.03.10 22:08, Robert Strong wrote:
>> On 3/11/2010 1:00 PM, Robert Kaiser wrote:
>>> Neil Deakin wrote:
>>>> Indeed, the confusion stems from a solution being posted to comment on
>>>> without any description of the problem.
>>>
>>> I guess we in the L10n community know all the problems so well that we
>>> tend to not name them any more when talk comes to this point. ;-)
>> Sure but the specific problems this work is going to address should be
>> specified so the scope / expectations for this work is known. I doubt it
>> is going to fix everything though I guess it *might* be possible.
>
> We have a set of problems that we indeed want to address. There are a
> few problems that we would like to get in as we go.
Sounds good and it would be great if someone created a list of the
problems that are planned to be addressed by this work and another list
of problem that might be addressed (e.g. would like to get as you go) by
this work. It would be also be great if problems that aren't going to be
addressed by this work are also listed. It doesn't have to be fancy by
any means.

Cheers,
Robert

Axel Hecht

unread,
Mar 11, 2010, 5:59:10 PM3/11/10
to
On 11.03.10 19:48, Zack Weinberg wrote:
> Axel Hecht<l1...@mozilla.com> wrote:
>>
>> Collisions within your own file? This is not gettext, which usually
>> just throws one big translation memory at the application. This is
>> very much like gecko, where you scope entities by putting them into
>> the same file. And then you load one file into one context, unless
>> you really need something from a different file in one very string.
>
> If you're totally confident that each source file will only need to
> refer to one set of translated messages, why are we talking about
> namespacing at all?

Tbh, I don't know why we're talking about namespacing. That might be
because Benjamin had the question about conflicts really close to his
use of "namespace mozilla" for the c++ api. Which were different issues,
though.

> But if not, explicit namespacing is much preferable to namespacing
> embedded in the message labels, IMO.
>
> [...]
>>> I also need a better way of handling sentences with a common prefix:
>>> currently css.properties has
>>>
>>> PEUnexpEOF2=Unexpected end of file while searching for %1$S.
>>>
>>> PESkipAtRuleEOF=end of unknown at-rule
>>> PECharsetRuleEOF=charset string in @charset rule
>>> PEGatherMediaEOF=end of media list in @import or @media rule
>>> [many more]
>>>
>>> PEUnexpEOF2 and one of the other *EOF strings are separately
>>> localized and then the results are pasted together at runtime. I
>>> *could* just delete PEUnexpEOF2 and copy the common prefix into all
>>> the other *EOF strings but then they risk getting out of sync with
>>> each other.
>>>
>>
>> The key of l20n is, it's not the C++ coder that does that. Whether
>> this is a fair optimization or not is something that the language
>> expert should decide.
>
> Right, but here I am with a .properties file and C++ code that does do
> that (it was that way when I got here, for the record); what's my
> migration route?

We're hoping that we can use rewriting tools for the migration route for
the majority of the work. In cases like yours, where we can actually use
new features, there's folks to help. We're hiring here, too,
optimistically speaking there will be someone in the office to actually
talk to.

Axel

Benjamin Smedberg

unread,
Mar 12, 2010, 9:05:23 AM3/12/10
to

Absolutely. Without either complicated security wrappers or JSON/copying,
the localization would have direct access to the chrome scope, which is
against your security goal (I hope, anyway). We should treat these
parameters at least the same way we treat web workers.

--BDS

0 new messages