Questions about bindings for L20n

zbran...@mozilla.com

unread,

Jun 10, 2016, 4:51:30 AM6/10/16

to

While working on the new localization API (See Intent to Implement post from yesterday), we're developing bindings into UI languages used by Firefox and we have some decisions to make that could be better answered by this group.

The general API is declarative and DOM-based. Instead of forcing developers to programmatically create string bundles, request raw strings from them and manually interpolate variables, L20n uses a Mutation Observer which is notified about changes to data-l10n-* attributes. The complexity of the language negotiation, resource loading, error fallback and string interpolation is hidden in the mutation handler. Most of our questions in this email relate to what the best way to declare resources is.

1) HTML API

Our HTML API has to allow us to create a set of localization bundle objects, each with a unique name, that aggregate a set of localization sources. It also has to allow us to annotate elements with L10n ID/Args pairs and potentially with L10n Bundle reference id.

Currently, our proposal looks like this:

<html>
<head>
<link rel="localization" name="main" href="./locales/resource1.ftl"/>
<link rel="localization" name="main" href="./locales/resource2.ftl"/>

<link rel="localization" name="menu" href="./locales/resource3.ftl"/>
<link rel="localization" name="menu" href="./locales/resource4.ftl"/>
</head>
<body>
<h1 data-l10n-id="mainTitle" data-l10n-args="{\"user\": \"John\"}" data-l10n-bundle="main" />
</body>
</html>

Resource URIs are identifiers resolved by a localization registry which -- similar to the chrome registry -- knows which languages are available in the current build and optionally knows about other locations to check for resources (other Gecko packages, langpacks, remote services etc.). Localization bundles can query the registry multiple times to get alternative versions of a resource, a feature which makes it possible to provide a runtime fallback mechanism for missing or broken translations.

We're considering allowing names to be omitted which would imply the "default" bundle to reduce the noise for scenarios where only a single l10n bundle is needed. There's also a document.l10n collection which stores all localization bundles by name, manages the Mutation Observer and listens to languagechange events.

The open questions are:

* Would it be better to instead use custom elements like <l10n-bundle> <l10n-source src="…"/> </l10n-bundle>?
* Are data-l10n-* for attributes OK?
* Is there a better way to store arguments than stringified JSON? We considered storing arguments as separate attributes (e.g. data-l10n-arg-user="John") but that would make it impossible to the Mutation Observer to know what to observe.
* Any other feedback on the design?

2) XUL API

For XUL, we would like to use custom elements for bundles which are bound by XBL. The binding looks for <source> elements and creates a localization bundle object which is also available via the document.l10n collection.

<window>
<localization name="browser">
<source src="./locales/resource1.ftl" />
<source src="./locales/resource2.ftl" />
</localization>
<label data-l10n-bundle="browser" data-l10n-id="foo></label>
</window>

The open questions are:

* Can we use custom elements like <localization> in XUL?
* Is there a more canonical way to do this?
* Are there plans to replace XBL components with Web Components?
* Is it okay to use the "name" attribute in XUL for the <localization> object?
* Is it okay to use data-l10n-* attributes for localizable elements? Or perhaps l10n-* would be sufficient?

3) XBL API

For XBL, we plan to use the same XUL bindings but inside of the anonymous content. Again, this creates a localization bundle object which is available via the document.l10n collection.

<content>
<xul:localization name="tabbrowser">
<xul:source src="/browser/tabbrowser.ftl"/>
</xul:localization>
<xul:label data-l10n-bundle="tabbrowser" data-l10n-id="foo"></xul:label>
</content>

Open questions:

* We understand that this creates and destroys the element each time the parent is bound/unbound. Is there UI that does that on a timing-sensitive path extensively? That'd be good to measure.

* Mutations inside of the anonymous content are caught be the document.l10n's observer; are there plans to unify this with how mutations are handled in shadow DOM where observers observing non-anonymous content aren't notified about mutations in the anonymous content?

4) Performance measuring

We need to evaluate the performance impact of the change. So far we've been able to measure the loading time of about:support with DTD/StringBundle vs L20n using the Performance Timing API and the results are promising (perf win!), but we don't know how representative it is for Firefox startup and memory.

Question: Which performance tests should we run to ensure that L20n is indeed not regressing performance of Firefox?

That's it for now. We appreciate your feedback and comments!
Your L10n Team

Gijs Kruitbosch

unread,

Jun 10, 2016, 5:49:16 AM6/10/16

to

On 10/06/2016 09:51, zbran...@mozilla.com wrote:
> While working on the new localization API (See Intent to Implement
> post from yesterday), we're developing bindings into UI languages
> used by Firefox and we have some decisions to make that could be
> better answered by this group.
>
> The general API is declarative and DOM-based. Instead of forcing
> developers to programmatically create string bundles, request raw
> strings from them and manually interpolate variables, L20n uses a
> Mutation Observer which is notified about changes to data-l10n-*
> attributes. The complexity of the language negotiation, resource
> loading, error fallback and string interpolation is hidden in the
> mutation handler. Most of our questions in this email relate to what
> the best way to declare resources is.

Mutation observers or mutation events? How do you decide which elements
you observe? Observing the entire DOM tree seems like it'd likely be
terrible for performance once we start mutating the DOM. Have you done
any measurements on the performance of this approach when large amounts
of DOM are inserted (ie not about:support :-) )? How do you decide on
which documents you add these observers to?

MutationObservers are async, and dtd localization in XHTML is currently
synchronous on parsing. That seems like a large change that will cause a
lot of problems relating to reflow / flashes of unlocalized content
(keep in mind we use l10n data for style as well), tests that expect
synchronous changes as a result of actions, as well as issues where we
would want the localized changes in elements that aren't in the page DOM
(so constructed in JS, but not included in the DOM (yet)). You don't
mention a JS/C++ API, which we need for e.g. strings we pass to message
boxes or into the argument strings for about:neterror/about:certerror.
What are your plans in that department?

Less markup is better, so please don't wrap in more custom elements.

I don't have a strong opinion on custom elements over <link> ones,
though I'd note that there's existing architecture for link elements
being added/modified/removed that fire specific events to chrome code
that you may be able to leverage.

> * Are data-l10n-* for attributes OK?

Seems OK to me.

* Is there a better way to store
> arguments than stringified JSON? We considered storing arguments as
> separate attributes (e.g. data-l10n-arg-user="John") but that would
> make it impossible to the Mutation Observer to know what to observe.
> * Any other feedback on the design?

The escaped-JSON-in-markup looks very painful. In fact, it looks wrong
as it is, the correct escaping in HTML would be something like:

data-l10n-args="{"user": "John"}"

It's not clear to me why we need a key/value object rather than a
sequence as we use now. Perhaps just a semicolon-separated string with
\; as an escape for literal ; ? That'd certainly be easier to read/write.

Otherwise, it also seems wrong to require the bundle name
(data-l10n-bundle) on every localized element. The observer should be
able to simply iterate through the stringbundles in declaration order
until it finds a matching symbol.

> 2) XUL API
>
> For XUL, we would like to use custom elements for bundles which are
> bound by XBL. The binding looks for <source> elements and creates a
> localization bundle object which is also available via the
> document.l10n collection.
>
> <window> <localization name="browser"> <source
> src="./locales/resource1.ftl" /> <source
> src="./locales/resource2.ftl" /> </localization> <label
> data-l10n-bundle="browser" data-l10n-id="foo></label> </window>
>
> The open questions are:
>
> * Can we use custom elements like <localization> in XUL?

I think so. Again, I'd prefer not to have a wrapper element.

> * Is there a more canonical way to do this?

Besides "use a DTD file"? Not that I'm aware of. Note that XUL also
supports <stringbundle>, and I don't know if there isn't a reason to
reuse those tag names.

> * Are there plans to replace XBL components with Web Components?

Last I checked web components were not ready yet (ie bits were either
not implemented or not enabled by default all the way through to
release). I am not aware of concrete plans, but that is by no means
conclusive proof that such plans do not exist.

> * Is it okay to use the "name" attribute in XUL for the <localization> object?

I don't see any reason not to, but Neil Deakin would be able to give you
a more authoritative answer.

> * Is it okay to use data-l10n-* attributes for localizable elements? Or perhaps l10n-*
> would be sufficient?

data- has the nice property in HTML that there is automatic easy access
as element properties (on element.dataset) but this doesn't seem to
exist/happen for XUL. At that point, I would expect that not using the
prefix might be easier to read. On the other hand, inconsistency with
the HTML case might also be confusing. YMMV depending on whether you
use/need/like the "data-" prefix / dataset properties for HTML.

> 3) XBL API
>
> For XBL, we plan to use the same XUL bindings but inside of the
> anonymous content. Again, this creates a localization bundle object
> which is available via the document.l10n collection.
>
> <content> <xul:localization name="tabbrowser"> <xul:source
> src="/browser/tabbrowser.ftl"/> </xul:localization> <xul:label
> data-l10n-bundle="tabbrowser" data-l10n-id="foo"></xul:label>
> </content>
>
> Open questions:
>
> * We understand that this creates and destroys the element each time
> the parent is bound/unbound. Is there UI that does that on a
> timing-sensitive path extensively? That'd be good to measure.

I'm not sure. I'd assume that you can/want/should just cache the
contents of bundles in a JSM or equivalent, though, so that the actual
element instantiation should be reasonably quick once the resource has
loaded? This would also help in the case where you have 500 identical
XUL elements that are all bound to include the same localization resource...

> 4) Performance measuring
>
> We need to evaluate the performance impact of the change. So far
> we've been able to measure the loading time of about:support with
> DTD/StringBundle vs L20n using the Performance Timing API and the
> results are promising (perf win!), but we don't know how
> representative it is for Firefox startup and memory.
>
> Question: Which performance tests should we run to ensure that L20n
> is indeed not regressing performance of Firefox?

tpaint, ts_paint, sessionrestore, sessionrestore_no_auto_restore,
tsvg-opacity, tart, cart

(and their e10s equivalents, of course) would be my prime suspects. IIRC
these come under "other,svgr" in try syntax, but trychooser can confirm
this for you (hover over suites tells you what tests they run).

See https://wiki.mozilla.org/Buildbot/Talos/Tests for descriptions,
which also explains why I'm listing tsvg-opacity. :-)

~ Gijs

Neil Deakin

unread,

Jun 10, 2016, 9:25:15 AM6/10/16

to zbran...@mozilla.com

On 2016-06-10 4:51 AM, zbran...@mozilla.com wrote:
> While working on the new localization API (See Intent to Implement post from yesterday), we're developing bindings into UI languages used by Firefox and we have some decisions to make that could be better answered by this group.
>
> The general API is declarative and DOM-based. Instead of forcing developers to programmatically create string bundles, request raw strings from them and manually interpolate variables, L20n uses a Mutation Observer which is notified about changes to data-l10n-* attributes. The complexity of the language negotiation, resource loading, error fallback and string interpolation is hidden in the mutation handler. Most of our questions in this email relate to what the best way to declare resources is.
>

The one thing I would recommend you do (and I mentioned this many years
ago when this was first proposed) is to not use the jargon terms 'l10n'
or 'l20n' anywhere. They are hard to type and hard to read. Three of the
four characters in 'l10n' are indistinguishable from similar characters
in some fonts which can lead to hard to detect bugs.

You should instead spell out words such as 'locale' making it clearer to
a casual reader the meaning.

> <window>
> <localization name="browser">
> <source src="./locales/resource1.ftl" />
> <source src="./locales/resource2.ftl" />
> </localization>
> <label data-l10n-bundle="browser" data-l10n-id="foo></label>
> </window>
>
> The open questions are:
>
> * Can we use custom elements like <localization> in XUL?

Yes. Is there a need for two different elements? The HTML case doesn't
have this. You could also just use the existing stringbundle element and
make it handle a different file syntax, perhaps.

> * Is it okay to use the "name" attribute in XUL for the <localization> object?

Yes

> * Is it okay to use data-l10n-* attributes for localizable elements? Or perhaps l10n-* would be sufficient?
>

You should just use bundle="browser". I'm not sure what the data
attribute is meant to represent here, but the attribute name should be
simpler as mentioned above (XUL attributes don't have hyphens in them
either)

It's hard to give concrete suggestions when only some examples are
given. It would help if you posted a more descriptive proposal of what
all these elements/attributes do for each language. Maybe this is
already available?

smalo...@mozilla.com

unread,

Jun 10, 2016, 12:26:26 PM6/10/16

to

Thanks for a lot of great feedback, Gijs!

W dniu piątek, 10 czerwca 2016 11:49:16 UTC+2 użytkownik Gijs Kruitbosch napisał:

> Mutation observers or mutation events? How do you decide which elements
> you observe? Observing the entire DOM tree seems like it'd likely be
> terrible for performance once we start mutating the DOM. Have you done
> any measurements on the performance of this approach when large amounts
> of DOM are inserted (ie not about:support :-) )? How do you decide on
> which documents you add these observers to?

We use Mutation Observers. Actually, just one per document, which observes document.documentElement for subtree inserts and changes to the data-l10n-* attributes. I don't currently have perf numbers for large DOM inserts but generally we've been very satisfied with this approach in Firefox OS. We saw perf wins even on low-memory handsets. I'm sure we'll measure a lot more once we're closer to having a working fork of Firefox.

> MutationObservers are async, and dtd localization in XHTML is currently
> synchronous on parsing. That seems like a large change that will cause a
> lot of problems relating to reflow / flashes of unlocalized content
> (keep in mind we use l10n data for style as well), tests that expect
> synchronous changes as a result of actions, as well as issues where we
> would want the localized changes in elements that aren't in the page DOM
> (so constructed in JS, but not included in the DOM (yet)). You don't
> mention a JS/C++ API, which we need for e.g. strings we pass to message
> boxes or into the argument strings for about:neterror/about:certerror.
> What are your plans in that department?

These are all valid concerns. The API is async by design, which allow us to also do runtime fallback on missing or broken translations.

document.l10n.get('main').formatValue('hello', { user: '…' }).then(hello => …);

In this example 'main' is the name of the bundle and 'hello' is an identifier of a translation. We used this API in Firefox OS and it has proven to be versatile and easy to use. It turns out that there are very few use-cases when it's necessary to use this API. Usually it OK to pass string _identifiers_ around and only elem.setAttribute('data-l10n-id', id) when it's time to show it to the user.

Same goes for tests: if we have good tests for the Mutation Observer, then in other tests we could just test if the data-l10n-id was properly set.

We could expose something like document.l10n.translateFragment to deal with node trees which are not yet attached to the DOM, although working on data-l10n-id and then just translating once (via the observer) when the nodes are inserted would be our preferred approach.

FOUCs can also be a problem and something that's constantly on our radar. The Firefox OS experience allows us to be optimistic, and we're sure to follow up with much more perf testing in the near future to make sure we don't regress.

> > 1) HTML API

> >
> > The open questions are:
> >
> > * Would it be better to instead use custom elements like
> > <l10n-bundle> <l10n-source src="…"/> </l10n-bundle>?
>
> Less markup is better, so please don't wrap in more custom elements.

The idea of <l10n-bundle> element came about when considering where to expose the JS API (like formatValue above). Similar to <stringbundle> in XUL, one idea was to expose it on the element itself: document.querySelector('l10n-bundle').formatValue(…). As you can see, we solved it with document.l10n.get() for now.

Having it on an element has one more interesting consequence: it's widely understood that you need to wait for DOMContentLoaded or readystate === 'interactive' to query the element and start using the API. With document.l10n.get() it might look like those bundles are available synchronously while in reality we still need to query the <links> to even create them.

> I don't have a strong opinion on custom elements over <link> ones,
> though I'd note that there's existing architecture for link elements
> being added/modified/removed that fire specific events to chrome code
> that you may be able to leverage.

That's great to know, we're definitely interested in this!

> It's not clear to me why we need a key/value object rather than a
> sequence as we use now. Perhaps just a semicolon-separated string with
> \; as an escape for literal ; ? That'd certainly be easier to read/write.

That's an interesting alternative. One reason we went for named args is that they make it much easier for localizers to understand what they're translating.

> Otherwise, it also seems wrong to require the bundle name
> (data-l10n-bundle) on every localized element. The observer should be
> able to simply iterate through the stringbundles in declaration order
> until it finds a matching symbol.

Yes, that's definitely the plan and it's even implemented right now. The default bundle is called 'main' and data-l10n-bundle="main" is implied if missing from the localizable node.

> > 2) XUL API

> I think so. Again, I'd prefer not to have a wrapper element.

Would you say introducing a new XUL element <link> is OK, or perhaps we should reuse <html:link> here as well?

> > 3) XBL API
> >
> > For XBL, we plan to use the same XUL bindings but inside of the
> > anonymous content. Again, this creates a localization bundle object
> > which is available via the document.l10n collection.
> >
> > <content> <xul:localization name="tabbrowser"> <xul:source
> > src="/browser/tabbrowser.ftl"/> </xul:localization> <xul:label
> > data-l10n-bundle="tabbrowser" data-l10n-id="foo"></xul:label>
> > </content>

One more question here about XBL, related to the XUL API. If we go for <link> elements in XUL, I think it would make sense to keep things consistent in XBL as well. If I understand correctly, XBL doesn't support <script> tags which would allow us to add the required behavior to make those <link> elements work. With <localization> we can bind the desired behavior via more XBL. Are there other ways to include external scripts in XBL other than 1) more XBL, and 2) Cu.import() in <constructor>?

> > Open questions:
> >
> > * We understand that this creates and destroys the element each time
> > the parent is bound/unbound. Is there UI that does that on a
> > timing-sensitive path extensively? That'd be good to measure.
>
> I'm not sure. I'd assume that you can/want/should just cache the
> contents of bundles in a JSM or equivalent, though, so that the actual
> element instantiation should be reasonably quick once the resource has
> loaded? This would also help in the case where you have 500 identical
> XUL elements that are all bound to include the same localization resource...

Great point. Right now bundle names are expected to be unique and if one already exists in the document.l10n collection, the XBL binding won't create a new one. And we only remove them when no other bindings are in use.

> > 4) Performance measuring

> > Question: Which performance tests should we run to ensure that L20n
> > is indeed not regressing performance of Firefox?
>
> tpaint, ts_paint, sessionrestore, sessionrestore_no_auto_restore,
> tsvg-opacity, tart, cart
>
> (and their e10s equivalents, of course) would be my prime suspects. IIRC
> these come under "other,svgr" in try syntax, but trychooser can confirm
> this for you (hover over suites tells you what tests they run).
>
> See https://wiki.mozilla.org/Buildbot/Talos/Tests for descriptions,
> which also explains why I'm listing tsvg-opacity. :-)

Thanks, this is very helpful indeed!

-Staś

Gijs Kruitbosch

unread,

Jun 10, 2016, 1:37:04 PM6/10/16

to smalo...@mozilla.com

On 10/06/2016 17:26, smalo...@mozilla.com wrote:
> These are all valid concerns. The API is async by design, which allow us to also do runtime fallback on missing or broken translations.
>
> document.l10n.get('main').formatValue('hello', { user: '…' }).then(hello => …);

This async-ness will not be acceptable in all circumstances. As a
somewhat random example: how would we localize the 'slow script' dialog,
for which we have to pause script and then show the dialog? Another
example: in docshell, some error page URLs are currently generated
synchronously in some circumstances (invalid host/uris, for instance).
Making such a location change asynchronous just because of localization
is going to break a *lot* of assumptions, not to mention require
rewriting a bunch of yucky docshell code that will then probably break
some more assumptions... It's much easier to just say "we'll make
everything async" when you have a greenfield project like b2g than to
retrospectively jam it into 20 years of history (ie Gecko).

Not all JS and C++ code that will want to localize things has access to
a document object, and for all consumers to have to create one just to
use localization features would be cumbersome (and, as I understand it,
would not work without also inserting all the stringbundle things you'd
need). Please can we make sure that we have a pure-JS/C++ API that is
usable without having to have a document? (Currently, you can create
nsIStringBundle instances via XPCOM, and PluralForm can be used as a jsm
but not from C++, which also already causes headaches.)

> In this example 'main' is the name of the bundle and 'hello' is an identifier of a translation. We used this API in Firefox OS and it has proven to be versatile and easy to use. It turns out that there are very few use-cases when it's necessary to use this API. Usually it OK to pass string _identifiers_ around and only elem.setAttribute('data-l10n-id', id) when it's time to show it to the user.
>
> Same goes for tests: if we have good tests for the Mutation Observer, then in other tests we could just test if the data-l10n-id was properly set.

I'm quite worried some of this won't be workable. For instance, XUL
panels make decisions about how big they need to be based on their
contents. We'll need to ensure that the content in such panels is
present and localized before attempting to show the panel. We can't just
add the attributes, show the panel, and hope for the best. If we insert
extra turns of the event loop in here because we're ending up waiting
for localization, that'll make it harder to deal with state changes (I
clicked this button twice, is the popup open or closed? etc. etc.)

> Having it on an element has one more interesting consequence: it's widely understood that you need to wait for DOMContentLoaded or readystate === 'interactive' to query the element and start using the API. With document.l10n.get() it might look like those bundles are available synchronously while in reality we still need to query the <links> to even create them.

This isn't really true. If I have a <script> at the bottom of a
document, I expect to be able to modify the preceding DOM without
waiting for DOMContentLoaded. Likewise, if I'm in an XBL binding, I can
do certain things with the bound element immediately without attaching
event listeners.

>> It's not clear to me why we need a key/value object rather than a
>> sequence as we use now. Perhaps just a semicolon-separated string with
>> \; as an escape for literal ; ? That'd certainly be easier to read/write.
>
> That's an interesting alternative. One reason we went for named args is that they make it much easier for localizers to understand what they're translating.

There's no reason not to have the names, but the order would have to be
defined, potentially just implicitly by the order in the string in the
canonical/default language (English in our case).

>> Otherwise, it also seems wrong to require the bundle name
>> (data-l10n-bundle) on every localized element. The observer should be
>> able to simply iterate through the stringbundles in declaration order
>> until it finds a matching symbol.
>
> Yes, that's definitely the plan and it's even implemented right now. The default bundle is called 'main' and data-l10n-bundle="main" is implied if missing from the localizable node.

This is still problematic in terms of markup though. It's not uncommon
to have 3 or more DTDs in a file, and I can just use an entity without
asking what bundle it's from. Having to specify it for any "non-main"
bundle would be problematic. Why can't we just fall back to using the
other available bundles?

> Would you say introducing a new XUL element <link> is OK, or perhaps we should reuse <html:link> here as well?

I defer to Neil on XUL.

>>> 3) XBL API
>>>
>>> For XBL, we plan to use the same XUL bindings but inside of the
>>> anonymous content. Again, this creates a localization bundle object
>>> which is available via the document.l10n collection.
>>>
>>> <content> <xul:localization name="tabbrowser"> <xul:source
>>> src="/browser/tabbrowser.ftl"/> </xul:localization> <xul:label
>>> data-l10n-bundle="tabbrowser" data-l10n-id="foo"></xul:label>
>>> </content>
>
> One more question here about XBL, related to the XUL API. If we go for <link> elements in XUL, I think it would make sense to keep things consistent in XBL as well. If I understand correctly, XBL doesn't support <script> tags which would allow us to add the required behavior to make those <link> elements work. With <localization> we can bind the desired behavior via more XBL. Are there other ways to include external scripts in XBL other than 1) more XBL, and 2) Cu.import() in <constructor>?

You can load anything with the subscript loader, or invoke other XPCOM
interfaces, or use attributes that get interpreted as script. So the
answer depends a bit on what you mean by "include external script".

>>> Open questions:
>>>
>>> * We understand that this creates and destroys the element each time
>>> the parent is bound/unbound. Is there UI that does that on a
>>> timing-sensitive path extensively? That'd be good to measure.
>>
>> I'm not sure. I'd assume that you can/want/should just cache the
>> contents of bundles in a JSM or equivalent, though, so that the actual
>> element instantiation should be reasonably quick once the resource has
>> loaded? This would also help in the case where you have 500 identical

>> XUL elements that are all bound to include the same localization resource....

>
> Great point. Right now bundle names are expected to be unique and if one already exists in the document.l10n collection, the XBL binding won't create a new one. And we only remove them when no other bindings are in use.

This is very confusing. So you could have multiple bundle elements that
refer to the same bundle? You said earlier that "main" is the default
bundle, but if I have a XUL document with 50 different bindings loaded
(not exactly a lot / surprising in XUL) then how are they supposed to
have stringbundles - are they all supposed to have globally unique
identifiers/names ? It seems much more sensible to determine uniqueness
based on the resource URI, and irrespective of the name. I don't really
see the point of addressing bundles by the name apart from "it's more
convenient than URIs", and even that would go away if we actually don't
need to do that specifically and can just ask for "string foo".

The other way of dealing with the name uniqueness problem would be
considering each XBL binding its own 'namespace' of sorts, so that
'main' in one binding isn't the same as 'main' in another, but that has
issues with your using these names/ids as a perf optimization point. :-\

~ Gijs

zbran...@mozilla.com

unread,

Jun 10, 2016, 5:49:07 PM6/10/16

to

Hi Gijs,

On Friday, June 10, 2016 at 2:49:16 AM UTC-7, Gijs Kruitbosch wrote:

> Mutation observers or mutation events? How do you decide which elements
> you observe? Observing the entire DOM tree seems like it'd likely be
> terrible for performance once we start mutating the DOM. Have you done
> any measurements on the performance of this approach when large amounts
> of DOM are inserted (ie not about:support :-) )? How do you decide on
> which documents you add these observers to?

We're using Mutation Observers, and we haven't observed (no punt intended) any performance impact yet. We've been using them on the slowest devices that FxOS has been designed for, and they performed surprisingly well.

While working on with Mutation Observers I tried to evaluate the potential to optimize them to increase the signal/noise ratio of callbacks, and talked to people like Olly and Anne about potential improvements that would work better for our use case [0].

The general response to my questions was -
a) Seems like Microsoft's NodeWatch proposal [1]
b) They asked us to show them an example of where the current API is slow for our use case and they'll help us develop a better one.

So far we failed to find a case where MutationObserver would have a noticable negative impact on performance.

Would you by any chance know any piece of Firefox which does large amounts of DOM insertions that we could test against?

> MutationObservers are async, and dtd localization in XHTML is currently
> synchronous on parsing. That seems like a large change that will cause a
> lot of problems relating to reflow / flashes of unlocalized content
> (keep in mind we use l10n data for style as well)

Correct. It's a major change.

Similarly to performance concerns, FOUCs are on our mind, and we've been working on this technology initially targeting very slow devices. We've been able to get no-FOUC experience so far, but we know it's not deterministic.

We're in a position similar to many other developers who want to use JS to alter DOM before frame creation and layout happen. [2]

> , tests that expect synchronous changes as a result of actions

We'll have to fix the tests. Yes.

> , as well as issues where we would want the localized changes in elements that aren't in the page DOM (so constructed in JS, but not included in the DOM (yet)).

That's actually fairly well solved in our approach. By default localization happens only when you inject your DOMFragment into DOM, but you can also manually fire "translateFragment" which will do this on a disconnected fragment.

> You don't mention a JS/C++ API, which we need for e.g. strings we pass to message boxes or into the argument strings for about:neterror/about:certerror.
> What are your plans in that department?

Two fold.

First of all, we are planning a pure JS API. In fact, we have Node as our target, which obviously doesn't use any DOM.

The API is not finalized, but it'll allow you to do the same thing you do in DOM from JS:

var bundle = new LocalizationBundle([
'path/to/source1',
'path/to/source2'
]);

bundle.formatValue('entityId').then(val => console.log(val));

On top of that we'll probably provide some synchronous way to get the value, if only for the compatibility mode, but we'll actively discourage using it, and using it will make the code not benefit from the features of the framework.

Secondly, we'll be advocating people to move the localization to the front-end of their code. Except of a few cases, there's no reason to localize a message deep in your code, and carry a translated string around, while instead the entityId should be carried around and resolved only in the UI.

> Less markup is better, so please don't wrap in more custom elements.

So, you're saying that:

<link href="./source1"> // implcit bundle 'main'
<link href="./source2"> // implicit bundle 'main'
<link l10n-bundle="menu" href="./source3">
<link l10n-bundle="menu" href="./source4">

is preferred over:

<l10n-bundle>
<source href="./source1">
<source href="./source2">
</l10n-bundle>
<l10n-bundle name="menu">
<source href="./source3">
<source href="./source4">
</l10n-bundle>

?

> It's not clear to me why we need a key/value object rather than a
> sequence as we use now. Perhaps just a semicolon-separated string with
> \; as an escape for literal ; ? That'd certainly be easier to read/write.

semicolon-separated string would be flat. Stringified JSON allows us to build deeper structures.

We provide a wrapper API to facilitate that:

document.l10n.setAttributes(element, 'l10nId', {
user: {
'name': "John",
'gender': "male"
}
});

will assign data-l10n-id and data-l10n-args to the element, while

const {
l10nId,
l10nArgs
} = document.l10n.getAttributes(element);

handles the reverse.

> Otherwise, it also seems wrong to require the bundle name
> (data-l10n-bundle) on every localized element. The observer should be
> able to simply iterate through the stringbundles in declaration order
> until it finds a matching symbol.

It will iterate over sources in a single l10n-bundle.
In most cases, you will only have one l10n-bundle per document, so no need to explicitly name it or refer to it.

If you want a separate another bundle, you have to name it and refer to it from elements that are supposed to use it (instead of the main one).

> > * Is there a more canonical way to do this?
>
> Besides "use a DTD file"? Not that I'm aware of. Note that XUL also
> supports <stringbundle>, and I don't know if there isn't a reason to
> reuse those tag names.

Because our API is so fundamentally different and based on fundamentally different paradigms that we would prefer not to attempt to reuse the same HTML API.

We'll also live for a while in a world where old method and new co-exist, so choosing different names allow us to live without conflicts.

> tpaint, ts_paint, sessionrestore, sessionrestore_no_auto_restore,
> tsvg-opacity, tart, cart
>
> (and their e10s equivalents, of course) would be my prime suspects. IIRC
> these come under "other,svgr" in try syntax, but trychooser can confirm
> this for you (hover over suites tells you what tests they run).
>
> See https://wiki.mozilla.org/Buildbot/Talos/Tests for descriptions,
> which also explains why I'm listing tsvg-opacity. :-)

Thanks! That's super helpful :)

zb.

[0] https://groups.google.com/d/topic/mozilla.dev.platform/z4_iYqIAG-A/discussion
[1] https://www.w3.org/2008/webapps/wiki/MutationReplacement#NodeWatch_.28A_Microsoft_Proposal.29
[2] https://groups.google.com/d/topic/mozilla.dev.platform/F3Mp6dZonMA/discussion

smaug

unread,

Jun 10, 2016, 5:56:15 PM6/10/16

to

On 06/10/2016 12:49 PM, Gijs Kruitbosch wrote:
> On 10/06/2016 09:51, zbran...@mozilla.com wrote:
>> While working on the new localization API (See Intent to Implement
>> post from yesterday), we're developing bindings into UI languages
>> used by Firefox and we have some decisions to make that could be
>> better answered by this group.
>>
>> The general API is declarative and DOM-based. Instead of forcing
>> developers to programmatically create string bundles, request raw
>> strings from them and manually interpolate variables, L20n uses a
>> Mutation Observer which is notified about changes to data-l10n-*
>> attributes. The complexity of the language negotiation, resource
>> loading, error fallback and string interpolation is hidden in the
>> mutation handler. Most of our questions in this email relate to what
>> the best way to declare resources is.
>
> Mutation observers or mutation events? How do you decide which elements you observe? Observing the entire DOM tree seems like it'd likely be terrible
> for performance once we start mutating the DOM. Have you done any measurements on the performance of this approach when large amounts of DOM are
> inserted (ie not about:support :-) )? How do you decide on which documents you add these observers to?
>
> MutationObservers are async

From JS side, but they are sync from browser engine point of view since MO callbacks are called at microtask check point.
(end of outermost script execution or end of task)

, and dtd localization in XHTML is currently synchronous on parsing. That seems like a large change that will cause a lot
> of problems relating to reflow / flashes of unlocalized content

Why would it cause reflows or flashes of unlocalized content?

zbran...@mozilla.com

unread,

Jun 10, 2016, 6:01:03 PM6/10/16

to

On Friday, June 10, 2016 at 10:37:04 AM UTC-7, Gijs Kruitbosch wrote:

> This async-ness will not be acceptable in all circumstances. As a
> somewhat random example: how would we localize the 'slow script' dialog,
> for which we have to pause script and then show the dialog?

Agree, there are exceptions, and we may have to provide sync version (which will have limited functionality) for such cases.

For this particular case, the way we approached it in FxOS was to do something like replacing:

window.alert(l10n.get("slowScriptTitle"));

with:

l10n.formatValue("slowScriptTitle").then(val => window.alert(val));

Would that not work?

> Another example: in docshell, some error page URLs are currently generated
> synchronously in some circumstances (invalid host/uris, for instance).
> Making such a location change asynchronous just because of localization
> is going to break a *lot* of assumptions, not to mention require
> rewriting a bunch of yucky docshell code that will then probably break
> some more assumptions...

Yay! :)

If you're saying that we're generating URLs with localized messages in them, then I'd question to design...

But as I said, we may have to provide a compatibility layer where we'll have sync variant for those scenarios and discourage it for new code.

> It's much easier to just say "we'll make
> everything async" when you have a greenfield project like b2g than to
> retrospectively jam it into 20 years of history (ie Gecko).

It probably is, but you don't want to know how much time it took me to transition even the relatively young project from sync to async! ;)

> Not all JS and C++ code that will want to localize things has access to
> a document object, and for all consumers to have to create one just to
> use localization features would be cumbersome (and, as I understand it,
> would not work without also inserting all the stringbundle things you'd
> need). Please can we make sure that we have a pure-JS/C++ API that is
> usable without having to have a document? (Currently, you can create
> nsIStringBundle instances via XPCOM, and PluralForm can be used as a jsm
> but not from C++, which also already causes headaches.)

We'll definitely have pure JS code. We're going to land JSM code, and as I said, Intl stuff (like PluralRules) will be available straight from SpiderMonkey (Intl.PluralRules). Although in L20n world, as an engineer you won't ever need to use PluralRules manually :)

For C++, we may wrap the JS API and expose it in C++, but we may also try to migrate l10n in C++ up the layer and make C++ code carry l10nIds, and JS UI code localize them.

> I'm quite worried some of this won't be workable. For instance, XUL
> panels make decisions about how big they need to be based on their
> contents. We'll need to ensure that the content in such panels is
> present and localized before attempting to show the panel. We can't just
> add the attributes, show the panel, and hope for the best. If we insert
> extra turns of the event loop in here because we're ending up waiting
> for localization, that'll make it harder to deal with state changes (I
> clicked this button twice, is the popup open or closed? etc. etc.)

That's a great point. As I said in my previous email I'd love a way to prevent frame creation until JS init code is done.

We may also decide to move the MutationObserver part in Gecko to ContentSink, or design an API that we'll plug into our DOM that will work better for us than Mutation Observer.

So far MO works well and gives us the results we need.

> This is still problematic in terms of markup though. It's not uncommon
> to have 3 or more DTDs in a file, and I can just use an entity without
> asking what bundle it's from. Having to specify it for any "non-main"
> bundle would be problematic. Why can't we just fall back to using the
> other available bundles?

By default you will have all your "DTD"s in the "main" bundle, and we'll loop over them to localize your elements. So that works the way you expect it.

On top of that, you'll be able to also specify more bundles with more source files. That's where named ones come in.

Thanks,
zb.

Gijs Kruitbosch

unread,

Jun 13, 2016, 4:39:32 AM6/13/16

to

On 10/06/2016 22:56, smaug wrote:
> , and dtd localization in XHTML is currently synchronous on parsing.
> That seems like a large change that will cause a lot
>> of problems relating to reflow / flashes of unlocalized content
>
> Why would it cause reflows or flashes of unlocalized content?

AIUI, if I insert content that needs localizing with these attributes
into the DOM tree of a document, and then read any salient property of
that document/tree that triggers a reflow (scrollHeight, anything off
the bounding client rect, etc.), that will cause a sync reflow before
the localized content is there, and it will likely produce different
values from what we want when localization is in place.

Separately, the documentation put forth so far seems to indicate that
the localization itself is also async, on top of the asyncness of the
mutationobserver approach, and that could potentially result in flashes
of unlocalized content, depending on "how" asynchronous that API really
ends up being. (AFAIK, if the API returned an already-resolved promise,
there might be less chance of that than if it actually went off and did
IO off-main-thread, then came back with some results.)

~ Gijs

Gijs Kruitbosch

unread,

Jun 13, 2016, 4:52:31 AM6/13/16

to zbran...@mozilla.com

On 10/06/2016 23:01, zbran...@mozilla.com wrote:
> On Friday, June 10, 2016 at 10:37:04 AM UTC-7, Gijs Kruitbosch wrote:
>
>> This async-ness will not be acceptable in all circumstances. As a
>> somewhat random example: how would we localize the 'slow script' dialog,
>> for which we have to pause script and then show the dialog?
>
> Agree, there are exceptions, and we may have to provide sync version (which will have limited functionality) for such cases.
>
> For this particular case, the way we approached it in FxOS was to do something like replacing:
>
> window.alert(l10n.get("slowScriptTitle"));
>
> with:
>
> l10n.formatValue("slowScriptTitle").then(val => window.alert(val));
>
>
> Would that not work?

I don't know, you'd have to ask people who are more intimately familiar
with how our slow script interrupt code works. It's in C++, so I'm not
sure how your example maps. Intuitively, I am worried it might not work
- we're trying to stop script from running, and so waiting on a promise
and then running script on the same (UI) thread feels interesting. Maybe
it's solvable with *shudder* another nested event loop to wait for that
promise, or something.

>> Another example: in docshell, some error page URLs are currently generated
>> synchronously in some circumstances (invalid host/uris, for instance).
>> Making such a location change asynchronous just because of localization
>> is going to break a *lot* of assumptions, not to mention require
>> rewriting a bunch of yucky docshell code that will then probably break
>> some more assumptions...
>
> Yay! :)
>
> If you're saying that we're generating URLs with localized messages in them, then I'd question to design...

http://searchfox.org/mozilla-central/rev/ff5673acd6a38a43bc250a2baa47df8fe6ef7859/docshell/base/nsDocShell.cpp#5150-5151

http://searchfox.org/mozilla-central/rev/ff5673acd6a38a43bc250a2baa47df8fe6ef7859/docshell/base/nsDocShell.cpp#5291-5294

Nobody I know of particularly likes this code (I believe this to be true
much more generally than this particular quirk of it, for that matter -
error page handling is "interesting".), but it does exist, yes. The
problem is that the error page is unprivileged and so doesn't have
access to the stringbundles one would like to use, so it's not entirely
trivial to fix. Certainly not unchangeable, though...

~ Gijs

zbran...@mozilla.com

unread,

Jun 14, 2016, 12:06:16 AM6/14/16

to

On Monday, June 13, 2016 at 9:39:32 AM UTC+1, Gijs Kruitbosch wrote:
> Separately, the documentation put forth so far seems to indicate that
> the localization itself is also async, on top of the asyncness of the
> mutationobserver approach, and that could potentially result in flashes
> of unlocalized content, depending on "how" asynchronous that API really
> ends up being. (AFAIK, if the API returned an already-resolved promise,
> there might be less chance of that than if it actually went off and did
> IO off-main-thread, then came back with some results.)

The DOM localization that is used in response to MutationObserver is sync.

zb.

wax miguel

unread,

Jun 14, 2016, 12:33:35 AM6/14/16

to zbran...@mozilla.com, dev-pl...@lists.mozilla.org

looking forward! thanks!

> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>

--
https://google.com.ph

Axel Hecht

unread,

Jun 14, 2016, 1:56:59 AM6/14/16

to

... unless strings trigger a load, either if the initial suite of
localizations isn't loaded yet, or the loaded strings trigger a runtime
error, which requires more l10n files to be loaded. That's obviously
cached, so it happens at first occasion.

Axel

Joe Walker

unread,

Jun 14, 2016, 6:51:16 AM6/14/16

to Axel Hecht, dev-pl...@lists.mozilla.org

I don't think you can say "It's sync unless <something> in which case it's
async".
If that's that case then from the API consumers point of view, then (deep
voodoo withstanding) it's async.

Joe.

zbran...@mozilla.com

unread,

Jun 15, 2016, 4:31:45 AM6/15/16

to

On Tuesday, June 14, 2016 at 11:51:16 AM UTC+1, Joe Walker wrote:
> I don't think you can say "It's sync unless <something> in which case it's
> async".
> If that's that case then from the API consumers point of view, then (deep
> voodoo withstanding) it's async.

As weird as it sounds, I believe that you actually can in this case.

Because the API is declarative, we can translate DOM synchronously and if we encounter an error, we can either synchronously or asynchronously get the fallback.

Which means that we're only dealing with async (potentially) when we hit an error scenario.

Dealing with worse performance in error scenarios is still significantly better than the current situation where we just crash.

And as Axel pointed out, we can do the error scenario sync or async, depending on our decisions that don't affect our architecture.

zb.