RFD: L20n Context Data

Zibi Braniecki

unread,

Aug 7, 2015, 7:02:02 PM8/7/15

to mozilla-t...@lists.mozilla.org

I'd like to bring back the concept from l20n 1.0.x line of context data.

We didn't add it in 2.x and we don't have it in 3.x yet, but I'd like to get us all on the same page regarding the value of this feature in the future.

The idea goes like this.

We have a concept that a set of strings can operate within one context.
We call it an l10n context.

It may look like this (using pseudo-API for simplicity):

var ctx = new Context('diaspora.org/board);
ctx.addResource('./menu.{locale}.resource');
ctx.addResource('./settings.{locale}.resource');
ctx.addResource('./board.{locale}.resource');

ctx.get('latestNews');
ctx.get('newMessages', {
messageCount: 5
});
ctx.get('events', {
count: 10
});

Now, the view that this context is used for is a certain board of a certain person on diaspora social network.
This person has a certain characteristic that may impact translation of some UI elements in this view. For example gender or name.

In another case, we can imagine a download manager window that has a number of files being downloaded currently.

Now, there are two aspects that are in play here. First of all, there may be a lot of elements that may be in one or the other way affected by those variables.

It can be easily addressed by adding this data bits to the relevant entity calls:

ctx.get('newMessages', {
messageCount: 5,
user: {
name: 'John',
gender: 'male'
}
});

ctx.get('events', {
count: 5,
user: {
name: 'John',
gender: 'male'
}
});

but that's mundane and on top of that, it increases complexity for HTML bindings scenario where you'd have to set data-l10n-args on all affected elements.

Second of all, it is possible that the developer doesn't know which entities may be affected.

Do we know which messages on the diaspora screen should be affected by the user's gender? Or maybe the Pause button on download manager window should be translated differently in some languages depending on if there are current files being downloaded?

Context Data comes to rescue.

Context data is a set of variables that the developer defines on the whole localization context and can be accessed by any entity resolved within that context.

It may look something like this:

ctx.data = {
user: {
name: 'John',
gender: 'male'
}
};

ctx.get('latestNews');
ctx.get('newMessages', {
messageCount: 5
});
ctx.get('events', {
count: 10
});

and localizers can use user.name and user.gender from any of the entities.

Now, you may ask, what about HTML API? How do we provide context data for HTML localization which happens before JS code is executed?

In 1.x we provided a <script> tag with JSON data that stored this, but I believe now that it was an unreasonable approach because it required build time variable resolution for client-side UI.

Instead, I suggest that we don't resolve entities that use context data until context data is resolved.

I'm not sure yet how exactly it may look like, but in my vision it works like that:

1) HTML has l10n-id's
2) Some of them require context data (I don't know how identified)
3) When we load HTML we resolve all entities except of the ones that require context data
4) When JS is launched it adds context data which in turn localizes the remaining elements

I think that it would be actually really useful for all scenarios where currently devs have to localize HTML Elements from JS just because they need to pass a single variable resolved at runtime.

I don't have details on how exactly HTML API would work, but I think it's pretty clear how the JS API should.

What do you think?
zb.

Staś Małolepszy

unread,

Aug 10, 2015, 11:39:30 AM8/10/15

to Zibi Braniecki, mozilla-t...@lists.mozilla.org

Hey Zibi, thanks for starting the thread. This has been something that I
was meaning to tackle for some time now.

On Sat, Aug 8, 2015 at 1:01 AM, Zibi Braniecki <zbigniew....@gmail.com
> wrote:

>
> It can be easily addressed by adding this data bits to the relevant entity

> calls […] but that's mundane and on top of that, it increases complexity

> for HTML bindings scenario where you'd have to set data-l10n-args on all
> affected elements.
>
> Second of all, it is possible that the developer doesn't know which
> entities may be affected.
>

This is the key argument for having a system in which some data is
available context-wide. This is how we unlock new possibilities in
translations. I'm really glad that we're coming back to this idea.

> Context data is a set of variables that the developer defines on the whole
> localization context and can be accessed by any entity resolved within that
> context.
>

I wonder if there's an opportunity here to make L20n simpler by providing
context data in form of either a) global(s) or b) a special var, or even c)
a special entity.

What bothers me right now is that we have a special syntax for globals (@),
which are context-wide, and a special syntax for vars ($), which can either
be context-wide or local to the entity. This isn't consistent and also
makes it harder for tools to know what 's going on. For instance detecting
a missing reference to a var is hard because the tool has to know that a
particular reference might be a context-wide arg not used in the source
language but required in the translation.

Perhaps we should design our data split (and the syntax) around the concept
of being context wide vs. local to the entity? This would put global and
the context-wide data in one group, and local vars in the other.

I don't have answers to the specifics yet, but maybe the user gender should
be exposed as @gender (a custom global defined by the developer), or
perhaps @ctx.gender or @ctx('gender') to avoid name collisions.

Or, we could go the opposite direction: have a special var which is the
namespace for all context-wide data: $ctx.gender. Which would also be a
way to remove globals all together ;) $ctx.plural or $ctx.deviceType could
work well, too.

What do you think about this grouping-by-scope instead of the current
grouping-by-provider?

> In 1.x we provided a <script> tag with JSON data that stored this, but I
> believe now that it was an unreasonable approach because it required build
> time variable resolution for client-side UI.
>

I don't think it's that unreasonable. For things like context-wide
user-gender, we're likely talking about the currently logged-in user. We
can have the server provide this kind of data to the client in form of
HTML. For all-clientside apps, we could still insert the required <script>
before l20n.js is initialized, or pass a promise with the context data to
the initialization of the document.l10n View.

> I don't have details on how exactly HTML API would work, but I think it's
> pretty clear how the JS API should.
>

Do we still want to have a method for updating the ctx data
(ctx.updateData(…)), like we used in v1.x? The rationale was that we
wanted to know when some ctx arg changed value and possibly react to it.
Can we use Object.observe() already? :)

-stas

Zibi Braniecki

unread,

Aug 11, 2015, 6:09:20 PM8/11/15

to mozilla-t...@lists.mozilla.org

On Monday, August 10, 2015 at 8:39:30 AM UTC-7, Staś Małolepszy wrote:
> What bothers me right now is that we have a special syntax for globals (@),
> which are context-wide, and a special syntax for vars ($), which can either
> be context-wide or local to the entity. This isn't consistent and also
> makes it harder for tools to know what 's going on. For instance detecting
> a missing reference to a var is hard because the tool has to know that a
> particular reference might be a context-wide arg not used in the source
> language but required in the translation.
>
> Perhaps we should design our data split (and the syntax) around the concept
> of being context wide vs. local to the entity? This would put global and
> the context-wide data in one group, and local vars in the other.
>
> I don't have answers to the specifics yet, but maybe the user gender should
> be exposed as @gender (a custom global defined by the developer), or
> perhaps @ctx.gender or @ctx('gender') to avoid name collisions.
>
> Or, we could go the opposite direction: have a special var which is the
> namespace for all context-wide data: $ctx.gender. Which would also be a
> way to remove globals all together ;) $ctx.plural or $ctx.deviceType could
> work well, too.
>
> What do you think about this grouping-by-scope instead of the current
> grouping-by-provider?

I'm not strongly opinionated on @ vs. $ctx for globals. My slight preference goes toward using '@' just because of minor things like having to deal with name overlap or special-casing variable names.
But I'm ok doing that if we have a reason to avoid '@'.

On the other hand I do have a strong opinion on grouping-by-provider. I feel that dual-nature of variable (coming from context or from the call) is actually a feature and a very desired one for me while globals are context-scoped by nature.

I must say that I don't see anything confusing in that.

The dual-nature of variables makes sense to me because they are preserving the isolation of concerns. Localizer doesn't care how the developer provides the variable, and the developer can define a context one, and then override it locally for a particular call.

> I don't think it's that unreasonable. For things like context-wide
> user-gender, we're likely talking about the currently logged-in user. We
> can have the server provide this kind of data to the client in form of
> HTML. For all-clientside apps, we could still insert the required <script>
> before l20n.js is initialized, or pass a promise with the context data to
> the initialization of the document.l10n View.

I don't think server side code should inject content like that into HTML. I believe it should be polled with API at runtime.
So I still think that <script> tag is not helpful here.

> Do we still want to have a method for updating the ctx data
> (ctx.updateData(…)), like we used in v1.x? The rationale was that we
> wanted to know when some ctx arg changed value and possibly react to it.

I think so.

> Can we use Object.observe() already? :)

Yes.

zb.
p.s. I'd like to make sure that Axel, you and I agree on the big-picture idea of having data from the developer provided to the whole context before we dwell into syntactic formatting.

Staś Małolepszy

unread,

Aug 12, 2015, 8:38:54 AM8/12/15

to Zibi Braniecki, mozilla-t...@lists.mozilla.org

On Wed, Aug 12, 2015 at 12:09 AM, Zibi Braniecki <
zbigniew....@gmail.com> wrote:

>
> On the other hand I do have a strong opinion on grouping-by-provider. I
> feel that dual-nature of variable (coming from context or from the call) is
> actually a feature and a very desired one for me while globals are
> context-scoped by nature.
>
> I must say that I don't see anything confusing in that.
>
> The dual-nature of variables makes sense to me because they are preserving
> the isolation of concerns. Localizer doesn't care how the developer
> provides the variable, and the developer can define a context one, and then
> override it locally for a particular call.
>

What the localizer cares about is which variable can be used in the string
and which can't. It's not clear to me what our plan to make this easier
is. How is the localizer supposed to know if they can use $gender or $n in
other strings? What can we do to make tools help the localizers?

If context data was part of the global scope (whatever the syntax might be
for it; let's put this aside for now, as you say), the distinction is
clear. $n is a local arg passed to the entity. It can't be used freely in
other entities. @gender and @plural are global, on the other hand, and can
be used in all entities. Another way of thinking about that is that
globals are context-wide data that's always available in every context. I
also don't expect there to be many use-cases for context-wide data provided
by the developer. The gender is probably *the* most important one. Do we
know of any other?

In fact, I'm starting to think that maybe all globals should be
developer-provided. That is, the developer should have the control over
which globals they want to provide. There would be sane defaults for the
given environment (@screen for browsers etc), and the developer could add
their own "globals", like gender. The benefit of this approach would be
that all global references would be grouped together which would make it
easier to understand where they can be used and also it would be easier to
instrument tools about globals available in the current context. Right now
we'd need to instrument twice: once for globals and once for context-wide
data provided by the developer.

-stas

Axel Hecht

unread,

Aug 13, 2015, 7:00:34 AM8/13/15

to mozilla-t...@lists.mozilla.org

Hi,

tbh, I'm not so fond of global context data. I find it confusing and
probably error prone.

I do see value in creating views(?) with data associated to them, just
so that folks don't need to repeat argument passing a few times.

I can see a similar aspect in allowing l10n data to be associated with
any parent element of a localizable html element, too. The benefit of
not enforcing this to be global is that you can actually put l10n data
onto things like contacts cards, and have multiple per document.

Axel

zbran...@mozilla.com

unread,

Sep 28, 2015, 3:56:26 PM9/28/15

to mozilla-t...@lists.mozilla.org

On Wednesday, August 12, 2015 at 5:38:54 AM UTC-7, Staś Małolepszy wrote:
> What the localizer cares about is which variable can be used in the string
> and which can't. It's not clear to me what our plan to make this easier
> is. How is the localizer supposed to know if they can use $gender or $n in
> other strings? What can we do to make tools help the localizers?

I have no idea.

> If context data was part of the global scope (whatever the syntax might be
> for it; let's put this aside for now, as you say), the distinction is
> clear. $n is a local arg passed to the entity. It can't be used freely in
> other entities. @gender and @plural are global, on the other hand, and can
> be used in all entities. Another way of thinking about that is that
> globals are context-wide data that's always available in every context.

Ok, I see your point. That doesn't sound bad.

> I also don't expect there to be many use-cases for context-wide data provided
> by the developer. The gender is probably *the* most important one. Do we
> know of any other?

No, and I think it's hard to predict at this point. It's just too early. We still didn't get to start using L20n file format in any app. Once we do, and once we get more UI's that use L20n features, I expect we'll start seeing more duplication of l10n-args asking for de-duplication and that's where ctxdata comes in. :)

In result, I'm ok waiting with this feature and not adding it yet. I just wanted to make sure that we have a chance to discuss it before we come to the point where we'll start trying to solve a problem with it.

> In fact, I'm starting to think that maybe all globals should be
> developer-provided. That is, the developer should have the control over
> which globals they want to provide. There would be sane defaults for the
> given environment (@screen for browsers etc), and the developer could add
> their own "globals", like gender. The benefit of this approach would be
> that all global references would be grouped together which would make it
> easier to understand where they can be used and also it would be easier to
> instrument tools about globals available in the current context. Right now
> we'd need to instrument twice: once for globals and once for context-wide
> data provided by the developer.

I'm not sure if I like that. I see a difference between environment variables like @screen or @hour, and ui specific like @gender or @user.name.

But those are likely implementation details that we can narrow down once we accumulate more examples of where ctxdata would solve problems.

zb.

zbran...@mozilla.com

unread,

Sep 28, 2015, 4:03:59 PM9/28/15

to mozilla-t...@lists.mozilla.org

On Thursday, August 13, 2015 at 4:00:34 AM UTC-7, Axel Hecht wrote:
> I do see value in creating views(?) with data associated to them, just
> so that folks don't need to repeat argument passing a few times.

I believe that that's exactly what we're thinking about. :)

The way I think about it is that we could have multiple views per context (while currently we have 1-1), and each view could have it's own ctxdata (so maybe view data)?

For example, document's View would have it's data, while webcomponents view (while sharing the l10n context) would provide it's own data.

I can then see how a navigation bar could in the future have it's own data etc.

> I can see a similar aspect in allowing l10n data to be associated with
> any parent element of a localizable html element, too. The benefit of
> not enforcing this to be global is that you can actually put l10n data
> onto things like contacts cards, and have multiple per document.

Yeah, I like it. I would like to tinker more with this thought, but I can imagine sth like:

<element l10n-view-args="">
<h1 l10n-id="foo"></h1>
</element>

where foo has access to viewArgs defined on the element.

zb.