Yesterday Jono and I discussed some of the remaining Big Issues to
resolve before making Parser 2 the default parser for Ubiquity [1] and
it was very clear that deciding on an approach for making Ubiquity
commands and nountypes localizable is our number one priority right
now. We got a good conversation started in our meeting but as it was
getting long, we're moving this conversation back to the listhost.
Here's a recap of our discussion today:
[1] http://mitcho.com/blog/projects/big-issues-and-small-issues-with-parser-2/
--
# OUR GOAL
We would like to make two types of data localizable:
1. Ubiquity commands. These are Ubiquity "verbs" which may take
certain arguments and whose actions can be previewed and executed.
2. Ubiquity nountypes. Each nountype defines a class of argument
strings which may be accepted as an argument to a verb. For example,
built-in nountypes include number, language, URL, date.
Note that "localization of commands" and "localization of nountypes"
are two fundamentally different things. The commands require some
localized strings (the verb "names", messages in the preview's and
execute's, etc.) but a localization should not be able to change the
command's fundamental preview and execute actions (logic). Nountype
localization, however, requires updated logic: for example,
noun_type_date may accept different sets of strings when running in
different languages, due to the differing date formats of the locales.
We can break down the question of "how to make these localizable" into
two subproblems:
1. What will be the data structure of localized commands/nountypes
within Ubiquity?
2. How do we distribute/share these localizations?
I (mitcho) believe these subproblems are orthogonal.
# THE DATA STRUCTURE QUESTION: two approaches
There are broadly two approaches to the data structure question:
gettext-style string replacement vs. a unified object.
## gettext-style string replacement
In this approach, a verb might look like {name: _('move'),...} and the
underscore function uses the base string (here, 'move') as a key and
replaces it with the active locale's version on runtime. This
"dictionary" could be provided in the regular gettext-style (po or mo)
or in JSON.
PROS:
1. People are used to it (esp. in the unix world).
2. Cleanly separates strings from logic.
3. Doesn't require (much) knowledge of JS.
CONS:
1. Requires (unless we use some magic) command authors to use _() to
make strings localizable.
2. Doesn't allow localization of logic (js).
3. Some things are complicated: How would you gettext an array of
options, say? eval(_("[list]"))? _("list").split('|')? Would we use
templates for messages like "translating the selection from (source
language) to (goal language)"?
## Unified object approach
In this approach, a verb might look like {name: {en: 'move', fr:
'porte',...},...}. If I write a command like {name: {en: 'move'},...},
someone else could make a French copy {name: {fr: 'porte'},...} and
the objects could be unified.[2]
PROS:
1. Enables localization of logic.
2. Doesn't require diligent wrapping of all strings with some function
(cf gettext _())
CONS:
1. Requires some knowledge of JS.
2. Logic and strings are mixed.
[2] How exactly this happens depends on the distribution question, but
manual unification (by the command author) and automated unification
(via a centralized repository/authority (the herd) or in Ubiquity on
the client) are both possible.
## Thoughts
At the end of our conversation we sort of ended on the conclusion that
the gettext approach might be better suited towards verbs while the
unified object approach is better suited towards nountypes.
# THE DISTRIBUTION/SHARING QUESTION
In our meeting today we didn't get around to discussing the
distribution/sharing question much but I'll jot down some feelings:
1. Don't require the command author to collect/redistribute
localizations.
2. Don't require the user to subscribe to the command + localizations
separately.
3. There are benefits and downsides to both centralizing (for example,
doing it on the herd) and decentralizing (like current commands are).
--
I hope we can get this discussion rolling on the distribution
question... I'd also love to get some feedback on the two approaches
to the data structure question from some of our l10n folks.
Thanks!
mitcho
--
mitcho (Michael 芳貴 Erlewine)
mit...@mitcho.com
http://mitcho.com/
linguist, coder, teacher
.flod