Locale support proposal

9 views
Skip to first unread message

Pavel Kunc

unread,
Jul 29, 2008, 7:25:05 PM7/29/08
to merb_global
So as we have L10n translations in place. Let's give some love to
Locale. I looked for some resources and to other frameworks/languages
and here are some ideas.

We need some locale information which needs to be available for
merb_global itself and users as well to support internationalization.
Especially we need for each language/region:
* Number, Date, DateTime, Time formats
* Language information (codes, direction, name translations
* Translations for days and months
* Measurement units
* Collation?
* Fallback locale?
* Currency?

All these information can we get from http://www.unicode.org/cldr in
XML format.

To provide users with the locale information we create Locale class
which will all carry current locale information. First question is if
we provide some rich API (Locale.days, Locale.months,
Locale.number_format) or some flat API to just get Hash or some query
function to get desired info.

We need to provide means to set/get current Locale or get Locale
information based on ISO-639-1/2 and ISO-3166-1.

Methods:
Locale.current
Locale.get(lang_code)
Locale.set(lang_code)

Setting locale will set Thread.current_locale.

We currently have providers for Dates and Numbers so we could keep
them and get information for proper formation from Locale object. That
would be one set of providers than we could have current one which
rely on the OS.

This proposed RICH Locale information could as well be one of the
Locale information provider for users who needs more info. Other
provider could provide simple locale information for example with only
Locale.language_code.

We need to store somehow locale information. I think it would be best
to find some native Ruby representation which will not need much
dependencies if any. Ideas:

YAML - native support
JSON - needs gem so not good one
XML - needs lot's of parsing
Marshaling - Simple
PStore - simple, fast?
Hash - simple, fast

I think that PStore, Marshaling or Hash are good candidates as we save/
get native objects. And we can have rake task which will generate our
representation from the XML distribution of the CLRD.

What needs to be localized - format?

Date, DateTime and Time objects.
Numbers
Measurement units
Currency

All this should localize #localize method using correct provider.

About the API for the end user. We currently don't alter String,
Fixnum or Date class which is really good.

Current:
#_() method for both translations and localizations

That's good that we have just one method, but parameters are
significant and different translations and localizations.

Globalize plugin:
#_() or .t
.localize

They currently adds .t and .localize method to String and other
classes which should be avoided.

Rails new API:
#translate
#localize

Just module methods. Separates API for Translations and Localization.
I personally like that the most. Also because it could cleanup _()
method. Of course we could have _() and #localize.

Sorry for long post!

Pavel Kunc

unread,
Jul 29, 2008, 7:37:32 PM7/29/08
to merb_global
Hash is possible only when we would have for each language separate
hash in a separate file so we would load information only for that
lang. Otherwise it would eat a lot of memory.

That is how CLDR is distributed. Separate XMLs for language.

Pavel

Maciej Piechotka

unread,
Jul 30, 2008, 8:43:33 AM7/30/08
to merb_...@googlegroups.com
On Tue, 2008-07-29 at 16:25 -0700, Pavel Kunc wrote:
> So as we have L10n translations in place. Let's give some love to
> Locale. I looked for some resources and to other frameworks/languages
> and here are some ideas.
>
> We need some locale information which needs to be available for
> merb_global itself and users as well to support internationalization.
> Especially we need for each language/region:
> * Number, Date, DateTime, Time formats

We could do it by the current providers (not sure about date/time
format). It could be a front-end.

> * Language information (codes, direction, name translations

It would be worth

> * Translations for days and months
> * Measurement units

Do we need it? I think that SI is currently a standard?

> * Collation?
> * Fallback locale?
> * Currency?
>
> All these information can we get from http://www.unicode.org/cldr in
> XML format.
>

1. IANAL What about license?
2. In many cases we can provide a substitutes from current providers. We
can use Locale object to load data from providers. Optionally it can
provide a fallback.

> To provide users with the locale information we create Locale class
> which will all carry current locale information. First question is if
> we provide some rich API (Locale.days, Locale.months,
> Locale.number_format) or some flat API to just get Hash or some query
> function to get desired info.
>

I would prefere to keep it near Date. But I would like to hear the
pros/cons of storing it in Locale.

> We need to provide means to set/get current Locale or get Locale
> information based on ISO-639-1/2 and ISO-3166-1.
>
> Methods:
> Locale.current
> Locale.get(lang_code)
> Locale.set(lang_code)
>

Yes. It should also provide a Locale.current.lang_code.
1. What should Locale.get(lang_code) do? Shouldn't it be
Locale.new(lang_code)?
2. Instead of set may be better would be Locale.current=(locale)

> Setting locale will set Thread.current_locale.
>

Yes

> We currently have providers for Dates and Numbers so we could keep
> them and get information for proper formation from Locale object. That
> would be one set of providers than we could have current one which
> rely on the OS.
>

Yes. However we would have to see in what format this information is
stored.

> This proposed RICH Locale information could as well be one of the
> Locale information provider for users who needs more info. Other
> provider could provide simple locale information for example with only
> Locale.language_code.
>

I guess better solution would be to fallback to this data if provider do
not support it.

> We need to store somehow locale information. I think it would be best
> to find some native Ruby representation which will not need much
> dependencies if any. Ideas:
>
> YAML - native support
> JSON - needs gem so not good one
> XML - needs lot's of parsing
> Marshaling - Simple
> PStore - simple, fast?
> Hash - simple, fast
>
> I think that PStore, Marshaling or Hash are good candidates as we save/
> get native objects. And we can have rake task which will generate our
> representation from the XML distribution of the CLRD.
>

It depends what cache will be used. However I guess we need low maximum
latency as well.

> What needs to be localized - format?
>
> Date, DateTime and Time objects.
> Numbers
> Measurement units
> Currency
>
> All this should localize #localize method using correct provider.
>

I'd prefere:

<% @d.localize %>

or @d.currency and @d.localize(format) for currency and
Date/DateTime/Time.

If we had this information @date.localize also for Date/DateTime/Time
(returning the default format for locale).

> About the API for the end user. We currently don't alter String,
> Fixnum or Date class which is really good.
>

Why is it good?

> Current:
> #_() method for both translations and localizations
>
> That's good that we have just one method, but parameters are
> significant and different translations and localizations.
>
> Globalize plugin:
> #_() or .t
> .localize
>
> They currently adds .t and .localize method to String and other
> classes which should be avoided.
>
> Rails new API:
> #translate
> #localize
>
> Just module methods. Separates API for Translations and Localization.
> I personally like that the most. Also because it could cleanup _()
> method. Of course we could have _() and #localize.
>

I guess we have 2 separate problems:
1. The API
2. What the API should provide

> Sorry for long post!

No problem.

signature.asc

Pavel Kunc

unread,
Jul 31, 2008, 8:01:18 AM7/31/08
to merb_global
CLDR is free. See http://www.unicode.org/copyright.html. We just have
to include copyright notice...

"Permission is hereby granted, free of charge, to any person obtaining
a copy of the Unicode data files and any associated documentation (the
"Data Files") or Unicode software and any associated documentation
(the "Software") to deal in the Data Files or Software without
restriction, including without limitation the rights to use, copy,
modify, merge, publish, distribute, and/or sell copies of the Data
Files or Software, and to permit persons to whom the Data Files or
Software are furnished to do so, provided that (a) the above copyright
notice(s) and this permission notice appear with all copies of the
Data Files or Software, (b) both the above copyright notice(s) and
this permission notice appear in associated documentation, and (c)
there is clear notice in each modified Data File or in the Software as
well as in the documentation associated with the Data File(s) or
Software that the data or software has been modified."

Pavel
>  signature.asc
> 1KDownload

Pavel Kunc

unread,
Jul 31, 2008, 8:17:17 AM7/31/08
to merb_global
About measurement units... The CLDR does provide information only
about the metric system for the locale. I'd not stick to provide list
of units or translations.

Pavel

On Jul 31, 1:01 pm, Pavel Kunc <pavel.k...@gmail.com> wrote:
> CLDR is free. Seehttp://www.unicode.org/copyright.html. We just have
Reply all
Reply to author
Forward
0 new messages