RFC: Enterprise-ready internationalization (i18n) for Rails

209 views

Skip to first unread message

geekQ

unread,

Apr 21, 2009, 5:08:19 AM4/21/09

to Ruby on Rails: Core

We have been working on a Rails application for a big company, that
will be
rolled out in many countries. We have carefully followed Rails
development over
the last 8 months and tried out different tool sets for
internationalization:
Rails 2.1 together with the gettext 1.93 gem, ActiveSupport.I18n as
part of
Rails 2.3 with and without fast_gettext.

All our approaches still required extensive monkey patching resulting
in high,
unexpected efforts. The solutions work in 95% of all cases, which is
probably
sufficient for most Rails applications, but in our case it is not.

We think it is possible to implement a reliable 100% correct
localization with
reasonable effort, but some important principles have to be considered
in the
Rails core though.

# Internationalization principles

### Separation of roles

Often software developers are not able to or are not allowed to create
translations of their applications' texts for several reasons:

* There are only a few developers who perfectly speak several
languages.
* For commercial-grade applications the exact wording usually is
controlled by
the marketing department.
* In open source projects it should be possible for enthusiastic non-
technical
users to contribute improvements or complete translations.
Consequentially,
some open source projects have defined even [more roles][gnome-
roles].

Important note: non-developers need [tools][poedit] for editing
language files!

### Streamlining the development and translation process

Identifying messages by a symbol in a YAML file (like in Rails 2.3) is
problematic, because it breaks the developer's flow: you have to stop
coding,
come up with a good identifier (symbol) name for your message, go to a
YAML
file, and type in the message.

Later on the translator does not know a message's context and needs to
open
two YAML files side by side - one contains the context and the other
one gets
filled with the translations. In contrast to that, the [gettext
approach][gettext-approach] works smoothly - for C and for Python; for
open
source and for commercial projects.

_('Archive is invalid')
_('%{attribute} must not be empty') % attr

are both easier to write and easier to translate.

With command line tools such as [msgfmt -cv][msgfmt] you can also
check the
well-formnedess and completeness of your transaltions as part of the
*continuous build process*.

A reliable, high-quality, feature rich parsing tool for Ruby and for
Rails
still needs to be implemented, but [ruby-gettext][] is a good starting
point.

### Linguistically correct translations

ActiveRecord validations support the concept of error messages and
full error
messages. From a linguistical point of view this does not work: there
is no
way to infer a correct full message from its short message counterpart
and
vice versa. The string concatenation approach used by Rails (almost)
works for
English but rarely for other languages.

If you can not infer one message from another, the distinction does
not make
sense. You only need one kind of message, preferably with a
placeholder for
the attribute name. (see the example above.)

The current implementation is both overengineered and not sufficient:

# lib/active_record/validations.rb:196
full_messages << attr_name + I18n.t
('activerecord.errors.format.separator',
:default => ' ') + message

The main problem with this solution is: if a language needs a
different
separator for different parts of the sentence, then it will probably
also
differ in more vital aspects. For example, it might insist on a
different
order of words in a sentence.

A message can only be translated as a whole. Hence, it should be
possible to
provide custom ActiveRecord validation messages at any time. For us it
was
only possible with [a dirty hack][custom-validation-messages].

Usage of string concatenation for building error messages in the
framework
makes it [extremely complicated][remove-prefix] to avoid the
corruption of
error messages with a prefix derived from attribute/relation names.

*String concatenation should never be used to create human-readable
messages.
Use string interpolation instead (as it has been used in other
frameworks and
platforms for decades).*

In addition, ActiveRecord should allow for proc-based validation
messages:

validates_format_of account, :message => proc {...}

Of course, a robust pluralization implementation, as provided by
gettext, is
important, too.

### Locale selection

All the different localization libraries try to select an appropriate
locale
corresponding to their own rules and in a transparent way. The
corresponding
logic is often buried deep in the library's implementation and cannot
be fixed
using monkey patching.

Even if our application only offers English and Italian, for example,
the
gettext library with its ActiveRecord extensions sometimes shows
validation
messages in Greek (depending on the user's browser settings). Of
course,
libraries and Rails should be able to provide translations in a plenty
of
languages, but the application should have the last word in the
decision,
which subset of possible languages is offered to the user.

A callback in the application controller which can be overridden by an
application developer would be an advantage. A before_filter would
also do,
but it has to be executed before all other before_filters.

# initializer/internationalization.rb
offer_locales :en_UK, :en_ZA, :nl
default_text_domain 'myapp'

# application_controller.rb
class ApplicationController
def compute_effective_locale
# application specific implementation, that uses
#
# params[:lang]
# cookies[:lang]
# request.headers['Accept-language']
# default_locale
#
# to compute the effective locale
end
end

A default implementation with priority order [query parameter, cookie,
browser
setting] can be provided, but almost any non-trivial application needs
its own
rules.

### Syntax

If the handling of text messages needs to be refactored anyway, it
would be
advantageous to switch to the less invasive, proven, and familiar
gettext
syntax:

_("The billing system is not available. Please, try again later.")

instead of

I18n.t(:billing_not_available)

Providing context for translation:

"Gadget|Title" => (German) "Bezeichnung"

The word "Title" is translated differently depending on its context.
Hierarchical contexts are not needed, that is YAML files with deeper
nesting
as in Rails 2.3 do not make sense.

### Implementation backends

The current interface for plugging in different localization storage
backends
is a nice intention, but in this case flexibility is not needed. A
perfectly
designed and working backend would be sufficient. Other - less
successful -
frameworks and platforms such as django, pylons, and Microsoft.NET
have much
more powerful internationalization/localization features and they all
support
only one backend. Localization in Python frameworks is based on
gettext, .NET
uses resource files. Both technologies are mature and they are
supported by a
large set of tools for maintaining translations. The obvious choice
for Rails
would be gettext, then.

# Conclusion

*It is not possible to implement a sustainable internationalization
solution as
a gem, as a plugin, or as a collection of monkey patches. Important
principles
must be considered in the Rails core, especially in ActiveRecord/
ActiveModel,
to make applications fully internationalizable. This would be an
important
step to make Rails enterprise-ready.*

We offer an industrial-strength internationalization implementation
for Rails3
and all the needed refactoring of validation code. But we wanted to
check
upfront, if the community is interested in such an implementation and
if
there's a chance that these changes would be integrated into the Rails
trunk.

Vladimir Dobriakov (vladimir....@innoq.com)
http://blog.geekq.net

Maik Schmidt (maik.s...@vodafone.com)
http://maik-schmidt.de

[gnome-roles]: http://live.gnome.org/TranslationProject/LocalisationGuide#head-99ad8844377d7c12dcff787e4701d6109bdce69b
[poedit]: http://www.poedit.net/
[gettext-approach]: http://www.gnu.org/software/gettext/manual/gettext.html#Mark-Keywords
[msgfmt]: http://www.gnu.org/software/gettext/manual/gettext.html#msgfmt-Invocation
[remove-prefix]: http://blog.geekq.net/2009/04/09/i18n-remove-validation-message-prefix/
[custom-validation-messages]: http://blog.geekq.net/2009/04/08/activerecord-i18n-validation-message/
[ruby-gettext]: http://rubyforge.org/projects/gettext/

Sven Fuchs

unread,

Apr 22, 2009, 6:18:49 AM4/22/09

to rubyonra...@googlegroups.com

Hey guys,

thanks a lot for you proposal. I think we all agree that rails-i18n
can be improved and input on that is highly appreciated.

This is a very long mail, so I'll cut some of it.

On 21.04.2009, at 11:08, geekQ wrote:
> All our approaches still required extensive monkey patching resulting
> in high, unexpected efforts.

Could you please list the bits you had to monkey patch, maybe
providing some code we can look at? Or does this only refer to AR
messages?

> The solutions work in 95% of all cases, which is
> probably sufficient for most Rails applications, but in our case it
> is not.

Yes. The initial goal of the rails-i18n project was to a) provide a
common API all I18n solutions could build on while b) providing an
implementation (simple backend) that works for English. It turned out
that this implementation seems to work for (as you say) 95% of all
usecases which is much more than we expected.

We are currently seeing concurring implementations (backends) which
IMO is a good thing but doesn't necessarily mean we need to integrate
all of them into Rails core right now.

> Often software developers are not able to or are not allowed to create
> translations of their applications' texts for several reasons:

Agreed.

> Important note: non-developers need [tools][poedit] for editing
> language files!

I agree that currently a solid tool for managing large collections of
translations is missing. There are efforts to build such tools though.
(E.g. see http://github.com/newsdesk/translate)

> Identifying messages by a symbol in a YAML file (like in Rails 2.3) is
> problematic, because it breaks the developer's flow: you have to stop
> coding, come up with a good identifier (symbol) name for your
> message, go to a
> YAML file, and type in the message.

This is certainly a highly debated topic.

We are refering to this as "default translations as keys" vs "symbols
as keys". Gettext also embeds "contexts" (scopes) to the default
translation. There are several variations of these concepts and there
are pros and cons to all of them.

You name one of the problems with "defaults as keys":

> * For commercial-grade applications the exact wording usually is
> controlled by the marketing department.

"You don't want to those guys mess with your code." seems like a good
reason to use Symbols as keys.

Afaik another reason is that the initially picked default translation
might need to be changed during the process so there's a risk for keys
getting out of sync. Also, with "defaults as keys" there's no way to
compute defaults (fallbacks) (which of course affects another highly
debated topic: reusing keys). E.g. you can not fallback
to :"errors.model.invalid" when :"errors.article.invalid" is missing.

In the end we agreed to go with Symbols as keys because we felt that
they a) are a better fit for framework needs and b) provides even
better means for separating roles (i.e. devs mess with Symbols,
translators mess with translations).

There's no reason though why you could not add a helper method _() to
your application and then use the same syntax as in your examples:

> _('Archive is invalid')
> _('%{attribute} must not be empty') % attr

In Rails I18n Strings can be used as keys. The only drawback here is
that you'd need to escape/unescape dots to something else in your _()
implementation because they'd be interpreted as scope (context)
separators. (This might be an opportunity to improve the API. We've
never discussed this further.)

> With command line tools such as [msgfmt -cv][msgfmt] you can also
> check the
> well-formnedess and completeness of your transaltions as part of the
> *continuous build process*.

This again concerns the tools layer and isn't necessarily related to
the API and/or backend implementation.

Afaik these Gettext tools rely on the keys not being computed though.
E.g. devs must stick to using _('Archive is invalid') instead of
_(foo.msg), right? This obviously is a limitation that we might not
want to rely on for Rails itself. It might be a perfect fit for
userland apps though so I can't see what's holding anybody back from
using Rails I18n like this.

> ActiveRecord validations support the concept of error messages and
> full error messages. From a linguistical point of view this does not
> work: there
> is no way to infer a correct full message from its short message
> counterpart
> and vice versa. The string concatenation approach used by Rails
> (almost)
> works for English but rarely for other languages.

I'm actually not sure about the current status of this issue, but it's
been a known issue when we implemented Rails I18n. AR error messages
were subject to an ongoing discussion at that time so we simply ported
the existing functionality even though it's suboptimal.

> A message can only be translated as a whole. Hence, it should be
> possible to provide custom ActiveRecord validation messages at any
> time. For us it
> was only possible with [a dirty hack][custom-validation-messages].

Allowing Procs for AR messages seems like a good idea to me. It's not
the only place where Rails translates stuff though so it's probably
not sufficient for replacing Rails I18n with "something else" (i.e.
Gettext in your case). I feel the more appropriate way would be to use
the API and use a Gettext enabled backend instead.

> *String concatenation should never be used to create human-readable
> messages. Use string interpolation instead (as it has been used in
> other
> frameworks and platforms for decades).*

I think we all agree on this.

> Of course, a robust pluralization implementation, as provided by
> gettext, is important, too.

What's wrong or not robust with the current pluralization API derived
from CLDR?

> All the different localization libraries try to select an appropriate
> locale corresponding to their own rules and in a transparent way. The
> corresponding logic is often buried deep in the library's
> implementation and cannot
> be fixed using monkey patching.

Rails I18n does not ship locale detection/selection, so there's
nothing to monkey patch?

But yeah, you're laying out why we decided to leave features like
these to plugin land for now.

> If the handling of text messages needs to be refactored anyway, it
> would be advantageous to switch to the less invasive, proven, and
> familiar
> gettext syntax:
>
> _("The billing system is not available. Please, try again later.")
>
> instead of
>
> I18n.t(:billing_not_available)
>
> Providing context for translation:
>
> "Gadget|Title" => (German) "Bezeichnung"

Tbh I don't fully understand how this is less invasive than

t('gadget.title', :default => 'Title')

It's a bit shorter, sure, but that comes to a price, too. And you can
always add your own accessor layer/helper on top of I18n#t, no?

> The word "Title" is translated differently depending on its context.
> Hierarchical contexts are not needed, that is YAML files with deeper
> nesting as in Rails 2.3 do not make sense.

You don't need to nest your keys/scopes/contexts if you don't want to.
Even the GNU gettext manual though seems to suggest that there are
situations where this makes sense: http://www.gnu.org/software/hello/manual/gettext/Contexts.html

Am I missing your point here?

> The current interface for plugging in different localization storage
> backends is a nice intention, but in this case flexibility is not
> needed. A
> perfectly designed and working backend would be sufficient.

That's a strong statement as it suggests that there's a silver-bullet
solution for all needs. Our experience with several concurrent I18n
solutions in Rails' history rather seemed to suggest the opposite.

I believe the way forward can't be to force everbody to use Gettext
but instead make sure Rails I18n supports a full stack Gettext
solution through the API as seemlessly as possible. That might mean:
implement a Gettext backend, provide some helper syntax, maybe make
the scope separator configurable.

All that said, thanks again for bringing this up!

Sven

geekQ

unread,

Apr 22, 2009, 4:56:09 AM4/22/09

to Ruby on Rails: Core

I noticed that the formatting somehow got screwed up.
You may find the html version easier to read
http://github.com/geekq/rails/blob/854140d9401ee25fae5b5e0f8c1436818507e796/rfc-internationalization.markdown

Regards,

Vladimir

Chris Cruft

unread,

Apr 22, 2009, 8:46:59 AM4/22/09

to Ruby on Rails: Core

Get this man hooked up to the wagon! It's great to see this level of
thought going into Rails I18n, and equally wonderful to know that his
efforts could positively impact the framework.

> Vladimir Dobriakov (vladimir.dobria...@innoq.com)http://blog.geekq.net
>
> Maik Schmidt (maik.schm...@vodafone.com)http://maik-schmidt.de
>
> [gnome-roles]:http://live.gnome.org/TranslationProject/LocalisationGuide#head-99ad8...

> [poedit]:http://www.poedit.net/
> [gettext-approach]:http://www.gnu.org/software/gettext/manual/gettext.html#Mark-Keywords

> [msgfmt]:http://www.gnu.org/software/gettext/manual/gettext.html#msgfmt-Invoca...

wvk

unread,

Apr 25, 2009, 9:52:08 AM4/25/09

to Ruby on Rails: Core

Greetings, Folks!

having implemented a few "internationalized" applications with
increasing complexity in the past, I have to agree with some valid
points Vladimir names in his post that do not (yet?) seem to have been
solved in Rails. But let's take a look first at what we as web
developers mostly expect from multilingual applications. Obviously, we
do not want to introduce new code into our existing application if a
new languages has to be supported by it. By "code" I mean everything
from "logic concerning timezone calculation", "formatting rules
concerning different punctuation, date, time, currency, pluralization"
to "everything that cannot be edited with M$ WORD without significant
danger of losing important information". Important information can be
anything like indentation and quotation that is easily screwed as soon
as the average non-tech guy from YourBigComp Inc.'s marketing
department starts to translate things using his favourite "text
editor".

So while most of us do write internationalized applications using no
more than two languages targeted at an audience within one timezone
and exchanging the same currency most of the time, that "separation of
roles" is, as Vladimir pointed out, a non-issue for us most of the
time. This picture however changes drastically as soon as we're lucky
enough to catch a "big fish" client that is actually willing to pay us
huge amounts of money for developing an "enterprisey", yet web2.0-ish
application that will serve business content to a target audience
spread across the globe. Huge amounts of money come, however, with a
catch: that application has to conform to certain rules that we only
find in enterprise level companies:

* You're not going to be the one that will deploy the application and
decide which platform it will run on
* You're not going to be the one that will run the application
* You're not going to be the one that will maintain the application
two years from now
* You're certainly not going to be the one that is going to translate
the whole UI into half a gazillion languages (by "languages" I mean
"locales"!)

In larger companies, interfacing with other departments that usually
are far less competent in technical issues is a major concern.
"technical issues" can really be things like using a different "text
editor" than WORD or conforming to any formatting rules that are not
like "insert a bulleted list here" or "make this font look bold" --
sad but true, but writing YAML means "programming" to most folks out
there! So you definitely do not want to offer anyone outside your own
developer team anything else than the simplest possible type of plain
text, "ideally" even encapsulated in a .doc- or .xls-file.

This non-technical world out there seems infinitely far away, but for
some of us it's a daily struggle we have to live with. In my current
project we have actually managed to get translation done for all
different countries using gettext and by sending our gettext PO-files
the translators. After a few training iterations, they were able to
translate the strings marked with "msgid" into the line marked with
"msgstr" -- that really IS more than one is actually allowed to expect
from people in that positions (no offense meant at this point).

The major advantages of the gettext PO-file format over any other
format currently used in different i18n frameworks are, as I see them:

* it's a not-so technical looking format (go on, flame me for that)
* it's incredibly robust: it has enough "syntactical overhead" to
allow automated sanity checks and _error correction_! Even if the
translator screws whitespaces and newlines, you'd still be able to
recover a valid file only by parsing for "msgstr" etc. this is a HUGE
advantage over YAML files.
* it is only as verbose as it needs to be: XML would of course be
more robust than anything else (validation and re-formatting tools
everywhere...), but it is way too much overhead (tech stuff! OMG!) to
be handled by a non-techie.
* there's _excellent_, robust tool support and huge dictionary
resources for translating Gettext files. Hey, GNU Gettext has been
around for about 30 years or so, tools are mature and platform
independent - Poedit, KBabel, Gorm, ... you name it.

Some more remarks on that last point: Even if, at some point in time,
there will be "tool support" for the Rails I18n API: tools take time
to mature. And if there's one thing about big companies that really
gets annoying at times, it's that big companies love solid tools.
Companies like (or rather call it 'demand') tools if you want to
introduce new technologies, and they want them right away. It is at
times not acceptable to hope for someone to come up with a half-baken
translation tool that has to be used by those aforementioned marketing
guys. An 80% solution just doesn't do the job, neither does a 95%
solution -- even if that is "far more than _we_ expected": It's either
a 100% support or none at all -- which would mean, either a working
solution right now or you're gonna do it with J2EE or C++ or Perl and
some whacky CGI based framework, because that's what they've been
working with for the past couple decades -- anyway chances are you'll
end up using Gettext anyway.

I do not intend to offend anyone contributing to Rails, but it has to
be said that Rails itself is, in large parts, far from mature.
Especially its new I18n API, which basically seems to me like a re-
invention of the wheel. It would be ignorant to say that a rock solid
tool for managing internationalization in its full complexity can be
expected to emerge from the community within the next year(s), because
it is an incredibly complex subject.
Things take time, so at this point time would be wisely spent on
adopting proven technology into the rails core.

Bottom line: either we offer those translators a simple file format
they can handle using the big four-letter-WORD, or we provide them the
rock solid tool chain necessary to handle our "strange files". A
minimal-effort solution would indeed be to depend on Gettext as a
whole (not just as an optional backend, more reasons for that later),
since it is neither code-invasive nor visually invasive and it has
actually worked really, really well for major OSS project for decades
(and will continue to do so, my bet ;)).

Convention over configuration has always been a really helpful,
productive and, for that matter, cool design philosophy of Rails.
Conventions should not be broken, because they make it easy to
implement things using a simple and well-known set of basic rules.
Pulling in at least something that looks like the Gettext API into the
Rails core would probably make it easier and more productive for the
"senior" SW developers amongst us (well, I'm not counting myself in
here ;)) to start with I18n right away -- just because it would be
done in Rails in the exact same way as it has been done in dozens of
other frameworks and applications (<flamebait>even the Python guys
over at Pylons use it</flamebait>). and the really neat thing would
be, that you could migrate from your old Perl/CGI based web app that
has to be replaced with a nifty new Rails app without even touching
your translation files, granted the UI would look roughly alike. Think
companies with a strong need for continuity at their customer care
level. Why bother with scripts for moving the one file format into the
other -- just take what you already have and what everybody already
knows.

Some words targeted at Sven's remarks: I18n goes much further than
just translating strings, applying a new date and currency format and
possibly determining the timezone of the current user. You also have
to account for some languages actually having an arbitrary amount of
pluralization formats, even different pluralization rules depending on
whether the subject is male or female and possibly the grammatical
gender of the object spoken about. To take that even further: there
exist languages on this world that have different words for things
that come in two, three or more and additionally depending on the
shape of those things. Other Languages use "measure words", i.e.
classifiers used in conjunction with quantities and different
"classes" of objects. Those are concepts unheard of in English or
German, but such languages are actually being spoken in target
locations of some companies: just take Japanese or Russian, for
example. I'm sure Vladimir could tell us more about the latter -- and
drop Matz himself a line to hear more about the former, I guess ;-)
The point I want to make here is: if we look at e.g. the ActiveRecord
validation mechanism, we have to consider more than just a field name,
possibly a number (length, range, size) and some default messsage. It
requires logic that nobody amongst us wants to re-invent from scratch
as soon as we need it. However, some of us need it right now.

I personally would very much appreciate a solid Gettext based I18n
Implementation making it into the Rails core. If whatever company or
individual has the resources to actually implement it in a way an
enterprise level application could work flawlessly with it without
having to monkey patch Rails with every new release, go for it!

May the source be with you

Willem

Hongli Lai

unread,

Apr 26, 2009, 2:31:37 PM4/26/09

to Ruby on Rails: Core

(Note: I'm biased towards the Gettext approach, after having used it
to translate desktop applications)

After reading all the replies I think the issue boils down to "default
translations as key" vs "symbols as keys".

"default transactions as key":
1. Pro: lots of existing, mature Gettext tools for creating
translations, detecting stale/outdated translations, generating
translation statistics, etc.
For example non-tech savvy translators can use Poedit to create the
translations, which should be the most fool-proof thing after a web
interface.
2. Pro: can easily fallback to the default translation.
This is a huge benefit if you don't have a reliable translation
team, i.e. not all translations are always kept up-to-date. This way
the user interface can at least fallback to an English string, which
is still better than presenting the user with an empty string, a
symbol or an error message. This is
the case for many open source projects, but probably not so for
enterprise
developers.
3. Pro: the default translation makes the code easier to understand.
Symbols are usually a lot more opaque.
4. Con: it's not 100% straightforward. Developers who implement a
localization framework themselves for the first time would probably
use the symbol approach. Developers need some training in order to get
used to Gettext's workflow of marking strings for translation,
extracting them with tools, editing the translation files and
compiling the translation files.
5. Con: not possible for translators to change the default text
without editing the source code.
6. Con: Ruby-related Gettext tools still suck. For example Ruby-
Gettext Rails plugin cannot extract strings from Haml templates. I've
seen someone reinvent his own localization framework based on symbols
because of this.

"symbols as keys":
7. Pro: easy to understand for new developers. Most people who
implement a localization framework for the first time would probably
use this approach.
8. Pro: possible for translators to change the default text without
editing the source code.
9. Pro: allows falling back to a related string, e.g.
"errors.article.invalid" => "errors.model.invalid".
10. Con: very limited tool support, even worse than Ruby-Gettext.
11. Con: makes code more opaque; the meaning of a symbol is not always
immediately obvious until the programmer sees the associated string.

It's arguable whether 9 really is a pro. Has there even been any need
for this feature? I figure that in most applications, most symbols
have no related fallback symbol, and so a missing translation usually
results in an error. Gettext falls back to the default string which is
usually English, which is still better than presenting the user with
the symbol or with an empty string.

4 is pretty awkward for developers who are just getting into
localization, but I blame it on documentation. There shouldn't be any
problems if the documentation is good. The need to manually
compile .po files to .mo files can be solved with the right code.
Rails could, for example, auto-compile modified .po files during
startup, or someone could write a .po parser and load .po files
directly.

6 and 10 can be fixed given enough effort.

So this leaves 5/8 and 3/11 as the only fundamental issues, which are
also mutually exclusive: the ability to change the default string
without letting translators to mess with the source code (Rails I18n),
and whether embedding default strings in the source code makes it
easier to understand than using symbols (Gettext).
3/11 might be arguable, I'm sure there are developers out there who
don't think that using symbols makes their code more opaque.

Sven Fuchs

unread,

Apr 26, 2009, 5:13:32 PM4/26/09

to rubyonra...@googlegroups.com

Hi Hongli,

cool, that's a great writeup! Thanks for turning this discussion
towards more practical points :)

Perhaps it helps when I also add some disclaimer about myself. I'm not
biased towards or against Gettext in any way, too. I've used it a lot
a quite some years ago. In fact it was me who repeatedly tried to get
Gettext people on board while we worked on Rails I18n. I do think
though that the API in fact is the best bread of all solutions for
Rails we had previously, including ruby-gettext. That of course
doesn't mean it can not be improved, but to me it means we should not
go back to a less flexible API.

In your list I'd suggest that 9.) is just an example of a more
abstract point: "Symbols as keys" makes it possible to compute keys.
You can not compute default translations. Rails itself leverages that
for validation messages, I've seen people using it for "resourceful
controllers" (e.g. flash messages) and there are tons of other
situations where this is useful. Computing keys allows you to define a
generic translation that works for most of the situations and
overwrite that for particular situations where you need something
special - thus effectively reducing the amout of repetition a lot. You
can also react to contexts (e.g. pick a particular translation
depending on the type of an object) flexibly where you'd otherwise
need to use generic translations/messages. Thus, I believe that
"Symbols as keys" allow for a more abstract way of coding.

I'd also suggest to add another pair of pro/con arguments to the list.
Using "defaults as keys" usually means that you have the actual
translations cluttered throughout your code. Of course, Gettext allows
you to "announce" translated strings through gettext_noop when you
want to collect messages at a central place but that requirement
really feels much more like jumping through hoops than just using
Symbols in the first place.

(Also, I'd like to remind of the motivations that lead to such
solutions as Gibberish, Globalite (not Globalize) and
SimpleLocalization. People wanted a simple and clean API, they
explicitely did not want to mess with Gettext which was designed,
let's face it, in 1994 for C. It feels old and awkward to many.)

I really wonder though if both approaches actually are mutually
exclusive, or mutually exclusive in all areas (Rails core, plugin
land, user/dev/app land).

Imagine a helper like this:

def _(msg)
I18n.t(msg, :default => msg)
end

For pluralization there could be a similar helper. This should work
for all messages that do not contain a dot. I wonder if we can get rid
of this limitation. Approaches that come to my mind:

1. Make the scope separator (dot) configurable. That might mean that
Rails core should not continue using dots as separators (but instead
just use Arrays for scopes).
2. Escape/unescape dots in the helper.
3. ?

Btw. re "Gettext falls back to the default string which is usually
English" - you can do the same thing easily with Rails I18n. It just
wasn't part of our original requirements ("the simple backend works
for English") so we left locale fallbacks to the plugin land.

Hongli Lai

unread,

Apr 26, 2009, 5:35:39 PM4/26/09

to Ruby on Rails: Core

The remaining issues are *really* subjective.
- Putting default translations in the code is clutter in your opinion.
In my opinion it's the opposite: it makes the code easier to read. :)
gettext_noop is a bit weird at first but I don't see it as any worse
than what all the other localization frameworks provide.
- You view the fact that SimpleLocalization, Globalite and co are not
designed with the Gettext style as proof that people want something
simple and clean. The way I see it is that they haven't seriously
tried Gettext. I think their view of "simple" is like coding a web
application without using MVC - it's simpler but it gives you more
headache down the road. I find it "interesting" that pretty much all
open source desktop applications use Gettext. Gettext has been used to
translate hundreds, if not thousands, of desktop applications to
dozens of languages. Yet the web applications world seems to
completely ignore Gettext. For PHP I can understand, everybody's
reinventing the wheel there. But Rails?

Your idea regarding the computability of symbols is interesting. On an
abstract level it does seem to fit within the Rails philosophy, but it
remains to be seen how useful it is in practice and whether anyone can
come up with a good implementation.

In any case, what is clear that at the very least, Rails should have
better I18n tools. There should be tools that alert translators which
translations need to be updated, how many strings still need
translations, for writing the translations, etc.

Sven Fuchs

unread,

Apr 26, 2009, 7:30:12 PM4/26/09

to rubyonra...@googlegroups.com

Hi Hongli,

On 26.04.2009, at 23:35, Hongli Lai wrote:
> The remaining issues are *really* subjective.
> - Putting default translations in the code is clutter in your opinion.

I didn't say it's my opinion :) My role in the Rails I18n group was
more the one of being a moderator.

> - You view the fact that SimpleLocalization, Globalite and co are not
> designed with the Gettext style as proof that people want something
> simple and clean. The way I see it is that they haven't seriously
> tried Gettext.

I'm pretty sure they did.

> I think their view of "simple" is like coding a web
> application without using MVC - it's simpler but it gives you more
> headache down the road.

Regarding the API quite the opposite is true. You just don't have this
feature set with gettext's _(). Regarding the tools layer I agree, but
hey, if you want to use poedit for 95% of your messages you can just
do that, no? Just add fast_gettext and use it. Also, I bet a converter
that takes a flat yaml translations file and converts it to po should
not be that hard to do.

> I find it "interesting" that pretty much all
> open source desktop applications use Gettext. Gettext has been used to
> translate hundreds, if not thousands, of desktop applications to
> dozens of languages. Yet the web applications world seems to
> completely ignore Gettext. For PHP I can understand, everybody's
> reinventing the wheel there. But Rails?

Look at the history of Rails. There were tons of concurring
implementations, Gettext being one of them. Gettext hasn't been able
to win the race in any way and I think that's for a reason.

Also, we haven't reinvented the wheel. We've extracted what we (based
on the experience of several implementors) believed the best ideas are.

> Your idea regarding the computability of symbols is interesting. On an
> abstract level it does seem to fit within the Rails philosophy, but it
> remains to be seen how useful it is in practice and whether anyone can
> come up with a good implementation.

Hm? People are doing stuff like this.

flash[:notice] =
t(:"flash.#{controller_name}.#{action}.success", :default
=> :"flash.#{action}.success")

Do that in gettext. Obviously, flash messages are only one place where
computability of keys is quite useful.

Again, there was a reason why so many people weren't happy with
gettext before and invented their own APIs for years.

> In any case, what is clear that at the very least, Rails should have
> better I18n tools. There should be tools that alert translators which
> translations need to be updated, how many strings still need
> translations, for writing the translations, etc.

I agree.

Aside from that though I think some thought should be put into how
integrate a gettext style accessor _('foo') and a gettext backend.

Sven Fuchs

unread,

Apr 26, 2009, 7:41:34 PM4/26/09

to rubyonra...@googlegroups.com

I forgot to add the "default argument" against "default as keys" as
another pro/contra pair: keys can easily get out of sync. If it's hard
for a developer to come up with a good key for a translation (while
focussing on development) then it's even harder to come up with the
final English message at this point: there's a good chance for it to
change, so one has to propagate that change to translation files.
(Again, depending on your setup and environment that might be more or
less hassle.)

Btw having default translations in your code, no matter how clean or
cluttered that seems to anybody, will fight one of the major original
points that brought up this discussion: separation of roles (dev vs
editors vs marketing vs translators etc.)

On 26.04.2009, at 20:31, Hongli Lai wrote:

>

geekQ

unread,

Apr 27, 2009, 12:02:13 PM4/27/09

to Ruby on Rails: Core

Hi,

= Executive summary ;-)

the most important question is, whether the core team would sacrifice
*some* parts of humanize, pluralize and other string concatenation
voodoo, especially in ActiveRecord to allow for 100% linguistically
correct translations and smooth, enterprise-ready localization
workflow.

This, together with other improvements, would make broader adoption
of Rails in a more traditional environment, outside start-ups,
possible.
Currently we have to put a lot of effort into monkey patching to work
around the opinionated decisions baked in into Rails. My hope was
that Rails3 is planned to become a more of general purpose
web framework.

Other questions are only technical details, supporting such
a decision.

= Details

Regarding default human readable string as a key Hongli Lai listed
some pros and cons, let me turn the three remaining cons to pros
and we get a solution, that has only advantages ;-)

> 1. Pro: lots of existing, mature Gettext tools for creating
> translations, detecting stale/outdated translations, generating
> translation statistics, etc.
> For example non-tech savvy translators can use Poedit to create the
> translations, which should be the most fool-proof thing after a web
> interface.
> 2. Pro: can easily fallback to the default translation.
> This is a huge benefit if you don't have a reliable translation
> team, i.e. not all translations are always kept up-to-date. This way
> the user interface can at least fallback to an English string, which
> is still better than presenting the user with an empty string, a
> symbol or an error message. This is
> the case for many open source projects, but probably not so for
> enterprise
> developers.
> 3. Pro: the default translation makes the code easier to understand.
> Symbols are usually a lot more opaque.
> 4. Con: it's not 100% straightforward. Developers who implement a
> localization framework themselves for the first time would probably
> use the symbol approach. Developers need some training in order to get
> used to Gettext's workflow of marking strings for translation,
> extracting them with tools, editing the translation files and
> compiling the translation files.

I've found the gettext workflow easy to grasp for new developers
in every team I have worked with so far.
For self-didacts high quality documentation (to be written) should
be enough.

> 5. Con: not possible for translators to change the default text
> without editing the source code.

Missing possibility of changing the default string has never been an
issue,
neither in my personal decade of writing international applications
(different open source platforms, Microsoft.net and pre-dot-net) nor
for big
open source projects with longer history. For commercial grade
applications the marketing department or release team is going to
translate
our hacker- English to marketing-conform English anyway. BTW,
separately
for every English speaking country to account for cultural
differences,
e.g. translations differ between UK and South-Africa.

The case with a typo in the default message can be handled the same
way as a typo in the symbol-name - as a bug, it can be corrected
in all the relevant files. There is even some tool support in gettext
for this - fuzzy matching and checking for missing translations.

> 6. Con: Ruby-related Gettext tools still suck. For example Ruby-
> Gettext Rails plugin cannot extract strings from Haml templates. I've
> seen someone reinvent his own localization framework based on symbols
> because of this.

Masao is currently rewriting ruby-gettext. I personally currently
prefer
the fast_gettext - not because it is fast, but because it has a more
straightforward implementation. As opposite to ruby-gettext it does
not make attempts to monkey-patch the Rails.

Regarding the parsing of Haml templates - the implementation will
likely be up to the Haml users. It can be based on ruby-gettext
typical
parsing of source code for string literals.

Sven Fuchs wrote:

> It turned out
> that this implementation seems to work for (as you say) 95% of all
> usecases which is much more than we expected.

So this solution does not qualify for any serious enterprise or
governmental (European Union) application. 100% linguistically correct
translations are required. 95% is much less that is expected from us.

> In your list I'd suggest that 9.) is just an example of a more abstract point: "Symbols as keys" makes it possible to compute keys. You can not compute default translations. Rails itself leverages that for validation messages, I've seen people using it for "resourceful controllers" (e.g. flash messages) and there are tons of other situations where this is useful. Computing keys allows you to define a generic translation that works for most of the situations and overwrite that for particular situations where you need something special - thus effectively reducing the amout of repetition a lot. You can also react to contexts (e.g. pick a particular translation depending on the type of an object) flexibly where you'd otherwise need to use generic translations/messages. Thus, I believe that "Symbols as keys" allow for a more abstract way of coding.

All the kinds of hierarchically organized scopes (computed keys)
and (optional) translation inheritance do not work in environment
with role separation. Inheritance and method overriding
work in OOP. But it does not work for translations.
A translation agency needs a comprehensive and flat list
of strings to be translated. To be able to make a
decision about to override or not to override or where to override
they would need to analyse the application source code. Only manually
created and obligatory translation scope makes sense. This is a kind
of message from developer to the translation team. The gettext
convention is to use the pipe character.
_("Search|I'm feeling lucky")
_("Mood poll|I'm feeling lucky")

> People wanted a simple and clean API, they explicitely did not want to mess with Gettext which was designed, let's face it, in 1994 for C. It feels old and awkward to many.)

And since then adapted for 20 different programming languages.
Bindings
for dynamic language, e.g. Python are very nice. Same API can be used
for Ruby too.

> For pluralization there could be a similar helper. This should work for all messages that do not contain a dot. I wonder if we can get rid of this limitation. Approaches that come to my mind:
>
> 1. Make the scope separator (dot) configurable. That might mean that Rails core should not continue using dots as separators (but instead just use Arrays for scopes).
> 2. Escape/unescape dots in the helper.
> 3. ?

Sounds complicated...

BTW,
* pluralization rules are different for different languages,
gettext uses a formula in a programming language per language for
that
* some languages have 3 or 5 plural forms as opposite to 2 in English
and German
* only complete sentence can be pluralized, not a single word

Gettext accounts for all that - in code and in the tool chain,
Rails I18n - not. So before spending much more time and effort on yet
another I18n implementation, we should focus on integrating a
perfectly
solid solution that is known to work. Fixing and patching and hoping
for
a >95% solution won't get Rails where many would like to see it in the
near future, i.e. in bigger enterprise-y environments.

The question remains, whether this really is the direction towards
Rails
is heading. If so, we would contribute a solid Gettext based I18n
implementation that addresses the aforementioned issues. This however
requires some breaking changes within the Rails core and a consensus
about the necessity of them being addressed.

Best Regards,

Vladimir

Sven Fuchs

unread,

Apr 27, 2009, 5:19:40 PM4/27/09

to rubyonra...@googlegroups.com

Hi Vladimir,

On 27.04.2009, at 18:02, geekQ wrote:
> the most important question is, whether the core team would sacrifice
> *some* parts of humanize, pluralize and other string concatenation
> voodoo, especially in ActiveRecord to allow for 100% linguistically
> correct translations and smooth, enterprise-ready localization
> workflow.

Exactly which parts are you referring to?

> Currently we have to put a lot of effort into monkey patching to work
> around the opinionated decisions baked in into Rails.

Again, it would be great if you could list the exact places that you
found need monkeypatching.

> Missing possibility of changing the default string has never been an
> issue,
> neither in my personal decade of writing international applications
> (different open source platforms, Microsoft.net and pre-dot-net) nor
> for big
> open source projects with longer history.

It has been an issue which is why people implemented key based
solutions.

>> It turned out
>> that this implementation seems to work for (as you say) 95% of all
>> usecases which is much more than we expected.
> So this solution does not qualify for any serious enterprise or
> governmental (European Union) application. 100% linguistically correct
> translations are required. 95% is much less that is expected from us.

Right. Which is why we have a pluggable backend so you can implement
your needs in plugin land. If you need patching to core, please list
the places that need patching. If you need changes to the API, please
do so, too.

> All the kinds of hierarchically organized scopes (computed keys)
> and (optional) translation inheritance do not work in environment
> with role separation. Inheritance and method overriding
> work in OOP. But it does not work for translations.
> A translation agency needs a comprehensive and flat list
> of strings to be translated. To be able to make a
> decision about to override or not to override or where to override
> they would need to analyse the application source code. Only manually
> created and obligatory translation scope makes sense.

Maybe they don't make sense for the most part of translation agencies.
That doesn't mean they don't make sense for the rest.

>> People wanted a simple and clean API, they explicitely did not
>> want to mess with Gettext which was designed, let's face it, in
>> 1994 for C. It feels old and awkward to many.)
> And since then adapted for 20 different programming languages.
> Bindings
> for dynamic language, e.g. Python are very nice. Same API can be used
> for Ruby too.

Yeah, still asuming a C'ish API and compilation stage though.

> * pluralization rules are different for different languages,
> gettext uses a formula in a programming language per language for
> that
> * some languages have 3 or 5 plural forms as opposite to 2 in English
> and German
> * only complete sentence can be pluralized, not a single word

Yup. The API covers that.

> The question remains, whether this really is the direction towards
> Rails
> is heading. If so, we would contribute a solid Gettext based I18n
> implementation that addresses the aforementioned issues. This however
> requires some breaking changes within the Rails core and a consensus
> about the necessity of them being addressed.

I believe this ship has sailed about 1 year ago. It's not the question
anymore whether or not we want that API. The question is if everybody
who has good ideas rolls up their sleeves and implements them *using*
this common API. If you want to do that for Gettext I'm absolutely
sure the community will welcome that with big applause.

Sven Fuchs

unread,

Apr 27, 2009, 5:28:40 PM4/27/09

to rubyonra...@googlegroups.com

>> For pluralization there could be a similar helper. This should
>> work for all messages that do not contain a dot. I wonder if we
>> can get rid of this limitation. Approaches that come to my mind:
>>
>> 1. Make the scope separator (dot) configurable. That might mean
>> that Rails core should not continue using dots as separators (but
>> instead just use Arrays for scopes).
>> 2. Escape/unescape dots in the helper.
>> 3. ?
>
> Sounds complicated...

Btw I've just pushed some experiments with gettext'ish accessors on
top of Rails I18n:

http://github.com/svenfuchs/i18n/tree/gettext

You might particularly want to look at the helper layer and the tests:

http://github.com/svenfuchs/i18n/blob/49220ce667fe542041c97bd38a0190e27a9581d6/lib/i18n/gettext.rb
http://github.com/svenfuchs/i18n/blob/49220ce667fe542041c97bd38a0190e27a9581d6/test/gettext_test.rb

For a fullstack gettext support that uses the Rails I18n API there
seem to be three things missing:

- complete the helpers (trivial)
- implement a gettext backend (anybody?)
- figure out a gettext'ish way to announce expected translations for
computed keys

Any help and/or feedback would be appreciated!

Lawrence Pit

unread,

Apr 28, 2009, 2:06:53 AM4/28/09

to rubyonra...@googlegroups.com

Hi Sven

Nice.

I'd use this even without having gettext as the backend.

With helpers you refer to the ability to accept named arguments? Would
you add that to the _ method or would you do it ruby-gettext style by
extending String with a % method?

(I'd prefer _ to accept the arguments directly, saves polluting the
String object)

Cheers,
Lawrence

Sven Fuchs

unread,

Apr 28, 2009, 4:24:19 AM4/28/09

to rubyonra...@googlegroups.com

Hi Lawrence,

On 28.04.2009, at 08:06, Lawrence Pit wrote:
> I'd use this even without having gettext as the backend.

heh :)

> With helpers you refer to the ability to accept named arguments? Would
> you add that to the _ method or would you do it ruby-gettext style by
> extending String with a % method?

No idea, I was just checking this out for some kind of proof of concept.

> (I'd prefer _ to accept the arguments directly, saves polluting the
> String object)

Sure. I guess the question would be whether one wants to rebuild the
exact gettext api with all of its C'ish methods (sgettext, pgettext,
psgettext, ngettext, nsgettext, ...) or not.

Sven Fuchs

unread,

Apr 29, 2009, 1:28:39 PM4/29/09

to rubyonra...@googlegroups.com

I've continued playing with this stuff and added an experimental
Gettext backend:

http://github.com/svenfuchs/i18n/commit/fb7fcfff5e94510dbc1cb0b9b12a374c6828fb6f

It extends from the Simple backend, reads PO files using Masao Mutoh's
poparser [1] and simply loads the translations to the standard Hash
format. They can then be read both using the gettext'ish helper
methods I've played with recently as well as the standard I18n gem API.

I18n.load_path = [File.dirname(__FILE__) + '/../locale/de.po']
I18n.backend = I18n::Backend::Gettext.new
I18n.locale = :de
assert_equal 'Auto', _('car')

Please note that this is really just an experimental proof of concept
thing. I want to show that it's possible but don't have a real use for
that myself right now. So, any feedback or help with this is highly
appreciated!

Also, maybe this is a good time to take this discussion over to the
rails-i18n mailinglist [2] to work out implementation details? I'll
just post a follow-up over there.

Lemme also point out that there are efforts from other people to
improve Gettext integration for or use alongside with Rails I18n.
Maybe most notably:

- Masao Mutoh's gettext_rails
- Sam Lown's i18n_gettext
- Michael Grosser's fast_gettext

[1] http://github.com/mutoh/gettext/blob/d36e97af7dc801af1b1ceb5a47450cab90ed078f/lib/gettext/poparser.rb
[2] http://groups.google.com/group/rails-i18n
[3] http://github.com/mutoh/gettext_rails
[4] http://github.com/ferblape/i18n_gettext
[5] http://github.com/grosser/fast_gettext

geekQ

unread,

Apr 30, 2009, 4:27:16 AM4/30/09

to Ruby on Rails: Core

Hi,

Sven Fuchs wrote:
> Hi Vladimir,
>
> On 27.04.2009, at 18:02, geekQ wrote:
>> the most important question is, whether the core team would sacrifice
>> *some* parts of humanize, pluralize and other string concatenation
>> voodoo, especially in ActiveRecord to allow for 100% linguistically
>> correct translations and smooth, enterprise-ready localization
>> workflow.
>
> Exactly which parts are you referring to?

The problem is best visible in the following line:

http://github.com/rails/rails/blob/09a976ac58d2d7637003b92d51637f59f647b53a/activerecord/lib/active_record/validations.rb#L207

full_messages << attr_name +
I18n.t('activerecord.errors.format.separator', :default => ' ') +
message

The counterpart in Rails3 is
http://github.com/rails/rails/blob/bab2bfa69220ca1b6c7b56dccc79cf8e41245306/activemodel/lib/active_model/errors.rb#L65

errors_with_attributes << (attribute.to_s.humanize + " " + error)

This makes the whole ActiveRecord validation subsystem impossible to
use for linguistically correct validation messages.

There are probably more places, where string concatenation is used,
but *validation* makes trouble whole the time.

The second issue with ActiveRecord validations is using custom
messages. Gettext can not be used at this place without monkey-
patching,
that adds lambda support.

= Known Monkeys

* in our project we monkey patched as follows
http://blog.geekq.net/2009/04/09/i18n-remove-validation-message-prefix/

* Masao Mutoh pointed out, that we do not need any monkey patching,
if we use N_ from his gettext library because he has already
monkey patched everything.

* more monkey patching from masao
http://github.com/mutoh/*

* following library also overrides the full_messages()
http://github.com/yaroslav/russian/blob/7960596ede5159462c41d5dcd07b137953bf1b3d/lib/russian/active_record_ext/custom_error_message.rb

>> * pluralization rules are different for different languages,
>> gettext uses a formula in a programming language per language for
>> that
>> * some languages have 3 or 5 plural forms as opposite to 2 in English
>> and German
>> * only complete sentence can be pluralized, not a single word
>
> Yup. The API covers that.

Did not find documentation for that in activesupport-2.3 / I18n.
Now I've found some hints in the current Rails guide.

But people are still forced to do a lot of programming per language,
like in
http://github.com/yaroslav/russian/blob/7960596ede5159462c41d5dcd07b137953bf1b3d/lib/russian/backend/advanced.rb

> I believe this ship has sailed about 1 year ago. It's not the question
> anymore whether or not we want that API. The question is if everybody
> who has good ideas rolls up their sleeves and implements them *using*
> this common API. If you want to do that for Gettext I'm absolutely
> sure the community will welcome that with big applause.

Could you point to at least one complete backend implementation,
that is entirely based on the Rails.I18n public API, without the
need for extensive monkey patching?

> Lemme also point out that there are efforts from other people to
> improve Gettext integration for or use alongside with Rails I18n.
> Maybe most notably:
>
> - Masao Mutoh's gettext_rails
> - Sam Lown's i18n_gettext
> - Michael Grosser's fast_gettext

No, it is not possible to implement serious Gettext or serious
internationalization on the basis of Rails I18n API, that is why

- gettext_rails is a pure monkey patch solution, without any usage
of the mentioned API
- Michael Grosser's fast_gettext does not use the mentioned API
in any way
- Sam Lown's i18n_gettext is a Rails plugin, that simply wraps
the Masao's library and uses it as a fallback in addition to
the Rails simple backend. i18n_gettext is not a stand alone
internationalization solution

> Also, maybe this is a good time to take this discussion over to the
> rails-i18n mailinglist [2] to work out implementation details? I'll
> just post a follow-up over there.

No, I was discussing the ActiveRecord and Rails core issues
here, not rails-i18n issues. If there is no interest here,
I'll not bother with further mails.

= Conclusion

I've noticed, that

1. Internationalization is out of scope of the Rails core team
2. Rails.I18n responsible do not grasp the important concepts
of internationalization
3. All cool hacker, that need real internationalization, do
this currently by monkey patching Rails, especially ActiveRecord

So I'll concentrate on doing the third until something changes on
the first.

Best Regards and good-bye,

Vladimir

Michael Koziarski

unread,

Apr 30, 2009, 5:49:35 AM4/30/09

to rubyonra...@googlegroups.com

Vladimir,

Sven has tried repeatedly to get specifics out of you throughout this
thread. Until this message there's been nothing but vague statements
and rehashing of discussions which came to conclusion months ago.
Sven and the rails-i18n team *do* grasp the issues that you've
mentioned and have their i18n patches applied straight to rails. The
guys on that list are responsible for directing the rails i18n effort
and we listen to them and take their patches. You've cleaerly
identified a few key points where the existing i18n api is lacking,
and the ActiveRecord code is inflexible. Let's address those issues,
and the right place to do those is the rails-i18n list and sven and co
are the ones to talk with.

Rather than throwing your toys out of the cot and feeling
self-satisfied in the superior enterprisiness of your approach, you
should try to work with Sven and the team to iron out all the issues
with the existing api and let everyone benefit from the amount of work
you've clearly put into this. If you're genuinely interested in
enabling 'true gettext' support and removing the string
concatenations in the validations API, then it will be surely be a few
small, targeted patches.

If on the other hand you're looking to dump wiki markup into mailing
list threads and talk dismissively about the work of other
programmers, then perhaps you should do that elsewhere.

We're all working towards the same goal here, just because you've
found some shortcomings doesn't mean that the people who did the
existing work are evil or clueless.

--
Cheers

Koz

Sven Fuchs

unread,

May 1, 2009, 8:37:36 AM5/1/09

to rubyonra...@googlegroups.com

Hi Vladimir,

On 30.04.2009, at 10:27, geekQ wrote:
> The problem is best visible in the following line:
>
> http://github.com/rails/rails/blob/09a976ac58d2d7637003b92d51637f59f647b53a/activerecord/lib/active_record/validations.rb#L207
>
> full_messages << attr_name +
> I18n.t('activerecord.errors.format.separator', :default => ' ') +
> message
>
> The counterpart in Rails3 is
> http://github.com/rails/rails/blob/bab2bfa69220ca1b6c7b56dccc79cf8e41245306/activemodel/lib/active_model/errors.rb#L65
>
> errors_with_attributes << (attribute.to_s.humanize + " " + error)

Great, thanks for pointing that out. This is a known issue and I agree
that we should get that fixed. There are few options to do that and
discussion about that has already started over at http://groups.google.com/group/rails-i18n

Please join in! We're keen on hearing your opinions.

> The second issue with ActiveRecord validations is using custom
> messages. Gettext can not be used at this place without monkey-
> patching,
> that adds lambda support.

Integrating lamda support to the I18n API has been a request for a
long time. It's also useful for localizing dates to rather funky rules
and such.

I've worked with Clemens yesterday on integrating and polishing his
contributions and pushed it to a branch: http://github.com/svenfuchs/i18n/commits/lambda

So this should then be possible:

validates_format_of :account, :messages => lambda { _("foo") }

Another option to solve this situation might be:

validates_format_of :account, :messages => gettext_noop("foo").to_sym

This is also being discussed on the rails-i18n list. Please let us
know about your opinion.

> = Known Monkeys
>
> * in our project we monkey patched as follows
> http://blog.geekq.net/2009/04/09/i18n-remove-validation-message-prefix/
>
> * Masao Mutoh pointed out, that we do not need any monkey patching,
> if we use N_ from his gettext library because he has already
> monkey patched everything.
>
> * more monkey patching from masao
> http://github.com/mutoh/*
>
> * following library also overrides the full_messages()
> http://github.com/yaroslav/russian/blob/7960596ede5159462c41d5dcd07b137953bf1b3d/lib/russian/active_record_ext/custom_error_message.rb

Great list, this is helpful. Thanks!

>>> * pluralization rules are different for different languages,
>>> gettext uses a formula in a programming language per language for
>>> that
>>> * some languages have 3 or 5 plural forms as opposite to 2 in
>>> English
>>> and German
>>> * only complete sentence can be pluralized, not a single word
>>
>> Yup. The API covers that.
>
> Did not find documentation for that in activesupport-2.3 / I18n.
> Now I've found some hints in the current Rails guide.
>
> But people are still forced to do a lot of programming per language,
> like in
> http://github.com/yaroslav/russian/blob/7960596ede5159462c41d5dcd07b137953bf1b3d/lib/russian/backend/advanced.rb

Sure. Please distinguish the API from their implementations
(backends). This is on purpose.

There are a few more backend implementations in Globalize2:
http://github.com/joshmh/globalize2/tree/a46ab1e885c37aff435823d992cc8c919b0e3c50/lib/globalize/backend

Please think about the I18n API in Rails as similar to the Rack API
support. Rack allows for previously unseen extensibility and
exchangeability of concurrent implementations of rather focussed
features.

Now, even though Rails 2.x now supports that API it doesn't leverage
all of the features it provides. E.g. Rack routing/url_generation is
not supported, yet (will be there in Rails 3, afaik). Nobody's arguing
Rails should stop supporting Rack for this reason though. And similar
the fact that Rails does not perfectly support all features required
for proper I18n/L10n does not mean it should stop supporting the I18n
API.

>> I believe this ship has sailed about 1 year ago. It's not the
>> question
>> anymore whether or not we want that API. The question is if everybody
>> who has good ideas rolls up their sleeves and implements them *using*
>> this common API. If you want to do that for Gettext I'm absolutely
>> sure the community will welcome that with big applause.
> Could you point to at least one complete backend implementation,
> that is entirely based on the Rails.I18n public API, without the
> need for extensive monkey patching?

If by "extensive monkey patching" you mean the bug/shortcoming in
AR#full_messages then, no.

> No, it is not possible to implement serious Gettext or serious
> internationalization on the basis of Rails I18n API, that is why
>
> - gettext_rails is a pure monkey patch solution, without any usage
> of the mentioned API
> - Michael Grosser's fast_gettext does not use the mentioned API
> in any way
> - Sam Lown's i18n_gettext is a Rails plugin, that simply wraps
> the Masao's library and uses it as a fallback in addition to
> the Rails simple backend. i18n_gettext is not a stand alone
> internationalization solution

Yeah, I know these are different approaches from what you have in mind.

I've listed them because I have received some angry private messages
that were based on the perception I wouldn't know about or conceal or
downplay these efforts. Just wanted to make sure people know that I
don't, these are great contributions.

Thanks again!

Vladimir Dobriakov

unread,

May 1, 2009, 9:48:09 AM5/1/09

to rubyonra...@googlegroups.com

Hi Sven,

I am happy to hear, that we totally agree on the important items.

We should tackle the problems, as you are describing:

1. use string interpolation instead of concatenation everywhere

> Great, thanks for pointing that out. This is a known issue and I agree
> that we should get that fixed. There are few options to do that and
> discussion about that has already started over at
> http://groups.google.com/group/rails-i18n

2. introducing lambda support for error messages for maximum flexibility

> Integrating lamda support to the I18n API has been a request for a
> long time. It's also useful for localizing dates to rather funky rules
> and such.

3. discuss/improve the API. My opinion always was, that supporting
different *storage* backends is a good idea. Every developer is
comfortable with yaml files. Others can use gettext
specific .mo or .po (bypassing, as you pointed out,
the dated compilation approach). On the other hand, some things
should not be optional and can not be plugged through the API,
but lets discuss this later, after we succeeded with the first
two things.

> Please think about the I18n API in Rails as similar to the Rack API
> support. Rack allows for previously unseen extensibility and
> exchangeability of concurrent implementations of rather focussed
> features.

This is an excellent example!

It illustrates two things: successful technical design and importance of
experience and solution maturity. Instead of reinventing the wheel, the
Ruby community adopted successful and proven solution from the Python
world, where it is known under the name of WSGI, and further improved it.

I was talking about the gettext whole the time not because I admire the
obscure .mo file format
http://www.gnu.org/software/gettext/manual/gettext.html#MO-Files , but
because the folks at GNU have already seen all the possible problems and
addressed them in the design, the tools, and the best practices.
http://www.gnu.org/software/gettext/manual/gettext.html#Why

Best Regards and see you
on the http://groups.google.com/group/rails-i18n shortly,