Internationalization suggestions

0 views
Skip to first unread message

Jan Rychter

unread,
Feb 23, 2008, 10:13:17 AM2/23/08
to webl...@googlegroups.com
I'm banging my head into a number of weblocks built-in strings, so I
thought I'd provide a general suggestion to make life easier for those
of us who write non-English applications. There is basically one rule
worth sticking to:

"You should not assume that you can generate any part of any string
visible to the user without the full context."

What this means is that for strings presented to users it isn't enough
to provide a localizable *default-message-value*, you have to allow
delegation to the most specific piece of the code dealing with the
string, because only that piece of code will have appropriate context to
properly produce the string. In case of Weblocks, that piece of code is
usually in the view, or very close to it.

This is something that is very familiar to people speaking Slavic
languages and I've seen it discussed on this list. Let me provide some
examples:

-- the string "(required)" appearing next to required fields. In Polish,
this will be either "wymagane" "wymagany" or "wymagana", depending on
the gender of the noun before it, so you have to let me override
this in my view definition on a per-field basis, unless you enhance
humanize-name to guess the gender of field names :-)

-- *required-field-message*: it isn't enough to provide it as a
defparameter. I can make it work in Polish, but it sounds artificial
and I would much rather specialize this error on a per-field basis
(depending on gender and plurality). Also, please do not assume that
you can just stick the field name into the ~A in the error
string. In general, you can't, because some combinations will sound
silly,

-- whenever you want to produce a string that contains a number and a
noun, delegate the whole thing to the programmer writing the final
application, and enjoy watching as he screams in pain and agony. It
has been mentioned here that numerals are difficult in Slavic
languages, but I don't think people fully grasp how crazy it can
get. I think Polish is the worst here, as the form of the noun as
well as that of the numeral depends on the number, gender and
case. Given that we have 7 singular cases (plus 7 for plural), three
genders, all multiplied by (at least) several forms that depend on
the particular number (for example, 21 produces a different form from
22, and it gets different again at 25), you really really don't want
to get into that. It's completely insane. If you still don't believe
me, take a look at this concise 252-pages long introduction to Polish
numerals:
http://www.amazon.com/Liczebnik-grammar-numerals-exercises-language/dp/832420234X

I'm writing this because I believe sticking to the above suggestion is
more important than using solutions such as cl-i18n. Don't get me wrong,
cl-i18n will work for lots of people, but if you don't make a design
decision to provide a way to override the strings locally where full
context is available, there will be lots of cases where it won't work.

--J.

Vyacheslav Akhmechet

unread,
Feb 23, 2008, 2:39:12 PM2/23/08
to webl...@googlegroups.com
On 2/23/08, Jan Rychter <j...@rychter.com> wrote:
> I'm banging my head into a number of weblocks built-in strings, so I
> thought I'd provide a general suggestion to make life easier for those
> of us who write non-English applications.
It is by no means my intention to ignore users that write non-English
applications. The reason why I didn't put effort into making weblocks
internationalization friendly is that it adds an entire dimension of
complexity and at the time I wanted to get something working first,
and expand later.

I am fluent in Russian (in fact, it's my native language), so the
comments you made about Polish don't come as a surprise to me. It
probably makes sense to come up with a unified internationalization
strategy so that when people run into these issues they can deal with
them in a consistent way on a case by case basis. I suppose a unified
solution will involve a combination of cl-i18n and context-specific
string specialization as you're suggesting.

I suppose the question is, what needs to be done to get over the
issues you're facing now?

P.S. This is somewhat unrelated, but one thing I've been thinking is
that I am not convinced that including string literals into a
programming language is actually a good idea. It makes it easy to say
(print "hello world") but also makes things incredibly difficult the
moment internationalization enters the picture.

Jan Rychter

unread,
Feb 23, 2008, 3:11:10 PM2/23/08
to webl...@googlegroups.com
"Vyacheslav Akhmechet" <coff...@gmail.com> writes:
> On 2/23/08, Jan Rychter <j...@rychter.com> wrote:
>> I'm banging my head into a number of weblocks built-in strings, so I
>> thought I'd provide a general suggestion to make life easier for those
>> of us who write non-English applications.
> It is by no means my intention to ignore users that write non-English
> applications. The reason why I didn't put effort into making weblocks
> internationalization friendly is that it adds an entire dimension of
> complexity and at the time I wanted to get something working first,
> and expand later.
>
> I am fluent in Russian (in fact, it's my native language), so the
> comments you made about Polish don't come as a surprise to me.

I actually guessed everything you wrote above, and I didn't mean my post
as a criticism, rather as suggestions on how to improve things.

> It probably makes sense to come up with a unified internationalization
> strategy so that when people run into these issues they can deal with
> them in a consistent way on a case by case basis. I suppose a unified
> solution will involve a combination of cl-i18n and context-specific
> string specialization as you're suggesting.

That is exactly what I'm thinking.

> I suppose the question is, what needs to be done to get over the
> issues you're facing now?

Well, one thing that would help is go through weblocks and everytime you
see a string literal, think how it can be "exported" to the view, or
close to it. I have nothing against these literals sitting there, but
I'd like there to be a way for me to override them, fully.

Unfortunately, this is a difficult task, because it often involves
redesign. Just wrapping all string literals in some sort of i18n calls
won't really solve anything.

> P.S. This is somewhat unrelated, but one thing I've been thinking is
> that I am not convinced that including string literals into a
> programming language is actually a good idea. It makes it easy to say
> (print "hello world") but also makes things incredibly difficult the
> moment internationalization enters the picture.

But it depends on where those string literals sit. If you write a
single-language application (which probably covers more than 95% of
usage scenarios) and you use the string literals within your UI
definitions, everything is just fine. It all fits together nicely and is
very simple. I wouldn't sacrifice that simplicity.

For those of us that really need multi-language applications, having
string literals close to the UI context really helps. You still need an
i18n approach, but you get more information to work with, because you
know the exact context.

I'm suggesting that sticking to the design principle I described in my
previous E-mail is probably a good direction to take. If I could
override all strings from within my view definitions, I wouldn't need
anything else at this point -- I'm writing a single-language
application, it's just that the language happens not to be English.

--J.

Vyacheslav Akhmechet

unread,
Feb 23, 2008, 3:49:11 PM2/23/08
to webl...@googlegroups.com
On 2/23/08, Jan Rychter <j...@rychter.com> wrote:
> I'm writing a single-language
> application, it's just that the language happens not to be English.
Right, that's another can worms. An application that's done in one
language which isn't English, and an application that needs to support
multiple languages are two related but completely diffierent beasts. I
guess a combination of your approach and cl-i18n can help in the
second case, but supporting multiple languages will probably involve
pushing abstracting strings to a significantly more involved level (I
am guessing you'll need very different abstractions depending on
whether you're writing your application in Russian or Arabic).

brian

unread,
Feb 23, 2008, 11:42:42 PM2/23/08
to weblocks
You might look at the conditions system in CL.

I'd agree that applications should not contain string literals --
symbolic expressions which can be rendered as text is a good approach.
I don't think that you should need very different abstractions, as
long as your symbolic expression is sufficiently uncomposed -- i.e.,
on the order of whole sentences or paragraphs. It may be possible to
join multiple expressions with 'because', or 'then', or 'alternately'
clauses, but that becomes more complex.

(past (because (missing file "fred.txt") fail-submit))

The submit failed because the file "fred.txt" could not be found.
fred.txt 파일을 찾을수 없어서 수락이 안됀데요.

Anyhow, it should be possible to handle things like missing-file or
field-required or whatever, as long as they produce disjoint textual
expressions.

Vyacheslav Akhmechet

unread,
Feb 24, 2008, 1:07:17 PM2/24/08
to webl...@googlegroups.com
On 2/23/08, brian <brian.s...@gmail.com> wrote:
> (past (because (missing file "fred.txt") fail-submit))
This is a nice system, but likely very difficult to implement. In
English you have to deal with all kinds of exceptions, in Russian
words change in unpredictable ways, etc. Also, I think Asian languages
don't have the same constructs of past-present-future as germanic and
slavic languages do.

This might be a good approximation though...

Reply all
Reply to author
Forward
0 new messages