More on unicode

1 view
Skip to first unread message

Uwe Feldtmann

unread,
Feb 1, 2007, 1:46:45 AM2/1/07
to pylons-...@googlegroups.com
I've cross posted this on the Mako list as well.

Now I'm not sure if this is a question for this list or the Mako list.

The scenario:-

A template contains translatable strings and is rendered by the Pylons
controller via Mako.
The translated and compiled template ends up in data/templates/xxx.py

user A wants the page in English
user B wants it in ForeignLang

What happens if both users request the page at the same time? I assume
it will be generated twice (which seems wasteful), once for each
language overwriting the one before it.

Is there a way to avoid overwriting an already generated page?

James Gardner

unread,
Feb 1, 2007, 10:18:40 AM2/1/07
to pylons-...@googlegroups.com
Hi Uwe,

The translation should occur at run time so this shouldn't be a problem.
How are you doing the translation?

If you are using the Pylons _() function in the template everything
should be fine surely?

Cheers,

James

Shannon -jj Behrens

unread,
Feb 1, 2007, 9:40:48 PM2/1/07
to pylons-...@googlegroups.com

I don't think that the user's desired language affects how the page
gets compiled.

Best Regards,
-jj

--
http://jjinux.blogspot.com/

Uwe Feldtmann

unread,
Feb 2, 2007, 1:06:16 AM2/2/07
to pylons-...@googlegroups.com
Hi James.

James Gardner wrote:
> The translation should occur at run time so this shouldn't be a problem.
> How are you doing the translation?
>

On closer inspection it doesn't seem to be a problem. I was thinking
that the pre-compiled templates were what was being sent to the browser.
My mistake.


> If you are using the Pylons _() function in the template everything
> should be fine surely?
>

All is translating fine using _() although it would be nice if there was
some quick way to get all the strings from a template.

Another question however:

Is request.environ['HTTP_ACCEPT_LANGUAGE'] the best way to get access
to the languages acceptable to the browser?

This is what I get when I execute the above:-
en-GB,en;q=0.9,en-us;q=0.8,en-US;q=0.6,ar-AE;q=0.5,ar;q=0.4,en-gb;q=0.3,en;q=0.1

Is there a further breakdown or list or should I parse the line manually?

request.environ.languages returing a list would be cool.

The list for the above might look like
['en-GB','en',en-us','en-US','ar-AE','ar','en-gb']
leaving off the duplicate 'en' off the end.

Uwe

Uwe Feldtmann

unread,
Feb 2, 2007, 1:07:14 AM2/2/07
to pylons-...@googlegroups.com
Shannon -jj Behrens wrote:
> I don't think that the user's desired language affects how the page
> gets compiled.
>
You are right. I wasn't thinking straight on this one. Too many late
nights.


James Gardner

unread,
Feb 2, 2007, 8:39:47 AM2/2/07
to pylons-...@googlegroups.com
Hi Uwe,

Uwe Feldtmann wrote:
> All is translating fine using _() although it would be nice if there was
> some quick way to get all the strings from a template.

As luck would have it I was documenting this yesterday around line 844 here:

http://pylonshq.com/project/pylonshq/browser/Pylons/trunk/docs/internationalization.txt?order=name

You can do something like this to extract strings, if you give it a go
I'd be interested to hear if you have any problems.

find translate_demo -type f -name '*myt' > translate_demo/i18n/filelist
find translate_demo -type f -name '*mak'>> translate_demo/i18n/filelist
find translate_demo -type f -name '*py' >> translate_demo/i18n/filelist
cat translate_demo/i18n/filelist | xargs xgettext -o \
translate_demo/i18n/messages.pot \
--language=Python --from-code=utf-8 \
--keyword=_ --keyword=N_ --keyword=ugettext \
--keyword=gettext --keyword=ngettext --keyword=ungettext

> Another question however:
>
> Is request.environ['HTTP_ACCEPT_LANGUAGE'] the best way to get access
> to the languages acceptable to the browser?
>
> This is what I get when I execute the above:-
> en-GB,en;q=0.9,en-us;q=0.8,en-US;q=0.6,ar-AE;q=0.5,ar;q=0.4,en-gb;q=0.3,en;q=0.1
>
> Is there a further breakdown or list or should I parse the line manually?
>
> request.environ.languages returing a list would be cool.
>
> The list for the above might look like
> ['en-GB','en',en-us','en-US','ar-AE','ar','en-gb']
> leaving off the duplicate 'en' off the end.

Believe it or not you are in luck again. Ben has just implemented this.
If you upgrade to the latest paste and pylons dev you should be able to
access all the languages as request.languages.

One other thing I implemented yesterday, but have yet to test properly
and write up, was language fallbacks so that if a word doesn't exist in
one catalog you can look it up in a fallback or the source instead. This
means you can setup fallbacks for all the languages in
request.languages. You'll need Pylons dev again but here's how it works:

from helloworld.lib.base import *
from pylons.i18n.translation import add_fallback

class HelloController(BaseController):
def index(self):
h.set_lang('en')
add_fallback('es')
return Response(_('Hello')+' '+_('World')+_('!'))

If "Hello" is in the "en" .mo file as "Hi", "World" is only in "es" as
"Mundo" and none of the catalogs defined "!" you will get the english,
spanish then the source words. So the message would be "Hi Mundo!".

Cheers,

James

Max Ischenko

unread,
Feb 2, 2007, 12:00:02 PM2/2/07
to pylons-discuss

On Feb 2, 3:39 pm, James Gardner <j...@pythonweb.org> wrote:
> Believe it or not you are in luck again. Ben has just implemented this.
> If you upgrade to the latest paste and pylons dev you should be able to
> access all the languages as request.languages.

Interesting. Look forward to seeing 0.9.5 release.

> One other thing I implemented yesterday, but have yet to test properly
> and write up, was language fallbacks so that if a word doesn't exist in
> one catalog you can look it up in a fallback or the source instead. This
> means you can setup fallbacks for all the languages in
> request.languages. You'll need Pylons dev again but here's how it works:
>
> from helloworld.lib.base import *
> from pylons.i18n.translation import add_fallback
>
> class HelloController(BaseController):
> def index(self):
> h.set_lang('en')
> add_fallback('es')
> return Response(_('Hello')+' '+_('World')+_('!'))
>
> If "Hello" is in the "en" .mo file as "Hi", "World" is only in "es" as
> "Mundo" and none of the catalogs defined "!" you will get the english,
> spanish then the source words. So the message would be "Hi Mundo!".

Is there is a way to setup this in config file?

E.g.: "lang = ru en"

If not, how can I setup this fallback globally? Add this to
BaseController?

Max.

James Gardner

unread,
Feb 2, 2007, 12:30:55 PM2/2/07
to pylons-...@googlegroups.com
Hi Max,

>> from helloworld.lib.base import *
>> from pylons.i18n.translation import add_fallback
>>
>> class HelloController(BaseController):
>> def index(self):
>> h.set_lang('en')
>> add_fallback('es')
>> return Response(_('Hello')+' '+_('World')+_('!'))
>>
>> If "Hello" is in the "en" .mo file as "Hi", "World" is only in "es" as
>> "Mundo" and none of the catalogs defined "!" you will get the english,
>> spanish then the source words. So the message would be "Hi Mundo!".
>
> Is there is a way to setup this in config file?
>
> E.g.: "lang = ru en"

Well, you can do lang=en already to specify the main language but it's a
good idea to able to specify the fallbacks too, I'll implement that.

> If not, how can I setup this fallback globally? Add this to
> BaseController?

Best place is probably the __init__() method of the Globals object in
lib/app_globals.py actually. I'd have thought it was best not to do it
in a controller action like I described above otherwise you will be
adding numerous fallbacks. I'll also write some code so that fallbacks
are only added if they aren't in place already.

Cheers,

James

Damjan

unread,
Feb 2, 2007, 2:48:43 PM2/2/07
to pylons-discuss
While on the topic ... can someone take a look at
http://routes.groovie.org/trac/routes/ticket/37

James Gardner

unread,
Feb 2, 2007, 3:20:20 PM2/2/07
to pylons-...@googlegroups.com
Hi Damjan,

Damjan wrote:
> While on the topic ... can someone take a look at
> http://routes.groovie.org/trac/routes/ticket/37

This is fixed in the latest routes. Try:

easy_install -U "routes==dev"

I've closed the ticket.

Cheers,

James

Damjan

unread,
Feb 2, 2007, 3:51:35 PM2/2/07
to pylons-discuss
Thanks, it works... I also notice that with this version of Routes
1.6.3dev-r325 controller arguments are unicode now.

for example, I have
def show(self, pagename):
...
pagename was a byte string before, but now it's unicode... which is
great.

Shannon -jj Behrens

unread,
Feb 2, 2007, 5:01:29 PM2/2/07
to pylons-...@googlegroups.com
On 2/1/07, Uwe Feldtmann <u...@microshare.com.au> wrote:
>
> Hi James.
>
> James Gardner wrote:
> > The translation should occur at run time so this shouldn't be a problem.
> > How are you doing the translation?
> >
> On closer inspection it doesn't seem to be a problem. I was thinking
> that the pre-compiled templates were what was being sent to the browser.
> My mistake.
> > If you are using the Pylons _() function in the template everything
> > should be fine surely?
> >
> All is translating fine using _() although it would be nice if there was
> some quick way to get all the strings from a template.

If you're using Mako, then the answer is easy. Just compile the
templates down to Python and then run xgettext on the Python.

> Another question however:
>
> Is request.environ['HTTP_ACCEPT_LANGUAGE'] the best way to get access
> to the languages acceptable to the browser?

There's some stuff that I submitted to the most recent version of
Paste for parsing that.

> This is what I get when I execute the above:-
> en-GB,en;q=0.9,en-us;q=0.8,en-US;q=0.6,ar-AE;q=0.5,ar;q=0.4,en-gb;q=0.3,en;q=0.1
>
> Is there a further breakdown or list or should I parse the line manually?
>
> request.environ.languages returing a list would be cool.
>
> The list for the above might look like
> ['en-GB','en',en-us','en-US','ar-AE','ar','en-gb']
> leaving off the duplicate 'en' off the end.

Yep. I hope that code works for you.

Shannon -jj Behrens

unread,
Feb 2, 2007, 5:04:13 PM2/2/07
to pylons-...@googlegroups.com

If I'm understanding the question correctly, the code that I sent to
Ben from Aquarium "ULTIMATE_FALLBACK", or something like that,
addresses this problem :)

Shannon -jj Behrens

unread,
Feb 2, 2007, 5:06:45 PM2/2/07
to pylons-...@googlegroups.com
On 2/2/07, Max Ischenko <isch...@gmail.com> wrote:

By the way, in case it isn't obvious, it's very bad practice to break
up sentences into multiple translations this way. What if a language
translates "Hello World!" into "World Hello!"? In general, you should
keep sentences and phrases together.

Happy Hacking!
-jj

--
http://jjinux.blogspot.com/

Ben Bangert

unread,
Feb 4, 2007, 9:26:21 PM2/4/07
to pylons-...@googlegroups.com
On Feb 1, 2007, at 10:06 PM, Uwe Feldtmann wrote:

> Is request.environ['HTTP_ACCEPT_LANGUAGE'] the best way to get access
> to the languages acceptable to the browser?
>
> This is what I get when I execute the above:-
> en-GB,en;q=0.9,en-us;q=0.8,en-US;q=0.6,ar-AE;q=0.5,ar;q=0.4,en-
> gb;q=0.3,en;q=0.1
>
> Is there a further breakdown or list or should I parse the line
> manually?

Yep, the latest Paste now has a languages attribute on the request
object, and the latest dev Pylons can handle being passed a list of
languages (like the request.languages will yield from HTTP Accept
Language).

So you could do (with latest Paste and svn Pylons):
h.set_lang(request.languages)

Which will set the languages up with the ones from the browser.
Generally, a good scheme for setting the language is to let the user
choose and store it in the session and set it based on that. If no
language was chosen, then load it up with request.languages.

Cheers,
Ben

Uwe Feldtmann

unread,
Feb 4, 2007, 10:36:14 PM2/4/07
to pylons-...@googlegroups.com
Thanks all. I've updated to the latest Devs of Pylons and Paste and
will test later today.

Ben Bangert wrote:
> Yep, the latest Paste now has a languages attribute on the request
> object, and the latest dev Pylons can handle being passed a list of
> languages (like the request.languages will yield from HTTP Accept
> Language).
>
> So you could do (with latest Paste and svn Pylons):
> h.set_lang(request.languages)
>
> Which will set the languages up with the ones from the browser.
> Generally, a good scheme for setting the language is to let the user
> choose and store it in the session and set it based on that. If no
> language was chosen, then load it up with request.languages.
>

What I want is to pick up the languages supported by the browser in the
order in which they are specified by the browser. It would appear that
Firefox changes the sequence of the languages based on the currently
selected locale - at least my copy does.

Once I have that list I want to check against languages the app supports
and if there is a match at the top of the list I'll switch to the
appropriate language. English will be the fallback and the user will be
able to set his/her prefered language thereafter.

I'll let you know how my testing goes.

Thanks again.

Uwe.

Damjan

unread,
Feb 4, 2007, 10:55:48 PM2/4/07
to pylons-discuss

> What I want is to pick up the languages supported by the browser in the
> order in which they are specified by the browser. It would appear that
> Firefox changes the sequence of the languages based on the currently
> selected locale - at least my copy does.

Most probably ... since that string comes from the locale.
(intl.accept_languages in ./toolkit/chrome/global/intl.properties -
I'm helping with a Firefox localization and have the source here :) )

Uwe Feldtmann

unread,
Feb 4, 2007, 11:09:22 PM2/4/07
to pylons-...@googlegroups.com
Thanks for that.

I've installed the latest Pylons and Paste (dev) and here's what I get.

print request.environ['HTTP_ACCEPT_LANGUAGE'] returns
en-GB,en;q=0.9,en-US;q=0.8,en-us;q=0.6,ar-AE;q=0.5,ar;q=0.4,en-gb;q=0.3,en;q=0.1

print request.languages returns
['en-gb', 'en', 'en-us']

It appears to be dropping the ar-AE and ar but only until I switch to Arabic in the browser.

This is what I get when switched to Arabic

print request.environ['HTTP_ACCEPT_LANGUAGE'] returns
ar-AE,ar;q=0.9,en-GB;q=0.8,en;q=0.6,en-US;q=0.5,en-us;q=0.4,en-gb;q=0.3,en;q=0.1

print request.languages returns
['ar-ae', 'ar', 'en-gb', 'en', 'en-us']

Is this to be expected or is it filtering the languages somehow?

Uwe.



  

David Smith

unread,
Feb 4, 2007, 11:30:46 PM2/4/07
to public-pylons-discuss-...@ciao.gmane.org
Uwe Feldtmann <uwe-t63v2z5pxu/Ts8Xqbr2Lf...@public.gmane.org> writes:

How are you setting that header that way in your browser? Which
browser are you using? The q=0.X values are supposed to be
preference quotients that sum to 1.0 used for sorting. I didn't
think we'd have to handle the case of them summing to far more
than 1.0 or having so many duplicates, but then again it
wouldn't be hard to support and we should be that flexible.

And something else seems funny because I just tested your
original top accept language string against current paste and I
got back

['en', 'en-us', 'en-us', 'ar-ae', 'ar', 'en-gb', 'en']

More details about your setup, please
--
David D. Smith

Ben Bangert

unread,
Feb 4, 2007, 11:38:32 PM2/4/07
to pylons-...@googlegroups.com
On Feb 4, 2007, at 8:09 PM, Uwe Feldtmann wrote:

> print request.environ['HTTP_ACCEPT_LANGUAGE'] returns
> en-GB,en;q=0.9,en-US;q=0.8,en-us;q=0.6,ar-AE;q=0.5,ar;q=0.4,en-
> gb;q=0.3,en;q=0.1
>
> print request.languages returns
> ['en-gb', 'en', 'en-us']
>
> It appears to be dropping the ar-AE and ar but only until I switch
> to Arabic in the browser.
>
> This is what I get when switched to Arabic
>
> print request.environ['HTTP_ACCEPT_LANGUAGE'] returns
> ar-AE,ar;q=0.9,en-GB;q=0.8,en;q=0.6,en-US;q=0.5,en-us;q=0.4,en-
> gb;q=0.3,en;q=0.1
>
> print request.languages returns
> ['ar-ae', 'ar', 'en-gb', 'en', 'en-us']
>
> Is this to be expected or is it filtering the languages somehow?

It is, for a good reason that is in the doc string but will need to
be mentioned elsewhere. It's assumed that you write your original
source code in the main 'fallback' language. From the docstring:
The ``language`` default value is considered the fallback during
i18n
translations to ensure in odd cases that mixed languages don't
occur should
the ``language`` file contain the string but not another
language in the
accepted languages list. The ``language`` value only applies
when getting
a list of accepted languages from the HTTP Accept header.

This behavior is duplicated from Aquarium, and may seem strange
but is
very useful. Normally, everything in the code is in "en-us".
However,
the "en-us" translation catalog is usually empty. If the user
requests
``["en-us", "zh-cn"]`` and a translation isn't found for a
string in
"en-us", you don't want gettext to fallback to "zh-cn". You
want it to
just use the string itself. Hence, if a string isn't found in the
``language`` catalog, the string in the source code will be used.

If you would like to change the fallback language that's used, you
can change it in your environment.py like so:
return pylons.config.Config(myghty, map, paths, request_settings=dict
(charset='UTF-8', errors='replace', language='ar-AE'))

Hopefully that makes sense? If not, I'm sure JJ will help shed some
light on it as well.

Cheers,
Ben

Uwe Feldtmann

unread,
Feb 4, 2007, 11:45:53 PM2/4/07
to pylons-...@googlegroups.com
David Smith wrote:
> How are you setting that header that way in your browser? Which
> browser are you using? The q=0.X values are supposed to be
> preference quotients that sum to 1.0 used for sorting. I didn't
> think we'd have to handle the case of them summing to far more
> than 1.0 or having so many duplicates, but then again it
> wouldn't be hard to support and we should be that flexible.
>
> And something else seems funny because I just tested your
> original top accept language string against current paste and I
> got back
>
> ['en', 'en-us', 'en-us', 'ar-ae', 'ar', 'en-gb', 'en']
>
> More details about your setup, please
>
I upgraded using
easy_install -U http://pylonshq.com/svn/Pylons/trunk

Current installation.
Python 2.4
Pylons-0.9.5dev_r1793
Paste-1.2.1
PasteDeploy-1.1
PasteScript-1.1.1dev_r6177

Uwe.

Uwe Feldtmann

unread,
Feb 4, 2007, 11:50:04 PM2/4/07
to pylons-...@googlegroups.com
David Smith wrote:
> How are you setting that header that way in your browser? Which
> browser are you using? The q=0.X values are supposed to be
> preference quotients that sum to 1.0 used for sorting. I didn't
> think we'd have to handle the case of them summing to far more
> than 1.0 or having so many duplicates, but then again it
> wouldn't be hard to support and we should be that flexible.
>
I'm not setting anything in the browser. I'm using Quick Locale
Switcher Firefox pluggin and only have a few of the languages ticked.
There doesn't appear to be a way to specify a sequence preference value.

Uwe.

Uwe Feldtmann

unread,
Feb 4, 2007, 11:56:02 PM2/4/07
to pylons-...@googlegroups.com
Hi Ben.
That's great but it still filtered the ar and ar-AE entries. At least it did at my end. 
David got what I would have expected so I'll see what happens.

     This behavior is duplicated from Aquarium, and may seem strange  
but is
     very useful. Normally, everything in the code is in "en-us".   
However,
     the "en-us" translation catalog is usually empty.  If the user  
requests
     ``["en-us", "zh-cn"]`` and a translation isn't found for a  
string in
     "en-us", you don't want gettext to fallback to "zh-cn".  You  
want it to
     just use the string itself.  Hence, if a string isn't found in the
     ``language`` catalog, the string in the source code will be used.

If you would like to change the fallback language that's used, you  
can change it in your environment.py like so:
return pylons.config.Config(myghty, map, paths, request_settings=dict 
(charset='UTF-8', errors='replace', language='ar-AE'))

Hopefully that makes sense? If not, I'm sure JJ will help shed some  
light on it as well.
  
All that makes sense and is what I want to do.  I think I may have some version mismatch at this end.  Perhaps with Paste. I'll take a look and try again.

Uwe.

Shannon -jj Behrens

unread,
Feb 5, 2007, 3:55:40 PM2/5/07
to pylons-...@googlegroups.com

Your explanation seems just fine. I'm here to answer any additional questions.

Reply all
Reply to author
Forward
0 new messages