language code in subdomain

77 views
Skip to first unread message

SamBull

unread,
Sep 23, 2008, 7:41:29 PM9/23/08
to django-multilingual
Hey all,

I've been reading about django-localeurl and it looks pretty solid. I
have an idea for a variation on it where the locale is encoded as a
subdomain instead of as the first part of the path.

I'm preparing to branch localeurl to support this functionality but I
wanted to check in with you guys and see if this made sense.

I hope this is the right place to discuss this. I couldn't find a
discussion group for localeurl specifically.

I'm also nursing some wacky ideas for how to support translated URLs
(I mean, translations of the actual URLs so the words in them
correspond the active locale). I haven't been able to find any
discussion on how people do this, so it's either a bad idea or a
really hard problem or both. Still, if anybody's interested, drop me a
line. I can share my crackpot schemes. :)

Sam

Joost Cassee

unread,
Sep 24, 2008, 5:16:46 AM9/24/08
to django-mu...@googlegroups.com
On 24-09-08 01:41, SamBull wrote:

> I've been reading about django-localeurl and it looks pretty solid. I
> have an idea for a variation on it where the locale is encoded as a
> subdomain instead of as the first part of the path.
>
> I'm preparing to branch localeurl to support this functionality but I
> wanted to check in with you guys and see if this made sense.

As the author of localeurl: this make a lot of sense. I would love to
see this functionality in the localeurl application itself. Would you
consider contributing it instead of branching?

> I hope this is the right place to discuss this. I couldn't find a
> discussion group for localeurl specifically.

There isn't one. I did set-up one when I started the project, but I
thought there would be too little traffic. Maybe (if Marcin doesn't
mind) I could point to this list as 'the' discussion list.

> I'm also nursing some wacky ideas for how to support translated URLs
> (I mean, translations of the actual URLs so the words in them
> correspond the active locale). I haven't been able to find any
> discussion on how people do this, so it's either a bad idea or a
> really hard problem or both. Still, if anybody's interested, drop me a
> line. I can share my crackpot schemes. :)

Sounds interesting, what do you mean exactly? Something like
'category/1' versus 'kategorie/1'?


Regards,

Joost

--
Joost Cassee
http://joost.cassee.net

signature.asc

Marcin Kaszynski

unread,
Sep 24, 2008, 2:17:51 PM9/24/08
to django-multilingual
Hi,

On 24 Wrz, 11:16, Joost Cassee <jo...@cassee.net> wrote:
> > I hope this is the right place to discuss this. I couldn't find a
> > discussion group for localeurl specifically.
>
> There isn't one. I did set-up one when I started the project, but I
> thought there would be too little traffic. Maybe (if Marcin doesn't
> mind) I could point to this list as 'the' discussion list.

Of course, feel free to :)

> > I'm also nursing some wacky ideas for how to support translated URLs
> > (I mean, translations of the actual URLs so the words in them
> > correspond the active locale). I haven't been able to find any
> > discussion on how people do this, so it's either a bad idea or a
> > really hard problem or both. Still, if anybody's interested, drop me a
> > line. I can share my crackpot schemes. :)
>
> Sounds interesting, what do you mean exactly? Something like
> 'category/1' versus 'kategorie/1'?

It does sound interesting. Sam: please, do share :)

Regards,
-mk

Joost Cassee

unread,
Sep 24, 2008, 3:46:09 PM9/24/08
to django-mu...@googlegroups.com
On 24-09-08 01:41, SamBull wrote:

> I'm also nursing some wacky ideas for how to support translated URLs
> (I mean, translations of the actual URLs so the words in them
> correspond the active locale). I haven't been able to find any
> discussion on how people do this, so it's either a bad idea or a
> really hard problem or both. Still, if anybody's interested, drop me a
> line. I can share my crackpot schemes. :)

What I really dislike about localeurl is that it requires
monkey-patching urlresolvers.reverse(). Malcolm Tredinnick dropped a
remark the other day on django-developers:

On 17 sep, 02:14, Malcolm Tredinnick <malc...@pointy-stick.com> wrote:

> You can already put any object you like that has a resolve() method into
> urlpatterns(). That object can see every single pattern that passes
> through if it is matched for the pattern ''. We have to *some* kind of
> root object and it happens to be RegexURLResolver.

So here is my idea: localeurl could be based on some class pretending to
be a RegexPattern / RegexURLResolver combination. Not sure yet how that
would work, but this would fit in with other special URL mangling
functionality.

This could work something like this:

urls.py:
urlpatterns = patterns(
localepatterns(
url(...),
),
url(...),
)

Locale dependent paths would go into localepatterns. This would also
make the LOCALE_INDEPENDENT_PATHS obsolete.

signature.asc

SamBull

unread,
Sep 24, 2008, 5:52:25 PM9/24/08
to django-multilingual
> What I really dislike about localeurl is that it requires
> monkey-patching urlresolvers.reverse().

I agree.

> So here is my idea: localeurl could be based on some class pretending to
> be a RegexPattern / RegexURLResolver combination. Not sure yet how that
> would work, but this would fit in with other special URL mangling
> functionality.
>
> This could work something like this:
>
> urls.py:
> urlpatterns = patterns(
>     localepatterns(
>         url(...),
>     ),
>     url(...),
> )

I think something like this makes more sense and provides cleaner
code.

It also dovetails with what I'd like to do with translatable URLs.

To answer your earlier question, yes, it would be allowing a resource
to be accessed via either "/category/1" or "/kategorie/1", depending
on the locale of the requester.

My friends and I here at work were talking about this earlier today
and it's a tricky issue. As far as I can tell there's no agreed on
best practice for how the URLs should work on a multi-language site.

Here are my basic ideals:
1. I don't want to be reminded of the language I'm reading a site in.
I already know that I'm reading it in English, so it's redundant
information to show it in the URL
2. The structure of the site should be presented to me entirely in my
language or not at all. I don't want to visit the English version of a
German website and have to suffer German words in the URLs. Of course,
I assume this is something the non-English-speaking majority out there
has to suffer a lot more than I do.
3. Every resource on a page should have a URL that represents it. This
conflicts with #1 because it means the language will probably have to
appear in at least some of the URLs because there are no other tell-
tales to distinguish the URL in language A from the URL in language B
(The home page of a site is the most obvious example of this)

The best solution I can think of, from my point of view, stores the
language code in a subdomain the way wikipedia does. Attempting to
load the site without a language subdomain either prompts you to chose
a language or choses one for you the way Django already does and
redirects you to the appropriate subdomain.

URLs should use only language of the active locale. All the words of
the URL, whether they come from the URL pattern or from the slug of an
object, should be locale-appropriate.

Here's the URLs that point to the English and German versions of some
imaginary resource:
en.mysite.com/category/frogs/
de.mysite.com/kategorie/frösche/ (An interesting side-question: In
German, are nouns capitalized when they appear in URLs?)

Implementing this requires two mechanisms:
1: A modified version of localeurl that sets the request's language
based on the subdomain
2: A modified version of Django's URL resolver that can allow an URL
to have one or more translations

I think swapping "/en/..." for "en...." will be fairly easy, but I
know there will probably be issues. I haven't taken a crack at
actually writing this yet. I'll let you know what problems I have.

My idea for multi language URLs is to use the gettext system:

from django.utils.translation import ugettext as _

from localeurl.patterns import turl
...

urlpatterns = ('',
turl(_(r'^/category/(<?P<slug>[-\w+])/$'), 'path.to.view',
extras_dict, name="example_url),
...
)

Every URL pattern defined in this way would turn up in the django.po
file and you could define the equivalents in your other locales there.
I have an idea for how I'd like to implement "turl" as a locale-aware
version of URL, but I'll write more about that another time. The idea
is just that even though the patterns in the urls.py files will all
probably be in English, the URLs being matched against will be the
ones for the current locale.

I hadn't seen Malcolm Tredinnick's comment before. This sounds like
the smartest way to build the thing.

There are almost certainly issues with this part of my solution as
well. Please tell me about them as you see them!

The biggest issue has to do with a fourth ideal which I didn't list
above: Every resource with a URL should know what it's equivalent is
in every other available locale (i.e. There should be a "switch
language" widget on every page of the site that does The Right
Thing(tm)). This isn't so bad for resources that are for specific
Model instances. That's what django-multilingual is all about, right?
The problem is dealing with resources that are lists of objects, or in
some other way don't correspond neatly to a single object in the db.
I'll write more about that in a follow-up post. Right now I'm out of
time to write this, so I'm done!

Let me know what you think. I hope this is coherent. I wrote it over a
few different sittings throughout the day.

Cheers,

Sam

Nicola Larosa

unread,
Sep 25, 2008, 3:22:38 AM9/25/08
to django-mu...@googlegroups.com
SamBull wrote:
> Here are my basic ideals:
> 1. I don't want to be reminded of the language I'm reading a site in.
> I already know that I'm reading it in English, so it's redundant
> information to show it in the URL

Agreed: in *any* part of the URL, though. :-)


> 2. The structure of the site should be presented to me entirely in my
> language or not at all. I don't want to visit the English version of a
> German website and have to suffer German words in the URLs. Of course,
> I assume this is something the non-English-speaking majority out there
> has to suffer a lot more than I do.

Again, agreed: the keyword is "entirely".


> 3. Every resource on a page should have a URL that represents it. This
> conflicts with #1 because it means the language will probably have to
> appear in at least some of the URLs because there are no other tell-
> tales to distinguish the URL in language A from the URL in language B
> (The home page of a site is the most obvious example of this)

Not necessarily, see below.


> The best solution I can think of, from my point of view, stores the
> language code in a subdomain the way wikipedia does. Attempting to
> load the site without a language subdomain either prompts you to chose
> a language or choses one for you the way Django already does and
> redirects you to the appropriate subdomain.

I can think of another one: use a different domain name altogether,
without specifying the language at all.


> URLs should use only language of the active locale. All the words of
> the URL, whether they come from the URL pattern or from the slug of an
> object, should be locale-appropriate.

Again, agreed, with emphasis on "All". :-)


> Here's the URLs that point to the English and German versions of some
> imaginary resource:
> en.mysite.com/category/frogs/
> de.mysite.com/kategorie/frösche/

But notice how "mysite" and "com" are still English language words. :-)

I'd go with these:

www.mysite.com/category/frogs/
www.meinewebsite.de/kategorie/frosche/ (the umlaut should be escaped, or
substituted)

Yes, I know the whole country vs. language distinction: alas, top level
domains denote countries, that's what we have to work with.

Technically speaking, adding third level domains implies changing DNS
anyway, and domain names are fairly inexpensive these days.


> (An interesting side-question: In
> German, are nouns capitalized when they appear in URLs?)

Best practices suggest that URLs elements should not be capitalized, for
easier typing and spelling. Where that criterion conflicts with language
rules, I would still stick to it, but that's a fairly subjective choice.


> The biggest issue has to do with a fourth ideal which I didn't list
> above: Every resource with a URL should know what it's equivalent is
> in every other available locale (i.e. There should be a "switch
> language" widget on every page of the site that does The Right
> Thing(tm)). This isn't so bad for resources that are for specific
> Model instances. That's what django-multilingual is all about, right?
> The problem is dealing with resources that are lists of objects, or in
> some other way don't correspond neatly to a single object in the db.

Yes, that's a problem that should be solved too.


Let's also make use of content negotiation, where available. The browsers
have a preferred language setting, that makes them send a header with the
user preferred language in it.

However, the user of an English-preferring browser may actually want to
read the German version of a site, and we should allow for that. The
"switch language" widget should override content negotiation, possibly
via a cookie or, uglier, via a query string param.

--
Nicola Larosa - http://www.tekNico.net/

TV is all about instilling a desire in you to buy something. Instead,
why not take stock of the good things you already have, and say
"thanks." Then say "no thanks" to the nonsense on TV and let the
falling ratings work the change. - John Schettler, January 2008

SamBull

unread,
Sep 25, 2008, 11:55:05 AM9/25/08
to django-multilingual
> > Here's the URLs that point to the English and German versions of some
> > imaginary resource:
> > en.mysite.com/category/frogs/
> > de.mysite.com/kategorie/frösche/
>
> But notice how "mysite" and "com" are still English language words. :-)
>
> I'd go with these:
>
> www.mysite.com/category/frogs/
> www.meinewebsite.de/kategorie/frosche/(the umlaut should be escaped, or substituted)

That's a fair point. I don't see why a domain-based language selector
couldn't support any arbitrary variation of domains, be it via
subdomains or TLDs or via an entirely different domain. I wonder if it
would make sense to use Django's contrib.site app here.

A side note: I've heard different things about non-ascii characters in
URLs, but I've certainly seem some high profile usage of non-ascii
characters (see: http://ja.wikipedia.org/wiki/国際化と地域化, the Japanese
wikipedia entry for internationalization :) ) and if it's supported I
hope we can use it, because it feels like we wouldn't be able to
represent more than a handful of locales properly without it.

> Technically speaking, adding third level domains implies changing DNS
> anyway, and domain names are fairly inexpensive these days.

I agree. I don't think supporting any scheme for varying domains will
be much harder than implementing specific support for one scheme
(locale codes as subdomains). Each project will need to select the
domain naming scheme based on their specific requirements. One reason
(and a whole other thorny aspect to this stuff) is that google treats
multiple subdomains of a site as aspects of that one site, whereas I
assume sites with different TLDs will never be considered aspects of
the same site. I really wish I knew more about how to best represent a
multi-language site to Google.

> Let's also make use of content negotiation, where available. The browsers
> have a preferred language setting, that makes them send a header with the
> user preferred language in it.
>
> However, the user of an English-preferring browser may actually want to
> read the German version of a site, and we should allow for that. The
> "switch language" widget should override content negotiation, possibly
> via a cookie or, uglier, via a query string param.

The behaviour I was imagining was that you would have a locale-neutral
domain for the page (say, www.mysite.com, in my example). If a user
arrives on that page the site would try to determine their language
the same way Django's regular locale system does and then forward the
user to the locale-specific domain (say, en.mysite.com). The language
selection widget would always be there, letting you jump to the other-
locale equivalents of whatever page you're on. This does override the
automatic content negotiation because that would only ever be applied
if you arrived at the locale-neutral domain.

So, for example, I arrive at www.mysite.com. I'm using my German
friend's computer so the content negotiation system determines that
I'd like to see the site in the German locale and redirects me to
de.mysite.com (or meinesite.de, or whatever it is). I don't speak
German so I look for the language/locale selection widget and choose
English. I get sent to en.mysite.com (or whatever the English locale's
domain is). The site's not going to try and send me back to German
version, despite what my browser is telling it because I'm no longer
on the locale-neutral domain so the locale is entirely determined by
the domain itself.

I know this inspires an obvious question: how do we get new users to
arrive at the locale-neutral domain? I don't know. I think my best
solution would be to get Google to index that instead of the normal
locale-specific domain for the site's primary language, but that could
easily be tricky to do technically and without making Google
suspicious, for all I know.

Thanks for all the feedback. I feel like I'm getting a lot closer to
being able to define at least my own personal best practice for this.
I hope it's at least interesting to others.

Cheers,

Sam

Joost Cassee

unread,
Sep 25, 2008, 1:18:33 PM9/25/08
to django-mu...@googlegroups.com
On 25/09/2008 17:55, SamBull wrote:

> That's a fair point. I don't see why a domain-based language selector
> couldn't support any arbitrary variation of domains, be it via
> subdomains or TLDs or via an entirely different domain. I wonder if it
> would make sense to use Django's contrib.site app here.

I was also wondering what the interaction with contrib.sites would be.

> I know this inspires an obvious question: how do we get new users to
> arrive at the locale-neutral domain?

From Google? Isn't that precisely what you would *not* want? If a user
searched for 'küche' you would want them to arrive at your German site.
Users would only come to the locale-neutral domain by typing it in.

> Thanks for all the feedback. I feel like I'm getting a lot closer to
> being able to define at least my own personal best practice for this.
> I hope it's at least interesting to others.

It is interesting and I am looking forward to your implementation. Would
you like to get svn access to develop something in a branch at the
django-localeurl project location?

signature.asc
Reply all
Reply to author
Forward
0 new messages