Handling international date format - ticket #4147

17 views
Skip to first unread message

orestis

unread,
Apr 26, 2007, 3:25:40 PM4/26/07
to Django I18N
I've searched and this hasn't been brought up before, so I think it's
time to discuss this. I'll post copy of a ticket I've submitted:

http://code.djangoproject.com/ticket/4147
----------------------------------------
Django's handling of i18n is fairly good, but one major point that it
doesn't handle well (together with almost everything else out there)
is the formatting of dates.

The most obvious example is the display of the month names. While in
English there is only one case of nouns, in many other european
languages there are more, which have different spellings. For example,
in Greek, in order to be able to diplay full-word months and capture
all possible sentence formats, one needs three cases:

1. The subjective case (eg. en: January, 2007 - el: Ιανουάριος,
2007)
2. The posessive case (eg. en: 23th of January - el: 23η
Ιανουαρίου)
3. The objective case (eg. en: Entries posted on January - el:
Δημοσιεύσεις που έγιναν τον Ιανουάριο)

I'm sure this is common in most european languages, but I'm not an
expert; Please everybody comment on this.

To implement this in django, I suggest the following:

* Add MONTHS_POS, MONTHS_OBJ to django.utils.dates. This should
read "of January" and "on January" in english .
* Add a custom extension in django.utils.dateformat: Q for
MONTHS_POS, V for MONTHS_OBJ. Any available letter should do.

That's all. There is still an issue about the format 'S' that adds the
ordinals (1st, 2nd etc) but I don't know how other languages deal with
this.

I can submit a patch for this...
----------------------------------
To which malcolm replied:
----------------------------------
I haven't thought about this enough to really know if this is the
right approach or not. Please have some patience, you are only one of
many people requesting things be added to the code and you only opened
the ticket 24 hours ago.

Write a patch if you want, it can't hurt. However, I'm trying to think
of some way to do this that maybe doesn't involve creating a bunch of
new settings and format modifiers. It is very important that we also
keep things easy to write the code in the first place. Might be worth
having a discussion on django-i18n first before writing this. Tickets
aren't the best place to have a discussion about a feature.
----------------------------------


So I think I'll start the discussion here.

Could the translators that have similar issues with cases, genders
etc. point them out ?

I'll post my thoughts on Greek in another post, to keep things clear.

orestis

unread,
Apr 26, 2007, 4:02:59 PM4/26/07
to Django I18N
I'll push this a little further:

In order to handle international sites, one should be able to mark
date format strings as translatable. The ordering, punctuation and
wording of a date for a specific placement will not be the same for
each language, so essentially the date format string is something that
should change according to the locale, ergo it should be translatable.

So this points to having one date filter, that is changed to handle
translation strings, and since it will know about locales, it would be
able to pull custom extensions from the localflavor applications.

Thoughts on this ?

orestis

unread,
Apr 26, 2007, 3:55:27 PM4/26/07
to Django I18N
So, for Greek:

I'll stand by my choice of having more options in the
django.utils.dateformat.

But, since Malcolm seems reluctant to change a functionality so basic
and simple, maybe something can be put in localflavor, so that each
language/country can handle their own needs and peculiarities without
bothering everyone else.

Maybe a "ldate" or "localdate" filter that:

a) Tries to find an implementation in the current locale - localflavor
and passes the arguments there and
b) failing that, either gives an error message or returns the default
date format, using the current django.utils.dateformat

This way we free the project administrators from unnecessary decisions
about matters they really can't judge as well and leave the current
functionality working as before.

There could also be a setting like USE_I18N (or it could be the same)
that decides which kind of tag it would use - so one could use this in
order to decide which kind of filter to use, leading to a bit more
complex/magic behavior but gaining in ease of use.

Radek Svarz

unread,
Apr 26, 2007, 4:11:22 PM4/26/07
to Djang...@googlegroups.com
Hi,

from the Czech language point of view I see the same issues as they
are in Greek.

Radek

Radek Svarz

unread,
Apr 26, 2007, 4:57:10 PM4/26/07
to Djang...@googlegroups.com
Orestis,

please, take into consideration several types of websites:

- English only (common for djangoproject admins)

- non-English using one different language (eg. local German website)

- several languages (eg. Czech as local language and English for
other visitors) - quite common in Europe

I believe, that the implementation should be ready for the 3rd option,
ie. not being stuck by django-server wide setting for choosing the
date format.

Radek

On 4/26/07, orestis <ore...@gmail.com> wrote:
>

Malcolm Tredinnick

unread,
Apr 27, 2007, 3:25:37 AM4/27/07
to Djang...@googlegroups.com
On Thu, 2007-04-26 at 19:55 +0000, orestis wrote:
> So, for Greek:
>
> I'll stand by my choice of having more options in the
> django.utils.dateformat.
>
> But, since Malcolm seems reluctant to change a functionality so basic
> and simple,

Hold on... you're putting words into my mouth here. :-)

As I wrote in the initial report, I simply don't have an opinion on a
preferred solution yet. No reluctance, no enthusiasm, just lots of
thinking about it. What I'm not prepared to do is rush to a decision. If
it takes us two weeks or a month to come up with a workable solution,
the world won't stop turning in the meantime.

This is a hard problem, not because there are a lack of solutions, but
because every solution has a potential downside and there are a number
of different audiences affected: translators, developers (both core and
third-party) and users.

I am completely in favour of solving this problem. Partly because the
majority of people on the planet have a first language that is not
English, so we have some obligation to them, and partly because I get a
kick out of being able to send friends in Sweden or China a screenshot
showing a fully translated, accurate page that blows away anything they
get from the closed source world.

> maybe something can be put in localflavor, so that each
> language/country can handle their own needs and peculiarities without
> bothering everyone else.
>
> Maybe a "ldate" or "localdate" filter that:
>
> a) Tries to find an implementation in the current locale - localflavor
> and passes the arguments there and
> b) failing that, either gives an error message or returns the default
> date format, using the current django.utils.dateformat
>
> This way we free the project administrators from unnecessary decisions
> about matters they really can't judge as well and leave the current
> functionality working as before.

So that we're all thinking about the same set of problems, let me lay
out the developer side of the issue here (as well as the perspective of
somebody who has done i18n development for quite a few years, so I have
some sympathy for the potential problems that arise for both translators
and developers):

(1) The fundamental reason this (date formats) is a problem at all is
because we are attempting to construct grammatical sentences out of
short fragments. A guiding principle in creating translatable strings is
to create complete strings as often as possible because constructing
sentences from fragments is *very* locale specific.

The reason we cannot use full string extracts here, of course, is
because these fragments go together to create every date in the year.
366 of them! Multiplied by two or three different forms. So we need to
work out a solution that is based on building from fragments.

(2) Ending and even whole word changes due to the role played by a word
in a sentence (object, subject, noun, adjective-form, ....) is common in
many languages, but not so much in English. Unfortunately, because of
the family tree of languages, the necessary changes vary a lot based on
locale. Even within the Indo-European languages there are a lot of
variations. Trying to catalogue all variations on our particular problem
phrases here might be challenging. Or it might not be that complicated
for the case at hand. This suggests that pushing more intelligence down
to the locale-specific functions is not a crazy approach (again, I have
no opinion yet, I'm just laying out the various sides of the problem).

(3) Adding lots of extra format markers to indicate objective,
subjective, dative, possessive and other sentence roles for nouns is
something that has been proposed, although I'm not sure if we will get a
clash between the role to assign to a word in different, quite separated
languages.

This does add to the burden of the author. I (as a software developer)
would now need to mark up my previously relatively simple date string in
a way that I probably won't be able to understand without reading a
reference page when I come to debug it. I can remember the common POSIX
date format strings, because I use them a lot. I cannot remember the PHP
ones (which we use in Django templates), so I have to look them up each
time. Multiplying the possibilities still further starts to get really
challenging. A PHP developer or web developer is probably going to have
a slightly different point of view (they will be more familiar with
Django's template date format strings), but they will also be dragged
into unfamiliar country when we add new format markers. Not a death blow
for this idea, but something to bear in mind.

There's also the side-effect that I -- and people like me -- probably
will not be able to get the markup correct for a date string on the
first one, two or six tries. That's pretty normal for complicated i18n
markup, though, and translators and other knowledgeable people can
usually help to fix the odd cases.

-- end of summary --

Please feel free to add any points that I may have missed here. I think
the above three are a summary of all the things floating around in my
head. I thought it was originally going to be about 10 points, but when
I merged all the similar ideas, there are only a few big issues.

So we need to find a way of allowing translators to mark up strings in
each of the forms required for their language and substituting in the
right value at localisation time. Ideally, without a huge impact on
content authors (although some impact is unavoidable, so let's not be
too restrictive). The solution should definitely allow for dynamic (on a
per-page request basis) changes in the language used to present the
page, although I don't think any of the suggestions so far have made
that impossible.

At the moment, I think it might be productive to keep hearing
suggestions. Basically, "no idea is too stupid" for a few days. I
suspect there might be a good idea that nobody has mentioned yet. How
about we leave this thread going for a little while to give people a
chance to think and then look at what we've got. I'm not going to
contribute much unless I see anything that is clearly technically
infeasible.

This is a hard probem. I do not know of any truly perfect solution in
other projects, either.

Regards,
Malcolm

orestis

unread,
Apr 27, 2007, 8:53:40 AM4/27/07
to Django I18N
Malcolm: I didn't mean to put words in your mouth :) I'm sorry if I
offended you somehow.

I've been giving some more thought on this, have tried some
implementations and here is the summary:

a) As Radek said, this solution should target sites that have a
multilingual audience, since the single-language problem is quite
different and can be solved today without changes in the django code
(IMO).

b) I still think that translatable date format strings is the way to
go, since this approach allows a translator (who has the domain
knowledge) to translate the date as well as the rest of the site

c) Also, it seems to me, as Malcolm said that the complexities of the
problem should be pushed down to localflavors, so the maintainers (who
have the domain knowledge) can do their magic as needed. This will
need some changes in places of django that handle date formatting (and
humanize and possibly other things), in order to take the localflavors
into account, using a robust mechanism with fallbacks etc. I've
started to develop a proof of concept locally.

d) However, this still doesn't solve the problem of a user/developer
who doesn't have the domain knowledge. Django-admin springs to mind:
we can't expect the django developers to know all the peculiarities of
each country's preferences, so they can't go and translate the date
format string for each supported language. Possibly the community can
help on this, but for personal projects, it's a no go. So I step back
and speculate:

Most of the time, we don't *need* all these date format strings. We
basically show date and time in 3-4 forms:

1. the full date/time (eg. 22:56 Monday, April 18, 2007)
2. the abbreviated date/time (eg. 22:56 mon, 18 apr 2007)
3. the numerical date/time (eg. 22:56 4/18/2007)
4. ?

This thinking also matches the current approach, where you can define
in the settings this kind of formatting for different situation.
We can take this into the next step, by enriching and putting these
format strings as technical messages in the translation files. So now,
if I want to display a full date in, say, French, I can just specify
that I want the full date, and the date filter will find the preferred
locale, fetch the format string, apply the local dateformat and
present me a date in perfect French. All this without me knowing how
the French prefer to spell their dates, the position of the month, the
case of the months etc.

----

That's my thinking, please comment on it!

PS. I'll next post my thoughts about how to expand the date with
custom modifiers without affecting custom functionality.

On Apr 27, 10:25 am, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

orestis

unread,
Apr 27, 2007, 9:04:26 AM4/27/07
to Django I18N
And here is my proposed mechanism for extending the date filter:

1. datefilter is invoked with a format string and a date object
2. if i18n is active, it tries to locate the active (per-request)
localflavor dateformat module
3. if there is no localflavor dateformat module, continue to parse the
date and format string as usual -> END
4. if there is, invoke the localflavor dateformat function. This
function takes a date and a format string, and returns a string that:
a) has filled in any locale specific peculiarities (gender, case
etc)
b) leaves alone any already defined placeholder characters
5. The date filter now parses the date with the new format string. ->
END

Example: Supposedly we want to output a full date in Greek. The format
string for this would be:
"d (Fp) Y"
where d is the date, (Fp) is the possesive form for the month and Y is
the year. I used parenthesis to make clear that this is a custom local
filter.
After invoking the greek localflavor dateformat, which can understand
(Fp) - but not "d" or "Y" the format string becomes:
"d Απριλίου Y"
and it is returned to the date filter. The date filter now fills in
"d" and "Y", but ignores the "Απριλίου" part and returns the formatted
date:
"27 Απριλίου 2007"

I don't see anything flawed in this, so please everyone give it a
thought.

An added plus is that the translation maintainers can add both the
general purpose dateformat strings and the localflavor module that
implements them, so nothing is loose.

On Apr 27, 10:25 am, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

Radek Svarz

unread,
Apr 27, 2007, 1:02:20 PM4/27/07
to Djang...@googlegroups.com
Hi,

as I see, you have thought a lot about the presentation of local flavor dates.

Please, take into consideration the entering and validation as well.

Radek

Reply all
Reply to author
Forward
0 new messages