#9974: Allow pluralize filter to consider zero as a singular form

50 views
Skip to first unread message

David Larlet

unread,
Jan 6, 2009, 4:26:32 AM1/6/09
to Djang...@googlegroups.com
Hello,

I'd just submitted a ticket related to an issue I have for ages but
the decision mainly depends on who is affected by this. Are there
other languages which consider zero as a singular form?

Thanks,
David

Malcolm Tredinnick

unread,
Jan 6, 2009, 7:23:07 PM1/6/09
to Djang...@googlegroups.com

I think that ticket is a bit misguided. The pluralize filter is designed
to work for English only, at the moment. It doesn't support multiple
plural forms -- languages with more than one plural form for quantities
greater than one -- for example. Adding special cases for things like
languages where zero is singular leads into "make it support every i18n
plural form under the sun" and things get really complicated really
quickly.

You need a different approach in that case. There's nothing wrong with
the goal of supporting plural forms properly, but that's not the
pluralize filter. Piecemeal patches to that to support the French but
not the Poles will lead to tears. We don't want crying here. I'm
thinking about it, but I'm inclined to wontfix that ticket at the moment
in favour of somebody coming up with a more comprehensive solution as a
separate filter.

Regards,
Malcolm


David Larlet

unread,
Jan 6, 2009, 7:37:33 PM1/6/09
to Djang...@googlegroups.com

Le 7 janv. 09 à 01:23, Malcolm Tredinnick a écrit :

I perfectly understand and that's why I ask for feedback and propose
to create a new filter (per language) in contrib.localflavor in the
ticket's description but it depends on needs. I agree that ticket/
email's title is not reflecting this solution but I can propose a
pluralize_fr filter for instance if you prefer this approach? (and
renaming ticket instead of wontfix status?)

In fact, I need feedback because I don't know if it's related to
French speaking people or if more languages are affected by this added
"s". In this case it'll be more complicated to find the right place in
localflavor.

Regards,
David

Malcolm Tredinnick

unread,
Jan 6, 2009, 8:54:36 PM1/6/09
to Djang...@googlegroups.com
Hi David,

On Wed, 2009-01-07 at 01:37 +0100, David Larlet wrote:
[...]


> I perfectly understand and that's why I ask for feedback and propose
> to create a new filter (per language) in contrib.localflavor in the
> ticket's description but it depends on needs. I agree that ticket/
> email's title is not reflecting this solution but I can propose a
> pluralize_fr filter for instance if you prefer this approach? (and
> renaming ticket instead of wontfix status?)

Yeah, I understand. I would prefer a solution that didn't end up with
pluralize_fr, pluralize_de, pluralize_pl, ... It's adding a lot of code
and not necessarily making things easier. I realise that would be a good
parallel with the existing pluralize filter working just for English,
but that's there for historical reasons and we can, and should, do
better now that we have i18n/l10n support in Django.

In an ideal world, somebody would think really hard until steam came out
their ears and then come up with a nice pluralize_l10n filter that
somehow "just worked". Sadly, I have no idea what that would look like
or how it would work.

It's going to be like art: we'll know the right solution when we see it,
because it will look simple and functional. :-)

Right at the moment, I'd hesitate to commit anything to Django's
codebase whilst we don't have a good general solution, since it's not
really going to be a showstopper for anybody (you can use filters and
template tags that aren't in core, etc). So I think it's worth putting
in the effort to find a better solution. And I do think it's worth
putting in the effort, but I'm not necessarily saying it must be you or
anybody else who does so. If you're motivated, though, I'll certainly
read any proposals with interest and attention. I really, really want
all this stuff to work well in Django.

> In fact, I need feedback because I don't know if it's related to
> French speaking people or if more languages are affected by this added
> "s". In this case it'll be more complicated to find the right place in
> localflavor.

My feeling is that this problem is exactly as complicated as plural
forms in translations (PO files). The bottom of this page has a partial
summary of the situation (it classifies the French case correctly, too):

http://www.gnu.org/software/automake/manual/gettext/Plural-forms.html

Using (u)ngettext as inspiration for a solution might provoke some
ideas. That precise API would look a bit ugly in templates, but there
might be something that falls out of the sky when you think along those
lines.

Regards,
Malcolm

David Larlet

unread,
Jan 6, 2009, 9:57:54 PM1/6/09
to Djang...@googlegroups.com
Hi Malcolm,

Thanks for your detailed answer.

Le 7 janv. 09 à 02:54, Malcolm Tredinnick a écrit :


> Using (u)ngettext as inspiration for a solution might provoke some
> ideas. That precise API would look a bit ugly in templates, but there
> might be something that falls out of the sky when you think along
> those
> lines.

The sky was really cloudy after reading your link but hopefully, I
found this one:
http://scratchpad.cmlenz.net/c8e1af24481df087217bf12a57862949/

My proposition is to get a human readable dictionary between locales
and expressions:
{ 'en': '(n != 1)', 'fr': '(n>1)', and so on }
and then to add a l10n argument to pluralize (naive first iteration
for the syntax):
d{% comment_count|pluralize:"u,es,l10n=fr" %} commentaire{%
comment_count|pluralize:"l10n=fr" %}
It looks simple and functional but I'm afraid that it will impact
performances... I can provide a benchmark to verify but let's discuss
at first :-)

Regards,
David

Malcolm Tredinnick

unread,
Jan 8, 2009, 4:00:27 AM1/8/09
to Djang...@googlegroups.com
On Wed, 2009-01-07 at 03:57 +0100, David Larlet wrote:
> Hi Malcolm,
>
> Thanks for your detailed answer.
>
> Le 7 janv. 09 à 02:54, Malcolm Tredinnick a écrit :
> > Using (u)ngettext as inspiration for a solution might provoke some
> > ideas. That precise API would look a bit ugly in templates, but there
> > might be something that falls out of the sky when you think along
> > those
> > lines.
>
> The sky was really cloudy after reading your link but hopefully, I
> found this one:
> http://scratchpad.cmlenz.net/c8e1af24481df087217bf12a57862949/

The only slightly complex thing have to know to understand the PO file
expressions is the C ternary operator

<test> ? <result-if-true> : <result-if-false>

The final result has to return a number between 0 (sometimes 0 is
omitted and 1 is returned for the singular case) and "n" (for some value
of n corresponding to the number of plural forms in the language). PO
files take a very limited set of things for the <test>. Basically, only
numerical comparisons (n==2, n > 100, etc)

> My proposition is to get a human readable dictionary between locales
> and expressions:
> { 'en': '(n != 1)', 'fr': '(n>1)', and so on }

It would be nicer if we could extract that information directly from the
PO files to save on having to keep it updated in two places at once, if
at all possible. Not sure what the license is on Christian's code, but
writing something similar from scratch wouldn't be too hard, I suspect.

For prototyping, though, you can assume such a dictionary exists however
you want to create it. We can fill in the holes later. I'd certainly
make the values in the dictionary function objects, though, so that you
can pass in the quantity and get back the index or offset of the suffix
to use.

> and then to add a l10n argument to pluralize (naive first iteration
> for the syntax):
> d{% comment_count|pluralize:"u,es,l10n=fr" %} commentaire{%
> comment_count|pluralize:"l10n=fr" %}

Take your hands off the keyboard and back away from the existing
pluralize filter! :-)

Let's not make the existing one more complex, just yet. At least whilst
developing. If you create a new filter -- say pluralize_i18n filter --
you can require that the first argument is always the locale and is
required, for example. Passing arguments to filters is a bit ugly,
trying to work in optional keyword arguments as well starts to get
pretty messy.

> It looks simple and functional but I'm afraid that it will impact
> performances... I can provide a benchmark to verify but let's discuss
> at first :-)

It shouldn't be too slow, relative to everything else that is going on.
You're doing a key lookup, a function call to a pretty simple function,
and an array lookup.

So that all looks reasonable.

There's an elephant in the room that we haven't mentioned yet. Appending
some kind of string suffix to a word works for plural forms (mostly) in
English. But it doesn't internationalise particularly well. For example,
in German, "dream" is "Traum", "dreams" is "Traüme". But I don't think
that's a reason not to work on this. The pluralize filter is an aid for
some cases, not a do-everything filter, even in English. A template
author can also write things like

Ich habe {{ count|as_a_word_in_german }} {{ count|
pluralize_i18n:"de,Traum,Traüme" }}.

I'm only bringing this up partially so that we remember to note it in
any final documentation.

I think you're going in the right direction here. It's a little bit
fiddly and I'll keep thinking about what might go wrong. I'm also hoping
(hint!!) that some of other code contributors on this list (Ramiro, the
Marcs, Jannis, Ludvig, ...) will reply if they have any great ideas or
see any huge problems (again, keeping in mind that we're not trying to
solve every pluralisation problem on the planet -- just provide an aid
for the simple cases).

Best wishes,
Malcolm

Reply all
Reply to author
Forward
0 new messages