I'd just submitted a ticket related to an issue I have for ages but the decision mainly depends on who is affected by this. Are there other languages which consider zero as a singular form?
On Tue, 2009-01-06 at 10:26 +0100, David Larlet wrote: > Hello,
> I'd just submitted a ticket related to an issue I have for ages but > the decision mainly depends on who is affected by this. Are there > other languages which consider zero as a singular form?
I think that ticket is a bit misguided. The pluralize filter is designed to work for English only, at the moment. It doesn't support multiple plural forms -- languages with more than one plural form for quantities greater than one -- for example. Adding special cases for things like languages where zero is singular leads into "make it support every i18n plural form under the sun" and things get really complicated really quickly.
You need a different approach in that case. There's nothing wrong with the goal of supporting plural forms properly, but that's not the pluralize filter. Piecemeal patches to that to support the French but not the Poles will lead to tears. We don't want crying here. I'm thinking about it, but I'm inclined to wontfix that ticket at the moment in favour of somebody coming up with a more comprehensive solution as a separate filter.
> On Tue, 2009-01-06 at 10:26 +0100, David Larlet wrote: >> Hello,
>> I'd just submitted a ticket related to an issue I have for ages but >> the decision mainly depends on who is affected by this. Are there >> other languages which consider zero as a singular form?
> I think that ticket is a bit misguided. The pluralize filter is > designed > to work for English only, at the moment. It doesn't support multiple > plural forms -- languages with more than one plural form for > quantities > greater than one -- for example. Adding special cases for things like > languages where zero is singular leads into "make it support every > i18n > plural form under the sun" and things get really complicated really > quickly.
> You need a different approach in that case. There's nothing wrong with > the goal of supporting plural forms properly, but that's not the > pluralize filter. Piecemeal patches to that to support the French but > not the Poles will lead to tears. We don't want crying here. I'm > thinking about it, but I'm inclined to wontfix that ticket at the > moment > in favour of somebody coming up with a more comprehensive solution > as a > separate filter.
I perfectly understand and that's why I ask for feedback and propose to create a new filter (per language) in contrib.localflavor in the ticket's description but it depends on needs. I agree that ticket/ email's title is not reflecting this solution but I can propose a pluralize_fr filter for instance if you prefer this approach? (and renaming ticket instead of wontfix status?)
In fact, I need feedback because I don't know if it's related to French speaking people or if more languages are affected by this added "s". In this case it'll be more complicated to find the right place in localflavor.
On Wed, 2009-01-07 at 01:37 +0100, David Larlet wrote:
[...]
> I perfectly understand and that's why I ask for feedback and propose > to create a new filter (per language) in contrib.localflavor in the > ticket's description but it depends on needs. I agree that ticket/ > email's title is not reflecting this solution but I can propose a > pluralize_fr filter for instance if you prefer this approach? (and > renaming ticket instead of wontfix status?)
Yeah, I understand. I would prefer a solution that didn't end up with pluralize_fr, pluralize_de, pluralize_pl, ... It's adding a lot of code and not necessarily making things easier. I realise that would be a good parallel with the existing pluralize filter working just for English, but that's there for historical reasons and we can, and should, do better now that we have i18n/l10n support in Django.
In an ideal world, somebody would think really hard until steam came out their ears and then come up with a nice pluralize_l10n filter that somehow "just worked". Sadly, I have no idea what that would look like or how it would work.
It's going to be like art: we'll know the right solution when we see it, because it will look simple and functional. :-)
Right at the moment, I'd hesitate to commit anything to Django's codebase whilst we don't have a good general solution, since it's not really going to be a showstopper for anybody (you can use filters and template tags that aren't in core, etc). So I think it's worth putting in the effort to find a better solution. And I do think it's worth putting in the effort, but I'm not necessarily saying it must be you or anybody else who does so. If you're motivated, though, I'll certainly read any proposals with interest and attention. I really, really want all this stuff to work well in Django.
> In fact, I need feedback because I don't know if it's related to > French speaking people or if more languages are affected by this added > "s". In this case it'll be more complicated to find the right place in > localflavor.
My feeling is that this problem is exactly as complicated as plural forms in translations (PO files). The bottom of this page has a partial summary of the situation (it classifies the French case correctly, too):
Using (u)ngettext as inspiration for a solution might provoke some ideas. That precise API would look a bit ugly in templates, but there might be something that falls out of the sky when you think along those lines.
Le 7 janv. 09 à 02:54, Malcolm Tredinnick a écrit :
> Using (u)ngettext as inspiration for a solution might provoke some > ideas. That precise API would look a bit ugly in templates, but there > might be something that falls out of the sky when you think along > those > lines.
My proposition is to get a human readable dictionary between locales and expressions: { 'en': '(n != 1)', 'fr': '(n>1)', and so on } and then to add a l10n argument to pluralize (naive first iteration for the syntax): d{% comment_count|pluralize:"u,es,l10n=fr" %} commentaire{% comment_count|pluralize:"l10n=fr" %} It looks simple and functional but I'm afraid that it will impact performances... I can provide a benchmark to verify but let's discuss at first :-)
On Wed, 2009-01-07 at 03:57 +0100, David Larlet wrote: > Hi Malcolm,
> Thanks for your detailed answer.
> Le 7 janv. 09 à 02:54, Malcolm Tredinnick a écrit : > > Using (u)ngettext as inspiration for a solution might provoke some > > ideas. That precise API would look a bit ugly in templates, but there > > might be something that falls out of the sky when you think along > > those > > lines.
The only slightly complex thing have to know to understand the PO file expressions is the C ternary operator
<test> ? <result-if-true> : <result-if-false>
The final result has to return a number between 0 (sometimes 0 is omitted and 1 is returned for the singular case) and "n" (for some value of n corresponding to the number of plural forms in the language). PO files take a very limited set of things for the <test>. Basically, only numerical comparisons (n==2, n > 100, etc)
> My proposition is to get a human readable dictionary between locales > and expressions: > { 'en': '(n != 1)', 'fr': '(n>1)', and so on }
It would be nicer if we could extract that information directly from the PO files to save on having to keep it updated in two places at once, if at all possible. Not sure what the license is on Christian's code, but writing something similar from scratch wouldn't be too hard, I suspect.
For prototyping, though, you can assume such a dictionary exists however you want to create it. We can fill in the holes later. I'd certainly make the values in the dictionary function objects, though, so that you can pass in the quantity and get back the index or offset of the suffix to use.
> and then to add a l10n argument to pluralize (naive first iteration > for the syntax): > d{% comment_count|pluralize:"u,es,l10n=fr" %} commentaire{% > comment_count|pluralize:"l10n=fr" %}
Take your hands off the keyboard and back away from the existing pluralize filter! :-)
Let's not make the existing one more complex, just yet. At least whilst developing. If you create a new filter -- say pluralize_i18n filter -- you can require that the first argument is always the locale and is required, for example. Passing arguments to filters is a bit ugly, trying to work in optional keyword arguments as well starts to get pretty messy.
> It looks simple and functional but I'm afraid that it will impact > performances... I can provide a benchmark to verify but let's discuss > at first :-)
It shouldn't be too slow, relative to everything else that is going on. You're doing a key lookup, a function call to a pretty simple function, and an array lookup.
So that all looks reasonable.
There's an elephant in the room that we haven't mentioned yet. Appending some kind of string suffix to a word works for plural forms (mostly) in English. But it doesn't internationalise particularly well. For example, in German, "dream" is "Traum", "dreams" is "Traüme". But I don't think that's a reason not to work on this. The pluralize filter is an aid for some cases, not a do-everything filter, even in English. A template author can also write things like
Ich habe {{ count|as_a_word_in_german }} {{ count| pluralize_i18n:"de,Traum,Traüme" }}.
I'm only bringing this up partially so that we remember to note it in any final documentation.
I think you're going in the right direction here. It's a little bit fiddly and I'll keep thinking about what might go wrong. I'm also hoping (hint!!) that some of other code contributors on this list (Ramiro, the Marcs, Jannis, Ludvig, ...) will reply if they have any great ideas or see any huge problems (again, keeping in mind that we're not trying to solve every pluralisation problem on the planet -- just provide an aid for the simple cases).