The two main reasons against it I see are that firstly, it only works
for English words, so would be of little use to developers using
foreign languages, and secondly, it perhaps wouldn't be as widely used
as the other filters in there.
Many thanks,
Harry
It sounds like a potentially interesting addition to contrib.humanize,
but you have hit both of the objections that I would raise.
The foreign language limitation is particularly important - if we're
going to introduce a tag like this, then it should be able to be used
for languages other than English. If you present some research to
demonstrate how this tag could/would work for non-English languages,
it would be a lot more compelling.
Yours,
Russ Magee %-)
Regards,
Harry
On Jan 6, 3:45 am, Russell Keith-Magee <freakboy3...@gmail.com> wrote:
Hmm, can it handle the following?
an honest man
a history book
an historical book (debatable)
My gut instinct is that it's not possible to work this out
programmatically. When it comes to other languages, I imagine it's
going to be even harder (if it's possible to get harder than
'impossible'), because you have things like gender and case to worry
about, which certainly cannot be worked out by an algorithm.
To give some examples, in French, the choice is between 'un' and
'une', depending on whether the word is masculine or feminine. In
Greek, the choice is between ̔εις, ̔ενα, ̔ενος, ̔ενι, μια, μιαν,
μιας, μια, ̔εν, ̔εν, ̔ενος, ̔ενι, depending on whether the word is
masculine, feminine or neuter, and in nominative, accusative, genitive
or dative case. Although in many cases you would probably omit the
article altogether - the above words often mean "one" rather than "a".
(That's NT Koine Greek, it might be different/simpler/more complicated
in modern Greek).
I imagine there are plenty of languages where this gets even worse,
violating almost every assumption you don't even know you are making
(like whether the article comes before or after or in the middle, or
exists at all, etc. etc.)
To summarise: if I were you, I would give up now.
Luke
--
"Mediocrity: It takes a lot less time, and most people don't
realise until it's too late." (despair.com)
Luke Plant || http://lukeplant.me.uk/
It can't, the rules for the indefinite article around 'h' are complex
and depend on the etymology of the word used. To add complexity the
lexicographic rules are often different to the rules for speech, and
UK rules differ from US rules (and possibly Oz too, but I don't
know).
> If you present some research to
> demonstrate how this tag could/would work for non-English languages,
> it would be a lot more compelling.
That's not going to work, in any meaningful sense. That peculiarity of
the article is highly English-specific. The generalization would
surely be something like
{% if /some-regex/.matches(word) %}{{ form1 }} {{ word }}{% else %}
{{ form2 }} {{ word }}{% endif %}
where the regex is language and context dependent. There are various
regex replacement filters/tags out in the djangosphere. Could you use
one of them?
> (That's NT Koine Greek, it might be different/simpler/more complicated
> in modern Greek).
What is it about Django and NT scholars - have you come across James
Tauber (of Pinax fame?)
Ian.
There are at least three Django committers who can list one or another
ancient Greek dialect among their studies. Not sure why that is, but
it does make for fun conversation over drinks.
--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
But I agree that this would be far too difficult ( / impossible) to
make multi-lingual so is perhaps not appropriate for inclusion in
Django.
Harry
Disclaimer: I have a masters degree in Computational Linguistics. Ths
is a simplified account of "last year of bachelor"-stuff:
Human language cannot (mathematically proven) be modelled by a mere
regexp, as human language is not only context-free, (needing a full
parser) but context-sensitive (needing parsers we don't really have
yet). Nice, yes?
It cannot go in humanize but it could go in localflavor for English.
It would be necessary with a stemmer and a replaceable wordlist
though, as what words get "an" and what get "a" not only depends on
country but also on specific publishing styles - and all of this has a
tendency to change over time.
HM
--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Here's a snippet I wrote a while back you may want to check out too:
www.djangosnippets.org/snippets/1519/