.order_by([fieldnames], collation = 'et_EE')
another version of it could be automatical, based on django language
settings. Model often has ordering set in its Meta. So why not include
automatical ordering there too.
Is it possible to include this into some of next versions of Django?
'''Some background'''
If django is used to power some international site like a newspaper then
there is rarely need for sorting data based on sorting rules of language
other than english. If you use it to power a site that hosts data in
several languages then you have need to also display data in that language
properly. Sorting based on English alphabet does not work correctly in
that case cause letters missing from english alphabet (and there are MANY
of those letters that exist in swedish, danish, german, not to mention
estonian, russian etc. alphabets) are displayed in completely wrong
places.
Again - most django sites will be probably using single language. But
there are many cases that will benefit from this feature. Like e-commerce
modules, that translate product names - they can benefit from it. Any site
that allows language based naming or translating of their objects will
benefit from this feature.
--
Ticket URL: <https://code.djangoproject.com/ticket/21181>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
Comment:
I agree that some solution to this is a good idea. There are some tickets
that are close to this request, #14310 and #9682 come to mind. I didn't
find a perfect duplicate so accepting this.
Custom lookups could offer a neat solution to this. You should be able to
do: `qs.alias(name_fi=Collate('name', 'fi')).order_by('name_fi')`. Or, to
use collated filtering, use `qs.alias(name_fi=Collate('name',
'fi')).filter(name_fi__icontains='someval')`. See #16187 for details. The
patch in that ticket already offer all the machinery needed to do this,
but there is still a lot of work before merge is possible.
While waiting for custom lookups maybe there should be some other way to
achieve this.
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:1>
* stage: Unreviewed => Accepted
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:2>
Comment (by alan.kesselmann@…):
I think there is no duplicate to this, as this is relatively new thing.
#14310 is similar cause it requests per query collation overrides but the
#9682, which is marked as #14310's duplicate is another thing completely.
In my test (or investigation) case i created two postgresql databases. One
with default en_US.utf8 collation and one with POSIX collation. Since one
table can hold data which is in several languages you (in this case i)
have to be able to add collation as a part of query. Like this:
SELECT * FROM <tablename> ORDER BY <textcolum> COLLATE "<your_collation>",
<integercolumn>
or
SELECT <textcolumn> COLLATE "<your_collation>" FROM <tablename> ORDER BY
<textcolumn>
What this allows to achieve is that table of strings like:
id, name
1, älan
2, alan
3, õlan
will be sorted with query "SELECT * FROM table ORDEr BY name" to
2, alan
1, älan
3, õlan
with en_US.utf8 collation and to
2, alan
3, õlan
1, älan
with estonian alphabet based collation.
Your example, qs.alias(name_fi=Collate('name', 'fi')).order_by('name_fi')
is the best i think (for my use-case anyway), as it pictures the core
thing that i need - pair of column name ("name") and collation ("fi"). The
Collate object seems like a good idea that might allow some more complex
use cases too (http://www.postgresql.org/docs/9.1/static/collation.html
lists several).
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:3>
Comment (by akaariai):
I tried how hard this would be to implement using custom lookups. Turns
out it was somewhat straightforward, see:
https://github.com/akaariai/django/commit/76063881d79732cbb1942e9905fe4d51c0df09c1
The example works only on PostgreSQL 9.1 plus, and the test works only if
collations fi_FI, en_GB and POSIX are available (I believe the last one is
always there). Of course, adding in full multidb support would require
more work so that the SQL would actually work on all databases.
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:4>
Comment (by alan.kesselmann@…):
Hmm i can see you have done loads of work in your custom lookups branch.
Any idea which version those changes might end up in?
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:5>
Comment (by akaariai):
Now that expressions are in, a Collate expression would be a very welcome
addition to Django. It should work on all databases that support
collations, not just on PostgreSQL.
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:6>
* owner: nobody => Ian-Foote
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:7>
* owner: Ian-Foote =>
* status: assigned => new
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:8>
* owner: (none) => Alan
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:9>
Comment (by Simon Charette):
Now that `order_by` has proper expression support this is pretty
straightforward to implement using an `Expression` of `Func` subclass.
I guess this ticket could be repurposed to allow a `collation` argument to
be passed to `OrderBy` or introduce a backend agnostic `Collate`
expression.
`order_by(F('title').desc(collation='fr_FR'))` or
`order_by(Collate('title', 'fr_FR').desc())`
I think a `Collate` expression would make more sense since it could be
used in lookups as well `filter(title=Collate(Value(title), 'fr_FR'))`.
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:10>
Comment (by Tom Carrick):
I made a quick attempt at implementing `Collate()` as an expression. I've
hit a problem though, as it seems like SQLIte doesn't allow you to use
parameters with `COLLATE`:
{{{
In [34]: with connection.cursor() as cursor:
...: cursor.execute("select name from testapp_foo order by %s
collate %s", ["name", "nocase"])
...: row = cursor.fetchall()
...
OperationalError: near "?": syntax error
}}}
I also have problems on postgres (haven't tried with other backends):
{{{
In [16]: with connection.cursor() as cursor:
...: cursor.execute("select name from testapp_foo order by %s
collate %s", ["name", "en_US"])
...: row = cursor.fetchone()
...
ProgrammingError: syntax error at or near "'en_US'"
LINE 1: select name from testapp_foo order by 'name' collate 'en_US'
}}}
Not exactly sure what's going on here, but it looks like single quotes
don't do it here. It works if I remove the placeholders and set it with
`"en_US"` - with double quotes.
Maybe I'm misunderstanding something.
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:11>
* cc: Tom Carrick (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:12>
* cc: Alexandr Artemyev (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:13>
* cc: Simon Charette (added)
Comment:
Looks like collation names are identifiers on these backends and need to
be escaped using `connection.ops.quote_name`. Was MySQL also behaving this
way?
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:14>
* owner: Alan => Tom Carrick
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:15>
* has_patch: 0 => 1
Comment:
[https://github.com/django/django/pull/13284 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:16>
* stage: Accepted => Ready for checkin
Comment:
Went through a few rounds of reviews and patch now LGTM
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:17>
* needs_better_patch: 0 => 1
* stage: Ready for checkin => Accepted
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:18>
Comment (by Simon Charette):
Yes, `quote_name` cannot protect against SQL injections, it shouldn't be
an issue just like `Func(func)` also allows injections? As long as
`collation` is not under user control it should not be an issue.
Since collation names are identifiers and cannot be provided as string
literals (see comment:11) I don't see a way around that?
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:19>
Comment (by felixxm):
So maybe, we can add a warning in docs, e.g.
{{{
To protect against `SQL injection attacks
<https://en.wikipedia.org/wiki/SQL_injection>`_, ``collation``
shouldn't be
a user-controlled parameter because collation names are database
identifiers and cannot be properly escape.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:20>
* needs_better_patch: 1 => 0
* stage: Accepted => Ready for checkin
Comment:
I added warning to docs.
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:21>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"63300f7e686c2c452763cb512df9abf7734fd588" 63300f7e]:
{{{
#!CommitTicketReference repository=""
revision="63300f7e686c2c452763cb512df9abf7734fd588"
Fixed #21181 -- Added Collate database function.
Thanks Simon Charette for reviews.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:22>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"278b6187d2a6711afb0b35dc5c3c09fe5be9adb3" 278b6187]:
{{{
#!CommitTicketReference repository=""
revision="278b6187d2a6711afb0b35dc5c3c09fe5be9adb3"
Refs #21181 -- Corrected DatabaseFeatures.test_collations for Swedish
collation.
Previously, backends used different keys "swedish-ci" or "swedish_ci".
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/21181#comment:23>