SQLite iexact problem with non-latin symbols

297 views
Skip to first unread message

Eugene Mirotin

unread,
Dec 18, 2008, 4:53:15 PM12/18/08
to django...@googlegroups.com
Hello.
Consider I have a simple model

class Team(models.Model):
name = models.CharField(max_length=200)

Then I create some team and try to look for it with iexact match:
tt = Team(name='English')
tt.save()
Team.objects.filter(name__iexact=tt.name.lower()) # this returns the
list containing my initial team object

When I do the same from the terminal and create the Team object by
passing the unicode symbols (e.g. u'<Russian symbols>'), it works the
same.
But when created from the admin interface, the data is not saved in
unicode (at least it looks like so, cause the field is varchar).
And then name__iexact=name.lower() does not work.

The question is: is it possible to fix the issue so that
1) the objects could still be created with admin interfaces with
minimal changes (m.b. some additional arguments to the model fields?)
2) the case-insensitive lookups start working without any additional
magic (so, I do not have to think about unicode and non-unicode data
when filtering the DB)

Karen Tracey

unread,
Dec 18, 2008, 8:28:51 PM12/18/08
to django...@googlegroups.com
On Thu, Dec 18, 2008 at 4:53 PM, Eugene Mirotin <emir...@gmail.com> wrote:

Hello.
Consider I have a simple model

class Team(models.Model):
   name = models.CharField(max_length=200)

Then I create some team and try to look for it with iexact match:
tt = Team(name='English')
tt.save()
Team.objects.filter(name__iexact=tt.name.lower()) # this returns the
list containing my initial team object

When I do the same from the terminal and create the Team object by
passing the unicode symbols (e.g. u'<Russian symbols>'), it works the
same.
But when created from the admin interface, the data is not saved in
unicode (at least it looks like so, cause the field is varchar).
And then name__iexact=name.lower() does not work.

The data is saved using utf-8 encoding.  The issue is the SQLite doesn't support case-insensitive matching on anything other than 7-bit ASCII chars.  See:

http://www.sqlite.org/faq.html#q18

The last time I answered this question I only found reference to this behavior as a "bug".  The answer I'm pointing to above, however, considers it a design decision and mentions that sqlite provides an extension to get case-insensitive Unicode comparisons working.  So, there may actually be a way to get this working under Django, but I'm not sure of that.

My guess is this extension would need to be included/supported by pysqlite (the Python interface to SQLite that Django uses).  If it is, perhaps there is just some switch Django could flip to tell pysqlite to use this ICU extension to get case-insensitive unicode matching working.  If pysqlite doesn't support this extension then I rather doubt this can be made to work under Django, but I'm basically just guessing here.

Switching to a different database, one that supports case-insensitive unicode matching out of the box, might be an easier answer for you.

Karen

Eugene Mirotin

unread,
Dec 19, 2008, 4:28:24 AM12/19/08
to Django users
Thank you for the detailed answer.

I'll investigate the question about pysqlite (http://
oss.itsystementwicklung.de/trac/pysqlite/changeset/361 looks
promosing), but switching to other DB is OK, though SQLite is much
more convenient for development and debugging as I work on the project
from several different computers and do not still have hosted SQL.

On Dec 19, 3:28 am, "Karen Tracey" <kmtra...@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages