[Django] #22377: SQL Logging throws an exception when fields have utf-8 characters

45 views
Skip to first unread message

Django

unread,
Apr 3, 2014, 5:04:05 AM4/3/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
----------------------------------------------+----------------------------
Reporter: rolanvc@… | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer (models, ORM) | Version: 1.6
Severity: Normal | Keywords: sql logging
Triage Stage: Unreviewed | for utf-8
Easy pickings: 0 | Has patch: 0
| UI/UX: 0
----------------------------------------------+----------------------------
In django/db/backends/util.py, in CursorDebugWrapper.execute, the line:

logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),
extra={'duration': duration, 'sql': sql, 'params': params}
)

throws a "UnicodeDecodeError:ascii codec .... " when calling model.save()
when some fields have utf-8/non-ascii values. My only workaround is to
modify it to:

logger.debug('(%.3f) %s; args=%s' % (duration, sql.decode('utf-8'),
params),
extra={'duration': duration, 'sql': sql, 'params': params}
)

I'm hoping there's a better and more elegant way that I could use.

Thanks!
=)

--
Ticket URL: <https://code.djangoproject.com/ticket/22377>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 3, 2014, 3:19:29 PM4/3/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------

Reporter: rolanvc@… | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.6
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by claudep):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Comment:

Could you please provide the full traceback? It would also be useful to
know the values of duration, sql and params variables just before the
crash.

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:1>

Django

unread,
Apr 4, 2014, 2:13:09 PM4/4/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody
Type: Bug | Status: closed

Component: Database layer | Version: 1.6
(models, ORM) | Resolution: needsinfo

Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by claudep):

* status: new => closed
* type: Uncategorized => Bug
* resolution: => needsinfo


--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:2>

Django

unread,
Sep 3, 2014, 9:17:09 AM9/3/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: 1.6
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by kahnert):

* status: closed => new
* resolution: needsinfo =>


Comment:

Here is a traceback:
{{{
Traceback (most recent call last):
File "./manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/core/management/__init__.py", line 399, in
execute_from_command_line
utility.execute()
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/core/management/__init__.py", line 392, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/core/management/base.py", line 242, in run_from_argv
self.execute(*args, **options.__dict__)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/core/management/base.py", line 285, in execute
output = self.handle(*args, **options)
File
"/home/kahnert/pycharm/schedules/schedules/data/management/commands/import_data.py",
line 35, in handle
self.parse_file(fname)
File
"/home/kahnert/pycharm/schedules/schedules/data/management/commands/import_data.py",
line 40, in parse_file
self.parse_entry(cols)
File
"/home/kahnert/pycharm/schedules/schedules/data/management/commands/import_data.py",
line 61, in parse_entry
row_header, created = OIRowHeader.objects.get_or_create(**header_vals)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/manager.py", line 154, in get_or_create
return self.get_queryset().get_or_create(**kwargs)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 376, in get_or_create
return self.get(**lookup), False
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 304, in get
num = len(clone)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 77, in __len__
self._fetch_all()
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 857, in _fetch_all
self._result_cache = list(self.iterator())
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 220, in iterator
for row in compiler.results_iter():
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/sql/compiler.py", line 713, in results_iter
for rows in self.execute_sql(MULTI):
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/sql/compiler.py", line 786, in execute_sql
cursor.execute(sql, params)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/backends/util.py", line 78, in execute


logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 450:
ordinal not in range(128)
}}}
I can't provide the data values, most likely some russian or chinese
string, but I think the bug is obvious if you look into
`django.db.backends.util.CursorDebugWrapper`:
- Python 2
- percent-formatting into bytestrings (`str`-Type)
-> UnicodeDecodeError for any non-ascii char

The solution should be as simple as: Use `six` string types

This is realy a hassle, since you have to work around '''de'''bugging Code
at the moment.

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:3>

Django

unread,
Sep 3, 2014, 3:10:55 PM9/3/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.6
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by claudep):

It's not about percent-formatting into bytestrings, as the file imports
`from __future__ import unicode_literals`. I suspect that content in
`header_vals` dictionary has some non-ASCII non-Unicode content, which you
should check and fix.

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:4>

Django

unread,
Sep 5, 2014, 8:27:34 AM9/5/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.6
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by kahnert):

It's absolutely sure that all content in `header_vals` is correctly
encoded, since it's read from utf-8 encoded CSV files. All content columns
are safe-decodeded to unicode after reading. If I turn the DebugCursor
off, all content gets into the DB and is correctly displayed when read
from the DB.

But you're right, I could not create a test case in which
{{{
from __future__ import unicode_literals

logging.debug('(%.3f) %s; args=%s' % (duration, sql, params),


extra={'duration': duration, 'sql': sql, 'params': params}
)
}}}

throws an error. Even if I put wrong encoded strings into the `params`
dict.

May it be possible that the actual DecodingError is produced by some
logging handler or formatter?
Does the used db connector make a difference?

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:5>

Django

unread,
Sep 5, 2014, 12:47:52 PM9/5/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.6
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by claudep):

If you can reproduce the traceback in comment:3, I'd suggest to catch the
UnicodeDecodeError in a try/except, then `pdb.set_trace()` in the except
code to look at various variables involved. This should give a hint about
the problem.

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:6>

Django

unread,
Sep 9, 2014, 9:06:55 PM9/9/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.6
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by brian):

I can verify this occurs with Django 1.7, and only with Python2. Python3
is fine.

{{{
import xyz.models as m
m.Class.objects.filter(attribute=u'’')
}}}

I put the following lines in above the line where it fails
(django/db/backends/util.py logger.debug line):

{{{
print type(sql)
test = u'(%.3f) %s; args=%s' % (duration, sql, params)
}}}

(Yes, I know the u is redundant, wanted to make sure)

The first line prints <type 'str'>, the second line causes the exception.

So the problem appears to be that we are basically doing the following:

{{{
In [5]: u"%s" % '’'
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call
last)
<ipython-input-5-87d36823dac9> in <module>()
----> 1 u"%s" % '’'

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0:
ordinal not in range(128)
}}}

Where we should be doing this:

{{{
In [4]: u"%s" % u'’'
Out[4]: u'\u2019'
}}}

According to ticket #18461, the result from last_executed_query should
always be unicode, however, according to my tests, that doesn't appear to
be happening.

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:7>

Django

unread,
Sep 9, 2014, 9:24:21 PM9/9/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.6
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by brian):

Oh crap.

`last_executed_query` is defined in the db layer, and I am using python-
mysql-connector, which defines it as:

{{{
def last_executed_query(self, cursor, sql, params):
return cursor.statement
}}}

So I guess this needs to be reported to Oracle?

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:8>

Django

unread,
Sep 9, 2014, 9:34:40 PM9/9/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.6
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by brian):

Reported it here: http://bugs.mysql.com/bug.php?id=73867

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:9>

Django

unread,
Sep 9, 2014, 10:01:13 PM9/9/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.6
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by brian):

Unfortunately nobody else here has reported the db driver being used. If
you encounter the problem and do not use python-mysql-connector, please
report the driver you use here.

I think it is rather unlikely that this this problem is reproducible with
any driver that comes supplied with Django.

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:10>

Django

unread,
Oct 1, 2014, 11:04:01 AM10/1/14
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody
Type: Bug | Status: closed

Component: Database layer | Version: 1.6
(models, ORM) | Resolution: needsinfo

Severity: Normal | Triage Stage:
Keywords: sql logging for | Unreviewed
utf-8 | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by timgraham):

* status: new => closed

* resolution: => needsinfo


--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:11>

Django

unread,
Jan 25, 2016, 4:33:47 AM1/25/16
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: closed
Component: Database layer | Version: 1.6
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: sql logging for | Triage Stage:
utf-8 | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by olivergraeser):

I do see the same problem, and I use django.db.backends.mysql (which is a
Django-supplied driver to my very limited understanding). I can provide
additional information but would need some guidance.

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:12>

Django

unread,
Jan 25, 2016, 4:34:57 AM1/25/16
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.9

(models, ORM) |
Severity: Normal | Resolution:
Keywords: sql logging for | Triage Stage:
utf-8 | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by olivergraeser):

* status: closed => new

* version: 1.6 => 1.9
* resolution: needsinfo =>


--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:13>

Django

unread,
Jan 25, 2016, 6:55:57 AM1/25/16
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.9
(models, ORM) |
Severity: Normal | Resolution:
Keywords: sql logging for | Triage Stage:
utf-8 | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by timgraham):

Can you provide a sample project or a test for Django's test suite that
reproduces the issue?

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:14>

Django

unread,
Jan 26, 2016, 8:57:02 AM1/26/16
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody
Type: Bug | Status: closed

Component: Database layer | Version: 1.9
(models, ORM) |
Severity: Normal | Resolution: needsinfo

Keywords: sql logging for | Triage Stage:
utf-8 | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* status: new => closed
* resolution: => needsinfo


Old description:

> In django/db/backends/util.py, in CursorDebugWrapper.execute, the line:
>
> logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),
> extra={'duration': duration, 'sql': sql, 'params':
> params}
> )
>
> throws a "UnicodeDecodeError:ascii codec .... " when calling model.save()
> when some fields have utf-8/non-ascii values. My only workaround is to
> modify it to:
>
> logger.debug('(%.3f) %s; args=%s' % (duration, sql.decode('utf-8'),
> params),
> extra={'duration': duration, 'sql': sql, 'params':
> params}
> )
>
> I'm hoping there's a better and more elegant way that I could use.
>
> Thanks!
> =)

New description:

In django/db/backends/util.py, in CursorDebugWrapper.execute, the line:
{{{
logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),
extra={'duration': duration, 'sql': sql, 'params': params}
)
}}}
throws a "UnicodeDecodeError:ascii codec .... " when calling model.save()
when some fields have utf-8/non-ascii values. My only workaround is to
modify it to:
{{{
logger.debug('(%.3f) %s; args=%s' % (duration, sql.decode('utf-8'),
params),
extra={'duration': duration, 'sql': sql, 'params': params}
)
}}}
I'm hoping there's a better and more elegant way that I could use.

Thanks!
=)

--

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:15>

Django

unread,
Apr 15, 2016, 11:26:40 AM4/15/16
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: closed
Component: Database layer | Version: 1.9
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: sql logging for | Triage Stage:
utf-8 | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by marcinn):

I can confirm similar issue when using raw queries directly via
`connection.cursor().execute()`, when some parameters passed as `dict`
contains unicode data. It happens for `Django 1.8.6`, `Python 2.7.11`,
`psycopg2` driver and only when `DEBUG=True` is set. Currently I've
applied workaround by using native cursor
(`connection.cursor().cursor.execute()`.

Exception is raised when DebugCursor is logging a sql
(`logger.debug('(%.3f) [...]'`, at line 88 in `django.db.backends.utils`)

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:16>

Django

unread,
Apr 15, 2016, 11:38:50 AM4/15/16
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: closed
Component: Database layer | Version: 1.9
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: sql logging for | Triage Stage:
utf-8 | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by marcinn):

Also there is another issue related to this logger - when raw query has
incorrect syntax, Django still tries to log the query (in the `finally`
block). Then original exception (a sql error) is hidden, because of
raising encoding/decoding error of the log message. Probably original
error message lands somewhere in errors wrapper. I've spent hours on
debugging unicode/deocde error instead of fixing query syntax. But when
syntax was fixed, the error with logging still exists.

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:17>

Django

unread,
Apr 15, 2016, 11:59:27 AM4/15/16
to django-...@googlegroups.com
#22377: SQL Logging throws an exception when fields have utf-8 characters
-------------------------------------+-------------------------------------
Reporter: rolanvc@… | Owner: nobody

Type: Bug | Status: closed
Component: Database layer | Version: 1.9
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: sql logging for | Triage Stage:
utf-8 | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by claudep):

Would it be possible to give us more details about the error, the
query/parameters and the traceback? We have to be able to reproduce it in
order to fix it.

--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:18>

Reply all
Reply to author
Forward
0 new messages