logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),
extra={'duration': duration, 'sql': sql, 'params': params}
)
throws a "UnicodeDecodeError:ascii codec .... " when calling model.save()
when some fields have utf-8/non-ascii values. My only workaround is to
modify it to:
logger.debug('(%.3f) %s; args=%s' % (duration, sql.decode('utf-8'),
params),
extra={'duration': duration, 'sql': sql, 'params': params}
)
I'm hoping there's a better and more elegant way that I could use.
Thanks!
=)
--
Ticket URL: <https://code.djangoproject.com/ticket/22377>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
Comment:
Could you please provide the full traceback? It would also be useful to
know the values of duration, sql and params variables just before the
crash.
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:1>
* status: new => closed
* type: Uncategorized => Bug
* resolution: => needsinfo
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:2>
* status: closed => new
* resolution: needsinfo =>
Comment:
Here is a traceback:
{{{
Traceback (most recent call last):
File "./manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/core/management/__init__.py", line 399, in
execute_from_command_line
utility.execute()
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/core/management/__init__.py", line 392, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/core/management/base.py", line 242, in run_from_argv
self.execute(*args, **options.__dict__)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/core/management/base.py", line 285, in execute
output = self.handle(*args, **options)
File
"/home/kahnert/pycharm/schedules/schedules/data/management/commands/import_data.py",
line 35, in handle
self.parse_file(fname)
File
"/home/kahnert/pycharm/schedules/schedules/data/management/commands/import_data.py",
line 40, in parse_file
self.parse_entry(cols)
File
"/home/kahnert/pycharm/schedules/schedules/data/management/commands/import_data.py",
line 61, in parse_entry
row_header, created = OIRowHeader.objects.get_or_create(**header_vals)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/manager.py", line 154, in get_or_create
return self.get_queryset().get_or_create(**kwargs)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 376, in get_or_create
return self.get(**lookup), False
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 304, in get
num = len(clone)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 77, in __len__
self._fetch_all()
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 857, in _fetch_all
self._result_cache = list(self.iterator())
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/query.py", line 220, in iterator
for row in compiler.results_iter():
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/sql/compiler.py", line 713, in results_iter
for rows in self.execute_sql(MULTI):
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/models/sql/compiler.py", line 786, in execute_sql
cursor.execute(sql, params)
File "/mnt/schedenv/local/lib/python2.7/site-
packages/django/db/backends/util.py", line 78, in execute
logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 450:
ordinal not in range(128)
}}}
I can't provide the data values, most likely some russian or chinese
string, but I think the bug is obvious if you look into
`django.db.backends.util.CursorDebugWrapper`:
- Python 2
- percent-formatting into bytestrings (`str`-Type)
-> UnicodeDecodeError for any non-ascii char
The solution should be as simple as: Use `six` string types
This is realy a hassle, since you have to work around '''de'''bugging Code
at the moment.
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:3>
Comment (by claudep):
It's not about percent-formatting into bytestrings, as the file imports
`from __future__ import unicode_literals`. I suspect that content in
`header_vals` dictionary has some non-ASCII non-Unicode content, which you
should check and fix.
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:4>
Comment (by kahnert):
It's absolutely sure that all content in `header_vals` is correctly
encoded, since it's read from utf-8 encoded CSV files. All content columns
are safe-decodeded to unicode after reading. If I turn the DebugCursor
off, all content gets into the DB and is correctly displayed when read
from the DB.
But you're right, I could not create a test case in which
{{{
from __future__ import unicode_literals
logging.debug('(%.3f) %s; args=%s' % (duration, sql, params),
extra={'duration': duration, 'sql': sql, 'params': params}
)
}}}
throws an error. Even if I put wrong encoded strings into the `params`
dict.
May it be possible that the actual DecodingError is produced by some
logging handler or formatter?
Does the used db connector make a difference?
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:5>
Comment (by claudep):
If you can reproduce the traceback in comment:3, I'd suggest to catch the
UnicodeDecodeError in a try/except, then `pdb.set_trace()` in the except
code to look at various variables involved. This should give a hint about
the problem.
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:6>
Comment (by brian):
I can verify this occurs with Django 1.7, and only with Python2. Python3
is fine.
{{{
import xyz.models as m
m.Class.objects.filter(attribute=u'’')
}}}
I put the following lines in above the line where it fails
(django/db/backends/util.py logger.debug line):
{{{
print type(sql)
test = u'(%.3f) %s; args=%s' % (duration, sql, params)
}}}
(Yes, I know the u is redundant, wanted to make sure)
The first line prints <type 'str'>, the second line causes the exception.
So the problem appears to be that we are basically doing the following:
{{{
In [5]: u"%s" % '’'
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call
last)
<ipython-input-5-87d36823dac9> in <module>()
----> 1 u"%s" % '’'
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0:
ordinal not in range(128)
}}}
Where we should be doing this:
{{{
In [4]: u"%s" % u'’'
Out[4]: u'\u2019'
}}}
According to ticket #18461, the result from last_executed_query should
always be unicode, however, according to my tests, that doesn't appear to
be happening.
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:7>
Comment (by brian):
Oh crap.
`last_executed_query` is defined in the db layer, and I am using python-
mysql-connector, which defines it as:
{{{
def last_executed_query(self, cursor, sql, params):
return cursor.statement
}}}
So I guess this needs to be reported to Oracle?
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:8>
Comment (by brian):
Reported it here: http://bugs.mysql.com/bug.php?id=73867
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:9>
Comment (by brian):
Unfortunately nobody else here has reported the db driver being used. If
you encounter the problem and do not use python-mysql-connector, please
report the driver you use here.
I think it is rather unlikely that this this problem is reproducible with
any driver that comes supplied with Django.
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:10>
* status: new => closed
* resolution: => needsinfo
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:11>
Comment (by olivergraeser):
I do see the same problem, and I use django.db.backends.mysql (which is a
Django-supplied driver to my very limited understanding). I can provide
additional information but would need some guidance.
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:12>
* status: closed => new
* version: 1.6 => 1.9
* resolution: needsinfo =>
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:13>
Comment (by timgraham):
Can you provide a sample project or a test for Django's test suite that
reproduces the issue?
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:14>
* status: new => closed
* resolution: => needsinfo
Old description:
> In django/db/backends/util.py, in CursorDebugWrapper.execute, the line:
>
> logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),
> extra={'duration': duration, 'sql': sql, 'params':
> params}
> )
>
> throws a "UnicodeDecodeError:ascii codec .... " when calling model.save()
> when some fields have utf-8/non-ascii values. My only workaround is to
> modify it to:
>
> logger.debug('(%.3f) %s; args=%s' % (duration, sql.decode('utf-8'),
> params),
> extra={'duration': duration, 'sql': sql, 'params':
> params}
> )
>
> I'm hoping there's a better and more elegant way that I could use.
>
> Thanks!
> =)
New description:
In django/db/backends/util.py, in CursorDebugWrapper.execute, the line:
{{{
logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),
extra={'duration': duration, 'sql': sql, 'params': params}
)
}}}
throws a "UnicodeDecodeError:ascii codec .... " when calling model.save()
when some fields have utf-8/non-ascii values. My only workaround is to
modify it to:
{{{
logger.debug('(%.3f) %s; args=%s' % (duration, sql.decode('utf-8'),
params),
extra={'duration': duration, 'sql': sql, 'params': params}
)
}}}
I'm hoping there's a better and more elegant way that I could use.
Thanks!
=)
--
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:15>
Comment (by marcinn):
I can confirm similar issue when using raw queries directly via
`connection.cursor().execute()`, when some parameters passed as `dict`
contains unicode data. It happens for `Django 1.8.6`, `Python 2.7.11`,
`psycopg2` driver and only when `DEBUG=True` is set. Currently I've
applied workaround by using native cursor
(`connection.cursor().cursor.execute()`.
Exception is raised when DebugCursor is logging a sql
(`logger.debug('(%.3f) [...]'`, at line 88 in `django.db.backends.utils`)
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:16>
Comment (by marcinn):
Also there is another issue related to this logger - when raw query has
incorrect syntax, Django still tries to log the query (in the `finally`
block). Then original exception (a sql error) is hidden, because of
raising encoding/decoding error of the log message. Probably original
error message lands somewhere in errors wrapper. I've spent hours on
debugging unicode/deocde error instead of fixing query syntax. But when
syntax was fixed, the error with logging still exists.
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:17>
Comment (by claudep):
Would it be possible to give us more details about the error, the
query/parameters and the traceback? We have to be able to reproduce it in
order to fix it.
--
Ticket URL: <https://code.djangoproject.com/ticket/22377#comment:18>