{{{
from django.forms import Form
from django.forms.fields import DecimalField
class MyForm(Form):
field = DecimalField()
data = {'field': '1\xac23'}
form = MyForm(data)
form.is_valid() # This will raise a DjangoUnicodeDecodeError instead of
returning False and having a validation error on the 'field'.
}}}
I noticed this on Django 1.5, but it looks like it also reproes on Django
1.6.
Upon investigation, it looks like smart_str was used in previous versions.
Nowadays, it looks like smart_text can throw a DjangoUnicodeDecodeError in
these cases. I believe this exception should be caught in the to_python
code and go through the same codepath as when an exception is raised in
the "Decimal(value)" code block.
--
Ticket URL: <https://code.djangoproject.com/ticket/21729>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* status: new => closed
* needs_better_patch: => 0
* resolution: => needsinfo
* needs_tests: => 0
* needs_docs: => 0
Comment:
I think that with current Django code, a field should never receive such
byte streams. Could you provide us with a plausible use case where such
invalid data can reach form data?
--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:1>
Comment (by thnee):
I think that with current Django code, a field should never receive such
byte streams.
Not quite sure what this quote is referring to, possibly how requests are
processed and handled before going into a view?
The Form class makes no rules about about where the data must come from.
I am hitting this issue when building an API using a Form's `CharField`
that takes the user's data from a CSV file.
They sent me a a string that ends in a partial UTF-8 character (only the
first byte, and not the second), and the form raises
`DjangoUnicodeDecodeError` on `is_valid`.
Pretty much exactly what the original example demonstrates.
I argue that there is a precedence catch this exception (possibly in the
`Field` class), since the job of a `Form` is to take any user input data
and produce a list of errors. And when the user did send invalid data, the
Form crashed instead of producing an error.
Here is (the relevant part of) a stack trace:
{{{
File "/home/my_user/my_project/apps/my_app/parsers.py", line 222, in
parse_feed
if not form.is_valid():
File "/usr/local/lib/python2.6/dist-packages/django/forms/forms.py",
line 129, in is_valid
return self.is_bound and not bool(self.errors)
File "/usr/local/lib/python2.6/dist-packages/django/forms/forms.py",
line 121, in errors
self.full_clean()
File "/usr/local/lib/python2.6/dist-packages/django/forms/forms.py",
line 273, in full_clean
self._clean_fields()
File "/usr/local/lib/python2.6/dist-packages/django/forms/forms.py",
line 288, in _clean_fields
value = field.clean(value)
File "/usr/local/lib/python2.6/dist-packages/django/forms/fields.py",
line 148, in clean
value = self.to_python(value)
File "/usr/local/lib/python2.6/dist-packages/django/forms/fields.py",
line 208, in to_python
return smart_text(value)
File "/usr/local/lib/python2.6/dist-packages/django/utils/encoding.py",
line 73, in smart_text
return force_text(s, encoding, strings_only, errors)
File "/usr/local/lib/python2.6/dist-packages/django/utils/encoding.py",
line 119, in force_text
raise DjangoUnicodeDecodeError(s, *e.args)
django.utils.encoding.DjangoUnicodeDecodeError: 'utf8' codec can't decode
byte 0xc3 in position 29: unexpected end of data. You passed in 'The
Chesterfield brand Stor h\xc3' (<type 'str'>)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:2>
* status: closed => new
* resolution: needsinfo =>
--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:3>
* version: 1.5 => master
* type: Bug => Cleanup/optimization
* stage: Unreviewed => Accepted
Comment:
Thanks for your input. I think that your use case makes sense. Note that
`CharField` is also subject to this issue, and other fields should be
investigated.
--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:4>
* status: new => closed
* resolution: => worksforme
Comment:
Can't reproduce, probably a Python 2 problem which is unsupported on
current master, so I'm closing this as wontfix.
Added a test in https://github.com/django/django/pull/8314
--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:5>
Comment (by Tim Graham):
I think the way to reproduce this is with an input like `b'1\xac23'`
(bytestring rather than unicode string), however, I'm not sure an input is
feasible on Python 3. For example, similar tests have been removed in
75f0070a54925bc8d10b1f5a2b6a50fe3a1f7f50.
--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:6>