[Django] #21729: DecimalField.to_python() fails on values with invalid unicode start byte

10 views
Skip to first unread message

Django

unread,
Jan 3, 2014, 2:13:34 PM1/3/14
to django-...@googlegroups.com
#21729: DecimalField.to_python() fails on values with invalid unicode start byte
-----------------------------------+--------------------
Reporter: brett_energysavvy | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.5
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
-----------------------------------+--------------------
Consider the following example:


{{{
from django.forms import Form
from django.forms.fields import DecimalField

class MyForm(Form):
field = DecimalField()

data = {'field': '1\xac23'}
form = MyForm(data)
form.is_valid() # This will raise a DjangoUnicodeDecodeError instead of
returning False and having a validation error on the 'field'.
}}}

I noticed this on Django 1.5, but it looks like it also reproes on Django
1.6.

Upon investigation, it looks like smart_str was used in previous versions.
Nowadays, it looks like smart_text can throw a DjangoUnicodeDecodeError in
these cases. I believe this exception should be caught in the to_python
code and go through the same codepath as when an exception is raised in
the "Decimal(value)" code block.

--
Ticket URL: <https://code.djangoproject.com/ticket/21729>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 11, 2014, 10:58:27 AM1/11/14
to django-...@googlegroups.com
#21729: DecimalField.to_python() fails on values with invalid unicode start byte
-----------------------------------+--------------------------------------
Reporter: brett_energysavvy | Owner: nobody
Type: Bug | Status: closed
Component: Forms | Version: 1.5
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-----------------------------------+--------------------------------------
Changes (by claudep):

* status: new => closed
* needs_better_patch: => 0
* resolution: => needsinfo
* needs_tests: => 0
* needs_docs: => 0


Comment:

I think that with current Django code, a field should never receive such
byte streams. Could you provide us with a plausible use case where such
invalid data can reach form data?

--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:1>

Django

unread,
Feb 9, 2015, 7:23:22 AM2/9/15
to django-...@googlegroups.com
#21729: DecimalField.to_python() fails on values with invalid unicode start byte
-----------------------------------+--------------------------------------
Reporter: brett_energysavvy | Owner: nobody
Type: Bug | Status: closed
Component: Forms | Version: 1.5
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage: Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-----------------------------------+--------------------------------------

Comment (by thnee):

I think that with current Django code, a field should never receive such
byte streams.

Not quite sure what this quote is referring to, possibly how requests are
processed and handled before going into a view?
The Form class makes no rules about about where the data must come from.

I am hitting this issue when building an API using a Form's `CharField`
that takes the user's data from a CSV file.
They sent me a a string that ends in a partial UTF-8 character (only the
first byte, and not the second), and the form raises
`DjangoUnicodeDecodeError` on `is_valid`.
Pretty much exactly what the original example demonstrates.

I argue that there is a precedence catch this exception (possibly in the
`Field` class), since the job of a `Form` is to take any user input data
and produce a list of errors. And when the user did send invalid data, the
Form crashed instead of producing an error.

Here is (the relevant part of) a stack trace:

{{{
File "/home/my_user/my_project/apps/my_app/parsers.py", line 222, in
parse_feed
if not form.is_valid():
File "/usr/local/lib/python2.6/dist-packages/django/forms/forms.py",
line 129, in is_valid
return self.is_bound and not bool(self.errors)
File "/usr/local/lib/python2.6/dist-packages/django/forms/forms.py",
line 121, in errors
self.full_clean()
File "/usr/local/lib/python2.6/dist-packages/django/forms/forms.py",
line 273, in full_clean
self._clean_fields()
File "/usr/local/lib/python2.6/dist-packages/django/forms/forms.py",
line 288, in _clean_fields
value = field.clean(value)
File "/usr/local/lib/python2.6/dist-packages/django/forms/fields.py",
line 148, in clean
value = self.to_python(value)
File "/usr/local/lib/python2.6/dist-packages/django/forms/fields.py",
line 208, in to_python
return smart_text(value)
File "/usr/local/lib/python2.6/dist-packages/django/utils/encoding.py",
line 73, in smart_text
return force_text(s, encoding, strings_only, errors)
File "/usr/local/lib/python2.6/dist-packages/django/utils/encoding.py",
line 119, in force_text
raise DjangoUnicodeDecodeError(s, *e.args)
django.utils.encoding.DjangoUnicodeDecodeError: 'utf8' codec can't decode
byte 0xc3 in position 29: unexpected end of data. You passed in 'The
Chesterfield brand Stor h\xc3' (<type 'str'>)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:2>

Django

unread,
Feb 9, 2015, 7:48:52 AM2/9/15
to django-...@googlegroups.com
#21729: DecimalField.to_python() fails on values with invalid unicode start byte
-----------------------------------+--------------------------------------

Reporter: brett_energysavvy | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.5
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-----------------------------------+--------------------------------------
Changes (by arthurk):

* status: closed => new
* resolution: needsinfo =>


--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:3>

Django

unread,
Feb 9, 2015, 12:39:58 PM2/9/15
to django-...@googlegroups.com
#21729: DecimalField.to_python() fails on values with invalid unicode start byte
--------------------------------------+------------------------------------
Reporter: brett_energysavvy | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Forms | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by claudep):

* version: 1.5 => master
* type: Bug => Cleanup/optimization
* stage: Unreviewed => Accepted


Comment:

Thanks for your input. I think that your use case makes sense. Note that
`CharField` is also subject to this issue, and other fields should be
investigated.

--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:4>

Django

unread,
Apr 6, 2017, 11:58:51 AM4/6/17
to django-...@googlegroups.com
#21729: DecimalField.to_python() fails on values with invalid unicode start byte
-------------------------------------+-------------------------------------
Reporter: brett_energysavvy | Owner: nobody
Type: | Status: closed
Cleanup/optimization |

Component: Forms | Version: master
Severity: Normal | Resolution:
| worksforme

Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Raphael Michel):

* status: new => closed

* resolution: => worksforme


Comment:

Can't reproduce, probably a Python 2 problem which is unsupported on
current master, so I'm closing this as wontfix.

Added a test in https://github.com/django/django/pull/8314

--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:5>

Django

unread,
Apr 6, 2017, 12:56:21 PM4/6/17
to django-...@googlegroups.com
#21729: DecimalField.to_python() fails on values with invalid unicode start byte
-------------------------------------+-------------------------------------
Reporter: brett_energysavvy | Owner: nobody

Type: | Status: closed
Cleanup/optimization |
Component: Forms | Version: master
Severity: Normal | Resolution:
| worksforme
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham):

I think the way to reproduce this is with an input like `b'1\xac23'`
(bytestring rather than unicode string), however, I'm not sure an input is
feasible on Python 3. For example, similar tests have been removed in
75f0070a54925bc8d10b1f5a2b6a50fe3a1f7f50.

--
Ticket URL: <https://code.djangoproject.com/ticket/21729#comment:6>

Reply all
Reply to author
Forward
0 new messages