[Django] #22305: MaxLengthValidator doesn't take database encoding into account

19 views
Skip to first unread message

Django

unread,
Mar 21, 2014, 11:36:41 AM3/21/14
to django-...@googlegroups.com
#22305: MaxLengthValidator doesn't take database encoding into account
----------------------------+--------------------
Reporter: joeri | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.6
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------+--------------------
Interesting issue we came across today. Consider the following:

{{{
>>> from django.db import models
>>> from django.forms.models import modelform_factory

>>> class Pizza(models.Model):
>>> name = models.CharField(max_length=10) # Short pizza names ftw.

>>> form = modelform_factory(Pizza)

>>> pizza_name = u'mozzarélla' # Note the special character.
>>> f = form(data={'name': pizza_name})
>>> f.is_valid()
True
}}}

However, when form is saved to the database you will see: {{{DataError:
value too long for type character varying(10)}}}.

Why? Because the form's {{{MaxLengthValidator}}} sees:
{{{
>>> len(pizza_name) # Woop, it fits!
10
}}}

And the database sees (assuming it uses UTF-8 encoding) this:
{{{
>>> len(pizza_name.encode('utf-8')) # Oops, does not fit!
11
}}}

A solution would be to replace {{{len(x)}}} in the
{{{MaxLengthValidator}}} to {{{len(x.encode('utf-8'))}}}. This however
does not take the actual database encoding into account...

--
Ticket URL: <https://code.djangoproject.com/ticket/22305>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Mar 21, 2014, 3:25:32 PM3/21/14
to django-...@googlegroups.com
#22305: MaxLengthValidator doesn't take database encoding into account
------------------------+--------------------------------------
Reporter: joeri | Owner: nobody
Type: Bug | Status: closed
Component: Forms | Version: 1.6
Severity: Normal | Resolution: worksforme
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
------------------------+--------------------------------------
Changes (by bmispelon):

* status: new => closed
* needs_better_patch: => 0
* resolution: => worksforme
* needs_tests: => 0
* needs_docs: => 0


Comment:

Hi,

Your database should validate the length of the **text**, not of the
corresponding bytes.

I tried your code and it works fine for me (on sqlite and postgres) so
there must be something else at play.

Could you show the code you're using to trigger the error and the
corresponding traceback (and reopen this ticket when you do so)?

Thanks.

--
Ticket URL: <https://code.djangoproject.com/ticket/22305#comment:1>

Django

unread,
Apr 4, 2014, 11:42:17 AM4/4/14
to django-...@googlegroups.com
#22305: MaxLengthValidator doesn't take database encoding into account
------------------------+--------------------------------------

Reporter: joeri | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.6
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
------------------------+--------------------------------------
Changes (by joeri):

* cc: joeri (added)
* status: closed => new
* resolution: worksforme =>


Comment:

Hi,

I was a bit incomplete in my ticket. We are using PostgreSQL 9.1 and we
traced the problem to the database charset, which was set to `SQL_ASCII`
(for some reason `template0` charset was set to this and used to create
the db). You would still expect everything to work on the db end, or that
your `ModelForm` catches the error.

Edge case, so maybe a note in the docs would suffice.

Here's the unit tests that I ran against a PostgreSQL db with `SQL_ASCII`
encoding:

{{{
# -*- coding: utf-8 -*-
from django.test import TestCase


from django.db import models
from django.forms.models import modelform_factory


class Pizza(models.Model):
name = models.CharField(max_length=10) # Short pizza names ftw.


class MyTests(TestCase):
def test_form_to_db(self):


form = modelform_factory(Pizza)
pizza_name = u'mozzarélla'

f = form(data={'name': pizza_name})

self.assertTrue(f.is_valid())
f.save() # Gives an error
}}}

And here's the result of the test:

{{{
$ python src/manage.py test
Creating test database for alias 'default'...
E
======================================================================
ERROR: test_form_to_db (tests.MyTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/joeri/playground/django-1.6/ticket22305/tests.py", line 18,
in test_form
f.save()
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/forms/models.py", line 446, in save
construct=False)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/forms/models.py", line 99, in save_instance
instance.save()
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/models/base.py", line 545, in save
force_update=force_update, update_fields=update_fields)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/models/base.py", line 573, in save_base
updated = self._save_table(raw, cls, force_insert, force_update,
using, update_fields)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/models/base.py", line 654, in _save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk,
raw)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/models/base.py", line 687, in _do_insert
using=using, raw=raw)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/models/manager.py", line 232, in _insert
return insert_query(self.model, objs, fields, **kwargs)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/models/query.py", line 1511, in insert_query
return query.get_compiler(using=using).execute_sql(return_id)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/models/sql/compiler.py", line 899, in execute_sql
cursor.execute(sql, params)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/backends/util.py", line 53, in execute
return self.cursor.execute(sql, params)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/utils.py", line 99, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/joeri/playground/django-1.6/env/local/lib/python2.7/site-
packages/django/db/backends/util.py", line 53, in execute
return self.cursor.execute(sql, params)


DataError: value too long for type character varying(10)
}}}

Reopening the ticket.

--
Ticket URL: <https://code.djangoproject.com/ticket/22305#comment:2>

Django

unread,
Apr 8, 2014, 5:11:02 AM4/8/14
to django-...@googlegroups.com
#22305: MaxLengthValidator doesn't take database encoding into account
--------------------------------------+------------------------------------
Reporter: joeri | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Documentation | Version: 1.6
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by mjtamlyn):

* component: Forms => Documentation
* type: Bug => Cleanup/optimization
* stage: Unreviewed => Accepted


Comment:

Marking this as a documentation issue. Django assumes UTF8 everywhere and
takes great pains to ensure you can assume this. All we need to do is add
a small note that your database should be configured to use UTF8 as well.

Despite anything else, if a user enters `mozzarélla` and gets an error
saying they must only type 10 characters that is horrible UX.

--
Ticket URL: <https://code.djangoproject.com/ticket/22305#comment:3>

Django

unread,
May 8, 2014, 8:47:29 PM5/8/14
to django-...@googlegroups.com
#22305: MaxLengthValidator doesn't take database encoding into account
--------------------------------------+------------------------------------
Reporter: joeri | Owner: nobody

Type: Cleanup/optimization | Status: new
Component: Documentation | Version: 1.6
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by shai):

I just closed ticket 17202, which (in a [comment:2:ticket:17202 comment])
complains about wrong length repored in introspection when a MySQL table
is defined to use latin1, as a duplicate of this bug.

--
Ticket URL: <https://code.djangoproject.com/ticket/22305#comment:4>

Django

unread,
Jul 26, 2014, 7:39:05 AM7/26/14
to django-...@googlegroups.com
#22305: MaxLengthValidator doesn't take database encoding into account
--------------------------------------+------------------------------------
Reporter: joeri | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Documentation | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by nip3o):

* has_patch: 0 => 1
* version: 1.6 => master


Comment:

https://github.com/django/django/pull/2970

--
Ticket URL: <https://code.djangoproject.com/ticket/22305#comment:5>

Django

unread,
Jul 26, 2014, 9:40:37 AM7/26/14
to django-...@googlegroups.com
#22305: MaxLengthValidator doesn't take database encoding into account
--------------------------------------+------------------------------------
Reporter: joeri | Owner: nobody
Type: Cleanup/optimization | Status: closed
Component: Documentation | Version: master
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: new => closed

* resolution: => fixed


Comment:

In [changeset:"08b85de9b7a8940702dba9348b642538da888c6c"]:
{{{
#!CommitTicketReference repository=""
revision="08b85de9b7a8940702dba9348b642538da888c6c"
Fixed #22305 -- Added note to docs about UTF8 database requirement.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22305#comment:6>

Django

unread,
Jul 26, 2014, 9:41:03 AM7/26/14
to django-...@googlegroups.com
#22305: MaxLengthValidator doesn't take database encoding into account
--------------------------------------+------------------------------------
Reporter: joeri | Owner: nobody

Type: Cleanup/optimization | Status: closed
Component: Documentation | Version: master
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"1c714c18d22a38ad2dca696ca8bc4201cc3da131"]:
{{{
#!CommitTicketReference repository=""
revision="1c714c18d22a38ad2dca696ca8bc4201cc3da131"
[1.6.x] Fixed #22305 -- Added note to docs about UTF8 database
requirement.

Backport of 08b85de9b7 from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22305#comment:7>

Django

unread,
Jul 26, 2014, 9:41:04 AM7/26/14
to django-...@googlegroups.com
#22305: MaxLengthValidator doesn't take database encoding into account
--------------------------------------+------------------------------------
Reporter: joeri | Owner: nobody

Type: Cleanup/optimization | Status: closed
Component: Documentation | Version: master
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"ddf2b7d96b1048b1210d2315d92275496352ccaf"]:
{{{
#!CommitTicketReference repository=""
revision="ddf2b7d96b1048b1210d2315d92275496352ccaf"
[1.7.x] Fixed #22305 -- Added note to docs about UTF8 database
requirement.

Backport of 08b85de9b7 from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22305#comment:8>

Reply all
Reply to author
Forward
0 new messages