If a `modelformset_factory()` is used to create a form set for a model
with a unique field, the user can enter two strings differing only by
case. The `BaseModelFormSet.validate_unique()` does not consider MySQL's
collation rules and considers these two strings different. Upon insertion
in the database, there is a constraint violation. The following ''new''
unit test demonstrates how this could happen. Code:
<https://github.com/jdufresne/django/tree/mysql-formset-duplicates>.
Not sure the correct approach to fix this as other database backends will
happily accept strings differing only by case. So the formset needs to
have some sense of the database backend or whether or not unique fields
should be case sensitive.
{{{
diff --git a/tests/forms_tests/models.py b/tests/forms_tests/models.py
index 82fa005..45d9680 100644
--- a/tests/forms_tests/models.py
+++ b/tests/forms_tests/models.py
@@ -110,3 +110,7 @@ class Cheese(models.Model):
class Article(models.Model):
content = models.TextField()
+
+
+class Tag(models.Model):
+ name = models.CharField(max_length=100, unique=True)
diff --git a/tests/forms_tests/tests/test_formsets.py
b/tests/forms_tests/tests/test_formsets.py
index 94e2704..2e77afd 100644
--- a/tests/forms_tests/tests/test_formsets.py
+++ b/tests/forms_tests/tests/test_formsets.py
@@ -6,8 +6,10 @@ import datetime
from django.forms import (CharField, DateField, FileField, Form,
IntegerField,
SplitDateTimeField, ValidationError, formsets)
from django.forms.formsets import BaseFormSet, formset_factory
+from django.forms.models import modelformset_factory
from django.forms.utils import ErrorList
from django.test import TestCase
+from ..models import Tag
class Choice(Form):
@@ -1213,3 +1215,22 @@ class TestEmptyFormSet(TestCase):
class FileForm(Form):
file = FileField()
self.assertTrue(formset_factory(FileForm,
extra=0)().is_multipart())
+
+
+TagFormSet = modelformset_factory(Tag, fields=['name'])
+
+
+class ModelFormSetTestCase(TestCase):
+ def test_duplicates_mixed_case(self):
+ formset = TagFormSet({
+ 'form-TOTAL_FORMS': 2,
+ 'form-INITIAL_FORMS': 0,
+ 'form-0-name': 'TAG',
+ 'form-1-name': 'tag',
+ })
+ self.assertTrue(formset.is_valid())
+ formset.save()
+ self.assertQuerysetEqual(
+ Tag.objects.all(),
+ ['TAG'],
+ lambda o: o.name)
}}}
When running this test with a MySQL database the test fails with:
{{{
======================================================================
ERROR: test_duplicates_mixed_case
(forms_tests.tests.test_formsets.ModelFormSetTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jon/devel/django/tests/forms_tests/tests/test_formsets.py",
line 1232, in test_duplicates_mixed_case
formset.save()
File "/home/jon/devel/django/django/forms/models.py", line 638, in save
return self.save_existing_objects(commit) +
self.save_new_objects(commit)
File "/home/jon/devel/django/django/forms/models.py", line 769, in
save_new_objects
self.new_objects.append(self.save_new(form, commit=commit))
File "/home/jon/devel/django/django/forms/models.py", line 621, in
save_new
return form.save(commit=commit)
File "/home/jon/devel/django/django/forms/models.py", line 461, in save
construct=False)
File "/home/jon/devel/django/django/forms/models.py", line 103, in
save_instance
instance.save()
File "/home/jon/devel/django/django/db/models/base.py", line 694, in
save
force_update=force_update, update_fields=update_fields)
File "/home/jon/devel/django/django/db/models/base.py", line 722, in
save_base
updated = self._save_table(raw, cls, force_insert, force_update,
using, update_fields)
File "/home/jon/devel/django/django/db/models/base.py", line 803, in
_save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk,
raw)
File "/home/jon/devel/django/django/db/models/base.py", line 842, in
_do_insert
using=using, raw=raw)
File "/home/jon/devel/django/django/db/models/manager.py", line 86, in
manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/jon/devel/django/django/db/models/query.py", line 952, in
_insert
return query.get_compiler(using=using).execute_sql(return_id)
File "/home/jon/devel/django/django/db/models/sql/compiler.py", line
930, in execute_sql
cursor.execute(sql, params)
File "/home/jon/devel/django/django/db/backends/utils.py", line 65, in
execute
return self.cursor.execute(sql, params)
File "/home/jon/devel/django/django/db/utils.py", line 95, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/jon/devel/django/django/db/backends/utils.py", line 65, in
execute
return self.cursor.execute(sql, params)
File "/home/jon/devel/django/django/db/backends/mysql/base.py", line
126, in execute
return self.cursor.execute(query, args)
File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 174,
in execute
self.errorhandler(self, exc, value)
File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line
36, in defaulterrorhandler
raise errorclass, errorvalue
IntegrityError: (1062, "Duplicate entry 'tag' for key 'name'")
----------------------------------------------------------------------
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/23964>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
Comment:
I am able to workaround this in my project by changing the line in
BaseModelFormSet.validate_unique() from:
{{{
row_data = tuple([form.cleaned_data[field] for field in unique_check if
field in form.cleaned_data])
}}}
To (adding `.lower()`):
{{{
row_data = tuple([form.cleaned_data[field].lower() for field in
unique_check if field in form.cleaned_data])
}}}
I feel this is a hack and wrongly assumes that the database is case-
insensitive. This is OK for my project, but not as a proper fix.
I'd appreciate some guidance on how this should be fixed in the general
case to handle case-insensitive databases such as MySQL as well as case-
sensitive databases such as PostgreSQL.
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:1>
Comment (by timgraham):
Do you really want that collation? Have you read the
[https://docs.djangoproject.com/en/dev/ref/databases/#collation-settings
Django docs] on the topic? I am not sure a specific mention of this
problem is needed, but if you care to draft something, I'll happily review
it.
It seems like a lot of complexity for Django to try to inspect MySQL's
collation and alter its behavior based on that, and I'm not sure it's
worth it.
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:2>
* cc: jon.dufresne@… (added)
Comment:
> Do you really want that collation? Have you read the Django docs on the
topic?
Thank you for pointing me to that documentation. I have read through it.
By "that collation" do you mean the ''default'' of `utf8_general_ci`?
To be honest, I find it a bit strange that Django is saying it can't work
properly with MySQL--a supported database bacekend--in the default MySQL
configuration. It seems to be suggesting I change my database collation to
`utf8_bin` (which I can do if I must) but changing to that collation
creates different set of issues (bytestrings vs Unicode strings). So I
guess I have to choose which issues I can tolerate most.
> I am not sure a specific mention of this problem is needed, but if you
care to draft something, I'll happily review it.
Sure, I'll take a stab at it.
> It seems like a lot of complexity for Django to try to inspect MySQL's
collation and alter its behavior based on that, and I'm not sure it's
worth it.
I agree we want to avoid inspecting MySQL's collation.
This is why I was wondering if there was any guidance to fix this in a
general way that didn't look like a complete hack around MySQL specific
behavior. Basically, I'm asking for some ideas. Once those ideas are
presented, I don't mind doing the work to get them in place.
If it isn't worth it, that means this default database backend will
knowingly break when used with useful Django features. Those useful
features: model formsets and unique constraints.
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:3>
Comment (by Tim Graham <timograham@…>):
In [changeset:"4d27d72d149b714431b77f2f15bad1591a9602b7"]:
{{{
#!CommitTicketReference repository=""
revision="4d27d72d149b714431b77f2f15bad1591a9602b7"
Refs #23964 -- Added warning about case-insensitive, unique fields used
with formsets
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:4>
Comment (by Tim Graham <timograham@…>):
In [changeset:"4862eb6e5b6dae63e5c156fa1095c5b87503620b"]:
{{{
#!CommitTicketReference repository=""
revision="4862eb6e5b6dae63e5c156fa1095c5b87503620b"
[1.7.x] Refs #23964 -- Added warning about case-insensitive, unique fields
used with formsets
Backport of 4d27d72d149b714431b77f2f15bad1591a9602b7 from master
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:5>
Comment (by Tim Graham <timograham@…>):
In [changeset:"30a12d6ca66ae2558740ef05f4a0b44566cfefa3"]:
{{{
#!CommitTicketReference repository=""
revision="30a12d6ca66ae2558740ef05f4a0b44566cfefa3"
[1.6.x] Refs #23964 -- Added warning about case-insensitive, unique fields
used with formsets
Backport of 4d27d72d149b714431b77f2f15bad1591a9602b7 from master
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:6>
* stage: Unreviewed => Accepted
Comment:
Thanks for the documentation patch. I don't see an easy way to solve the
problem generally. An option on the formset won't work if multiple
databases are in use and third-party apps cannot know what database they
are running on. A model method like `get_normalized_unique_value()` could
inspect the database indicated by the write router and handle
database/table specific behavior. The formset could call that then.
Perhaps try raising the idea on the DevelopersMailingList to get other
opinions. I'll mark the ticket accepted for now, and we can close it later
if we decide it isn't worthwhile.
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:7>
Comment (by Mariusz Felisiak):
As far as I'm aware adding support for `Meta.constraints` validation in
`BaseModelFormSet.validate_unique()` should solve this and related issues
(#28405). It should be feasible to pass all values from the formset via a
constraint definition and check for duplicated rows on the database level.
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:9>
* cc: Hannes Ljungberg (added)
Comment:
#34296 was a duplicate for `UniqueConstraint` that only use `F()`
expressions.
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:10>
* cc: jon.dufresne@… (removed)
--
Ticket URL: <https://code.djangoproject.com/ticket/23964#comment:11>