[Django] #22458: MySQL notes recommend legacy utf8_general_ci unicode collation

7 views
Skip to first unread message

Django

unread,
Apr 16, 2014, 10:52:12 AM4/16/14
to django-...@googlegroups.com
#22458: MySQL notes recommend legacy utf8_general_ci unicode collation
--------------------------------------+------------------------
Reporter: tobami@… | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Documentation | Version: 1.7-beta-1
Severity: Normal | Keywords: unicode
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 1 | UI/UX: 0
--------------------------------------+------------------------
The documentation section "MySQL notes" recommends the obsolete
utf8_general_ci collation settings:
"By default, with a UTF-8 database, MySQL will use the utf8_general_ci
collation." [0]
and
"... you should still use utf8_general_ci (the default) collation for the
django.contrib.sessions.models.Session table"

While it may still be the default depending on your MySQL version, MySQL
itself recommends utf8_unicode_ci instead of utf8_general_ci, as the later
can be incorrect for some characters and languages and its performance
benefits are no longer relevant. From the MySQL docs themselves:
"utf8_general_ci is a legacy collation that does not support expansions,
contractions, or ignorable characters." [1]

Using utf8_general_ci can be the cause of difficult to debug text issues.
IMO Django should update its MySQL collation recommendation to
utf8_unicode_ci.

[0] https://docs.djangoproject.com/en/dev/ref/databases/#collation-
settings
[1] http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-sets.html

--
Ticket URL: <https://code.djangoproject.com/ticket/22458>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 16, 2014, 11:09:53 AM4/16/14
to django-...@googlegroups.com
#22458: MySQL notes recommend legacy utf8_general_ci unicode collation
-------------------------------------+-------------------------------------
Reporter: tobami@… | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version:
Component: Documentation | 1.7-beta-1
Severity: Normal | Resolution:
Keywords: unicode | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by aaugustin):

* needs_better_patch: => 0
* stage: Unreviewed => Accepted
* needs_tests: => 0
* needs_docs: => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/22458#comment:1>

Django

unread,
Apr 18, 2014, 2:19:03 AM4/18/14
to django-...@googlegroups.com
#22458: MySQL notes recommend legacy utf8_general_ci unicode collation
-------------------------------------+-------------------------------------
Reporter: tobami@… | Owner: mardini
Type: | Status: assigned

Cleanup/optimization | Version:
Component: Documentation | 1.7-beta-1
Severity: Normal | Resolution:
Keywords: unicode | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by mardini):

* status: new => assigned
* owner: nobody => mardini


--
Ticket URL: <https://code.djangoproject.com/ticket/22458#comment:2>

Django

unread,
Apr 18, 2014, 12:45:17 PM4/18/14
to django-...@googlegroups.com
#22458: MySQL notes recommend legacy utf8_general_ci unicode collation
-------------------------------------+-------------------------------------
Reporter: tobami@… | Owner: mardini
Type: | Status: assigned
Cleanup/optimization | Version:
Component: Documentation | 1.7-beta-1
Severity: Normal | Resolution:
Keywords: unicode | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by mardini):

PR: https://github.com/django/django/pull/2587

MySQL documentation doesn't recommends utf8_unicode_ci in all cases. It
states that "comparisons for the utf8_general_ci collation are faster, but
slightly less correct, than comparisons for utf8_unicode_ci", and "If this
is acceptable for your application, you should use utf8_general_ci because
it is faster. If this is not acceptable (for example, if you require
German dictionary order), use utf8_unicode_ci because it is more
accurate." I added a note and a link that explains both cases, and what
the recommended usage for each collation is. Thanks.

--
Ticket URL: <https://code.djangoproject.com/ticket/22458#comment:3>

Django

unread,
Apr 18, 2014, 3:10:47 PM4/18/14
to django-...@googlegroups.com
#22458: MySQL notes recommend legacy utf8_general_ci unicode collation
-------------------------------------+-------------------------------------
Reporter: tobami@… | Owner: mardini
Type: | Status: closed

Cleanup/optimization | Version:
Component: Documentation | 1.7-beta-1
Severity: Normal | Resolution: fixed

Keywords: unicode | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"11ac50b18e578498c1d95e0a75921b5864387d46"]:
{{{
#!CommitTicketReference repository=""
revision="11ac50b18e578498c1d95e0a75921b5864387d46"
Fixed #22458 -- Added a note about MySQL utf8_unicode_ci collation

Thanks tobami at gmail.com for the report.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22458#comment:4>

Django

unread,
Apr 18, 2014, 3:11:34 PM4/18/14
to django-...@googlegroups.com
#22458: MySQL notes recommend legacy utf8_general_ci unicode collation
-------------------------------------+-------------------------------------
Reporter: tobami@… | Owner: mardini
Type: | Status: closed
Cleanup/optimization | Version:
Component: Documentation | 1.7-beta-1
Severity: Normal | Resolution: fixed
Keywords: unicode | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"b6863879e1cf20acdecb3606da8fe66b486836cf"]:
{{{
#!CommitTicketReference repository=""
revision="b6863879e1cf20acdecb3606da8fe66b486836cf"
[1.6.x] Fixed #22458 -- Added a note about MySQL utf8_unicode_ci collation

Thanks tobami at gmail.com for the report.

Backport of 11ac50b18e from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22458#comment:5>

Django

unread,
Apr 18, 2014, 3:11:34 PM4/18/14
to django-...@googlegroups.com
#22458: MySQL notes recommend legacy utf8_general_ci unicode collation
-------------------------------------+-------------------------------------
Reporter: tobami@… | Owner: mardini
Type: | Status: closed
Cleanup/optimization | Version:
Component: Documentation | 1.7-beta-1
Severity: Normal | Resolution: fixed
Keywords: unicode | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 1 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"b1e7dd445bb64c27df8e2b6902a76a67c79332ab"]:
{{{
#!CommitTicketReference repository=""
revision="b1e7dd445bb64c27df8e2b6902a76a67c79332ab"
[1.7.x] Fixed #22458 -- Added a note about MySQL utf8_unicode_ci collation

Thanks tobami at gmail.com for the report.

Backport of 11ac50b18e from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22458#comment:6>

Reply all
Reply to author
Forward
0 new messages