[Django] #21725: Javascript translations fail with non-BMP characters

33 views
Skip to first unread message

Django

unread,
Jan 2, 2014, 2:07:29 PM1/2/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
--------------------------------------+--------------------
Reporter: nedbatchelder | Owner: nobody
Type: Uncategorized | Status: new
Component: Internationalization | Version: 1.6
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+--------------------
If a translated string includes a non-BMP character (above 0xFFFF), then
javascript_catalog in views/i18n.py fails:

{{{
Traceback (most recent call last):
File "/home/ned/.virtualenvs/edx-platform/local/lib/python2.7/site-
packages/django/core/handlers/base.py", line 111, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/home/ned/.virtualenvs/edx-platform/local/lib/python2.7/site-
packages/django/views/i18n.py", line 264, in javascript_catalog
csrc.append("catalog['%s'] = '%s';\n" % (javascript_quote(k),
javascript_quote(v)))
File "/home/ned/.virtualenvs/edx-platform/local/lib/python2.7/site-
packages/django/utils/functional.py", line 176, in wrapper
return func(*args, **kwargs)
File "/home/ned/.virtualenvs/edx-platform/local/lib/python2.7/site-
packages/django/utils/text.py", line 305, in javascript_quote
return str(ustring_re.sub(fix, s))
UnicodeEncodeError: 'ascii' codec can't encode character u'\U0001d543' in
position 31: ordinal not in range(128)
}}}

(this is running 1.4.8, but the javascript_quote code hasn't changed since
then.)

It should be possible to fix javascript_quote to turn the character into a
surrogate pair.

--
Ticket URL: <https://code.djangoproject.com/ticket/21725>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 3, 2014, 6:42:02 AM1/3/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
--------------------------------------+------------------------------------
Reporter: nedbatchelder | Owner: nobody
Type: Bug | Status: new
Component: Internationalization | Version: 1.6
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by claudep):

* needs_better_patch: => 0
* needs_docs: => 0
* type: Uncategorized => Bug
* needs_tests: => 0
* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:1>

Django

unread,
Feb 15, 2014, 1:29:50 PM2/15/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: assigned
Component: | Version: 1.6
Internationalization | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by MattBlack):

* status: new => assigned
* owner: nobody => MattBlack


--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:2>

Django

unread,
Feb 15, 2014, 1:39:55 PM2/15/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: closed
Component: | Version: 1.6
Internationalization | Resolution: fixed

Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by Baptiste Mispelon <bmispelon@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"1c1dffca757b0b6acaf99d893d68847250ab4146"]:
{{{
#!CommitTicketReference repository=""
revision="1c1dffca757b0b6acaf99d893d68847250ab4146"
Fixed #21725 -- Fixed JavaScript quoting encoding.

Thanks to nedbatchelder for the report.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:3>

Django

unread,
Feb 15, 2014, 2:46:03 PM2/15/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: closed
Component: | Version: 1.6
Internationalization | Resolution: fixed
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by nedbatchelder):

Is this really a full fix? Why does this function replace BMP characters
with four-digit \uXXXX escapes, but allow non-BMP characters through
unchanged? I would have thought that Javascript would need surrogate
pairs.

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:4>

Django

unread,
Feb 16, 2014, 11:02:23 AM2/16/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: new

Component: | Version: 1.6
Internationalization | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by Honza_Kral):

* status: closed => new
* resolution: fixed =>


Comment:

I am getting consistent failures since this was merge in python 2 (not
python 3) on mac:
{{{
======================================================================
FAIL: test_javascript_quote_unicode (utils_tests.test_text.TestUtilsText)
----------------------------------------------------------------------


Traceback (most recent call last):

File "/Users/honza/work/django/tests/utils_tests/test_text.py", line
162, in test_javascript_quote_unicode
self.assertEqual(text.javascript_quote(input), output)
AssertionError: u"<script>alert(\\'Hello \\\\xff.\\n
Wel\\ud835\\udd43come\\there\\r\\');<\\/scr [truncated]... !=
u"<script>alert(\\'Hello \\\\xff.\\n
Wel\U0001d543come\\there\\r\\');<\\/script> [truncated]...
- <script>alert(\'Hello \\xff.\n
Wel\ud835\udd43come\there\r\');<\/script>?
^^^^^^^^^^^^
+ <script>alert(\'Hello \\xff.\n
Wel\ud835\udd43come\there\r\');<\/script>?
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:5>

Django

unread,
Feb 16, 2014, 7:23:24 PM2/16/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: new
Component: | Version: 1.6
Internationalization | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by bmispelon):

I did some digging and as it turns out, `javascript_quote` is not used
anymore when doing javascript translation since
a506b6981bc48caec30bca3de94d2ac3e6fc1660.

In fact, this function is undocumented, barely tested, and was only used
internally for the `javascript_catalogue` view.

On top of that, it was also completely broken on Python2 if you ever
passed it non-ascii input.

I think we should just delete it altogether.

As for the test failure, they pass on my machine both with Python 2 and 3
and our CI server is also in the green. There's a chance it might be
related to a wide/narrow build of Python.

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:6>

Django

unread,
Feb 21, 2014, 4:02:09 PM2/21/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: new
Component: | Version: 1.6
Internationalization | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0

Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by bmispelon):

* has_patch: 0 => 1


Comment:

Here's a pull request that adds a regression test to the
`javascript_catalog` suite that makes sure that non-BMP characters in
translation files are correctly handled.

It also deprecates `javascript_quote` altogether (it was undocumented and
not used anymore after a506b6981bc48caec30bca3de94d2ac3e6fc1660).

Finally, it also skips the reported failing test on narrow python builds.

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:7>

Django

unread,
Feb 21, 2014, 5:32:31 PM2/21/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: new
Component: | Version: 1.6
Internationalization | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by claudep):

https://github.com/django/django/pull/2339 I guess :-)

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:8>

Django

unread,
Feb 22, 2014, 5:49:14 AM2/22/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: new
Component: | Version: 1.6
Internationalization | Resolution:
Severity: Normal | Triage Stage: Ready for
Keywords: nlsprint14 | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by erikr):

* cc: eromijn@… (added)
* keywords: => nlsprint14
* stage: Accepted => Ready for checkin


Comment:

PR 2339 looks fine to me, and tests run (although I tested on a non-wide
python).

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:9>

Django

unread,
Feb 22, 2014, 7:52:08 AM2/22/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: new
Component: | Version: 1.6
Internationalization | Resolution:
Severity: Normal | Triage Stage: Ready for
Keywords: nlsprint14 | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Baptiste Mispelon <bmispelon@…>):

In [changeset:"926e18d7d126fcf7f4b2d25ce4155423ac6e2f90"]:
{{{
#!CommitTicketReference repository=""
revision="926e18d7d126fcf7f4b2d25ce4155423ac6e2f90"
Deprecated django.utils.text.javascript_quote.

Refs #21725.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:10>

Django

unread,
Feb 22, 2014, 7:53:08 AM2/22/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: closed
Component: | Version: 1.6
Internationalization | Resolution: fixed

Severity: Normal | Triage Stage: Ready for
Keywords: nlsprint14 | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by bmispelon):

* status: new => closed
* resolution: => fixed


Comment:

Thanks for the review.

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:11>

Django

unread,
Mar 5, 2014, 3:04:19 AM3/5/14
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: closed
Component: | Version: 1.6
Internationalization | Resolution: fixed
Severity: Normal | Triage Stage: Ready for
Keywords: nlsprint14 | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Claude Paroz <claude@…>):

In [changeset:"ac699cdc174a825e6b78c6f3c6e967bc961413c8"]:
{{{
#!CommitTicketReference repository=""
revision="ac699cdc174a825e6b78c6f3c6e967bc961413c8"
Really hidden warnings in javascript_quote tests

Refs #21725.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:12>

Django

unread,
Jan 17, 2015, 12:43:07 PM1/17/15
to django-...@googlegroups.com
#21725: Javascript translations fail with non-BMP characters
-------------------------------------+-------------------------------------
Reporter: nedbatchelder | Owner: MattBlack
Type: Bug | Status: closed
Component: | Version: 1.6
Internationalization |
Severity: Normal | Resolution: fixed
Keywords: nlsprint14 | Triage Stage: Ready for

| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"df3f3bbe2927b9bad80088c6adbf5e8c5ba778c9"]:
{{{
#!CommitTicketReference repository=""
revision="df3f3bbe2927b9bad80088c6adbf5e8c5ba778c9"
Removed utils.text.javascript_quote() per deprecation timeline; refs
#21725.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/21725#comment:13>

Reply all
Reply to author
Forward
0 new messages