[Django] #23271: Makemessages can corrupt existing .po files on Windows

25 views
Skip to first unread message

Django

unread,
Aug 11, 2014, 11:04:23 AM8/11/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: 1.7-rc-2
Internationalization | Keywords: makemessages utf8
Severity: Normal | unicode
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
== Description ==

Seen on German Windows7 SP1 with 64bit Python 3.4.1 and gettext 0.18.1.

When you have '''an existing .po file''' with translations, e.g.
{{{
msgid ""
msgstr ""
"Project-Id-Version: \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2014-03-03 10:44+0100\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: \n"
"Language-Team: \n"
"Language: de\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"

msgid "Size"
msgstr "Größe"
}}}
and then run
{{{
manage.py makemessages --no_location --no_wrap -l de
}}}
to update the .po file, you get '''a corrupted .po file''':
{{{
msgid ""
msgstr ""
"Project-Id-Version: \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2014-03-03 10:44+0100\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: \n"
"Language-Team: \n"
"Language: de\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"

msgid "Size"
msgstr "Größe"
}}}

== Investigation ==

Setting environment variables like LANG, LANGUAGE, LC_ALL, LC_MESSAGES,
etc. has no effect on the outcome. Also calling chcp 65001 on the Windows
console does not fix the problem.

However, it seems like the described behavior was introduced as a side-
effect with this commit
[https://github.com/django/django/commit/dbb48d2bb99a5f660cf2d85f137b8d87fc12d99f]:
All file accesses in ''makemessages.py'' were changed to explicitly use
utf-8, but the stdout of the gettext binaries (like ''msgmerge'' etc.) are
still interpreted with the Windows encoding cp1252, as ''popen_wrapper''
sets ''universal_newline=True'' which in turn uses the encoding returned
by ''locale.getpreferredencoding()''.

As a test I patched ''popen_wrapper'' to interpret output of external
processes with utf-8 encoding instead of
''locale.getpreferredencoding()'':
{{{
def popen_wrapper_utf8(args,
os_err_exc_type=django.core.management.base.CommandError):
"""Monkey-patch for django.core.management.utils.popen_wrapper"""
try:
p = Popen(args, shell=False, stdout=PIPE, stderr=PIPE,
close_fds=os.name != 'nt', universal_newlines=True)
except OSError as e:
six.reraise(os_err_exc_type, os_err_exc_type('Error executing %s:
%s' %
(args[0],
e.strerror)), sys.exc_info()[2])
output, errors = p.communicate()

# Additional utf-8 decoding
output =
output.encode(locale.getpreferredencoding(False)).decode('utf-8')
#

return (
output,
force_text(errors, DEFAULT_LOCALE_ENCODING, strings_only=True),
p.returncode
)
}}}
This has the desired effect and prevents the corruption of the .po files.

I also tried changing the value returned by
''locale.getpreferredencoding()'' to "utf-8", but that seems impossible on
Windows, as Python uses the win32 API ''GetACP()'', which according to
MSDN [http://msdn.microsoft.com/en-
us/library/windows/desktop/dd318070(v=vs.85).aspx] only returns ANSI
codepages and thus will never return "utf-8".

--
Ticket URL: <https://code.djangoproject.com/ticket/23271>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Aug 11, 2014, 1:00:23 PM8/11/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: 1.7-rc-2
Internationalization | Resolution:
Severity: Release blocker | Triage Stage: Accepted
Keywords: makemessages utf8 | Needs documentation: 0
unicode | Patch needs improvement: 0
Has patch: 0 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by claudep):

* needs_better_patch: => 0
* needs_docs: => 0
* severity: Normal => Release blocker
* needs_tests: => 0
* stage: Unreviewed => Accepted


Comment:

I'm guilty for the regression, but I'm afraid I will not be able to debug
this issue as I don't have access to Windows machines.

If noone else can fix it, I can revert the patch in the 1.7 branch to
remove the release blocking for 1.7.

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:1>

Django

unread,
Aug 12, 2014, 3:53:29 PM8/12/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: 1.7-rc-2
Internationalization | Resolution:
Severity: Release blocker | Triage Stage: Accepted
Keywords: makemessages utf8 | Needs documentation: 0
unicode | Patch needs improvement: 0
Has patch: 0 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by andrewgodwin):

I am also without a dev environment. If we have to remove the patch from
1.7, and we think we can do it cleanly, then it's an option.

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:2>

Django

unread,
Aug 12, 2014, 5:05:51 PM8/12/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: master

Internationalization | Resolution:
Severity: Release blocker | Triage Stage: Accepted
Keywords: makemessages utf8 | Needs documentation: 0
unicode | Patch needs improvement: 0
Has patch: 0 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by claudep):

* version: 1.7-rc-2 => master


Comment:

[cdfefbec721b59695e28] has been reverted on the 1.7.x branch. I've applied
[67870137b9e1f1] instead to fix #22686, but I'd like to keep the more
extensive fix on master. The release blocking flag is now only applicable
to master.

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:3>

Django

unread,
Dec 20, 2014, 1:26:53 PM12/20/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution:
Keywords: makemessages utf8 | Triage Stage: Accepted
unicode |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by timgraham):

Ramiro said he would have some time to look at this before 1.8 alpha.

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:4>

Django

unread,
Dec 29, 2014, 8:39:20 AM12/29/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution:
Keywords: makemessages utf8 | Triage Stage: Ready for
unicode | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* has_patch: 0 => 1
* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:5>

Django

unread,
Dec 29, 2014, 11:05:02 AM12/29/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: closed
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution: fixed

Keywords: makemessages utf8 | Triage Stage: Ready for
unicode | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Ramiro Morales <ramiro@…>):

* status: new => closed
* resolution: => fixed


Comment:

In [changeset:"6fb9dee470d57882e378247fd2706d5f9867b5f9"]:
{{{
#!CommitTicketReference repository=""
revision="6fb9dee470d57882e378247fd2706d5f9867b5f9"
Fixed #23271 -- Don't corrupt PO files on Windows when updating them.

Make sure PO catalog text fetched from gettext programs via standard
output isn't corrupted by mismatch between assumed (UTF-8) and real
(CP1252) encodings. This can cause mojibake to be written when creating
or updating PO files.

Also fixes #23311.

Thanks to contributor with Trac nick 'danielmenzel' for the report,
excellent research and fix.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:6>

Django

unread,
Dec 29, 2014, 11:24:53 AM12/29/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution:
Keywords: makemessages utf8 | Triage Stage: Accepted
unicode |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* status: closed => new
* has_patch: 1 => 0
* resolution: fixed =>
* stage: Ready for checkin => Accepted


Comment:

Python 3 looks good, but now I have test failures on Python 2 and Windows.
Can you reproduce?

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:7>

Django

unread,
Dec 29, 2014, 7:05:59 PM12/29/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution:
Keywords: makemessages utf8 | Triage Stage: Accepted
unicode |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Ramiro Morales <cramm0@…>):

In [changeset:"002a8ffe478b7520a64c9176f9f640633f643b9c"]:
{{{
#!CommitTicketReference repository=""
revision="002a8ffe478b7520a64c9176f9f640633f643b9c"
Fixed breakage by 6fb9dee4 under Python2+Windows.

Refs #23271
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:8>

Django

unread,
Dec 29, 2014, 7:09:32 PM12/29/14
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: closed
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution: fixed

Keywords: makemessages utf8 | Triage Stage: Accepted
unicode |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by ramiro):

* status: new => closed
* resolution: => fixed


--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:9>

Django

unread,
Apr 19, 2015, 12:51:25 PM4/19/15
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution:
Keywords: makemessages utf8 | Triage Stage: Accepted
unicode |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by daphshez):

* status: closed => new

* resolution: fixed =>


Comment:

Regression fails on my machine (Windows 7 64, Django development version
head, with

{{{
locale.getpreferredencoding()=='cp1255'
}}}


{{{
======================================================================
ERROR: test_po_file_encoding_when_updating
(i18n.test_extraction.BasicExtractorTests)
Update of PO file doesn't corrupt it with non-UTF-8 encoding on
Python3+Windows (#23271)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\data\dev\django\django\tests\i18n\test_extraction.py", line
407, in test_po_file_encoding_when_updating
management.call_command('makemessages', locale=['pt_BR'], verbosity=0)
File "C:\data\dev\django\django\django\core\management\__init__.py",
line 118, in call_command
return command.execute(*args, **defaults)
File "C:\data\dev\django\django\django\core\management\base.py", line
398, in execute
output = self.handle(*args, **options)
File
"C:\data\dev\django\django\django\core\management\commands\makemessages.py",
line 322, in handle
self.write_po_file(potfile, locale)
File
"C:\data\dev\django\django\django\core\management\commands\makemessages.py",
line 455, in write_po_file
msgs, errors, status = gettext_popen_wrapper(args)
File
"C:\data\dev\django\django\django\core\management\commands\makemessages.py",
line 39, in
gettext_popen_wrapper
stdout, stderr, status_code = popen_wrapper(args,
os_err_exc_type=os_err_exc_type)
File "C:\data\dev\django\django\django\core\management\utils.py", line
27, in popen_wrapper
output, errors = p.communicate()
File "C:\Python34\lib\subprocess.py", line 959, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "C:\Python34\lib\subprocess.py", line 1234, in _communicate
stdout = stdout[0]
IndexError: list index out of range
}}}

Also, this seems also to be the source of #21928 which isn't resolved for
other people.

The problem is that Popen still tries to encode msgmerge output stream
using DEFAULT_LOCALE_ENCODING, when the data is actually in UTF-8.

A way around it is to call Popen with universal_newlines=true, and then
take care of line ending inside gettext_popen_wrapper. I'll post a patch
soon.

(BTW the current solution is for Windows only. Linux uses have to set
undocumented environment variables. I think we should strive to solve this
for Linux users as well).

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:10>

Django

unread,
Apr 20, 2015, 9:45:42 AM4/20/15
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: new
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution:
Keywords: makemessages utf8 | Triage Stage: Accepted
unicode |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by daphshez):

Well, I patched the code and sent a pull request at

[https://github.com/django/django/pull/4532]

This is my first attempt to contribute code and I seem to have messed up
with the process. I am sorry about it. I hope my fix will get considered
anyway...

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:11>

Django

unread,
May 1, 2015, 10:30:16 AM5/1/15
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: closed
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution: fixed

Keywords: makemessages utf8 | Triage Stage: Accepted
unicode |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: new => closed
* resolution: => fixed


Comment:

In [changeset:"57202a112a966593857725071ecd652a87c157fb" 57202a11]:
{{{
#!CommitTicketReference repository=""
revision="57202a112a966593857725071ecd652a87c157fb"
Fixed #23271 -- Fixed makemessages crash/test failure for some locales.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:12>

Django

unread,
May 1, 2015, 10:39:22 AM5/1/15
to django-...@googlegroups.com
#23271: Makemessages can corrupt existing .po files on Windows
-------------------------------------+-------------------------------------
Reporter: danielmenzel | Owner: nobody
Type: Bug | Status: closed
Component: | Version: master
Internationalization |
Severity: Release blocker | Resolution: fixed
Keywords: makemessages utf8 | Triage Stage: Accepted
unicode |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"c45fd57f682499af0e72f163b87b261dfaf0fbbc" c45fd57f]:
{{{
#!CommitTicketReference repository=""
revision="c45fd57f682499af0e72f163b87b261dfaf0fbbc"
[1.8.x] Fixed #23271 -- Fixed makemessages crash/test failure for some
locales.

Backport of 57202a112a966593857725071ecd652a87c157fb from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23271#comment:13>

Reply all
Reply to author
Forward
0 new messages