Py3k changes

Strontium

unread,

Mar 24, 2011, 10:59:34 AM3/24/11

to python...@googlegroups.com

Hi All,

In Felixes experimental Babel Py3 repo I just committed a py3 support
change.

It uses 2to3 when setup.py is run to convert most everything, certainly
all of the syntactic stuff seems to be converted OK. The things that
remain are changes in functionality or missing features from 2.x to 3.x.

A few manual fixes identified by felix are added, but to contain them
and minimise in-tree changes, i added a py2compt.py module which at the
moment holds dictmixin and a replacement for the 2.x builtin cmp, both
of which are missing from py3.

The idea was to keep the code as similar as possible, and to contain
these larger chunks together, to aid readability.

It all builds fine under 2.6, 2.7 and 3.2. Its late here and I haven't
run any tests (im not really sure how to run the tests actually). But I
wanted to commit this so others could look at it and comment on if this
is the right approach or not.

This is more of a structure patch than a "this will make it work for
sure on py3 patch."

Strontium

Jeroen Ruigrok van der Werven

unread,

Mar 24, 2011, 11:46:39 AM3/24/11

to python...@googlegroups.com

-On [20110324 16:10], Strontium (strn...@gmail.com) wrote:
>A few manual fixes identified by felix are added, but to contain them
>and minimise in-tree changes, i added a py2compt.py module which at the
>moment holds dictmixin and a replacement for the 2.x builtin cmp, both
>of which are missing from py3.

I haven't looked at the code yet, but have you guys added a local fixers
file for this? Distribute has a wonderful mechanism for this stuff.

>It all builds fine under 2.6, 2.7 and 3.2. Its late here and I haven't
>run any tests (im not really sure how to run the tests actually). But I
>wanted to commit this so others could look at it and comment on if this
>is the right approach or not.

Have you tested scripts/import_cldr.py? That was the problem I was trying to
locally solve, since it imports from the babel source repository directly.

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーンラウフロックヴァンデルウェルヴェン
http://www.in-nomine.org/ | GPG: 2EAC625B
The great man is he who does not lose his childlike heart...

Strontium

unread,

Mar 25, 2011, 2:27:03 AM3/25/11

to python...@googlegroups.com

On 03/24/2011 11:46 PM, Jeroen Ruigrok van der Werven wrote:
>
> Have you tested scripts/import_cldr.py? That was the problem I was trying to
> locally solve, since it imports from the babel source repository directly.
>

I've just been playing with this. Its a hairy problem. And short of
copying the Babel directory and running a 2to3 pass over it, I don't see
an elegant answer.

Now i'm a complete Babel noob, so bear with me. It seems that
import_cldr is only used for people who checkout source from svn, correct?

Well it seems we have two choices.
1. Spend a lot of time writing some custom "make a temporary py3 version
from import_cldr and its dependencies" code. OR
2. For the time being specify that import_cldr and the other scripts are
Python2 scripts and if you are using svn, and need to run them you need
Python 2.x Which to my mind doesn't seem like much of an impost for
someone using the svn version.

From my messing around, this would not preclude one from then building
Babel using Python3 as the scripts purpose is only to massage a bunch of
data for Babels later use.

Strontium

Felix Schwarz

unread,

Mar 26, 2011, 3:09:58 PM3/26/11

to python...@googlegroups.com

Am 24.03.2011 16:46, schrieb Jeroen Ruigrok van der Werven:
>> It all builds fine under 2.6, 2.7 and 3.2. Its late here and I haven't
>> run any tests (im not really sure how to run the tests actually).

python setup.py test

> Have you tested scripts/import_cldr.py? That was the problem I was trying to
> locally solve, since it imports from the babel source repository directly.

What's the exact problem? Is it that distutils/distribute does not apply
2to3 on that file as it is not copied in the egg?

fs

Felix Schwarz

unread,

Mar 26, 2011, 3:23:11 PM3/26/11

to python...@googlegroups.com

Hi,

Am 24.03.2011 15:59, schrieb Strontium:
> It all builds fine under 2.6, 2.7 and 3.2. Its late here and I haven't
> run any tests (im not really sure how to run the tests actually). But I
> wanted to commit this so others could look at it and comment on if this
> is the right approach or not.

Thank you very much for your commits. I really like most of these
changes :-)

For the cmp patch I'd like to propose that we add new-style comparison
methods like __eq__ which fall back to __cmp__. That way we don't have
to ship an implementation of cmp. Also (AFAIK) the __cmp__ protocol is
not used anymore in Python 3 so we need to add these methods anyway.

fs

Strontium

unread,

Mar 26, 2011, 11:04:14 PM3/26/11

to python...@googlegroups.com

> What's the exact problem? Is it that distutils/distribute does not apply
> 2to3 on that file as it is not copied in the egg?
>

The scripts are not part of what Distribute deals with as they aren't
installed, yes, so they are not 2to3'd by default. Further, to do a
proper install with distribute, one needs the CLDR data processed and
installed into the source tree, which is what the scripts do. Running
2to3 on the scripts also doesn't work, because they back link into the
main source which isn't 2to3'd at this point.

Its a catch 22.

I started to look at the possibility of another script that 2to3'd the
scripts and then the parts of the main source that needed it (copying it
to a temp directory) but import_cldr expects to be executing from where
it is, because it trys to insert the processed CLDR data back into the
main tree using file system locations relative to the scripts
directory. Its quite a curly problem, and given it only effects
developers using SVN, I figured time was better spent actually getting
Babel to work under Python 3. I think a more elegant solution would be
for setup to auto process the cldr data if it isn't present, and 2to3
that source (somehow). But I haven't the foggiest how one would achieve
that. I also considered promoting the import_cldr functionality to be
part of pybabel, but the problem with that is to properly install
pybabel you need the processed cldr data, again a catch 22.

Strontium

unread,

Mar 26, 2011, 11:14:34 PM3/26/11

to python...@googlegroups.com

On 03/27/2011 03:23 AM, Felix Schwarz wrote:
>
> Thank you very much for your commits. I really like most of these
> changes :-)

Thanks.

> For the cmp patch I'd like to propose that we add new-style comparison
> methods like __eq__ which fall back to __cmp__. That way we don't have
> to ship an implementation of cmp. Also (AFAIK) the __cmp__ protocol is
> not used anymore in Python 3 so we need to add these methods anyway.

Sounds OK to me, its use is limited. I also have an uncommitted patch
on my tree that gets rid of the copied DictMixin code. So getting rid
of cmp would get rid of the py2compat.py file i added, which i'm not too
thrilled about. A Better solution was to :

try:
from UserDict import DictMixin
except ImportError:
from collections import UserDict as DictMixin

and then

class LocaleDataDict(DictMixin, dict):
"""Dictionary wrapper that automatically resolves aliases to the actual
values.
"""

def __init__(self, data, base=None):
dict.__init__(self, data)
if sys.version_info >= (3, 0):
DictMixin.__init__(self,data)

Im wrangling with the test suite at the moment, and when I have some
confidence this is actually working properly ill submit it.

Regarding the testsuite, docstring tests are broken from py2 to py3.
And its not easy or straight forward to fix them. Armin on the Jinja2
port said "There is a doctest converter in 2to3, but it does not give
you much. Error messages changed, reprs changed which it cannot properly
pick up, nested tracebacks cause a lot of grief and they are hard to debug."

Id agree with that, I have started following his advice, namely,
disabling docstring tests for Py3 (but leaving them for py2), and adding
actual unittest test cases to replace them (for both py2 and py3). I
spent most of the night wrangling with DictMixin, and seem to have that
sorted, so I should be able to get a few test cases knocked over tonight
cause that was causing heaps of problems. Once I have some confidence
what I've got is good, ill commit it, so you can critique the
direction. At the moment i've got lots of debug code in there to test
the testsuite, and we don't need that, so I cant commit right now.

Strontium

Felix Schwarz

unread,

Mar 27, 2011, 4:40:03 AM3/27/11

to python...@googlegroups.com

I see two different things here:
1. To create an egg, it's sufficient if only "run-time" files are
processed by 2to3. Therefore we can ignore import_cldr and tests.
2. In a Python 3-only environment I think it's ok to run 2to3 manually
(maybe through a custom script/distutils command we ship) on all
files.

There's also the reasoning behind that user's don't expect that
python setup.py …
changes the actual source code.

fs

Felix Schwarz

unread,

Mar 27, 2011, 4:46:23 AM3/27/11

to python...@googlegroups.com

Am 27.03.2011 05:14, schrieb Strontium:
> I also have an uncommitted patch
> on my tree that gets rid of the copied DictMixin code. So getting rid
> of cmp would get rid of the py2compat.py file i added, which i'm not too
> thrilled about. A Better solution was to :
>
> try:
> from UserDict import DictMixin
> except ImportError:
> from collections import UserDict as DictMixin

If this works, I'd really glad. I just remembered that it didn't when I
ran the tests but that might have been also because of a separate problem.

> Regarding the testsuite, docstring tests are broken from py2 to py3.
> And its not easy or straight forward to fix them. Armin on the Jinja2
> port said "There is a doctest converter in 2to3, but it does not give
> you much. Error messages changed, reprs changed which it cannot properly
> pick up, nested tracebacks cause a lot of grief and they are hard to
> debug."

The problem with the doctest converter is that it can't change any
"output", it only changes code. Therefore all u'' in doctests cause
problems in Python 3.

General (community) advice seems to be to get rid of doctests. I think
the only legitimate usage of doctests is to very if example code in
documentation still works.
Testing functionality should be done in a proper unittest as a general rule.

Therefore I'm okay with disabling doctests for Python3 if we mention
this limitation in the docs. We can fix that later, the main value is in
a working Python 3 version of Babel.

fs

Strontium

unread,

Mar 28, 2011, 1:38:10 AM3/28/11

to python...@googlegroups.com

On 03/27/2011 04:46 PM, Felix Schwarz wrote:
>
> Am 27.03.2011 05:14, schrieb Strontium:
>> A Better solution was to :
>>
>> try:
>> from UserDict import DictMixin
>> except ImportError:
>> from collections import UserDict as DictMixin
> If this works, I'd really glad. I just remembered that it didn't when I
> ran the tests but that might have been also because of a separate problem.
>

It really seems to work. Will know for sure when I get through all the
unit tests.

I have committed a change the passes all unittests for core.py and adds
a bunch of new unit tests replacing the doctests that are now disabled
for Py3.

Core.py now passes all tests under Py2.6,2.7 and 3.2. There's a bunch
of other tests that pass, but also some really hideous breakages, so
unless i've worked through a module fully and added back the tests that
were once doctests, I'm not declaring those other modules as passing.

Basically my strategy is to take it one module at a time, run tests
under 2.6, add unittests to replace doctests on the current module, make
sure it passes all tests under the 3 pythons im testing with, move on to
the next one. The next module I am going to tackle is dates.py.

I had a weird problem with repr() which I am using in the replacement
doctests, in 2.6 and 2.7.
The line is:
self.assertEqual(repr(Locale('en', 'US').currency_formats[None])
,'<NumberPattern %s\'\xa4#,##0.00\'>' % py2u)

py2u is just 'u' on Python 2 and '' on Python 3, To fix up the strings.

on Py2 the repr generates: "<NumberPattern u'\\xa4#,##0.00'>" which
seems wrong, and the assertEqual fails on Py2.6 and 2.7. However it
passes OK on Py3.

I worked around with a lesser test for Py2 of:
self.assertEqual(Locale('en', 'US').currency_formats[None].pattern,
u'\xa4#,##0.00' )

which works, but I am at a loss to explain why the first line doesn't
work for Py2, so if anyone can educate me on where I am going wrong, i'd
appreciate it.

Strontium

unread,

Mar 31, 2011, 10:15:27 AM3/31/11

to python...@googlegroups.com

I just committed a change to the Python 3 port, that introduced
unittests to replace disabled doctests for messages.catalog.py
all these tests pass in Py2.6,2.7 and 3.2

util.odict did not run under py3.2, but py3.1+ has a native ordered dict
which according to the pep which defined it gained some inspiration from
babels odict, so i changed util.py to subclass collections.OrderedDict
for odict, for Python 3.1 and up. It seems to work fine. I subclassed
it in case some small tweaks were needed to the api to make it
compaitble fully with the py2 odict implementation, so far that hasn't
proved necessary, if by the end of the port it proves that
collection.OrderedDict is a complete replacement for odict then i will
change the imports for odict to be conditional on the python version and
just "import collection.OrderedDict as odict" as its more straight
forward. Unless of course people would prefer it kept the way i've done
it at the moment.

I dont know if odict works on Python 3.0 but I remember reading that
Python 3.0 is considered broken by everyone anyway, so i dont know if
thats an issue or not.

so far the port is progressing smoothly. I am expecting the test cases
to be complete by the end of next week, unless something really horrible
crops up, or real life gets in the way.

Strontium

Felix Schwarz

unread,

Mar 31, 2011, 5:01:30 PM3/31/11

to python...@googlegroups.com

Hi,

thanks for your work. IMHO we should not bother supporting Python 3.0. I
don't think there are a lot of users for py3k anyway, so let's not
complicate our code here.

fs

Strontium

unread,

Apr 1, 2011, 12:40:27 AM4/1/11

to python...@googlegroups.com

I agree, for completeness if the code i'm putting a conditional path in
for "should" work on Py3, I make the test >= Py3.0.
In this specific case I know it wont work cause Py3 does not have the
library, which is why I made the test >= Py3.1.

Strontium

unread,

Apr 4, 2011, 4:44:59 AM4/4/11

to python...@googlegroups.com

Just committed changes to tests/extract.py and extract.py for python3
compatibility.
I reintroduced the removed doctests for extract.py as unittests in
tests/extract.py.
All tests pass under python 3.2, 2.6 and 2.7 for extract.py
I had to put some py3 version checks in extract.py because it does a
couple of string.decode(encoding) calls which, in addition to being
invalid for python 3, seem to be totally redundant for python 3.

Strontium

unread,

Apr 5, 2011, 12:23:25 PM4/5/11

to python...@googlegroups.com

Ok,

So far, most porting changes have been mechanical in nature and pretty
straight forward once i wrap my head around them. I am on the hardest
bit now. PO Files.

MO Files are easy, because they have to be Binary, and I seem to have
that working for py3k.

PO Files, look like text files, but to py3.x they act something like
binary files, because in theory, their encoding can change mid stream.
(it seems to me) Because it seems you could start reading as (say)
utf-8 and then read the mime header and change encodings after the
header has been read and "Content-Type:" processed. But I dont see how
Babel handles that, if it does.

Now, if PO Files are text files and opened with a particular encoding,
then things are easy, but at the moment, no PO Files are opened with any
particular encoding, so what I am asking is, should I make the PO File
handling Binary like and massage each and every lines encoding manually,
or do we say for Py3, the fileobj must be opened with the particular
encoding required and then I just add some check to the beginning of
read_po and write_po for py3 to ensure the file obj mode is correct AND
the encoding matches catalog.charset.

My preference is the latter (require po file objects to be opened with
the correct encoding), because it easier. For utf-8 encoded po files i
have it all SEEMS to be working, its breaking at the moment handling
iso-8859-1 encoded po files in the test suite.

Some advice in this regard would be appreciated, before i blunder off
and make a huge mistake.

Strontium

Felix Schwarz

unread,

Apr 5, 2011, 1:02:58 PM4/5/11

to python...@googlegroups.com

Hi,

Am 05.04.2011 18:23, schrieb Strontium:
> PO Files, look like text files, but to py3.x they act something like
> binary files, because in theory, their encoding can change mid stream.
> (it seems to me) Because it seems you could start reading as (say)
> utf-8 and then read the mime header and change encodings after the
> header has been read and "Content-Type:" processed. But I dont see how
> Babel handles that, if it does.

It does. See http://babel.edgewall.org/ticket/255

> Now, if PO Files are text files and opened with a particular encoding,
> then things are easy, but at the moment, no PO Files are opened with any
> particular encoding, so what I am asking is, should I make the PO File
> handling Binary like and massage each and every lines encoding manually,
> or do we say for Py3, the fileobj must be opened with the particular
> encoding required and then I just add some check to the beginning of
> read_po and write_po for py3 to ensure the file obj mode is correct AND
> the encoding matches catalog.charset.
>
> My preference is the latter (require po file objects to be opened with
> the correct encoding), because it easier.

While I think that most po files will be in UTF-8 now but in order to
support po files fully, I think we should go the "right" (aka hard) way.
IMHO the limitation mentioned in #255 can stay until we fix it but we
should support ISO-8559-1 po files. I thought I handled that in my old
patches already.

However if you like you can also do it the easy way first and let
someone else (e.g. me) fix the thing for real later.

Btw: I'm currently working on getting a Python 3 bitten build slave
running for another project of mine. When this is done, I'll continue
working on the Python3 version of Babel :-)

fs

Wichert Akkerman

unread,

Apr 5, 2011, 4:25:59 PM4/5/11

to python...@googlegroups.com

On 2011-4-5 18:23, Strontium wrote:
> Ok,
>
> So far, most porting changes have been mechanical in nature and pretty
> straight forward once i wrap my head around them. I am on the hardest
> bit now. PO Files.
>
> MO Files are easy, because they have to be Binary, and I seem to have
> that working for py3k.
>
> PO Files, look like text files, but to py3.x they act something like
> binary files, because in theory, their encoding can change mid stream.

On a bit of a tangent: I have had several problems with Babel breaking
PO-files during processing, either due to bugs in wrapping or escaping.
polib (http://pypi.python.org/pypi/polib) gave me much better results
and appears to see significant uptake, so I am wondering if Babel should
start using polib instead of having its own po/mo implementation.

Wichert.

--
Wichert Akkerman <wic...@wiggy.net> It is simple to make things.
http://www.wiggy.net/ It is hard to make things simple.

Strontium

unread,

Apr 5, 2011, 9:14:26 PM4/5/11

to python...@googlegroups.com

On 04/06/2011 01:02 AM, Felix Schwarz wrote:
> Hi,
>
> Am 05.04.2011 18:23, schrieb Strontium:
> It does. See http://babel.edgewall.org/ticket/255

Ok, I know it breaks API compatibility, but is it really necessary to
open the po file outside of write_po and read_po and pass the file
object in? If the file name was passed in, instead of a file object,
then po_read and po_write could do this:
1. open po file in binary mode. (so that if the character set is not
legal utf-8 there are no errors).
2. look for a "Content-Type:" header. And record encoding, or default to
utf-8 if missing.
3. open po file in text mode, with correct encoding.
4. process as normal, as no encoding/decoding are necessary as it will
be handled by python.

write_po and read_po are only used in anger by frontend.py and they all
are variants of:
outfile = open(self.output_file, 'w')
try:
write_po(outfile, catalog)
finally:
outfile.close()

when with my proposal above you would just have:
write_po(self.output_file, catalog)
and
read_po(infile, locale)

The biggest issue for the Babel codebase with doing this are the tests,
as they use StringIO but i would prefer to remove StringIO from the
tests for write_po and read_po to have these functions work good for
py2.x and 3.x than support the use of StringIO, as I struggle to see a
purpose for that, outside the tests.

As for polib, that might be a better way to go eventually, but for this
effort (making babel work on py3) polib itself does not support py3 so
it would have to be ported first. :(
polib works just like i propose here, it takes file names and handles
files internally.

> While I think that most po files will be in UTF-8 now but in order to
> support po files fully, I think we should go the "right" (aka hard) way.
> IMHO the limitation mentioned in #255 can stay until we fix it but we
> should support ISO-8559-1 po files. I thought I handled that in my old
> patches already.

If im going to mess with read_po i will try and fix this as well.

> However if you like you can also do it the easy way first and let
> someone else (e.g. me) fix the thing for real later.

not a fan of doing things twice :)

> Btw: I'm currently working on getting a Python 3 bitten build slave
> running for another project of mine. When this is done, I'll continue
> working on the Python3 version of Babel :-)

cool.

strontium

Jeroen Ruigrok van der Werven

unread,

Apr 6, 2011, 1:35:06 AM4/6/11

to python...@googlegroups.com

-On [20110405 22:26], Wichert Akkerman (wic...@wiggy.net) wrote:
>On a bit of a tangent: I have had several problems with Babel breaking
>PO-files during processing, either due to bugs in wrapping or escaping.
>polib (http://pypi.python.org/pypi/polib) gave me much better results
>and appears to see significant uptake, so I am wondering if Babel should
>start using polib instead of having its own po/mo implementation.

The problem lies in the fact that you get yet another dependency for Babel.
On the other hand, holding onto Not Invented Here (NIH) is also not
productive.

Need to think some more on this.

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーンラウフロックヴァンデルウェルヴェン
http://www.in-nomine.org/ | GPG: 2EAC625B

Atone me to my throes curtail...

Philip Jenvey

unread,

Apr 6, 2011, 1:37:12 AM4/6/11

to python...@googlegroups.com

On Apr 5, 2011, at 10:35 PM, Jeroen Ruigrok van der Werven wrote:

> -On [20110405 22:26], Wichert Akkerman (wic...@wiggy.net) wrote:
>> On a bit of a tangent: I have had several problems with Babel breaking
>> PO-files during processing, either due to bugs in wrapping or escaping.
>> polib (http://pypi.python.org/pypi/polib) gave me much better results
>> and appears to see significant uptake, so I am wondering if Babel should
>> start using polib instead of having its own po/mo implementation.
>
> The problem lies in the fact that you get yet another dependency for Babel.
> On the other hand, holding onto Not Invented Here (NIH) is also not
> productive.
>
> Need to think some more on this.

Does it support Python 3?

--
Philip Jenvey

Felix Schwarz

unread,

Apr 6, 2011, 2:48:36 AM4/6/11

to python...@googlegroups.com

Am 06.04.2011 07:37, schrieb Philip Jenvey:
> Does it support Python 3?

No, I don't think so.
Also the docs say: "polib requires python 2.5 or higher."

To me at least Python 2.4 support is really important for the next 2-3
years (while RHEL 5 is still actively maintained). But I guess given
enough interest we could fix that.

fs

Disclaimer: I never worked with polib, just looked at the source for 10
minutes.

Felix Schwarz

unread,

Apr 6, 2011, 2:52:40 AM4/6/11

to python...@googlegroups.com

Am 06.04.2011 07:35, schrieb Jeroen Ruigrok van der Werven:
> The problem lies in the fact that you get yet another dependency for Babel.
> On the other hand, holding onto Not Invented Here (NIH) is also not
> productive.
>
> Need to think some more on this.

To me dependencies are not such a bad thing as long as the project
understands the difference between development and distribution:
- For developers all kind of dependencies are fine. Even quite
complicated dependencies are ok.
- As a user I want to have as few dependencies as possible because I
don't know how to install them, things might go wrong etc.

Therefore you could include a dependency in your distributed files while
falling back to a system wide lib if it is not included. This also helps
linux distributions with their "no bundling" policy.

It's evil though to have your dependencies included in your source (like
Django, twill, …). Bundling dependencies is purely for distributions.

Don't know what to do about polib though. ;-)

fs

Wichert Akkerman

unread,

Apr 6, 2011, 3:06:31 AM4/6/11

to python...@googlegroups.com

The reason I suggested polib is that the Babel po-file implementation
has too many bugs which kept breaking our po-files, and Babel
development was effectively stalled. I was about to start writing my own
thing when I ran into polib which appears to be actively maintained and
did not suffer from any of the problems I have ran into with Babel, so
switching was a simple choice for me.

I am not saying polib is ideal; its documentation certainly leaves a lot
to be desired. But my gut feeling is that an effort to improve polib is
more worthwile than maintaining a separate po-implementation in Babel.

Jeroen Ruigrok van der Werven

unread,

Apr 6, 2011, 3:31:18 AM4/6/11

to python...@googlegroups.com

-On [20110406 08:55], Felix Schwarz (felix....@oss.schwarz.eu) wrote:
> - As a user I want to have as few dependencies as possible because I
> don't know how to install them, things might go wrong etc.

I am more thinking of use cases like the Trac project. There was already
some grumbling about pytz, so imagine if we add polib.

I guess asking David if it's ok to drop polib.py into Babel's source also
kind of defeats the whole purpose.

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーンラウフロックヴァンデルウェルヴェン
http://www.in-nomine.org/ | GPG: 2EAC625B

I believe because it is impossible...

Wichert Akkerman

unread,

Apr 6, 2011, 3:54:52 AM4/6/11

to python...@googlegroups.com

On 4/6/11 09:31 , Jeroen Ruigrok van der Werven wrote:
> -On [20110406 08:55], Felix Schwarz (felix....@oss.schwarz.eu) wrote:
>> - As a user I want to have as few dependencies as possible because I
>> don't know how to install them, things might go wrong etc.
>
> I am more thinking of use cases like the Trac project. There was already
> some grumbling about pytz, so imagine if we add polib.

Why are they so afraid of dependencies? Most of the applications I
install have dozens of external dependencies and using standard tools
such as virtualenv of zc.buildout is really is trivial to manage them,
so I do not quite understand their worry. Especially if there is a
choice between having to do more work yourself when you already don't
have enough manpower versus getting things for free.

Wichert.

Felix Schwarz

unread,

Apr 6, 2011, 4:11:36 AM4/6/11

to python...@googlegroups.com

Am 06.04.2011 09:54, schrieb Wichert Akkerman:
> Why are they so afraid of dependencies?

I think one of the reasons why Django is so successful is because they
don't have dependencies. It certainly contributes to Tracs popularity as
well.

When I worked for a company developing a popular Trac plugin, we had a
lot of issues with users not being able to install all dependencies.
These guys were not software developers or linux admins, many of them
use Windows and all that crap.

> Most of the applications I
> install have dozens of external dependencies and using standard tools
> such as virtualenv of zc.buildout is really is trivial to manage them,

I strongly disagree here. It might be easy to setup (not trivial, think
of pypi being down, incompatible requirements because one of your
dependencies was updated etc) but IMHO it's a PITA to maintain:
- easy_install, pip etc really suck compared to yum/aptitude
- no mirror network with auto-failover for pypi
- hard to get latest (security) updates for packages installed in
virtualenv
- no stable set of versions where you can get security fixes and
simple bug fixes but no API/ABI breakage for a certain time frame
(think CentOS/RHEL - 7+ years)

So virtualenv is nice for single deployments but if you want to maintain
stuff on a bigger scale (e.g. dozens of virtualenvs, lots of servers,
unskilled users) with minimum admin resources, it definitively not easy
to manage.

fs

Wichert Akkerman

unread,

Apr 6, 2011, 4:18:24 AM4/6/11

to python...@googlegroups.com

On 4/6/11 10:11 , Felix Schwarz wrote:
> Am 06.04.2011 09:54, schrieb Wichert Akkerman:
>> Why are they so afraid of dependencies?
>
> I think one of the reasons why Django is so successful is because they
> don't have dependencies. It certainly contributes to Tracs popularity as
> well.

Interestingly enough the tide is turning there, and Django is now
starting to be broken up in smaller pieces.

> I strongly disagree here. It might be easy to setup (not trivial, think
> of pypi being down, incompatible requirements because one of your
> dependencies was updated etc) but IMHO it's a PITA to maintain:
> - easy_install, pip etc really suck compared to yum/aptitude
> - no mirror network with auto-failover for pypi
> - hard to get latest (security) updates for packages installed in
> virtualenv
> - no stable set of versions where you can get security fixes and
> simple bug fixes but no API/ABI breakage for a certain time frame
> (think CentOS/RHEL - 7+ years)

All of these are just as problematic for a single package as for a
dozen. If you need to deploy to multiple machines tools like pip bundles
or zc.buildout allow you to completely lock down all versions and give
you an easily reproducible deployment.

Wichert.

David Fraser

unread,

Apr 6, 2011, 9:28:47 AM4/6/11

to python...@googlegroups.com

I can see plenty of purpose for using StringIO for po files, and have done so fairly often - you maybe passing around an in-memory copy of a PO file and want to change it; you may be grabbing it off the web; etc etc

Just thought I'd mention it :)

Cheers
David

Wichert Akkerman

unread,

Apr 6, 2011, 9:41:51 AM4/6/11

to python...@googlegroups.com

FWIW polib uses a slightly odd pattern for this: you pass a string to
its pofile method, and if os.path.exists(input) returns True it assumes
it to be a filename, and otherwise it assumes it is the raw content of a
PO or MO files.

Wichert.

Strontium

unread,

Apr 6, 2011, 11:29:43 AM4/6/11

to python...@googlegroups.com

Ok, point taken, But is there a point in a StringBuffer in an encoding
other than unicode?

Felix Schwarz

unread,

Apr 6, 2011, 12:37:53 PM4/6/11

to python...@googlegroups.com

Am 06.04.2011 17:29, schrieb Strontium:
> Ok, point taken, But is there a point in a StringBuffer in an encoding
> other than unicode?

I think this is a classical case of the Python2 unicode problem (no real
separation for binary and ASCII string data). In Python3 all of these
methods should take ByteIO streams.

fs

Strontium

unread,

Apr 6, 2011, 7:16:53 PM4/6/11

to python...@googlegroups.com

I Agree. But if that is the case, then all file objects should be
binary streams, not text streams. Because again Python 2 doesn't make
much of a distinction between binary files and text files, whereas
Python 3 does.

Felix Schwarz

unread,

Apr 7, 2011, 3:42:07 PM4/7/11

to python...@googlegroups.com

Not sure if I understand you completely but yes, I think as far as Babel
is concerned, we should treat all po files as binary until we parsed the
encoding. In Python 2 we only have StringIO for that but otherwise it
really should be BytesIO.

fs

Wichert Akkerman

unread,

Apr 8, 2011, 1:18:20 AM4/8/11

to python...@googlegroups.com

On 2011-4-7 21:42, Felix Schwarz wrote:
> Not sure if I understand you completely but yes, I think as far as Babel
> is concerned, we should treat all po files as binary until we parsed the
> encoding. In Python 2 we only have StringIO for that but otherwise it
> really should be BytesIO.

Python 2.6 and later do have BytesIO.

Strontium

unread,

Apr 12, 2011, 11:58:46 AM4/12/11

to python...@googlegroups.com

Well I just committed a big patch to the py3 tree. It now runs ALL
tests successfully for Py2.6, 2.7 and 3.2

I apologise for the size of the patch, but when I got to frontend.py
everything unravelled as it was all interconnected and i've only just
gotten it all back together by fixing pofile and mofile handling.

I need to go through and make sure all doc tests are ported to unit
tests, but at this stage it looks like it should all work, at least as
far as the unit tests are concerned.

I made a BIG upgrade to the functionality of pofile reading.

AND I closed http://babel.edgewall.org/ticket/255

PO Files for Python 2.x can be read from:
Files. Alternately a file name passed in will automatically be opened
and read, or if its not a file but a string containing a pofile's
contents that too will be read and processed.
It will Automatically read content type encoding from the file and use
that, falling back to the encoding set in the catalogue if its not set.

PO Files for Python 3.x can be read from:
Text Files/TextIO - limited to the encoding the text file is opened in
(usually utf-8). This is a Py3 limitation as I can't find any way to
re-open a file with a different encoding. Well actually for Text files I
could do something like:
filename = getattr(fileobj, 'name', '')
fileobj = open(filename,'rb')

but that seems a little perverse. If the consensus is that would be a
good thing I can implement it easy enough. then text files will behave
just like binary files and only TextIO will be limited to utf-8.

Binary Files/BytesIO/binary string - detects the encoding and uses that,
or defaults to catalog encoding if not found.
Text String - If its a file name, that file is automatically opened and
read, like a binary file. otherwise the text string is processed like a
text file.

Writing is a little more restricted.
PO File writing for Py2.x is essentially unchanged.
for Py3.x:
TextFiles/TextIO are forced to use the encoding specified when the file
was opened. (usually utf-8)
Binary files/ByteIO are encoded in the encoding set in the catalog.

Again I could possibly use the get file name and re-open trick above to
coerce the text file into binary mode, but that seems even more
perverse, given the file is already open for write and we don't really
know what the caller will do with the original file handle after the call.

For MoFiles, on py3 the fileobj must be a binary file. I could make
read_mo more forgiving using the re-open trick above, but i dont think
its worth it, and seems very hacky.

So at the moment, i am very confident this is a working Babel for Py3
and it is exactly the same source tree for both versions. I added a
couple of custom fixers to fix up some decelerations, its possible that
some of my "if py 3 do this, else do that" paths could be replaced with
custom fixers, but im hardly an expert at writing fixers (which seems
like a black art) and im not sure it would be worth it. I implemented
the ones i've added because there was a few of them and the manual
change was hugely messy.

So at this stage, i'd love some constructive criticism, cause i think
the work is close to done.

Strontium

Reply all

Reply to author

Forward