Genshi Ported to Python 3

76 views
Skip to first unread message

Simon Cross

unread,
Sep 5, 2010, 9:58:03 AM9/5/10
to gen...@googlegroups.com
Hi Jeroen (and others)

It looks like the sprint was fairly successful and we have a patch
that updates Genshi to be Python 3 compatible (without losing Python 2
compatibility). The patch is available at either of:

* bitbucket -- http://bitbucket.org/hodgestar/genshi-py3k
* CTPUG wiki --
http://ctpug.org.za/attachment/wiki/Meeting20100904/genshi-py3k.diff

Bitbucket has the full commit history. The sprint occurred in two
stages. First we worked on simply porting to Python 3 without
maintaining compatibility with Python 2. This work was done in the
2to3 bitbucket branch and took about seven hours. In the second stage
we created a single code-base that supports Python 2 and Python 3 (via
2to3 and some helpers). This was done in the default branch and took
about eight hours.

We've tested the result under Python 2.4, 2.5, 2.6, 3.1 and 3.2 and on
Linux (Ubuntu Lucid and Maverick) and Mac OS X.


API Change
^^^^^^^^^^^

We made one important API change. Previously the default encoding used
by Genshi in a number of functions was UTF-8. This is a little awkward
in Python 3 since strings are unicode (and it's generally bytes that
have to be explicitly asked for). We opted to change the default
encoding to None (i.e. use a unicode string). Although this breaks
compatibility, we think it eases the migration towards Python 3 (and
it made porting easier).


Usage
^^^^^^

The patch should apply cleanly against trunk. After applying one can
build, test and install under Python 2 using:

$ python setup.py build
$ python setup.py test
$ python setup.py install

and under Python 3 using:

$ python3 setup.py build
$ python3 setup.py test
$ python3 setup.py install

Running setup.py under Python 3 requires Distribute (which runs under
Python 3 and provides hooks for running 2to3).


Patch Tricks
^^^^^^^^^^^

- "if not isinstance(s, unicode):" is converted by 2to3 while "if
isinstance(s, str)" is not.
- "u'foo'.encode('utf-8')" creates a byte string in UTF-8 both before
and after 2to3 while "'foo'" does not.


Patch Details
^^^^^^^^^^^^

.hgignore
README
- Hg ignore file and note about the patch. No need to commit.

setup.py
MANIFEST.in
- changes to make setup.py importable by both Python 2 and Python 3
(without 2to3).
- include tests in build so that "python3 setup.py test" works.
- Distribute runs the tests on the build rather than on the source
since the source is not modified by 2to3.
- add Distribute 2to3 options when run with Python 3.

doc/common/doctools.py
- change to an svn:external to make prints python3 compatible.
- doctools is import by setup.py so this needs to be importable by
python3 without running 2to3 on it.

examples_to_py3k.sh
- script to run 2to3 on the python files in the examples folder.

fixes/fix_unicode_in_strings.py
- We wrote a custom 2to3 fixer to fix unicode output strings inside
doctests and other string constants.

genshi/_speedups.c
- ported using #ifdefs for Python 3.

genshi/compat.py
- new home for cross-python-version compatibility functions

genshi/core.py
genshi/tests/core.py
- change default encoding from UTF-8 to None (i.e. unicode strings)
- stringrepr no longer used inside __repr__ in Python 3 repr()
returns a string (i.e. unicode)

genshi/input.py
genshi/tests/input.py
- change default encoding for HTML() to None.
- track changes to expat parser in Python 3 (mostly it accepts bytes
instead of strings).

genshi/output.py
genshi/tests/output.py
- change default encoding for encode() to None.

genshi/util.py
- move cross-Python-version compatibility fixes into genshi.compat

genshi/filters/html.py
- minor changes to track encoding=None API change

genshi/filters/tests/html.py
- renamed to test_html.py. In Python 3 there is a top-level module
named html and having this overridden when
running "python3 genshi/filters/tests/__init__.py" or similar was
incredibly frustrating.
- left genshi/filters/html.py with its own name since
genshi/filters/ doesn't typically end up in sys.path. :)

genshi/filters/i18n.py
genshi/filters/tests/i18n.py
- ugettext and friends are gone in Python 3 (and only gettext and
friends exist and they now handle unicode).
- Some \ line continuations inside doctests confused 2to3 so we removed them.
- Testing picked up a problem (already present in trunk) where
Translator.__call__ could end up defining gettext
as an endlessly recursive function. Noted with a TODO.

genshi/filters/transform.py
genshi/filters/tests/transform.py
- minor changes to track encoding=None API change

genshi/template/astutil.py
- AST for raise has changed in Python 3.
- Python 3 adds AST nodes for individual arguments and Bytes.

genshi/template/base.py
- distinguish between bytes and unicode in Python 3 compatible way.

genshi/template/directives.py
genshi/template/tests/directives.py
- slightly odd syntax changes to make the 2to3 .next() fixer pick
up *stream.next().
- minor test fix for change in behaviour of division (/) in Python 3.

genshi/template/eval.py
genshi/template/tests/eval.py
- use genshi.compat functions for dealing with code objects.
- replace doctests that reply on exception names with uglier but
more compatible try:.. except:.. doctest
- handle filename preferences of Python 2 and 3 (2 prefers bytes, 3
prefers unicode).
- ifilter is gone from itertools in Python 3 so use repeat for tests instead.

A more minor change is that the ASTTransformer in Genshi coerces byte
strings to unicode (in genshi/templates/eval.py -- visit_Str). I'm not
sure I fully understand the rationale for this in Python 2 but in
Python 3 it's definitely wrong. We've retained the old behaviour in
Python 2 and left bytes as they are in Python 3.

genshi/template/loader.py
genshi/template/tests/loader.py
- add 'b' to file modes to ensure it's loaded as bytes in Python 3.

genshi/template/tests/markup.py
- use BytesIO compatibility function in test

genshi/template/plugin.py
genshi/template/tests/plugin.py
- track encoding=None change.

genshi/template/text.py
- use not isinstance(s, unicode) instead of isinstance(s, str)


Schiavo
Simon

Jeroen Ruigrok van der Werven

unread,
Sep 6, 2010, 5:13:29 AM9/6/10
to gen...@googlegroups.com
-On [20100905 15:58], Simon Cross (hodg...@gmail.com) wrote:
>Hi Jeroen (and others)

>
>The patch should apply cleanly against trunk. After applying one can
>build, test and install under Python 2 using:

In the meantime I have found out that I don't have commit privileges to
Genshi. I thought I had them (at some point), but it turns out I do not
(anymore).

You will need to ask Pedro to apply any of the patches.

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
In every colour there's the Light...

Joshua Rowley

unread,
Sep 6, 2010, 5:13:32 AM9/6/10
to gen...@googlegroups.com
Brilliant - nice work guys. I'm going to give it a whirl asap.

Will you be adding a release on the Genshi homepage and on pypi's py3k module list?

Thanks

Josh


--
You received this message because you are subscribed to the Google Groups "Genshi" group.
To post to this group, send email to gen...@googlegroups.com.
To unsubscribe from this group, send email to genshi+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/genshi?hl=en.


Simon Cross

unread,
Sep 6, 2010, 5:30:42 AM9/6/10
to gen...@googlegroups.com
On Mon, Sep 6, 2010 at 11:13 AM, Joshua Rowley <jos...@vannelluna.com> wrote:
> Brilliant - nice work guys. I'm going to give it a whirl asap.

Woot. Let us know how it goes. :)

> Will you be adding a release on the Genshi homepage and on pypi's py3k
> module list?

I think the first step is to get the patch into trunk but I've added
the "Programming Language :: Python :: 3" trove classifier to setup.py
in the bitbucket branch.

Schiavo
Simon

Christian Boos

unread,
Sep 6, 2010, 5:41:09 AM9/6/10
to gen...@googlegroups.com
On 9/6/2010 11:13 AM, Jeroen Ruigrok van der Werven wrote:
> -On [20100905 15:58], Simon Cross (hodg...@gmail.com) wrote:
>> Hi Jeroen (and others)
>>
>> The patch should apply cleanly against trunk. After applying one can
>> build, test and install under Python 2 using:
> In the meantime I have found out that I don't have commit privileges to
> Genshi. I thought I had them (at some point), but it turns out I do not
> (anymore).

You still have them, but they were only given on trunk.
I've added you to sandboxers (hope Christopher doesn't mind), so that
you could integrate that branch somewhere below /branches/experimental
if you want, which I believe is the appropriate next step before Chris
integrates it in trunk if he'd like to.

-- Christian

Simon Cross

unread,
Sep 6, 2010, 5:49:00 AM9/6/10
to gen...@googlegroups.com
On Mon, Sep 6, 2010 at 11:41 AM, Christian Boos <cb...@neuf.fr> wrote:
> You still have them, but they were only given on trunk.
> I've added you to sandboxers (hope Christopher doesn't mind), so that
> you could integrate that branch somewhere below /branches/experimental
> if you want, which I believe is the appropriate next step before Chris
> integrates it in trunk if he'd like to.

Cool. Thanks!

Would it be easier if one of us (i.e. the patch creators) got commit
access to the experimental branch? Then we could apply any fixes or
changes requested there ourselves without continually having to
involve someone else? I already have commit access to the Bitten
repository (username: hodgestar).

Schiavo
Simon

Christian Boos

unread,
Sep 6, 2010, 6:17:06 AM9/6/10
to gen...@googlegroups.com

Sure! You're also a sandboxer now.

Note also that as I co-administrator on edgewall.org, I'm only trying to
facilitate collaboration on the projects there, but I don't want to step
over the actual owners and maintainers of the projects.

So be sure to have a chat with Christopher at some point, to see if he's
OK with the general idea of having such an experimental branch later to
be included in trunk. I think that having the code "at hand" in the
Genshi repository, in easily digestible chunks may only make this
easier, so you could take this opportunity to fold some of the original
changes in logical pieces, without the trials and errors, for easier review.

-- Christian

Simon Cross

unread,
Sep 6, 2010, 6:26:52 AM9/6/10
to gen...@googlegroups.com
On Mon, Sep 6, 2010 at 12:17 PM, Christian Boos <cb...@neuf.fr> wrote:
> Sure! You're also a sandboxer now.

Woot.

> Note also that as I co-administrator on edgewall.org, I'm only trying to
> facilitate collaboration on the projects there, but I don't want to step
> over the actual owners and maintainers of the projects.

Understood.

> So be sure to have a chat with Christopher at some point, to see if he's OK
> with the general idea of having such an experimental branch later to be
> included in trunk. I think that having the code "at hand" in the Genshi
> repository, in easily digestible chunks may only make this easier, so you
> could take this opportunity to fold some of the original changes in logical
> pieces, without the trials and errors, for easier review.

I've been kind of hoping Christopher would see these emails and chime in. :)

Re-organizing the original work into themed commits sounds sensible
but it would probably be best to get a basic acceptance of the
approach (use of 2to3, API changes, etc) before putting the effort in
(I guess it'd need a few hours from me at some point).

Schiavo
Simon

Simon Cross

unread,
Sep 6, 2010, 6:32:49 AM9/6/10
to gen...@googlegroups.com
On Mon, Sep 6, 2010 at 12:26 PM, Simon Cross <hodg...@gmail.com> wrote:
> I've been kind of hoping Christopher would see these emails and chime in. :)

Hmm. Nickserv on Freenode IRC says:

-NickServ- Information on cmlenz (account cmlenz):
-NickServ- Registered : Aug 30 12:41:20 2004 (6 years, 1 week, 0 days,
21:48:14 ago)
-NickServ- Last addr : ~cmlenz@2002:543a:3f0c:0:223:6cff:fe99:3440
-NickServ- Last seen : Apr 23 23:11:02 2010 (19 weeks, 2 days, 11:18:32 ago)

So Christopher hasn't been seen on IRC since the end of April.

Schiavo
Simon

Christopher Lenz

unread,
Sep 6, 2010, 9:08:41 AM9/6/10
to gen...@googlegroups.com
On 05.09.2010, at 15:58, Simon Cross wrote:
> It looks like the sprint was fairly successful and we have a patch
> that updates Genshi to be Python 3 compatible (without losing Python 2
> compatibility). The patch is available at either of:
>
> * bitbucket -- http://bitbucket.org/hodgestar/genshi-py3k
> * CTPUG wiki --
> http://ctpug.org.za/attachment/wiki/Meeting20100904/genshi-py3k.diff
>
> Bitbucket has the full commit history. The sprint occurred in two
> stages. First we worked on simply porting to Python 3 without
> maintaining compatibility with Python 2. This work was done in the
> 2to3 bitbucket branch and took about seven hours. In the second stage
> we created a single code-base that supports Python 2 and Python 3 (via
> 2to3 and some helpers). This was done in the default branch and took
> about eight hours.

Awesome work! I'm currently unhealthily swamped with work (and have been for way too long), so I've had no time for Genshi. This is just to let you know I'm still alive (kind of) and this work you've put in here is much appreciated. I'd be happy to give out more commit bits.

Cheers,
--
Christopher Lenz
cml...@gmail.com
http://www.cmlenz.net/

Simon Cross

unread,
Sep 6, 2010, 9:55:49 AM9/6/10
to gen...@googlegroups.com
On Mon, Sep 6, 2010 at 3:08 PM, Christopher Lenz <cml...@gmail.com> wrote:
> Awesome work! I'm currently unhealthily swamped with work (and have been for way
> too long), so I've had no time for Genshi. This is just to let you know I'm still alive (kind
> of) and this work you've put in here is much appreciated. I'd be happy to give out
> more commit bits.

Thanks! And glad to hear you're still alive. :D

Since you're busy could you perhaps appoint / bless someone to look at
the patch in your stead?

I'm happy to be given commit access and take on the responsibility of
getting something committed to trunk if you'd like that (although I
really think it would be better if someone with more Genshi
development experience did it).

At the moment I envisage the process as:

0) Get general okay of approach taken in patch from eventual committer
(and make changes as needed).
1) Try out patch with Trac under Python 2 (and fix any obvious problems).
2) Speak to Trac developers and get their buy-in on the patch (and
make any changes needed).
3) Re-organize changesets and commit to experimental branch.
4) Poll user community for feedback for a few (say four) weeks.
5) Commit to trunk.
6) Fix any issues that arise from users who didn't participate in step 4. :)

I know from my own Genshi use that I pulled in Genshi trunk by default
for awhile so we probably don't want to surprise people by landing
straight to trunk in step 3 (although it would short circuit the
process a bit).

Schiavo
Simon

Simon Cross

unread,
Oct 2, 2010, 3:30:25 AM10/2/10
to gen...@googlegroups.com
On Mon, Sep 6, 2010 at 3:55 PM, Simon Cross <hodg...@gmail.com> wrote:
> 0) Get general okay of approach taken in patch from eventual committer
> (and make changes as needed).

I'm taking this as read for the moment and moving on. :)

> 1) Try out patch with Trac under Python 2 (and fix any obvious problems).

With the attached two-line patch to Trac (which I believe is backwards
compatible with Genshi 0.6), Trac works with the genshi-py3k hg
branch.

> 2) Speak to Trac developers and get their buy-in on the patch (and
> make any changes needed).

This is next on the agenda. :)

Schiavo
Simon

explicit-utf8-template-encoding.diff

Simon Cross

unread,
Oct 2, 2010, 3:35:16 AM10/2/10
to gen...@googlegroups.com
On Sat, Oct 2, 2010 at 9:30 AM, Simon Cross <hodg...@gmail.com> wrote:
>> 1) Try out patch with Trac under Python 2 (and fix any obvious problems).
>
> With the attached two-line patch to Trac (which I believe is backwards
> compatible with Genshi 0.6), Trac works with the genshi-py3k hg
> branch.

I should clarify that "works" means "works under Python 2.x". Porting
Trac to Python 3 is left as an exercise for the reader. :)

Schiavo
Simon

Christian Boos

unread,
Oct 2, 2010, 3:44:40 AM10/2/10
to gen...@googlegroups.com

No worries, the patch looks fine and will soon be in trunk ;-)

Now Babel is next on the list, for the people wanting to get Trac3k ;-)

-- Christian

Simon Cross

unread,
Oct 2, 2010, 3:57:28 AM10/2/10
to gen...@googlegroups.com
On Sat, Oct 2, 2010 at 9:44 AM, Christian Boos <cb...@neuf.fr> wrote:
> No worries, the patch looks fine and will soon be in trunk ;-)

Woot. Thanks!

> Now Babel is next on the list, for the people wanting to get Trac3k ;-)

My l10n foo is lacking but David Fraser on the other hand has had lots
of experience with Pootle. Anyway, one step at a time for now (for me
anyway). :)

Schiavo
Simon

Christian Boos

unread,
Oct 2, 2010, 7:08:51 AM10/2/10
to gen...@googlegroups.com
On 10/2/2010 9:57 AM, Simon Cross wrote:
> On Sat, Oct 2, 2010 at 9:44 AM, Christian Boos<cb...@neuf.fr> wrote:
>> No worries, the patch looks fine and will soon be in trunk ;-)
> Woot. Thanks!

I've committed it, and indeed this seems to work well. Congrats!

However, there are a couple issues related to the tests, e.g.

ERROR: Regression test related to #5795
----------------------------------------------------------------------
Traceback (most recent call last):
File
"C:\Workspace\src\trac\repos\trunk\trac\mimeview\tests\patch.py", line
38, in setUp
self.patch_html = Stream(list(HTMLParser(patch_html)))
File "build\bdist.win32\egg\genshi\core.py", line 273, in _ensure
event = stream.next()
File "build\bdist.win32\egg\genshi\input.py", line 443, in _coalesce
for kind, data, pos in chain(stream, [(None, None, None)]):
File "build\bdist.win32\egg\genshi\input.py", line 335, in _generate
raise UnicodeError("source returned bytes, but no encoding specified")
UnicodeError: source returned bytes, but no encoding specified


And a couple more. Some seem to be specific to the tests themselves, but
a couple of failures in the notification tests may actually correspond
to real problems (in total, failures=1, errors=24 with 'make unit-test',
and no problems at all with 'make functional-test').

I briefly had a look but didn't find obvious fixes, so you may want to
help us there as well ;-)

-- Christian

Simon Cross

unread,
Oct 2, 2010, 2:45:12 PM10/2/10
to gen...@googlegroups.com
On Sat, Oct 2, 2010 at 1:08 PM, Christian Boos <cb...@neuf.fr> wrote:
> However, there are a couple issues related to the tests, e.g.

Patch for tests and issues revealed by tests attached.

trac/web/main.py:
- Explicitly specify encoding when sending project index.

trac/notification.py:
- Explicitly specify encoding when sending email notifications.

trac/mimeview/api.py:
- Cosmetic change that makes it explicit that stream is expected to
be a list of unicode but not a list of bytes.

trac/mimeview/tests/*
- Explicitly specify encoding when constructing streams to test against.

Schiavo
Simon

genshi-py3k-test-patches.diff

Christian Boos

unread,
Oct 5, 2010, 1:40:14 PM10/5/10
to gen...@googlegroups.com

Great! Patch is in with a little bonus ;-)

http://trac.edgewall.org/changeset/10176

Hm, just saw that I could have said in the commit message that this was
a follow-up to r10168. Now that we have the hg and git mirrors I can't
fix anymore the svn:log property after the fact... too bad.

-- Christian

Simon Cross

unread,
Oct 24, 2010, 7:02:35 PM10/24/10
to gen...@googlegroups.com
On Mon, Sep 6, 2010 at 3:55 PM, Simon Cross <hodg...@gmail.com> wrote:
> 3) Re-organize changesets and commit to experimental branch.

This evening I created an svn branch for the Python 3 work at:

http://svn.edgewall.org/repos/genshi/branches/experimental/py3k
http://genshi.edgewall.org/browser/branches/experimental/py3k

I opted not to attempt to break the patch up into pieces that
individually pass all their tests but instead broke the patch up
thematically into seven commits based on the area of Genshi involved.
I don't think it's particular useful to think of this patch as a
series of steps -- there's really only one step (adding support for
Python 3). I hope the thematic break down should allow interested
people to follow what was happening to each part of Genshi without
having to juggle lots of small changesets or searching through a
single giant changeset.

The commits can be viewed at:

http://genshi.edgewall.org/timeline?from=10%2F25%2F10&daysback=1&changeset=on

I still need to get the small tweak to doc/common/doctools.py in to
the Edgewall repositories but after that I'll move on to step 4.

Schiavo
Simon

Simon Cross

unread,
Oct 24, 2010, 7:10:17 PM10/24/10
to gen...@googlegroups.com
Hi Christian

On Tue, Oct 5, 2010 at 7:40 PM, Christian Boos <cb...@neuf.fr> wrote:
> Great! Patch is in with a little bonus ;-)

Thanks! And thanks for the mention in the thanks file. :)

Could I ask one more commit favour? :) The attached patch upgrades
doctools.py from
https://svn.edgewall.org/repos/edgewall/tools/doc/doctools.py to work
under both Python 2 and Python 3. Genshi imports it from its setup.py
file so I can't rely on 2to3 to fix it. Could you commit the patch for
me if you think it's okay?

Schiavo
Simon

doctools-py3k.diff

Simon Cross

unread,
Nov 5, 2010, 1:11:23 PM11/5/10
to gen...@googlegroups.com
On Mon, Oct 25, 2010 at 1:10 AM, Simon Cross <hodg...@gmail.com> wrote:
> Could I ask one more commit favour? :)  The attached patch upgrades
> doctools.py from
> https://svn.edgewall.org/repos/edgewall/tools/doc/doctools.py to work
> under both Python 2 and Python 3. Genshi imports it from its setup.py
> file so I can't rely on 2to3 to fix it. Could you commit the patch for
> me if you think it's okay?

The other Simon (osimons) gave me commit access to edgewall/tools and
I've committed the change. Genshi Py3k branch is now all set for
hordes of users to descend on it. :)

Schiavo
Simon

Reply all
Reply to author
Forward
0 new messages