Hello NLTK developers,
The idea is to support python 2.6, 2.7 and 3.2 using the same codebase. A bit of background: some major ibraries (django, webob, pyramid) were ported using this approach; I myself have ported several small python libraries over this year using this approach and found it good. The NLTK port is not ready yet and it doesn't run under python 3 at the moment, but python 2 tests have the same failures as in master branch so I hope I didn't broke anything for python 2. The plan is to update existing 2.x code to use more recent idioms and then add python 3 support. Python 2.5 support is already dropped in this branch to make transition easier and code better. First several commits are "cowboy commits" - I was fixing random things in random modules, sorry for this. Later commits use better approach: they are fixing certain aspects of NLTK code, e.g. 'reduce' builtin that is removed in py3k or the changed iteration protocol.
I'm utilizing the excellent python-modernize utility (
https://github.com/mitsuhiko/python-modernize) to find and fix some 2+3 issues, running its fixers one-by-one. It doesn't make code 2+3 compatible automatically, doesn't find all incompatibilities and is not always correct, but is still very useful helper to automate tedious search/replace tasks.
I don't see how the porting work can be parallelized right now. Please wait a couple of days :) But once the library gets updated to the point py32 tests run there is a need for a lot of help to actually make sure NLTK works under python 3 and python 2. Code review, extra test coverage, fixing failing tests and checking if updated NLTK works with your codebase is very welcome!