py3k raise syntax not backward compatible with_traceback

111 views
Skip to first unread message

AnilG

unread,
Jul 9, 2011, 11:04:07 AM7/9/11
to gevent: coroutine-based Python network library
I'm trying to build gevent in python 3.1.

I get invalid syntax on line 381 of pywsgi.py for:
raise exc_info[0], exc_info[1], exc_info[2]
but 2to3 wants to replace it with:
raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])
which is invalid in python 2.

I think this works in python 2:
raise exc_info[0](exc_info[1])
but python 2 doesn't have an alternative way to write the traceback.

In order to maintain code that is both forward and backward
compatible;
does this require raising exceptions through a function call,
where different implementations are kept in DIFFERENT SOURCE FILES,
one for py2 and one for py3k,
and one is deleted or selected for the build by version?

Or does the code remain in py2 compatible syntax,
and for py3k builds part of the build process is to run 2to3 first?
Perhaps a call to 2to3 can be included in setup.py?

Has anyone thought about this and is there a known or preferred
solution?

Anil

Damien Churchill

unread,
Jul 9, 2011, 12:58:25 PM7/9/11
to gev...@googlegroups.com

I have done some work towards this in the hg tip, my solution was to use exec to define a raise_ global function that is compatible on both versions.

AnilG

unread,
Jul 10, 2011, 3:52:57 AM7/10/11
to gevent: coroutine-based Python network library
> > I get invalid syntax on line 381 of pywsgi.py for:
> > raise exc_info[0], exc_info[1], exc_info[2]
> > but 2to3 wants to replace it with:
> > raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])
> > which is invalid in python 2.

> I have done some work towards this in the hg tip, my solution was to use
> exec to define a raise_ global function that is compatible on both versions.

Thanks, Damien, I believe I have the most recent tip from
ssh://h...@bitbucket.org/denis/gevent,
but I don't see a def raise_ anywhere? You didn't push yet?

I was thinking a careful call of 2to3 in setup.py with the right
fixers could provide
predictable code without requiring an additional function call,
but it's not going to be performance critical code anyway.

I've noticed some improvements in the tip in the last few months.
Are you working towards py3k gevent too?
Can I help with the work?
Is Denis deciding or guiding the implementation decisions?

Where are you on the other py3k issues?
Just from 2to3 on pywsgi.py I'm seeing:

-from urllib import unquote
+from urllib.parse import unquote

- def next(self):
+ def __next__(self):

- raise exc_info[0], exc_info[1], exc_info[2]
+ raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])

Damien Churchill

unread,
Jul 10, 2011, 4:48:28 AM7/10/11
to gev...@googlegroups.com

Try out my branch, I haven't done much work lately but I got quite a way with support. I had a compat module that had a big bunch of global fixes.

Have a look at https://bitbucket.org/damoxc/gevent-py3/overview, I need to sync it up to the latest code in Denis' tree but I got quite a way with making it compatible. I'm thinking that maybe a fresh solution using 2to3 might result in cleaner code though. The major issue I had is byte strings, a lot of the network related code only accepts byte strings. All the import issues can be resolved by try/except ImportError...

If you're willing to help then give it a look. Some of my work has been merged upstream. I'll resync my branch when I get home later today.

AnilG

unread,
Jul 10, 2011, 8:22:02 AM7/10/11
to gevent: coroutine-based Python network library
> Try out my branch, I haven't done much work lately but I got quite a way
> with support. I had a compat module that had a big bunch of global fixes.
> Have a look athttps://bitbucket.org/damoxc/gevent-py3/overview, I need to

I've forked gevent-py3 to bitbucket and made a local clone, and had a
glance at compat.py.
I like how you're patching __builtins__ based on per feature
detection.
It looks clean and reliable.
I guess compat.py is run once at import?

> I'm thinking that maybe a fresh solution using 2to3
> might result in cleaner code though.

I guess 2to3 is an alternative to what you're doing. You either change
the actual code or provide support for the old code. My intuition
tells me changing the code (2to3) will be more powerful. I guess when
you start creating functions with exec you are in a way doing what
2to3 will do, changing the code, because you can't provide support for
everything.
Is there a case for using 2to3 and compat.py, or is it an either/or?
Note that 2to3 supports custom fixers.

> The major issue I had is byte strings,
> a lot of the network related code only accepts byte strings.

Yeah I don't understand this at all yet, and I noticed jjonte (last
year) didn't like it. I probably need to read up on unicode support in
python 2/3.

> If you're willing to help then give it a look. Some of my work has been
> merged upstream. I'll resync my branch when I get home later today.

It seems you've done more work and are more current that I am. I want
to be a help rather than a burden.
I'll get into internals if I have to, to solve problems, but my main
goal is to get gevent working in py3k.
Are you able to direct me at areas of work so I can be productive
sooner?

What are the options for merge back into denis/gevent?
I was thinking that many small steps with acceptance back into the
original repo at each point would be best.
But that probably pre-supposes an agreed good design solution proof of
concept.
How are you feeling about compat.py / 2to3? Looks like you've done
some good work, shame to throw away.
But maybe that work was what it took to build a vision of the ideal
solution?
Are you in the mood for a re-write or is the solution almost there?

AnilG

unread,
Jul 31, 2011, 1:29:54 AM7/31/11
to gevent: coroutine-based Python network library
Hi Damien, you said:

> Try out my branch, I haven't done much work lately but I got quite a way
> with support. I had a compat module that had a big bunch of global fixes.
>
> I'm thinking that maybe a fresh solution using 2to3
> might result in cleaner code though.
>
> The major issue I had is byte strings,
> a lot of the network related code only accepts byte strings.

I think 2to3 looks like a better solution.

Most of the work has been done.
It doesn't need dummy functions and additional code.
The actual installed code is changed.
Distribute (setuptools) integrates 2to3 into the build.
Python 2 code stays untouched while Py3k code is developed.

http://python3porting.com/toc.html
http://packages.python.org/distribute/python3.html
http://docs.python.org/library/2to3.html

There is a compat layer already built if you want one:
http://pypi.python.org/pypi/six

I've changed setup.py to call 2to3 for Py3k.
I've also built a fixer to remove calls to sys.exc_clear().
That fixer is not actually there yet in lib2to3.

**I'll push this to Denis if you approve?**
**Is simply removing sys.exc_clear() calls sufficient?**

I've taken your fix for missing _fileobject in Py3k socket,
and have applied it to a newer socket.py.
I'm now testing but py-sqlite3 not available in Py3k,
and I can't get greentests to work,
so I'm doing it manually.

**I'm thinking maybe all strings should be bytes inside gevent?**
**Could alias bytes to str in Python 2 and change all instances?**

Damien Churchill

unread,
Aug 1, 2011, 4:21:04 PM8/1/11
to gev...@googlegroups.com
On 31 July 2011 06:29, AnilG <anils...@gmail.com> wrote:
> Hi Damien, you said:
>
>> Try out my branch, I haven't done much work lately but I got quite a way
>> with support. I had a compat module that had a big bunch of global fixes.
>>
>> I'm thinking that maybe a fresh solution using 2to3
>> might result in cleaner code though.
>>
>> The major issue I had is byte strings,
>> a lot of the network related code only accepts byte strings.
>
> I think 2to3 looks like a better solution.
>

I agree with the 2to3 solution, I was mostly just playing around to
see what was involved to get a better grips and what gotchas there
were for the future.

>
> **I'm thinking maybe all strings should be bytes inside gevent?**
> **Could alias bytes to str in Python 2 and change all instances?**
>

That's already the case in Python 2.6+. The problem is the bytes
method in Python3 expects an encoding parameter which the str() method
in Python2 doesn't which is nuisance. Would we be able to add a custom
fixer in 2to3 that converts all normal strings to byte strings, and
any unicode strings to strings?

AnilG

unread,
Aug 3, 2011, 9:03:23 AM8/3/11
to gevent: coroutine-based Python network library
> I agree with the 2to3 solution, I was mostly just playing around to
> see what was involved to get a better grips and what gotchas there
> were for the future.

That's great Damien, I'll push my mods to setup.py
and my fixer for sys.exc_clear to Denis.
Then we'll actually have Py3k builds, even though they won't work yet.

Can you please confirm just dropping calls to sys.exc_clear() is ok?
I don't understand why they were needed in Python 2,
nor why the call isn't available in Py3k.

> > **I'm thinking maybe all strings should be bytes inside gevent?**
> > **Could alias bytes to str in Python 2 and change all instances?**
>
> That's already the case in Python 2.6+. The problem is the bytes
> method in Python3 expects an encoding parameter which the str() method
> in Python2 doesn't which is nuisance. Would we be able to add a custom
> fixer in 2to3 that converts all normal strings to byte strings, and
> any unicode strings to strings?

I think I need to look at the full range of usage to understand this
better.
I thought that if we treated incoming as *already* bytes no encoding
was needed.
I'm referring to the bytes type, not a function.
When I say alias bytes to str what I mean is the reverse of the
default.
In normal 2to3 usage all Python 2 (bytes) str are treated as unicode
str in Py3k.
I want to treat Python 2 str as Py3k bytes type.
I think an encoding is only needed when swapping between bytes/str.
I'm pretty unsure of all this right now so I'll start looking.

I've currently seem to have a working Py3k core, hub and with the code
from
your repo, a socket.py as well. I think I need to have a working
queue.py next.

I'm still floundering with the complexity of the gevent build.
I'd appreciate it if you can nominate a 'bottom up' order for building
the source files.
I'm testing manually so I need to convert python source files one at a
time,
and I can't add multiple dependencies at one time, if you see what I
mean.
I think I can add queue to core hub and socket without unsatisfied
dependencies,
can you give me a the next few source files to address?
Like I said, I don't understand the build. I'm not up to speed on this
advanced package management.

Damien Churchill

unread,
Aug 4, 2011, 5:48:58 AM8/4/11
to gev...@googlegroups.com
On 3 August 2011 14:03, AnilG <anils...@gmail.com> wrote:
>> I agree with the 2to3 solution, I was mostly just playing around to
>> see what was involved to get a better grips and what gotchas there
>> were for the future.
>
> That's great Damien, I'll push my mods to setup.py
> and my fixer for sys.exc_clear to Denis.
> Then we'll actually have Py3k builds, even though they won't work yet.
>
> Can you please confirm just dropping calls to sys.exc_clear() is ok?
> I don't understand why they were needed in Python 2,
> nor why the call isn't available in Py3k.
>

Yes it's fine, in Python 3 the except: does the same thing as sys.exc_clear().

>> > **I'm thinking maybe all strings should be bytes inside gevent?**
>> > **Could alias bytes to str in Python 2 and change all instances?**
>>
>> That's already the case in Python 2.6+. The problem is the bytes
>> method in Python3 expects an encoding parameter which the str() method
>> in Python2 doesn't which is nuisance. Would we be able to add a custom
>> fixer in 2to3 that converts all normal strings to byte strings, and
>> any unicode strings to strings?
>
> I think I need to look at the full range of usage to understand this
> better.
> I thought that if we treated incoming as *already* bytes no encoding
> was needed.
> I'm referring to the bytes type, not a function.
> When I say alias bytes to str what I mean is the reverse of the
> default.
> In normal 2to3 usage all Python 2 (bytes) str are treated as unicode
> str in Py3k.
> I want to treat Python 2 str as Py3k bytes type.
> I think an encoding is only needed when swapping between bytes/str.
> I'm pretty unsure of all this right now so I'll start looking.
>

Yes but when strings are defined in Python3 they are str by default,
not bytes. So we'll need all strings in the gevent source converted to
b'foo' instead of 'foo', or 'foo' instead of u'foo'

>
> I've currently seem to have a working Py3k core, hub and with the code
> from
> your repo, a socket.py as well. I think I need to have a working
> queue.py next.
>

queue.py might be a tricky one if I remember correctly, certainly one
module was as you can't do comparisons between objects of different
types in Python 3 e.g. 3 < '3' doesn't work.

> I'm still floundering with the complexity of the gevent build.
> I'd appreciate it if you can nominate a 'bottom up' order for building
> the source files.
> I'm testing manually so I need to convert python source files one at a
> time,
> and I can't add multiple dependencies at one time, if you see what I
> mean.
> I think I can add queue to core hub and socket without unsatisfied
> dependencies,
> can you give me a the next few source files to address?
> Like I said, I don't understand the build. I'm not up to speed on this
> advanced package management.

I think pywsgi can be done without too much difficultly.
event.py looks like a candidate as well.
after that baseserver.py, followed by server.py

AnilG

unread,
Aug 7, 2011, 9:17:21 AM8/7/11
to gevent: coroutine-based Python network library
> > Can you please confirm just dropping calls to sys.exc_clear() is ok?
> > I don't understand why they were needed in Python 2,
> > nor why the call isn't available in Py3k.
> Yes it's fine, in Python 3 the except: does the same thing as sys.exc_clear().
Thanks, that's a relief.

> > I thought that if we treated incoming as *already* bytes no encoding
> > was needed. I'm referring to the bytes type, not a function.
> > When I say alias bytes to str what I mean is the reverse of the
> > default.
> > In normal 2to3 usage all Python 2 (bytes) str are treated as unicode
> > str in Py3k.
> > I want to treat Python 2 str as Py3k bytes type.
> > I think an encoding is only needed when swapping between bytes/str.
> Yes but when strings are defined in Python3 they are str by default,
> not bytes. So we'll need all strings in the gevent source converted to
> b'foo' instead of 'foo', or 'foo' instead of u'foo'
I guess I was hoping there weren't too many instances of this.
If a custom fixer is possible it's not going to be easy. There's
little documentation.
But if it's what's gotta happen then it's gotta happen, right?

> queue.py might be a tricky one if I remember correctly, certainly one
> module was as you can't do comparisons between objects of different
> types in Python 3  e.g. 3 < '3' doesn't work.
> > can you give me a the next few source files to address?
> I think pywsgi can be done without too much difficultly.
> event.py looks like a candidate as well.
> after that baseserver.py, followed by server.py

Thanks. I may have spoke to soon about queue.py.

AnilG

unread,
Aug 9, 2011, 10:31:03 AM8/9/11
to gevent: coroutine-based Python network library
> >> I agree with the 2to3 solution,
> > That's great Damien, I'll push my mods to setup.py
> > and my fixer for sys.exc_clear to Denis.
> > Then we'll actually have Py3k builds, even though they won't work yet.

Hi Damien, I forked my anil_g/gevent from denis/gevent again.

I forked current tip 42292da5c42a server: rename kill() method to
close() to anil_g/gevent
then added 4 specific Py3k updates that should simultaneously support
Py3k and Python 2.
I included your socket.py changes in commit ed94139522bd.

I have a successful 3.1.1 build for this repo now,
but I am unable to test 2.x builds due to cython 0.15 reporting as
0.14.1 issue.

In theory these commits should be good.
A clone of anil_g/gevent should be able to prove it builds in Python 2
anywhere the current denis/gevent tip builds.

Commits are:
1. Add 2to3 call in setup.py to enable Py3k builds.
2. Provide custom fixer in util. Needs to be manually installed to
work.
3. Provided your socket.py changes to simultaneously support Py3k.
4. Updated util/cython_ifdef.py usage to simultaneously support Py3k /
Python 2.

Damien Churchill

unread,
Aug 9, 2011, 11:10:29 AM8/9/11
to gev...@googlegroups.com
On 9 August 2011 15:31, AnilG <anils...@gmail.com> wrote:
>> >> I agree with the 2to3 solution,
>> > That's great Damien, I'll push my mods to setup.py
>> > and my fixer for sys.exc_clear to Denis.
>> > Then we'll actually have Py3k builds, even though they won't work yet.
>
> Hi Damien, I forked my anil_g/gevent from denis/gevent again.
>
> I forked current tip 42292da5c42a server: rename kill() method to
> close() to anil_g/gevent
> then added 4 specific Py3k updates that should simultaneously support
> Py3k and Python 2.
> I included your socket.py changes in commit ed94139522bd.
>
> I have a successful 3.1.1 build for this repo now,
> but I am unable to test 2.x builds due to cython 0.15 reporting as
> 0.14.1 issue.
>

Excellent, I'll try and get around to testing this out on Arch using
3.2.x later on, cython seems to be okay for 2.x on Arch too, cython
--version says 0.15.

> In theory these commits should be good.
> A clone of anil_g/gevent should be able to prove it builds in Python 2
> anywhere the current denis/gevent tip builds.
>
> Commits are:
> 1. Add 2to3 call in setup.py to enable Py3k builds.
> 2. Provide custom fixer in util. Needs to be manually installed to
> work.

I'll have a look at adding a fix_strings custom fixer, we'll have to
disable the unicode fixer and then the replacement fixer will do both
jobs

> 3. Provided your socket.py changes to simultaneously support Py3k.
> 4. Updated util/cython_ifdef.py usage to simultaneously support Py3k /
> Python 2.
>

Could also be achieved by using a setuptools build task and running it
through 2to3 manually, might be a bit cleaner?

AnilG

unread,
Aug 10, 2011, 7:38:10 AM8/10/11
to gevent: coroutine-based Python network library
> > I have a successful 3.1.1 build for this repo now,
> > but I am unable to test 2.x builds due to cython 0.15 reporting as
> > 0.14.1 issue.
> Excellent, I'll try and get around to testing this out on Arch using
> 3.2.x later on, cython seems to be okay for 2.x on Arch too, cython
> --version says 0.15.

You may have problems on Python 3.2.
I moved to 3.2 but ended up going back to Python 3.1.4.
I think greenlet or something doesn't work on 3.2 (does on 3.1).
But I'm unsure why there is a greenlet.py.
I thought gevent based on external greenlet module.

> I'll have a look at adding a fix_strings custom fixer, we'll have to
> disable the unicode fixer and then the replacement fixer will do both

Sounds like that could do an awesome amount of work if it works.
I've not really looked at the string usage in gevent yet.
I thought gevent is like a 'carrier' and never needs to actually
refer to the string content.
E.g. It should never need to know the encoding.

> > 3. Provided your socket.py changes to simultaneously support Py3k.
> > 4. Updated util/cython_ifdef.py usage to simultaneously support Py3k /
> > Python 2.
> Could also be achieved by using a setuptools build task and running it
> through 2to3 manually, might be a bit cleaner?

cython_ifdef.py needs to be dual version because it's part of the
build.
Unless do extra work in setup to run it through 2to3 before calling,
but I think it's dual compatible now anyway.
I just found it easier that way.
Since it's part of the build process it'll never be in production
code.

socket.py is put through 2to3 during build, as will every .py.
Some things are just not handled by 2to3 fixers yet though.
I haven't got a handle on custom fixers for advanced usage (yet?).
You already had changes to socket.py and I wanted to test quickly.
Maybe in some cases it may just be easier, quicker and maintainable
to do *some* dual version coding.

We can improve the build as we go along too.
We can dual version code if no fixers initially and then add fixers
later.

Actually need to be careful with this though.
Need to make sure applying 2to3 on dual version coded file doesn't
break.
For instance, if there is a fixer for any particular syntax change,
then probably need to use fixer for all instances of that particular
change.
If we add a new fixer may need to update existing dual coded files to
use it.

Damien Churchill

unread,
Aug 11, 2011, 5:45:09 AM8/11/11
to gev...@googlegroups.com
On 10 August 2011 12:38, AnilG <anils...@gmail.com> wrote:
>> > I have a successful 3.1.1 build for this repo now,
>> > but I am unable to test 2.x builds due to cython 0.15 reporting as
>> > 0.14.1 issue.
>> Excellent, I'll try and get around to testing this out on Arch using
>> 3.2.x later on, cython seems to be okay for 2.x on Arch too, cython
>> --version says 0.15.
>
> You may have problems on Python 3.2.
> I moved to 3.2 but ended up going back to Python 3.1.4.
> I think greenlet or something doesn't work on 3.2 (does on 3.1).
> But I'm unsure why there is a greenlet.py.
> I thought gevent based on external greenlet module.
>

It is, it's not called greenlet though if memory serves,
py.magic.greenlet or something along those lines. The gevent greenlet
module just contains wrapper classes around the greenlet class it
imports from the external module. And you are correct, the last
release of the external module doesn't work with Python 3.2, however
hg tip does :-)

>> I'll have a look at adding a fix_strings custom fixer, we'll have to
>> disable the unicode fixer and then the replacement fixer will do both
>
> Sounds like that could do an awesome amount of work if it works.
> I've not really looked at the string usage in gevent yet.
> I thought gevent is like a 'carrier' and never needs to actually
> refer to the string content.
> E.g. It should never need to know the encoding.
>

It doesn't, and it shouldn't, which is why mostly all strings want to
be converted to byte strings, b'foo' instead of 'foo'.

>> > 3. Provided your socket.py changes to simultaneously support Py3k.
>> > 4. Updated util/cython_ifdef.py usage to simultaneously support Py3k /
>> > Python 2.
>> Could also be achieved by using a setuptools build task and running it
>> through 2to3 manually, might be a bit cleaner?
>
> cython_ifdef.py needs to be dual version because it's part of the
> build.
> Unless do extra work in setup to run it through 2to3 before calling,
> but I think it's dual compatible now anyway.
> I just found it easier that way.
> Since it's part of the build process it'll never be in production
> code.
>
> socket.py is put through 2to3 during build, as will every .py.
> Some things are just not handled by 2to3 fixers yet though.
> I haven't got a handle on custom fixers for advanced usage (yet?).
> You already had changes to socket.py and I wanted to test quickly.
> Maybe in some cases it may just be easier, quicker and maintainable
> to do *some* dual version coding.
>

Oh totally, there's a lot of things that should support dual version,
in my eyes fixers really should only be used when there is a backwards
incompatible change that needs to be made (such as b'foo').

AnilG

unread,
Aug 14, 2011, 7:26:32 AM8/14/11
to gevent: coroutine-based Python network library
> >> > I have a successful 3.1.1 build for this repo now,
> >> > but I am unable to test 2.x builds due to cython 0.15 reporting as
> >> > 0.14.1 issue.
> >> Excellent, I'll try and get around to testing this out on Arch using
> >> 3.2.x later on, cython seems to be okay for 2.x on Arch too, cython
> >> --version says 0.15.

Sorry. A build problem on my machine. I've got consistent cython 0.15
now.

> > You may have problems on Python 3.2.
> > I moved to 3.2 but ended up going back to Python 3.1.4.
> > I think greenlet or something doesn't work on 3.2 (does on 3.1).
> > But I'm unsure why there is a greenlet.py.
> > I thought gevent based on external greenlet module.
>
> It is, it's not called greenlet though if memory serves,
> py.magic.greenlet or something along those lines. The gevent greenlet
> module just contains wrapper classes around the greenlet class it
> imports from the external module. And you are correct, the last
> release of the external module doesn't work with Python 3.2, however
> hg tip does :-)

Ok! Thanks.

> > Maybe in some cases it may just be easier, quicker and maintainable
> > to do *some* dual version coding.
> Oh totally, there's a lot of things that should support dual version,
> in my eyes fixers really should only be used when there is a backwards
> incompatible change that needs to be made (such as b'foo').

After resolving my cython version problem and getting clean gevent
build on Python 2.7 and Python 3.1, I'm still happy my commits / pull
request is a good next step towards gevent Py3k.

AnilG

unread,
Aug 14, 2011, 7:36:37 AM8/14/11
to gevent: coroutine-based Python network library
> >> I'll have a look at adding a fix_strings custom fixer, we'll have to
> >> disable the unicode fixer and then the replacement fixer will do both

> It doesn't, and it shouldn't, which is why mostly all strings want to
> be converted to byte strings, b'foo' instead of 'foo'.

I took a look at string usage in socket.py and it changed my
impression.

Just looking at socket.py, most if not all strings need to be Py3k
unicode strings.
Actual data for carrying over the socket is never inspected and comes
in as bytes anyway.

It seems to me the following kinds of strings need to be normal Py3k
unicode strings:

* doc strings
* error messages and exception values e.g. raise error(EBADF, 'Bad
file descriptor')
* method names e.g. __implements__ and __imports__
* comparisons to unicode from other modules e.g. sys.platform ==
'win32'
* attribute names e.g. getattr(switch, '__self__', None)
* __repr__ and __str__ values

In socket.py I can't see any strings that might need to be byte
strings except perhaps in bind_and_listen and getfqdn,
but even here the standard socket module uses Py3k unicode strings for
hostnames.
b'' is tolerated (presumably tested for False) but b'hostname' is not.

File "/usr/local/lib/python3.1/socket.py", line 292, in getfqdn
hostname, aliases, ipaddrs = gethostbyaddr(name)
TypeError: must be string, not bytes

In gevent socket.py sending bytes works,
and sending a unicode string throws TypeError as it should:

Traceback (most recent call last):
File "tmc.py", line 17, in <module>
s.send(data) # send test string
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/socket.py", line 542, in send
return sock.send(data, flags)
TypeError: must be bytes or buffer, not str

But ares doesn't seem to be typing itself for Py3k yet.

>>> from gevent import socket
>>> socket.getfqdn()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/socket.py", line 775, in getfqdn
hostname, aliases, ipaddrs = gethostbyaddr(name)
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/socket.py", line 755, in
gethostbyaddr
return get_hub().resolver.gethostbyaddr(ip_address)
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/resolver_ares.py", line 172, in
gethostbyaddr
return self._gethostbyaddr(ip_address)
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/resolver_ares.py", line 153, in
_gethostbyaddr
self.ares.gethostbyaddr(waiter, ip_address)
File "ares.pyx", line 386, in gevent.ares.channel.gethostbyaddr
(gevent/gevent.ares.c:5271)
TypeError: expected bytes, str found

My conclusions are:

1. We're probably never going to be able to use a b'' string fixer.
2. We may never need a b'' string fixer, maybe it's not that bad.
3. String handling in each python file is going to be different.
4. I feel I need to look at ares.pyx next for string / bytes handling.

What do you think?

Damien Churchill

unread,
Aug 14, 2011, 4:37:36 PM8/14/11
to gev...@googlegroups.com

I could be selective, I think pywsgi needed a lot of byte strings if I
recall correctly.

> 2. We may never need a b'' string fixer, maybe it's not that bad.

I've written one already in case we do ;-)

> 3. String handling in each python file is going to be different.

Agreed.

> 4. I feel I need to look at ares.pyx next for string / bytes handling.
> What do you think?

I've also figured out how integrate our own fixers into the build
process [1]. Getting the tests running properly will be another
important step I think.

[1] http://dpaste.com/594285/

Damien Churchill

unread,
Aug 15, 2011, 2:57:16 PM8/15/11
to gev...@googlegroups.com

I've just committed and pushed a bunch of changes to my fork [1] of
your repo if you'd like to take a look. Just a few compatibility fixes
and integrating the custom fixer into the setup process. echoserver.py
works when run through 2to3 now with those fixes.

[1] https://bitbucket.org/damoxc/gevent

AnilG

unread,
Aug 19, 2011, 10:29:02 AM8/19/11
to gevent: coroutine-based Python network library
> I've also figured out how integrate our own fixers into the build
> process [1]. Getting the tests running properly will be another
> important step I think.
> [1]http://dpaste.com/594285/

That's great, Damien.
You seem to have done it with much less code than I thought it would
be.

AnilG

unread,
Aug 19, 2011, 10:39:39 AM8/19/11
to gevent: coroutine-based Python network library
> I've just committed and pushed a bunch of changes to my fork [1] of
> your repo if you'd like to take a look. Just a few compatibility fixes
> and integrating the custom fixer into the setup process. echoserver.py
> works when run through 2to3 now with those fixes.
> [1]https://bitbucket.org/damoxc/gevent

This is great, Damien!
I pulled your changes and ran my meagre tests ok.
I then pulled Denis' latest changes as well, merged,
and pushed back (with 1 minor change) to anil_g/gevent.
(I hope that's correct hg practice).

If you pull my changes both repos will have the same
Py3k changesets and will be up to date
with working denis/gevent Python 2 tip.

We are now progressing on a dual build of gevent!

Damien Churchill

unread,
Aug 19, 2011, 11:16:36 AM8/19/11
to gev...@googlegroups.com

A small step in the right direction ;-) I was contemplating starting
to write some tests that could be run via nose, instead of the custom
way gevent does it at the moment. But py3 support takes priority I
think.

AnilG

unread,
Aug 21, 2011, 4:12:17 AM8/21/11
to gevent: coroutine-based Python network library
> I was contemplating starting to write some tests that could be
> run via nose, instead of the custom way gevent does it at the
> moment. But py3 support takes priority I think.

Yes, working tests would be good. I couldn't get pysqlite3 working in
Py3k, though. I've been working on ares resolver.

This is a long one (tell me if too long). I'd like to check this with
you. I'm in unfamiliar territory.

My main question is: do we need to move encoding / decoding inside
ares.pyx? This would require a change in the function prototypes. If
so I can throw away everything below.

My second question is: What is the encoding?! We need a guaranteed
encoding. Is it 'utf-8' or something unusual like 'idna'? Maybe we can
set it to anything we want?

See, currently ares is working in bytes internally and therefore
receiving unicode str from Py3k breaks:

$ python3.1
>>> from gevent import socket
>>> socket.gethostbyaddr('127.0.0.1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/socket.py", line 755, in
gethostbyaddr
return get_hub().resolver.gethostbyaddr(ip_address)
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/resolver_ares.py", line 175, in
gethostbyaddr
return self._gethostbyaddr(ip_address)
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/resolver_ares.py", line 156, in
_gethostbyaddr
self.ares.gethostbyaddr(waiter, ip_address)
File "ares.pyx", line 386, in gevent.ares.channel.gethostbyaddr
(gevent/gevent.ares.c:5271)
TypeError: expected bytes, str found
>>>

Refer: http://docs.cython.org/src/tutorial/strings.html

This is the function signature at ares.pyx line 386:
def gethostbyaddr(self, object callback, char* addr):

I changed it to the following and encoded the incoming str inside
gethostbyaddr, but I got some strange behaviour.
def gethostbyaddr(self, object callback, addr):

Otherwise the Py3k unicode strings going in to gethostbyaddr and
gethostbyname need to be encoded to bytes, so that it will match char*
ares.pyx type, at all entry points in ares.pyx for Py3k only.

py_byte_string = py_unicode_string.encode('utf-8')
cdef char* c_string = py_byte_string

You need to keep a reference to the Python byte string to memory
manage the pointer passed in to Cython. I used:

if not hasattr(hostname, 'decode'): # Only True for Py3k unicode
strings.
shostname = hostname # Attempt to persist the Py3k unicode string.
hostname = shostname.encode('idna') # And take a pointer to the
byte string instead.
ares.gethostbyname(waiter, hostname, family)

I tried doing this inside cares.pyx first but it didn't work out. I
tried to fix it in resolver_ares.py instead, assuming all calls come
through resolver_ares.py, and assuming all such calls in
resolver_ares.py are prefixed ares.gethostby????.

I wonder if trouble may happen because the call is made but not
completed and a waiter is created.
The Python string may get garbage collected while the pointer is
retained in the waiter?, destroying the memory that C is relying on.

I got some partial success using the resolver_ares.py fixes (note
again the returns are bytes):

$ python3.1
>>> import gevent
>>> hub = gevent.resolver_ares.get_hub()
>>> resolver = hub.resolver
>>> resolver.gethostbyaddr('127.0.0.1')
(b'localhost', [b'localhost.local'], [b'127.0.0.1'])
>>> resolver.gethostbyname('engine')
b'10.0.1.100'
>>>

$ python3.1
>>> from gevent import socket
>>> socket.gethostbyaddr('127.0.0.1')
(b'localhost', [b'localhost.local'], [b'127.0.0.1'])
>>>

But I still got:

$ python3.1
>>> socket.getfqdn('127.0.0.1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/socket.py", line 781, in getfqdn
if '.' in name:
TypeError: Type str doesn't support the buffer API

Note return values are also in byte strings and some string literals
inside are being interpreted as unicode under Py3k.

$ python3.1
>>> s='127.0.0.1'
>>> '.' in s
True
>>> b=s.encode('idna')
>>> b
b'127.0.0.1'
>>> '.' in b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Type str doesn't support the buffer API
>>> b'.' in b
True

But then I also got this:

$ python3.1
>>> from gevent import socket
>>> socket.getfqdn('engine')
Traceback (most recent call last):
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/resolver_ares.py", line 167, in
_gethostbyaddr
return waiter.get()
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/hub.py", line 472, in get
getcurrent().throw(*self._exception)
ValueError: illegal IP address string: b'engine'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/socket.py", line 775, in getfqdn
hostname, aliases, ipaddrs = gethostbyaddr(name)
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/socket.py", line 755, in
gethostbyaddr
return get_hub().resolver.gethostbyaddr(ip_address)
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/resolver_ares.py", line 189, in
gethostbyaddr
return self._gethostbyaddr(ip_address)
File "/usr/local/lib/python3.1/site-packages/gevent-1.0dev-py3.1-
freebsd-8.2-RELEASE-i386.egg/gevent/resolver_ares.py", line 175, in
_gethostbyaddr
if not str(ex).startswith(b'illegal IP'):
TypeError: 'tuple' object is not callable

After some debugging I'm pretty sure 'str' is a tuple, instead of a
callable. Somebody's trashed the 'str' identifier?! I couldn't locate
__builtins__ at that point in the execution either, so I couldn't call
__builtins__.str(ex) instead.

I gave up at this point. I'd like to know if we should be doing this
inside cares.pyx before I do more of this, and as I said, it's
unfamiliar territory, I'd appreciate any feedback you can give.
Reply all
Reply to author
Forward
0 new messages