The future of greenlet

316 views
Skip to first unread message

Damjan

unread,
Aug 21, 2011, 2:44:39 PM8/21/11
to gevent: coroutine-based Python network library
What do you guys use for greenlet today?

I've recently encountered a crash when using gevent-zeromq, which was
solved when I installed greenlet from the https://bitbucket.org/ambroff/greenlet
repository. but I also see there's some fork at https://bitbucket.org/snaury/greenlet

So my question is which one do you mostly use, especially in
production, and does the future hold for greenlet?

--
damjan

Scott Reynolds

unread,
Aug 22, 2011, 12:17:16 PM8/22/11
to gev...@googlegroups.com
love to hear what bug you encountered with gevent-zeromq

Ralf Schmitt

unread,
Aug 22, 2011, 5:06:22 PM8/22/11
to gev...@googlegroups.com
Damjan <gda...@gmail.com> writes:

I'm using plain 0.3.1 with some patches from snaury. On a 64 bit windows
I use the packages from Christoph Gohlke: http://www.lfd.uci.edu/~gohlke/pythonlibs/
It's a bit sad that the current greenlet maintainer doesn't release an
upgraded package.

--
Cheers
Ralf

Alexey Borzenkov

unread,
Aug 22, 2011, 5:23:26 PM8/22/11
to gev...@googlegroups.com
Well, of course I'm using my own fork, that's why I'm working on it after all. :)

I'm using greenlet and gevent at Kaspersky Lab to run thousands of simulated network-bound micro-processes and other network-related activities (e.g. concurrent dns resolution of thousands of random possibly-malware domains, to catch and block them close to their registration time). Given that I run so many greenlets (each micro-process manages its own group of greenlets, so it can gracefully stop when it's told to) I need them to be rock solid, and I really don't like it when something crashes or hangs up.

Recently I made a number of commits that should greatly improve greenlet stability (either theoretical or depending on particular compiler switches) on amd64 and i386 (and even SEH on windows that should help with Visual C++ exceptions), so I welcome you all to try it and find if there's anything wrong.

Alexey Borzenkov

unread,
Aug 22, 2011, 5:39:51 PM8/22/11
to gev...@googlegroups.com
On Tuesday, August 23, 2011 1:06:22 AM UTC+4, Ralf Schmitt wrote:

I'm using plain 0.3.1 with some patches from snaury. On a 64 bit windows

Do you use 32-bit python or do you use masm code from Stackless on 64-bit python? Also, you might want to see my recent patches too, just for one this:


Fixes a very interesting side effect that happened on i386 (since I'm mostly using ubuntu amd64 I didn't seem to stumble at it for a while).

Also, what do you people think of 64-bit Python support on Windows? Is it needed/wanted? Would it be ok if it required installing nasm? Or would it be ok if it included a pre-compiled .obj file? I know that Stackless has some 64-bit support, but it requires masm and it's a show stopper for me, since Visual Studio Express doesn't include it.

Ralf Schmitt

unread,
Aug 22, 2011, 6:40:07 PM8/22/11
to gev...@googlegroups.com
Alexey Borzenkov <sna...@gmail.com> writes:

> On Tuesday, August 23, 2011 1:06:22 AM UTC+4, Ralf Schmitt wrote:
>>
>> I'm using plain 0.3.1 with some patches from snaury. On a 64 bit windows
>>
> Do you use 32-bit python or do you use masm code from Stackless on 64-bit
> python? Also, you might want to see my recent patches too, just for one
> this:

I'm using 64 bit python. I think Christoph Gohlke ported the stackless
code.

>
> https://bitbucket.org/snaury/greenlet/changeset/faaf9508c2f3
>
> Fixes a very interesting side effect that happened on i386 (since I'm mostly
> using ubuntu amd64 I didn't seem to stumble at it for a while).
>
> Also, what do you people think of 64-bit Python support on Windows? Is it
> needed/wanted? Would it be ok if it required installing nasm? Or would it be
> ok if it included a pre-compiled .obj file? I know that Stackless has some
> 64-bit support, but it requires masm and it's a show stopper for me, since
> Visual Studio Express doesn't include it.
>

I think most people would expect some binary installer for windows :)
I think shipping with a .obj file is fine, as long as you ship the
source and instruction on how to build it.

--
Cheers
Ralf

Damjan

unread,
Aug 23, 2011, 10:14:01 AM8/23/11
to gevent: coroutine-based Python network library
> love to hear what bug you encountered with gevent-zeromq

I'm on 32bit Linux using Python 2.7.1. The problem was an immediate
crash of the Python interpreter when trying a simple PUB/SUB demo.

The crash went away when I installed greenlet from the hg repo, so I
don't think the problem is with gevent-zeromq.


--
damjan

Denis Bilenko

unread,
Aug 31, 2011, 2:18:29 AM8/31/11
to gev...@googlegroups.com
On Tue, Aug 23, 2011 at 4:06 AM, Ralf Schmitt <ra...@systemexit.de> wrote:

> I'm using plain 0.3.1 with some patches from snaury. On a 64 bit windows

> I use the packages from Christoph Gohlke: http://www.lfd.uci.edu/~gohlke/pythonlibs/
> It's a bit sad that the current greenlet maintainer doesn't release an
> upgraded package.

I've contacted Kyle and he says he currently does not have much time
to work on greenlet.

Anyone wants to take over?

The job is take the best current version, test it, prepare the release
and upload it on PyPI.
One can use this mailing list as maintainer's email.


On Tue, Aug 23, 2011 at 4:23 AM, Alexey Borzenkov <sna...@gmail.com> wrote:

> Well, of course I'm using my own fork, that's why I'm working on it after
> all. :)
> I'm using greenlet and gevent at Kaspersky Lab to run thousands of simulated
> network-bound micro-processes and other network-related activities (e.g.
> concurrent dns resolution of thousands of random possibly-malware domains,
> to catch and block them close to their registration time).

That sounds cool. Maybe you can write a success story for
http://gevent.org/success.html ?

We need more cool stories of well-known companies using gevent to
convince the skeptical users.

I still meet a lot of people who are afraid either of greenlet stack
switching or monkey patching.

Positive examples could go a long way towards reducing these fears.

> Recently I made a number of commits that should greatly improve greenlet
> stability (either theoretical or depending on particular compiler switches)
> on amd64 and i386 (and even SEH on windows that should help with Visual C++
> exceptions), so I welcome you all to try it and find if there's anything
> wrong.

Do you think it's good enough to wrap it up as a release? Can you do
it? You're the best person to do it, since you've produced most of the
patches since the last release.

I can help testing on the platforms I have access to.

Regarding the subject - "The future of greenlet" - I'd say it's rather
bright, now that
PyPy's has greenlet functionality working with jit.

Alexey Borzenkov

unread,
Aug 31, 2011, 3:39:14 AM8/31/11
to gev...@googlegroups.com
On Wed, Aug 31, 2011 at 10:18 AM, Denis Bilenko <denis....@gmail.com> wrote:
> I've contacted Kyle and he says he currently does not have much time
> to work on greenlet.
>
> Anyone wants to take over?
>
> The job is take the best current version, test it, prepare the release
> and upload it on PyPI.
> One can use this mailing list as maintainer's email.

Unfortunately, that's not all that's needed. Greenlet needs
clarification of licensing situation, because it can't be distributed
under MIT license, at the very least not all of it. :-/ I traced
greenlet back to when greenlet was forked by Bob Ippolito and
contacted Armin Rigo, here's what he said in a conversation:

[quote]
Me:
> The question is which parts have
> been MIT licensed, and which by mistake. Given that your current
> greenlet code is Python licensed, does it mean this was the intention
> from the very start and py's MIT license didn't apply to greenlet?
Armin Rigo:
> No. Let's say it this way. Any part of the code that I am the author
> for is distributed
> under either the Python license or the MIT license, as the recipient
> wishes. If it helps I can write down the previous sentence in the
> codespeak repository. This was also true at the time ambroff made his
> fork, so he can rightfully claim that most of the code is under the
> MIT license.
[/quote]

I still need to contact Christian Tismer and understand whether Armin
Rigo and Christian Tismer were the only authors of that initial
greenlet module (which I traced back to user/arigo/greenlet repo at
codespeak, back in April 2004 with the message "Yes I know, this
version segfaults happily. I still put it here for reference."), or
whether I need to ask anyone else. As soon as I have it all clarified
the plan is to license greenlet in two parts: greenlet itself is MIT
licensed, everything taken from Stackless is Python licensed and will
move into a separate folder.

Until then it's impossible to release new versions of greenlet, it has
been violating copyright enough as it is already.

> That sounds cool. Maybe you can write a success story for
> http://gevent.org/success.html ?
>
> We need more cool stories of well-known companies using gevent to
> convince the skeptical users.

Hmm, I don't think I can really talk about details here, sorry. It's
highly experimental R&D stuff and even I myself don't consider it a
success story yet, especially since I might very well be the only one
who is crazy enough to use Python+greenlet+gevent and a lot of
"unstable crazy stuff" in my day-to-day work instead of, ehrm,
ubiquitous C#. :)

But I think I can say that for me greenlet and gevent (even though I
use very little of it, because early in development I used eventlet
and got used to its APIs for classes like Event, Queue, etc. and
latter basically had to make same-API wrappers around gevent, - I'm
not even using gevent's Greenlet class, instead having my own
paradigms, e.g. where multiple greenlets are grouped into a Task,
managed as a whole, micro-processes essentially) are a life savior. It
easily allows me to do extremely parallel (when network is concerned),
and logically clean solutions to some problems.

Btw, one funny thing I constantly have to deal with, is that while my
solutions (thanks to greenlet/gevent) scale like crazy, I often hit
brick walls because other services just cannot keep up! :) The big one
for example is PyMongo (which if you're not careful, will open way too
many connections to the database), it seems things usually just aren't
designed for that kind of parallelism. For example, in case of
MongoDB, if you monkey patch threading and try using PyMongo naively,
then when 3000 cooperative tasks try to open connections to it and do
simultaneous reads/writes, it just feels like it's dying (MongoDB
becomes unresponsive, and basically nothing gets read or written).
However, if I make a proxy class that routes requests via 10
connections, then suddenly everything is fast, etc. These "resource
usage limiters" are very annoying to write, and in some cases are even
hard to get right the first time.

> I still meet a lot of people who are afraid either of greenlet stack
> switching or monkey patching.

One guy who used async libraries in C++, when heard about greenlet and
monkey patching, basically said "sounds hacky". It's even hard to
argue, because it IS a hack, however beautiful it is. But after using
many of these async frameworks, even Twisted (which makes it a bit
easier with @defer.inlineCallbacks), I shudder at the thought of ever
using them again for anything even remotely complex.

It's also sad that many people don't really understand the beauty of
greenlet, its usage of a single stack and what it achieves, instead of
many separate stacks, like e.g. perl's Coro.

> Do you think it's good enough to wrap it up as a release? Can you do
> it? You're the best person to do it, since you've produced most of the
> patches since the last release.

Kyle offered me to become maintainer, though I haven't replied on the
subject yet. Basically, I'm thinking about it, but not before I'm sure
I can resolve the license situation.

> Regarding the subject - "The future of greenlet" - I'd say it's rather
> bright, now that
> PyPy's has greenlet functionality working with jit.

Well, it would be cool when PyPy starts to compile with --stackless by
default. :)

Richard Tew

unread,
Aug 31, 2011, 4:44:19 AM8/31/11
to gev...@googlegroups.com
On Wed, Aug 31, 2011 at 3:39 PM, Alexey Borzenkov <sna...@gmail.com> wrote:
> I still need to contact Christian Tismer and understand whether Armin
> Rigo and Christian Tismer were the only authors of that initial
> greenlet module (which I traced back to user/arigo/greenlet repo at
> codespeak, back in April 2004 with the message "Yes I know, this
> version segfaults happily. I still put it here for reference."), or
> whether I need to ask anyone else. As soon as I have it all clarified
> the plan is to license greenlet in two parts: greenlet itself is MIT
> licensed, everything taken from Stackless is Python licensed and will
> move into a separate folder.

There are other authors, but how many and who are the question.
This can be seen by simply looking in the directory with the switching
routines and reading the files.

Here's what I would do if I were trying to get permission to relicense to
MIT for the Stackless files:

1. Go over the SVN repository and view the changes made to the
Stackless directory of every branch from when they were branched
from their base mainline Python branch to when greenlet was
released.

http://svn.python.org/view/stackless/branches/*
http://svn.python.org/view/stackless/Python-2.3.3/
http://svn.python.org/view/stackless/Python-2.4.2/
http://svn.python.org/view/stackless/Python-2.4.3/

2. Contact Christian Tismer and get access to Stackless CVS
repository. Do the same.

3. Contact everyone who made a change if you can find them.

In some cases, the change may have been checked in without credit
in the change description. But there may have been a post to the
mailing list mentioning who gave it.

Frankly it sounds like too much bother. Going the other way
and relicenscing to Python license is probably more feasible.

Cheers,
Richard.

Vasile Ermicioi

unread,
Aug 31, 2011, 5:27:36 AM8/31/11
to gev...@googlegroups.com
Well, it would be cool when PyPy starts to compile with --stackless by
default. :)

PyPy  will have by default greenlet module (as I know it is already in trunk), written entirely in python

Alexey Borzenkov

unread,
Aug 31, 2011, 9:26:45 AM8/31/11
to gev...@googlegroups.com
On Wed, Aug 31, 2011 at 12:44 PM, Richard Tew <richar...@gmail.com> wrote:
> On Wed, Aug 31, 2011 at 3:39 PM, Alexey Borzenkov <sna...@gmail.com> wrote:
>> I still need to contact Christian Tismer and understand whether Armin
>> Rigo and Christian Tismer were the only authors of that initial
>> greenlet module (which I traced back to user/arigo/greenlet repo at
>> codespeak, back in April 2004 with the message "Yes I know, this
>> version segfaults happily. I still put it here for reference."), or
>> whether I need to ask anyone else. As soon as I have it all clarified
>> the plan is to license greenlet in two parts: greenlet itself is MIT
>> licensed, everything taken from Stackless is Python licensed and will
>> move into a separate folder.
> There are other authors, but how many and who are the question.
> This can be seen by simply looking in the directory with the switching
> routines and reading the files.

And that's what I was saying. Switching routines should be under
Python license, especially since these were repeatedly taken from
stackless since 2004 fork. It would also be simpler if they are under
Python license, since Stackless and Greenlet would be able to share
this code without further licensing worries.

> Here's what I would do if I were trying to get permission to relicense to
> MIT for the Stackless files:

Whichever way you look at it pulling off relicensing is too much work,
besides I'm not a lawyer and have no clue how to do it right. There
have been many contributions to greenlet under MIT license since its
fork from py, there have been many contributions to stackless before
and after the 2004 fork, from which greenlet appears to have updated
switching code a number of times.

Since relicensing greenlet code under Python and stackless code under
MIT is too much work, I'd like to leave them at their respective
licenses and forget about the issue. The only question is whether py's
MIT license applied to greenlet code from which the ambroff's repo was
ultimately forked.

> In some cases, the change may have been checked in without credit
> in the change description.  But there may have been a post to the
> mailing list mentioning who gave it.
>
> Frankly it sounds like too much bother.  Going the other way
> and relicenscing to Python license is probably more feasible.

Exactly. I can't be sure who contributed what to greenlet either, so
it's not easy either.

Antonin AMAND

unread,
Aug 31, 2011, 2:29:14 PM8/31/11
to gev...@googlegroups.com
On Wed, Aug 31, 2011 at 9:39 AM, Alexey Borzenkov <sna...@gmail.com> wrote:
> Btw, one funny thing I constantly have to deal with, is that while my
> solutions (thanks to greenlet/gevent) scale like crazy, I often hit
> brick walls because other services just cannot keep up! :) The big one
> for example is PyMongo (which if you're not careful, will open way too
> many connections to the database), it seems things usually just aren't
> designed for that kind of parallelism. For example, in case of
> MongoDB, if you monkey patch threading and try using PyMongo naively,
> then when 3000 cooperative tasks try to open connections to it and do
> simultaneous reads/writes, it just feels like it's dying (MongoDB
> becomes unresponsive, and basically nothing gets read or written).
> However, if I make a proxy class that routes requests via 10
> connections, then suddenly everything is fast, etc. These "resource
> usage limiters" are very annoying to write, and in some cases are even
> hard to get right the first time.

I implemented a mongo Pool class to work with gevent and pymongo
>=2.0, it's around 2x faster than with monkey patch and seems to
respect the number of connections you specify. It's still experimal
though.

https://gist.github.com/1184264

I'm not sure it will help you if you don't use the Greenlet class from
gevent but maybe you can adjust it to your needs.

Antonin

Alexey Borzenkov

unread,
Aug 31, 2011, 3:31:47 PM8/31/11
to gev...@googlegroups.com

Well, of course I wrote my own mongo pool too. :) And it worked quite
well, but here's the problem: you need to explicitly call end_request
all the time. I did it for a while, but when you have 3000 long living
greenlets talking to the database, and there are greenlets that need
certain order of their request, when you only have a small number of
sockets and some greenlets perform several long-running requests
before calling end_request, all other greenlets have to wait for the
lock and the queue of free sockets.

Your implementation also has some serious problems, that my pool
implementation also has:

- You cannot really link a greenlet and return sockets automatically
like that, because you cannot be sure that socket is consistent. For
example, greenlet might be killed (or timeout, or whatever) in the
middle of a sock.send() inside of PyMongo. In that case you return a
socket that has some data written already. If you reuse this socket it
will often lead to AssertionError in PyMongo on safe operations,
because ids of operations no longer match. This could lead to chain
reaction where most of your greenlets fail due to one "poisoned"
socket in the pool.
- Your pool doesn't handle PyMongo's willingness to call .disconnect()
on you due to any reason whatsoever, even minor. PyMongo does this by
allocating a *new* pool object, scraping the old one. Now all of your
greenlets that had been waiting on that self._queue.get() are
deadlocked because nobody will ever put more sockets there.
- Using a lock over connect is evil, those 10 greenlets could have
been connecting to MongoDB concurrently, but in your case they need to
wait for each other's turn.

Basically, I was fed up with the way PyMongo works, and instead
created a proxy class, that has a with_database method, that queues
requests to workers, they allocate a new PyMongo connections on
demand, discard them on connection errors, etc. and call target
methods. The problem with the order of operations is solved by a pair
of start_request/end_request methods, that just set greenlet's
affinity to a particular worker. It goes something like this:

dbproxy = DatabaseProxy("mongodb://whatever.host", "mydatabase")
# the following calls will be executed randomly on any available connections
dbproxy.with_database(lambda database: database['collection'].insert(...))
dbproxy.with_database(lambda database: database['collection'].insert(...))
# the following calls are guaranteed to execute over the same connection
dbproxy.start_request()
dbproxy.with_database(lambda database: database['collection'].insert(...))
dbproxy.with_database(lambda database: database['collection'].find(...))
dbproxy.end_request()

This logically clarified a lot of stuff. For example, between
start_request and end_request only the usage of the same connection is
guaranteed, not that there will be no other database commands in
between. Requests with and without affinity don't have priority one
over another.

It worked surprisingly well, for example when starting my host
process, internal micro-processes do a lot of database writes during
startup, signaling that they are now running, reporting other
information. Previously, when using custom pool I had them start
pretty slowly, something like 10-15 micro-processes a second (and
there are 3000 of them, so starting up host process was taking
something like 5 minutes). With the DatabaseProxy class
micro-processes report their running state almost instantaneously,
because instead of waiting for other greenlets to finally quit playing
and free the pool socket, requests queue up and run in parallel.

Yichuan Wang

unread,
Aug 31, 2011, 5:20:35 PM8/31/11
to gev...@googlegroups.com
First, let me say I think Gevent is great, and I think gevent with libev will be better. Thank you.

However, as someone who is exploring all the concurrency options in Python, I feel better documentation, tutorial and examples are much more important than success stories. I mean as a programmer, the 'Facebook use it' kind of story does not help much, we are not facebook, and facebook's technologies are not necessarily superior.

The most popular use of gevent is probably web app, comparing with other libraries/framework/server, like tornado or uwsgi, gevent's document and tutorial is just, well, bad.

I really like the idea of self scheduled corouintes plus the power of epoll/kqueue, but to get the benefit, one have to learn and understand the framework. If everyone NEEDS to understand the underlying and internals, the need of a library is greatly reduced.

David Robinson

unread,
Sep 12, 2011, 7:32:41 PM9/12/11
to gev...@googlegroups.com

Sounds interesting. I've been thinking of using PyMongo w/ gevent but
haven't dived-in yet due to these kind of issues.

Is the proxy implementation something you can share (I had a quick
look on github but couldn't see it)?

Has anyone tried the mongo async python driver w/ gevent? Curious
whether it works with gevent and solves the problems mentioned
above...

https://github.com/fiorix/mongo-async-python-driver

--
Dave

lasizoillo

unread,
Sep 13, 2011, 6:06:02 AM9/13/11
to gev...@googlegroups.com
2011/9/13 David Robinson <zxvd...@gmail.com>:

>
> Has anyone tried the mongo async python driver w/ gevent? Curious
> whether it works with gevent and solves the problems mentioned
> above...
>
> https://github.com/fiorix/mongo-async-python-driver
>

Only if you use Twisted instead gevent. Don't resolve anything in gevent.

Try with this other solutions:
http://pastebin.com/z9nFYGa9
http://code.activestate.com/recipes/577490-mongodb-pool-for-gevent-and-pymongo-packages/

or search mongodb gevent in google.

Reply all
Reply to author
Forward
0 new messages