I'm looking at integrating gevent into an existing application which has a fair amount of 3rd-party dependencies. I'm going through each of these libraries one by one to scan for obvious incompatibilities and other risky points (e.g. one reads from the Windows registry, another tries to use the native APIs to read from serial ports, some more include custom Python C extensions etc.).
Does gevent or anyone in the community maintain some sort of list of libraries that have known gevent-compatibility issues so that I can spot them more easily and look for replacements?
Maybe here would be a good place to start listing them. If we get enough, I might be able to set up a page (on the gevent docs, maybe?). More generally, is there a list of criteria somewhere that explains what to watch out for? For example, direct socket access in a PYD file is rather obvious, but anything that performs a lengthy CPU-bound operation can make be problematic.
For the moment, I have (not forcibly tested to demonstrate incompatibilities):
> More generally, is there a list of criteria somewhere that explains what > to watch out for? For example, direct socket access in a PYD file is > rather obvious, but anything that performs a lengthy CPU-bound operation > can make be problematic.
> For the moment, I have (not forcibly tested to demonstrate > incompatibilities):
- pyzmq, use gevent_zmq until the current master of pyzmq is released with the new pyzmq.green - python-memcached (I have switched to pylibmc wich still blocks afaik but is fast enough ;) -- a good candidate is ultramemcached but I have not tried it - I *think* py2app has issues.
> The 'ultramysql' driver works well with gevent but doesn't implement the > Python DBI.
Thanks, I hadn't seen this. However, just like the ultramemcached, this performs I/O in C++-land. I have trouble figuring out why you recommend this for use in gevent. Plus, we're already considering pymysql as an alternative to mysqldb.
> Workaround: use Linux. :-P
Already applied this workaround, ran into bug "Customer won't switch to Linux" :-)
> 5. Basically anything that accesses the file system. NFS requests may > block indefinitely. OS threads aren't bothered by this; gevent is. :-(
Do you mean "anything that accesses the file system while bypassing gevent" or is this a problem even after monkey patching?
> - python-memcached (I have switched to pylibmc wich still blocks afaik > but is fast enough ;) -- a good candidate is ultramemcached but I have not > tried it
That's odd. Of all memcached clients for Python, this seems to be the only one that's written entirely in Python. Pylibmc wraps libmemcached and ultramemcached performs I/O in C++-land. What problems did you encounter with python-memcached in a gevent application?
>> - python-memcached (I have switched to pylibmc wich still blocks afaik
>> but is fast enough ;) -- a good candidate is ultramemcached but I have not
>> tried it
> That's odd. Of all memcached clients for Python, this seems to be the
> only one that's written entirely in Python. Pylibmc wraps libmemcached and
> ultramemcached performs I/O in C++-land. What problems did you encounter
> with python-memcached in a gevent application?
On Mon, Jul 23, 2012 at 4:42 PM, andre.l.caron <andre.l.ca...@gmail.com>wrote:
> The 'ultramysql' driver works well with gevent but doesn't implement the
>> Python DBI.
> Thanks, I hadn't seen this. However, just like the ultramemcached, this
> performs I/O in C++-land. I have trouble figuring out why you recommend
> this for use in gevent. Plus, we're already considering pymysql as an
> alternative to mysqldb.
> > 5. Basically anything that accesses the file system. NFS requests may > > block indefinitely. OS threads aren't bothered by this; gevent is. :-(
> Do you mean "anything that accesses the file system while bypassing gevent" > or is this a problem even after monkey patching?
Umm, gevent does NOT monkeypatch stuff like file.open or file.read or
file.flush or os.link or os.unlink or os.path.abspath or … all of which
access the file system, which might block arbitrarily long.
The best workaround for this that I've found (so far) is to use RPyC,
which has a legacy mode that basically exports a complete remote
namespace. Create a socketpair, fork, run a server in the child and
use its file() and os.* instead of Python's standard file class
and os module. It's reasonably transparent; the only problem is that
you need to modify any library code yourself – you can't monkeypatch
file().
On Monday, July 23, 2012 3:51:06 PM UTC-4, smurf wrote:
> Umm, gevent does NOT monkeypatch stuff like file.open or file.read or > file.flush or os.link or os.unlink or os.path.abspath or … all of which > access the file system, which might block arbitrarily long.
> The best workaround for this that I've found (so far) is to use RPyC, > which has a legacy mode that basically exports a complete remote > namespace. Create a socketpair, fork, run a server in the child and > use its file() and os.* instead of Python's standard file class > and os module. It's reasonably transparent; the only problem is that > you need to modify any library code yourself – you can't monkeypatch > file().
Wouldn't it be easier to write a gevent.fs package that uses AsyncResult<http://www.gevent.org/gevent.event.html#gevent.event.AsyncResult> and run the FS operations in a real background thread (pool)? Note that I'm not suggesting that the thread pool be exposed in gevent's public API; I can certainly foresee the abuse it would get. However, provided a little bit of magic, gevent could offer in-process FS operations that don't block all greenlets. (This is getting off topic, though. If the idea is interesting, we'd be better to continue this topic in another thread).
Anyways, it might not be a deal killer (we don't do much file I/O), but it's certainly good to know that all regular FS operations block all greenlets, regardless of monkey patching.
> Wouldn't it be easier to write a gevent.fs package that uses
> AsyncResult > and run the FS operations in a real background
> thread (pool)?
The point is that third-party libraries use these calls, so you need to
monkeypatch __builtins__ with something that behaves like the original,
but still delegates the real work to some other thread behind the scenes.
Yes, a threaded solution might be simpler. OTOH, I don't want gevent to
block because some 3rd-party lilbrary didn't correctly release the GIL
across a call to C, or does some lenghty calculation, or other stuff along
these lines which you usually don't notice…
> Anyways, it might not be a deal killer (we don't do much file I/O), but > it's certainly good to know that all regular FS operations block all > greenlets, regardless of monkey patching.
On Monday, July 23, 2012 7:42:35 AM UTC-7, andre.l.caron wrote:
> The 'ultramysql' driver works well with gevent but doesn't implement the >> Python DBI.
> Thanks, I hadn't seen this. However, just like the ultramemcached, this > performs I/O in C++-land. I have trouble figuring out why you recommend > this for use in gevent. Plus, we're already considering pymysql as an > alternative to mysqldb.
I experienced sporadic pymysql failures with the combination of pymysql and DBUtils.PooledDB on MacOS and Linux when using gevent socket monkey-patch in an app that had two greenlets making SQL queries. The failures would manifest themselves as None being returned (instead of a list) from cursor.fetchall(). The problem went away after I turned off gevent monkey-patching of socket, and with it went the ability to execute concurrent SQL queries.
On Wednesday, July 25, 2012 3:33:31 AM UTC-4, vitaly wrote:
> I experienced sporadic pymysql failures with the combination of pymysql > and DBUtils.PooledDB on MacOS and Linux when using gevent socket > monkey-patch in an app that had two greenlets making SQL queries. The > failures would manifest themselves as None being returned (instead of a > list) from cursor.fetchall(). The problem went away after I turned off > gevent monkey-patching of socket, and with it went the ability to execute > concurrent SQL queries.
Have you come up with a theory on why this failure occured? I can easily see two greenlets making concurrent queries interleaving reads and writes on the same socket since they share a database connection object. For example, one greenlet sends a query, then another greenlet sends a query, and then the results are picked up out of order by greenlets (the second greenlet is scheduled first and gets the first query's result). Remember that any monkey-patched socket I/O operation can result in a context switch.
Does the problem go away if you use a DB connection per greenlet? I don't recommend this for production, but if it does solve your problem, perhaps it would be a good idea to create a small pool of DB-bound greenlets that performs SQL queries on a greenlet-local connections. In this scheme, regular greenlets make SQL queries through one of the greenlets in the pool while blocking on an AsyncResult or something.
I'm particularly interested in this use case because I'm still wrapping my head around concurrency issues in gevent-based applications. It's not like they dissapear just because scheduling is cooperative instead of preemptive. Sharing resources across greenlet boundaries still requires synchronization, albeit in a different way (context switching occurs at predictable locations).
On Wed, Jul 25, 2012 at 7:56 PM, andre.l.caron <andre.l.ca...@gmail.com> wrote:
> On Wednesday, July 25, 2012 3:33:31 AM UTC-4, vitaly wrote:
> I'm particularly interested in this use case because I'm still wrapping my
> head around concurrency issues in gevent-based applications. It's not like
> they dissapear just because scheduling is cooperative instead of preemptive.
> Sharing resources across greenlet boundaries still requires synchronization,
> albeit in a different way (context switching occurs at predictable
> locations).
> On Wednesday, July 25, 2012 3:33:31 AM UTC-4, vitaly wrote:
> > I experienced sporadic pymysql failures with the combination of pymysql
> > and DBUtils.PooledDB on MacOS and Linux when using gevent socket
> > monkey-patch in an app that had two greenlets making SQL queries. The
> > failures would manifest themselves as None being returned (instead of a
> > list) from cursor.fetchall(). The problem went away after I turned off
> > gevent monkey-patching of socket, and with it went the ability to execute
> > concurrent SQL queries.
> Have you come up with a theory on why this failure occured? I can easily
> see two greenlets making concurrent queries interleaving reads and writes
> on the same socket since they share a database connection object. For
> example, one greenlet sends a query, then another greenlet sends a query,
> and then the results are picked up out of order by greenlets (the second
> greenlet is scheduled first and gets the first query's result). Remember
> that any monkey-patched socket I/O operation can result in a context switch.
> Does the problem go away if you use a DB connection per greenlet? I don't
> recommend this for production, but if it does solve your problem, perhaps
> it would be a good idea to create a small pool of DB-bound greenlets that
> performs SQL queries on a greenlet-local connections. In this scheme,
> regular greenlets make SQL queries through one of the greenlets in the pool
> while blocking on an AsyncResult or something.
> I'm particularly interested in this use case because I'm still wrapping my
> head around concurrency issues in gevent-based applications. It's not like
> they dissapear just because scheduling is cooperative instead of
> preemptive. Sharing resources across greenlet boundaries still requires
> synchronization, albeit in a different way (context switching occurs at
> predictable locations).
> Regards,
> André
You can expect to encounter all the same problems with greenlets as
you would with threads, such as deadlocks and other race-conditions.
This is just a natural consequence of context switches, even
cooperative ones. In my gevent-based modules, I code my greenlets as
I would code threads, but use gevent primitives for synchronization
(e.g., a gevent semaphore versus posix mutex); this way, when code
changes are made by me or others that make additional calls to
functions that may result in context switch in places where they
didn't happen before, the code still continues to work properly.
In the failure case that I described, I used the Open-Source
DBUtils.PooledDB to provide/manage a mysql database connection pool,
so that I wouldn't have to implement the same thing myself. It's a
pretty nifty utility that has the capability to automatically
reconnect broken connections, do connection pings, integrates well
with SQL transaction support, etc. I am told that it works pretty
well in true multi-threaded apps.
I spent some time trying to debug the problem, but not enough to nail
it down. However, it made me realize that monkey.patch_all(),
monkey.patch_socket(), etc. can cause undesirable side-effects/
failures in other unsuspecting 3rd party packages that the same app
also needs to use. My theory is that somewhere in DBUtils.PooledDB or
pymysql (my intuition leans towards PooledDB) there is code that
doesn't expect a context switch to take place within a single thread,
and so updates some shared data structures and performs blocking
operations (that may cause a gevent context switch when monkey-
patched) in an order that may result in some sort of data structure
integrity problem during concurrency situations. I tried a couple of
work-arounds in my investigation: 1. create a new db connection for
each SQL transaction (not using PooledDB): this made the failure go
away, preserved concurrency, but had an undesirable impact on
performance. 2. not use gevent monkey-patching of the socket module:
this made the sporadic failure go away, but also disabled concurrency
during the execution of SQL. I settled on work-around #2 as that was
the practical solution for my app at the time.
On Jul 25, 8:59 am, Denis Bilenko <denis.bile...@gmail.com> wrote:
> On Wed, Jul 25, 2012 at 7:56 PM, andre.l.caron <andre.l.ca...@gmail.com> wrote:
> > On Wednesday, July 25, 2012 3:33:31 AM UTC-4, vitaly wrote:
> > I'm particularly interested in this use case because I'm still wrapping my
> > head around concurrency issues in gevent-based applications. It's not like
> > they dissapear just because scheduling is cooperative instead of preemptive.
> > Sharing resources across greenlet boundaries still requires synchronization,
> > albeit in a different way (context switching occurs at predictable
> > locations).
Thank you for the link. My gevent-based app shared a common DAO
module with non-geven-aware apps. That shared module used
DBUtils.PooledDB + pymysql under the covers, and it was not practical
to replace DBUtils.PooledDB with a gevent-friendly version at the time.
On Thursday, July 26, 2012 4:28:50 AM UTC-4, Denis Bilenko wrote:
> On Mon, Jul 23, 2012 at 4:25 PM, andre.l.caron <andre.l.ca...@gmail.com> > wrote: > > dateutil (reads from the Windows registry)
> does reading from local Windows registry makes it incompatible? This > is not even network communication.
I'm not sure, that's mostly why I started this thread in the first place. I came up with a list of suspect packages from all our dependencies. AFAIK, reading from the Windows registry may hit the disk, so I think it's *risky* to do so anywhere but at startup.
On Mon, Jul 23, 2012 at 9:51 PM, Matthias Urlichs <matth...@urlichs.de> wrote:
> Hi,
> andre.l.caron:
>> > 5. Basically anything that accesses the file system. NFS requests may
>> > block indefinitely. OS threads aren't bothered by this; gevent is. :-(
>> Do you mean "anything that accesses the file system while bypassing gevent"
>> or is this a problem even after monkey patching?
> Umm, gevent does NOT monkeypatch stuff like file.open or file.read or
> file.flush or os.link or os.unlink or os.path.abspath or … all of which
> access the file system, which might block arbitrarily long.
> The best workaround for this that I've found (so far) is to use RPyC,
> which has a legacy mode that basically exports a complete remote
> namespace. Create a socketpair, fork, run a server in the child and
> use its file() and os.* instead of Python's standard file class
> and os module. It's reasonably transparent; the only problem is that
> you need to modify any library code yourself – you can't monkeypatch
> file().
> --
> -- Matthias Urlichs
Other possibility would be binding libeio [1]. It would allows to have
non blocking io on the fs just like libev with the sockets. Also this
is the same author/group.
Or maybe use the library used by nodejs libuv [2] which is wrapping
libev & libeio and also use libares
On Thu, Jul 26, 2012 at 4:25 PM, Benoit Chesneau <bchesn...@gmail.com> wrote:
> Other possibility would be binding libeio [1]. It would allows to have
> non blocking io on the fs just like libev with the sockets. Also this
> is the same author/group.
On Thu, Jul 26, 2012 at 3:02 PM, Denis Bilenko <denis.bile...@gmail.com> wrote:
> On Thu, Jul 26, 2012 at 4:25 PM, Benoit Chesneau <bchesn...@gmail.com> wrote:
>> Other possibility would be binding libeio [1]. It would allows to have
>> non blocking io on the fs just like libev with the sockets. Also this
>> is the same author/group.
> How is it better than using gevent.threadpool?
for what? it may be better to share the same loop of events instead
of having a thread / file io operations imo
On Thu, Jul 26, 2012 at 6:31 PM, Benoit Chesneau <bchesn...@gmail.com> wrote:
> On Thu, Jul 26, 2012 at 3:02 PM, Denis Bilenko <denis.bile...@gmail.com> wrote:
>> On Thu, Jul 26, 2012 at 4:25 PM, Benoit Chesneau <bchesn...@gmail.com> wrote:
>>> Other possibility would be binding libeio [1]. It would allows to have
>>> non blocking io on the fs just like libev with the sockets. Also this
>>> is the same author/group.
>> How is it better than using gevent.threadpool?
> for what? it may be better to share the same loop of events instead
> of having a thread / file io operations imo
libeio uses threadpool for most operations, isn't it?
You already can make any file operation non-blocking:
>>> threadpool.apply(os.unlink, (filename, )) # won't block the event loop
So I don't see a reason for wrapping async unlink from libeio which
would also run unlink() in a [private] thread pool. It could be that a
threadpool implemented in C has less overhead. However, making
wrappers for all those eio functions to be compatible with Python os
module does not seems worth the trouble.
For those who want async OS operations in gevent, it's now really easy
to make them yourself: