Re: Getting coverage.py to work with gevent

465 views
Skip to first unread message

Ned Batchelder

unread,
Nov 18, 2012, 11:02:01 PM11/18/12
to gev...@googlegroups.com
Oh, and one other thing: does anyone else have experience using the patch in newbrough's repo, and can say whether it worked or didn't?

Thanks,

--Ned.

On Saturday, November 17, 2012 8:04:08 AM UTC-5, Ned Batchelder wrote:
Hi all, I'm the author of coverage.py, and every few months, someone asks about getting it to work with gevent.  I know next to nothing about gevent, but would like to get them working together if I can.  That means I need help from someone here.  Here is the coverage.py ticket about it:  https://bitbucket.org/ned/coveragepy/issue/149/coverage-gevent-looks-broken  and here is a repo with a fix for coverage.py:  https://github.com/newbrough/coverage

But the fix in that repo sacrifices other behavior, so I would like to understand the issues better so that I can make the best fix possible.  There are a few specific points I could use help with:
  1. I'd like to understand why my Python trace function fails while newbrough's succeeds.  What assumption in my code is broken when running under gevent?  I maintain my own stack data structure that should correspond to the Python call stack, and that clearly isn't right under gevent, but can someone explain concisely where that goes wrong?
  2. I need help constructing test cases that properly demonstrate the problem.  This thread (https://groups.google.com/forum/?fromgroups=#!searchin/gevent/coverage.py/gevent/Y9xEE0QLqG0/_wuTLiUAjp0J) has demonstration code, but I'd like to incorporate something into my automated test suite.
  3. An earlier thread in this group (titled "gevent fix for code coverage") indicates that there needs to be fixes to both coverage.py and gevent itself to make everything work properly.  Is this true?  If so, it will complicate my efforts to test my fix, especially in an automated way.
BTW: if this mailing list isn't a good place to hash this out, I'm also on freenode IRC as nedbat, often in #python.

Thanks,

--Ned.

ggw

unread,
Nov 19, 2012, 12:54:59 PM11/19/12
to gev...@googlegroups.com
Howdy Ned,

I can say that we are still successfully using this patch on a pretty complex project.  It does require a patch to gevent too (gevent doesn't consider system trace functions at all).

I'm not enough of a gevent expert to go into detail, but gevent causes the python call stack to show a mix frames from unrelated threads.  A call stack may look like this:

mainFunction()  --> looks like this is one thread
subFunction()
time.sleep()   ---> until you reach a blocking call
gevent.hub()
doSomething()  ---> and then suddenly you are in a different thread completely
nowMore()

The coverage trace function maintains its own internal stack which does not expect doSomething() [from thread#2] to appear in the same stack as subFunction() [from thread#1].  It prints an error message and stops collecting coverage stats.

The gevent patch is a little more general-purpose than the coverage patch.  It could be argued it would be useful even if you weren't using coverage.  It adds system trace functions to gevent threads.  But until that project decides to add this feature, it remains my patched version.

In fairness, I confess we are only using the patched gevent during the unit testing.  During normal operations, we use the unpatched version.  There is no known issue with the patched version but I lean toward keeping things as vanilla as possible and easy to support.

Jonathan

Denis Bilenko

unread,
Nov 21, 2012, 3:38:44 PM11/21/12
to gev...@googlegroups.com
Hi Ned,

I don't know much about coverage, but I'd start with figuring out if
it works with basic greenlet.

On Sat, Nov 17, 2012 at 2:04 PM, Ned Batchelder <n...@nedbatchelder.com> wrote:
> I'd like to understand why my Python trace function fails while newbrough's
> succeeds. What assumption in my code is broken when running under gevent?
> I maintain my own stack data structure that should correspond to the Python
> call stack, and that clearly isn't right under gevent, but can someone
> explain concisely where that goes wrong?

Well, the whole point of greenlet is to switch call stack:

If you had function f that creates greenlet from function g() and
switches to it and then it calls function e() the call stack looks
like this:

f() -> g() -> e()

However, when a switch occurs the g() -> e() gets cut out from the
stack and saved on the heap so you're left with

f()

or, more likely, you switch into some other greenlet that also had a
call stack (previously saved):

f() -> bb() -> cc()

Basically call stack no longer changes incrementally, between it can
change much more drastically.

If you saved the data structures you maintain in greenlet-local object
(gevent.local.local) maybe that would help? (You can use thread-local
and under monkey patching they'll be greenlet-local).

Also, the recent greenlet version has tracing function that allow you
to hook into switching, this could be useful.

BTW, this list is a perfect place to discuss this, even if it's
greenlet-related.

Peter Portante

unread,
Mar 17, 2013, 6:45:31 PM3/17/13
to gev...@googlegroups.com
An initial version that works for gevent and eventlet, hacky, proof of concept, etc. can be found here: https://github.com/portante/coverage. It is based on 3.6 downloaded from PyPI, and Ned will likely rework this entirely to be properly supported in Coverage at some point in the future. Feedback welcome.

Example usage:

$ coverage run --include=c.py --timid c.py
$ coverage report -m
Name    Stmts   Miss  Cover   Missing
-------------------------------------
c          30      6    80%   22-24, 31-33
$ # Must use --timid to get the PyTracer, as this is not currently supported with the cTracer
$ coverage run --include=c.py --timid --concurrency=gevent c.py
$ coverage report -m
Name    Stmts   Miss  Cover   Missing
-------------------------------------
c          30      0   100%   
$ cat c.py
from gevent import monkey
monkey.patch_thread()
import threading
import gevent.queue as Queue

class Producer(threading.Thread):
    def __init__(self, q):
        threading.Thread.__init__(self)
        self.q = q

    def run(self):
        for i in range(10):
            self.q.put(i)
        self.q.put(None)

class Consumer(threading.Thread):
    def __init__(self, q):
        threading.Thread.__init__(self)
        self.q = q

    def run(self):
        while True:
            i = self.q.get()
            if i is None:
                return
            print i

def main():
    q = Queue.Queue()
    p = Producer(q)
    c = Consumer(q)
    c.start()
    p.start()
    p.join()
    c.join()

if __name__ == "__main__":
    main()
Reply all
Reply to author
Forward
0 new messages