Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Timeouts in tests using Tornado + Motor
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Ben Darnell  
View profile  
 More options Oct 5 2012, 11:47 pm
From: Ben Darnell <b...@bendarnell.com>
Date: Fri, 5 Oct 2012 20:47:03 -0700
Local: Fri, Oct 5 2012 11:47 pm
Subject: Re: [tornado] Timeouts in tests using Tornado + Motor
Oh, I see.  What's happening is that each time you create a debug-mode
application, it calls autoreload.start(), which schedules a callback
every 500ms on the IOLoop to check the modification times of all the
source files.  As you run more tests, those accumulate, and eventually
the server can hardly do anything else.  Autoreload should definitely
check to see if it's being started on the same IOLoop more than once.
(and maybe it should warn if it's being started on two different
IOLoops in the same process, since that's not really useful).

-Ben

On Fri, Oct 5, 2012 at 4:44 PM, Daniel Kador <d...@keen.io> wrote:
> I appear to have found a fix.  I was previously returning an Application
> instance in self.get_app() with the debug kwarg set to True (tests were
> inheriting the default value for running in dev mode) and now everything is
> quite speedy.  More importantly, the tests actually all run!  I'm satisfied,
> but if Ben or Jesse wants me to dig in more to figure out what, exactly,
> debug mode is doing, I'm happy to help out.

> One thing I noticed: I was able to reproduce the problem in a small test
> class with ~100 tests.  The ONLY way I could repro, though, was if I had
> debug mode on AND I returned the singleton IOLoop from IOLoop.instance() in
> self.get_new_ioloop() AND each test made a simple HTTP request AND my
> application used my request handlers (if I tried with a barebones app it was
> all fine).

> So it appears as if there's some strange interaction between my handlers and
> debug mode?

> On Friday, October 5, 2012 3:21:19 PM UTC-7, Daniel Kador wrote:

>> The doc additions looks great, thanks for handling it so quickly.

>> db.serverStatus().cursors reports nothing interesting at all.  The cursors
>> are definitely being closed (which is good).

>> The CPU is being used by Python.  Mongo hardly has a blip.

>> I ran lsof -p PID | wc -l, where PID was the PID of my Python process.
>> The count stays steady around 55.  Most of the results are related to the
>> .so files loaded by Python.  There are about 10 related to sockets, which
>> seems pretty reasonable to me.

>> Mongo stays pretty consistent around 113.  Again, that doesn't sound
>> crazy.

>> It definitely *feels* like the Python layer is leaking something,
>> somewhere.  Is there an easy way to see what's registered on the ioloop?  Or
>> just simply log what the ioloop is currently working on? I wonder if my code
>> is somehow not terminating in some cases?  Or taking much longer than I
>> think?

>> On Friday, October 5, 2012 3:09:38 PM UTC-7, A. Jesse Jiryu Davis wrote:

>>> I've updated the docs about MotorCursor:

>>> http://emptysquare.net/motor/pymongo/api/motor/motor_cursor.html

>>> Let me know what you find out regarding your tests, Dan, or any other way
>>> I can help. I want anyone who uses Motor to succeed. =)

>>> On Friday, October 5, 2012 2:53:00 PM UTC-7, A. Jesse Jiryu Davis wrote:

>>>> No, I'll add this to the docs -- a MotorCursor's __del__ calls the
>>>> PyMongo Cursor's close(), which checks if the cursor needs to be closed on
>>>> the server and if so sends the closeCursors command. So if you've fully
>>>> iterated a cursor (with to_list, or next_object until it returns None) then
>>>> the server has already closed the cursor, otherwise the client cursor closes
>>>> itself on the server when it's dereferenced.

>>>> Motor's own test suite checks for leaked cursors after every test:

>>>> https://github.com/ajdavis/mongo-python-driver/blob/motor/test/motor/...

>>>> So you could do something similar in your tests, or manually run
>>>> db.serverStatus().cursors in the Mongo shell while the tests are going. If
>>>> it rises a lot, then we have a problem.

>>>> Is it Python or MongoDB that mostly uses CPU? I'm not surprised your CPU
>>>> is pegged, nothing *else* would be the bottleneck in your unittests.

>>>> Can you run lsof|grep PID|wc -l, where PID is the PID of your Python
>>>> process, and make sure that process itself isn't opening more than a dozen
>>>> or so connections? What about with mongod's PID?

>>>> On Friday, October 5, 2012 2:11:19 PM UTC-7, Daniel Kador wrote:

>>>>> Quick update - I believe I've ruled out file descriptors as the source
>>>>> of my problem.  I wrote a small script that just repeatedly runs "lsof | wc
>>>>> -l" (which, as I understand it, should give me the number of open file
>>>>> descriptors on my system) and stores the results in a texfile.  Before I
>>>>> launched any tests, my open file descriptor count hovered around 7,300.  As
>>>>> I ran tests, the number fluctuated a bit, but never went above 7,405 and
>>>>> never went below 7,221.  So it feels like I'm not blowing out the descriptor
>>>>> count anywhere.

>>>>> I also verified that the tearDown methods in AsyncTestCase and
>>>>> AsyncHTTPTestCase are being called on every test.

>>>>> Some more details (not sure if they're useful or relevant, but I'm
>>>>> hoping they are): when I run tests my CPU gets pegged.  One of my cores goes
>>>>> to about 90% and hovers there for the whole run.  Also, I'm overriding
>>>>> get_new_ioloop() to return the singleton instance.

>>>>> Jesse, question for you: do I need to explicitly close cursors opened
>>>>> by Motor?  I'm pretty sure I don't need to do anything with the Connection
>>>>> object, but I couldn't find anything that told me my responsibility re: the
>>>>> cursor lifecycle.

>>>>> Will keep digging in.

>>>>> Thanks again!

>>>>> -Dan

>>>>> On Friday, October 5, 2012 11:38:59 AM UTC-7, Daniel Kador wrote:

>>>>>> Awesome, I was thinking along the same lines.  I've seen this problem
>>>>>> on both my local development platform (Mac) and my CI box (Ubuntu).  I've
>>>>>> looked into the file descriptor problem before and didn't come up with
>>>>>> anything definitive, but I'll dig in more.  Both Jesse's and Ben's
>>>>>> suggestions sound like good places to start.

>>>>>> I'm using MotorReplicaSetConnection.  I don't create a new one per
>>>>>> test, I have a single module-level connection reference.  I've tried
>>>>>> explicitly setting up the connection and then tearing it down on every test,
>>>>>> which didn't make a noticeable difference (except to significantly slow down
>>>>>> my tests :P).

>>>>>> I'll dig in more and report back. I don't /believe/ I'm forgetting to
>>>>>> call super.tearDown(), but I'll double-check.  If it's that simple I'll be
>>>>>> relieved (and slightly embarrassed).

>>>>>> Thanks so much!

>>>>>> -Dan

>>>>>> On Thursday, October 4, 2012 11:23:11 PM UTC-7, A. Jesse Jiryu Davis
>>>>>> wrote:

>>>>>>> Agreed. Are you using MotorConnection or MotorReplicaSetConnection?
>>>>>>> Do you create a new connection per test? Does it help to call close() on
>>>>>>> that connection in tearDown() rather than simply relying on it to be
>>>>>>> garbage-collected when it's dereferenced?

>>>>>>> On Thursday, October 4, 2012 8:53:31 PM UTC-7, Ben Darnell wrote:

>>>>>>>> This sounds like it might be a file descriptor leak.  Your tests
>>>>>>>> might
>>>>>>>> be opening sockets and not closing them, which would cause problems
>>>>>>>> once you hit the file descriptor limit, which is relatively low by
>>>>>>>> default on many platforms.  Try increasing the limit by running
>>>>>>>> "uname
>>>>>>>> -n 10000" before running your tests (if you're on linux and the OS
>>>>>>>> won't let you, try increasing the settings in
>>>>>>>> /etc/security/limits.conf).  If that works, then you're running out
>>>>>>>> of
>>>>>>>> file descriptors.  Once you've confirmed the problem, a better
>>>>>>>> long-term solution is to ensure that you're closing all the file
>>>>>>>> descriptors you're opening.  This includes sockets and regular
>>>>>>>> files,
>>>>>>>> as well as other file descriptors created by IOLoop and
>>>>>>>> AsyncHTTPClient (could you be forgetting to call super.tearDown() in
>>>>>>>> your tests?).

>>>>>>>> -Ben

>>>>>>>> On Thu, Oct 4, 2012 at 5:50 PM, Daniel Kador <d...@keen.io> wrote:
>>>>>>>> > Hi!

>>>>>>>> > I'm running into (what I perceive to be) a strange problem with my
>>>>>>>> > tests.
>>>>>>>> > I've got an application written on top of Tornado that
>>>>>>>> > communicates with
>>>>>>>> > Mongo through the Motor library.  It's humming away nicely in
>>>>>>>> > production
>>>>>>>> > (daemonized using supervisord).  The only problems (that I can
>>>>>>>> > detect) are
>>>>>>>> > with my tests.  I'm very happy to acknowledge that this could be a
>>>>>>>> > problem
>>>>>>>> > in my application layer, or in the Motor layer, or maybe even in
>>>>>>>> > MongoDB
>>>>>>>> > itself - mostly I'm looking for some guidance as I'm not really
>>>>>>>> > sure where
>>>>>>>> > to start digging in.

>>>>>>>> > But let me step back and describe what I've noticed in more
>>>>>>>> > detail.

>>>>>>>> > I have approximately ~150 tests that extend the
>>>>>>>> > tornado.testing.AsyncHTTPTestCase class, spread across a number of
>>>>>>>> > my own
>>>>>>>> > classes.  I have ~50 that are simple unit tests.  They both
>>>>>>>> > communicate to
>>>>>>>> > Mongo through Motor.  The unit tests are fast and never fail (I
>>>>>>>> > mention this
>>>>>>>> > only as a data point that may show Motor by itself may not be the
>>>>>>>> > culprit).
>>>>>>>> > The AsyncHTTPTestCase tests all pass when their various test
>>>>>>>> > classes are run
>>>>>>>> > individually.

>>>>>>>> > But when I run them all together, after about ~65 tests running
>>>>>>>> > and passing,
>>>>>>>> > I start getting timeouts in various spots.  Sometimes it's when
>>>>>>>> > interacting
>>>>>>>> > directly with Motor.  Sometimes it's when making an async HTTP
>>>>>>>> > request to
>>>>>>>> > one of the APIs I've created (which, under the covers, is talking
>>>>>>>> > to Mongo
>>>>>>>> > through Motor).  But it's always when calling (from a test)
>>>>>>>> > self.wait().

>>>>>>>> > It's getting to be a serious problem, as now I can't actually run
>>>>>>>> > all my
>>>>>>>> > tests locally (at one time).  Obviously not a recipe for
>>>>>>>> > test-driven
>>>>>>>> > development success.

>>>>>>>> > I'm relatively sure I've done something stupid, but I haven't
>>>>>>>> > figured out
>>>>>>>> > where yet.  Would really appreciate some help here.

>>>>>>>> > Thanks!

>>>>>>>> > -Dan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.