Oh, I see. What's happening is that each time you create a debug-mode
source files. As you run more tests, those accumulate, and eventually
the server can hardly do anything else. Autoreload should definitely
On Fri, Oct 5, 2012 at 4:44 PM, Daniel Kador <d
...@keen.io> wrote:
> I appear to have found a fix. I was previously returning an Application
> instance in self.get_app() with the debug kwarg set to True (tests were
> inheriting the default value for running in dev mode) and now everything is
> quite speedy. More importantly, the tests actually all run! I'm satisfied,
> but if Ben or Jesse wants me to dig in more to figure out what, exactly,
> debug mode is doing, I'm happy to help out.
> One thing I noticed: I was able to reproduce the problem in a small test
> class with ~100 tests. The ONLY way I could repro, though, was if I had
> debug mode on AND I returned the singleton IOLoop from IOLoop.instance() in
> self.get_new_ioloop() AND each test made a simple HTTP request AND my
> application used my request handlers (if I tried with a barebones app it was
> all fine).
> So it appears as if there's some strange interaction between my handlers and
> debug mode?
> On Friday, October 5, 2012 3:21:19 PM UTC-7, Daniel Kador wrote:
>> The doc additions looks great, thanks for handling it so quickly.
>> db.serverStatus().cursors reports nothing interesting at all. The cursors
>> are definitely being closed (which is good).
>> The CPU is being used by Python. Mongo hardly has a blip.
>> I ran lsof -p PID | wc -l, where PID was the PID of my Python process.
>> The count stays steady around 55. Most of the results are related to the
>> .so files loaded by Python. There are about 10 related to sockets, which
>> seems pretty reasonable to me.
>> Mongo stays pretty consistent around 113. Again, that doesn't sound
>> crazy.
>> It definitely *feels* like the Python layer is leaking something,
>> somewhere. Is there an easy way to see what's registered on the ioloop? Or
>> just simply log what the ioloop is currently working on? I wonder if my code
>> is somehow not terminating in some cases? Or taking much longer than I
>> think?
>> On Friday, October 5, 2012 3:09:38 PM UTC-7, A. Jesse Jiryu Davis wrote:
>>> I've updated the docs about MotorCursor:
>>> http://emptysquare.net/motor/pymongo/api/motor/motor_cursor.html
>>> Let me know what you find out regarding your tests, Dan, or any other way
>>> I can help. I want anyone who uses Motor to succeed. =)
>>> On Friday, October 5, 2012 2:53:00 PM UTC-7, A. Jesse Jiryu Davis wrote:
>>>> No, I'll add this to the docs -- a MotorCursor's __del__ calls the
>>>> PyMongo Cursor's close(), which checks if the cursor needs to be closed on
>>>> the server and if so sends the closeCursors command. So if you've fully
>>>> iterated a cursor (with to_list, or next_object until it returns None) then
>>>> the server has already closed the cursor, otherwise the client cursor closes
>>>> itself on the server when it's dereferenced.
>>>> Motor's own test suite checks for leaked cursors after every test:
>>>> https://github.com/ajdavis/mongo-python-driver/blob/motor/test/motor/...
>>>> So you could do something similar in your tests, or manually run
>>>> db.serverStatus().cursors in the Mongo shell while the tests are going. If
>>>> it rises a lot, then we have a problem.
>>>> Is it Python or MongoDB that mostly uses CPU? I'm not surprised your CPU
>>>> is pegged, nothing *else* would be the bottleneck in your unittests.
>>>> Can you run lsof|grep PID|wc -l, where PID is the PID of your Python
>>>> process, and make sure that process itself isn't opening more than a dozen
>>>> or so connections? What about with mongod's PID?
>>>> On Friday, October 5, 2012 2:11:19 PM UTC-7, Daniel Kador wrote:
>>>>> Quick update - I believe I've ruled out file descriptors as the source
>>>>> of my problem. I wrote a small script that just repeatedly runs "lsof | wc
>>>>> -l" (which, as I understand it, should give me the number of open file
>>>>> descriptors on my system) and stores the results in a texfile. Before I
>>>>> launched any tests, my open file descriptor count hovered around 7,300. As
>>>>> I ran tests, the number fluctuated a bit, but never went above 7,405 and
>>>>> never went below 7,221. So it feels like I'm not blowing out the descriptor
>>>>> count anywhere.
>>>>> I also verified that the tearDown methods in AsyncTestCase and
>>>>> AsyncHTTPTestCase are being called on every test.
>>>>> Some more details (not sure if they're useful or relevant, but I'm
>>>>> hoping they are): when I run tests my CPU gets pegged. One of my cores goes
>>>>> to about 90% and hovers there for the whole run. Also, I'm overriding
>>>>> get_new_ioloop() to return the singleton instance.
>>>>> Jesse, question for you: do I need to explicitly close cursors opened
>>>>> by Motor? I'm pretty sure I don't need to do anything with the Connection
>>>>> object, but I couldn't find anything that told me my responsibility re: the
>>>>> cursor lifecycle.
>>>>> Will keep digging in.
>>>>> Thanks again!
>>>>> -Dan
>>>>> On Friday, October 5, 2012 11:38:59 AM UTC-7, Daniel Kador wrote:
>>>>>> Awesome, I was thinking along the same lines. I've seen this problem
>>>>>> on both my local development platform (Mac) and my CI box (Ubuntu). I've
>>>>>> looked into the file descriptor problem before and didn't come up with
>>>>>> anything definitive, but I'll dig in more. Both Jesse's and Ben's
>>>>>> suggestions sound like good places to start.
>>>>>> I'm using MotorReplicaSetConnection. I don't create a new one per
>>>>>> test, I have a single module-level connection reference. I've tried
>>>>>> explicitly setting up the connection and then tearing it down on every test,
>>>>>> which didn't make a noticeable difference (except to significantly slow down
>>>>>> my tests :P).
>>>>>> I'll dig in more and report back. I don't /believe/ I'm forgetting to
>>>>>> call super.tearDown(), but I'll double-check. If it's that simple I'll be
>>>>>> relieved (and slightly embarrassed).
>>>>>> Thanks so much!
>>>>>> -Dan
>>>>>> On Thursday, October 4, 2012 11:23:11 PM UTC-7, A. Jesse Jiryu Davis
>>>>>> wrote:
>>>>>>> Agreed. Are you using MotorConnection or MotorReplicaSetConnection?
>>>>>>> Do you create a new connection per test? Does it help to call close() on
>>>>>>> that connection in tearDown() rather than simply relying on it to be
>>>>>>> garbage-collected when it's dereferenced?
>>>>>>> On Thursday, October 4, 2012 8:53:31 PM UTC-7, Ben Darnell wrote:
>>>>>>>> This sounds like it might be a file descriptor leak. Your tests
>>>>>>>> might
>>>>>>>> be opening sockets and not closing them, which would cause problems
>>>>>>>> once you hit the file descriptor limit, which is relatively low by
>>>>>>>> default on many platforms. Try increasing the limit by running
>>>>>>>> "uname
>>>>>>>> -n 10000" before running your tests (if you're on linux and the OS
>>>>>>>> won't let you, try increasing the settings in
>>>>>>>> /etc/security/limits.conf). If that works, then you're running out
>>>>>>>> of
>>>>>>>> file descriptors. Once you've confirmed the problem, a better
>>>>>>>> long-term solution is to ensure that you're closing all the file
>>>>>>>> descriptors you're opening. This includes sockets and regular
>>>>>>>> files,
>>>>>>>> as well as other file descriptors created by IOLoop and
>>>>>>>> AsyncHTTPClient (could you be forgetting to call super.tearDown() in
>>>>>>>> your tests?).
>>>>>>>> -Ben
>>>>>>>> On Thu, Oct 4, 2012 at 5:50 PM, Daniel Kador <d...@keen.io> wrote:
>>>>>>>> > Hi!
>>>>>>>> > I'm running into (what I perceive to be) a strange problem with my
>>>>>>>> > tests.
>>>>>>>> > I've got an application written on top of Tornado that
>>>>>>>> > communicates with
>>>>>>>> > Mongo through the Motor library. It's humming away nicely in
>>>>>>>> > production
>>>>>>>> > (daemonized using supervisord). The only problems (that I can
>>>>>>>> > detect) are
>>>>>>>> > with my tests. I'm very happy to acknowledge that this could be a
>>>>>>>> > problem
>>>>>>>> > in my application layer, or in the Motor layer, or maybe even in
>>>>>>>> > MongoDB
>>>>>>>> > itself - mostly I'm looking for some guidance as I'm not really
>>>>>>>> > sure where
>>>>>>>> > to start digging in.
>>>>>>>> > But let me step back and describe what I've noticed in more
>>>>>>>> > detail.
>>>>>>>> > I have approximately ~150 tests that extend the
>>>>>>>> > tornado.testing.AsyncHTTPTestCase class, spread across a number of
>>>>>>>> > my own
>>>>>>>> > classes. I have ~50 that are simple unit tests. They both
>>>>>>>> > communicate to
>>>>>>>> > Mongo through Motor. The unit tests are fast and never fail (I
>>>>>>>> > mention this
>>>>>>>> > only as a data point that may show Motor by itself may not be the
>>>>>>>> > culprit).
>>>>>>>> > The AsyncHTTPTestCase tests all pass when their various test
>>>>>>>> > classes are run
>>>>>>>> > individually.
>>>>>>>> > But when I run them all together, after about ~65 tests running
>>>>>>>> > and passing,
>>>>>>>> > I start getting timeouts in various spots. Sometimes it's when
>>>>>>>> > interacting
>>>>>>>> > directly with Motor. Sometimes it's when making an async HTTP
>>>>>>>> > request to
>>>>>>>> > one of the APIs I've created (which, under the covers, is talking
>>>>>>>> > to Mongo
>>>>>>>> > through Motor). But it's always when calling (from a test)
>>>>>>>> > self.wait().
>>>>>>>> > It's getting to be a serious problem, as now I can't actually run
>>>>>>>> > all my
>>>>>>>> > tests locally (at one time). Obviously not a recipe for
>>>>>>>> > test-driven
>>>>>>>> > development success.
>>>>>>>> > I'm relatively sure I've done something stupid, but I haven't
>>>>>>>> > figured out
>>>>>>>> > where yet. Would really appreciate some help here.
>>>>>>>> > Thanks!
>>>>>>>> > -Dan