Cursor Timeouts

405 views
Skip to first unread message

Brian McManus

unread,
Jan 26, 2012, 9:44:22 AM1/26/12
to Mongoid
Hi all. I've got a scenario where I need to loop through all records
in a collection and make updates to them. While iterating over the
collection I will eventually get the following error:

Query response returned CURSOR_NOT_FOUND. Either an invalid cursor was
specified, or the cursor may have timed out on the server.

I searched here and found another thread but it didn't really seem to
have an answer.

I thought that I had read somewhere (from Durran I thought) that
Mongoid automatically handles refreshing the cursor while iterating
over a collection so worrying about timeouts should not be necessary.
Watching the job run I actually do occasionally see log entries from
Mongoid saying it is doing just that. Actually the line right before
the timeout in the logs indicates Mongoid was trying to do this:

MONGODB [DEBUG] cursor.refresh() for cursor 716743837813245228

I'm somewhat at a loss at this point. Any ideas?

Additional information: Job is already backgrounded and run by
Resque. Strongly prefer not creating a job for each doc in the
collection (don't think that would really help anyways since I'd still
have to iterate through them all to do that). Each iteration through
the loop is taking only a couple ms to run so it's not getting stuck/
blocking for any length of time to cause a timeout.

Daniel Doubrovkine

unread,
Feb 7, 2012, 10:55:00 PM2/7/12
to mon...@googlegroups.com
I think the problem is that mongoid doesn't get a chance to refresh the cursor - you get a large number of elements to process, then by the time you come back to ask for more it's too late.

We have the same problem all the time, so we added a little each_by extension. If anyone has a better suggestion, would love to hear it. Maybe something that could be added to Mongoid?
module Mongoid
  class Criteria
    def each_by(by, &block)
      idx = 0
      total = 0
      set_limit = options[:limit]
      while ((results = clone.limit(by).skip(idx)) && results.any?)
        results.each do |result|
          return self if set_limit and set_limit >= total
          total += 1
          yield result
        end
        idx += by
      end
      self
    end
  end
end

Durran Jordan

unread,
Feb 9, 2012, 4:34:46 AM2/9/12
to mon...@googlegroups.com
The cursor refresh is not handled by Mongoid, but by the driver itself. There's a couple things that may be happening here, which I'll lay out but I cannot be exactly sure if it's the problem you are having...

1. Are you using replica sets and reading from secondary? Cursors are on a per server basis, and when the driver asks to get more documents it needs to hit the same secondary replica set that it hit before. The 1.4.x drivers had problems with this since they were asking for cursors on different nodes than the original query, so I'd make sure you're on 1.5.x.

2. Cursors on the server time out after 10 minutes if not exhausted completely. Is this process taking over 10? I would recommend for batch jobs of this size that you drop down to the driver directly if possible and pass timeout: false to find and do your magic in that block.

3. With Mongoid 3 and Moped you wont see these types of issues, Moped's handling these cases under the covers, so the retry, refresh, timeout options won't be necessary. But I'd go with option 2 until then.

2012/1/26 Brian McManus <bdm...@gmail.com>

fbjork

unread,
Feb 25, 2012, 6:48:40 PM2/25/12
to Mongoid
I have the same issue when doing batch queries. I do reads from
primary and the queries don't take longer than 10 minutes to complete.
This is using the latest Mongoid and mongo driver 1.5.1. What could be
going on here?

I have a huge collection with a sparse index that I query at set
intervals. At random the queries fail with the CURSOR_NOT_FOUND
exception. The collection changes frequently and causes documents to
grow, could that be causing the cursor to fail?

On Feb 9, 1:34 am, Durran Jordan <dur...@gmail.com> wrote:
> The cursor refresh is not handled by Mongoid, but by the driver itself.
> There's a couple things that may be happening here, which I'll lay out but
> I cannot be exactly sure if it's the problem you are having...
>
> 1. Are you using replica sets and reading from secondary? Cursors are on a
> per server basis, and when the driver asks to get more documents it needs
> to hit the same secondary replica set that it hit before. The 1.4.x drivers
> had problems with this since they were asking for cursors on different
> nodes than the original query, so I'd make sure you're on 1.5.x.
>
> 2. Cursors on the server time out after 10 minutes if not exhausted
> completely. Is this process taking over 10? I would recommend for batch
> jobs of this size that you drop down to the driver directly if possible and
> pass timeout: false to find and do your magic in that block.
>
> 3. With Mongoid 3 and Moped you wont see these types of issues, Moped's
> handling these cases under the covers, so the retry, refresh, timeout
> options won't be necessary. But I'd go with option 2 until then.
>
> 2012/1/26 Brian McManus <bdma...@gmail.com>
Reply all
Reply to author
Forward
0 new messages