Hi
I'm using the Java driver (v2.6.3) to issue a query over a collection containing a 101 documents. The query should return all 101 documents. I have set the batch size to 100 and have iterated over the results set. I expect to get back 101 documents (with the driver transparently fetching the second batch of 1 document during my iteration), but I only get 100. In fact, the number of results returned is always equal to batch_size, e.g. batch_size = 10 : documents returned = 10, batch_size = 101 : documents_returned = 101, etc
I've used mongosniff to observe the wire communications, and it looks like the cursor id is always returned as 0, even though theoretically there should be "more" results on the server: 1 batch of 100 documents and 1 batch of 1 document:
query: { query: { $or: [ { body.events.headers.sequence: { $gte: 0, $lte: 4611686018427387903 } }, { _id.sequence: { $gte: 0, $lte: 4611686018427387903 } } ], headers.timestamp.as_float: { $gte: 0.0, $lte: 4.611686018427388e+18 }, _id.event_source_id: "503218fe-c5f3-4770-8b3d-eb0423a12e07" }, orderby: { _id.sequence: 1 } } ntoreturn: 100 ntoskip: 0
reply n:100 cursorId: 0
{ _id: { event_source_id: "503218fe-c5f3-4770-8b3d-eb0423a12e07", sequence: 0 }, headers: { id: "f7e9a738-43ac-4798-aa8f-ee3531e293c2", type: "rerum", version: 1, origin: { headers: { id: "24a54f19-64b2-4095-bd3b-d3d429fff0a7", type: "book_tee", version: 1 }, body: { course_id: "5554f87c-55d1-48c2-8991-139bc10c36d9", player_id: "ad12d075-c498-4bce-acb1-19e0381a7708", time: 1338380284.964 } }, dispatched: false, duration: null, timestamp: { as_float: 1338380042.966, as_rfc3339: "2012-05-30T12:14:02.966000+00:00" } }, body: { commands: [ { headers: { id: "775f26b3-42c8-4530-ba72-28db6a27256e", type: "check_for_slow_play", version: 1 }, body: { course_id: "5554f87c-55d1-48c2-8991-139bc10c36d9", player_id: "ad12d075-c498-4bce-acb1-19e0381a7708", time: 1338380284.964 } } ], events: [ { headers: { id: "d4e00e60-de46-4025-86bf-3f367b43a996", type: "tee_booked", version: 1, sequence: 0 }, body: { course_id: "5554f87c-55d1-48c2-8991-139bc10c36d9", player_id: "ad12d075-c498-4bce-acb1-19e0381a7708", time: 1338380284.964 } } ] } }
reply n:1 cursorId: 0
{ updatedExisting: true, n: 1, connectionId: 1185, waited: 30, wtime: 0, err: null, ok: 1.0 }
What is interesting is that it looks like the server is returning two batches (if I'm reading it right). There's a reply n:100 and a reply n:1 immediately after it. But since the Response class in the Java driver uses the value of the cursorId to determine whether or not there are more results, it seemingly ignores the second reply.
More confusingly, there seems to be plenty of people out there in mongo land who are happily batch_sizing away without any apparent problems. Am I missing a crucial piece of information? The server is at v2.0.4 installed via Homebrew on OSX.
Cheers!
Lee