java driver query with batch size never returns > batch_size results

202 views
Skip to first unread message

Lee Henson

unread,
May 30, 2012, 8:21:22 AM5/30/12
to mongod...@googlegroups.com
Hi

I'm using the Java driver (v2.6.3) to issue a query over a collection containing a 101 documents. The query should return all 101 documents. I have set the batch size to 100 and have iterated over the results set. I expect to get back 101 documents (with the driver transparently fetching the second batch of 1 document during my iteration), but I only get 100. In fact, the number of results returned is always equal to batch_size, e.g. batch_size = 10 : documents returned = 10, batch_size = 101 : documents_returned = 101, etc

I've used mongosniff to observe the wire communications, and it looks like the cursor id is always returned as 0, even though theoretically there should be "more" results on the server: 1 batch of 100 documents and 1 batch of 1 document:

127.0.0.1:57768  -->> 127.0.0.1:27017 euston-event-store-specs.commits  373 bytes  id:1b1 433

query: { query: { $or: [ { body.events.headers.sequence: { $gte: 0, $lte: 4611686018427387903 } }, { _id.sequence: { $gte: 0, $lte: 4611686018427387903 } } ], headers.timestamp.as_float: { $gte: 0.0, $lte: 4.611686018427388e+18 }, _id.event_source_id: "503218fe-c5f3-4770-8b3d-eb0423a12e07" }, orderby: { _id.sequence: 1 } }  ntoreturn: 100 ntoskip: 0

127.0.0.1:27017  <<--  127.0.0.1:57768   106145 bytes  id:caf7  51959 - 433

reply n:100 cursorId: 0
{ _id: { event_source_id: "503218fe-c5f3-4770-8b3d-eb0423a12e07", sequence: 0 }, headers: { id: "f7e9a738-43ac-4798-aa8f-ee3531e293c2", type: "rerum", version: 1, origin: { headers: { id: "24a54f19-64b2-4095-bd3b-d3d429fff0a7", type: "book_tee", version: 1 }, body: { course_id: "5554f87c-55d1-48c2-8991-139bc10c36d9", player_id: "ad12d075-c498-4bce-acb1-19e0381a7708", time: 1338380284.964 } }, dispatched: false, duration: null, timestamp: { as_float: 1338380042.966, as_rfc3339: "2012-05-30T12:14:02.966000+00:00" } }, body: { commands: [ { headers: { id: "775f26b3-42c8-4530-ba72-28db6a27256e", type: "check_for_slow_play", version: 1 }, body: { course_id: "5554f87c-55d1-48c2-8991-139bc10c36d9", player_id: "ad12d075-c498-4bce-acb1-19e0381a7708", time: 1338380284.964 } } ], events: [ { headers: { id: "d4e00e60-de46-4025-86bf-3f367b43a996", type: "tee_booked", version: 1, sequence: 0 }, body: { course_id: "5554f87c-55d1-48c2-8991-139bc10c36d9", player_id: "ad12d075-c498-4bce-acb1-19e0381a7708", time: 1338380284.964 } } ] } }

127.0.0.1:27017  <<--  127.0.0.1:57759   124 bytes  id:caf8 51960 - 430

reply n:1 cursorId: 0
{ updatedExisting: true, n: 1, connectionId: 1185, waited: 30, wtime: 0, err: null, ok: 1.0 }

What is interesting is that it looks like the server is returning two batches (if I'm reading it right). There's a reply n:100 and a reply n:1 immediately after it. But since the Response class in the Java driver uses the value of the cursorId to determine whether or not there are more results, it seemingly ignores the second reply.

More confusingly, there seems to be plenty of people out there in mongo land who are happily batch_sizing away without any apparent problems. Am I missing a crucial piece of information? The server is at v2.0.4 installed via Homebrew on OSX.

Cheers!
Lee

Scott Hernandez

unread,
May 30, 2012, 9:03:52 AM5/30/12
to mongod...@googlegroups.com
Can you please post your java code? It sounds like there is something
wrong with how you are setting the batch/limit.

Please post to gist/pastie/etc.
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb

Lee Henson

unread,
May 30, 2012, 9:26:42 AM5/30/12
to mongod...@googlegroups.com
I was afraid someone would ask that. I'm actually doing all this via jmongo, a ruby wrapper around the java driver for use with jruby. So there's an extra layer of abstraction in play here. I've done my best to pull together the appropriate bits of code:

Scott Hernandez

unread,
May 30, 2012, 9:44:00 AM5/30/12
to mongod...@googlegroups.com
Okay, can you post the output of DBCursor.toString() just before you
get results? That should include all the options and will shed some
light on it.

Is it possible you are passing in a negative batchSize?

Lee Henson

unread,
May 30, 2012, 9:58:17 AM5/30/12
to mongod...@googlegroups.com
After obtaining the jmongo cursor, but before it is iterated over:

Cursor id=0, ns=euston-event-store-specs.commits, query={ \"$or\" : [ { \"body.events.headers.sequence\" : { \"$gte\" : 0 , \"$lte\" : 4611686018427387903}} , { \"_id.sequence\" : { \"$gte\" : 0 , \"$lte\" : 4611686018427387903}}] , \"headers.timestamp.as_float\" : { \"$gte\" : 0.0 , \"$lte\" : 4.6116860184273879E18} , \"_id.event_source_id\" : \"23a929ff-88d0-42d0-8110-ee6017ddb098\"}, numIterated=0, batchSize=100


On Wednesday, May 30, 2012 2:44:00 PM UTC+1, Scott Hernandez wrote:
Okay, can you post the output of DBCursor.toString() just before you
get results? That should include all the options and will shed some
light on it.

Is it possible you are passing in a negative batchSize?

Scott Hernandez

unread,
May 30, 2012, 10:50:07 AM5/30/12
to mongod...@googlegroups.com
This looks fine, and after you start iterating you saw it doesn't have
a cursorId?

Lee Henson

unread,
May 30, 2012, 10:54:59 AM5/30/12
to mongod...@googlegroups.com
That's correct. I've printed out the cursorid from inside the Response object and I can see that it matches the mongosniff output above, i.e. that it's always 0. 


On Wednesday, May 30, 2012 3:50:07 PM UTC+1, Scott Hernandez wrote:
This looks fine, and after you start iterating you saw it doesn't have
a cursorId?

Lee Henson

unread,
Jun 5, 2012, 11:22:28 AM6/5/12
to mongod...@googlegroups.com
Scott, some more info on this one:

It would appear that the critical factor is the orderby option. If I don't specify an orderby, I get a cursor id > 0 and the getmore request is sent as expected. If I *do* specify an orderby, the cursorid is always 0 and I never get more results than the batch size. I've tested this with the latest 2.8.0 rc1 java driver vs mongodb 2.0.5 on osx installed via homebrew.

I have a test case running in the jmongo repo on github:


I'm running JRuby 1.6.7.2 under 1.9 mode, and executing the test via:

jruby -I"lib" -w -I"/Users/leemhenson/.rvm/gems/jruby-1.6.7.2@global/gems/rake-0.9.2.2/lib" "/Users/leemhenson/.rvm/gems/jruby-1.6.7.2@global/gems/rake-0.9.2.2/lib/rake/rake_test_loader.rb" "test/cursor_batch_size_test.rb"

The test produces the following mongosniff output for a query *without* an orderby:

192.168.0.2:54044  -->> 192.168.0.2:27017 ruby-test-db.test  51 bytes  id:69 105
query: {}  ntoreturn: 100 ntoskip: 0
192.168.0.2:27017  <<--  192.168.0.2:54044 o?+??m>  13937 bytes  id:2645 9797 - 105
reply n:100 cursorId: 4471632141715486575
{ _id: ObjectId('4fce1fab9772d3f8b02bece9'), A1338908587459: 1338908587.46, A133890858746: 1338908587.461, A1338908587461: 1338908587.462, A1338908587462: 1338908587.462, xyz: 1, abc: 539 }
192.168.0.2:54044  -->> 192.168.0.2:27017 ruby-test-db.test  50 bytes  id:6a 106
getMore nToReturn: 100 cursorId: 4471632141715486575
192.168.0.2:27017  <<--  192.168.0.2:54044   228 bytes  id:2646 9798 - 106
reply n:1 cursorId: 0
{ _id: ObjectId('4fce1fab9772d3f8b02bed24'), A1338908587774: 1338908587.775, A1338908587783: 1338908587.784, A1338908587787: 1338908587.79, A1338908587795: 1338908587.796, A1338908587796: 1338908587.796, A1338908587797: 1338908587.798, xyz: 1, abc: 502 }

And the test produces the following mongosniff output for a query *with* an orderby:

192.168.0.2:54038  -->> 192.168.0.2:27017 ruby-test-db.test  90 bytes  id:69 105
query: { query: {}, orderby: { abc: 1 } }  ntoreturn: 100 ntoskip: 0
192.168.0.2:27017  <<--  192.168.0.2:54038   13968 bytes  id:2642 9794 - 105
reply n:100 cursorId: 0
{ _id: ObjectId('4fce1f88977245776b248887'), A1338908552634: 1338908552.635, A1338908552635: 1338908552.636, A1338908552636: 1338908552.637, A1338908552637: 1338908552.637, xyz: 1, abc: 9 }

Is there any further information I can get for you?

Scott Hernandez

unread,
Jun 5, 2012, 11:31:04 AM6/5/12
to mongod...@googlegroups.com
Can you test this using the mongo javascript shell also?

Lee Henson

unread,
Jun 5, 2012, 12:13:44 PM6/5/12
to mongod...@googlegroups.com
It would appear so:

MongoDB shell version: 2.0.5
connecting to: test
> use ruby-test-db
switched to db ruby-test-db
> db.test.count()
101

With no orderby:

> var cursor = db.test.find().batchSize(100);
> var count = 0;
> while(cursor.hasNext()) { cursor.next(); count++; }
100
> count
101

Mongosniff:

127.0.0.1:54977  -->> 127.0.0.1:27017 ruby-test-db.test  51 bytes  id:a 10
query: {}  ntoreturn: 100 ntoskip: 0
127.0.0.1:27017  <<--  127.0.0.1:54977 ? e?*?|y  13869 bytes  id:2667 9831 - 10
reply n:100 cursorId: 8754037985523367853
{ _id: ObjectId('4fce22c3977225f8cb3b5ad1'), A1338909379446: 1338909379.447, A1338909379447: 1338909379.448, A1338909379448: 1338909379.449, A1338909379449: 1338909379.449, xyz: 1, abc: 330 }
127.0.0.1:54977  -->> 127.0.0.1:27017 ruby-test-db.test  50 bytes  id:b 11
getMore nToReturn: 100 cursorId: 8754037985523367853
127.0.0.1:27017  <<--  127.0.0.1:54977   227 bytes  id:2668 9832 - 11
reply n:1 cursorId: 0
{ _id: ObjectId('4fce22c3977225f8cb3b5aea'), A1338909379578: 1338909379.584, A1338909379584: 1338909379.585, A1338909379585: 1338909379.588, A1338909379588: 1338909379.589, A1338909379589: 1338909379.59, A133890937959: 1338909379.59, xyz: 1, abc: 30 }

With an orderby:

> var cursor = db.test.find().sort({ abc: 1 }).batchSize(100);
> var count = 0;
> while(cursor.hasNext()) { cursor.next(); count++; }
99
> count
100

Mongosniff:

query: { query: {}, orderby: { abc: 1.0 } }  ntoreturn: 100 ntoskip: 0
127.0.0.1:27017  <<--  127.0.0.1:54977   13916 bytes  id:2678 9848 - 27
reply n:100 cursorId: 0
{ _id: ObjectId('4fce22c3977225f8cb3b5acf'), A1338909379437: 1338909379.438, A1338909379438: 1338909379.439, A1338909379439: 1338909379.44, A133890937944: 1338909379.44, xyz: 1, abc: 2 }

On Tuesday, June 5, 2012 4:31:04 PM UTC+1, Scott Hernandez wrote:
Can you test this using the mongo javascript shell also?

Lee Henson

unread,
Jun 6, 2012, 8:28:35 AM6/6/12
to mongod...@googlegroups.com
Hi Scott

Do you want me to add an issue to the mongo jira for this? 

Scott Hernandez

unread,
Jun 6, 2012, 8:34:17 AM6/6/12
to mongod...@googlegroups.com
Yes, please.
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com

Lee Henson

unread,
Jun 6, 2012, 9:01:54 AM6/6/12
to mongod...@googlegroups.com
Done:


On Wednesday, June 6, 2012 1:34:17 PM UTC+1, Scott Hernandez wrote:
Yes, please.
> mongodb-user+unsubscribe@googlegroups.com
Reply all
Reply to author
Forward
0 new messages