This is very hot: I ran into the same issue using a similar query but
the no-timeout option didn't help. Only batchSize and the returned
fields had an impact whether the iterator could go through all
documents or not.
I was able to copy the whole resultset into a brand new collection, so
I could play with it. I could reproduce the inconsistent query results
on this collection too! Then I dumped the collection and restored it
into a brand new sharded system where I could reproduce the same
phenomenon!
Then I executed:
db.shop199850.validate({full:true})
But all seemded ok. Here you can check the output:
http://pastie.org/3196491
db.repairDatabase() didn't help either:
{
"raw" : {
"s1/localhost:20017" : {
"ok" : 1
},
"s2/localhost:20018" : {
"ok" : 1
},
"s3/localhost:20019" : {
"ok" : 1
}
},
"ok" : 1
}
Let ne show you the console output of the queries.
1) two returned fields (id, bokey) with batchSize 10000 OK
2) two returned fields (id, sku) with batchSize 10000 OK
3) three returned fields (id, bokey, sku) with batchSize 10000 NOT OK
4) three returned fields (id, bokey, sku) with batchSize 1000 OK
mongos> var cursor = db.shop199850.find({shopId: 199850},{_id:1,bokey:
1}).batchSize(10000);
mongos> var count=0;while(cursor.hasNext()){obj=cursor.next(); count+
+;}
89605
mongos> var cursor = db.shop199850.find({shopId: 199850},{_id:1,sku:
1}).batchSize(10000);
mongos> var count=0;while(cursor.hasNext()){obj=cursor.next(); count+
+;}
89605
mongos> var cursor = db.shop199850.find({shopId: 199850},{_id:1,bokey:
1,sku:1}).batchSize(10000);
mongos> var count=0;while(cursor.hasNext()){obj=cursor.next(); count+
+;}
25305
mongos> var cursor = db.shop199850.find({shopId: 199850},{_id:1,bokey:
1,sku:1}).batchSize(1000);
mongos> var count=0;while(cursor.hasNext()){obj=cursor.next(); count+
+;}
89605
mongos>
I restored the same collection also in a unshareded system and could
NOT reproduce the error - at leat not with batchSize 10000 and the 3
returned fields above.
My sharded system is constituted of 3 shards, 1 configserver, 1
router. Each shard has only 1 node. I'm running mongo v2.0.1 on Linux
64 Bit.
The collection is only 16 MB (compressed) so it would be easy to send
it to you, so you could hopefully reproduce and track down the issue.
Of course, our data are confidential. How can I privately send it to
you?
> ...
>
> read more »