Cursor Bug in MongoDB?

181 views
Skip to first unread message

Nabor

unread,
Oct 11, 2013, 4:28:01 AM10/11/13
to mongod...@googlegroups.com
May be I found a cursor bug in MongoDB.

I use the newest java-7-openjdk-amd64 on a Ubuntu 13.04 machine.
MongoDB has version 2.4.6
In addition I use spring-data-mongodb 1.3.1 with mongo-java-driver 2.10.1

My Query is { "EMPID" : { "$ne" : "" }}.
There are 1929 records in my Database.

The code that handles this query is

        final MongoConverter converter = mongoTemplate.getConverter();
        final DBCollection coll = mongoTemplate.getCollection(typeRepository.getCollectionName(type));
        final DBCursor cursor = coll.find(parseQuery(handler.query()), prepareFields(handler.fields()));
        try {
            handler.start();
            handler.setCount(cursor.count());
            cursor.skip(handler.skip());
            cursor.limit(handler.limit());
            if (handler.sort() != null) {
                cursor.sort(buildSort(handler.sort()));
            }
            int count = 0;
            while (cursor.hasNext()) {
                final DBObject obj = cursor.next();
                if (!handler.handle(converter.read(GenericObject.class, obj))) {
                    log.info("stop further processing of generic objects cause handler returns false");
                    break;
                }
                count++;
                log.debug("Wrote {} of {} records to handler.", count, cursor.count());
            }
            log.debug("Explain MongoDB query: {}", cursor.explain());
        } catch (final MongoException e) {
            throw new ServiceException(400, e.getMessage());
        } finally {
            cursor.close();
            handler.stop();
        }

When the bug occurs, the cursor has 11 Entries, but writes 25 to the handler.

2013-10-11 10:07:46,065 DEBUG d.p.f.s.mongodb.ObjectRepository - Wrote 25 of 11 records to handler.
2013-10-11 10:07:46,103 DEBUG d.p.f.s.mongodb.ObjectRepository - Explain MongoDB query: { "cursor" : "BtreeCursor EMPID_1 multi" , "isMultiKey" : false , "n" : 25 , "nscannedObjects" : 11 , "nscanned" : 12 , "nscannedObjectsAllPlans" : 49 , "nscannedAllPlans" : 51 , "scanAndOrder" : true , "indexOnly" : false , "nYields" : 0 , "nChunkSkips" : 0 , "millis" : 1 , "indexBounds" : { "EMPID" : [ [ { "$minElement" : 1} , ""] , [ "" , { "$maxElement" : 1}]]} , "allPlans" : [ { "cursor" : "BtreeCursor EMPID_1 multi" , "n" : 11 , "nscannedObjects" : 11 , "nscanned" : 12 , "indexBounds" : { "EMPID" : [ [ { "$minElement" : 1} , ""] , [ "" , { "$maxElement" : 1}]]}} , { "cursor" : "BtreeCursor EMPID_-1 multi" , "n" : 12 , "nscannedObjects" : 12 , "nscanned" : 13 , "indexBounds" : { "EMPID" : [ [ { "$maxElement" : 1} , ""] , [ "" , { "$minElement" : 1}]]}} , { "cursor" : "BtreeCursor EMPID_1_LASTNAME_1_FIRSTNAME_1_modified_1 multi" , "n" : 13 , "nscannedObjects" : 13 , "nscanned" : 13 , "indexBounds" : { "EMPID" : [ [ { "$minElement" : 1} , ""] , [ "" , { "$maxElement" : 1}]] , "LASTNAME" : [ [ { "$minElement" : 1} , { "$maxElement" : 1}]] , "FIRSTNAME" : [ [ { "$minElement" : 1} , { "$maxElement" : 1}]] , "modified" : [ [ { "$minElement" : 1} , { "$maxElement" : 1}]]}} , { "cursor" : "BasicCursor" , "n" : 13 , "nscannedObjects" : 13 , "nscanned" : 13 , "indexBounds" : { }}] , "oldPlan" : { "cursor" : "BtreeCursor EMPID_1 multi" , "indexBounds" : { "EMPID" : [ [ { "$minElement" : 1} , ""] , [ "" , { "$maxElement" : 1}]]}} , "server" : "vlmucext07:27017"}
2013-10-11 10:07:46,103 WARN d.p.feeder.web.ObjectController - Something goes really wrong: Count: 11, Entries: 25
    Retry!

So the cursor delivers more entries, then itself counts...
It fills to the limit of 25 that we use for pagination.

My Workaround is, to check for this problem, and send the query a second time immediately.
Now the Last log is:

2013-10-11 10:07:46,538 DEBUG d.p.f.s.mongodb.ObjectRepository - Wrote 11 of 11 records to handler.
2013-10-11 10:07:46,577 DEBUG d.p.f.s.mongodb.ObjectRepository - Explain MongoDB query: { "cursor" : "BtreeCursor EMPID_1 multi" , "isMultiKey" : false , "n" : 25 , "nscannedObjects" : 11 , "nscanned" : 12 , "nscannedObjectsAllPlans" : 49 , "nscannedAllPlans" : 51 , "scanAndOrder" : true , "indexOnly" : false , "nYields" : 0 , "nChunkSkips" : 0 , "millis" : 1 , "indexBounds" : { "EMPID" : [ [ { "$minElement" : 1} , ""] , [ "" , { "$maxElement" : 1}]]} , "allPlans" : [ { "cursor" : "BtreeCursor EMPID_1 multi" , "n" : 11 , "nscannedObjects" : 11 , "nscanned" : 12 , "indexBounds" : { "EMPID" : [ [ { "$minElement" : 1} , ""] , [ "" , { "$maxElement" : 1}]]}} , { "cursor" : "BtreeCursor EMPID_-1 multi" , "n" : 12 , "nscannedObjects" : 12 , "nscanned" : 13 , "indexBounds" : { "EMPID" : [ [ { "$maxElement" : 1} , ""] , [ "" , { "$minElement" : 1}]]}} , { "cursor" : "BtreeCursor EMPID_1_LASTNAME_1_FIRSTNAME_1_modified_1 multi" , "n" : 13 , "nscannedObjects" : 13 , "nscanned" : 13 , "indexBounds" : { "EMPID" : [ [ { "$minElement" : 1} , ""] , [ "" , { "$maxElement" : 1}]] , "LASTNAME" : [ [ { "$minElement" : 1} , { "$maxElement" : 1}]] , "FIRSTNAME" : [ [ { "$minElement" : 1} , { "$maxElement" : 1}]] , "modified" : [ [ { "$minElement" : 1} , { "$maxElement" : 1}]]}} , { "cursor" : "BasicCursor" , "n" : 13 , "nscannedObjects" : 13 , "nscanned" : 13 , "indexBounds" : { }}] , "oldPlan" : { "cursor" : "BtreeCursor EMPID_1 multi" , "indexBounds" : { "EMPID" : [ [ { "$minElement" : 1} , ""] , [ "" , { "$maxElement" : 1}]]}} , "server" : "vlmucext07:27017"}

As far as I see, the explains are equal.
The Problem happens from time to time, but I can reproduce it after MongoDB restart and Jetty Restart.

Is there a bug or is it my fault?

Regards

Asya Kamsky

unread,
Oct 11, 2013, 9:50:38 PM10/11/13
to mongodb-user
Could you run the same query with explain() and also with
explain(true) from the mongo shell and post the output please?

Asya
> --
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
>
> ---
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mongodb-user...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Asya Kamsky

unread,
Oct 11, 2013, 9:53:49 PM10/11/13
to mongodb-user
By the way, you seem to have multiple unnecessary indexes in your
database. Both EMPID:-1 and EMPID:1 indexes are unnecessary since
EMPID is the prefix on the compound index on EMPID, LASTNAME,
FIRSTNAME and that can be effectively used for any query or sort on
EMPID.

Asya

Nabor

unread,
Oct 12, 2013, 5:31:38 AM10/12/13
to mongod...@googlegroups.com
Thank you for this hint, but currently we don't care about this unecessary indexes ;)

Rob Moore

unread,
Oct 14, 2013, 11:03:45 AM10/14/13
to mongod...@googlegroups.com

Nabor -

I am not sure what is causing the cursor to return more documents but I would start by not mixing the count()/explain() and hasNext()/next() calls on the cursor.  The driver's cursor implementation has to track if it is in "construction" mode or "fetch" mode and I fear mixing count()/explain() (construction) and hasNext()/next() (fetch) is breaking your usage.  Can you try something like the following and see if it helps?

        final MongoConverter converter = mongoTemplate.getConverter();
        final DBCollection coll = mongoTemplate.getCollection(typeRepository.getCollectionName(type));
        final DBCursor cursor = coll.find(parseQuery(handler.query()), prepareFields(handler.fields()));
        try {
            handler.start();

            int total = cursor.count();

            handler.setCount(total);

            cursor.skip(handler.skip());
            cursor.limit(handler.limit());
            if (handler.sort() != null) {
                cursor.sort(buildSort(handler.sort()));
            }
            log.debug("Explain MongoDB query: {}", cursor.explain());
            int count = 0;
            while (cursor.hasNext()) {
                final DBObject obj = cursor.next();
                if (!handler.handle(converter.read(GenericObject.class, obj))) {
                    log.info("stop further processing of generic objects cause handler returns false");
                    break;
                }
                count++;
                log.debug("Wrote {} of {} records to handler.", count, total);

Nabor

unread,
Oct 14, 2013, 4:51:22 PM10/14/13
to mongod...@googlegroups.com
Thank you. It is an interesting idea.
I hadn't thought in that way. For me count() was only a function that delivers something and not something that causes a big functionality in the background.
I'll try that.
In addition I'll try the statement and the explain within the mongo console as you suggested but so fare I hadn't time doing it.

Nabor

unread,
Oct 17, 2013, 5:54:31 AM10/17/13
to mongod...@googlegroups.com
I found something, but not sure why this happens.
As long as using the query without sort, everything works fine:

db.GO_SF_USER.find({ "EMPID" : { "$ne" : ""}},{ "type" : 1 , "modified" : 1 , "EMPID" : 1 , "LASTNAME" : 1 , "FIRSTNAME" : 1}).explain()
{

    "cursor" : "BtreeCursor EMPID_1 multi",
    "isMultiKey" : false,
    "n" : 11,
    "nscannedObjects" : 11,
    "nscanned" : 11,
    "nscannedObjectsAllPlans" : 11,
    "nscannedAllPlans" : 11,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 0,

    "indexBounds" : {
        "EMPID" : [
            [
                {
                    "$minElement" : 1
                },
                ""
            ],
            [
                "",
                {
                    "$maxElement" : 1
                }
            ]
        ]
    },
    "server" : "vlmucext07:27017"
}

now with sort:

db.GO_SF_USER.find({ "EMPID" : { "$ne" : ""}},{ "type" : 1 , "modified" : 1 , "EMPID" : 1 , "LASTNAME" : 1 , "FIRSTNAME" : 1}).sort({ "EMPID" : 1 , "LASTNAME" : 1 , "FIRSTNAME" : 1 , "modified" : 1}).explain()
{

    "cursor" : "BtreeCursor EMPID_1_LASTNAME_1_FIRSTNAME_1_modified_1 multi",
    "isMultiKey" : false,
    "n" : 1930,
    "nscannedObjects" : 1930,
    "nscanned" : 1930,
    "nscannedObjectsAllPlans" : 1930,
    "nscannedAllPlans" : 1930,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 9,

    "indexBounds" : {
        "EMPID" : [
            [
                {
                    "$minElement" : 1
                },
                ""
            ],
            [
                "",
                {
                    "$maxElement" : 1
                }
            ]
        ],
        "LASTNAME" : [
            [
                {
                    "$minElement" : 1
                },
                {
                    "$maxElement" : 1
                }
            ]
        ],
        "FIRSTNAME" : [
            [
                {
                    "$minElement" : 1
                },
                {
                    "$maxElement" : 1
                }
            ]
        ],
        "modified" : [
            [
                {
                    "$minElement" : 1
                },
                {
                    "$maxElement" : 1
                }
            ]
        ]
    },
    "server" : "vlmucext07:27017"
}

Without sort and explain the first example delivers my 11 entries.
With sort and without explain, the second delivers all records

Any Ideas?

Nabor

unread,
Oct 17, 2013, 6:07:04 AM10/17/13
to mongod...@googlegroups.com
The Query is insufficient, but I think, it should not behave this way.
If we ask for { EMPID : { $ne : "" }} without sort, it uses the sparse index, that only contains records with an EMPID.
It we ask the same query with sort, it uses another index, that contains all records, most of them without an EMPID.
With the query { EMPID : { $exists : true, $ne : "" }} and sort, everything works also with the bigger index.
Now the question is, can one expect that $ne implicitly means $exists?

Regards

Asya Kamsky

unread,
Oct 18, 2013, 4:57:11 AM10/18/13
to mongodb-user
There is a note in the docs that specifically warns that using sparse
indexes changes the semantics of certain queries.

http://docs.mongodb.org/manual/core/index-sparse/

You should be careful using sparse indexes, especially when you have
these somewhat ambiguous queries conditions like EMPID:{$ne:""} -
after all, if EMPID does not exist, it's definitely not equal to "".

Asya

Nabor

unread,
Oct 18, 2013, 9:01:58 AM10/18/13
to mongod...@googlegroups.com
I agree, that the query was ambigous and changed it.
Now after this experience, I know what to take care for.

Thank you for your support.
Reply all
Reply to author
Forward
0 new messages