Strange index performance inconsitency

20 views
Skip to first unread message

Pieter Jordaan

unread,
May 8, 2012, 5:42:37 AM5/8/12
to mongod...@googlegroups.com

Why is the first slower than the second. Both use the same index. These differences are consistent

db.tester.find({i: 9, j: {$lt: 10}}).sort({k: -1}).explain()
{
        "cursor" : "BtreeCursor i_1_j_1_k_-1",
        "isMultiKey" : false,
        "n" : 10000,
        "nscannedObjects" : 40000,
        "nscanned" : 40000,
        "scanAndOrder" : true,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 138,
        "indexBounds" : {
                "i" : [
                        [
                                9,
                                9
                        ]
                ],
                "j" : [
                        [
                                -1.7976931348623157e+308,
                                10
                        ]
                ],
                "k" : [
                        [
                                {
                                        "$maxElement" : 1
                                },
                                {
                                        "$minElement" : 1
                                }
                        ]
                ]
        },
        "server" : "Climax-Laptop:27017"
}
> db.tester.find({i: 9, j: {$lt: 10}}).hint("i_1_j_1_k_-1").sort({k: -1}).explain()
{
        "cursor" : "BtreeCursor i_1_j_1_k_-1",
        "isMultiKey" : false,
        "n" : 10000,
        "nscannedObjects" : 10000,
        "nscanned" : 10000,
        "scanAndOrder" : true,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 67,
        "indexBounds" : {
                "i" : [
                        [
                                9,
                                9
                        ]
                ],
                "j" : [
                        [
                                -1.7976931348623157e+308,
                                10
                        ]
                ],
                "k" : [
                        [
                                {
                                        "$maxElement" : 1
                                },
                                {
                                        "$minElement" : 1
                                }
                        ]
                ]
        },
        "server" : "Climax-Laptop:27017"
}

The index is created with: 
db.tester.ensureIndex({i: 1, j: 1, k: -1})

another index: db.tester.ensureIndex({i: 1, j: 1})

and another: db.tester.ensureIndex({k: -1}).


After some preliminary testing I've found the following:

  • After dropping the last two indexes, the performance of both queries match. Why did this happen? Does sort use a different index than the query?
  • After adding the index on i and j, no performance inconsistencies occurred. But after adding the index on k: -1 the performance inconsistency appeared again. This makes me think that the sort uses a different index. How can I test this?

Scott Hernandez

unread,
May 8, 2012, 9:28:38 AM5/8/12
to mongod...@googlegroups.com

I answered the thread, please respond there.

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/ihTt8Ap6ya4J.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.

Pieter Jordaan

unread,
May 8, 2012, 10:02:37 AM5/8/12
to mongod...@googlegroups.com
I am replying here as Google Groups says that the previous topic doesn't exist anymore

No the data did not change whatsoever. The queries were executed chronologically.

Data was inserted with:

for(i = 0; i < 100; ++i) {
... for(j = 0; j < 100; ++j) {
... for(k = 0; k < 1000; ++k) {
... db.tester.insert({i:i,j:j,k:k});
... }
... }
... }

Probably not the fastest way to insert test data I know...

After creating the index {k:-1, i: 1, j: 1} I found:


db.tester.find({i: 9, j: {$lt: 10}}).sort({k: -1}).explain()
{
        "cursor" : "BtreeCursor k_-1_i_1_j_1",
        "isMultiKey" : false,
        "n" : 10000,
        "nscannedObjects" : 10244,
        "nscanned" : 12243,
        "scanAndOrder" : false,

        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 53,
        "indexBounds" : {

                "k" : [
                        [
                                {
                                        "$maxElement" : 1
                                },
                                {
                                        "$minElement" : 1
                                }
                        ]
                ],
                "i" : [
                        [
                                9,
                                9
                        ]
                ],
                "j" : [
                        [
                                -1.7976931348623157e+308,
                                10
                        ]
                ]
        },
        "server" : "Climax-Laptop:27017"
}

and with an explicit hint:


> db.tester.find({i: 9, j: {$lt: 10}}).hint("k_-1_i_1_j_1").sort({k: -1}).explain()
{
        "cursor" : "BtreeCursor k_-1_i_1_j_1",

        "isMultiKey" : false,
        "n" : 10000,
        "nscannedObjects" : 10000,
        "nscanned" : 12000,
        "scanAndOrder" : false,

        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 45,
        "indexBounds" : {

                "k" : [
                        [
                                {
                                        "$maxElement" : 1
                                },
                                {
                                        "$minElement" : 1
                                }
                        ]
                ],
                "i" : [
                        [
                                9,
                                9
                        ]
                ],
                "j" : [
                        [
                                -1.7976931348623157e+308,
                                10
                        ]
                ]
        },
        "server" : "Climax-Laptop:27017"
}


The time in milliseconds are on average the same between the two, but the number of objects scanned differs. Why is this?

I was under the impression that the sort key should be the last key in the index.

On Tuesday, 8 May 2012 13:34:07 UTC+2, Scott Hernandez wrote:
Did the data change in the collection between the first and the second?

The first is slower because it required looking through 4 times as
many documents to return the results: see the nscanned values from
each explain.

The index was not used for the sort as noted by the scanAndOrder:true
in the explain output. For this query you may want to create and test
this index: {k:-1, i: 1, j: 1}

> After some preliminary testing I've found the following:
>
>  After dropping the last two indexes, the performance of both queries match.
> Why did this happen? Does sort use a different index than the query?
> After adding the index on i and j, no performance inconsistencies occurred.
> But after adding the index on k: -1 the performance inconsistency appeared
> again. This makes me think that the sort uses a different index. How can I
> test this?
>

> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit

> https://groups.google.com/d/msg/mongodb-user/-/ZqjOWEYlMhIJ.


> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to

> mongodb-user+unsubscribe@googlegroups.com.

Scott Hernandez

unread,
May 8, 2012, 10:08:05 AM5/8/12
to mongod...@googlegroups.com
You are doing a range query and a sort on different fields and that
affects that general wisdom of sort field last.

On Tue, May 8, 2012 at 7:02 AM, Pieter Jordaan

>> > mongodb-user...@googlegroups.com.


>> > For more options, visit this group at
>> > http://groups.google.com/group/mongodb-user?hl=en.
>>
>>

> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit

> https://groups.google.com/d/msg/mongodb-user/-/vSqT15CpW3kJ.


>
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to

> mongodb-user...@googlegroups.com.

Pieter Willem Jordaan

unread,
May 8, 2012, 10:12:13 AM5/8/12
to mongod...@googlegroups.com
Thanks I now understand why the index should contain the sort field first then, but why do the nscanned and nscannedObjects differ when explicitly hinting the index to use? Even though that index was chosen by the query optimizer?

Scott Hernandez

unread,
May 8, 2012, 10:17:15 AM5/8/12
to mongod...@googlegroups.com
There is a certain amount of variability in how the index is traversed
and that is what you are seeing. On average it will be much more
efficient since it doesn't need to sort in memory.

Pieter Willem Jordaan

unread,
May 8, 2012, 10:20:15 AM5/8/12
to mongod...@googlegroups.com
So it is better then, in general, to trust the optimizer's choice of index rather than explicitly hinting (unless of course the first field of an index isn't used)

Scott Hernandez

unread,
May 8, 2012, 10:55:06 AM5/8/12
to mongod...@googlegroups.com
There is nothing wrong with the hint, but if there is no ambiguity for
the query wrt the indexes you don't need a hint.

On Tue, May 8, 2012 at 7:20 AM, Pieter Willem Jordaan

aaron

unread,
May 8, 2012, 2:21:09 PM5/8/12
to mongodb-user
Hi Pieter,

As of mongo 2.1.1, the results of a query may be interleaved from
multiple query plans that are attempted when the query begins
running. (In prior versions of mongo, different plans might be
attempted simultaneously but all query results would come from a
single plan.)

Accompanying this change, the explain output was modified so that
nscanned is now reported as the sum of the nscanned values of all
candidate query plans rather than the nscanned value for the "winning"
plan. The individual nscanned values for each of the attempted plans
are now reported as well, if you run explain() with the true option
( eg. db.c.find().explain( true ) ).

The reason you now see a discrepancy between the nscanned values for
an unhinted explain vs a hinted explain is that in the unhinted case
the reported nscanned includes the nscanned values of additional
candidate plans that were attempted. In the hinted case, there are no
additional candidate plans to be reported in this manner.

On May 8, 7:12 am, Pieter Willem Jordaan <pieterwjordaa...@gmail.com>
wrote:
> Thanks I now understand why the index should contain the sort field first
> then, but why do the nscanned and nscannedObjects differ when explicitly
> hinting the index to use? Even though that index was chosen by the query
> optimizer?
>

Pieter Willem Jordaan

unread,
May 8, 2012, 2:24:40 PM5/8/12
to mongod...@googlegroups.com
Thank you very much for the clear answer. :)
Reply all
Reply to author
Forward
0 new messages