Hi
With this setup, I seams that I really don’t use the indexes at their fullest, only the once on f is used, and I take a very long time to execute the query. I’ve read that cardinal & range are a problem with combine with a sort.
What should I change, or rather which index will be best.
If you require sorting, it is advisable to have the sort criteria in your index, since MongoDB has a limit of 32 MB for an in-memory sort stage of a query (see Use Indexes to Sort Query Results). Exceeding this limit will make the query fail to produce any result.
The best tool to determine index effectiveness is using query explain()
output. In short, you want to avoid stages named “SORT_KEY_GENERATOR” (which means an in-memory sort), and “COLLSCAN” (which means a collection scan). Another good indicator for index effectiveness is the numbers nReturned
, totalKeysExamined
, and totalDocsExamined
in the explain('executionStats')
output. It’s best if the nReturned
number corresponds closely to the totalKeysExamined
andtotalDocsExamined
numbers.
For more details regarding this subject, please see: http://stackoverflow.com/a/36420196/5619724, and also my post illustrating compound index: https://groups.google.com/d/msg/mongodb-user/LhcipsTCpqw/lP-gyBz8BgAJ. Another good source to understand indexing is Optimizing MongoDB Compound Indexes. Please note that although the article was written in 2012 and uses outdated terms, the content is still relevant.
With your example query, it’s difficult to tell what index would be best without knowing your use case in detail, the cardinality of the fields, other queries you perform, and other things specific to your deployment. However, I did some cursory observation using some random data:
> db.test.findOne()
{
"_id": ObjectId("58228641e5a536965c7f5d01"),
"a": "Alpha",
"b": true,
"c": "Alpha",
"d": false,
"e": ISODate("2016-03-23T23:45:45Z"),
"f": 395366
}
and found:
{a: 1, b: 1, c: 1, d: 1, e:1 }
contains the “SORT_KEY_GENERATOR” stage, due to the field f
not included in the index.{f: 1}
is only used for sorting and did not help with the rest of the query.{a: 1, b: 1, c: 1, d: 1, f: 1, e: 1}
works best due to the lack of “SORT_KEY_GENERATOR” and “COLLSCAN” stages.Please note that the observation I did was using some random data. It’s best if you can examine the explain()
result yourself to determine the best index for your use case.
Best regards,
Kevin