in mongodb v2+, multikey not working properly bounds

231 views
Skip to first unread message

Jack Xu

unread,
Aug 20, 2012, 2:55:25 PM8/20/12
to mongod...@googlegroups.com
in mongod 2.0+, when I run the following query in a single element multikey indexed document, it works fine and leverages multikey index: 

db.objects.drop();
db.objects.save({attrib : [ {d : 3}]})
db.objects.save({attrib : [ {d : 5} ]})
db.objects.save({attrib : [ {d : 7}]})
db.objects.ensureIndex({attrib : 1})
db.objects.find({attrib: {$gt: {d: 4}, $lt: {d: 6}}}).explain();
 
{
"cursor" : "BtreeCursor attrib_1",
"isMultiKey" : false,
"n" : 1,
"nscannedObjects" : 1,
"nscanned" : 1,
"nscannedObjectsAllPlans" : 1,
"nscannedAllPlans" : 1,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"attrib" : [
[
{
"d" : 4
},
{
"d" : 6
}
]
]
},
"server" : "lucid64:27017"
}

However, as soon as there are multiple elements in the array, one of the bounds is simply ignored! 

db.objects.drop();
db.objects.save({attrib : [ {c : 'test'} , {d : 3} ]})
db.objects.save({attrib : [ {c : 'test1'}, {d : 5}]})
db.objects.save({attrib : [ {c : 'test3'}, {d : 7} ]})
db.objects.ensureIndex({attrib : 1})
db.objects.find({attrib: {$gt: {d: 4}, $lt: {d: 6}}}).explain();

{
"cursor" : "BtreeCursor attrib_1",
"isMultiKey" : true,
"n" : 2,
"nscannedObjects" : 5,
"nscanned" : 5,
"nscannedObjectsAllPlans" : 5,
"nscannedAllPlans" : 5,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"attrib" : [
[
{
"d" : 4
},
{
"$maxElement" : 1
}
]
]
},
"server" : "lucid64:27017"
}

I do understand this issue has been reported in https://jira.mongodb.org/browse/SERVER-4155, but the answer is hard for me to understand.

also, btw, both scenarios work perfectly in 1.8.6.

please help.

Jenna

unread,
Aug 22, 2012, 11:33:52 AM8/22/12
to mongodb-user
Hello Jack,

As Aaron recently pointed out in the server ticket you reference
above, you will be able to use $elemMatch to restrict the index bounds
for range queries using multi-key indexes: https://jira.mongodb.org/browse/SERVER-4180.

On Aug 20, 2:55 pm, Jack Xu <jack...@gmail.com> wrote:
> in mongod 2.0+, when I run the following query in a single element multikey
> indexed document, it works fine and leverages multikey index:
>
> *db.objects.drop();
>
> > **db.objects.save({attrib : [ {d : 3}]})
> > **db.objects.save({attrib : [ {d : 5} ]})
> > **db.objects.save({attrib : [ {d : 7}]})
> > **db.objects.ensureIndex({attrib : 1})
> > **db.objects.find({attrib: {$gt: {d: 4}, $lt: {d: 6}}}).explain();*
>
> {
> "cursor" : "BtreeCursor attrib_1",
> "isMultiKey" : false,
> "n" : 1,
> "nscannedObjects" : 1,
> "nscanned" : 1,
> "nscannedObjectsAllPlans" : 1,
> "nscannedAllPlans" : 1,
> "scanAndOrder" : false,
> "indexOnly" : false,
> "nYields" : 0,
> "nChunkSkips" : 0,
> "millis" : 0,
> "indexBounds" : {
> "attrib" : [
> [
> {
> "d" : 4},
>
> {
> "d" : 6}
>
> ]
> ]},
>
> "server" : "lucid64:27017"
>
> }
>
> *However, as soon as there are multiple elements in the array, one of the
> bounds is simply ignored! *
>
> *db.objects.drop();
>
> > **db.objects.save({attrib : [ {c : 'test'} , {d : 3} ]})
> > **db.objects.save({attrib : [ {c : 'test1'}, {d : 5}]})
> > **db.objects.save({attrib : [ {c : 'test3'}, {d : 7} ]})
> > **db.objects.ensureIndex({attrib : 1})
> > **db.objects.find({attrib: {$gt: {d: 4}, $lt: {d: 6}}}).explain();*
>
> *
> *
> *
> {
> "cursor" : "BtreeCursor attrib_1",
> "isMultiKey" : true,
> "n" : 2,
> "nscannedObjects" : 5,
> "nscanned" : 5,
> "nscannedObjectsAllPlans" : 5,
> "nscannedAllPlans" : 5,
> "scanAndOrder" : false,
> "indexOnly" : false,
> "nYields" : 0,
> "nChunkSkips" : 0,
> "millis" : 0,
> "indexBounds" : {
> "attrib" : [
> [
> {
> "d" : 4},
>
> {
> "$maxElement" : 1}
>
> ]
> ]},
>
> "server" : "lucid64:27017"
>
> }
>
> I do understand this issue has been reported inhttps://jira.mongodb.org/browse/SERVER-4155, but the answer is hard for me
> to understand.
>
> also, btw, both scenarios work perfectly in 1.8.6.
>
> please help.
>
> *

Jack Xu

unread,
Aug 22, 2012, 2:02:47 PM8/22/12
to mongod...@googlegroups.com
thanks, Jenna, pointing this out and it worked! 

however, when I include multiple conditions in my examples (see below), the results seem to be correct, but the explain() output makes me wonder if the index is being leveraged for the second criteria. 

see below. any thoughts? 

db.objects.drop();
db.objects.save({attrib : [ {d : 3}]})
db.objects.save({attrib : [ {d : 5} ]})
db.objects.save({attrib : [ {d : 7}]})
db.objects.ensureIndex({attrib : 1})

db.objects.find({$and: [{attrib: { $elemMatch: {$gt: {d: 2}, $lt: {d: 6}}}}, {attrib: {c: 'test1'}} ] }).explain();
{
"cursor" : "BtreeCursor attrib_1",
"isMultiKey" : true,
"n" : 1,
"nscannedObjects" : 2,
"nscanned" : 2,
"nscannedObjectsAllPlans" : 2,
"nscannedAllPlans" : 2,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
"attrib" : [ //where is c : 'test1' condition?
[
{
"d" : 2

Jack Xu

unread,
Aug 22, 2012, 2:04:46 PM8/22/12
to mongod...@googlegroups.com
I am using 2.2.0 rc, btw

Jenna

unread,
Aug 23, 2012, 11:20:34 AM8/23/12
to mongodb-user
Hello Jack,
For the same reason that you cannot use two indexes to meet a single
query (with the exception of $or queries), you cannot use two
discontiguous ranges of a multi-key index to lookup documents. The
bounds reported in explain() are not unexpected.

On Aug 22, 2:04 pm, Jack Xu <jack...@gmail.com> wrote:
> I am using 2.2.0 rc, btw
>
>
>
>
>
>
>
> On Wednesday, 22 August 2012 14:02:47 UTC-4, Jack Xu wrote:
>
> > thanks, Jenna, pointing this out and it worked!
>
> > however, when I include multiple conditions in my examples (see below),
> > the results seem to be correct, but the explain() output makes me wonder if
> > the index is being leveraged for the second criteria.
>
> > see below. any thoughts?
>
> > db.objects.drop();
> >> db.objects.save({attrib : [ {d : 3}]})
> >> db.objects.save({attrib : [ {d : 5} ]})
> >> db.objects.save({attrib : [ {d : 7}]})
> >> db.objects.ensureIndex({attrib : 1})
>
> > db.objects.find({$and: [{attrib: { $elemMatch: {$gt: {d: 2}, $lt: {d:
> >> 6}}}},* {attrib: {c: 'test1'}*} ] }).explain();
>
> > {
> > "cursor" : "BtreeCursor attrib_1",
> > "isMultiKey" : true,
> > "n" : 1,
> > "nscannedObjects" : 2,
> > "nscanned" : 2,
> > "nscannedObjectsAllPlans" : 2,
> > "nscannedAllPlans" : 2,
> > "scanAndOrder" : false,
> > "indexOnly" : false,
> > "nYields" : 0,
> > "nChunkSkips" : 0,
> > "millis" : 0,
> > "indexBounds" : {
> > *"attrib" : [ //where is c : 'test1' condition?*
> > * [*
> > * {*
> > * "d" : 2*
> > * },*
> > * {*
> > * "d" : 6*
> > * }*
> > ]
> > ]

Jack Xu

unread,
Aug 23, 2012, 1:25:01 PM8/23/12
to mongod...@googlegroups.com
Hi Jenna, 

interesting... 

are there any workarounds? for example, could I created a compound index using the same field, in which case, it is 

db.objects.ensureIndex({attrib : 1, attrib : 1}) 

it didn't work when I tried, but just wanted to throw it out there. theoretically, it should be that different from a compound index of 2 fields.

any suggestions are highly appreciated.

Jenna

unread,
Aug 23, 2012, 5:19:16 PM8/23/12
to mongodb-user
Hello Jack,

Your idea about using a compound index on attrib is sound, but
unfortunately, the current implementation prevents compound indexes
from being useful for querying arrays in this manner.  Here is a
related server ticket:
https://jira.mongodb.org/browse/SERVER-3104

Just as a side note (which is solely for your edification since it
won't solve the problem at hand)- the command to create a compound
index on fields d and c in attrib would be:

db.objects.ensureIndex({"attrib.d" : 1, "attrib.c": 1})

You could use a compound index for the query above if you structured
your data as embedded documents rather than as arrays:
{_id: 1, a: {d: 1, c: "test1"}}

Restructuring your data in this manner may have other implications.
 If you can provide additional information about your data and
queries, we may be able to help find alternate solutions.

Jack Xu

unread,
Aug 24, 2012, 10:56:06 AM8/24/12
to mongod...@googlegroups.com
hi Jenna, 

thanks for your considerations. my particular use case is that we have multiple attributes of a document, and I would like to filter, sort and group based on these attributes. there are so many attributes that need to be used on real time that creating compound indexes on either one of them and their combinations are not realistic.

I have considered the following approaches:

{_id: 1, a: {d: 1, c: "test1"}} 
{_id: 1, a: [{k:'d',v:1}, {k:'c', v: "test1"} ] } 

but none of them addresses the need.

https://jira.mongodb.org/browse/SERVER-3104 indicates that it will be fixed in 2.3.x version. and we are currently on 2.2. how far out are we looking at? 

thanks again,
Jack

Jenna

unread,
Aug 24, 2012, 12:06:21 PM8/24/12
to mongodb-user
Unfortunately, I cannot give you an expected date. After we release a
stable version of 2.2 (which will happen very soon), 2.3 becomes the
current master development branch. For 2.x, odd values of x indicate
development releases and even values are used for stable, production-
ready versions. If you're running in a dev environment, you can watch
the JIRA ticket and download the nightly build if the issue is
resolved. For production, however, you should wait until MongoDB
version 2.4.

On Aug 24, 10:56 am, Jack Xu <jack...@gmail.com> wrote:
> hi Jenna,
>
> thanks for your considerations. my particular use case is that we have
> multiple attributes of a document, and I would like to filter, sort and
> group based on these attributes. there are so many attributes that need to
> be used on real time that creating compound indexes on either one of them
> and their combinations are not realistic.
>
> I have considered the following approaches:
>
> {_id: 1, a: {d: 1, c: "test1"}}
> {_id: 1, a: [{k:'d',v:1}, {k:'c', v: "test1"} ] }
>
> but none of them addresses the need.
>
> https://jira.mongodb.org/browse/SERVER-3104indicates that it will be fixed

Jack Xu

unread,
Aug 24, 2012, 1:01:15 PM8/24/12
to mongod...@googlegroups.com
okay, Jenna. I will wait then. in the meantime, as you mentioned before. any alternative approach that I can use for my use case in mongodb? 

Jenna

unread,
Aug 25, 2012, 2:56:31 PM8/25/12
to mongodb-user
Sure thing. Could you provide additional information? A few sample
documents to illustrate how fields vary from one document to the next,
as well as the exact queries that you need to run (and presumably use
indexes) will help us try to find workarounds.
> > >https://jira.mongodb.org/browse/SERVER-3104indicatesthat it will be

Jenna

unread,
Aug 29, 2012, 5:54:50 PM8/29/12
to mongodb-user
Multi-key indexes may be appropriate for your use case. Are the
fields in "attrs" consistent across your documents? Can you give more
information/ specifics about your queries?

http://www.mongodb.org/display/DOCS/Multikeys

On Aug 28, 3:49 pm, damirv <da...@damirvandic.nl> wrote:
> Are there any solutions to this problem? I have the general case of an
> 'e-commerce' catalog where I store around 80 properties of products, on
> which I need to be able to query (including $gt, $lt, etc.). The properties
> can also have a list as the value.
>
> An example (shortned):
> { "_id" : ObjectId( "503d1e320930d052269b6ea2" ),
>  ....
>   "attrs" : { "gprs" : true,
>     "tekstvoorspelling" : [],
>     "wifi" : true,
>     "intern_geheugen" : 16384,
>     "netwerkfrequenties" : [
>       850,
>       900,
>       1800,
>       1900,
>       2100 ],
>     "maximaal_uitbreidbaar_opslag" : 64,
>     "processorsnelheid" : 1400,
>     "type_telefoon" : [
>       "smartphone" ],
>     "brand_name" : "samsung",
>     "bluetooth" : true,
>
> } }
>
> What is the current recommended approach (using MongoDB 2+) to index such a
> collection?
> > > > >https://jira.mongodb.org/browse/SERVER-3104indicatesthatit will be
> > > > > > > > > >> > {...
>
> read more »

damirv

unread,
Aug 29, 2012, 7:52:07 PM8/29/12
to mongod...@googlegroups.com
Hi Jenna,

Thank you for your reply. What do you mean exactly by 'consistent'? The value type for all fields in attrs is the same for all documents, but some documents can miss some attributes (i.e., missing information), although I could add them and set the value to null/empty list.

The problem with multi-keys is that it only works with exact matches, right?... consider the example below:
db.articles.drop()

db.articles.ensureIndex( { tags : 1 } )
db.articles.insert( { name: "Doc A", author: "Author A", tags: [{v1:'a'}, {v2:3}, {v3:30}]})
db.articles.insert( { name: "Doc B", author: "Author B", tags: [{v1:'b'}, {v2:700}, {v3:300}]})
db.articles.insert( { name: "Doc C", author: "Author C", tags: [{v1:'c'}, {v2:7000}, {v3:3000}]})

# uses index
db.articles.find( { tags: {v1:'c'} } ).explain()
# does not use index
db.articles.find( { tags: {$elemMatch:{v2:{$gt:1000}}}} ).explain()

My queries will be on these many attributes (with no predefined attribute combination).

To summarize my problem (and translate this to the above example): how can I search effciently on fields v1, v2 and v3 (with $gt, $lt, etc operators)? Should I use another data representation? Maybe with two fields 'name' and 'value'? (but then we have the bug in MongoDB 2+)

Thanks again.

Jenna

unread,
Aug 30, 2012, 1:29:08 PM8/30/12
to mongodb-user
Hi there,

Your observation about multi-key indexes is correct- since the index
entry is created on the entire embedded document, e.g. {v2:100}, you
can't use the index to search for values or ranges with the syntax:
db.articles.find( { tags: {$elemMatch:{v2:{$gt:1000}}}} )

However, it is possible to use $gt and $lt to compare the binary
representation of documents (which includes the field name as well as
the value). To illustrate this point, consider the following query
that sorts "test":

db.run.find().sort({test:1})
{ "_id" : 6, "test" : { "v" : 1 } }
{ "_id" : 1, "test" : { "v1" : 1 } }
{ "_id" : 2, "test" : { "v1" : 2 } }
{ "_id" : 4, "test" : { "v1" : 2, "v2" : 0 } }
{ "_id" : 3, "test" : { "v1" : 2, "v2" : 1 } }
{ "_id" : 5, "test" : { "v3" : 1 } }

(notice the results are sorted lexicographically)

Following this logic, it's possible to use $gt and $lt to query for
ranges of the tags array and still use the multi-key index:
db.articles.find({tags:{$elemMatch:{$gt:{v2:4}, $lt:{v2:100}}}})

Keep in mind, however, that
{v3:1} > {v2:1}
so if you're only interested in documents with v2 > than some value,
you should make sure to include an upper bound, i.e. $lt: {v2:100000}
> ...
>
> read more »

damirv

unread,
Aug 30, 2012, 2:09:44 PM8/30/12
to mongod...@googlegroups.com
Hi Jenny,

I didn't know that! Actually this is a feasible solution for me. Many thanks!
Reply all
Reply to author
Forward
0 new messages