$elemMatch won't use my index on array

1,423 views
Skip to first unread message

smitchell

unread,
Sep 21, 2011, 7:24:13 PM9/21/11
to mongodb-user
Hi guys, I'm hoping one of you can point out what I'm doing wrong
here. I am testing out a collection that will allow users to define
custom fields on their documents. I have tucked these custom name/
value pairs into an array called "fields". Here are a couple of
example documents to give you a feeling for the structure:

{ "_id" : ObjectId("4d51fb56330a000000000cb2"),
"created_ts" : "2011-01-01",
"email" : "sc...@test.com",
"fields" : [ { "state" : "GA" },
{ "age" : 30 },
{ "sports" : [ "football", "soccer" ] }
],
"status" : 1
}
{ "_id" : ObjectId("4d51fca0722a000000007f5d"),
"email" : "te...@test.com",
"status" : 1,
"created_ts" : "2011-01-01",
"fields" : [ { "state" : "FL" },
{ "age" : 40 },
{ "sports" : [ "soccer" ] }
]
}

I have created an index on the "fields" array and with the following
syntax it works great:

db.contacts.find( { fields: { age : 30} } )

However, when I switch the query to use the $elemMatch operator so
that I can do more interesting expressions like greater than and less
than Mongo no longer uses the expression. Here's an example:

db.contacts.find( { fields: { $elemMatch : { age : { $gt :
30 } } } } ).explain()

"cursor" : "BasicCursor",
"nscanned" : 5,
"nscannedObjects" : 5,
"n" : 3,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {

}


I've also tested $elemMatch with an equality expression and it still
doesn't use my index. Is this intentional or a bug?

Scott Hernandez

unread,
Sep 21, 2011, 8:31:39 PM9/21/11
to mongod...@googlegroups.com
$elematch is for multiple values in an array element. You have no such
structure here. There is no need to use it.

Please read the docs about indexing a document:
http://www.mongodb.org/display/DOCS/Indexes#Indexes-DocumentsasKeys

This behavior is not a bug and expected.

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

smitchell

unread,
Sep 22, 2011, 8:25:01 AM9/22/11
to mongodb-user
Thanks Scott, I'm trying the factories example you linked to. However,
I can't get $gt expressions to work with the index in this case
either. I loaded the following documents (I added a size attribute to
have a simple numeric to greater than/less than comparisons on):

{ "_id" : ObjectId("4e7a8d36030b43de10c38ac2"), "name" : "abc",
"metro" : { "city" : "Atlanta", "state" : "GA", "size" : 10 } }
{ "_id" : ObjectId("4e7a8d42030b43de10c38ac3"), "name" : "xyz",
"metro" : { "city" : "New York", "state" : "NY", "size" : 5 } }

I then attempted:

db.factories.find( { metro: { size: { $gt : 7 } } } )

That does use the index, but does not return any values.

And:

db.factories.find( { metro: { $gte : { size : 7 } } } )

Which also uses the index, but returns both document (instead of just
the one with size:10)

And finally:

db.factories.find( { "metro.size" : { $gt : 5 } } )

which returns the correct document, but does not use the index.

So, can you point me in the right direction of how I can get either
array values, or embedded documents to work with an index when
performing operations other than equality?

Thanks,
Scott

On Sep 21, 8:31 pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
> $elematch is for multiple values in an array element. You have no such
> structure here. There is no need to use it.
>
> Please read the docs about indexing a document:http://www.mongodb.org/display/DOCS/Indexes#Indexes-DocumentsasKeys
>
> This behavior is not a bug and expected.
>
>
>
>
>
>
>
> On Wed, Sep 21, 2011 at 7:24 PM, smitchell <smitch...@silverpop.com> wrote:
> > Hi guys, I'm hoping one of you can point out what I'm doing wrong
> > here. I am testing out a collection that will allow users to define
> > custom fields on their documents. I have tucked these custom name/
> > value pairs into an array called "fields". Here are a couple of
> > example documents to give you a feeling for the structure:
>
> > { "_id" : ObjectId("4d51fb56330a000000000cb2"),
> >  "created_ts" : "2011-01-01",
> >  "email" : "sc...@test.com",
> >  "fields" : [ { "state" : "GA" },
> >               { "age" : 30 },
> >               { "sports" : [ "football", "soccer" ] }
> >             ],
> >  "status" : 1
> > }
> > { "_id" : ObjectId("4d51fca0722a000000007f5d"),
> >  "email" : "t...@test.com",

Nat

unread,
Sep 22, 2011, 8:30:32 AM9/22/11
to mongod...@googlegroups.com
Try { "metro.size" : { $gt : 7 } }

Scott Hernandez

unread,
Sep 22, 2011, 8:32:51 AM9/22/11
to mongod...@googlegroups.com
On Thu, Sep 22, 2011 at 8:25 AM, smitchell <smit...@silverpop.com> wrote:
> Thanks Scott, I'm trying the factories example you linked to. However,
> I can't get $gt expressions to work with the index in this case
> either. I loaded the following documents (I added a size attribute to
> have a simple numeric to greater than/less than comparisons on):
>
>  { "_id" : ObjectId("4e7a8d36030b43de10c38ac2"), "name" : "abc",
> "metro" : { "city" : "Atlanta", "state" : "GA", "size" : 10 } }
>  { "_id" : ObjectId("4e7a8d42030b43de10c38ac3"), "name" : "xyz",
> "metro" : { "city" : "New York", "state" : "NY", "size" : 5 } }
>
> I then attempted:
>
>  db.factories.find( { metro: { size: { $gt : 7 } } } )
>
> That does use the index, but does not return any values.

This is not really valid or what you think it is. It will find a metro
with the exact match of this documents "{ size: { $gt : 7 } }"

> And:
>
>  db.factories.find( { metro: { $gte : { size : 7 } } } )
>
> Which also uses the index, but returns both document (instead of just
> the one with size:10)

This is not what you think it is either. It will find documents which
are $gte "{size:7}", but the docs you have are of a completely
different form.

> And finally:
>
>  db.factories.find( { "metro.size" : { $gt : 5 } } )
>
> which returns the correct document, but does not use the index.

You really want to index "metro.size" and not metro if this is the
query you want to use.


> So, can you point me in the right direction of how I can get either
> array values, or embedded documents to work with an index when
> performing operations other than equality?

The key fact which I think is confusing you is that created an index
of the document does not index the fields inside. It indexes the whole
document as one piece.

smitchell

unread,
Sep 23, 2011, 12:55:55 PM9/23/11
to mongodb-user
Thanks again for the feedback Scott. In the case I am trying to solve
I don't think it will be reasonable for the to index the specific
fields ("metro.size" in the example above) as these are user defined
fields. I was quite happy with the array approach when I found it
would honor the index in the equality case ( { fields: { age :
30} } ). I just was hoping it would work with other operators as well.
That may not be a deal breaker for us as our users to do a lot of
equality checks (in the user defined queries we let them create
against this data) and currently we don't have those indexed in our
Oracle based solution.

So, having said all that, do you have any recommendations about how
you would attack such a problem in Mongo? Do you think the array of
name/value pairs makes sense in this case, or is there a different
solution that we just haven't hit upon in our reading?

On Sep 22, 8:32 am, Scott Hernandez <scotthernan...@gmail.com> wrote:

Scott Hernandez

unread,
Sep 23, 2011, 1:06:26 PM9/23/11
to mongod...@googlegroups.com

The array of single key/value pairs allows you to store and index arbitrary data. Does that struct work for you?

smitchell

unread,
Sep 23, 2011, 3:30:02 PM9/23/11
to mongodb-user
Well, like I said earlier it works fine for selecting documents that
have a matching name/value pair in the fields array ( ex. {size: 5} ).
Just curious if anyone has done something similar and what document
structure they found most effective.

On Sep 23, 1:06 pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
> The array of single key/value pairs allows you to store and index arbitrary
> data. Does that struct work for you?

Scott Hernandez

unread,
Sep 23, 2011, 4:40:23 PM9/23/11
to mongod...@googlegroups.com
Maybe I wasn't clear, you can do range queries on those as well. The
syntax is just little different since you have to compare the whole,
or part (the beginning) of the document.

http://www.mongodb.org/display/DOCS/Indexes#Indexes-DocumentsasKeys

Paul Harvey

unread,
Oct 6, 2011, 11:09:57 PM10/6/11
to mongodb-user
On Sep 24, 7:40 am, Scott Hernandez <scotthernan...@gmail.com> wrote:
> Maybe I wasn't clear, you can do range queries on those as well. The
> syntax is just little different since you have to compare the whole,
> or part (the beginning) of the document.
>
> http://www.mongodb.org/display/DOCS/Indexes#Indexes-DocumentsasKeys

I really can't get this to work. All of my experiments so far have
failed to construct an inequality query using the "multikey" approach
which uses the index.

Our documents normally look something like this:
{
"FIELD" : {
"UsersFieldName" : { value : 1, ... }
},
...
}

I dearly want to avoid a separate index on every
FIELD.UsersFieldNameName.value, FIELD.UsersFieldName.type, etc.

So I've tried duplicating these per-"UsersFieldName" key/value pairs
into an indexed array, so it now looks like:
{
"FIELD" : {
"UsersFieldName" : { value : 1, ... }
}
"_ARRAY:FIELD" : [
{ "UsersFieldName:value" : 1 }, ...
]
}

In a collection with 12056 total documents, where _ARRAY:FIELD may
contain { UsersFieldName:value : 0 } or { UsersFieldName:value : 1},
this query seems to use the index and gives a correct answer ("n" :
8721):

> db.current.find({'_ARRAY:FIELD' : { $lt : {'UsersFieldName:value' : 1}}}).explain()
{
"cursor" : "BtreeCursor _ARRAY:FIELD_1",
"nscanned" : 8737,
"nscannedObjects" : 8737,
"n" : 8721,
"millis" : 33,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : true,
"indexOnly" : false,
"indexBounds" : {
"_ARRAY:FIELD" : [
[
{
"$minElement" : 1
},
{
"UsersFieldName:value" : 1
}
]
]
}
}

But the following is wrong ("n" should be 0), and the index doesn't
seem to help:
> db.current.find({'_ARRAY:FIELD' : { $gt : {'UsersFieldName:value' : 1}}}).explain()
{
"cursor" : "BtreeCursor _ARRAY:FIELD_1",
"nscanned" : 384230,
"nscannedObjects" : 384230,
"n" : 12040,
"millis" : 1138,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : true,
"indexOnly" : false,
"indexBounds" : {
"_ARRAY:FIELD" : [
[
{
"UsersFieldName:value" : 1
},
{
"$maxElement" : 1
}
]
]
}
}

With this next form, I can always get a correct answer (I.E. n = 0 for
value > 1), but "nscanned" is always 12056:

db.current.find({'_ARRAY:FIELD.UsersFieldName:value' : { $gt :
1 }}).explain()

It seems that only an equality query actually uses the index:

db.current.find({'_ARRAY:FIELD' : {'UsersFieldName:value' :
1} }).explain()
{ ... "nscanned" : 3315, "n" : 3315 ... }

- Paul

Paul Harvey

unread,
Oct 7, 2011, 12:43:38 AM10/7/11
to mongodb-user
Okay, so I found this message: http://groups.google.com/group/mongodb-user/msg/c85f9093efb87cb4

PLEASE fix the multikeys doc to mention the requirement to list both
ends of a range :-)

I'm still confused why the performance seems to be nowhere near a
match/lookup type query, though (~30ms vs 1582ms):

> db.current.find({'_ARRAY:FIELD' : { $gt : {'UsersFieldName:value' : 0}, $lt : {'UsersFieldName:value' : 1}}}).explain()
{
"cursor" : "BtreeCursor _ARRAY:FIELD_1",
"nscanned" : 387545,
"nscannedObjects" : 387545,
"n" : 8721,
"millis" : 1582,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : true,
"indexOnly" : false,
"indexBounds" : {
"_ARRAY:FIELD" : [
[
{
"UsersFieldName:value" : 0
},
{
"$maxElement" : 1
}
]
]
}
}
Reply all
Reply to author
Forward
0 new messages