index usage question

24 views
Skip to first unread message

Dominik Gehl

unread,
Oct 27, 2011, 10:16:01 AM10/27/11
to mongodb-user
Hi,

I have the following index defined on a collection:
{
"v" : 0,
"key" : {
"aH.s" : 1,
"aH.m" : 1,
"aH.u" : 1
},
"unique" : true,
"background" : true,
"name" : "aH_s_1_aH_m_1_aH_u_1"
}


Now, I do the following query:

db.myCollection.find({'aH.s':'x', 'aH.m':'y', 'aH.u' : z}).explain()
{
"cursor" : "BtreeCursor aH_s_1_aH_m_1_aH_u_1",
"nscanned" : 220027,
"nscannedObjects" : 220027,
"n" : 1,
"millis" : 3016,
"nYields" : 55,
"nChunkSkips" : 0,
"isMultiKey" : true,
"indexOnly" : false,
"indexBounds" : {
"aH.s" : [
[
"x",
"x"
]
],
"aH.m" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
],
"aH.u" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
}
}

Since it's a 'unique' index, I'm surprised that there's a scan over
the whole collection which is executed (n=220027). The indexBounds
also don't seem to be properly used ... Any reason why ?

Thanks,
Dominik

Dominik Gehl

unread,
Oct 27, 2011, 10:28:34 AM10/27/11
to mongodb-user
I just did the same query on a different system running 1.8.2 ... the
indexBounds there are correct ! So is this a (known) bug in 2.0.1 ?

Thanks,
Dominik

Nat

unread,
Oct 27, 2011, 10:29:26 AM10/27/11
to mongod...@googlegroups.com
It seems like your index is a multikey index. Do you have certain field as array? Can you list each field's cardinality as well?

Dominik Gehl

unread,
Oct 27, 2011, 10:36:39 AM10/27/11
to mongodb-user
Fields should only contain text and numbers. Is there an explicit way
of listing a field's cardinality ?

Dominik

Nat

unread,
Oct 27, 2011, 10:39:59 AM10/27/11
to mongod...@googlegroups.com

Dominik Gehl

unread,
Oct 27, 2011, 11:24:49 AM10/27/11
to mongodb-user
Ok, checked it out. aH is either a hash (with keys s, m und u) or an
array of hashs with keys s, m und u

So indexes on this kind of structure don't "work" ?

Dominik

Nat

unread,
Oct 27, 2011, 11:33:01 AM10/27/11
to mongod...@googlegroups.com
I think the index still works but the unique bit is a bit questionable for array of hash. What is the logic there?

Dominik Gehl

unread,
Oct 27, 2011, 11:57:34 AM10/27/11
to mongodb-user
The combination of 's,m and u' must be unique among the whole
collection, but each document can possibly have several 's,m und u'
combinations. One possible example usage is files and folders. A given
file (identified by it's md5 for example) has at least one location on
disk (a given filename, directory), but can possibly have several
locations (copies). The other way around, a given location is still
unique among the complete list of files on disk.

I did several further tests:
- setting unique or not on the index does not affect query execution
- converting the 'can be array or hash' into 'is always an array' does
not affect query execution
- As long as all documents in the collection have at maximum one entry
(immediate hash or array with one hash) for aH, the multikey index is
completely used (aH.s, aH.m and aH.u are used). As soon as one
document in the collection has more than one entry for aH, the
multikey index uses only the first part (aH.s) and does a scan among
the returned results.

Is this the documented/supposed behavior ?

Dominik

Nat

unread,
Oct 27, 2011, 12:25:50 PM10/27/11
to mongod...@googlegroups.com
Yes... because if you have


{ aH : [{ s:1, m:1, u:1},{s:2,m:2,u:2}] }

db.<tablename>.find({ "aH.s":2, "aH.m":1, "aH.u":1 }) will find a match as well. If you want it to match exactly, you should use $elemMatch instead.



Dominik Gehl

unread,
Oct 27, 2011, 12:18:52 PM10/27/11
to mongodb-user
Another interesting piece of information. If I manually create a
concatenation of aH.s, aH.m and aH.u, name it aH.searchKey and index
aH.searchKey, the index seems to be used correctly, both when aH is a
hash and when aH is an array ...

Would be unfortunate though to have to manually create string
concatenations instead of a multikey index.

Is this the only way to solve this ?

Dominik

Dominik Gehl

unread,
Oct 27, 2011, 12:31:50 PM10/27/11
to mongod...@googlegroups.com
I agree and already use the $elemMatch, but only the first part of the multikey index is used, i.e. aH.s …

Dominik

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/BwuFYJicsfcJ.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Nat

unread,
Oct 27, 2011, 12:43:10 PM10/27/11
to mongod...@googlegroups.com
Well you can simply index on aH itself... then you can do

find({ aH : { s : blah , m : blah, u : blah }) 

but it only works with exact match

Bernie Hackett

unread,
Oct 27, 2011, 1:28:45 PM10/27/11
to mongodb-user
How did you create this index? Was the index created on a version
before 2.0?

I can't reproduce this on 1.8.2 or 2.0.1:

PRIMARY> db.foo.insert({aH: {s: 'x', m: 'y', u: 'z'}})
PRIMARY> db.foo.findOne()
{
"_id" : ObjectId("4ea98f82396ab659c735ead6"),
"aH" : {
"s" : "x",
"m" : "y",
"u" : "z"
}
}
PRIMARY> db.foo.ensureIndex({'aH.s': 1, 'aH.m': 1, 'aH.u': 1})
PRIMARY> db.foo.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "barg.foo",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"aH.s" : 1,
"aH.m" : 1,
"aH.u" : 1
},
"ns" : "barg.foo",
"name" : "aH.s_1_aH.m_1_aH.u_1"
}
]
PRIMARY> db.foo.find({'aH.s': 'x', 'aH.m': 'y', 'aH.u':
'z'}).explain()
{
"cursor" : "BtreeCursor aH.s_1_aH.m_1_aH.u_1",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"aH.s" : [
[
"x",
"x"
]
],
"aH.m" : [
[
"y",
"y"
]
],
"aH.u" : [
[
"z",
"z"
]
]

Dominik Gehl

unread,
Oct 27, 2011, 1:48:55 PM10/27/11
to mongod...@googlegroups.com
Hi Bernie,

the issue arises when you use arrays:

1. Only one array element: index is "correctly" used


> db.foo.insert({aH: [ {s: 'x', m: 'y', u: 'z'} ] })

> db.foo.ensureIndex({'aH.s': 1, 'aH.m': 1, 'aH.u': 1})

> db.foo.find({'aH': { $elemMatch : {'s': 'x', 'm': 'y', 'u': 'z'}}}).explain()


{
"cursor" : "BtreeCursor aH.s_1_aH.m_1_aH.u_1",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"aH.s" : [
[
"x",
"x"
]
],
"aH.m" : [
[
"y",
"y"
]
],
"aH.u" : [
[
"z",
"z"
]
]
}
}

2. One single element in the collection has two array elements. Only the first part of the index is used
> db.foo.insert({aH: [ {s: 'a1', m: 'b1', u: 'c1'}, {s: 'a2', m:'b2', u:'c2'}] })
> db.foo.find({'aH': { $elemMatch : {'s': 'x', 'm': 'y', 'u': 'z'}}}).explain()


{
"cursor" : "BtreeCursor aH.s_1_aH.m_1_aH.u_1",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.

> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

--
Dominik Gehl
Lead Developer
http://context.io

@DominikGehl

Bernie Hackett

unread,
Oct 27, 2011, 2:26:11 PM10/27/11
to mongodb-user
This seems wrong to me. I opened https://jira.mongodb.org/browse/SERVER-4164.

> > For more options, visit this group athttp://groups.google.com/group/mongodb-user?hl=en.

aaron

unread,
Oct 28, 2011, 9:11:38 PM10/28/11
to mongodb-user
Hi Dominik,

Please see the updated comments in SERVER-4164.

Thanks,
Aaron

On Oct 27, 11:26 am, Bernie Hackett <ber...@10gen.com> wrote:
> This seems wrong to me. I openedhttps://jira.mongodb.org/browse/SERVER-4164.

Dominik Gehl

unread,
Oct 28, 2011, 9:15:26 PM10/28/11
to mongod...@googlegroups.com
Hi Aaron,

thanks for the info. I'm going around the issue for now by creating a new field in each document, composed of the concatenation of aH.s, aH.m and aH.u and then indexing on that field.

Dominik

Dominik Gehl

unread,
Oct 31, 2011, 4:18:50 PM10/31/11
to mongod...@googlegroups.com
I'm still somehow suspecting that the degraded performance I'm experiencing since last week might be due to the update to 2.0.1. Does anyone know if going back from 2.0.1 to 2.0.0 causes any issues ?

Dominik

--

Bernie Hackett

unread,
Oct 31, 2011, 5:05:54 PM10/31/11
to mongodb-user
There are no backward breaking changes. 2.0.1 is just a patch level on
top of 2.0.0 to fix bugs found in the initial release. It has no new
features.
Reply all
Reply to author
Forward
0 new messages