How to enhance a poor indexed count() performance ?

SCO

unread,

Apr 29, 2011, 4:22:18 PM4/29/11

to mongodb-user

Hi,

I'm currently running a Mongo 1.8.1 server on a 16G RAM, Quadcore,
64bits Ubuntu box.
My main test collection has currently over 10 millions documents, and
two index, each of them on an integer field (user, and type):

> db.measure.findOne();
{
"_id" : ObjectId("4dba913e9d1cdd3b00000000"),
"user" : 1356,
"type" : 123,
"measures" : [
{
"value" : -2500,
"type" : 14
},
{
"value" : -2900,
"type" : 15
}
]
}

The distribution is such that the selectivity of the "user" index is
excellent (eg. the user 1356 has around 1000 documents referenced
amongs 9 millions, same for other users, roughly).
However, the selectivity of the index "type" is wrong (99% of the
documents have the same type [for now! data will be changing, so an
index is needed]).

the following request is almost instant :
db.measure.find({user:78}).count();

db.measure.find().count() is almost instant too.

However, the following takes approx. 9 seconds :
db.measure.find({type:123}).count();

This look extremely bad and disappointing.

As a comparison, the latter request, for the same data in Postgresql
with same indexes take 2 seconds.

Any ideas on how I can enhance the performance of my MongoDb request ?

Thank you !

SCO

chris anderton

unread,

Apr 29, 2011, 4:49:02 PM4/29/11

to mongodb-user

What indexes do you have defined? What's the output of:

db.measure.getIndexes();

SCO

unread,

Apr 29, 2011, 4:52:49 PM4/29/11

to mongodb-user

Hi Chris,

Yes, sorry, here they are :

> db.measure.getIndexes();
[
{
"name" : "_id_",
"ns" : "main.measure",
"key" : {
"_id" : 1
},
"v" : 0
},
{
"_id" : ObjectId("4dbadfd266acd8c868b247a7"),
"ns" : "main.measure",
"key" : {
"user" : 1
},
"name" : "user_1",
"v" : 0
},
{
"_id" : ObjectId("4dbae45866acd8c868b247a8"),
"ns" : "main.measure",
"key" : {
"type" : 1
},
"name" : "type_1",
"v" : 0
}
]

The stats() show the following :
> db.measure.stats();
{
"ns" : "main.measure",
"count" : 8829447,
"size" : 3338775560,
"avgObjSize" : 378.1409594508014,
"storageSize" : 4157424384,
"numExtents" : 31,
"nindexes" : 3,
"lastExtentSize" : 697643776,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 1177570112,
"indexSizes" : {
"_id_" : 660882368,
"user_1" : 258343872,
"type_1" : 258343872
},
"ok" : 1
}

Thanks for your help !

On 29 avr, 22:49, chris anderton <chris.ander...@thewebfellas.com>
wrote:

Antoine Girbal

unread,

Apr 29, 2011, 5:59:30 PM4/29/11

to mongod...@googlegroups.com

questions:
- can you give output of db.measure.find({type:123}).explain()
- what is the result of the count?
- does time taken varies based on result of count (e.g. different
types)?

SCO

unread,

Apr 30, 2011, 4:40:23 AM4/30/11

to mongodb-user

Antoine,

Yes time for find() varies based on the different types, so I suppose
also for count() :

> db.measure.find({type:2}).explain()
{
"cursor" : "BtreeCursor type_1",
"nscanned" : 1305,
"nscannedObjects" : 1305,
"n" : 1305,
"millis" : 1,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"type" : [
[
2,
2
]
]
}
}
> db.measure.find({type:4}).explain()
{
"cursor" : "BtreeCursor type_1",
"nscanned" : 33290,
"nscannedObjects" : 33290,
"n" : 33290,
"millis" : 38,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"type" : [
[
4,
4
]
]
}

Around 99% of the collection has type 123 :

> db.measure.find({type:123}).explain()
{
"cursor" : "BtreeCursor type_1",
"nscanned" : 8623280,
"nscannedObjects" : 8623280,
"n" : 8623280,
"millis" : 9628,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"type" : [
[
1,
1
]
]
}
}

The count() yields the following :
> db.measure.find({type:1}).count();
8623280

Current load of the server and IOs are very low :

avg-cpu: %user %nice %system %iowait %steal %idle
0,07 0,00 0,05 0,02 0,00 99,86

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 0,59 1,60 97,53 422492 25712088

Hope this helps !

Thak you for your time.

Antoine Girbal

unread,

May 1, 2011, 1:13:46 PM5/1/11

to mongod...@googlegroups.com

ok just to confirm, there is some inconsistency in your output.
You request type 123 but it is looking for type 1.
I'm assuming it's just a typo?

So what explain() shows is that mongo must iterate through the index to
count the 8.6m entries.
If the index is all loaded in RAM it can be fast, but otherwise if you
see disk activity during count it would explain why it's slow.
suggestions:
- if you run count twice in row and it gets much faster, means that it
depends on index being in RAM. This means that if you use this index
often, operation will be faster (or get more RAM).
- instead of calling count, just keep an estimated total counts in
separate collection. When you add 1 item of type 123 you then $inc that
counter. Every few hours run a count() and fix the counter if there is
any difference.

AG

Sékine Coulibaly

unread,

May 1, 2011, 3:26:08 PM5/1/11

to mongod...@googlegroups.com

Antoine,

Sorry, just a typo due to housecleaning ;)

The point is that mongod is nearly the only one process running on that box, a top shows the following :

top - 20:51:52 up 4 days, 11:26, 4 users, load average: 0.00, 0.06, 0.07

Tasks: 190 total, 1 running, 189 sleeping, 0 stopped, 0 zombie

Cpu(s): 0.1%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 16457792k total, 9398092k used, 7059700k free, 112152k buffers

Swap: 1044216k total, 0k used, 1044216k free, 8565624k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

1928 utilisateur 20 0 6479m 4.5g 4.4g S 0 28.8 35:50.17 mongod

15838 utilisateur2 20 0 70608 1608 884 S 0 0.0 0:00.01 sshd

16040 utilisateur 20 0 19220 1468 1064 R 0 0.0 0:00.02 top

1 root 20 0 23680 1864 1272 S 0 0.0 0:01.50 init

2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd

3 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0

4 root 20 0 0 0 0 S 0 0.0 0:00.18 ksoftirqd/0

5 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0

6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1

7 root 20 0 0 0 0 S 0 0.0 0:00.46 ksoftirqd/1

8 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1

9 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/2

10 root 20 0 0 0 0 S 0 0.0 0:00.08 ksoftirqd/2

11 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/2

12 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/3

13 root 20 0 0 0 0 S 0 0.0 0:00.12 ksoftirqd/3

14 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/3

In my understanding, the mongod server is using more than 4GB at that time and the indexes must be in RAM.

Iostat -x -d 1 reports the following for the 9 seconds during which spans the count() execution :

Linux 2.6.32-22-server (serveur) 01/05/2011 _x86_64_ (8 CPU)