How to enhance a poor indexed count() performance ?

897 views
Skip to first unread message

SCO

unread,
Apr 29, 2011, 4:22:18 PM4/29/11
to mongodb-user
Hi,

I'm currently running a Mongo 1.8.1 server on a 16G RAM, Quadcore,
64bits Ubuntu box.
My main test collection has currently over 10 millions documents, and
two index, each of them on an integer field (user, and type):

> db.measure.findOne();
{
"_id" : ObjectId("4dba913e9d1cdd3b00000000"),
"user" : 1356,
"type" : 123,
"measures" : [
{
"value" : -2500,
"type" : 14
},
{
"value" : -2900,
"type" : 15
}
]
}

The distribution is such that the selectivity of the "user" index is
excellent (eg. the user 1356 has around 1000 documents referenced
amongs 9 millions, same for other users, roughly).
However, the selectivity of the index "type" is wrong (99% of the
documents have the same type [for now! data will be changing, so an
index is needed]).

the following request is almost instant :
db.measure.find({user:78}).count();

db.measure.find().count() is almost instant too.

However, the following takes approx. 9 seconds :
db.measure.find({type:123}).count();

This look extremely bad and disappointing.

As a comparison, the latter request, for the same data in Postgresql
with same indexes take 2 seconds.

Any ideas on how I can enhance the performance of my MongoDb request ?

Thank you !

SCO

chris anderton

unread,
Apr 29, 2011, 4:49:02 PM4/29/11
to mongodb-user
What indexes do you have defined? What's the output of:

db.measure.getIndexes();

SCO

unread,
Apr 29, 2011, 4:52:49 PM4/29/11
to mongodb-user
Hi Chris,

Yes, sorry, here they are :

> db.measure.getIndexes();
[
{
"name" : "_id_",
"ns" : "main.measure",
"key" : {
"_id" : 1
},
"v" : 0
},
{
"_id" : ObjectId("4dbadfd266acd8c868b247a7"),
"ns" : "main.measure",
"key" : {
"user" : 1
},
"name" : "user_1",
"v" : 0
},
{
"_id" : ObjectId("4dbae45866acd8c868b247a8"),
"ns" : "main.measure",
"key" : {
"type" : 1
},
"name" : "type_1",
"v" : 0
}
]

The stats() show the following :
> db.measure.stats();
{
"ns" : "main.measure",
"count" : 8829447,
"size" : 3338775560,
"avgObjSize" : 378.1409594508014,
"storageSize" : 4157424384,
"numExtents" : 31,
"nindexes" : 3,
"lastExtentSize" : 697643776,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 1177570112,
"indexSizes" : {
"_id_" : 660882368,
"user_1" : 258343872,
"type_1" : 258343872
},
"ok" : 1
}

Thanks for your help !


On 29 avr, 22:49, chris anderton <chris.ander...@thewebfellas.com>
wrote:

Antoine Girbal

unread,
Apr 29, 2011, 5:59:30 PM4/29/11
to mongod...@googlegroups.com
questions:
- can you give output of db.measure.find({type:123}).explain()
- what is the result of the count?
- does time taken varies based on result of count (e.g. different
types)?

SCO

unread,
Apr 30, 2011, 4:40:23 AM4/30/11
to mongodb-user
Antoine,

Yes time for find() varies based on the different types, so I suppose
also for count() :

> db.measure.find({type:2}).explain()
{
"cursor" : "BtreeCursor type_1",
"nscanned" : 1305,
"nscannedObjects" : 1305,
"n" : 1305,
"millis" : 1,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"type" : [
[
2,
2
]
]
}
}
> db.measure.find({type:4}).explain()
{
"cursor" : "BtreeCursor type_1",
"nscanned" : 33290,
"nscannedObjects" : 33290,
"n" : 33290,
"millis" : 38,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"type" : [
[
4,
4
]
]
}


Around 99% of the collection has type 123 :

> db.measure.find({type:123}).explain()
{
"cursor" : "BtreeCursor type_1",
"nscanned" : 8623280,
"nscannedObjects" : 8623280,
"n" : 8623280,
"millis" : 9628,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"type" : [
[
1,
1
]
]
}
}

The count() yields the following :
> db.measure.find({type:1}).count();
8623280

Current load of the server and IOs are very low :

avg-cpu: %user %nice %system %iowait %steal %idle
0,07 0,00 0,05 0,02 0,00 99,86

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 0,59 1,60 97,53 422492 25712088

Hope this helps !

Thak you for your time.

Antoine Girbal

unread,
May 1, 2011, 1:13:46 PM5/1/11
to mongod...@googlegroups.com
ok just to confirm, there is some inconsistency in your output.
You request type 123 but it is looking for type 1.
I'm assuming it's just a typo?

So what explain() shows is that mongo must iterate through the index to
count the 8.6m entries.
If the index is all loaded in RAM it can be fast, but otherwise if you
see disk activity during count it would explain why it's slow.
suggestions:
- if you run count twice in row and it gets much faster, means that it
depends on index being in RAM. This means that if you use this index
often, operation will be faster (or get more RAM).
- instead of calling count, just keep an estimated total counts in
separate collection. When you add 1 item of type 123 you then $inc that
counter. Every few hours run a count() and fix the counter if there is
any difference.

AG

Sékine Coulibaly

unread,
May 1, 2011, 3:26:08 PM5/1/11
to mongod...@googlegroups.com
Antoine,

Sorry, just a typo due to housecleaning ;)

The point is that mongod is nearly the only one process running on that box, a top shows the following :

top - 20:51:52 up 4 days, 11:26,  4 users,  load average: 0.00, 0.06, 0.07
Tasks: 190 total,   1 running, 189 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.1%us,  0.0%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16457792k total,  9398092k used,  7059700k free,   112152k buffers
Swap:  1044216k total,        0k used,  1044216k free,  8565624k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1928 utilisateur   20   0 6479m 4.5g 4.4g S    0 28.8  35:50.17 mongod
15838 utilisateur2 20   0 70608 1608  884 S    0  0.0   0:00.01 sshd
16040 utilisateur   20   0 19220 1468 1064 R    0  0.0   0:00.02 top
    1 root      20   0 23680 1864 1272 S    0  0.0   0:01.50 init
    2 root      20   0     0    0    0 S    0  0.0   0:00.00 kthreadd
    3 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/0
    4 root      20   0     0    0    0 S    0  0.0   0:00.18 ksoftirqd/0
    5 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/0
    6 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/1
    7 root      20   0     0    0    0 S    0  0.0   0:00.46 ksoftirqd/1
    8 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/1
    9 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/2
   10 root      20   0     0    0    0 S    0  0.0   0:00.08 ksoftirqd/2
   11 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/2
   12 root      RT   0     0    0    0 S    0  0.0   0:00.00 migration/3
   13 root      20   0     0    0    0 S    0  0.0   0:00.12 ksoftirqd/3
   14 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/3

In my understanding, the mongod server is using more than 4GB at that time and the indexes must be in RAM.


Iostat -x -d 1 reports the following for the 9 seconds during which spans the count() execution :

Linux 2.6.32-22-server (serveur)     01/05/2011      _x86_64_        (8 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,01     8,04    0,01    0,53     1,10    67,55   126,25     0,02   44,89   3,77   0,20

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     1,00    0,00    3,00     0,00    32,00    10,67     0,03   10,00  10,00   3,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00   0,00   0,00

In this case, how can we make sure the index is correctly used ? I'm afraid that the cost is linear and spending 10 seconds to count 10 millions rows with a mongod poorly charged and having everything in memory is hiding something really bad.
I can run sample stress tests if needed.

Regarding your suggestion, I'm quite puzzled. Let me explain why. I'm may have inconsistency with my hand-crafted counter stored in a separate collection and the value returned by count() (the one I should do every few hours as suggested) since the time elapsed between the count() returns and the  time the db.xxx.counters({name:'measures_count'}) returns might be long enough so that an additionnal document has been inserted in the meanwhile, so neither count() nor my counter might be accurate. Does this makes sense ?

To me, having mongo core handl itself a count by collection would be much nicer and accurate.

Running it twice in a row has no impact, probably because everything is cached.

Thanks for your time Antoine. Any other ideas or investigation results appreciated !

Sekine

2011/5/1 Antoine Girbal <ant...@10gen.com>
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Andrew Armstrong

unread,
May 1, 2011, 8:47:43 PM5/1/11
to mongodb-user
As far as I am aware, a count() operation (except for
collection.count() which is instant because it already knows the
document count in the total collection) is pretty much a regular
find() except that a count is done instead of returning documents.

There are no optimizations right now to allow the count to be
retrieved from the index - mongo uses the index, but has to peek at
each index record and increment the count, instead of just knowing how
many records are in the index that match instantly.

There are a few Jira tickets about this performance problem like
https://jira.mongodb.org/browse/SERVER-1752 which you may want to vote
for (sign up to Jira to do this).

Based on personal experience (with MySQL though), trying to summarise
thousands of rows quickly - even with indexes - starts to break down
when you want to do so fast and have lots of data to analyse.

You should consider creating 'summary' records that aggregate your
data so that you only need to count 10 records to find out the total
instead of say 100,000.

For example, a statistics system I wrote will combine 30 minutes worth
of 5 minute statistic samples into one 30 minute sample, and then
combine 4 hours of 30 minute samples into one 4 hour sample, ... up to
a day's worth.

Then if I want to know the totals or average for a week's worth of
data, I just look at 7 rows in MySQL, instead of (288 lots of 5 minute
samples per day x 7 = 2,016 rows). In my case the data is consistent
as the previously written records never change (I don't need to go
back and re-calculate yesterday's summaries for example - thats not
going to change).

This allowed me to store hundreds of millions of records in MySQL with
no performance problems at all when querying my data set (for known
ranges).

Perhaps you can consider looking into doing something similar for your
system? I am not sure how easily mongo could support more efficient
count operations via an index (maybe it can, which would be
fantastic).

Regards,
Andrew
> 2011/5/1 Antoine Girbal <anto...@10gen.com>
> ...
>
> read more »

Sekine Coulibaly

unread,
May 2, 2011, 2:55:25 AM5/2/11
to mongod...@googlegroups.com
Hi Andrew,
I'm glad to see this issue has been in JIRA for months without even being planned.
Please let me notify 10gen that this issue is a "no go" reason for bigdata centered applications. Such an issue for 100k rows collections is ok, not for a product handling dozens of millions rows. Once again postgres is 4 times quicker, on a somewhat loaded machine.

This basic thing shall not be overlooked and should be at least planned.

I understand the workaround you suggest, but I'm concerned by consistency (what if your box crashes during statistics transfer from one period to another?).

Furthermore, having that kind of issue being put aside is not making me feeling secure.

I'll definitely vote for this Jira. To me, it's a must have "feature".

Thanks again !

Antoine Girbal

unread,
May 2, 2011, 1:57:24 PM5/2/11
to mongodb-user
Hi Sekine,
we are indeed planning to improve the perf of count(), right now the
reason it is slow is that the comparison between value is sub optimal,
even on simple integers.
You can vote up on the jira issue.
If you store the count in a side collection and keep it mostly in
sync, this can give you a quick estimate of count.
This way your count will be very scalable even with billions of rows.
The downside is that the value is only eventually consistent and be
slightly inaccurate at times.
AG

On May 1, 11:55 pm, Sekine Coulibaly <scoulib...@gmail.com> wrote:
> Hi Andrew,
> I'm glad to see this issue has been in JIRA for months without even being planned.
> Please let me notify 10gen that this issue is a "no go" reason for bigdata centered applications. Such an issue for 100k rows collections is ok, not for a product handling dozens of millions rows. Once again postgres is 4 times quicker, on a somewhat loaded machine.
>
> This basic thing shall not be overlooked and should be at least planned.
>
> I understand the workaround you suggest, but I'm concerned by consistency (what if your box crashes during statistics transfer from one period to another?).
>
> Furthermore, having that kind of issue being put aside is not making me feeling secure.
>
> I'll definitely vote for this Jira. To me, it's a must have "feature".
>
> Thanks again !
>
> Le 2 mai 2011 à 02:47, Andrew Armstrong <phpla...@gmail.com> a écrit :
>
> > As far as I am aware, a count() operation (except for
> > collection.count() which is instant because it already knows the
> > document count in the total collection) is pretty much a regular
> > find() except that a count is done instead of returning documents.
>
> > There are no optimizations right now to allow the count to be
> > retrieved from the index - mongo uses the index, but has to peek at
> > each index record and increment the count, instead of just knowing how
> > many records are in the index that match instantly.
>
> > There are a few Jira tickets about this performance problem like
> >https://jira.mongodb.org/browse/SERVER-1752which you may want to vote
> ...
>
> read more »
Reply all
Reply to author
Forward
0 new messages