16040 utilisateur 20 0 19220 1468 1064 R 0 0.0 0:00.02 top
1 root 20 0 23680 1864 1272 S 0 0.0 0:01.50 init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0
4 root 20 0 0 0 0 S 0 0.0 0:00.18 ksoftirqd/0
5 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0
6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1
7 root 20 0 0 0 0 S 0 0.0 0:00.46 ksoftirqd/1
8 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1
9 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/2
10 root 20 0 0 0 0 S 0 0.0 0:00.08 ksoftirqd/2
11 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/2
12 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/3
13 root 20 0 0 0 0 S 0 0.0 0:00.12 ksoftirqd/3
14 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/3
In my understanding, the mongod server is using more than 4GB at that time and the indexes must be in RAM.
Iostat -x -d 1 reports the following for the 9 seconds during which spans the count() execution :
Linux 2.6.32-22-server (serveur) 01/05/2011 _x86_64_ (8 CPU)
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,01 8,04 0,01 0,53 1,10 67,55 126,25 0,02 44,89 3,77 0,20
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 1,00 0,00 3,00 0,00 32,00 10,67 0,03 10,00 10,00 3,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
In this case, how can we make sure the index is correctly used ? I'm afraid that the cost is linear and spending 10 seconds to count 10 millions rows with a mongod poorly charged and having everything in memory is hiding something really bad.
I can run sample stress tests if needed.
Regarding your suggestion, I'm quite puzzled. Let me explain why. I'm may have inconsistency with my hand-crafted counter stored in a separate collection and the value returned by count() (the one I should do every few hours as suggested) since the time elapsed between the count() returns and the time the db.xxx.counters({name:'measures_count'}) returns might be long enough so that an additionnal document has been inserted in the meanwhile, so neither count() nor my counter might be accurate. Does this makes sense ?
To me, having mongo core handl itself a count by collection would be much nicer and accurate.
Running it twice in a row has no impact, probably because everything is cached.
Thanks for your time Antoine. Any other ideas or investigation results appreciated !