MongoDB replica set cpu issue

Torgeir

unread,

May 21, 2012, 7:06:58 AM5/21/12

to mongodb-user

Hi,

I have a question about my Mongo replica set. I have one non-primary
node that consumes more CPU (8-9%) than the other non-primary (1-2%).
It is a 3-node set. I am running v2.0.2. They have identical setup,
and I have 2 application nodes talking to the MongoDB nodes. How do I
figure out why this is happening?

Thanks,
Torgeir

Timothy Hawkins

unread,

May 21, 2012, 7:41:18 AM5/21/12

to mongod...@googlegroups.com, mongodb-user

Check the serverstats on each to see if the number of queries/getmores is evenly distributed.

Sent from my iPad

> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb

gregor

unread,

May 21, 2012, 7:49:20 AM5/21/12

to mongod...@googlegroups.com

You can use mongostat

http://docs.mongodb.org/manual/reference/mongostat/

and also mongotop

http://docs.mongodb.org/manual/reference/mongotop/

> mongodb-user+unsubscribe@googlegroups.com

Torgeir Lund

unread,

May 21, 2012, 8:59:21 AM5/21/12

to mongod...@googlegroups.com

thanks, but I have already done that, and they seem almost identical. Here's a sample:

Node2, with higher cpu (8-9%):

insert query update delete getmore command flushes mapped vsize    res faults locked % idx miss %     qr|qw   ar|aw netIn netOut conn       set repl       time
    *0      6     *1     *0       1     1|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   499b    28k    29 mydb SEC   11:17:18
    *0      6     *0     *0       0     5|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   852b    28k    29 mydb SEC   11:17:19
    *0      6     *0     *0       1     3|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   615b    28k    29 mydb SEC   11:17:20
    *0      3     *0     *0       0     3|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   541b    15k    29 mydb SEC   11:17:21
    *0      4     *1     *0       1     1|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   369b    19k    29 mydb SEC   11:17:22
    *0      4     *1     *0       1     3|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   653b    19k    29 mydb SEC   11:17:23
    *0      4     *0     *0       0     3|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   438b    20k    29 mydb SEC   11:17:24
    *0      3     *0     *0       1     5|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   704b    15k    29 mydb SEC   11:17:25
    *0      5     *0     *0       0     1|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   387b    23k    29 mydb SEC   11:17:26
    *0      4     *0     *0       1     3|0       0 5.16g 11.2g   802m      0        0          0       0|0     1|0   653b    20k    29 mydb SEC   11:17:27

Node3, with lower cpu (1-2%) :

insert query update delete getmore command flushes mapped vsize    res faults locked % idx miss %     qr|qw   ar|aw netIn netOut conn       set repl       time
    *0      5     *0     *0       0     4|0       0 5.16g 11.2g   818m      0        0          0       0|0     0|0   639b    24k    28 mydb SEC   11:21:29
    *0      4     *0     *0       0     2|0       0 5.16g 11.2g   818m      0        0          0       0|0     0|0   464b    19k    28 mydb SEC   11:21:30
    *0      5     *0     *0       0     4|0       0 5.16g 11.2g   818m      0        0          0       0|0     0|0   645b    24k    28 mydb SEC   11:21:31
    *0      7     *0     *0       0     2|0       0 5.16g 11.2g   818m      0        0          0       0|0     0|0   659b    33k    28 mydb SEC   11:21:32
    *0      3     *3     *0       0     2|0       0 5.16g 11.2g   818m      0        0          0       0|0     0|0   399b    14k    28 mydb SEC   11:21:33
    *0      5     *0     *0       0     4|0       0 5.16g 11.2g   818m      0        0          0       0|0     0|0   645b    24k    28 mydb SEC   11:21:34
    *0      7     *0     *0       0     2|0       0 5.16g 11.2g   818m      0        0          0       0|0     0|0   659b    32k    28 mydb SEC   11:21:35
    *0      6     *1     *0       0     4|0       0 5.16g 11.2g   817m      0      0.1          0       0|0     0|0   710b    29k    28 mydb SEC   11:21:36
    *0      3     *0     *0       0     2|0       0 5.16g 11.2g   817m      0        0          0       0|0     0|0   399b    14k    28 mydb SEC   11:21:37
    *0      5     *1     *0       0     2|0       0 5.16g 11.2g   817m      0        0          0       0|0     0|0   525b    23k    28 mydb SEC   11:21:38

Torgeir

mongodb-user...@googlegroups.com

gregor

unread,

May 22, 2012, 2:52:49 AM5/22/12

to mongod...@googlegroups.com, tor...@objectplanet.com

It looks like the one with the higher cpu is having getMore called by the secondary - it's oplog is being tailed. This would give you higher CPU usage. Try calling stepDown on the primary to force a failover and see if the situation is reversed.

Torgeir Lund

unread,

May 22, 2012, 4:19:45 AM5/22/12

to mongod...@googlegroups.com

Thanks, but I can't risk disrupt our service right now. But can you explain why one node is not tailing the oplog? Shouldn't all secondaries do that? I've checked both secondaries, and both seem to have all recent data. Everything seems to be working OK except the CPU issue.

I guess I don't really understand the getMore / oplog issue.

Torgeir.

mongodb-user...@googlegroups.com

gregor

unread,

May 22, 2012, 4:30:07 AM5/22/12

to mongod...@googlegroups.com, tor...@objectplanet.com

Sorry I mean the machine that is showing getMore operations is the primary - and is being queried by the secondaries. You can get a good view of what your system is doing (and any issues that are cropping up) using MMS - have you considered installing that?

https://mms.10gen.com/help/install.html
But this level of cpu usage is nothing to be concerned about.

Torgeir Lund

unread,

May 22, 2012, 7:34:07 AM5/22/12

to mongod...@googlegroups.com

Thank you, I will look into MMS.

I will continue to monitor our mongo instances, and report back if I find anything interesting.

mongodb-user...@googlegroups.com

gregor

unread,

May 22, 2012, 9:18:46 AM5/22/12

to mongod...@googlegroups.com, tor...@objectplanet.com

OK :)

Reply all

Reply to author

Forward