Slow operations on a WiredTiger secondary (3.2.6)

150 views
Skip to first unread message

Ofer Schreiber

unread,
Aug 3, 2016, 2:44:28 PM8/3/16
to mongodb-user
So we're in the middle of transitioning from MMAP to WiredTiger, and encounter some really bad performance issues on the WiredTiger secondaries.

Some details:
We have a single replicaset, the current master is running 3.2.6 with MMAP, the secondary in questions is running the same version with the WiredTiger storage engine.
Both servers has 60GB of ram (this means, the WiredTiger cache size is 35GB), 8 cpus, and 800GB SSD disk (Amazon i2.2xlarge instance)
Our data is ~80GB (on WiredTiger, including the indexes), there are 10 dbs, of which 3 is heavily used. The biggest DB has ~30K collections, ~60M objects, and 40GB in total.

After we moved the secondary to WiredTiger, I started seeing many slow queries in the log, mainly due to the time it takes to acquire locks:
2016-08-03T14:58:46.111+0000 I COMMAND  [conn23] command local.oplog.rs command: collStats { collstats: "oplog.rs" } keyUpdates:0 writeConflicts:0 numYields:0 reslen:4786 locks:{ Global: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 252271 } }, Database: { acquireCount: { r: 1 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 85 } }, oplog: { acquireCount: { r: 1 } } } protocol:op_query 253ms

2016-08-03T14:58:46.187+0000 I COMMAND  [conn66] command admin.$cmd command: listDatabases { listDatabases: 1.0 } keyUpdates:$
 writeConflicts:0 numYields:0 reslen:721 locks:{ Global: { acquireCount: { r: 22 }, acquireWaitCount: { r: 6 }, timeAcquiringMicros: { r: 8576 } }, Database: { acquireCount: { r: 11 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 66 } } } protocol:op_command 1830ms

For some reason, the show dbs command and db.stats command takes ~1/2 seconds to return, when it seems that server isn't at heavy used at all...
The cpu load is < 1, and the output of mongostat doesn't show anything meaningful I could see:
root@slow-mongo-server:/db# mongostat 5                                                                                 
insert query update delete getmore command % dirty % used flushes vsize   res qr|qw ar|aw netIn netOut conn   set repl
         time
   *27    *0   *103     *0       0     3|0     1.1   26.7       0 12.9G 12.3G   0|0   0|0  457b     5k   14 api_a  SEC 2016-08
-03T15:08:10Z
   *25    *0   *135     *0       0     3|0     0.8   26.7       1 12.9G 12.3G   0|0   0|0  349b     5k   14 api_a  SEC 2016-08
-03T15:08:15Z
   *21    *0   *110     *0       0     6|0     0.3   26.7       0 12.9G 12.3G   0|0   0|0  647b    15k   14 api_a  SEC 2016-08
-03T15:08:20Z
   *30    *0   *123     *0       0     3|0     0.4   26.7       0 12.9G 12.3G   0|0   0|0  349b     5k   14 api_a  SEC 2016-08
-03T15:08:25Z
   *17    *0   *100     *0       0     3|0     0.5   26.7       0 12.9G 12.3G   0|0   0|0  402b     5k   14 api_a  SEC 2016-08
-03T15:08:30Z

any idea what triggers the performance issues? or how can i debug it further?

Ofer Schreiber

unread,
Aug 7, 2016, 2:57:43 AM8/7/16
to mongodb-user
Bump....
Any ideas how can I debug this?

Amar

unread,
Aug 16, 2016, 9:56:26 PM8/16/16
to mongodb-user

Hi Ofer,

Apart from the queries reported in the log file, are you experiencing any performance issue on the replica set? Also, are the logs snippet and mongostat from the secondary you upgraded to WiredTiger?

The main role of a secondary in a replica set is to provide high availability by performing the operations of the primary node as quickly as possible. This is to ensure that if the primary goes offline, a secondary can instantly replace the primary. Therefore, for a secondary node, the most important metric is how far behind the primary it is in terms of writes.

This can be done by checking rs.printSlaveReplicationInfo() which shows how far is a secondary behind the primary.

Regards,

Amar


Ofer Cohen

unread,
Aug 17, 2016, 4:52:23 PM8/17/16
to mongodb-user
You should also monitor the mongodb data partition with iostat tool.

Thanks
Reply all
Reply to author
Forward
Message has been deleted
0 new messages