So I changed my production environment to wiredTiger again and no success.
Actually, I did see a lower disk usage right after changing the storageEngine, but after a few hours, the reading throughput got huge, the CPU I/O Wait percentage skyrocketed and I changed to MMaPV1 again.
I see also that the metric "writers tickets" goes to zero exactly after changing the storage engine at the primary shard. All the others shards have 100+ tickets available.
2015-11-30T23:05:43.948+0000 I WRITE [conn1181] update mydb.users query: { _id: ObjectId('517551f17ae3cf912500xxxx') } update: { $inc: { account.field: -yy } } nscanned:1 nscannedObjects:1 nMatched:1 nModified:1 keyUpdates:0 writeConflicts:2 numYields:1 locks:{ Global: { acquireCount: { r: 3, w: 3 } }, Database: { acquireCount: { w: 3 } }, Collection: { acquireCount: { w: 2 } }, oplog: { acquireCount: { w: 1 } } } 7458ms
2015-11-30T23:05:43.948+0000 I COMMAND [conn1181] command mydb.$cmd command: update { update: "users", updates: [ { q: { _id: ObjectId('517551f17ae3cf912500xxxx') }, u: { $inc: { account.field: -yy } }, multi: false, upsert: false } ], writeConcern: { w: 1 }, ordered: true, metadata: { shardName: "rs2", shardVersion: [ Timestamp 0|0, ObjectId('000000000000000000000000') ], session: 0 } } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:155 locks:{ Global: { acquireCount: { r: 3, w: 3 } }, Database: { acquireCount: { w: 3 } }, Collection: { acquireCount: { w: 2 } }, oplog: { acquireCount: { w: 1 } } } 7458ms
2015-11-30T23:05:43.953+0000 W - [conn1281] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.957+0000 I - [conn1142]
2015-11-30T23:05:43.957+0000 W - [conn1312] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.962+0000 W - [conn1358] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.968+0000 I - [conn1313]
2015-11-30T23:05:43.972+0000 W - [conn1301] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.978+0000 W - [conn1058] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.984+0000 W - [conn1484] DBException thrown :: caused by :: 112 WriteConflict
At this collections "users", a few users are more popular and, therefore, their documents are much more accessed then others. But, if MMaPV1 has collection-level locking, why document-level locking is slower?
Thank you :D