Very poor performance after start using WiredTiger

256 visualizações
Pular para a primeira mensagem não lida

leo.a...@gmail.com

não lida,
25 de nov. de 2015, 15:28:2725/11/2015
para mongodb-user
Hello,

I have a sharded cluster with 3 replica sets (shardrs1, rs2, shardrs3) and I'm having performance issues after changing the storage engine to wiredTiger.

A few data from the primary replica-member of the primary replica for the shard (it's a m2.4xlarge instance) : 

1) Before changes :

EBS Data:
Disk: 150GB with 4500 PIOPS, File System: ext4

 
 New Relic Servers Data:

Disk I/O - Utilization: ~1% with 30min peaks of ~4%
Disk I/O -  Rate: ~0.8MB/s with 30min peaks of  ~15 MB/s 
Disk I/O -   operations per second: ~55 with 30min peaks of ~300

Network Usage - Bandwidth 146 MB/s
Network Usage - 4,49 Packets per second
 
Process - RAM 19GBs 
Process - CPU Usage 8,6%

MMS Data:

Opcounter:
MetMore 683
Update 861
Query 895
Command 589

Configuration File:
dbpath=/var/lib/mongodb
logpath=/var/log/mongodb/mongodb.log
logappend=true
port = 27018
replSet = rs2
keyFile = /etc/secret
rest = true 

2) After changes:

EBS Data:

Disk: 150GB with 3000 PIOPS, File System: xfs

New Relic Servers Datas (Avg for 5min): 
Disk I/O - Utilization 6.5%
Disk I/O -  rate 21.8 MB/s 
Disk I/O -  518 operations per second 
 
Network Usage - Bandwidth 42,5 MB/s
Network Usage - 1,52 Packets per second
 
Process - RAM 27GBs 
Process - CPU Usage 28,8%

Configuration File:
systemLog:
   verbosity: 0
   quiet: false
   traceAllExceptions: true
   path: "/var/log/mongodb/mongodb.log"
   logAppend: true
   logRotate: "rename"
   destination: "file"
   component:
      accessControl:
         verbosity: 5
net:
   port: ----
   wireObjectCheck: true
security:
   keyFile: ----
   authorization: "enabled"
storage:
   dbPath: "/var/lib/mongodb-wt/"
   repairPath: /var/lib/mongodb-wt/repair/
   engine: "wiredTiger"
   wiredTiger:
      engineConfig:
         journalCompressor: "none"
      collectionConfig:
         blockCompressor: "none"
replication:
   replSetName:  ----
sharding:
   clusterRole: "shardsvr"


MMS Data:

Opcounter:
MetMore 223,21
Update 195
Query 247
Command 246
 

(I replaced a few data with "---" for security reasons).

My application slowed down the second after the changes were finished and this replica-member was reelected as primary due to it's priority.

As the EBS block was new, my first assumption was that it needed a "pre-warm" but the EBS docs now say that "New EBS volumes receive their maximum performance the moment that they are available and do not require initialization (formerly known as pre-warming)".

My others replica sets that run with smaller instances (m1.xlarge) are running with an even higher disk usage:

ubuntu@ip-10-238-134-53:~$ iostat 10
Linux 3.13.0-36-generic (ip-10-238-134-53) 11/25/2015 _x86_64_ (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                          4.19    0.01    3.18    7.69    0.13   84.81

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
xvdap1            0.23         3.46         1.03   31611457    9431676
xvdb              0.00         0.00         0.00       1573          4
xvdf              0.59         0.01         5.11     105421   46760928
xvdg            289.99      4236.77      1670.93 38753180425 15283732632
xvdh             17.80       553.89       170.50 5066314541 1559582544

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
                    13.41    0.00   29.63   55.27    0.10    1.59

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
xvdap1            0.10         0.80         0.00          8          0
xvdb              0.00         0.00         0.00          0          0
xvdf              1.00         0.00         8.80          0         88
xvdg            254.10      3336.80      1645.60      33368      16456
xvdh           1836.40     55466.80     18480.00     554668     184800

WiredTiger is running at /dev/xvdh 


Can you help me? I really don't know why this is happening...

Thanks in advance

Rhys Campbell

não lida,
26 de nov. de 2015, 10:21:4426/11/2015
para mongodb-user

leo.a...@gmail.com

não lida,
27 de nov. de 2015, 11:53:1127/11/2015
para mongodb-user
Hello,

No, I started the disk with XFS:

2) After changes:
EBS Data:
Disk: 150GB with 3000 PIOPS, File System: xfs

Just checking:

$ sudo file -sL /dev/xvdh
/dev/xvdh: SGI XFS filesystem data (blksz 4096, inosz 256, v2 dirs)

 So, i think this is not the problem. Any other ideas?

Asya Kamsky

não lida,
27 de nov. de 2015, 18:46:4227/11/2015
para mongodb-user
I'm not exactly clear on what the "after" operation stats are but I'm really surprised by this:

   wiredTiger:
      engineConfig:
         journalCompressor: "none"
      collectionConfig:
         blockCompressor: "none"

Why did you turn off compression?  That requires a lot more IO to go to disk and more importantly won't fit as much of your data files into the file system cache if your entire dataset does not fit in RAM.

Asya


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/3260b755-9a95-4cc6-aa3e-d09a88f51265%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Asya Kamsky
Lead Product Manager
MongoDB
Download MongoDB - mongodb.org/downloads
Free MongoDB Monitoring - cloud.mongodb.com
Free Online Education - university.mongodb.com
Get Involved - mongodb.org/community
We're Hiring! - https://www.mongodb.com/careers

leo.a...@gmail.com

não lida,
30 de nov. de 2015, 13:55:5030/11/2015
para mongodb-user
Hello Asya,

The "after" stats are the stats collected after the change to WiredTiger.

As MMAPV1 don't uses compression and the CPU usage grew after activating WiredTiger, I've deactivate compression to don't increase even more the CPU usage. 
In MMS's dashboards, I dont see lots of Page Faults, so I believe that all my data set fits in the RAM.

Even so, I'll run a few tests to stress this possibility and I come back here with the results.

Thank you


Em quarta-feira, 25 de novembro de 2015 18:28:27 UTC-2, leo.a...@gmail.com escreveu:

leo.a...@gmail.com

não lida,
1 de dez. de 2015, 11:39:1501/12/2015
para mongodb-user
Hi,

So I changed my production environment to wiredTiger again and no success.

Actually, I did see a lower disk usage right after changing the storageEngine, but after a few hours, the reading throughput got huge, the CPU I/O Wait percentage skyrocketed and I changed to MMaPV1 again.

I see also that the metric "writers tickets" goes to zero exactly after changing the storage engine at the primary shard. All the others shards have 100+ tickets available.

Digging into logs, I see that it's throwing "write conflict" errors like this:

2015-11-30T23:05:43.948+0000 I WRITE    [conn1181] update mydb.users query: { _id: ObjectId('517551f17ae3cf912500xxxx') } update: { $inc: { account.field: -yy } } nscanned:1 nscannedObjects:1 nMatched:1 nModified:1 keyUpdates:0 writeConflicts:2 numYields:1 locks:{ Global: { acquireCount: { r: 3, w: 3 } }, Database: { acquireCount: { w: 3 } }, Collection: { acquireCount: { w: 2 } }, oplog: { acquireCount: { w: 1 } } } 7458ms
2015-11-30T23:05:43.948+0000 I COMMAND  [conn1181] command mydb.$cmd command: update { update: "users", updates: [ { q: { _id: ObjectId('517551f17ae3cf912500xxxx') }, u: { $inc: { account.field: -yy } }, multi: false, upsert: false } ], writeConcern: { w: 1 }, ordered: true, metadata: { shardName: "rs2", shardVersion: [ Timestamp 0|0, ObjectId('000000000000000000000000') ], session: 0 } } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:155 locks:{ Global: { acquireCount: { r: 3, w: 3 } }, Database: { acquireCount: { w: 3 } }, Collection: { acquireCount: { w: 2 } }, oplog: { acquireCount: { w: 1 } } } 7458ms
2015-11-30T23:05:43.953+0000 W -        [conn1281] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.957+0000 I -        [conn1142] 
2015-11-30T23:05:43.957+0000 W -        [conn1312] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.962+0000 W -        [conn1358] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.968+0000 I -        [conn1313] 
2015-11-30T23:05:43.972+0000 W -        [conn1301] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.978+0000 W -        [conn1058] DBException thrown :: caused by :: 112 WriteConflict
2015-11-30T23:05:43.984+0000 W -        [conn1484] DBException thrown :: caused by :: 112 WriteConflict

At this collections "users", a few users are more popular and, therefore, their documents are much more accessed then others. But, if MMaPV1 has collection-level locking, why document-level locking is slower?

Thank you :D

Em quarta-feira, 25 de novembro de 2015 18:28:27 UTC-2, leo.a...@gmail.com escreveu:

Asya Kamsky

não lida,
1 de dez. de 2015, 16:41:1001/12/2015
para mongod...@googlegroups.com
What is the exact version here?  WriteConflict exception is an internal thing that should *not* be thrown back to the application.   You may be encountering a bug that we would want to get fixed ASAP.   If you are not on the latest 3.0.7 then please upgrade before any further testing!
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/7025bdcc-a43e-45f8-b22d-3854f1541d73%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

leo.a...@gmail.com

não lida,
2 de dez. de 2015, 11:27:5502/12/2015
para mongodb-user
Hi,

My version is 3.0.7.

How should I proceed?

Thanks in advance

Em quarta-feira, 25 de novembro de 2015 18:28:27 UTC-2, leo.a...@gmail.com escreveu:
Responder a todos
Responder ao autor
Encaminhar
0 nova mensagem