I apologize for a late reply.
@Rob
I am using the recently updated YCSB from
https://github.com/brianfrankcooper/YCSB. For the insert test, I am using batchsize=100 for now. I don't want to use async client yet as I want to measure latency accurately for the operations.
Since I was not able to get a good throughput with a 10 node setup, I decided to do what Asya mentioned in the blog post -
https://www.mongodb.com/blog/post/performance-testing-mongodb-30-part-1-throughput-improvements-measured-ycsb. I tried to replicate the numbers that are mentioned in the blog using a 2 node setup. The node's hardware configurations are still the same. I have 1 mongod+mongos+2 YCSB Clients running on one node and 1 node assigned for config servers.
With that setup, I ran following tests:
1) Load 30 million records (1 field with 100 bytes - as mentioned in the blog post)
2) Read uniform distribution
3) Read zipfian distribution
4) Range scan zipfian distribution
5) Mixed workload with 95% read zipfian / 5% update
6) Mixed workload with 50% read zipfian / 50% update
During the load phase, I saw mongod using between 22 and 25 cores(top showed 2200-2500% for mongod). The insert workload was CPU bound. During read and mixed workloads, mongod was using between 9 and 11 cores. During read and scan tests, I made sure using iostat that the disk was not getting hit. I also experimented with # of threads in YCSB clients and also tried adding more/less # of clients and found the sweet spot where I was getting good throughput without affecting the latency.
Here are the numbers:
Test | # of threads /client | RowSize | Cluster Throughput | Throughput ops/sec/client | Write: Avg Lat (us) | Write: Min Lat (us) | Write: Max Lat (us) | 95th % Latency (ms) | 99th % Latency (ms) | Read: AvgLat (us) | Read: MinLat (us) | Read: MaxLat (us) | 95th % Latency (ms) | 99th % Latency (ms) |
INSERT | 30 | 100B(1x100) | 126,430 | 63,215 | 348 | 0 | 133,534,630 | 0 | 5 | #N/A | #N/A | #N/A | #N/A | #N/A |
READ-UNIFORM | 50 | 100B(1x100) | 76,310 | 38,155 | #N/A | #N/A | #N/A | #N/A | #N/A | 1,304 | 158 | 192,358 | 2 | 3 |
READ-ZIPFIAN | 50 | 100B(1x100) | 76,512 | 38,256 | #N/A | #N/A | #N/A | #N/A | #N/A | 1,301 | 155 | 210,224 | 2 | 3 |
SCAN | 25 | 100B(1x100) | 12,487 | 6,244 | #N/A | #N/A | #N/A | #N/A | #N/A | 3,999 | 247 | 11,028,849 | 6 | 8 |
MIXED 95/5 | 50 | 100B(1x100) | 72,207 | 36,103 | 2,564 | 227 | 3,951,334 | 3 | 26 | 1,314 | 146 | 624,576 | 2 | 3 |
MIXED 50/50 | 50 | 100B(1x100) | 62,417 | 31,209 | 2,148 | 210 | 1,129,917 | 4 | 11 | 1,037 | 152 | 683,493 | 2 | 3 |
During scan test, mongod process died twice so for now we could ignore them. The error I saw in the mongod log was:
2015-07-01T17:32:41.506-0700 E STORAGE [conn1] WiredTiger (12) [1435797161:506959][20793:0x7f03e896a700], connection.open_session: only configured to sup
port 20010 sessions (including 10 internal): Cannot allocate memory
2015-07-01T17:32:41.507-0700 I - [conn1] Invariant failure: ret resulted in status UnknownError 12: Cannot allocate memory at src/mongo/db/storage/
wiredtiger/wiredtiger_session_cache.cpp 49
2015-07-01T17:32:41.507-0700 E STORAGE [conn88] WiredTiger (12) [1435797161:507173][20793:0x7f03e1302700], connection.open_session: only configured to su
pport 20010 sessions (including 10 internal): Cannot allocate memory
2015-07-01T17:32:41.507-0700 I - [conn88] Invariant failure: ret resulted in status UnknownError 12: Cannot allocate memory at src/mongo/db/storage
/wiredtiger/wiredtiger_session_cache.cpp 49
2015-07-01T17:32:41.524-0700 I CONTROL [conn48]
0xf77369 0xf140f1 0xef99fa 0xd8eae0 0xd8efb6 0xd8a0ce 0xd8a115 0xd78165 0xa83e58 0xa078e2 0xa0801d 0xa1e595 0x9fbb7d 0xbd5714 0xbd5ac4 0xba3a54 0xab4e90
0x7fb81d 0xf2883b 0x7f03f89559d1 0x7f03f74a786d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B77369"},{"b":"400000","o":"B140F1"},{"b":"400000","o":"AF99FA"},{"b":"400000","o":"98EAE0"},{"b":"400000","o":"98EFB6"},{"b":"400000","o":"98A0CE"},{"b":"400000","o":"98A115"},{"b":"400000","o":"978165"},{"b":"400000","o":"683E58"},{"b":"400000","o":"6078E2"},{"b":"400000","o":"60801D"},{"b":"400000","o":"61E595"},{"b":"400000","o":"5FBB7D"},{"b":"400000","o":"7D5714"},{"b":"400000","o":"7D5AC4"},{"b":"400000","o":"7A3A54"},{"b":"400000","o":"6B4E90"},{"b":"400000","o":"3FB81D"},{"b":"400000","o":"B2883B"},{"b":"7F03F894E000","o":"79D1"},{"b":"7F03F73BF000","o":"E886D"}],"processInfo":{ "mongodbVersion" : "3.0.4", "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5", "uname" : { "sysname" : "Linux", "release" : "2.6.32-431.29.2.el6.x86_64", "version" : "#1 SMP Tue Sep 9 21:36:05 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "7702B4740E91E1BA6F701DB2A42553968AC09562" }, { "b" : "7FFF97C38000", "elfType" : 3, "buildId" : "5474F0D8DAF3D6177E2C4B06F3892745CB43B4D5" }, { "b" : "7F03F894E000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "211321F78CA244BE2B2B1B8584B460F9933BA76B" }, { "b" : "7F03F86E2000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "40BEA6554E64FC0C3D5C7D0CD91362730515102F" }, { "b" : "7F03F82FF000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "FC4EFD7502ACB3B9D213D28272D15A165857AD5A" }, { "b" : "7F03F80F7000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "B26528BF6C0636AC1CAE5AC50BDBC07E60851DF4" }, { "b" : "7F03F7EF3000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "AFC7448F2F2F6ED4E5BC82B1BD8A7320B84A9D48" }, { "b" : "7F03F7BED000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "F07F2E7CF4BFB393CC9BBE8CDC6463652E14DB07" }, { "b" : "7F03F7969000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "98B028A725D6E93253F25DF00B794DFAA66A3145" }, { "b" : "7F03F7753000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "246C3BAB0AB093AFD59D34C8CBF29E786DE4BE97" }, { "b" : "7F03F73BF000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "59640F8CD5A70CF0391A7C64275D84336935E6AA" }, { "b" : "7F03F8B6B000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "57BF668F99B7F5917B8D55FBB645173C9A644575" }, { "b" : "7F03F717B000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "54BA6B78A9220344E77463947215E42F0EABCC62" }, { "b" : "7F03F6E95000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "6797403AA5F8FAD8ADFF683478B45F528CE4FB0E" }, { "b" : "7F03F6C91000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "8CE28F280150E62296240E70ECAC64E4A57AB826" }, { "b" : "7F03F6A65000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "05733977F4E41652B86070B27A0CFC2C1EA7719D" }, { "b" : "7F03F684F000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "5FA8E5038EC04A774AF72A9BB62DC86E1049C4D6" }, { "b" : "7F03F6644000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "E3FA235F3BA3F776A01A18ECA737C9890F445923" }, { "b" : "7F03F6441000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "AF374BAFB7F5B139A0B431D3F06D82014AFF3251" }, { "b" : "7F03F6227000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "A91A53E16DEABDFE05F28F7D04DAB5FFAA013767" }, { "b" : "7F03F6008000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "E6798A06BEE17CF102BBA44FD512FF8B805CEAF1" } ] }}
mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf77369]
mongod(_ZN5mongo10logContextEPKc+0xE1) [0xf140f1]
mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0xDA) [0xef99fa]
mongod(_ZN5mongo17WiredTigerSessionC1EP15__wt_connectionii+0xA0) [0xd8eae0]
mongod(_ZN5mongo22WiredTigerSessionCache10getSessionEv+0x4C6) [0xd8efb6]
mongod(_ZN5mongo22WiredTigerRecoveryUnit10getSessionEPNS_16OperationContextE+0x3E) [0xd8a0ce]
mongod(_ZN5mongo16WiredTigerCursorC1ERKSsmbPNS_16OperationContextE+0x35) [0xd8a115]
mongod(_ZNK5mongo21WiredTigerIndexUnique9newCursorEPNS_16OperationContextEi+0x55) [0xd78165]
mongod(_ZNK5mongo22BtreeBasedAccessMethod9newCursorEPNS_16OperationContextERKNS_13CursorOptionsEPPNS_11IndexCursorE+0x28) [0xa83e58]
mongod(_ZN5mongo9IndexScan13initIndexScanEv+0x62) [0xa078e2]
mongod(_ZN5mongo9IndexScan4workEPm+0x4D) [0xa0801d]
mongod(_ZN5mongo16ShardFilterStage4workEPm+0x55) [0xa1e595]
mongod(_ZN5mongo10FetchStage4workEPm+0xCD) [0x9fbb7d]
mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xA4) [0xbd5714]
mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbd5ac4]
mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_+0xA74) [0xba3a54]
mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xB10) [0xab4e90]
mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xDD) [0x7fb81d]
mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x34B) [0xf2883b]
libpthread.so.0(+0x79D1) [0x7f03f89559d1]
libc.so.6(clone+0x6D) [0x7f03f74a786d]
----- END BACKTRACE -----
2015-07-01T17:32:41.524-0700 I - [conn48]
Thanks,
Milind