MongoDb 2.0.4 performance question

Eugene Bolotin

unread,

Apr 23, 2012, 6:39:16 AM4/23/12

to mongod...@googlegroups.com

Hello!

I have two services running on single MongoDb. One service - service1 has large database and makes load of 250 reads and 250 writes per second (64 threads):
> db.stats()
{
        "db" : "rep",
        "collections" : 10,
        "objects" : 41312419,
        "avgObjSize" : 384.2041617558149,
        "dataSize" : 15872403312,
        "storageSize" : 22404643072,
        "numExtents" : 173,
        "indexes" : 8,
        "indexSize" : 2240907136,
        "fileSize" : 34276900864,
        "nsSizeMB" : 16,
        "ok" : 1
}
Records of database used by first service are never deleted. All collections have only _id index.

Second service - service2 has small database and makes load of 200 reads and 200 writes per second:
> db.stats()
{
        "db" : "cle",
        "collections" : 182,
        "objects" : 1166724,
        "avgObjSize" : 36.41681151669118,
        "dataSize" : 42488368,
        "storageSize" : 135518208,
        "numExtents" : 582,
        "indexes" : 180,
        "indexSize" : 53748688,
        "fileSize" : 2080374784,
        "nsSizeMB" : 16,
        "ok" : 1
}
Collections of database used by second service are named by template <name>-<date>. Each collection of that database lives 2 days and then deleted. All collections have only _id index.

output of ps command:
root     17958 1.8 71.6 75749984 11796248 ?   Sl   Apr04 502:06 /opt/mongodb/bin/mongod --fork --dbpath /opt/mongodb/data --logpath /opt/mongodb/mongodb.log

mongodb.log for today:
Mon Apr 23 00:26:53 [conn342] update rep.COL1 query: { _id: ObjectId('00000000bd7e7c32c452e23d') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 112ms
Mon Apr 23 01:10:53 [conn342] update rep.COL2 query: { _id: ObjectId('00000000e6c2c037fd5edcf5') } update: { $inc: {}, $set: { value: BinData } } idhack:1 161ms
Mon Apr 23 03:12:53 [conn342] update rep.COL3 query: { _id: ObjectId('00000000274c464bb45b8ff6') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 107ms
Mon Apr 23 04:25:27 [conn342] update rep.COL4 query: { _id: ObjectId('00000000a89c8355fdcb9a5d') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 113ms
Mon Apr 23 05:49:53 [conn342] update rep.COL4 query: { _id: ObjectId('000000003e343bb27f7e9251') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 145ms
Mon Apr 23 09:26:56 [conn342] update rep.COL3 query: { _id: ObjectId('000000004a69d31e08bcd739') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 132ms
Mon Apr 23 09:32:03 [conn342] update rep.COL3 query: { _id: ObjectId('000000002c3b86882c0b3d82') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 102ms
Mon Apr 23 09:55:58 [conn342] update rep.COL3 query: { _id: ObjectId('000000001b5d75b6d066c5f6') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 126ms
Mon Apr 23 10:03:38 [conn342] update rep.COL5 query: { _id: ObjectId('00000000e3aa7531772e1b68') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 116ms
Mon Apr 23 11:11:01 [conn342] update rep.COL1 query: { _id: ObjectId('00000000fe5137b0e49aac02') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 129ms
Mon Apr 23 11:13:03 [conn342] update rep.COL3 query: { _id: ObjectId('0000000044468b84bfa8ea32') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 151ms
Mon Apr 23 11:21:03 [conn342] update rep.COL5 query: { _id: ObjectId('0000000031008cc640ceb4eb') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 130ms
Mon Apr 23 11:30:35 [conn342] update rep.COL2 query: { _id: ObjectId('000000005d9579736a24351d') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 154ms
Mon Apr 23 12:18:05 [conn344] update rep.COL3 query: { _id: ObjectId('00000000dfa91e975532115b') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 104ms
Mon Apr 23 12:27:00 [conn342] update rep.COL2 query: { _id: ObjectId('000000009d38cc0d306de80b') } update: { $inc: {}, $set: { value: BinData } } idhack:1 127ms
Mon Apr 23 12:58:03 [conn342] update rep.COL5 query: { _id: ObjectId('00000000ec354eccca778c07') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 114ms
Mon Apr 23 13:14:39 [conn342] update rep.COL2 query: { _id: ObjectId('00000000b0d5a2cafc9c1ce0') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 123ms
Mon Apr 23 13:16:57 [conn342] update rep.COL5 query: { _id: ObjectId('00000000b0a4350169fc56a9') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 125ms
Mon Apr 23 13:22:05 [conn342] update rep.COL5 query: { _id: ObjectId('00000000b99f64016f29de76') } update: { $inc: {}, $set: { value: BinData } } idhack:1 fastmodinsert:1 103ms
Mon Apr 23 13:28:02 [conn343] update rep.COL5 query: { _id: ObjectId('00000000c10001c66ce7f36f') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 110ms
Mon Apr 23 14:10:02 [conn342] update rep.COL3 query: { _id: ObjectId('00000000f583c9db64e30f39') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 103ms
Mon Apr 23 14:15:58 [conn342] update rep.COL3 query: { _id: ObjectId('000000002b783bfc4b327eed') } update: { $inc: {}, $set: { value: BinData } } idhack:1 101ms
Mon Apr 23 14:18:34 [conn342] update rep.COL5 query: { _id: ObjectId('00000000a4923985fa00537e') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 111ms

Output of mongostat:
insert query update delete getmore command flushes mapped vsize    res faults locked % idx miss %     qr|qw   ar|aw netIn netOut conn       time
     0    456    230      0       0       1       0 35.9g 72.2g 11.2g      8      1.1          0       0|1     1|1   129k   171k    24   14:27:49
     0    372    187      0       0       1       0 35.9g 72.2g 11.2g     11        1          0       0|0     0|0   119k   171k    24   14:27:50
     0    456    229      0       0       1       0 35.9g 72.2g 11.2g      6        1          0       0|0     0|0   129k   173k    24   14:27:51
     0    504    255      0       0       1       0 35.9g 72.2g 11.2g      9      0.8          0       0|0     0|0   139k   180k    24   14:27:52
     0    368    185      0       0       1       0 35.9g 72.2g 11.3g      3      0.5          0       0|0     0|0   109k   151k    24   14:27:53
     0    440    223      0       0       1       0 35.9g 72.2g 11.2g      5      1.1          0       0|0     0|0   125k   165k    24   14:27:54
     0    580    291      0       0       1       0 35.9g 72.2g 11.2g     12      3.1          0       0|0     0|0   151k   189k    24   14:27:55
     0    378    189      0       0       1       0 35.9g 72.2g 11.3g     11      0.5          0       0|0     0|0   114k   161k    24   14:27:56
     0    463    235      0       0       1       0 35.9g 72.2g 11.3g      4      0.7          0       0|0     0|0   126k   162k    24   14:27:57
     0    464    220      0       0       1       0 35.9g 72.2g 11.3g      6      0.6          0       0|0     0|0   132k   178k    24   14:27:58
insert query update delete getmore command flushes mapped vsize    res faults locked % idx miss %     qr|qw   ar|aw netIn netOut conn       time
     0    406    220      0       0       1       0 35.9g 72.2g 11.3g      5      0.6          0       0|0     0|0   116k   152k    24   14:27:59
     0    388    194      0       0       1       0 35.9g 72.2g 11.3g      7      0.5          0       0|0     0|0   115k   161k    24   14:28:00
     0    382    189      0       0       1       0 35.9g 72.2g 11.3g      7      0.6          0       9|1     9|1   112k   155k    24   14:28:01
     0    453    225      0       0       1       0 35.9g 72.2g 11.2g     10      1.5          0       7|2     8|2   124k   165k    24   14:28:02
     0    438    229      0       0       1       0 35.9g 72.2g 11.3g      8      0.6          0       0|0     0|0   132k   175k    24   14:28:03
     0    464    234      0       0       1       0 35.9g 72.2g 11.2g      5      1.1          0       0|0     0|0   134k   182k    24   14:28:04
     0    387    194      0       0       1       0 35.9g 72.2g 11.2g      8      0.5          0       1|1     2|1   113k   158k    24   14:28:05
     0    489    246      0       0       1       0 35.9g 72.2g 11.3g      8      0.7          0       0|0     0|0   130k   167k    24   14:28:06
     0    446    227      0       0       1       1 35.9g 72.2g 11.2g      5      1.2          0       0|0     0|0   119k   152k    24   14:28:07
     0    442    223      0       0       1       0 35.9g 72.2g 11.2g      4      0.7          0       0|0     0|0   128k   174k    24   14:28:08

Queries to first service (with large database) have a timeout of 250 ms. I have a metric - error counter per 5 minutes. And now it has average value of 400 (1.3 error/second).
But I don't see so many long queries in mongodb.log and don't see so many page faults in mongostat.

How can I investigate it?
Maybe my problem is concerned with write-lock?
Does write-lock yield when I get page-fault on read?

Thanks in advance.

Adam C

unread,

Apr 23, 2012, 7:34:51 AM4/23/12

to mongod...@googlegroups.com

Where are you measuring the 250ms timeout?

If it is on the application side, do you think you might be waiting for threads/connections to come available? You don't mention the language/driver used, but there is usually a multiplier in play for the number of threads that can be blocking/waiting (anything above that limit will then throw an exception).

Assuming you are not using findAndModify (which does not yield), the long running ops and page faulting ops should be yielding - if that is happening though you will see those return more slowly as a result of the yield.

The lock percentage looks like it's not an issue judging by what you pasted from mongostat.

The slow operations from the log are all updates and it would seem they are triggering moves (outgrowing their original space) - those are relatively expensive, so you might want to look at padding those docs to prevent the moves (http://www.mongodb.org/display/DOCS/Padding+Factor#PaddingFactor-ManualPadding)

Finally, how does your disk IO look (iostat -xm 2)? if the disk was getting backed up it might also explain some of your issues.

Adam

Eugene Bolotin

unread,

Apr 23, 2012, 9:14:30 AM4/23/12

to mongod...@googlegroups.com

Adam, thanks for your reply.

I use mongo-c-driver, one of the latest version. But my services are written in C++. I use pool of 10 connections to mongo.
I measure findOne and update request times. I saw that if one request (read or write) takes more than 100 ms all simultaneous read/write requests in that time take similar time (>100 ms ~10 requests).

iostat -xm 2:
md2               0,00     0,00   13,75   81,88     0,16     0,32    10,37     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   37,50    9,50     0,31     0,08    17,02     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   24,00 357,50     0,30     1,43     9,27     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   23,00 1238,00     0,23     4,87     8,27     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   24,50 246,50     0,19     1,00     8,99     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   29,50 505,50     0,26     2,01     8,67     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   44,50 630,00     0,41     2,50     8,84     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   35,50   71,50     0,46     0,31    14,77     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   32,00    9,50     0,31     0,07    18,80     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   31,00   20,00     0,38     0,13    20,55     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   30,50    9,50     0,25     0,09    17,60     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   34,00    9,50     0,22     0,07    14,07     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   23,00   17,50     0,27     0,11    19,06     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   32,00    9,50     0,28     0,08    18,02     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   35,00   16,50     0,28     0,11    15,61     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   38,50    9,50     0,42     0,07    21,17     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   28,00    9,50     0,31     0,07    20,80     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   43,00   17,50     0,37     0,12    16,53     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   39,50    9,50     0,32     0,07    16,49     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   31,00 584,50     0,25     2,32     8,55     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   42,00   13,00     0,43     0,08    19,27     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   39,00    9,50     0,46     0,07    22,43     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   25,50 1150,50     0,22     4,50     8,22     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   45,00 568,50     0,49     2,29     9,29     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   27,00 395,50     0,29     1,57     8,99     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   66,00   36,00     0,52     0,18    14,08     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   42,50    9,50     0,44     0,08    20,54     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   21,50   10,00     0,20     0,08    18,29     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   39,00   17,00     0,38     0,12    17,93     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   33,50    9,50     0,29     0,08    17,77     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   33,00   17,50     0,24     0,12    14,42     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   32,50    9,00     0,38     0,07    22,46     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   35,50 231,00     0,30     0,95     9,56     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   41,50 1048,00     0,42     4,13     8,54     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   28,00 241,50     0,19     0,97     8,76     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   21,00 518,00     0,19     2,07     8,59     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   24,00 573,50     0,29     2,28     8,82     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   40,00   13,50     0,31     0,10    15,63     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   19,00    9,50     0,21     0,09    21,19     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   18,00   23,00     0,13     0,12    12,78     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   51,00    9,50     0,56     0,10    22,28     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   23,50    9,50     0,23     0,08    18,91     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   47,00   17,00     0,42     0,11    16,75     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   47,50    9,00     0,42     0,08    18,12     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   35,00   17,50     0,34     0,11    17,22     0,00    0,00    0,00    0,00   0,00   0,00

md2 (type ext3, software RAID 2 disks)

Adam C

unread,

Apr 23, 2012, 9:55:47 AM4/23/12

to mongod...@googlegroups.com

As a general recommendation I would suggest switching to ext4 (or XFS) rather than ext3 (ext3 is inefficient when allocating new data files), but that is unlikely to be the source of your issues, and the iostat looks clean.

Your mongostat output shows brief queuing (qr|qw = queued reads/writes), so when that ticks up you will see a longer time to return. The operation occurring at that time is taking a while and, given that it is blocking reads, is likely a long running write (given your logging output I think the updates are the cause) - multiple reads can occur simultaneously but are blocked by writes.

As I mentioned, updates which grow a document significantly and trigger a move are relatively costly (and slow) writes. If you can identify those operations and either manually pad the document to prevent it from growing or alter the queries to prevent the growth itself I think you will see better performance. You can also gauge how many of these are occurring by looking at the "paddingfactor" field in the collection stats (db.coll.stats()) - the closer this value is to 2 the more moves are being triggered (a value of 1 means no moves).

Adam

Eugene Bolotin

unread,

Apr 23, 2012, 10:22:41 AM4/23/12

to mongod...@googlegroups.com

Adam, I grep mongodb.log in scale of whole day. And I get lot more slow queries than updates. For example at last 15 minutes:
Mon Apr 23 17:52:00 [conn342] query rep.COL1 query: { _id: ObjectId('000000001c075f75267635c5') } ntoreturn:1 idhack:1 reslen:573 107ms
Mon Apr 23 17:52:56 [conn342] query rep.COL1 query: { _id: ObjectId('00000000c081ea20620e3f0b') } ntoreturn:1 idhack:1 reslen:573 133ms
Mon Apr 23 17:53:54 [conn342] query rep.COL3 query: { _id: ObjectId('0000000017995bfd03e39405') } ntoreturn:1 idhack:1 reslen:212 129ms
Mon Apr 23 17:53:57 [conn342] query rep.COL6 query: { _id: ObjectId('000000000aff2d83030b57be') } ntoreturn:1 idhack:1 reslen:180 120ms
Mon Apr 23 17:54:04 [conn342] query rep.COL1 query: { _id: ObjectId('000000006477beb87f3ef95c') } ntoreturn:1 idhack:1 reslen:541 308ms
Mon Apr 23 17:54:56 [conn342] query rep.COL4 query: { _id: ObjectId('0000000031a87172c8070c01') } ntoreturn:1 idhack:1 reslen:212 122ms
Mon Apr 23 17:54:56 [conn342] query rep.COL6 query: { _id: ObjectId('000000007c3a32295cceb602') } ntoreturn:1 idhack:1 reslen:212 153ms
Mon Apr 23 17:55:31 [conn342] query rep.COL1 query: { _id: ObjectId('00000000580997eb011a40be') } ntoreturn:1 idhack:1 reslen:573 147ms
Mon Apr 23 17:55:58 [conn344] query rep.COL1 query: { _id: ObjectId('00000000498cbd1aa0895854') } ntoreturn:1 idhack:1 reslen:541 142ms
Mon Apr 23 17:56:56 [conn342] query rep.COL1 query: { _id: ObjectId('000000002d303e80f2d30731') } ntoreturn:1 idhack:1 reslen:573 131ms
Mon Apr 23 17:57:31 [conn342] query rep.COL5 query: { _id: ObjectId('00000000f5538c272c7bb038') } ntoreturn:1 idhack:1 reslen:180 130ms
Mon Apr 23 17:57:59 [conn342] query rep.COL3 query: { _id: ObjectId('00000000ab426aa49954f73d') } ntoreturn:1 idhack:1 reslen:180 130ms
Mon Apr 23 17:58:01 [conn342] query rep.COL1 query: { _id: ObjectId('00000000a90f36d519c93812') } ntoreturn:1 idhack:1 reslen:573 125ms
Mon Apr 23 17:58:55 [conn342] query rep.COL1 query: { _id: ObjectId('00000000b2daef3c3d8869a2') } ntoreturn:1 idhack:1 reslen:573 111ms
Mon Apr 23 17:59:01 [conn342] query rep.COL1 query: { _id: ObjectId('00000000eb8214af01837d90') } ntoreturn:1 idhack:1 reslen:541 131ms
Mon Apr 23 17:59:31 [conn342] query rep.COL1 query: { _id: ObjectId('00000000e28b6715bbd7af84') } ntoreturn:1 idhack:1 reslen:541 172ms
Mon Apr 23 17:59:58 [conn342] query rep.COL1 query: { _id: ObjectId('0000000029d153c9964c5707') } ntoreturn:1 idhack:1 reslen:573 105ms
Mon Apr 23 17:59:59 [conn342] query rep.COL1 query: { _id: ObjectId('00000000d43d57b438a8eb46') } ntoreturn:1 idhack:1 reslen:541 115ms
Mon Apr 23 18:00:01 [conn342] query rep.COL1 query: { _id: ObjectId('00000000364ed4a459509c9a') } ntoreturn:1 idhack:1 reslen:537 110ms
Mon Apr 23 18:00:31 [conn342] query rep.COL1 query: { _id: ObjectId('000000005470edaa6beac4a7') } ntoreturn:1 idhack:1 reslen:573 110ms
Mon Apr 23 18:01:31 [conn342] query rep.COL1 query: { _id: ObjectId('000000005beac4f96db5acd5') } ntoreturn:1 idhack:1 reslen:573 129ms
Mon Apr 23 18:01:57 [conn342] query rep.COL3 query: { _id: ObjectId('0000000013193c8704281871') } ntoreturn:1 idhack:1 reslen:212 176ms
Mon Apr 23 18:01:58 [conn342] query rep.COL3 query: { _id: ObjectId('0000000067997169732e48ad') } ntoreturn:1 idhack:1 reslen:212 119ms
Mon Apr 23 18:02:31 [conn342] query rep.COL1 query: { _id: ObjectId('00000000b249e7a0bc2fa006') } ntoreturn:1 idhack:1 reslen:573 103ms
Mon Apr 23 18:02:57 [conn345] query rep.COL2 query: { _id: ObjectId('0000000013a45621305efd81') } ntoreturn:1 idhack:1 reslen:180 102ms
Mon Apr 23 18:03:31 [conn342] query rep.COL3 query: { _id: ObjectId('0000000042b79df9ae9db0aa') } ntoreturn:1 idhack:1 reslen:180 129ms
Mon Apr 23 18:04:00 [conn342] query rep.COL1 query: { _id: ObjectId('000000006e2e7a0fd7b7a7df') } ntoreturn:1 idhack:1 reslen:541 112ms
Mon Apr 23 18:04:05 [conn342] query rep.COL1 query: { _id: ObjectId('00000000ca8515792db26700') } ntoreturn:1 idhack:1 reslen:541 444ms
Mon Apr 23 18:04:58 [conn342] query rep.COL3 query: { _id: ObjectId('000000008bb753d2d459bf11') } ntoreturn:1 idhack:1 reslen:212 139ms
Mon Apr 23 18:05:34 [conn342] query rep.COL6 query: { _id: ObjectId('0000000020974ec901e68bed') } ntoreturn:1 idhack:1 reslen:180 117ms
Mon Apr 23 18:05:56 [conn342] query rep.COL1 query: { _id: ObjectId('000000000c69048cf9cb8305') } ntoreturn:1 idhack:1 reslen:573 108ms
Mon Apr 23 18:05:56 [conn342] query rep.COL1 query: { _id: ObjectId('00000000ef051c0bb8c15868') } ntoreturn:1 idhack:1 reslen:541 116ms
Mon Apr 23 18:06:31 [conn342] query rep.COL4 query: { _id: ObjectId('00000000d653c735bcb2dbbc') } ntoreturn:1 idhack:1 reslen:180 110ms
Mon Apr 23 18:06:54 [conn342] query rep.COL3 query: { _id: ObjectId('0000000007bd2d693e1b3258') } ntoreturn:1 idhack:1 reslen:180 103ms
Mon Apr 23 18:06:58 [conn342] query rep.COL1 query: { _id: ObjectId('000000007bfa3216e128954e') } ntoreturn:1 idhack:1 reslen:541 104ms
Mon Apr 23 18:06:58 [conn342] query rep.COL1 query: { _id: ObjectId('00000000e2f2f135abcf95a3') } ntoreturn:1 idhack:1 reslen:573 118ms
Mon Apr 23 18:07:31 [conn342] query rep.COL4 query: { _id: ObjectId('000000008783f480297bcfb4') } ntoreturn:1 idhack:1 reslen:212 105ms
Mon Apr 23 18:08:31 [conn342] query rep.COL1 query: { _id: ObjectId('0000000008bd5bc37e44d103') } ntoreturn:1 idhack:1 reslen:573 125ms
Mon Apr 23 18:08:56 [conn342] query rep.COL3 query: { _id: ObjectId('000000008cfb361ebd60d729') } ntoreturn:1 idhack:1 reslen:212 120ms
Mon Apr 23 18:09:31 [conn343] update rep.COL1 query: { _id: ObjectId('000000000855f59be5d03f75') } update: { $inc: {}, $set: { value: BinData } } idhack:1 moved:1 200ms

Do read operations also take write-lock?

Adam C

unread,

Apr 23, 2012, 10:46:59 AM4/23/12

to mongod...@googlegroups.com

No, read operations take a read lock and many can run concurrently, the time measured to return here does not include the time taken to acquire a lock, only to run the query:

http://www.mongodb.org/display/DOCS/How+does+concurrency+work#Howdoesconcurrencywork-mongod

Can you post the output of db.COL1.stats()?

I picked COL1 since it shows up the most frequently below.

Adam.

Message has been deleted

Eugene Bolotin

unread,

Apr 23, 2012, 11:15:22 AM4/23/12

to mongod...@googlegroups.com

> db.COL1.stats()
{
        "ns" : "rep.COL1",
        "count" : 13359659,
        "size" : 9713845976,
        "avgObjSize" : 727.1028381787289,
        "storageSize" : 14007103232,
        "numExtents" : 36,
        "nindexes" : 1,
        "lastExtentSize" : 2006649856,
        "paddingFactor" : 1.4799999985114942,
        "flags" : 1,
        "totalIndexSize" : 720323520,
        "indexSizes" : {
                "_id_" : 720323520
        },
        "ok" : 1

Adam C

unread,

Apr 23, 2012, 11:41:46 AM4/23/12

to mongod...@googlegroups.com

Quick question for you - are you zeroing out timestamps in all ObjectIDs (first 4 bytes) or is it only 0 timestamp documents that are showing up as slow queries?

There are definitely a significant number of moves happening here - do you think you can reduce that behavior (growing the documents) or is it just a feature of how you use the data? The fact that you just have an _id index means that it is not as costly as it could be (all indexes for a document must be updated when it is moved) but it is not going to help performance.

Adam

Eugene Bolotin

unread,

Apr 23, 2012, 12:18:54 PM4/23/12

to mongod...@googlegroups.com

I'm zeroing timestamp manually. I just convert ui64 to ObjectId string.
Growing of objects is a feature of my service. It happens not often - one time per four months. But elements are updated when accessing.

Adam C

unread,

Apr 23, 2012, 12:53:34 PM4/23/12

to mongod...@googlegroups.com

Do the incidents of slowness correspond to (or increase during) the growth periods or are things pretty consistent all round?

If the moves are periodic as you suggest then you could schedule a compact on a collection that has had a significant number of moves once they are done (or a repair, if space is a concern). That would defragment the collection somewhat. However, there is a big caveat with doing so - compaction and repairing remove all padding, so the impact of the next round of moves may be more severe.

If you were going to test this, I would recommend a compaction of a single collection for testing/comparison purposes before attempting it on a widespread basis.

Alternatively, if this is write lock contention and not an issue with moves, then the upcoming 2.1/2.2 concurrency improvements will at least isolate them to a specific database.

Adam.

Eugene Bolotin

unread,

Apr 24, 2012, 7:22:26 AM4/24/12

to mongod...@googlegroups.com

They are consistent. When service1 was alone, there was maximum of 10,000 errors per day. But when second service started using same mongo instance - error count rose to 30,000 per day. There was no records moving.
And the second service has a small database. I thought that its queries don't lead to page faults.

Adam C

unread,

Apr 24, 2012, 8:29:13 AM4/24/12

to mongod...@googlegroups.com

I just re-read this thread and I'd like to go bat to the fact that your timeout on the application side for this is 250ms, but you are not seeing that in the mongo logs (though there are ops hitting the default slowms of 100ms). Given that, and the fact that you are seeing a far higher incidence of the 250ms timer being hit than the logged queries, it would suggest that the root casue of the issue lies somewhere other than the response times of the queries themselves.

You mentioned that you have 64 threads but a connection pool of just 10 - have you considered (or tried) increasing the pool?

Is it possible that your threads are blocking/waiting for a connection to be free and that is the cause of your timer being triggered?

Adam

Message has been deleted

Eugene Bolotin

unread,

Apr 24, 2012, 9:32:27 AM4/24/12

to mongod...@googlegroups.com

Yes, I tried different pool size (from 1 to 100) and I got cascades of read/write timeouts there. Strange is that I get groups of read with same timeout.
Maybe lot of write queries with timeout < 100 ms acquire the lock?

I wrote small python script (pymongo) that synchronously read (only reads) records from database. And create big (50 millions) and small (100 thousand) database similar to production databases.

2 instances of script read from BIG database.
And one instance reads from SMALL database.

Total load is 800 rps. Big database script eventualy gets timeouts > 100 ms. And I get timeouts of script working with small database. Isn't that strange? Maybe that's the reason why I'm getting groups of simultaneous read timeouts?

Adam C

unread,

Apr 24, 2012, 9:42:24 AM4/24/12

to mongod...@googlegroups.com

If both are in the same instance it sounds like after a certain point you are faulting to disk on those reads, i.e. your working set outgrows the available physical RAM - how does your RAM utilisation and page fault count look when you start to hit the 100ms+ responses on the reads?

Your read volumes (and indeed the writes) don't seem excessive and the stats you have posted would suggest that IO is not the issue, unless the stats were from a time when the tests were not running. I have seen loads far in excess of this without causing similar issues, so we must be missing something here. Are these hosts in MMS (if so I would just need the group name to take a look).

Adam

Eugene Bolotin

unread,

Apr 26, 2012, 10:44:28 AM4/26/12

to mongod...@googlegroups.com

I get following mongostat without second script that works with small DB:

insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time

     0    893      0      0       0       1       0 26.1g 52.5g   552m   1155        0          0       0|0     1|0    69k   421k     4   17:09:15
     0    844      0      0       0       1       0 26.1g 52.5g   552m   1096        0          0       0|0     2|0    65k   398k     4   17:09:16
     0    940      0      0       0       1       0 26.1g 52.5g   553m   1180        0          0       0|0     1|0    73k   443k     4   17:09:17
     0    901      0      0       0       1       0 26.1g 52.5g   553m   1180        0          0       0|0     2|0    70k   424k     4   17:09:18
     0    876      0      0       0       1       0 26.1g 52.5g   554m   1099        0          0       0|0     2|0    68k   413k     4   17:09:19

I saw that second script is getting errors only few seconds after start (maybe while data is loading into memory):
#BET 2012-04-26 17:08:05.212873         11862 340 ms
#BET 2012-04-26 17:08:05.452267         28997 229 ms

So there is no problem when working with small and large databse simultaneous.

The way I'm working with data is following:
- read by _id
- process
- write _id

So data should be in memory when write operation starts. It means that write operation should be fast. And mongodb.log shows it. There are many query timeout records and few update timeout records (100 slow reads/1 slow update).
Read operation doesn't use write-lock. Now there are no slow moving operations in log. Now I'm doing 5th review of that code.

Adam C

unread,

Apr 26, 2012, 11:07:56 AM4/26/12

to mongod...@googlegroups.com

I see a lot of faults being reported, but your res memory is staying pretty constant at ~552MB - how much memory is there on this host?

If you are faulting that much I would expect your res memory to be growing, though it might take longer than the snapshot provided here. It also seems a lot lower than the last time you posted mongostat (then it was >11GB) - is this the same instance, was it restarted?

Adam

Eugene Bolotin

unread,

Apr 27, 2012, 5:19:07 AM4/27/12

to mongod...@googlegroups.com

No. That is test host - I'm running two test scripts there trying to reproduce the problem.

I also got core file for situation when one read takes a long time (it aborts when query_time > 100ms. That's in service that works with big database):
(gdb) info threads
19 Thread 0x7fffee042700 (LWP 4639) 0x00007ffff6ad9d6b in _int_free (av=0x7fffe0000020, p=0x7fffe008d9c0) at malloc.c:4823
18 Thread 0x7fffee843700 (LWP 4638) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
17 Thread 0x7fffef044700 (LWP 4637) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82
16 Thread 0x7fffef845700 (LWP 4636) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
15 Thread 0x7ffff0046700 (LWP 4635) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
14 Thread 0x7ffff0847700 (LWP 4634) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82
13 Thread 0x7ffff1048700 (LWP 4633) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
* 12 Thread 0x7ffff1849700 (LWP 4632) __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97
11 Thread 0x7ffff204a700 (LWP 4631) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
10 Thread 0x7ffff284b700 (LWP 4630) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
9 Thread 0x7ffff304c700 (LWP 4629) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
8 Thread 0x7ffff384d700 (LWP 4628) 0x00007ffff6a94d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
7 Thread 0x7ffff404e700 (LWP 4627) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
6 Thread 0x7ffff484f700 (LWP 4626) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82
5 Thread 0x7ffff5050700 (LWP 4625) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82
4 Thread 0x7ffff5851700 (LWP 4624) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82
3 Thread 0x7ffff6052700 (LWP 4623) 0x00007ffff6b0f13d in nanosleep () at ../sysdeps/unix/syscall-template.S:82
2 Thread 0x7ffff6a60700 (LWP 4622) 0x00007ffff6b0f13d in nanosleep () at ../sysdeps/unix/syscall-template.S:82
1 Thread 0x7ffff7fcd740 (LWP 4555) 0x00007ffff6b0f13d in nanosleep () at ../sysdeps/unix/syscall-template.S:82

There are 8 threads on __libc_recv, 1 thread (#8) that raised signal and 1 thread _int_free. Other threads wait on accept or on nanosleep (service threads).

I got backtraces for threads that are in __libc_recv - all backtraces are similar:

#0 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
#1 0x0000000000453f0f in mongo_read_socket (conn=0x71af10, buf=0x7ffff404ca40, len=16) at /home/user/mongoservice/env_default.c:57
#2 0x0000000000450bca in mongo_read_response (conn=0x71af10, reply=0x7fffe008a6f0) at /home/user/mongoservice/mongo.c:164
#3 0x00000000004522cb in mongo_cursor_op_query (cursor=0x7fffe008a6f0) at /home/user/mongoservice/mongo.c:750
#4 0x00000000004526f9 in mongo_find (conn=0x71af10, ns=0x7fffe0092cf8 "rep.COL1", query=0x7ffff404cd70, fields=0x7ffff404ccc0, limit=1, skip=0, options=4)
    at /home/user/mongoservice/mongo.c:838
#5 0x0000000000449a91 in nosql::MongoClientRaw::FindOne (this=0x71af10, db=..., collection=..., id=..., hash=..., err=@0x7ffff404d44f)
    at /home/user/mongoservice/mongoclient.cpp:147

_int_free thread:
#0 0x00007ffff6ad9d6b in _int_free (av=0x7fffe0000020, p=0x7fffe008d9c0) at malloc.c:4823
#1 0x00007ffff6ade3e3 in __libc_free (mem=<value optimized out>) at malloc.c:3738
#2 0x00007ffff754c019 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x000000000043574d in TMongoDb<TSenderData >::MongoRead (this=0x6fffa8, db=0x49d1d8 "rep", collection=0x7fffe00997b8 "COL1", index=40621381955, data=...,
    err=@0x7fffee04144f) at /home/user/mongoservice/mongostorage.h:86

I also have timestamp (is millis) variable in MongoRead function:
#7 start = 1335456660339
#8 start = 1335456660338, query_time = 337 (thread raises signal)
#9 start = 1335456660341
#10 start = 1335456660340
#11 start = 1335456660340
#13 start = 1335456660336
#15 start = 1335456660340
#16 start = 1335456660337
#18 start = 1335456660339
#19 start = 1335456660338
Start times are similar. That means that all queries started at same time. And all of them are read queries and took time > 300 ms.

1335456660338 ms = 2012-04-26 20:11:00

And mongodb.log:
hu Apr 26 20:10:28 [initandlisten] connection accepted from 127.0.0.1:57832 #405
Thu Apr 26 20:10:28 [initandlisten] connection accepted from 127.0.0.1:57833 #406
Thu Apr 26 20:10:28 [initandlisten] connection accepted from 127.0.0.1:57834 #407
Thu Apr 26 20:10:28 [initandlisten] connection accepted from 127.0.0.1:57835 #408
Thu Apr 26 20:11:00 [conn345] query rep.COL1 query: { _id: ObjectId('000000000000000001027908') } ntoreturn:1 idhack:1 reslen:454 338ms

Maybe all of queries caused page faults but why there is only one slow query record in mongodb.log?

Adam C

unread,

Apr 27, 2012, 6:54:45 AM4/27/12

to mongod...@googlegroups.com

The traces seem consistent with the logging, one of the threads exited with an abort because it was slow, and one query was logged as slow in MongoDB. I don't see where there's any other evidence of slowness here.

As I mentioned previously, you are seeing quite a few page faults (in the last mongostat at least), if 9 of your queries hit memory, and one faults to disk, then that would be the likely reason for a slow return.

You could try re-running that slow query afterward, if we are right, a subsequent second find using the same _id would then return quickly because it would have been paged into memory. If your whole data set needs to be in memory then you might run a large find, essentially a table scan, to get it into memory and warm up the cache to avoid the slow responses. As long as the data set fits, and there is no memory pressure that should remove the issues you are seeing. What you get otherwise is essentially a lazy loading mechanism, page faults when you look for data not in memory, and a certain level of slower responses.

There are plans to implement a "touch" command to make this process less manual and allow admins to load required data more easily in advance, but for now the more manual approach is your best bet.

Adam

Reply all

Reply to author

Forward