Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
MongoDb 2.0.4 performance question
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  18 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Eugene Bolotin  
View profile  
 More options Apr 23 2012, 6:39 am
From: Eugene Bolotin <eugenebolo...@gmail.com>
Date: Mon, 23 Apr 2012 03:39:16 -0700 (PDT)
Local: Mon, Apr 23 2012 6:39 am
Subject: MongoDb 2.0.4 performance question

Hello!

I have two services running on single MongoDb. One service - service1 has
large database and makes load of 250 reads and 250 writes per second (64
threads):

> db.stats()

{
        "db" : "rep",
        "collections" : 10,
        "objects" : 41312419,
        "avgObjSize" : 384.2041617558149,
        "dataSize" : 15872403312,
        "storageSize" : 22404643072,
        "numExtents" : 173,
        "indexes" : 8,
        "indexSize" : 2240907136,
        "fileSize" : 34276900864,
        "nsSizeMB" : 16,
        "ok" : 1
}

Records of database used by first service are never deleted. All
collections have only _id index.

Second service - service2 has small database and makes load of 200 reads
and 200 writes per second:

> db.stats()

{
        "db" : "cle",
        "collections" : 182,
        "objects" : 1166724,
        "avgObjSize" : 36.41681151669118,
        "dataSize" : 42488368,
        "storageSize" : 135518208,
        "numExtents" : 582,
        "indexes" : 180,
        "indexSize" : 53748688,
        "fileSize" : 2080374784,
        "nsSizeMB" : 16,
        "ok" : 1
}

Collections of database used by second service are named by template
<name>-<date>. Each collection of that database lives 2 days and then
deleted. All collections have only _id index.

output of ps command:
root     17958  1.8 71.6 75749984 11796248 ?   Sl   Apr04 502:06
/opt/mongodb/bin/mongod --fork --dbpath /opt/mongodb/data --logpath
/opt/mongodb/mongodb.log

mongodb.log for today:
Mon Apr 23 00:26:53 [conn342] update rep.COL1 query: { _id:
ObjectId('00000000bd7e7c32c452e23d') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 112ms
Mon Apr 23 01:10:53 [conn342] update rep.COL2 query: { _id:
ObjectId('00000000e6c2c037fd5edcf5') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 161ms
Mon Apr 23 03:12:53 [conn342] update rep.COL3 query: { _id:
ObjectId('00000000274c464bb45b8ff6') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 107ms
Mon Apr 23 04:25:27 [conn342] update rep.COL4 query: { _id:
ObjectId('00000000a89c8355fdcb9a5d') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 113ms
Mon Apr 23 05:49:53 [conn342] update rep.COL4 query: { _id:
ObjectId('000000003e343bb27f7e9251') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 145ms
Mon Apr 23 09:26:56 [conn342] update rep.COL3 query: { _id:
ObjectId('000000004a69d31e08bcd739') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 132ms
Mon Apr 23 09:32:03 [conn342] update rep.COL3 query: { _id:
ObjectId('000000002c3b86882c0b3d82') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 102ms
Mon Apr 23 09:55:58 [conn342] update rep.COL3 query: { _id:
ObjectId('000000001b5d75b6d066c5f6') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 126ms
Mon Apr 23 10:03:38 [conn342] update rep.COL5 query: { _id:
ObjectId('00000000e3aa7531772e1b68') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 116ms
Mon Apr 23 11:11:01 [conn342] update rep.COL1 query: { _id:
ObjectId('00000000fe5137b0e49aac02') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 129ms
Mon Apr 23 11:13:03 [conn342] update rep.COL3 query: { _id:
ObjectId('0000000044468b84bfa8ea32') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 151ms
Mon Apr 23 11:21:03 [conn342] update rep.COL5 query: { _id:
ObjectId('0000000031008cc640ceb4eb') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 130ms
Mon Apr 23 11:30:35 [conn342] update rep.COL2 query: { _id:
ObjectId('000000005d9579736a24351d') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 154ms
Mon Apr 23 12:18:05 [conn344] update rep.COL3 query: { _id:
ObjectId('00000000dfa91e975532115b') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 104ms
Mon Apr 23 12:27:00 [conn342] update rep.COL2 query: { _id:
ObjectId('000000009d38cc0d306de80b') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 127ms
Mon Apr 23 12:58:03 [conn342] update rep.COL5 query: { _id:
ObjectId('00000000ec354eccca778c07') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 114ms
Mon Apr 23 13:14:39 [conn342] update rep.COL2 query: { _id:
ObjectId('00000000b0d5a2cafc9c1ce0') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 123ms
Mon Apr 23 13:16:57 [conn342] update rep.COL5 query: { _id:
ObjectId('00000000b0a4350169fc56a9') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 125ms
Mon Apr 23 13:22:05 [conn342] update rep.COL5 query: { _id:
ObjectId('00000000b99f64016f29de76') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 fastmodinsert:1 103ms
Mon Apr 23 13:28:02 [conn343] update rep.COL5 query: { _id:
ObjectId('00000000c10001c66ce7f36f') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 110ms
Mon Apr 23 14:10:02 [conn342] update rep.COL3 query: { _id:
ObjectId('00000000f583c9db64e30f39') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 103ms
Mon Apr 23 14:15:58 [conn342] update rep.COL3 query: { _id:
ObjectId('000000002b783bfc4b327eed') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 101ms
Mon Apr 23 14:18:34 [conn342] update rep.COL5 query: { _id:
ObjectId('00000000a4923985fa00537e') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 111ms

Output of mongostat:
insert  query update delete getmore command flushes mapped  vsize    res
faults locked % idx miss %     qr|qw   ar|aw  netIn netOut  conn       time
     0    456    230      0       0       1       0  35.9g  72.2g  
11.2g      8      1.1          0       0|1     1|1   129k   171k    24  
14:27:49
     0    372    187      0       0       1       0  35.9g  72.2g  
11.2g     11        1          0       0|0     0|0   119k   171k    24  
14:27:50
     0    456    229      0       0       1       0  35.9g  72.2g  
11.2g      6        1          0       0|0     0|0   129k   173k    24  
14:27:51
     0    504    255      0       0       1       0  35.9g  72.2g  
11.2g      9      0.8          0       0|0     0|0   139k   180k    24  
14:27:52
     0    368    185      0       0       1       0  35.9g  72.2g  
11.3g      3      0.5          0       0|0     0|0   109k   151k    24  
14:27:53
     0    440    223      0       0       1       0  35.9g  72.2g  
11.2g      5      1.1          0       0|0     0|0   125k   165k    24  
14:27:54
     0    580    291      0       0       1       0  35.9g  72.2g  
11.2g     12      3.1          0       0|0     0|0   151k   189k    24  
14:27:55
     0    378    189      0       0       1       0  35.9g  72.2g  
11.3g     11      0.5          0       0|0     0|0   114k   161k    24  
14:27:56
     0    463    235      0       0       1       0  35.9g  72.2g  
11.3g      4      0.7          0       0|0     0|0   126k   162k    24  
14:27:57
     0    464    220      0       0       1       0  35.9g  72.2g  
11.3g      6      0.6          0       0|0     0|0   132k   178k    24  
14:27:58
insert  query update delete getmore command flushes mapped  vsize    res
faults locked % idx miss %     qr|qw   ar|aw  netIn netOut  conn       time
     0    406    220      0       0       1       0  35.9g  72.2g  
11.3g      5      0.6          0       0|0     0|0   116k   152k    24  
14:27:59
     0    388    194      0       0       1       0  35.9g  72.2g  
11.3g      7      0.5          0       0|0     0|0   115k   161k    24  
14:28:00
     0    382    189      0       0       1       0  35.9g  72.2g  
11.3g      7      0.6          0       9|1     9|1   112k   155k    24  
14:28:01
     0    453    225      0       0       1       0  35.9g  72.2g  
11.2g     10      1.5          0       7|2     8|2   124k   165k    24  
14:28:02
     0    438    229      0       0       1       0  35.9g  72.2g  
11.3g      8      0.6          0       0|0     0|0   132k   175k    24  
14:28:03
     0    464    234      0       0       1       0  35.9g  72.2g  
11.2g      5      1.1          0       0|0     0|0   134k   182k    24  
14:28:04
     0    387    194      0       0       1       0  35.9g  72.2g  
11.2g      8      0.5          0       1|1     2|1   113k   158k    24  
14:28:05
     0    489    246      0       0       1       0  35.9g  72.2g  
11.3g      8      0.7          0       0|0     0|0   130k   167k    24  
14:28:06
     0    446    227      0       0       1       1  35.9g  72.2g  
11.2g      5      1.2          0       0|0     0|0   119k   152k    24  
14:28:07
     0    442    223      0       0       1       0  35.9g  72.2g  
11.2g      4      0.7          0       0|0     0|0   128k   174k    24  
14:28:08

Queries to first service (with large database) have a timeout of 250 ms. I
have a metric - error counter per 5 minutes. And now it has average value
of 400 (1.3 error/second).
But I don't see so many long queries in mongodb.log and don't see so many
page faults in mongostat.

How can I investigate it?
Maybe my problem is concerned with write-lock?
Does write-lock yield when I get page-fault on read?

Thanks in advance.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Apr 23 2012, 7:34 am
From: Adam C <ad...@10gen.com>
Date: Mon, 23 Apr 2012 04:34:51 -0700 (PDT)
Local: Mon, Apr 23 2012 7:34 am
Subject: Re: MongoDb 2.0.4 performance question

Where are you measuring the 250ms timeout?

If it is on the application side, do you think you might be waiting for
threads/connections to come available?  You don't mention the
language/driver used, but there is usually a multiplier in play for the
number of threads that can be blocking/waiting (anything above that limit
will then throw an exception).

Assuming you are not using findAndModify (which does not yield), the long
running ops and page faulting ops should be yielding - if that is happening
though you will see those return more slowly as a result of the yield.

The lock percentage looks like it's not an issue judging by what you pasted
from mongostat.

The slow operations from the log are all updates and it would seem they are
triggering moves (outgrowing their original space) - those are relatively
expensive, so you might want to look at padding those docs to prevent the
moves
(http://www.mongodb.org/display/DOCS/Padding+Factor#PaddingFactor-Manu...)

Finally, how does your disk IO look (iostat -xm 2)? if the disk was getting
backed up it might also explain some of your issues.

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eugene Bolotin  
View profile  
 More options Apr 23 2012, 9:14 am
From: Eugene Bolotin <eugenebolo...@gmail.com>
Date: Mon, 23 Apr 2012 06:14:30 -0700 (PDT)
Local: Mon, Apr 23 2012 9:14 am
Subject: Re: MongoDb 2.0.4 performance question

Adam, thanks for your reply.

I use mongo-c-driver, one of the latest version. But my services are
written in C++. I use pool of 10 connections to mongo.
I measure findOne and update request times. I saw that if one request (read
or write) takes more than 100 ms all simultaneous read/write requests in
that time take similar time (>100 ms ~10 requests).

iostat -xm 2:
md2               0,00     0,00   13,75   81,88     0,16     0,32    
10,37     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   37,50    9,50     0,31     0,08    
17,02     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   24,00  357,50     0,30     1,43    
9,27     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   23,00 1238,00     0,23     4,87    
8,27     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   24,50  246,50     0,19     1,00    
8,99     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   29,50  505,50     0,26     2,01    
8,67     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   44,50  630,00     0,41     2,50    
8,84     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   35,50   71,50     0,46     0,31    
14,77     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   32,00    9,50     0,31     0,07    
18,80     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   31,00   20,00     0,38     0,13    
20,55     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   30,50    9,50     0,25     0,09    
17,60     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   34,00    9,50     0,22     0,07    
14,07     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   23,00   17,50     0,27     0,11    
19,06     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   32,00    9,50     0,28     0,08    
18,02     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   35,00   16,50     0,28     0,11    
15,61     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   38,50    9,50     0,42     0,07    
21,17     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   28,00    9,50     0,31     0,07    
20,80     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   43,00   17,50     0,37     0,12    
16,53     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   39,50    9,50     0,32     0,07    
16,49     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   31,00  584,50     0,25     2,32    
8,55     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   42,00   13,00     0,43     0,08    
19,27     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   39,00    9,50     0,46     0,07    
22,43     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   25,50 1150,50     0,22     4,50    
8,22     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   45,00  568,50     0,49     2,29    
9,29     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   27,00  395,50     0,29     1,57    
8,99     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   66,00   36,00     0,52     0,18    
14,08     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   42,50    9,50     0,44     0,08    
20,54     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   21,50   10,00     0,20     0,08    
18,29     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   39,00   17,00     0,38     0,12    
17,93     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   33,50    9,50     0,29     0,08    
17,77     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   33,00   17,50     0,24     0,12    
14,42     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   32,50    9,00     0,38     0,07    
22,46     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   35,50  231,00     0,30     0,95    
9,56     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   41,50 1048,00     0,42     4,13    
8,54     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   28,00  241,50     0,19     0,97    
8,76     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   21,00  518,00     0,19     2,07    
8,59     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   24,00  573,50     0,29     2,28    
8,82     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   40,00   13,50     0,31     0,10    
15,63     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   19,00    9,50     0,21     0,09    
21,19     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   18,00   23,00     0,13     0,12    
12,78     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   51,00    9,50     0,56     0,10    
22,28     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   23,50    9,50     0,23     0,08    
18,91     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   47,00   17,00     0,42     0,11    
16,75     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   47,50    9,00     0,42     0,08    
18,12     0,00    0,00    0,00    0,00   0,00   0,00
md2               0,00     0,00   35,00   17,50     0,34     0,11    
17,22     0,00    0,00    0,00    0,00   0,00   0,00

md2 (type ext3, software RAID 2 disks)

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Apr 23 2012, 9:55 am
From: Adam C <ad...@10gen.com>
Date: Mon, 23 Apr 2012 06:55:47 -0700 (PDT)
Local: Mon, Apr 23 2012 9:55 am
Subject: Re: MongoDb 2.0.4 performance question

As a general recommendation I would suggest switching to ext4 (or XFS)
rather than ext3 (ext3 is inefficient when allocating new data files), but
that is unlikely to be the source of your issues, and the iostat looks
clean.

Your mongostat output shows brief queuing (qr|qw = queued reads/writes), so
when that ticks up you will see a longer time to return.  The operation
occurring at that time is taking a while and, given that it is blocking
reads, is likely a long running write (given your logging output I think
the updates are the cause) - multiple reads can occur simultaneously but
are blocked by writes.

As I mentioned, updates which grow a document significantly and trigger a
move are relatively costly (and slow) writes.  If you can identify those
operations and either manually pad the document to prevent it from growing
or alter the queries to prevent the growth itself I think you will see
better performance.  You can also gauge how many of these are occurring by
looking at the "paddingfactor" field in the collection stats
(db.coll.stats()) - the closer this value is to 2 the more moves are being
triggered (a value of 1 means no moves).

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eugene Bolotin  
View profile  
 More options Apr 23 2012, 10:22 am
From: Eugene Bolotin <eugenebolo...@gmail.com>
Date: Mon, 23 Apr 2012 07:22:41 -0700 (PDT)
Local: Mon, Apr 23 2012 10:22 am
Subject: Re: MongoDb 2.0.4 performance question

Adam, I grep mongodb.log in scale of whole day. And I get lot more slow
queries than updates. For example at last 15 minutes:
Mon Apr 23 17:52:00 [conn342] query rep.COL1 query: { _id:
ObjectId('000000001c075f75267635c5') } ntoreturn:1 idhack:1 reslen:573 107ms
Mon Apr 23 17:52:56 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000c081ea20620e3f0b') } ntoreturn:1 idhack:1 reslen:573 133ms
Mon Apr 23 17:53:54 [conn342] query rep.COL3 query: { _id:
ObjectId('0000000017995bfd03e39405') } ntoreturn:1 idhack:1 reslen:212 129ms
Mon Apr 23 17:53:57 [conn342] query rep.COL6 query: { _id:
ObjectId('000000000aff2d83030b57be') } ntoreturn:1 idhack:1 reslen:180 120ms
Mon Apr 23 17:54:04 [conn342] query rep.COL1 query: { _id:
ObjectId('000000006477beb87f3ef95c') } ntoreturn:1 idhack:1 reslen:541 308ms
Mon Apr 23 17:54:56 [conn342] query rep.COL4 query: { _id:
ObjectId('0000000031a87172c8070c01') } ntoreturn:1 idhack:1 reslen:212 122ms
Mon Apr 23 17:54:56 [conn342] query rep.COL6 query: { _id:
ObjectId('000000007c3a32295cceb602') } ntoreturn:1 idhack:1 reslen:212 153ms
Mon Apr 23 17:55:31 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000580997eb011a40be') } ntoreturn:1 idhack:1 reslen:573 147ms
Mon Apr 23 17:55:58 [conn344] query rep.COL1 query: { _id:
ObjectId('00000000498cbd1aa0895854') } ntoreturn:1 idhack:1 reslen:541 142ms
Mon Apr 23 17:56:56 [conn342] query rep.COL1 query: { _id:
ObjectId('000000002d303e80f2d30731') } ntoreturn:1 idhack:1 reslen:573 131ms
Mon Apr 23 17:57:31 [conn342] query rep.COL5 query: { _id:
ObjectId('00000000f5538c272c7bb038') } ntoreturn:1 idhack:1 reslen:180 130ms
Mon Apr 23 17:57:59 [conn342] query rep.COL3 query: { _id:
ObjectId('00000000ab426aa49954f73d') } ntoreturn:1 idhack:1 reslen:180 130ms
Mon Apr 23 17:58:01 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000a90f36d519c93812') } ntoreturn:1 idhack:1 reslen:573 125ms
Mon Apr 23 17:58:55 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000b2daef3c3d8869a2') } ntoreturn:1 idhack:1 reslen:573 111ms
Mon Apr 23 17:59:01 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000eb8214af01837d90') } ntoreturn:1 idhack:1 reslen:541 131ms
Mon Apr 23 17:59:31 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000e28b6715bbd7af84') } ntoreturn:1 idhack:1 reslen:541 172ms
Mon Apr 23 17:59:58 [conn342] query rep.COL1 query: { _id:
ObjectId('0000000029d153c9964c5707') } ntoreturn:1 idhack:1 reslen:573 105ms
Mon Apr 23 17:59:59 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000d43d57b438a8eb46') } ntoreturn:1 idhack:1 reslen:541 115ms
Mon Apr 23 18:00:01 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000364ed4a459509c9a') } ntoreturn:1 idhack:1 reslen:537 110ms
Mon Apr 23 18:00:31 [conn342] query rep.COL1 query: { _id:
ObjectId('000000005470edaa6beac4a7') } ntoreturn:1 idhack:1 reslen:573 110ms
Mon Apr 23 18:01:31 [conn342] query rep.COL1 query: { _id:
ObjectId('000000005beac4f96db5acd5') } ntoreturn:1 idhack:1 reslen:573 129ms
Mon Apr 23 18:01:57 [conn342] query rep.COL3 query: { _id:
ObjectId('0000000013193c8704281871') } ntoreturn:1 idhack:1 reslen:212 176ms
Mon Apr 23 18:01:58 [conn342] query rep.COL3 query: { _id:
ObjectId('0000000067997169732e48ad') } ntoreturn:1 idhack:1 reslen:212 119ms
Mon Apr 23 18:02:31 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000b249e7a0bc2fa006') } ntoreturn:1 idhack:1 reslen:573 103ms
Mon Apr 23 18:02:57 [conn345] query rep.COL2 query: { _id:
ObjectId('0000000013a45621305efd81') } ntoreturn:1 idhack:1 reslen:180 102ms
Mon Apr 23 18:03:31 [conn342] query rep.COL3 query: { _id:
ObjectId('0000000042b79df9ae9db0aa') } ntoreturn:1 idhack:1 reslen:180 129ms
Mon Apr 23 18:04:00 [conn342] query rep.COL1 query: { _id:
ObjectId('000000006e2e7a0fd7b7a7df') } ntoreturn:1 idhack:1 reslen:541 112ms
Mon Apr 23 18:04:05 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000ca8515792db26700') } ntoreturn:1 idhack:1 reslen:541 444ms
Mon Apr 23 18:04:58 [conn342] query rep.COL3 query: { _id:
ObjectId('000000008bb753d2d459bf11') } ntoreturn:1 idhack:1 reslen:212 139ms
Mon Apr 23 18:05:34 [conn342] query rep.COL6 query: { _id:
ObjectId('0000000020974ec901e68bed') } ntoreturn:1 idhack:1 reslen:180 117ms
Mon Apr 23 18:05:56 [conn342] query rep.COL1 query: { _id:
ObjectId('000000000c69048cf9cb8305') } ntoreturn:1 idhack:1 reslen:573 108ms
Mon Apr 23 18:05:56 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000ef051c0bb8c15868') } ntoreturn:1 idhack:1 reslen:541 116ms
Mon Apr 23 18:06:31 [conn342] query rep.COL4 query: { _id:
ObjectId('00000000d653c735bcb2dbbc') } ntoreturn:1 idhack:1 reslen:180 110ms
Mon Apr 23 18:06:54 [conn342] query rep.COL3 query: { _id:
ObjectId('0000000007bd2d693e1b3258') } ntoreturn:1 idhack:1 reslen:180 103ms
Mon Apr 23 18:06:58 [conn342] query rep.COL1 query: { _id:
ObjectId('000000007bfa3216e128954e') } ntoreturn:1 idhack:1 reslen:541 104ms
Mon Apr 23 18:06:58 [conn342] query rep.COL1 query: { _id:
ObjectId('00000000e2f2f135abcf95a3') } ntoreturn:1 idhack:1 reslen:573 118ms
Mon Apr 23 18:07:31 [conn342] query rep.COL4 query: { _id:
ObjectId('000000008783f480297bcfb4') } ntoreturn:1 idhack:1 reslen:212 105ms
Mon Apr 23 18:08:31 [conn342] query rep.COL1 query: { _id:
ObjectId('0000000008bd5bc37e44d103') } ntoreturn:1 idhack:1 reslen:573 125ms
Mon Apr 23 18:08:56 [conn342] query rep.COL3 query: { _id:
ObjectId('000000008cfb361ebd60d729') } ntoreturn:1 idhack:1 reslen:212 120ms
Mon Apr 23 18:09:31 [conn343] update rep.COL1 query: { _id:
ObjectId('000000000855f59be5d03f75') } update: { $inc: {}, $set: { value:
BinData } } idhack:1 moved:1 200ms

Do read operations also take write-lock?

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Apr 23 2012, 10:46 am
From: Adam C <ad...@10gen.com>
Date: Mon, 23 Apr 2012 07:46:59 -0700 (PDT)
Local: Mon, Apr 23 2012 10:46 am
Subject: Re: MongoDb 2.0.4 performance question

No, read operations take a read lock and many can run concurrently, the
time measured to return here does not include the time taken to acquire a
lock, only to run the query:

http://www.mongodb.org/display/DOCS/How+does+concurrency+work#Howdoes...

Can you post the output of db.COL1.stats()?

I picked COL1 since it shows up the most frequently below.

Adam.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eugene Bolotin  
View profile  
 More options Apr 23 2012, 11:15 am
From: Eugene Bolotin <eugenebolo...@gmail.com>
Date: Mon, 23 Apr 2012 08:15:22 -0700 (PDT)
Local: Mon, Apr 23 2012 11:15 am
Subject: Re: MongoDb 2.0.4 performance question

> db.COL1.stats()

{
        "ns" : "rep.COL1",
        "count" : 13359659,
        "size" : 9713845976,
        "avgObjSize" : 727.1028381787289,
        "storageSize" : 14007103232,
        "numExtents" : 36,
        "nindexes" : 1,
        "lastExtentSize" : 2006649856,
        "paddingFactor" : 1.4799999985114942,
        "flags" : 1,
        "totalIndexSize" : 720323520,
        "indexSizes" : {
                "_id_" : 720323520
        },
        "ok" : 1

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Apr 23 2012, 11:41 am
From: Adam C <ad...@10gen.com>
Date: Mon, 23 Apr 2012 08:41:46 -0700 (PDT)
Local: Mon, Apr 23 2012 11:41 am
Subject: Re: MongoDb 2.0.4 performance question

Quick question for you - are you zeroing out timestamps in all ObjectIDs
(first 4 bytes) or is it only 0 timestamp documents that are showing up as
slow queries?

There are definitely a significant number of moves happening here - do you
think you can reduce that behavior (growing the documents) or is it just a
feature of how you use the data?  The fact that you just have an _id index
means that it is not as costly as it could be (all indexes for a document
must be updated when it is moved) but it is not going to help performance.  

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eugene Bolotin  
View profile  
 More options Apr 23 2012, 12:18 pm
From: Eugene Bolotin <eugenebolo...@gmail.com>
Date: Mon, 23 Apr 2012 09:18:54 -0700 (PDT)
Local: Mon, Apr 23 2012 12:18 pm
Subject: Re: MongoDb 2.0.4 performance question

I'm zeroing timestamp  manually. I just convert ui64 to ObjectId string.
Growing of objects is a feature of my service. It happens not often - one
time per four months. But elements are updated when accessing.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Apr 23 2012, 12:53 pm
From: Adam C <ad...@10gen.com>
Date: Mon, 23 Apr 2012 09:53:34 -0700 (PDT)
Local: Mon, Apr 23 2012 12:53 pm
Subject: Re: MongoDb 2.0.4 performance question

Do the incidents of slowness correspond to (or increase during) the growth
periods or are things pretty consistent all round?

If the moves are periodic as you suggest then you could schedule a compact
on a collection that has had a significant number of moves once they are
done (or a repair, if space is a concern).  That would defragment the
collection somewhat.  However, there is a big caveat with doing so -
compaction and repairing remove all padding, so the impact of the next
round of moves may be more severe.

If you were going to test this, I would recommend a compaction of a single
collection for testing/comparison purposes before attempting it on a
widespread basis.

Alternatively, if this is write lock contention and not an issue with
moves, then the upcoming 2.1/2.2 concurrency improvements will at least
isolate them to a specific database.  

Adam.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eugene Bolotin  
View profile  
 More options Apr 24 2012, 7:22 am
From: Eugene Bolotin <eugenebolo...@gmail.com>
Date: Tue, 24 Apr 2012 04:22:26 -0700 (PDT)
Local: Tues, Apr 24 2012 7:22 am
Subject: Re: MongoDb 2.0.4 performance question

They are consistent. When service1 was alone, there was maximum of 10,000
errors per day. But when second service started using same mongo instance -
error count rose to 30,000 per day. There was no records moving.
And the second service has a small database. I thought that its queries
don't lead to page faults.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Apr 24 2012, 8:29 am
From: Adam C <ad...@10gen.com>
Date: Tue, 24 Apr 2012 05:29:13 -0700 (PDT)
Local: Tues, Apr 24 2012 8:29 am
Subject: Re: MongoDb 2.0.4 performance question

I just re-read this thread and I'd like to go bat to the fact that your
timeout on the application side for this is 250ms, but you are not seeing
that in the mongo logs (though there are ops hitting the default slowms of
100ms).  Given that, and the fact that you are seeing a far higher
incidence of the 250ms timer being hit than the logged queries, it would
suggest that the root casue of the issue lies somewhere other than the
response times of the queries themselves.

You mentioned that you have 64 threads but a connection pool of just 10 -
have you considered (or tried) increasing the pool?
Is it possible that your threads are blocking/waiting for a connection to
be free and that is the cause of your timer being triggered?

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eugene Bolotin  
View profile  
 More options Apr 24 2012, 9:32 am
From: Eugene Bolotin <eugenebolo...@gmail.com>
Date: Tue, 24 Apr 2012 06:32:27 -0700 (PDT)
Local: Tues, Apr 24 2012 9:32 am
Subject: Re: MongoDb 2.0.4 performance question

Yes, I tried different pool size (from 1 to 100) and I got cascades of
read/write timeouts there. Strange is that I get groups of read with same
timeout.
Maybe lot of write queries with timeout < 100 ms acquire the lock?

I wrote small python script (pymongo) that synchronously read (only reads)
records from database. And create big (50 millions) and small (100
thousand) database similar to production databases.

2 instances of script read from BIG database.
And one instance reads from SMALL database.

Total load is 800 rps. Big database script eventualy gets timeouts > 100
ms. And I get timeouts of script working with small database. Isn't that
strange? Maybe that's the reason why I'm getting groups of simultaneous
read timeouts?

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Apr 24 2012, 9:42 am
From: Adam C <ad...@10gen.com>
Date: Tue, 24 Apr 2012 06:42:24 -0700 (PDT)
Local: Tues, Apr 24 2012 9:42 am
Subject: Re: MongoDb 2.0.4 performance question

If both are in the same instance it sounds like after a certain point you
are faulting to disk on those reads, i.e. your working set outgrows the
available physical RAM - how does your RAM utilisation and page fault count
look when you start to hit the 100ms+ responses on the reads?

Your read volumes (and indeed the writes) don't seem excessive and the
stats you have posted would suggest that IO is not the issue, unless the
stats were from a time when the tests were not running.  I have seen loads
far in excess of this without causing similar issues, so we must be missing
something here.  Are these hosts in MMS (if so I would just need the group
name to take a look).

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eugene Bolotin  
View profile  
 More options Apr 26 2012, 10:44 am
From: Eugene Bolotin <eugenebolo...@gmail.com>
Date: Thu, 26 Apr 2012 07:44:28 -0700 (PDT)
Local: Thurs, Apr 26 2012 10:44 am
Subject: Re: MongoDb 2.0.4 performance question

I get following mongostat without second script that works with small DB:

insert  query update delete getmore command flushes mapped  vsize    res
faults locked % idx miss %     qr|qw   ar|aw  netIn netOut  conn       time
     0    893      0      0       0       1       0  26.1g  52.5g   552m  
1155        0          0       0|0     1|0    69k   421k     4   17:09:15
     0    844      0      0       0       1       0  26.1g  52.5g   552m  
1096        0          0       0|0     2|0    65k   398k     4   17:09:16
     0    940      0      0       0       1       0  26.1g  52.5g   553m  
1180        0          0       0|0     1|0    73k   443k     4   17:09:17
     0    901      0      0       0       1       0  26.1g  52.5g   553m  
1180        0          0       0|0     2|0    70k   424k     4   17:09:18
     0    876      0      0       0       1       0  26.1g  52.5g   554m  
1099        0          0       0|0     2|0    68k   413k     4   17:09:19

I saw that second script is getting errors only few seconds after start
(maybe while data is loading into memory):
#BET 2012-04-26 17:08:05.212873         11862 340 ms
#BET 2012-04-26 17:08:05.452267         28997 229 ms

So there is no problem when working with small and large databse
simultaneous.

The way I'm working with data is following:
- read by _id
- process
- write _id

So data should be in memory when write operation starts. It means that
write operation should be fast. And mongodb.log shows it. There are many
query timeout records and few update timeout records (100 slow reads/1 slow
update).
Read operation doesn't use write-lock. Now there are no slow moving
operations in log. Now I'm doing 5th review of that code.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Apr 26 2012, 11:07 am
From: Adam C <ad...@10gen.com>
Date: Thu, 26 Apr 2012 08:07:56 -0700 (PDT)
Local: Thurs, Apr 26 2012 11:07 am
Subject: Re: MongoDb 2.0.4 performance question

I see a lot of faults being reported, but your res memory is staying pretty
constant at ~552MB - how much memory is there on this host?

If you are faulting that much I would expect your res memory to be growing,
though it might take longer than the snapshot provided here.  It also seems
a lot lower than the last time you posted mongostat (then it was >11GB) -
is this the same instance, was it restarted?

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eugene Bolotin  
View profile  
 More options Apr 27 2012, 5:19 am
From: Eugene Bolotin <eugenebolo...@gmail.com>
Date: Fri, 27 Apr 2012 02:19:07 -0700 (PDT)
Local: Fri, Apr 27 2012 5:19 am
Subject: Re: MongoDb 2.0.4 performance question

No. That is test host - I'm running two test scripts there trying to
reproduce the problem.

I also got core file for situation when one read takes a long time (it
aborts when query_time > 100ms. That's in service that works with big
database):
(gdb) info threads
  19 Thread 0x7fffee042700 (LWP 4639)  0x00007ffff6ad9d6b in _int_free
(av=0x7fffe0000020, p=0x7fffe008d9c0) at malloc.c:4823
  18 Thread 0x7fffee843700 (LWP 4638)  0x00007ffff6e050dc in __libc_recv
(fd=<value optimized out>, buf=<value optimized out>, n=<value optimized
out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
  17 Thread 0x7fffef044700 (LWP 4637)  0x00007ffff6e04fdd in accept () at
../sysdeps/unix/syscall-template.S:82
  16 Thread 0x7fffef845700 (LWP 4636)  0x00007ffff6e050dc in __libc_recv
(fd=<value optimized out>, buf=<value optimized out>, n=<value optimized
out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
  15 Thread 0x7ffff0046700 (LWP 4635)  0x00007ffff6e050dc in __libc_recv
(fd=<value optimized out>, buf=<value optimized out>, n=<value optimized
out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
  14 Thread 0x7ffff0847700 (LWP 4634)  0x00007ffff6e04fdd in accept () at
../sysdeps/unix/syscall-template.S:82
  13 Thread 0x7ffff1048700 (LWP 4633)  0x00007ffff6e050dc in __libc_recv
(fd=<value optimized out>, buf=<value optimized out>, n=<value optimized
out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
* 12 Thread 0x7ffff1849700 (LWP 4632)  __lll_lock_wait_private () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97
  11 Thread 0x7ffff204a700 (LWP 4631)  0x00007ffff6e050dc in __libc_recv
(fd=<value optimized out>, buf=<value optimized out>, n=<value optimized
out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
  10 Thread 0x7ffff284b700 (LWP 4630)  0x00007ffff6e050dc in __libc_recv
(fd=<value optimized out>, buf=<value optimized out>, n=<value optimized
out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
  9 Thread 0x7ffff304c700 (LWP 4629)  0x00007ffff6e050dc in __libc_recv
(fd=<value optimized out>, buf=<value optimized out>, n=<value optimized
out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
  8 Thread 0x7ffff384d700 (LWP 4628)  0x00007ffff6a94d05 in raise (sig=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
  7 Thread 0x7ffff404e700 (LWP 4627)  0x00007ffff6e050dc in __libc_recv
(fd=<value optimized out>, buf=<value optimized out>, n=<value optimized
out>, flags=<value optimized out>)
    at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
  6 Thread 0x7ffff484f700 (LWP 4626)  0x00007ffff6e04fdd in accept () at
../sysdeps/unix/syscall-template.S:82
  5 Thread 0x7ffff5050700 (LWP 4625)  0x00007ffff6e04fdd in accept () at
../sysdeps/unix/syscall-template.S:82
  4 Thread 0x7ffff5851700 (LWP 4624)  0x00007ffff6e04fdd in accept () at
../sysdeps/unix/syscall-template.S:82
  3 Thread 0x7ffff6052700 (LWP 4623)  0x00007ffff6b0f13d in nanosleep () at
../sysdeps/unix/syscall-template.S:82
  2 Thread 0x7ffff6a60700 (LWP 4622)  0x00007ffff6b0f13d in nanosleep () at
../sysdeps/unix/syscall-template.S:82
  1 Thread 0x7ffff7fcd740 (LWP 4555)  0x00007ffff6b0f13d in nanosleep () at
../sysdeps/unix/syscall-template.S:82

There are 8 threads on __libc_recv, 1 thread (#8) that raised signal and 1
thread _int_free. Other threads wait on accept or on nanosleep (service
threads).

I got backtraces for threads that are in __libc_recv - all backtraces are
similar:

#0  0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value
optimized out>, n=<value optimized out>, flags=<value optimized out>) at
../sysdeps/unix/sysv/linux/x86_64/recv.c:34
#1  0x0000000000453f0f in mongo_read_socket (conn=0x71af10,
buf=0x7ffff404ca40, len=16) at /home/user/mongoservice/env_default.c:57
#2  0x0000000000450bca in mongo_read_response (conn=0x71af10,
reply=0x7fffe008a6f0) at /home/user/mongoservice/mongo.c:164
#3  0x00000000004522cb in mongo_cursor_op_query (cursor=0x7fffe008a6f0) at
/home/user/mongoservice/mongo.c:750
#4  0x00000000004526f9 in mongo_find (conn=0x71af10, ns=0x7fffe0092cf8
"rep.COL1", query=0x7ffff404cd70, fields=0x7ffff404ccc0, limit=1, skip=0,
options=4)
    at /home/user/mongoservice/mongo.c:838
#5  0x0000000000449a91 in nosql::MongoClientRaw::FindOne (this=0x71af10,
db=..., collection=..., id=..., hash=..., err=@0x7ffff404d44f)
    at /home/user/mongoservice/mongoclient.cpp:147

_int_free thread:
#0  0x00007ffff6ad9d6b in _int_free (av=0x7fffe0000020, p=0x7fffe008d9c0)
at malloc.c:4823
#1  0x00007ffff6ade3e3 in __libc_free (mem=<value optimized out>) at
malloc.c:3738
#2  0x00007ffff754c019 in std::basic_string<char, std::char_traits<char>,
std::allocator<char> >::~basic_string() () from
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x000000000043574d in TMongoDb<TSenderData >::MongoRead (this=0x6fffa8,
db=0x49d1d8 "rep", collection=0x7fffe00997b8 "COL1", index=40621381955,
data=...,
    err=@0x7fffee04144f) at /home/user/mongoservice/mongostorage.h:86

I also have timestamp (is millis) variable in MongoRead function:
#7  start = 1335456660339
#8  start = 1335456660338, query_time = 337  (thread raises signal)
#9  start = 1335456660341
#10 start = 1335456660340
#11 start = 1335456660340
#13 start = 1335456660336
#15 start = 1335456660340
#16 start = 1335456660337
#18 start = 1335456660339
#19 start = 1335456660338
Start times are similar. That means that all queries started at same time.
And all of them are read queries and took time > 300 ms.

1335456660338 ms = 2012-04-26 20:11:00

And mongodb.log:
hu Apr 26 20:10:28 [initandlisten] connection accepted from 127.0.0.1:57832
#405
Thu Apr 26 20:10:28 [initandlisten] connection accepted from
127.0.0.1:57833 #406
Thu Apr 26 20:10:28 [initandlisten] connection accepted from
127.0.0.1:57834 #407
Thu Apr 26 20:10:28 [initandlisten] connection accepted from
127.0.0.1:57835 #408
Thu Apr 26 20:11:00 [conn345] query rep.COL1 query: { _id:
ObjectId('000000000000000001027908') } ntoreturn:1 idhack:1 reslen:454 338ms

Maybe all of queries caused page faults but why there is only one slow
query record in mongodb.log?

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Adam C  
View profile  
 More options Apr 27 2012, 6:54 am
From: Adam C <ad...@10gen.com>
Date: Fri, 27 Apr 2012 03:54:45 -0700 (PDT)
Local: Fri, Apr 27 2012 6:54 am
Subject: Re: MongoDb 2.0.4 performance question

The traces seem consistent with the logging, one of the threads exited with
an abort because it was slow, and one query was logged as slow in MongoDB.
 I don't see where there's any other evidence of slowness here.

As I mentioned previously, you are seeing quite a few page faults (in the
last mongostat at least), if 9 of your queries hit memory, and one faults
to disk, then that would be the likely reason for a slow return.

You could try re-running that slow query afterward, if we are right, a
subsequent second find using the same _id would then return quickly because
it would have been paged into memory.  If your whole data set needs to be
in memory then you might run a large find, essentially a table scan, to get
it into memory and warm up the cache to avoid the slow responses.  As long
as the data set fits, and there is no memory pressure that should remove
the issues you are seeing.  What you get otherwise is essentially a lazy
loading mechanism, page faults when you look for data not in memory, and a
certain level of slower responses.

There are plans to implement a "touch" command to make this process less
manual and allow admins to load required data more easily in advance, but
for now the more manual approach is your best bet.

Adam

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »