I have two services running on single MongoDb. One service - service1 has large database and makes load of 250 reads and 250 writes per second (64 threads):
Collections of database used by second service are named by template <name>-<date>. Each collection of that database lives 2 days and then deleted. All collections have only _id index.
Queries to first service (with large database) have a timeout of 250 ms. I have a metric - error counter per 5 minutes. And now it has average value of 400 (1.3 error/second). But I don't see so many long queries in mongodb.log and don't see so many page faults in mongostat.
How can I investigate it? Maybe my problem is concerned with write-lock? Does write-lock yield when I get page-fault on read?
If it is on the application side, do you think you might be waiting for threads/connections to come available? You don't mention the language/driver used, but there is usually a multiplier in play for the number of threads that can be blocking/waiting (anything above that limit will then throw an exception).
Assuming you are not using findAndModify (which does not yield), the long running ops and page faulting ops should be yielding - if that is happening though you will see those return more slowly as a result of the yield.
The lock percentage looks like it's not an issue judging by what you pasted from mongostat.
The slow operations from the log are all updates and it would seem they are triggering moves (outgrowing their original space) - those are relatively expensive, so you might want to look at padding those docs to prevent the moves (http://www.mongodb.org/display/DOCS/Padding+Factor#PaddingFactor-Manu...)
Finally, how does your disk IO look (iostat -xm 2)? if the disk was getting backed up it might also explain some of your issues.
On Monday, April 23, 2012 11:39:16 AM UTC+1, Eugene Bolotin wrote:
> Hello!
> I have two services running on single MongoDb. One service - service1 has > large database and makes load of 250 reads and 250 writes per second (64 > threads): > > db.stats() > { > "db" : "rep", > "collections" : 10, > "objects" : 41312419, > "avgObjSize" : 384.2041617558149, > "dataSize" : 15872403312, > "storageSize" : 22404643072, > "numExtents" : 173, > "indexes" : 8, > "indexSize" : 2240907136, > "fileSize" : 34276900864, > "nsSizeMB" : 16, > "ok" : 1 > } > Records of database used by first service are never deleted. All > collections have only _id index.
> Second service - service2 has small database and makes load of 200 reads > and 200 writes per second: > > db.stats() > { > "db" : "cle", > "collections" : 182, > "objects" : 1166724, > "avgObjSize" : 36.41681151669118, > "dataSize" : 42488368, > "storageSize" : 135518208, > "numExtents" : 582, > "indexes" : 180, > "indexSize" : 53748688, > "fileSize" : 2080374784, > "nsSizeMB" : 16, > "ok" : 1 > } > Collections of database used by second service are named by template > <name>-<date>. Each collection of that database lives 2 days and then > deleted. All collections have only _id index.
I use mongo-c-driver, one of the latest version. But my services are written in C++. I use pool of 10 connections to mongo. I measure findOne and update request times. I saw that if one request (read or write) takes more than 100 ms all simultaneous read/write requests in that time take similar time (>100 ms ~10 requests).
On Monday, April 23, 2012 3:34:51 PM UTC+4, Adam C wrote:
> Where are you measuring the 250ms timeout?
> If it is on the application side, do you think you might be waiting for > threads/connections to come available? You don't mention the > language/driver used, but there is usually a multiplier in play for the > number of threads that can be blocking/waiting (anything above that limit > will then throw an exception).
> Assuming you are not using findAndModify (which does not yield), the long > running ops and page faulting ops should be yielding - if that is happening > though you will see those return more slowly as a result of the yield.
> The lock percentage looks like it's not an issue judging by what you > pasted from mongostat.
> The slow operations from the log are all updates and it would seem they > are triggering moves (outgrowing their original space) - those are > relatively expensive, so you might want to look at padding those docs to > prevent the moves ( > http://www.mongodb.org/display/DOCS/Padding+Factor#PaddingFactor-Manu... > )
> Finally, how does your disk IO look (iostat -xm 2)? if the disk was > getting backed up it might also explain some of your issues.
> Adam
> On Monday, April 23, 2012 11:39:16 AM UTC+1, Eugene Bolotin wrote:
>> Hello!
>> I have two services running on single MongoDb. One service - service1 has >> large database and makes load of 250 reads and 250 writes per second (64 >> threads): >> > db.stats() >> { >> "db" : "rep", >> "collections" : 10, >> "objects" : 41312419, >> "avgObjSize" : 384.2041617558149, >> "dataSize" : 15872403312, >> "storageSize" : 22404643072, >> "numExtents" : 173, >> "indexes" : 8, >> "indexSize" : 2240907136, >> "fileSize" : 34276900864, >> "nsSizeMB" : 16, >> "ok" : 1 >> } >> Records of database used by first service are never deleted. All >> collections have only _id index.
>> Second service - service2 has small database and makes load of 200 reads >> and 200 writes per second: >> > db.stats() >> { >> "db" : "cle", >> "collections" : 182, >> "objects" : 1166724, >> "avgObjSize" : 36.41681151669118, >> "dataSize" : 42488368, >> "storageSize" : 135518208, >> "numExtents" : 582, >> "indexes" : 180, >> "indexSize" : 53748688, >> "fileSize" : 2080374784, >> "nsSizeMB" : 16, >> "ok" : 1 >> } >> Collections of database used by second service are named by template >> <name>-<date>. Each collection of that database lives 2 days and then >> deleted. All collections have only _id index.
As a general recommendation I would suggest switching to ext4 (or XFS) rather than ext3 (ext3 is inefficient when allocating new data files), but that is unlikely to be the source of your issues, and the iostat looks clean.
Your mongostat output shows brief queuing (qr|qw = queued reads/writes), so when that ticks up you will see a longer time to return. The operation occurring at that time is taking a while and, given that it is blocking reads, is likely a long running write (given your logging output I think the updates are the cause) - multiple reads can occur simultaneously but are blocked by writes.
As I mentioned, updates which grow a document significantly and trigger a move are relatively costly (and slow) writes. If you can identify those operations and either manually pad the document to prevent it from growing or alter the queries to prevent the growth itself I think you will see better performance. You can also gauge how many of these are occurring by looking at the "paddingfactor" field in the collection stats (db.coll.stats()) - the closer this value is to 2 the more moves are being triggered (a value of 1 means no moves).
On Monday, April 23, 2012 2:14:30 PM UTC+1, Eugene Bolotin wrote:
> Adam, thanks for your reply.
> I use mongo-c-driver, one of the latest version. But my services are > written in C++. I use pool of 10 connections to mongo. > I measure findOne and update request times. I saw that if one request > (read or write) takes more than 100 ms all simultaneous read/write requests > in that time take similar time (>100 ms ~10 requests).
> On Monday, April 23, 2012 3:34:51 PM UTC+4, Adam C wrote:
>> Where are you measuring the 250ms timeout?
>> If it is on the application side, do you think you might be waiting for >> threads/connections to come available? You don't mention the >> language/driver used, but there is usually a multiplier in play for the >> number of threads that can be blocking/waiting (anything above that limit >> will then throw an exception).
>> Assuming you are not using findAndModify (which does not yield), the long >> running ops and page faulting ops should be yielding - if that is happening >> though you will see those return more slowly as a result of the yield.
>> The lock percentage looks like it's not an issue judging by what you >> pasted from mongostat.
>> The slow operations from the log are all updates and it would seem they >> are triggering moves (outgrowing their original space) - those are >> relatively expensive, so you might want to look at padding those docs to >> prevent the moves ( >> http://www.mongodb.org/display/DOCS/Padding+Factor#PaddingFactor-Manu... >> )
>> Finally, how does your disk IO look (iostat -xm 2)? if the disk was >> getting backed up it might also explain some of your issues.
>> Adam
>> On Monday, April 23, 2012 11:39:16 AM UTC+1, Eugene Bolotin wrote:
>>> Hello!
>>> I have two services running on single MongoDb. One service - service1 >>> has large database and makes load of 250 reads and 250 writes per second >>> (64 threads): >>> > db.stats() >>> { >>> "db" : "rep", >>> "collections" : 10, >>> "objects" : 41312419, >>> "avgObjSize" : 384.2041617558149, >>> "dataSize" : 15872403312, >>> "storageSize" : 22404643072, >>> "numExtents" : 173, >>> "indexes" : 8, >>> "indexSize" : 2240907136, >>> "fileSize" : 34276900864, >>> "nsSizeMB" : 16, >>> "ok" : 1 >>> } >>> Records of database used by first service are never deleted. All >>> collections have only _id index.
>>> Second service - service2 has small database and makes load of 200 reads >>> and 200 writes per second: >>> > db.stats() >>> { >>> "db" : "cle", >>> "collections" : 182, >>> "objects" : 1166724, >>> "avgObjSize" : 36.41681151669118, >>> "dataSize" : 42488368, >>> "storageSize" : 135518208, >>> "numExtents" : 582, >>> "indexes" : 180, >>> "indexSize" : 53748688, >>> "fileSize" : 2080374784, >>> "nsSizeMB" : 16, >>> "ok" : 1
On Monday, April 23, 2012 5:55:47 PM UTC+4, Adam C wrote:
> As a general recommendation I would suggest switching to ext4 (or XFS) > rather than ext3 (ext3 is inefficient when allocating new data files), but > that is unlikely to be the source of your issues, and the iostat looks > clean.
> Your mongostat output shows brief queuing (qr|qw = queued reads/writes), > so when that ticks up you will see a longer time to return. The operation > occurring at that time is taking a while and, given that it is blocking > reads, is likely a long running write (given your logging output I think > the updates are the cause) - multiple reads can occur simultaneously but > are blocked by writes.
> As I mentioned, updates which grow a document significantly and trigger a > move are relatively costly (and slow) writes. If you can identify those > operations and either manually pad the document to prevent it from growing > or alter the queries to prevent the growth itself I think you will see > better performance. You can also gauge how many of these are occurring by > looking at the "paddingfactor" field in the collection stats > (db.coll.stats()) - the closer this value is to 2 the more moves are being > triggered (a value of 1 means no moves).
> Adam
> On Monday, April 23, 2012 2:14:30 PM UTC+1, Eugene Bolotin wrote:
>> Adam, thanks for your reply.
>> I use mongo-c-driver, one of the latest version. But my services are >> written in C++. I use pool of 10 connections to mongo. >> I measure findOne and update request times. I saw that if one request >> (read or write) takes more than 100 ms all simultaneous read/write requests >> in that time take similar time (>100 ms ~10 requests).
No, read operations take a read lock and many can run concurrently, the time measured to return here does not include the time taken to acquire a lock, only to run the query:
> On Monday, April 23, 2012 5:55:47 PM UTC+4, Adam C wrote:
>> As a general recommendation I would suggest switching to ext4 (or XFS) >> rather than ext3 (ext3 is inefficient when allocating new data files), but >> that is unlikely to be the source of your issues, and the iostat looks >> clean.
>> Your mongostat output shows brief queuing (qr|qw = queued reads/writes), >> so when that ticks up you will see a longer time to return. The operation >> occurring at that time is taking a while and, given that it is blocking >> reads, is likely a long running write (given your logging output I think >> the updates are the cause) - multiple reads can occur simultaneously but >> are blocked by writes.
>> As I mentioned, updates which grow a document significantly and trigger a >> move are relatively costly (and slow) writes. If you can identify those >> operations and either manually pad the document to prevent it from growing >> or alter the queries to prevent the growth itself I think you will see >> better performance. You can also gauge how many of these are occurring by >> looking at the "paddingfactor" field in the collection stats >> (db.coll.stats()) - the closer this value is to 2 the more moves are being >> triggered (a value of 1 means no moves).
>> Adam
>> On Monday, April 23, 2012 2:14:30 PM UTC+1, Eugene Bolotin wrote:
>>> Adam, thanks for your reply.
>>> I use mongo-c-driver, one of the latest version. But my services are >>> written in C++. I use pool of 10 connections to mongo. >>> I measure findOne and update request times. I saw that if one request >>> (read or write) takes more than 100 ms all simultaneous read/write requests >>> in that time take similar time (>100 ms ~10 requests).
} On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
> No, read operations take a read lock and many can run concurrently, the > time measured to return here does not include the time taken to acquire a > lock, only to run the query:
>> On Monday, April 23, 2012 5:55:47 PM UTC+4, Adam C wrote:
>>> As a general recommendation I would suggest switching to ext4 (or XFS) >>> rather than ext3 (ext3 is inefficient when allocating new data files), but >>> that is unlikely to be the source of your issues, and the iostat looks >>> clean.
>>> Your mongostat output shows brief queuing (qr|qw = queued reads/writes), >>> so when that ticks up you will see a longer time to return. The operation >>> occurring at that time is taking a while and, given that it is blocking >>> reads, is likely a long running write (given your logging output I think >>> the updates are the cause) - multiple reads can occur simultaneously but >>> are blocked by writes.
>>> As I mentioned, updates which grow a document significantly and trigger >>> a move are relatively costly (and slow) writes. If you can identify those >>> operations and either manually pad the document to prevent it from growing >>> or alter the queries to prevent the growth itself I think you will see >>> better performance. You can also gauge how many of these are occurring by >>> looking at the "paddingfactor" field in the collection stats >>> (db.coll.stats()) - the closer this value is to 2 the more moves are being >>> triggered (a value of 1 means no moves).
>>> Adam
>>> On Monday, April 23, 2012 2:14:30 PM UTC+1, Eugene Bolotin wrote:
>>>> Adam, thanks for your reply.
>>>> I use mongo-c-driver, one of the latest version. But my services are >>>> written in C++. I use pool of 10 connections to mongo. >>>> I measure findOne and update request times. I saw that if one request >>>> (read or write) takes more than 100 ms all simultaneous read/write requests >>>> in that time take similar time (>100 ms ~10 requests).
Quick question for you - are you zeroing out timestamps in all ObjectIDs (first 4 bytes) or is it only 0 timestamp documents that are showing up as slow queries?
There are definitely a significant number of moves happening here - do you think you can reduce that behavior (growing the documents) or is it just a feature of how you use the data? The fact that you just have an _id index means that it is not as costly as it could be (all indexes for a document must be updated when it is moved) but it is not going to help performance.
> On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
>> No, read operations take a read lock and many can run concurrently, the >> time measured to return here does not include the time taken to acquire a >> lock, only to run the query:
>>> On Monday, April 23, 2012 5:55:47 PM UTC+4, Adam C wrote:
>>>> As a general recommendation I would suggest switching to ext4 (or XFS) >>>> rather than ext3 (ext3 is inefficient when allocating new data files), but >>>> that is unlikely to be the source of your issues, and the iostat looks >>>> clean.
>>>> Your mongostat output shows brief queuing (qr|qw = queued >>>> reads/writes), so when that ticks up you will see a longer time to return. >>>> The operation occurring at that time is taking a while and, given that it >>>> is blocking reads, is likely a long running write (given your logging >>>> output I think the updates are the cause) - multiple reads can occur >>>> simultaneously but are blocked by writes.
>>>> As I mentioned, updates which grow a document significantly and trigger >>>> a move are relatively costly (and slow) writes. If you can identify those >>>> operations and either manually pad the document to prevent it from growing >>>> or alter the queries to prevent the growth itself I think you will see >>>> better performance. You can also gauge how many of these are occurring by >>>> looking at the "paddingfactor" field in the collection stats >>>> (db.coll.stats()) - the closer this value is to 2 the more moves are being >>>> triggered (a value of 1 means no moves).
>>>> Adam
>>>> On Monday, April 23, 2012 2:14:30 PM UTC+1, Eugene Bolotin wrote:
>>>>> Adam, thanks for your reply.
>>>>> I use mongo-c-driver, one of the latest version. But my services are >>>>> written in C++. I use pool of 10 connections to mongo. >>>>> I measure findOne and update request times. I saw that if one request >>>>> (read or write) takes more than 100 ms all simultaneous read/write requests >>>>> in that time take similar time (>100 ms ~10 requests).
I'm zeroing timestamp manually. I just convert ui64 to ObjectId string. Growing of objects is a feature of my service. It happens not often - one time per four months. But elements are updated when accessing.
On Monday, April 23, 2012 7:41:46 PM UTC+4, Adam C wrote:
> Quick question for you - are you zeroing out timestamps in all ObjectIDs > (first 4 bytes) or is it only 0 timestamp documents that are showing up as > slow queries?
> There are definitely a significant number of moves happening here - do you > think you can reduce that behavior (growing the documents) or is it just a > feature of how you use the data? The fact that you just have an _id index > means that it is not as costly as it could be (all indexes for a document > must be updated when it is moved) but it is not going to help performance.
> Adam
> On Monday, April 23, 2012 4:15:22 PM UTC+1, Eugene Bolotin wrote:
>> On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
>>> No, read operations take a read lock and many can run concurrently, the >>> time measured to return here does not include the time taken to acquire a >>> lock, only to run the query:
>>>> On Monday, April 23, 2012 5:55:47 PM UTC+4, Adam C wrote:
>>>>> As a general recommendation I would suggest switching to ext4 (or XFS) >>>>> rather than ext3 (ext3 is inefficient when allocating new data files), but >>>>> that is unlikely to be the source of your issues, and the iostat looks >>>>> clean.
>>>>> Your mongostat output shows brief queuing (qr|qw = queued >>>>> reads/writes), so when that ticks up you will see a longer time to return. >>>>> The operation occurring at that time is taking a while and, given that it >>>>> is blocking reads, is likely a long running write (given your logging >>>>> output I think the updates are the cause) - multiple reads can occur >>>>> simultaneously but are blocked by writes.
>>>>> As I mentioned, updates which grow a document significantly and >>>>> trigger a move are relatively costly (and slow) writes. If you can >>>>> identify those operations and either manually pad the document to prevent >>>>> it from growing or alter the queries to prevent the growth itself I think >>>>> you will see better performance. You can also gauge how many of these are >>>>> occurring by looking at the "paddingfactor" field in the collection stats >>>>> (db.coll.stats()) - the closer this value is to 2 the more moves are being >>>>> triggered (a value of 1 means no moves).
>>>>> Adam
>>>>> On Monday, April 23, 2012 2:14:30 PM UTC+1, Eugene Bolotin wrote:
>>>>>> Adam, thanks for your reply.
>>>>>> I use mongo-c-driver, one of the latest version. But my services are >>>>>> written in C++. I use pool of 10 connections to mongo. >>>>>> I measure findOne and update request times. I saw that if one request >>>>>> (read or write) takes more than 100 ms all simultaneous read/write requests >>>>>> in that time take similar time (>100 ms ~10 requests).
Do the incidents of slowness correspond to (or increase during) the growth periods or are things pretty consistent all round?
If the moves are periodic as you suggest then you could schedule a compact on a collection that has had a significant number of moves once they are done (or a repair, if space is a concern). That would defragment the collection somewhat. However, there is a big caveat with doing so - compaction and repairing remove all padding, so the impact of the next round of moves may be more severe.
If you were going to test this, I would recommend a compaction of a single collection for testing/comparison purposes before attempting it on a widespread basis.
Alternatively, if this is write lock contention and not an issue with moves, then the upcoming 2.1/2.2 concurrency improvements will at least isolate them to a specific database.
On Monday, April 23, 2012 5:18:54 PM UTC+1, Eugene Bolotin wrote:
> I'm zeroing timestamp manually. I just convert ui64 to ObjectId string. > Growing of objects is a feature of my service. It happens not often - one > time per four months. But elements are updated when accessing.
> On Monday, April 23, 2012 7:41:46 PM UTC+4, Adam C wrote:
>> Quick question for you - are you zeroing out timestamps in all ObjectIDs >> (first 4 bytes) or is it only 0 timestamp documents that are showing up as >> slow queries?
>> There are definitely a significant number of moves happening here - do >> you think you can reduce that behavior (growing the documents) or is it >> just a feature of how you use the data? The fact that you just have an _id >> index means that it is not as costly as it could be (all indexes for a >> document must be updated when it is moved) but it is not going to help >> performance.
>> Adam
>> On Monday, April 23, 2012 4:15:22 PM UTC+1, Eugene Bolotin wrote:
>>> On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
>>>> No, read operations take a read lock and many can run concurrently, the >>>> time measured to return here does not include the time taken to acquire a >>>> lock, only to run the query:
>>>>> On Monday, April 23, 2012 5:55:47 PM UTC+4, Adam C wrote:
>>>>>> As a general recommendation I would suggest switching to ext4 (or >>>>>> XFS) rather than ext3 (ext3 is inefficient when allocating new data files), >>>>>> but that is unlikely to be the source of your issues, and the iostat looks >>>>>> clean.
>>>>>> Your mongostat output shows brief queuing (qr|qw = queued >>>>>> reads/writes), so when that ticks up you will see a longer time to return. >>>>>> The operation occurring at that time is taking a while and, given that it >>>>>> is blocking reads, is likely a long running write (given your logging >>>>>> output I think the updates are the cause) - multiple reads can occur >>>>>> simultaneously but are blocked by writes.
They are consistent. When service1 was alone, there was maximum of 10,000 errors per day. But when second service started using same mongo instance - error count rose to 30,000 per day. There was no records moving. And the second service has a small database. I thought that its queries don't lead to page faults.
On Monday, April 23, 2012 8:53:34 PM UTC+4, Adam C wrote:
> Do the incidents of slowness correspond to (or increase during) the growth > periods or are things pretty consistent all round?
> If the moves are periodic as you suggest then you could schedule a compact > on a collection that has had a significant number of moves once they are > done (or a repair, if space is a concern). That would defragment the > collection somewhat. However, there is a big caveat with doing so - > compaction and repairing remove all padding, so the impact of the next > round of moves may be more severe.
> If you were going to test this, I would recommend a compaction of a single > collection for testing/comparison purposes before attempting it on a > widespread basis.
> Alternatively, if this is write lock contention and not an issue with > moves, then the upcoming 2.1/2.2 concurrency improvements will at least > isolate them to a specific database.
> Adam.
> On Monday, April 23, 2012 5:18:54 PM UTC+1, Eugene Bolotin wrote:
>> I'm zeroing timestamp manually. I just convert ui64 to ObjectId string. >> Growing of objects is a feature of my service. It happens not often - one >> time per four months. But elements are updated when accessing.
>> On Monday, April 23, 2012 7:41:46 PM UTC+4, Adam C wrote:
>>> Quick question for you - are you zeroing out timestamps in all ObjectIDs >>> (first 4 bytes) or is it only 0 timestamp documents that are showing up as >>> slow queries?
>>> There are definitely a significant number of moves happening here - do >>> you think you can reduce that behavior (growing the documents) or is it >>> just a feature of how you use the data? The fact that you just have an _id >>> index means that it is not as costly as it could be (all indexes for a >>> document must be updated when it is moved) but it is not going to help >>> performance.
>>> Adam
>>> On Monday, April 23, 2012 4:15:22 PM UTC+1, Eugene Bolotin wrote:
>>>> On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
>>>>> No, read operations take a read lock and many can run concurrently, >>>>> the time measured to return here does not include the time taken to acquire >>>>> a lock, only to run the query:
I just re-read this thread and I'd like to go bat to the fact that your timeout on the application side for this is 250ms, but you are not seeing that in the mongo logs (though there are ops hitting the default slowms of 100ms). Given that, and the fact that you are seeing a far higher incidence of the 250ms timer being hit than the logged queries, it would suggest that the root casue of the issue lies somewhere other than the response times of the queries themselves.
You mentioned that you have 64 threads but a connection pool of just 10 - have you considered (or tried) increasing the pool? Is it possible that your threads are blocking/waiting for a connection to be free and that is the cause of your timer being triggered?
On Tuesday, April 24, 2012 12:22:26 PM UTC+1, Eugene Bolotin wrote:
> They are consistent. When service1 was alone, there was maximum of 10,000 > errors per day. But when second service started using same mongo instance - > error count rose to 30,000 per day. There was no records moving. > And the second service has a small database. I thought that its queries > don't lead to page faults.
> On Monday, April 23, 2012 8:53:34 PM UTC+4, Adam C wrote:
>> Do the incidents of slowness correspond to (or increase during) the >> growth periods or are things pretty consistent all round?
>> If the moves are periodic as you suggest then you could schedule a >> compact on a collection that has had a significant number of moves once >> they are done (or a repair, if space is a concern). That would defragment >> the collection somewhat. However, there is a big caveat with doing so - >> compaction and repairing remove all padding, so the impact of the next >> round of moves may be more severe.
>> If you were going to test this, I would recommend a compaction of a >> single collection for testing/comparison purposes before attempting it on a >> widespread basis.
>> Alternatively, if this is write lock contention and not an issue with >> moves, then the upcoming 2.1/2.2 concurrency improvements will at least >> isolate them to a specific database.
>> Adam.
>> On Monday, April 23, 2012 5:18:54 PM UTC+1, Eugene Bolotin wrote:
>>> I'm zeroing timestamp manually. I just convert ui64 to ObjectId string. >>> Growing of objects is a feature of my service. It happens not often - >>> one time per four months. But elements are updated when accessing.
>>> On Monday, April 23, 2012 7:41:46 PM UTC+4, Adam C wrote:
>>>> Quick question for you - are you zeroing out timestamps in all >>>> ObjectIDs (first 4 bytes) or is it only 0 timestamp documents that are >>>> showing up as slow queries?
>>>> There are definitely a significant number of moves happening here - do >>>> you think you can reduce that behavior (growing the documents) or is it >>>> just a feature of how you use the data? The fact that you just have an _id >>>> index means that it is not as costly as it could be (all indexes for a >>>> document must be updated when it is moved) but it is not going to help >>>> performance.
>>>> Adam
>>>> On Monday, April 23, 2012 4:15:22 PM UTC+1, Eugene Bolotin wrote:
>>>>> On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
>>>>>> No, read operations take a read lock and many can run concurrently, >>>>>> the time measured to return here does not include the time taken to acquire >>>>>> a lock, only to run the query:
Yes, I tried different pool size (from 1 to 100) and I got cascades of read/write timeouts there. Strange is that I get groups of read with same timeout. Maybe lot of write queries with timeout < 100 ms acquire the lock?
I wrote small python script (pymongo) that synchronously read (only reads) records from database. And create big (50 millions) and small (100 thousand) database similar to production databases.
2 instances of script read from BIG database. And one instance reads from SMALL database.
Total load is 800 rps. Big database script eventualy gets timeouts > 100 ms. And I get timeouts of script working with small database. Isn't that strange? Maybe that's the reason why I'm getting groups of simultaneous read timeouts?
On Tuesday, April 24, 2012 4:29:13 PM UTC+4, Adam C wrote:
> I just re-read this thread and I'd like to go bat to the fact that your > timeout on the application side for this is 250ms, but you are not seeing > that in the mongo logs (though there are ops hitting the default slowms of > 100ms). Given that, and the fact that you are seeing a far higher > incidence of the 250ms timer being hit than the logged queries, it would > suggest that the root casue of the issue lies somewhere other than the > response times of the queries themselves.
> You mentioned that you have 64 threads but a connection pool of just 10 - > have you considered (or tried) increasing the pool? > Is it possible that your threads are blocking/waiting for a connection to > be free and that is the cause of your timer being triggered?
> Adam
> On Tuesday, April 24, 2012 12:22:26 PM UTC+1, Eugene Bolotin wrote:
>> They are consistent. When service1 was alone, there was maximum of 10,000 >> errors per day. But when second service started using same mongo instance - >> error count rose to 30,000 per day. There was no records moving. >> And the second service has a small database. I thought that its queries >> don't lead to page faults.
>> On Monday, April 23, 2012 8:53:34 PM UTC+4, Adam C wrote:
>>> Do the incidents of slowness correspond to (or increase during) the >>> growth periods or are things pretty consistent all round?
>>> If the moves are periodic as you suggest then you could schedule a >>> compact on a collection that has had a significant number of moves once >>> they are done (or a repair, if space is a concern). That would defragment >>> the collection somewhat. However, there is a big caveat with doing so - >>> compaction and repairing remove all padding, so the impact of the next >>> round of moves may be more severe.
>>> If you were going to test this, I would recommend a compaction of a >>> single collection for testing/comparison purposes before attempting it on a >>> widespread basis.
>>> Alternatively, if this is write lock contention and not an issue with >>> moves, then the upcoming 2.1/2.2 concurrency improvements will at least >>> isolate them to a specific database.
>>> Adam.
>>> On Monday, April 23, 2012 5:18:54 PM UTC+1, Eugene Bolotin wrote:
>>>> I'm zeroing timestamp manually. I just convert ui64 to ObjectId string. >>>> Growing of objects is a feature of my service. It happens not often - >>>> one time per four months. But elements are updated when accessing.
>>>> On Monday, April 23, 2012 7:41:46 PM UTC+4, Adam C wrote:
>>>>> Quick question for you - are you zeroing out timestamps in all >>>>> ObjectIDs (first 4 bytes) or is it only 0 timestamp documents that are >>>>> showing up as slow queries?
>>>>> There are definitely a significant number of moves happening here - do >>>>> you think you can reduce that behavior (growing the documents) or is it >>>>> just a feature of how you use the data? The fact that you just have an _id >>>>> index means that it is not as costly as it could be (all indexes for a >>>>> document must be updated when it is moved) but it is not going to help >>>>> performance.
>>>>> Adam
>>>>> On Monday, April 23, 2012 4:15:22 PM UTC+1, Eugene Bolotin wrote:
>>>>>> On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
>>>>>>> No, read operations take a read lock and many can run concurrently, >>>>>>> the time measured to return here does not include the time taken to acquire >>>>>>> a lock, only to run the query:
If both are in the same instance it sounds like after a certain point you are faulting to disk on those reads, i.e. your working set outgrows the available physical RAM - how does your RAM utilisation and page fault count look when you start to hit the 100ms+ responses on the reads?
Your read volumes (and indeed the writes) don't seem excessive and the stats you have posted would suggest that IO is not the issue, unless the stats were from a time when the tests were not running. I have seen loads far in excess of this without causing similar issues, so we must be missing something here. Are these hosts in MMS (if so I would just need the group name to take a look).
On Tuesday, April 24, 2012 2:32:27 PM UTC+1, Eugene Bolotin wrote:
> Yes, I tried different pool size (from 1 to 100) and I got cascades of > read/write timeouts there. Strange is that I get groups of read with same > timeout. > Maybe lot of write queries with timeout < 100 ms acquire the lock?
> I wrote small python script (pymongo) that synchronously read (only reads) > records from database. And create big (50 millions) and small (100 > thousand) database similar to production databases.
> 2 instances of script read from BIG database. > And one instance reads from SMALL database.
> Total load is 800 rps. Big database script eventualy gets timeouts > 100 > ms. And I get timeouts of script working with small database. Isn't that > strange? Maybe that's the reason why I'm getting groups of simultaneous > read timeouts?
> On Tuesday, April 24, 2012 4:29:13 PM UTC+4, Adam C wrote:
>> I just re-read this thread and I'd like to go bat to the fact that your >> timeout on the application side for this is 250ms, but you are not seeing >> that in the mongo logs (though there are ops hitting the default slowms of >> 100ms). Given that, and the fact that you are seeing a far higher >> incidence of the 250ms timer being hit than the logged queries, it would >> suggest that the root casue of the issue lies somewhere other than the >> response times of the queries themselves.
>> You mentioned that you have 64 threads but a connection pool of just 10 - >> have you considered (or tried) increasing the pool? >> Is it possible that your threads are blocking/waiting for a connection to >> be free and that is the cause of your timer being triggered?
>> Adam
>> On Tuesday, April 24, 2012 12:22:26 PM UTC+1, Eugene Bolotin wrote:
>>> They are consistent. When service1 was alone, there was maximum of >>> 10,000 errors per day. But when second service started using same mongo >>> instance - error count rose to 30,000 per day. There was no records moving. >>> And the second service has a small database. I thought that its queries >>> don't lead to page faults.
>>> On Monday, April 23, 2012 8:53:34 PM UTC+4, Adam C wrote:
>>>> Do the incidents of slowness correspond to (or increase during) the >>>> growth periods or are things pretty consistent all round?
>>>> If the moves are periodic as you suggest then you could schedule a >>>> compact on a collection that has had a significant number of moves once >>>> they are done (or a repair, if space is a concern). That would defragment >>>> the collection somewhat. However, there is a big caveat with doing so - >>>> compaction and repairing remove all padding, so the impact of the next >>>> round of moves may be more severe.
>>>> If you were going to test this, I would recommend a compaction of a >>>> single collection for testing/comparison purposes before attempting it on a >>>> widespread basis.
>>>> Alternatively, if this is write lock contention and not an issue with >>>> moves, then the upcoming 2.1/2.2 concurrency improvements will at least >>>> isolate them to a specific database.
>>>> Adam.
>>>> On Monday, April 23, 2012 5:18:54 PM UTC+1, Eugene Bolotin wrote:
>>>>> I'm zeroing timestamp manually. I just convert ui64 to ObjectId >>>>> string. >>>>> Growing of objects is a feature of my service. It happens not often - >>>>> one time per four months. But elements are updated when accessing.
>>>>> On Monday, April 23, 2012 7:41:46 PM UTC+4, Adam C wrote:
>>>>>> Quick question for you - are you zeroing out timestamps in all >>>>>> ObjectIDs (first 4 bytes) or is it only 0 timestamp documents that are >>>>>> showing up as slow queries?
>>>>>> There are definitely a significant number of moves happening here - >>>>>> do you think you can reduce that behavior (growing the documents) or is it >>>>>> just a feature of how you use the data? The fact that you just have an _id >>>>>> index means that it is not as costly as it could be (all indexes for a >>>>>> document must be updated when it is moved) but it is not going to help >>>>>> performance.
>>>>>> Adam
>>>>>> On Monday, April 23, 2012 4:15:22 PM UTC+1, Eugene Bolotin wrote:
>>>>>>> On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
>>>>>>>> No, read operations take a read lock and many can run concurrently, >>>>>>>> the time measured to return here does not include the time taken to acquire >>>>>>>> a lock, only to run the query:
I saw that second script is getting errors only few seconds after start (maybe while data is loading into memory): #BET 2012-04-26 17:08:05.212873 11862 340 ms #BET 2012-04-26 17:08:05.452267 28997 229 ms
So there is no problem when working with small and large databse simultaneous.
The way I'm working with data is following: - read by _id - process - write _id
So data should be in memory when write operation starts. It means that write operation should be fast. And mongodb.log shows it. There are many query timeout records and few update timeout records (100 slow reads/1 slow update). Read operation doesn't use write-lock. Now there are no slow moving operations in log. Now I'm doing 5th review of that code.
On Tuesday, April 24, 2012 5:42:24 PM UTC+4, Adam C wrote:
> If both are in the same instance it sounds like after a certain point you > are faulting to disk on those reads, i.e. your working set outgrows the > available physical RAM - how does your RAM utilisation and page fault count > look when you start to hit the 100ms+ responses on the reads?
> Your read volumes (and indeed the writes) don't seem excessive and the > stats you have posted would suggest that IO is not the issue, unless the > stats were from a time when the tests were not running. I have seen loads > far in excess of this without causing similar issues, so we must be missing > something here. Are these hosts in MMS (if so I would just need the group > name to take a look).
> Adam
> On Tuesday, April 24, 2012 2:32:27 PM UTC+1, Eugene Bolotin wrote:
>> Yes, I tried different pool size (from 1 to 100) and I got cascades of >> read/write timeouts there. Strange is that I get groups of read with same >> timeout. >> Maybe lot of write queries with timeout < 100 ms acquire the lock?
>> I wrote small python script (pymongo) that synchronously read (only >> reads) records from database. And create big (50 millions) and small (100 >> thousand) database similar to production databases.
>> 2 instances of script read from BIG database. >> And one instance reads from SMALL database.
>> Total load is 800 rps. Big database script eventualy gets timeouts > 100 >> ms. And I get timeouts of script working with small database. Isn't that >> strange? Maybe that's the reason why I'm getting groups of simultaneous >> read timeouts?
>> On Tuesday, April 24, 2012 4:29:13 PM UTC+4, Adam C wrote:
>>> I just re-read this thread and I'd like to go bat to the fact that your >>> timeout on the application side for this is 250ms, but you are not seeing >>> that in the mongo logs (though there are ops hitting the default slowms of >>> 100ms). Given that, and the fact that you are seeing a far higher >>> incidence of the 250ms timer being hit than the logged queries, it would >>> suggest that the root casue of the issue lies somewhere other than the >>> response times of the queries themselves.
>>> You mentioned that you have 64 threads but a connection pool of just 10 >>> - have you considered (or tried) increasing the pool? >>> Is it possible that your threads are blocking/waiting for a connection >>> to be free and that is the cause of your timer being triggered?
>>> Adam
>>> On Tuesday, April 24, 2012 12:22:26 PM UTC+1, Eugene Bolotin wrote:
>>>> They are consistent. When service1 was alone, there was maximum of >>>> 10,000 errors per day. But when second service started using same mongo >>>> instance - error count rose to 30,000 per day. There was no records moving. >>>> And the second service has a small database. I thought that its queries >>>> don't lead to page faults.
>>>> On Monday, April 23, 2012 8:53:34 PM UTC+4, Adam C wrote:
>>>>> Do the incidents of slowness correspond to (or increase during) the >>>>> growth periods or are things pretty consistent all round?
>>>>> If the moves are periodic as you suggest then you could schedule a >>>>> compact on a collection that has had a significant number of moves once >>>>> they are done (or a repair, if space is a concern). That would defragment >>>>> the collection somewhat. However, there is a big caveat with doing so - >>>>> compaction and repairing remove all padding, so the impact of the next >>>>> round of moves may be more severe.
>>>>> If you were going to test this, I would recommend a compaction of a >>>>> single collection for testing/comparison purposes before attempting it on a >>>>> widespread basis.
>>>>> Alternatively, if this is write lock contention and not an issue with >>>>> moves, then the upcoming 2.1/2.2 concurrency improvements will at least >>>>> isolate them to a specific database.
>>>>> Adam.
>>>>> On Monday, April 23, 2012 5:18:54 PM UTC+1, Eugene Bolotin wrote:
>>>>>> I'm zeroing timestamp manually. I just convert ui64 to ObjectId >>>>>> string. >>>>>> Growing of objects is a feature of my service. It happens not often - >>>>>> one time per four months. But elements are updated when accessing.
>>>>>> On Monday, April 23, 2012 7:41:46 PM UTC+4, Adam C wrote:
>>>>>>> Quick question for you - are you zeroing out timestamps in all >>>>>>> ObjectIDs (first 4 bytes) or is it only 0 timestamp documents that are >>>>>>> showing up as slow queries?
>>>>>>> There are definitely a significant number of moves happening here - >>>>>>> do you think you can reduce that behavior (growing the documents) or is it >>>>>>> just a feature of how you use the data? The fact that you just have an _id >>>>>>> index means that it is not as costly as it could be (all indexes for a >>>>>>> document must be updated when it is moved) but it is not going to help >>>>>>> performance.
>>>>>>> Adam
>>>>>>> On Monday, April 23, 2012 4:15:22 PM UTC+1, Eugene Bolotin wrote:
>>>>>>>> On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
>>>>>>>>> No, read operations take a read lock and many can run >>>>>>>>> concurrently, the time measured to return here does not include the time >>>>>>>>> taken to acquire a lock, only to run the query:
I see a lot of faults being reported, but your res memory is staying pretty constant at ~552MB - how much memory is there on this host?
If you are faulting that much I would expect your res memory to be growing, though it might take longer than the snapshot provided here. It also seems a lot lower than the last time you posted mongostat (then it was >11GB) - is this the same instance, was it restarted?
> I saw that second script is getting errors only few seconds after start > (maybe while data is loading into memory): > #BET 2012-04-26 17:08:05.212873 11862 340 ms > #BET 2012-04-26 17:08:05.452267 28997 229 ms
> So there is no problem when working with small and large databse > simultaneous.
> The way I'm working with data is following: > - read by _id > - process > - write _id
> So data should be in memory when write operation starts. It means that > write operation should be fast. And mongodb.log shows it. There are many > query timeout records and few update timeout records (100 slow reads/1 slow > update). > Read operation doesn't use write-lock. Now there are no slow moving > operations in log. Now I'm doing 5th review of that code.
> On Tuesday, April 24, 2012 5:42:24 PM UTC+4, Adam C wrote:
>> If both are in the same instance it sounds like after a certain point you >> are faulting to disk on those reads, i.e. your working set outgrows the >> available physical RAM - how does your RAM utilisation and page fault count >> look when you start to hit the 100ms+ responses on the reads?
>> Your read volumes (and indeed the writes) don't seem excessive and the >> stats you have posted would suggest that IO is not the issue, unless the >> stats were from a time when the tests were not running. I have seen loads >> far in excess of this without causing similar issues, so we must be missing >> something here. Are these hosts in MMS (if so I would just need the group >> name to take a look).
>> Adam
>> On Tuesday, April 24, 2012 2:32:27 PM UTC+1, Eugene Bolotin wrote:
>>> Yes, I tried different pool size (from 1 to 100) and I got cascades of >>> read/write timeouts there. Strange is that I get groups of read with same >>> timeout. >>> Maybe lot of write queries with timeout < 100 ms acquire the lock?
>>> I wrote small python script (pymongo) that synchronously read (only >>> reads) records from database. And create big (50 millions) and small (100 >>> thousand) database similar to production databases.
>>> 2 instances of script read from BIG database. >>> And one instance reads from SMALL database.
>>> Total load is 800 rps. Big database script eventualy gets timeouts > 100 >>> ms. And I get timeouts of script working with small database. Isn't that >>> strange? Maybe that's the reason why I'm getting groups of simultaneous >>> read timeouts?
>>> On Tuesday, April 24, 2012 4:29:13 PM UTC+4, Adam C wrote:
>>>> I just re-read this thread and I'd like to go bat to the fact that your >>>> timeout on the application side for this is 250ms, but you are not seeing >>>> that in the mongo logs (though there are ops hitting the default slowms of >>>> 100ms). Given that, and the fact that you are seeing a far higher >>>> incidence of the 250ms timer being hit than the logged queries, it would >>>> suggest that the root casue of the issue lies somewhere other than the >>>> response times of the queries themselves.
>>>> You mentioned that you have 64 threads but a connection pool of just 10 >>>> - have you considered (or tried) increasing the pool? >>>> Is it possible that your threads are blocking/waiting for a connection >>>> to be free and that is the cause of your timer being triggered?
>>>> Adam
>>>> On Tuesday, April 24, 2012 12:22:26 PM UTC+1, Eugene Bolotin wrote:
>>>>> They are consistent. When service1 was alone, there was maximum of >>>>> 10,000 errors per day. But when second service started using same mongo >>>>> instance - error count rose to 30,000 per day. There was no records moving. >>>>> And the second service has a small database. I thought that its >>>>> queries don't lead to page faults.
>>>>> On Monday, April 23, 2012 8:53:34 PM UTC+4, Adam C wrote:
>>>>>> Do the incidents of slowness correspond to (or increase during) the >>>>>> growth periods or are things pretty consistent all round?
>>>>>> If the moves are periodic as you suggest then you could schedule a >>>>>> compact on a collection that has had a significant number of moves once >>>>>> they are done (or a repair, if space is a concern). That would defragment >>>>>> the collection somewhat. However, there is a big caveat with doing so - >>>>>> compaction and repairing remove all padding, so the impact of the next >>>>>> round of moves may be more severe.
>>>>>> If you were going to test this, I would recommend a compaction of a >>>>>> single collection for testing/comparison purposes before attempting it on a >>>>>> widespread basis.
>>>>>> Alternatively, if this is write lock contention and not an issue with >>>>>> moves, then the upcoming 2.1/2.2 concurrency improvements will at least >>>>>> isolate them to a specific database.
>>>>>> Adam.
>>>>>> On Monday, April 23, 2012 5:18:54 PM UTC+1, Eugene Bolotin wrote:
>>>>>>> I'm zeroing timestamp manually. I just convert ui64 to ObjectId >>>>>>> string. >>>>>>> Growing of objects is a feature of my service. It happens not often >>>>>>> - one time per four months. But elements are updated when accessing.
>>>>>>> On Monday, April 23, 2012 7:41:46 PM UTC+4, Adam C wrote:
>>>>>>>> Quick question for you - are you zeroing out timestamps in all >>>>>>>> ObjectIDs (first 4 bytes) or is it only 0 timestamp documents that are >>>>>>>> showing up as slow queries?
>>>>>>>> There are definitely a significant number of moves happening here - >>>>>>>> do you think you can reduce that behavior (growing the documents) or is it >>>>>>>> just a feature of how you use the data? The fact that you just have an _id >>>>>>>> index means that it is not as costly as it could be (all indexes for a >>>>>>>> document must be updated when it is moved) but it is not going to help >>>>>>>> performance.
>>>>>>>> Adam
>>>>>>>> On Monday, April 23, 2012 4:15:22 PM UTC+1, Eugene Bolotin wrote:
>>>>>>>>> On Monday, April 23, 2012 6:46:59 PM UTC+4, Adam C wrote:
>>>>>>>>>> No, read operations take a read lock and many can run >>>>>>>>>> concurrently, the time measured to return here does not include the time >>>>>>>>>> taken to acquire a lock, only to run the query:
No. That is test host - I'm running two test scripts there trying to reproduce the problem.
I also got core file for situation when one read takes a long time (it aborts when query_time > 100ms. That's in service that works with big database): (gdb) info threads 19 Thread 0x7fffee042700 (LWP 4639) 0x00007ffff6ad9d6b in _int_free (av=0x7fffe0000020, p=0x7fffe008d9c0) at malloc.c:4823 18 Thread 0x7fffee843700 (LWP 4638) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 17 Thread 0x7fffef044700 (LWP 4637) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82 16 Thread 0x7fffef845700 (LWP 4636) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 15 Thread 0x7ffff0046700 (LWP 4635) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 14 Thread 0x7ffff0847700 (LWP 4634) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82 13 Thread 0x7ffff1048700 (LWP 4633) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 * 12 Thread 0x7ffff1849700 (LWP 4632) __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97 11 Thread 0x7ffff204a700 (LWP 4631) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 10 Thread 0x7ffff284b700 (LWP 4630) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 9 Thread 0x7ffff304c700 (LWP 4629) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 8 Thread 0x7ffff384d700 (LWP 4628) 0x00007ffff6a94d05 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 7 Thread 0x7ffff404e700 (LWP 4627) 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 6 Thread 0x7ffff484f700 (LWP 4626) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82 5 Thread 0x7ffff5050700 (LWP 4625) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82 4 Thread 0x7ffff5851700 (LWP 4624) 0x00007ffff6e04fdd in accept () at ../sysdeps/unix/syscall-template.S:82 3 Thread 0x7ffff6052700 (LWP 4623) 0x00007ffff6b0f13d in nanosleep () at ../sysdeps/unix/syscall-template.S:82 2 Thread 0x7ffff6a60700 (LWP 4622) 0x00007ffff6b0f13d in nanosleep () at ../sysdeps/unix/syscall-template.S:82 1 Thread 0x7ffff7fcd740 (LWP 4555) 0x00007ffff6b0f13d in nanosleep () at ../sysdeps/unix/syscall-template.S:82
There are 8 threads on __libc_recv, 1 thread (#8) that raised signal and 1 thread _int_free. Other threads wait on accept or on nanosleep (service threads).
I got backtraces for threads that are in __libc_recv - all backtraces are similar:
#0 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized out>, flags=<value optimized out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 #1 0x0000000000453f0f in mongo_read_socket (conn=0x71af10, buf=0x7ffff404ca40, len=16) at /home/user/mongoservice/env_default.c:57 #2 0x0000000000450bca in mongo_read_response (conn=0x71af10, reply=0x7fffe008a6f0) at /home/user/mongoservice/mongo.c:164 #3 0x00000000004522cb in mongo_cursor_op_query (cursor=0x7fffe008a6f0) at /home/user/mongoservice/mongo.c:750 #4 0x00000000004526f9 in mongo_find (conn=0x71af10, ns=0x7fffe0092cf8 "rep.COL1", query=0x7ffff404cd70, fields=0x7ffff404ccc0, limit=1, skip=0, options=4) at /home/user/mongoservice/mongo.c:838 #5 0x0000000000449a91 in nosql::MongoClientRaw::FindOne (this=0x71af10, db=..., collection=..., id=..., hash=..., err=@0x7ffff404d44f) at /home/user/mongoservice/mongoclient.cpp:147
_int_free thread: #0 0x00007ffff6ad9d6b in _int_free (av=0x7fffe0000020, p=0x7fffe008d9c0) at malloc.c:4823 #1 0x00007ffff6ade3e3 in __libc_free (mem=<value optimized out>) at malloc.c:3738 #2 0x00007ffff754c019 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #3 0x000000000043574d in TMongoDb<TSenderData >::MongoRead (this=0x6fffa8, db=0x49d1d8 "rep", collection=0x7fffe00997b8 "COL1", index=40621381955, data=..., err=@0x7fffee04144f) at /home/user/mongoservice/mongostorage.h:86
I also have timestamp (is millis) variable in MongoRead function: #7 start = 1335456660339 #8 start = 1335456660338, query_time = 337 (thread raises signal) #9 start = 1335456660341 #10 start = 1335456660340 #11 start = 1335456660340 #13 start = 1335456660336 #15 start = 1335456660340 #16 start = 1335456660337 #18 start = 1335456660339 #19 start = 1335456660338 Start times are similar. That means that all queries started at same time. And all of them are read queries and took time > 300 ms.
1335456660338 ms = 2012-04-26 20:11:00
And mongodb.log: hu Apr 26 20:10:28 [initandlisten] connection accepted from 127.0.0.1:57832 #405 Thu Apr 26 20:10:28 [initandlisten] connection accepted from 127.0.0.1:57833 #406 Thu Apr 26 20:10:28 [initandlisten] connection accepted from 127.0.0.1:57834 #407 Thu Apr 26 20:10:28 [initandlisten] connection accepted from 127.0.0.1:57835 #408 Thu Apr 26 20:11:00 [conn345] query rep.COL1 query: { _id: ObjectId('000000000000000001027908') } ntoreturn:1 idhack:1 reslen:454 338ms
Maybe all of queries caused page faults but why there is only one slow query record in mongodb.log?
On Thursday, April 26, 2012 7:07:56 PM UTC+4, Adam C wrote:
> I see a lot of faults being reported, but your res memory is staying > pretty constant at ~552MB - how much memory is there on this host?
> If you are faulting that much I would expect your res memory to be > growing, though it might take longer than the snapshot provided here. It > also seems a lot lower than the last time you posted mongostat (then it was > >11GB) - is this the same instance, was it restarted?
> Adam
> On Thursday, April 26, 2012 3:44:28 PM UTC+1, Eugene Bolotin wrote:
>> I get following mongostat without second script that works with small DB:
>> I saw that second script is getting errors only few seconds after start >> (maybe while data is loading into memory): >> #BET 2012-04-26 17:08:05.212873 11862 340 ms >> #BET 2012-04-26 17:08:05.452267 28997 229 ms
>> So there is no problem when working with small and large databse >> simultaneous.
>> The way I'm working with data is following: >> - read by _id >> - process >> - write _id
>> So data should be in memory when write operation starts. It means that >> write operation should be fast. And mongodb.log shows it. There are many >> query timeout records and few update timeout records (100 slow reads/1 slow >> update). >> Read operation doesn't use write-lock. Now there are no slow moving >> operations in log. Now I'm doing 5th review of that code.
>> On Tuesday, April 24, 2012 5:42:24 PM UTC+4, Adam C wrote:
>>> If both are in the same instance it sounds like after a certain point >>> you are faulting to disk on those reads, i.e. your working set outgrows the >>> available physical RAM - how does your RAM utilisation and page fault count >>> look when you start to hit the 100ms+ responses on the reads?
>>> Your read volumes (and indeed the writes) don't seem excessive and the >>> stats you have posted would suggest that IO is not the issue, unless the >>> stats were from a time when the tests were not running. I have seen loads >>> far in excess of this without causing similar issues, so we must be missing >>> something here. Are these hosts in MMS (if so I would just need the group >>> name to take a look).
>>> Adam
>>> On Tuesday, April 24, 2012 2:32:27 PM UTC+1, Eugene Bolotin wrote:
>>>> Yes, I tried different pool size (from 1 to 100) and I got cascades of >>>> read/write timeouts there. Strange is that I get groups of read with same >>>> timeout. >>>> Maybe lot of write queries with timeout < 100 ms acquire the lock?
>>>> I wrote small python script (pymongo) that synchronously read (only >>>> reads) records from
The traces seem consistent with the logging, one of the threads exited with an abort because it was slow, and one query was logged as slow in MongoDB. I don't see where there's any other evidence of slowness here.
As I mentioned previously, you are seeing quite a few page faults (in the last mongostat at least), if 9 of your queries hit memory, and one faults to disk, then that would be the likely reason for a slow return.
You could try re-running that slow query afterward, if we are right, a subsequent second find using the same _id would then return quickly because it would have been paged into memory. If your whole data set needs to be in memory then you might run a large find, essentially a table scan, to get it into memory and warm up the cache to avoid the slow responses. As long as the data set fits, and there is no memory pressure that should remove the issues you are seeing. What you get otherwise is essentially a lazy loading mechanism, page faults when you look for data not in memory, and a certain level of slower responses.
There are plans to implement a "touch" command to make this process less manual and allow admins to load required data more easily in advance, but for now the more manual approach is your best bet.
On Friday, April 27, 2012 10:19:07 AM UTC+1, Eugene Bolotin wrote:
> No. That is test host - I'm running two test scripts there trying to > reproduce the problem.
> I also got core file for situation when one read takes a long time (it > aborts when query_time > 100ms. That's in service that works with big > database): > (gdb) info threads > 19 Thread 0x7fffee042700 (LWP 4639) 0x00007ffff6ad9d6b in _int_free > (av=0x7fffe0000020, p=0x7fffe008d9c0) at malloc.c:4823 > 18 Thread 0x7fffee843700 (LWP 4638) 0x00007ffff6e050dc in __libc_recv > (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized > out>, flags=<value optimized out>) > at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 > 17 Thread 0x7fffef044700 (LWP 4637) 0x00007ffff6e04fdd in accept () at > ../sysdeps/unix/syscall-template.S:82 > 16 Thread 0x7fffef845700 (LWP 4636) 0x00007ffff6e050dc in __libc_recv > (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized > out>, flags=<value optimized out>) > at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 > 15 Thread 0x7ffff0046700 (LWP 4635) 0x00007ffff6e050dc in __libc_recv > (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized > out>, flags=<value optimized out>) > at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 > 14 Thread 0x7ffff0847700 (LWP 4634) 0x00007ffff6e04fdd in accept () at > ../sysdeps/unix/syscall-template.S:82 > 13 Thread 0x7ffff1048700 (LWP 4633) 0x00007ffff6e050dc in __libc_recv > (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized > out>, flags=<value optimized out>) > at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 > * 12 Thread 0x7ffff1849700 (LWP 4632) __lll_lock_wait_private () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97 > 11 Thread 0x7ffff204a700 (LWP 4631) 0x00007ffff6e050dc in __libc_recv > (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized > out>, flags=<value optimized out>) > at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 > 10 Thread 0x7ffff284b700 (LWP 4630) 0x00007ffff6e050dc in __libc_recv > (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized > out>, flags=<value optimized out>) > at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 > 9 Thread 0x7ffff304c700 (LWP 4629) 0x00007ffff6e050dc in __libc_recv > (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized > out>, flags=<value optimized out>) > at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 > 8 Thread 0x7ffff384d700 (LWP 4628) 0x00007ffff6a94d05 in raise (sig=6) > at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 > 7 Thread 0x7ffff404e700 (LWP 4627) 0x00007ffff6e050dc in __libc_recv > (fd=<value optimized out>, buf=<value optimized out>, n=<value optimized > out>, flags=<value optimized out>) > at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 > 6 Thread 0x7ffff484f700 (LWP 4626) 0x00007ffff6e04fdd in accept () at > ../sysdeps/unix/syscall-template.S:82 > 5 Thread 0x7ffff5050700 (LWP 4625) 0x00007ffff6e04fdd in accept () at > ../sysdeps/unix/syscall-template.S:82 > 4 Thread 0x7ffff5851700 (LWP 4624) 0x00007ffff6e04fdd in accept () at > ../sysdeps/unix/syscall-template.S:82 > 3 Thread 0x7ffff6052700 (LWP 4623) 0x00007ffff6b0f13d in nanosleep () > at ../sysdeps/unix/syscall-template.S:82 > 2 Thread 0x7ffff6a60700 (LWP 4622) 0x00007ffff6b0f13d in nanosleep () > at ../sysdeps/unix/syscall-template.S:82 > 1 Thread 0x7ffff7fcd740 (LWP 4555) 0x00007ffff6b0f13d in nanosleep () > at ../sysdeps/unix/syscall-template.S:82
> There are 8 threads on __libc_recv, 1 thread (#8) that raised signal and 1 > thread _int_free. Other threads wait on accept or on nanosleep (service > threads).
> I got backtraces for threads that are in __libc_recv - all backtraces are > similar:
> #0 0x00007ffff6e050dc in __libc_recv (fd=<value optimized out>, > buf=<value optimized out>, n=<value optimized out>, flags=<value optimized > out>) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34 > #1 0x0000000000453f0f in mongo_read_socket (conn=0x71af10, > buf=0x7ffff404ca40, len=16) at /home/user/mongoservice/env_default.c:57 > #2 0x0000000000450bca in mongo_read_response (conn=0x71af10, > reply=0x7fffe008a6f0) at /home/user/mongoservice/mongo.c:164 > #3 0x00000000004522cb in mongo_cursor_op_query (cursor=0x7fffe008a6f0) at > /home/user/mongoservice/mongo.c:750 > #4 0x00000000004526f9 in mongo_find (conn=0x71af10, ns=0x7fffe0092cf8 > "rep.COL1", query=0x7ffff404cd70, fields=0x7ffff404ccc0, limit=1, skip=0, > options=4) > at /home/user/mongoservice/mongo.c:838 > #5 0x0000000000449a91 in nosql::MongoClientRaw::FindOne (this=0x71af10, > db=..., collection=..., id=..., hash=..., err=@0x7ffff404d44f) > at /home/user/mongoservice/mongoclient.cpp:147
> _int_free thread: > #0 0x00007ffff6ad9d6b in _int_free (av=0x7fffe0000020, p=0x7fffe008d9c0) > at malloc.c:4823 > #1 0x00007ffff6ade3e3 in __libc_free (mem=<value optimized out>) at > malloc.c:3738 > #2 0x00007ffff754c019 in std::basic_string<char, std::char_traits<char>, > std::allocator<char> >::~basic_string() () from > /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #3 0x000000000043574d in TMongoDb<TSenderData >::MongoRead > (this=0x6fffa8, db=0x49d1d8 "rep", collection=0x7fffe00997b8 "COL1", > index=40621381955, data=..., > err=@0x7fffee04144f) at /home/user/mongoservice/mongostorage.h:86
> I also have timestamp (is millis) variable in MongoRead function: > #7 start = 1335456660339 > #8 start = 1335456660338, query_time = 337 (thread raises signal) > #9 start = 1335456660341 > #10 start = 1335456660340 > #11 start = 1335456660340 > #13 start = 1335456660336 > #15 start = 1335456660340 > #16 start = 1335456660337 > #18 start = 1335456660339 > #19 start = 1335456660338 > Start times are similar. That means that all queries started at same time. > And all of them are read queries and took time > 300 ms.
> Maybe all of queries caused page faults but why there is only one slow > query record in mongodb.log?
> On Thursday, April 26, 2012 7:07:56 PM UTC+4, Adam C wrote:
>> I see a lot of faults being reported, but your res memory is staying >> pretty constant at ~552MB - how much memory is there on this host?
>> If you are faulting that much I would expect your res memory to be >> growing, though it might take longer than the snapshot provided here. It >> also seems a lot lower than the last time you posted mongostat (then it was >> >11GB) - is this the same instance, was it restarted?
>> Adam
>> On Thursday, April 26, 2012 3:44:28 PM UTC+1, Eugene Bolotin wrote:
>>> I get following mongostat without second script that works with small DB:
>>> I saw that second script is getting errors only few seconds after start >>> (maybe while data is loading into memory): >>> #BET 2012-04-26 17:08:05.212873 11862 340 ms >>> #BET 2012-04-26 17:08:05.452267 28997 229 ms
>>> So there is no problem when working with small and large databse >>> simultaneous.
>>> The way I'm working with data is following: >>> - read by _id >>> -