MongoDB 3.2.10: Indexed queries taking too long to respond

1,106 views
Skip to first unread message

Astro

unread,
Nov 24, 2016, 5:05:26 AM11/24/16
to mongodb-user



We have been seeing this issue with one of the database(size 2.286GB) with wiredTiger storage engine.

Setup:
Instance type: r3.2xlarge (4 CPU cores, 32GB RAM)
Total number of DBs: 4
Size of largest DB: 2.286GB
Total collections in this DB: 10079
Total indexes in this DB: 89098
Total documents: 674155

Slow calls description:
Collection: holdings
Total docs: 721
Sample query: 

db.holdings.find({ "user_key" : "eafjhjafaefaefe", "deleted_on" : null })

Time taken for this query: 9191 millis

Below is the sample slow call captured in profiler:
{
 
"op" : "query",
 
"ns" : "userdatabase.holdings",
 
"query" : {
 
"find" : "holdings",
 
"filter" : {
 
"user_key" : "eafjhjafaefaefe",
 
"deleted_on" : null
 
},
 
"projection" : {
 
 
},
 
"limit" : 1,
 
"batchSize" : 1,
 
"singleBatch" : true
 
},
 
"keysExamined" : 1,
 
"docsExamined" : 1,
 
"cursorExhausted" : true,
 
"keyUpdates" : 0,
 
"writeConflicts" : 0,
 
"numYield" : 0,
 
"locks" : {
 
"Global" : {
 
"acquireCount" : {
 
"r" : NumberLong(2)
 
},
 
"acquireWaitCount" : {
 
"r" : NumberLong(1)
 
},
 
"timeAcquiringMicros" : {
 
"r" : NumberLong(9189404)
 
}
 
},
 
"Database" : {
 
"acquireCount" : {
 
"r" : NumberLong(1)
 
}
 
},
 
"Collection" : {
 
"acquireCount" : {
 
"r" : NumberLong(1)
 
}
 
}
 
},
 
"nreturned" : 1,
 
"responseLength" : 661,
 
"protocol" : "op_query",
 
"millis" : 9191,
 
"execStats" : {
 
"stage" : "CACHED_PLAN",
 
"nReturned" : 1,
 
"executionTimeMillisEstimate" : 0,
 
"works" : 1,
 
"advanced" : 1,
 
"needTime" : 0,
 
"needYield" : 0,
 
"saveState" : 0,
 
"restoreState" : 0,
 
"isEOF" : 1,
 
"invalidates" : 0,
 
"inputStage" : {
 
"stage" : "LIMIT",
 
"nReturned" : 1,
 
"executionTimeMillisEstimate" : 0,
 
"works" : 1,
 
"advanced" : 1,
 
"needTime" : 0,
 
"needYield" : 0,
 
"saveState" : 0,
 
"restoreState" : 0,
 
"isEOF" : 1,
 
"invalidates" : 0,
 
"limitAmount" : 1,
 
"inputStage" : {
 
"stage" : "FETCH",
 
"filter" : {
 
"deleted_at" : {
 
"$eq" : null
 
}
 
},
 
"nReturned" : 1,
 
"executionTimeMillisEstimate" : 0,
 
"works" : 1,
 
"advanced" : 1,
 
"needTime" : 0,
 
"needYield" : 0,
 
"saveState" : 0,
 
"restoreState" : 0,
 
"isEOF" : 0,
 
"invalidates" : 0,
 
"docsExamined" : 1,
 
"alreadyHasObj" : 0,
 
"inputStage" : {
 
"stage" : "IXSCAN",
 
"nReturned" : 1,
 
"executionTimeMillisEstimate" : 0,
 
"works" : 1,
 
"advanced" : 1,
 
"needTime" : 0,
 
"needYield" : 0,
 
"saveState" : 0,
 
"restoreState" : 0,
 
"isEOF" : 0,
 
"invalidates" : 0,
 
"keyPattern" : {
 
"user_key" : 1,
 
"deleted_on" : 1
 
},
 
"indexName" : "user_key_deleted_on",
 
"isMultiKey" : false,
 
"isUnique" : false,
 
"isSparse" : false,
 
"isPartial" : false,
 
"indexVersion" : 1,
 
"direction" : "forward",
 
"indexBounds" : {
 
"user_key" : [
 
"[\"eafjhjafaefaefe\", \"eafjhjafaefaefe\"]"
 
],
 
"deleted_on" : [
 
"[null, null]"
 
]
 
},
 
"keysExamined" : 1,
 
"dupsTested" : 0,
 
"dupsDropped" : 0,
 
"seenInvalidated" : 0
 
}
 
}
 
}
 
},
 
"ts" : ISODate("2016-11-24T13:07:40.552Z"),
 
"client" : "172.18.20.200",
 
"allUsers" : [
 
{
 
"user" : "abc",
 
"db" : "userdatabase"
 
}
 
],
 
"user" : "abc@userdatabase"
}



When the same query executed inside mongo shell, returns following in explain:

               
"executionTimeMillis" : 0,
 
"totalKeysExamined" : 1,
 
"totalDocsExamined" : 1
 

The memory and CPU usage found normal on this DB. Also, the average query/sec < 50.

Any help on this? 

Rhys Campbell

unread,
Nov 24, 2016, 10:27:14 AM11/24/16
to mongodb-user
It's spending time waiting to acquire locks...

"timeAcquiringMicros" : {
 
"r" : NumberLong(9189404)
 
}


Take a look at the globalLock section of serverStatus output. How big is the wirderTiger cache size? Default is 60%? How much data do you have in total? Can the working set fit comfortably in RAM? HOw many collections do you have in total? you have over 10K in one db, have you set ulimits?

Astro

unread,
Nov 24, 2016, 11:59:57 AM11/24/16
to mongodb-user
Thanks Rhys,

Update: The instance size is r3.xlarge( 4 CPU cores, 32GB RAM)

1. The wiredTiger cache size is set to default

2. Total data size is 4GB including 3 other databases on same instance.

3. With this memory size working set is expected to fit in memory

4. The total number of collections is 10079

5. Yes. The ulimits have been changed to meet large number of open files i.e 600,000

Hope this makes it clear

Kevin Adistambha

unread,
Dec 1, 2016, 11:42:42 PM12/1/16
to mongodb-user

Hi

I think Rhys may be correct. The query seems to stall waiting for something. Can you provide more information:

  • What is your MongoDB and your O/S version?
  • What is your deployment topology (standalone, replica set, or sharded cluster)?
  • Has the server been setup using the recommended settings in the Production Notes?
  • Does this always happen with the same query, or do other queries exhibit the same stall as well?
  • Is there anything in the mongod log during this stall period?
  • You mentioned “Total indexes in this DB: 89098”. Is this the size of the index in Bytes, or the number of indexes? How did you get this number?
  • Are there any other process running on the server, e.g. backup process, other mongod processes, etc.
  • Could you post the output of:
    • db.currentOp() during the stall, if possible
    • db.stats()
    • db.collection.stats() for affected collections
    • db.holdings.stats()
    • The execution stats output of the query, e.g. db.holdings.explain('executionStats').find({user_key:..., deleted_on: ...})

Please note that you can use mtools (particularly mplotqueries and mloginfo) to help diagnose slow query events using the mongod logs.

Best regards,
Kevin

Astro

unread,
Dec 5, 2016, 1:32:35 AM12/5/16
to mongodb-user


Hi Kevin, 

   Thanks for the update.
 

Below are the more specific details:

  • What is your MongoDB and your O/S version?
           MongoDB version is 3.2.10 and OS is Ubuntu 14.04 
  • What is your deployment topology (standalone, replica set, or sharded cluster)?
            It is replica set with three members (Primary, Secondary and Secondary)
  • Has the server been setup using the recommended settings in the Production Notes?
           Yes, we have been following this document. Some of the relevant checks have been taken care of. 
  • Does this always happen with the same query, or do other queries exhibit the same stall as well?
          This happens intermittently. Some of the other queries on one or two other collections exhibit the same behavior.(Indexed but slow)
  • Is there anything in the mongod log during this stall period?
          No errors found related to this in logs. 
  • You mentioned “Total indexes in this DB: 89098”. Is this the size of the index in Bytes, or the number of indexes? How did you get this number?
          Yes, that's the total numbers of indexes for this database. Obtained using db.stats()
  • Are there any other process running on the server, e.g. backup process, other mongod processes, etc.
          No, this host isn't configured for backup. 
  • Could you post the output of:
    • db.currentOp() during the stall, if possible
          Isn't captured.
    • db.stats()
         {
"db" : "userdatabase",
"collections" : 11767,
"objects" : 967227,
"avgObjSize" : 3109.8919633136793,
"dataSize" : 3007971474,
"storageSize" : 1379016704,
"numExtents" : 0,
"indexes" : 106844,
"indexSize" : 2024378368,
"ok" : 1
}          
    • db.holdings.stats()
        {
"ns" : "userdatabase.holdings",
"count" : 835,
"size" : 458705,
"avgObjSize" : 549,
"storageSize" : 397312,
"capped" : false,
     }
    • The execution stats output of the query, e.g. db.holdings.explain('executionStats').find({user_key:..., deleted_on: ...})
     { "executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 0,
"totalKeysExamined" : 1,
"totalDocsExamined" : 1,
"executionStages" : {
"stage" : "FETCH",
"filter" : {
"deleted_on" : {
"$eq" : null
}
},
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 3,
"advanced" : 1,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"docsExamined" : 1,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 1,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"user_key" : 1,
"deleted_on" : 1
},
"indexName" : "user_key_deleted_on",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"user_key" : [
"[\"eafjhjafaefaefe\", \"eafjhjafaefaefe\"]"
],
"deleted_on" : [
"[null, null]"
]
},
"keysExamined" : 1,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
},
"serverInfo" : {
"host" : "172.18.20.200",
"port" : 27017,
"version" : "3.2.10",
"gitVersion" : "79d9b3ab5ce20f51c272b4411202710a082d0317"
},
"ok" : 1
}      

Astro

unread,
Dec 9, 2016, 5:20:34 AM12/9/16
to mongodb-user
Hi! Kevin,

Any suggestions on this so that we can debug it further?


Thanks!

Rhys Campbell

unread,
Dec 9, 2016, 9:24:25 AM12/9/16
to mongodb-user
Have a look at the stats from...

db.serverStatus()["wiredTiger"]["cache"];

Are "modified pages evicted" & "unmodified pages evicted"  very high?

Astro

unread,
Dec 9, 2016, 9:50:33 AM12/9/16
to mongodb-user

Are "modified pages evicted" & "unmodified pages evicted"  very high?

     "modified pages evicted" : 347203,
     "unmodified pages evicted" : 11459833 

Rodrigo Nascimento

unread,
Dec 9, 2016, 12:42:33 PM12/9/16
to mongodb-user
"  This happens intermittently. Some of the other queries on one or two other collections exhibit the same behavior.(Indexed but slow)"

if it is intermittently the queries aren't the root cause, they are suffering the consequences of something else.
What else is going on in this server when the queries are running slow? 

All the best,

Logwriter

Rodrigo Nascimento

unread,
Dec 9, 2016, 12:53:54 PM12/9/16
to mongodb-user
You have a Global locking here:

"Global" : {
 
"acquireCount" : {
 
"r" : NumberLong(2)
 
},
 
"acquireWaitCount" : {
 
"r" : NumberLong(1)
 
},
 
"timeAcquiringMicros" : {
 
"r" : NumberLong(9189404)
 
}

It took 9,189ms to be acquired. And you also have a collection lock with a similar time as the global locking

 "Collection" : {
 
"acquireCount" : {
 
"r" : NumberLong(1)
 
}
 
}
 
},
 
"nreturned" : 1,
 
"responseLength" : 661,
 
"protocol" : "op_query",
 
"millis" : 9191,

I'm pretty sure those events are related to each other. So, what is happening in your environment that would trigger a global locking?

Kevin Adistambha

unread,
Dec 12, 2016, 11:06:05 PM12/12/16
to mongodb-user

Hi

Since the issue is intermittent, it is difficult to pin down exactly the cause. Can you determine a pattern in the slow queries, e.g. time of day, number of queries, output of db.currentOp() during period of slowness, etc. However, from the stats output you provided:

{
    "db" : "userdatabase",
    "collections" : 11767,
    "objects" : 967227,
    "avgObjSize" : 3109.8919633136793,
    "dataSize" : 3007971474,
    "storageSize" : 1379016704,
    "numExtents" : 0,
    "indexes" : 106844,
    "indexSize" : 2024378368,
    "ok" : 1
}

What is your use case in this deployment, i.e. why do you have such a large amount of collections and indexes? If you intermittently encounter some unexplainable slow queries, it may be that you are either hitting some kind of hardware limitation, you have a resource contention in the machine, there are other operations being done in the database, or other as yet unknown reason. Have you tried upgrading the instance type, or consider sharding? Also, please note that the latest version of MongoDB in the 3.2 series is 3.2.11.

Best regards,
Kevin

Astro

unread,
Jan 2, 2017, 12:18:02 AM1/2/17
to mongodb-user
The pattern isn't always recurring. However, the the time for some bunch of slow queries is same.

What are the possible reasons for such large global meAcquiringMicros? 

As a side note, the show dbs command takes 2-3 seconds to list the databases.

Do you think the large number of indexes and collections adding to the slowness?

Thanks!

 

Astro

unread,
Jan 17, 2017, 8:43:54 AM1/17/17
to mongodb-user
 We've noted that this is happening when there is some load on the servers.

But, the only thing we've found in profiler is high timeAcquiringMicros in Global locks.

Given the environment, what may be the possible cause of the Global lock?

Any advice will be helpful.

Thanks.

kaushik subramanian

unread,
Sep 29, 2017, 3:15:34 AM9/29/17
to mongodb-user
Hi Rhys,
Out of curiosity what could the theory be if pages evicted and unmodified pages evicted are very high? FYI, i am facing the same problem and those values are pretty high for me.

Monika Shah

unread,
Oct 2, 2017, 2:20:13 PM10/2/17
to mongodb-user
May I know which database and queryset you are using?
I also would like to know which profiler you have used.

Kevin Adistambha

unread,
Oct 5, 2017, 8:58:59 PM10/5/17
to mongodb-user

Hi Kaushik

Out of curiosity what could the theory be if pages evicted and unmodified pages evicted are very high?

In short, there are two “pages evicted” metrics: “modified pages evicted” which comes from updates (insert/update) and “unmodified pages evicted” which comes from reads (find/aggregate, etc.). The numbers for both metrics could approximately show you the reads/writes workload that you have on your server. The cause of high values in both metrics could include (but not limited to):

  • Lack of RAM for the workload
  • WiredTiger cache is set too small
  • Change in workload (i.e. an app querying parts of the dataset that was dormant before, e.g. analytics workload)
  • Collections scans (non-indexed queries) or suboptimal index usage

Of course, the exact cause is very specific to your workload & deployment.

Best regards,
Kevin

Reply all
Reply to author
Forward
0 new messages