How to analyze query performance for queries with dynamic parameters?

57 views
Skip to first unread message

Dennis Kashkin

unread,
Mar 22, 2016, 11:05:20 AM3/22/16
to mongodb-user
I help manage a fairly large and busy MongoDB 3.0.8 database. I recently had to troubleshoot a sudden increase in the inbound network traffic on the primary database (from 2Mbs to 50Mbs). I was able to find one query that caused this issue, but the troubleshooting process with the native MongoDB tools was very slow and painful for a few reasons:
  1. The database log file gets thousands of records per second, and all long queries are truncated, so it is very hard to visually identify the longest queries and estimate their frequency.
  2. The profiler allows sorting queries by responseLength, but not by the query length (db.system.profile.find().sort({responseLength : -1});
  3. The way Mongodb logs queries, all dynamic parameters like ID values are embedded into the query text, which makes every query look unique even if the application sends the same query "template" with different parameter values. I wish there was a way to separate all dynamic data from the query structure so that multiple queries with identical structure but different values of parameters could be automatically recognized as multiple executions of the same query. 
  4. I wish there was a way to assign a unique tag to each query type in the application and then find this tag in the log and profiler. This way even if the query structure is dynamic, there would be a way to collectively analyze all queries with the same value of the "query type tag". In SQL world this is done by injecting comments into queries like /* REPORT_SALES_DAILY */ which makes it easy to filter the log by the type of query. 
So, I wanted to ask if anybody knows of a way to efficiently identify similar queries with the native tools or with some 3rd party tool? All I want is a list of queries where each line looks like this:

QueryType=db.sales.find({InsertDate:{$gt:(DYNAMIC)]}, StoreID:{$in:[(DYNAMIC)]}}    Frequency=500/hour,  AvgTime=20ms,  AvgRequest=1.2KB,  AvgResponse=100KB  

Thanks!
- Dennis

Chris De Bruyne

unread,
Mar 22, 2016, 4:17:35 PM3/22/16
to mongodb-user
Hello Dennis,

As far as I know there is no way to get the longer queries in your log files. The limit is 10K per line, which is huge, let's be honest.

The answer to all your other problems are given by this great guy, Thomas Rückstieß, and his mTools : https://github.com/rueckstiess/mtools

It's part command line / part GUI, and if you mix and match the tools you can get exactly what you describe.


Have fun exploring
Chris

Stephen Steneker

unread,
Apr 28, 2016, 9:07:14 PM4/28/16
to mongodb-user

On Wednesday, 23 March 2016 07:17:35 UTC+11, Chris De Bruyne wrote:

As far as I know there is no way to get the longer queries in your log files. The limit is 10K per line, which is huge, let’s be honest.

Hi Chris, Dennis,

A belated answer for the archives :).

There isn’t a setting to increase the log line limit (nor is that likely to be particularly useful), however there are several approaches that can be helpful to understand changes in query traffic:

  • Use the $comment meta-operator to tag your queries.

  • As suggested by Chris, mtools is helpful for log analysis. Aside from mplotqueries, there is also mloginfo --queries which tries to aggregate queries by shape.

  • Use a comprehensive monitoring solution (eg MongoDB Cloud Manager) to capture a historical baseline of activity. The size of queries is one possibility for increased network traffic, but there are many other likely candidates such as increased operations, queries returning large result sets (backup? initial sync?), changes in cluster topology (perhaps chained replication if you have a distributed deployment), etc.

For some helpful background on investigating performance issues, I would recommend watching Asya’s presentation on Diagnostics and Debugging.

Regards,
Stephen

Reply all
Reply to author
Forward
0 new messages