Get realtime mongodb latency stats histogram

Ashu Pachauri

unread,

May 9, 2018, 4:40:09 AM5/9/18

to mongodb-user

MongoDB provides a way to get operation latency stats per collection and for the whole database as part of the $collStats aggregation pipeline. However, these are lifetime stats which means one slow query will distort my 99th or 99.9th percentile calculation forever until the server restarts. This sounds like a pretty useless metric for monitoring purposes. Is there a way to get realtime stats that give latency stats buckets only between checkpoints. Alternatively, Is there a way for me to reset these stats without having to restart the mongodb server.

Thanks,

Ashu

Wan Bachtiar

unread,

May 27, 2018, 10:20:55 PM5/27/18

to mongodb-user

Is there a way to get realtime stats that give latency stats buckets only between checkpoints

Hi Ashu,

You can’t query a range with $collStats histogram output.
The output format is bucketed in operation latencies.

For example:

  { micros: NumberLong(1), count: NumberLong(10) },
  { micros: NumberLong(2), count: NumberLong(1) },

Indicates:

10 operations taking 1 microsecond or less,
1 operation in the range (1, 2] microseconds.

Having said that, it’s part of aggregation pipeline, and depending on your use case you may modify the output accordingly by adding extra pipeline stages. i.e. remove the highest read value from the array.

Alternatively, if you’re interested in building a time based (checkpoints) graph, you could poll the server by running serverStatus calls. Gathering the latencies information from the opLatencies section to build up your own timeseries data.

Regards,
Wan.

Ashu Pachauri

unread,

May 29, 2018, 2:58:21 AM5/29/18

to mongodb-user

Thanks Wan for the reply. Running a new aggregation to pull metrics sounds like a hack. Ideally, mongodb should maintain latency hiistograms in memory wiith a fixed checkpointing interval and expose them using some sort of API.

I'll look into what the ideal behavior should be and create an upstream feature request and see where it goes.