95th percentile on interface rate

865 views
Skip to first unread message

harri...@gmail.com

unread,
Jan 15, 2016, 12:15:25 PM1/15/16
to Prometheus Developers
Hello.

I'm new to Prometheus and have been evaluating it for a few weeks and we are already getting some great results. Its a really good tool, I'm impressed, thank you!

Hopefully this isnt a stupid question. But here goes. How can I calculate the 95th percentile, on a single metric of a gauge or counter type, such as response times or interface rates over a period of time?

Reading through the docs the histogram_quantile function seems to require a histogram metric type. Is it possible to calculate a percentile from a rate derived from a counter? i.e. rate(ifHCInOctets{ifDescr="Ethernet3/27",instance="switch",job="core_interfaces"}[10m]) for say a 1 week period?

The work required for Prometheus seems fairly simple (unless I'm misunderstanding something) it just needs to order the data for the specified time period and then chuck away the top 5% and then return the largest value from the remaining 95%. I guess I'm just approaching this all wrong.

Anyway. Thanks for any assistance you can offer.

Brian Brazil

unread,
Jan 15, 2016, 1:02:34 PM1/15/16
to harri...@gmail.com, Prometheus Developers
On 15 January 2016 at 17:15, <harri...@gmail.com> wrote:
Hello.

I'm new to Prometheus and have been evaluating it for a few weeks and we are already getting some great results. Its a really good tool, I'm impressed, thank you!

Hopefully this isnt a stupid question.  But here goes.  How can I calculate the 95th percentile, on a single metric of a gauge or counter type, such as response times or interface rates over a period of time?

Reading through the docs the histogram_quantile function seems to require a histogram metric type. Is it possible to calculate a percentile from a rate derived from a counter? i.e. rate(ifHCInOctets{ifDescr="Ethernet3/27",instance="switch",job="core_interfaces"}[10m]) for say a 1 week period?

This is not currently supported for something like bandwidth usage, but may be added in the future. 

For event based metrics such as response times, the Histogram metric will do what you need combined with histogram_quantile.
 
The work required for Prometheus seems fairly simple (unless I'm misunderstanding something) it just needs to order the data for the specified time period and then chuck away the top 5% and then return the largest value from the remaining 95%.   I guess I'm just approaching this all wrong.

It's a tad more complicated than that, as the data may be uneven and need weighting. There's also the question of what size time buckets you want (95% with 1m buckets is not the same as 1h buckets).

Brian
 

Anyway.  Thanks for any assistance you can offer.

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Kirk Harris

unread,
Jan 15, 2016, 1:15:24 PM1/15/16
to Brian Brazil, Prometheus Developers

Thanks for such a fast response!

Reply all
Reply to author
Forward
0 new messages