Prometheus query with conditional operation

66 views
Skip to first unread message

Debashish Ghosh

unread,
Feb 27, 2020, 3:05:31 PM2/27/20
to Prometheus Users
Hi,
    I have a tricky problem to resolve when trying to get percentage of all the messages that flow through our system that takes more than 1 second over a span of 30 days.

I have a volumeCounter that gives the total_volume of messages so I believe the total number of messges will be increase(volumeCounter[30d]).

I have another counter latencyCounter that adds latency of each message . So to get latency per message I can use increase(latencyCounter[30d])/increase(volumeCounter[30d].

Now in the timeseries generated there are some dataPoints where the value is >1 second. I want to get the percentage of those from the overall number of dataPoints.

So for example in the last 30 days I send 10000 messages out of which 1000 took more than 1 second so the result of the query at that point should return 10 .

Is there a way of achieving this in prometheus ?

Thanks
Debashish

Brian Candler

unread,
Feb 27, 2020, 3:40:07 PM2/27/20
to Prometheus Users
It sounds like you need a histogram:
https://prometheus.io/docs/concepts/metric_types/#histogram

What this means is you generate separate buckets for different latencies (maybe: <0.1s, <0.2s, <0.5s, <1s, <2s, other) and increment the counts for each message.  Then answering questions like "out of all the events, what proportion took more than 1 second?" is straightforward.
Reply all
Reply to author
Forward
0 new messages