Maximum and Minimum Request Duration on Prometheus Classic Histograms

118 views
Skip to first unread message

tejaswini vadlamudi

unread,
Jun 18, 2025, 9:23:54 AMJun 18
to Prometheus Users

Hi,

I’m using Prometheus to monitor request durations via a histogram metric, e.g., http_request_duration_seconds_bucket. I would like to query:

  • The minimum time taken by a request
  • The maximum time taken by a request

…over a given time range (say, the last 1h or 24h).

I understand that histogram buckets give cumulative counts of requests below certain durations, but I’m not sure how to extract the actual min or max values of request durations during a time window.

Is this possible directly via PromQL? Or is there a recommended workaround (e.g., recording rules, external processing, or using histogram_quantile() in a specific way)?

Thanks in advance for any guidance!

Br,
Teja

tejaswini vadlamudi

unread,
Jun 18, 2025, 1:17:15 PMJun 18
to Prometheus Users
Including answer from Gen-AI:

| Description                         | PromQL Query                                                                                                     | Notes                                                                                           |
|-------------------------------------|------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
| Minimum request duration (1m)       | histogram_quantile(0, sum by (le) (rate(http_request_duration_seconds_bucket[1m])))                             | Fast but may be noisy or return NaN if low traffic. Good for near-real-time.                   |
| Maximum request duration (1m)       | histogram_quantile(1, sum by (le) (rate(http_request_duration_seconds_bucket[1m])))                             | Same as above, for longest duration estimate.                                                   |
| Minimum request duration (5m)       | histogram_quantile(0, sum by (le) (rate(http_request_duration_seconds_bucket[5m])))                             | More stable, smoother estimate over a slightly longer window.                                   |
| Maximum request duration (5m)       | histogram_quantile(1, sum by (le) (rate(http_request_duration_seconds_bucket[5m])))                             | Recommended when traffic is bursty or histogram series are sparse.                             |

Please confirm if the above answer is reliable or not. 

Brian Candler

unread,
Jun 19, 2025, 12:38:59 PMJun 19
to Prometheus Users
In general, I don't think you can get an accurate answer to that question from a histogram.

You can work out which *bucket* the lowest and highest request durations sat in, which means you could give the lower and upper bounds of the minimum, and the lower and upper bounds of the maximum. Just compare the bucket counters at the start and end of the time range, and find the lowest boundary (le) which has changed, and the highest boundary which has changed. But this still doesn't tell you what the *actual* value was.  

I don't think there's any point in trying to make an estimate of the actual value; these values are, by definition, outliers, so even if your data points fitted a nice distribution, these ones would be at the ends of the curve and subject to high error.

Your LLM answer is essentially what it says in the documentation for histogram_quantile:

You can use histogram_quantile(0, v instant-vector) to get the estimated minimum value stored in a histogram.

You can use histogram_quantile(1, v instant-vector) to get the estimated maximum value stored in a histogram.

I thought it was worth testing. Here is a metric from my home prometheus server, running 2.53.4:

go_gc_pauses_seconds_bucket
=>
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 0
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 0
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 12193
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 15369
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 27038
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 27085
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 27086
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="+Inf"} 27086

go_gc_pauses_seconds_bucket - go_gc_pauses_seconds_bucket offset 10m
=>
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 0
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 0
{instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 5
{instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 5
{instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 10
{instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 10
{instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 10
{instance="localhost:9090", job="prometheus", le="+Inf"} 10

rate(go_gc_pauses_seconds_bucket[10m])
=>
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 0
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 0
{instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 0.007407407407407408
{instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 0.007407407407407408
{instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 0.014814814814814815
{instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 0.014814814814814815
{instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 0.014814814814814815
{instance="localhost:9090", job="prometheus", le="+Inf"} 0.014814814814814815

Those exponential bucket boundaries in scientific notation aren't very readable, but you can see that:
* the lowest response time must have been somewhere between 6.399999999999999e-07 and 7.167999999999999e-06
* the highest response time must have been somewhere between 8.191999999999999e-05 and 0.0009175039999999999
 
Here are the answers from the formula the LLM suggested:

histogram_quantile(0, rate(go_gc_pauses_seconds_bucket[10m]))
=>
{instance="localhost:9090", job="prometheus"} NaN

histogram_quantile(1, rate(go_gc_pauses_seconds_bucket[10m]))
=>
{instance="localhost:9090", job="prometheus"} 0.0009175039999999999

The lower boundary of "NaN" is not useful at all (possibly this is a bug?), but I found I could get a value by specifying a very low, but non-zero, quantile:

histogram_quantile(0.000000001, rate(go_gc_pauses_seconds_bucket[10m]))
=>
{instance="localhost:9090", job="prometheus"} 6.40000013056e-07

Those values *do* sit between the boundaries given:

>>> 6.399999999999999e-07 < 6.40000013056e-07 <= 7.167999999999999e-06
True
>>> 8.191999999999999e-05 < 0.0009175039999999999 <= 0.0009175039999999999
True

In fact, the "minimum" answer is very close to the lower edge of the relevant bucket, and the "maximum" is the upper edge of the relevant bucket.

Therefore, these are not the *actual* minimum and maximum request times. In effect, they are saying "the minimum request time was more than 6.399999999999999e-07, and the maximum request time was no more than 0.0009175039999999999".  But that's as good as you can get with a histogram.
Message has been deleted

tejaswini vadlamudi

unread,
Jun 23, 2025, 3:15:42 AMJun 23
to Prometheus Users
Thanks Brain, for the clear heads-up and explanation!

It looks to me that there is no possibility to secure exact maximum and exact minimum values for durations (based on Prometheus histograms) :-(

However, for performing exploratory data analysis on the application software, need this summary statistics information, such as minimum and maximum values. Legacy monitoring systems have always had this support, which in turn expects the new technology to fit the use case to ensure backward compatibility. 

Please share what can be done in this regard to secure this info.

I'm thinking out loud, please correct/add wherever possible:

1. Does changing from Prometheus to OTEL instrumentation provide this feature (exact max and min duration time)?
2. Can metrics derived from distributed traces (instrumented with OTEL/Jaeger) be used to obtain minimum and maximum request durations?
3. Is it possible to secure the max and min duration time with Prometheus with any hack?
      a. For Classic Histograms?
      b. For Native Histograms?
4. A new PR/contribution on Prometheus to offer this support?

Thanks,
Teja

Brian Candler

unread,
Jun 23, 2025, 7:42:35 AMJun 23
to Prometheus Users
Remember that histograms don't store values. All they do is increment a counter by 1; the value is only used to select which bucket to increment.  This means that the amount of storage used by a histogram is very small - a fixed number of buckets with one counter each. It doesn't matter if you are processing 1 sample per second or 10,000 samples per second.

If you wanted to retrieve the *exact* lowest or highest value, over *any* arbitrary time period that you query, you would have to store every single value into a database. Prometheus is not a event logging system, and it will never work this way. A columnar datastore like Clickhouse can do that quite well, but if the number of samples is large, you will still have a very large storage issue.

More realistically, you could find the minimum or maximum value seen over a fixed time period (say one minute), and at the end of that minute, export the min/max value seen. That's cheap and quick. Indeed, you could do it over a relatively short time period (e.g. 1 second), and use prometheus' min/max_over_time functions if you want to query a longer period, i.e. to find the min of the mins, or the max of the maxes.  You need to make sure that every distinct min/max value ends up in the database though; either use remote_write to push them, or scrape your exporter at least twice as fast as the min/max values are changing.

In my experience, people are often not so interested in the single minimum or maximum value, but in the quantiles, such as the 1st percentile ("the fastest 1% of queries were answered in less than X seconds") or the 99th percentile ("the slowest 1% of queries were answered in more than Y seconds"). Prometheus can help you using a data type called a "summary":

A summary can give you very good estimates of the percentiles over a sliding time window (of a size you have to choose in advance), and uses a relatively small amount of storage like a histogram. It is better than a histogram in the case where you don't know in advance what the highest and lowest values are likely to be (i.e. you don't need to pre-allocate your bucket boundaries correctly).

Brian Candler

unread,
Jun 23, 2025, 8:17:57 AMJun 23
to Prometheus Users

Brian Candler

unread,
Jun 23, 2025, 8:21:03 AMJun 23
to Prometheus Users

tejaswini vadlamudi

unread,
Jun 23, 2025, 9:31:34 AMJun 23
to Prometheus Users
Thanks Brain, I don't want to go towards Summaries, but with histograms, mainly with Native Histograms, is there a possibility to get Max and Min values for a period of time?

With OTEL-based metrics instrumentation, it is possible to record max and min values. See https://opentelemetry.io/docs/specs/otel/metrics/data-model/#histogram

Histograms consist of the following:

  • An Aggregation Temporality of delta or cumulative.
  • A set of data points, each containing:
    • An independent set of Attribute name-value pairs.
    • A time window (of (start, end]) time for which the Histogram was bundled.
      • The time interval is inclusive of the end time.
      • Time values are specified as nanoseconds since the UNIX Epoch (00:00:00 UTC on 1 January 1970).
    • A count (count) of the total population of points in the histogram.
    • A sum (sum) of all the values in the histogram.
    • (optional) The min (min) of all values in the histogram.
    • (optional) The max (max) of all values in the histogram.

Br,
Teja

tejaswini vadlamudi

unread,
Jun 23, 2025, 10:40:33 AMJun 23
to Prometheus Users
If I try to instrument a native histogram using Prometheus Client Libraries and I see below output for GET on /metrics

# HELP test_http_request_duration_seconds HTTP latency distribution for /users (random delay, occasional error)
# TYPE test_http_request_duration_seconds histogram
test_http_request_duration_seconds_bucket{endpoint="/users",method="GET",status_code="200",le="+Inf"} 13
test_http_request_duration_seconds_sum{endpoint="/users",method="GET",status_code="200"} 0.5860522520000001
test_http_request_duration_seconds_count{endpoint="/users",method="GET",status_code="200"} 13
test_http_request_duration_seconds_bucket{endpoint="/users",method="GET",status_code="500",le="+Inf"} 4
test_http_request_duration_seconds_sum{endpoint="/users",method="GET",status_code="500"} 0.20080665900000003
test_http_request_duration_seconds_count{endpoint="/users",method="GET",status_code="500"} 4

If I query Prometheus, it results:

{
  "status": "success",
  "data": {
    "result": [
      {
        "metric": {
          "__name__": "test_http_request_duration_seconds",
          "endpoint": "/users",
          "method": "GET",
          "status_code": "200"
        },
        "histogram": [
          1750686407.931,
          {
            "count": "7",
            "sum": "0.35453753899999996",
            "buckets": [
              [0, "0.005065779510355506", "0.005524271728019902", "1"],
              [0, "0.014328188175072986", "0.015625", "1"],
              [0, "0.03716272234383503", "0.04052623608284405", "1"],
              [0, "0.0625", "0.0681567332915786", "2"],
              [0, "0.0810524721656881", "0.08838834764831843", "2"]
            ]
          }
        ]
      },
      {
        "metric": {
          "status_code": "500"
        },
        "histogram": [
          1750686407.931,
          {
            "count": "2",
            "sum": "0.127420411",
            "buckets": [
              [0, "0.057312752700291944", "0.0625", "1"],
              [0, "0.0681567332915786", "0.07432544468767006", "1"]
            ]
          }
        ]
      }
    ]
  }
}

I think  histogram_quantile(0, rate(http_request_duration_seconds[1m])) will give me min value. Is it correct?

Q2. But with OTEL, even min & max values are encoded. I don't understand how to get such support in Prometheus Native format.

Brian Candler

unread,
Jun 23, 2025, 5:13:56 PMJun 23
to Prometheus Users
On Monday, 23 June 2025 at 14:31:34 UTC+1 tejaswini vadlamudi wrote:
but with histograms, mainly with Native Histograms, is there a possibility to get Max and Min values for a period of time?

No.  If that's not clear from my explanations so far, then I have not done a very good job.
 

With OTEL-based metrics instrumentation, it is possible to record max and min values. See https://opentelemetry.io/docs/specs/otel/metrics/data-model/#histogram


Please read the link I posted before:
especially the last comment:

The OTEL histogram's way of reporting max/min values are not compatible with Prometheus. Prometheus histograms are cumulative: that is, they never reset. They are just counters which keep incrementing forever. If they had a min/max then it would be the min/max over all time, which is not useful.

You can create a separate timeseries for min/max, but it would not be part of the histogram. It would either have to be min/max over successive timeslots (e.g. 1 minute intervals), or min/max over a sliding window (which is expensive, as you've have to buffer all the samples for that time window).

> I think  histogram_quantile(0, rate(http_request_duration_seconds[1m])) will give me min value. Is it correct?

No, it's not. Again, it seems I have not been explaining this well.

Let's say your histogram buckets are:

0-10ms (le="0.01")
10-20ms (le="0.02")
20-30ms (le="0.03")
30-40ms (le="0.04")
40-50ms (le="0.05")
50ms+ (le="inf")

A sample comes in with request time of 12.3ms, and this is the fastest over the time period of interest.

All this does is add 1 to the 10-20ms bucket (counter) in the histogram. The actual value of 12.3ms is LOST.  The only thing you will know from looking at the histogram is that the 10-20ms bucket has incremented, but the 0-10ms bucket has not, and therefore the minimum value is somewhere between 10ms and 20ms.  If you use the histogram_quantile(0...) formula you showed, the answer will be "0.01", which I also explained previously with a worked example. This tells you "the minimum value is at least 10ms" - and implicitly not more than 20ms, as that's where the next bucket starts.

I think I had better hand over to someone else now.

Brian Candler

unread,
Jun 23, 2025, 5:25:38 PMJun 23
to Prometheus Users
See also https://prometheus.io/docs/specs/native_histograms/#opentelemetry-interoperability

I note that native histograms have exponentially-varying sizes. In principle, if you set the factors appropriately (e.g. say scaling of 1.1 = each bucket is 10% larger than the next) then the answers you get for min and max should be within 10% of the actual answer. But you won't get the *exact* minimum or maximum which is what you were asking for.

tejaswini vadlamudi

unread,
Jun 25, 2025, 3:55:54 AMJun 25
to Prometheus Users
Hi Brian,

Can you please correct the summary of my understanding?!

1. With Prometheus as a metrics storage and PromQL provider, the exact values of the Max and Min stats of a histogram can't be fetched as of now. This is not possible with Prometheus Classic Histograms, Prometheus Native Histograms, or collecting OTEL instrumented metrics via OTEL Collector in Prometheus (as a Metrics Store/Backend and PromQL Provider). 
2. Esp with Classic Histograms, Min/Max are not provided in a standard way. If these values are needed, separate gauges to capture these values are required. Even such an implementation is not ideal when there is high cardinality and dimensioning involved with the request.
3. Event-based logs and traces may give such exact information compared to Prometheus Histograms.
4. With cumulative histograms, it is possible to get Max & Min values for a specific duration.
5. With Delta histograms, it is possible to secure these values. But these are not supported by Prometheus and OTEL today.  
 
Thanks,
Teja

Brian Candler

unread,
Jun 26, 2025, 3:01:08 AMJun 26
to Prometheus Users
On Wednesday, 25 June 2025 at 08:55:54 UTC+1 tejaswini vadlamudi wrote:
4. With cumulative histograms, it is possible to get Max & Min values for a specific duration.

 With cumulative histograms, it is *not* possible to get Max & Min values for a specific time period.
Reply all
Reply to author
Forward
0 new messages