In general, I don't think you can get an accurate answer to that question from a histogram.
You can work out which *bucket* the lowest and highest request durations sat in, which means you could give the lower and upper bounds of the minimum, and the lower and upper bounds of the maximum. Just compare the bucket counters at the start and end of the time range, and find the lowest boundary (le) which has changed, and the highest boundary which has changed. But this still doesn't tell you what the *actual* value was.
I don't think there's any point in trying to make an estimate of the actual value; these values are, by definition, outliers, so even if your data points fitted a nice distribution, these ones would be at the ends of the curve and subject to high error.
Your LLM answer is essentially what it says in the
documentation for histogram_quantile:
You can use histogram_quantile(0, v instant-vector) to get the estimated minimum value stored in a histogram.
You can use histogram_quantile(1, v instant-vector) to get the estimated maximum value stored in a histogram.
I thought it was worth testing. Here is a metric from my home prometheus server, running 2.53.4:
go_gc_pauses_seconds_bucket
=>
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 0
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 0
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 12193
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 15369
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 27038
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 27085
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 27086
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", le="+Inf"} 27086
go_gc_pauses_seconds_bucket - go_gc_pauses_seconds_bucket offset 10m
=>
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 0
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 0
{instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 5
{instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 5
{instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 10
{instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 10
{instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 10
{instance="localhost:9090", job="prometheus", le="+Inf"} 10
rate(go_gc_pauses_seconds_bucket[10m])
=>
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 0
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 0
{instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 0.007407407407407408
{instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 0.007407407407407408
{instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 0.014814814814814815
{instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 0.014814814814814815
{instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 0.014814814814814815
{instance="localhost:9090", job="prometheus", le="+Inf"} 0.014814814814814815
Those exponential bucket boundaries in scientific notation aren't very readable, but you can see that:
* the lowest response time must have been somewhere between 6.399999999999999e-07 and 7.167999999999999e-06
* the highest response time must have been somewhere between 8.191999999999999e-05 and 0.0009175039999999999
Here are the answers from the formula the LLM suggested:
histogram_quantile(0, rate(go_gc_pauses_seconds_bucket[10m]))
=>
{instance="localhost:9090", job="prometheus"} NaN
histogram_quantile(1, rate(go_gc_pauses_seconds_bucket[10m]))
=>
{instance="localhost:9090", job="prometheus"} 0.0009175039999999999
The lower boundary of "NaN" is not useful at all (possibly this is a bug?), but I found I could get a value by specifying a very low, but non-zero, quantile:
histogram_quantile(0.000000001, rate(go_gc_pauses_seconds_bucket[10m]))
=>
{instance="localhost:9090", job="prometheus"} 6.40000013056e-07
Those values *do* sit between the boundaries given:
>>> 6.399999999999999e-07 < 6.40000013056e-07 <= 7.167999999999999e-06
True
>>> 8.191999999999999e-05 < 0.0009175039999999999 <= 0.0009175039999999999
True
In fact, the "minimum" answer is very close to the lower edge of the relevant bucket, and the "maximum" is the upper edge of the relevant bucket.
Therefore, these are not the *actual* minimum and maximum request times. In effect, they are saying "the minimum request time was more than 6.399999999999999e-07, and the maximum request time was no more than 0.0009175039999999999". But that's as good as you can get with a histogram.