Obviously any value which is less than 0.005 is not equal to 1, so this will always return 1 or nothing.
It sounds like what you're trying to do here is:
quantile_over_time(0.90, replication_read_duration_seconds{job="heartbeat-read"}[5m]) < bool .005
which will return 0 or 1.
But I don't think this will solve your problem very well:
replication_read_duration_seconds is a gauge? How often does it change?
If you want to report that 999 in 1000 requests were below 5ms, then you need at least 1000 samples, and if that's over a 5 minute period you must be scraping more than 3 times per second. That's not really how prometheus is supposed to be used.
It sounds like what you really want is to collect these events in a
histogram, then report on the histogram. But that means changing how you collect the data in the first place.
As a simple way to think about a histogram, imagine you have two counters:
- A counts the total events
- B counts only the events with latency < 5ms
If you take the increase of B over 5 minutes, divided by the increase in A over 5 minutes, that gives you the fraction you're looking for.