On 11.07.23 15:23, 'Braden Schaeffer' via Prometheus Developers wrote:
> They could live for 5s or 1 hour.
The whole idea of a Prometheus counter doesn't really make sense for a
job that lives for just 5s, if you are scraping every 15s or every
minute or so.
And a job that lives for 1 hour should be scraped directly.
So in the first case, using a counter doesn't make sense, and in the
second case using the Pushgateway doesn't make sense.
> Does it really matter what you send to pushgateway? It supports
> counters so why not push them?
We could be stricter and just reject counters being pushed to the
Pushgateway, but that would be a breaking change. Historically, the
metric type information in Prometheus was (and to a good part still
is) some kind of "weak typing", so no hard restrictions were imposed
(you can apply `rate` to a gauge or `delta` to a counter without
Prometheus complaining about it).
Also, it feels natural to count "records backed up by the daily
database back up job" in a counter and push it to the
Pushgateway. However, when it arrives on your Prometheus server, it
doesn't really behave as a counter. Summing those values up across
instances is really painful with PromQL, and the reason for that is
that we are essentially handling events here, for which Prometheus as
a whole wasn't really designed.
If you really have to use Prometheus for that case, the "least bad"
solutions I know of is statsd with the statsd-exporter (
https://github.com/prometheus/statsd_exporter ) or the
prom-aggregation-gateway
(
https://github.com/zapier/prom-aggregation-gateway ).
A TTL doesn't really address the fundamental problem. It might enable
a very brittle solution that is worse than the solution that are
already available.