Pushgateway: allow for optional delete of metric upon scraping

Skip to first unread message

dmitry b

unread,
Oct 31, 2020, 1:01:48 AM10/31/20
to Prometheus Users

I have read #19 discussion and many others on the topic of TTL in pushgateway.
This proposal is slightly different: [optionally] remove the metric when it's been read (probably by Prometheus)
My use case and seemingly many other people's call for it.
As it is, I have to jump through hoops to works around the issue of metric still showing as current in Prometheus long after it's been pushed to PG

Copying the response from beorn7 below in github and my response to that:
  1. I am NOT doing event processing.
  2. We manage a large Hadoop cluster (6000 nodes) and Hadoop services are being monitored by PUSHing metrics to a different system.
  3. We want to send these metrics to Prometheus
  4. The format of these metrics is Not in Prometheus format, so cannot be scraped, AND we are not setup for such scraping anyway, we are setup for PUSH.
  5. I would rather not enter a discussion about merits of Push vs Pull and how a system could be re-architected from scratch.
  6. Pushgateway satisfies the use case, however a metric needs to be scraped once and deleted, hence the proposal.
  7. Understood the point about HA, that's why I propose OPTIONAL removal - let the user decide. 
beorn7 commented 16 hours ago

It's a fundamental property of Prometheus metrics endpoints that a scrape doesn't change the state of the metrics.

For example, in a Prometheus HA setup, two identically configured Prometheus servers scrape the same metrics endpoint. The one server in such an HA pair would trigger the removal of the metric in your proposal, and the other server would then not see it anymore.

If you need a "delete after scraping once" behavior, that's a strong hint that you are actually doing event processing, for which Prometheus is not the right choice.

If you have more questions about your use case and how to implement (or perhaps better not to implement) it with Prometheus, I recommend the prometheus-users mailing list.

Brian Candler

unread,
Oct 31, 2020, 5:18:48 AM10/31/20
to Prometheus Users
On Saturday, 31 October 2020 05:01:48 UTC, dmitry b wrote:
This proposal is slightly different: [optionally] remove the metric when it's been read (probably by Prometheus)

I am sure that won't happen, as it completely breaks the Prometheus data collection model.  For example: a HA prometheus has two prometheus servers scraping the same endpoint.  A development environment running on a laptop may scrape the same endpoints as your production environment.  Scraping a value should not interfere with another system's scrape.
 
My use case and seemingly many other people's call for it.
As it is, I have to jump through hoops to works around the issue of metric still showing as current in Prometheus long after it's been pushed to PG

The metric either carries true information, or it does not.

The metric might say: "the last execution of job X had status code Y".  That metric carries a true piece of information.  If you don't re-run job X, then it remains true indefinitely.

If you are worried that job X last ran a long time ago, then you expose a new metric which carries the *time* at which job X last ran.  That's also true indefinitely, until the next time that job X runs.

In short: it sounds like you are mis-using pushgateway (and prometheus itself).  Prometheus is not an event-logging system; something like loki would be more suitable for that.  Prometheus carries timeseries, that is, numeric data which evolves over time, where each data point relates to the point before it.

Pushgateway is *only* for cases where:
- you run a particular job repeatedly, AND
- the job is a "one-shot", i.e. it runs then terminates, AND
- the job needs somewhere to stash its result so it can be scraped after it has terminated.

That's it.  There's no other use case for pushgateway.  I think it has a bad name; it should be called "cached_result_exporter" or something like that.

Bjoern Rabenstein

unread,
Nov 3, 2020, 5:33:57 PM11/3/20
to dmitry b, Prometheus Users
On 30.10.20 22:01, dmitry b wrote:
> 2. We manage a large Hadoop cluster (6000 nodes) and Hadoop services are being
> monitored by PUSHing metrics to a different system.
> 3. We want to send these metrics to Prometheus
> 4. The format of these metrics is Not in Prometheus format, so cannot be
> scraped, AND we are not setup for such scraping anyway, we are setup for
> PUSH.

I'm not familiar with Hadoop, but a websearch for
"monitoring hadoop with prometheus"
yields a bunch of interesting results.

The Pushgateway is definitely not the right choice to convert from a
push-based metrics setup to Prometheus.

--
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in
Reply all
Reply to author
Forward
0 new messages