metrics TTL for pushgateway

5,036 views
Skip to first unread message

syl...@abstraction.fr

unread,
Oct 17, 2018, 7:21:10 AM10/17/18
to Prometheus Developers
Hi,

There has been a few Github issues opened about a need for metrics TTL for the pushgateway.

So far those requests have been turned down and Björn asked that a thread be created here instead of commenting closed GH issues about that matter.

As I myself encountered the need for it in my organisation and did not find any thread about it, here it is :)

My use case is quite simple. We have hosts we monitor shipped to our customer all over the world and they only communicate with our infrastructure via AMQP messages.

We used this AMQP channel to funnel metrics from the hosts' node_exporter to a µservice in our infrastructure which has access to the pushgateway. This µservice listens to an AMQP queue and push the metrics delivered in the messages into the pushgateway.

But sometimes those hosts crash, and instead of having no data we see in our graphs flat lines equivalent to the metrics' values in the last push.

So as far as I am concerned a TTL for metrics in the pushgateway would really be useful.

Regards.

Björn Rabenstein

unread,
Oct 18, 2018, 6:36:35 AM10/18/18
to syl...@abstraction.fr, prometheus-developers
On Wed, 17 Oct 2018 at 13:21, <syl...@abstraction.fr> wrote:
>
> So far those requests have been turned down and Björn asked that a thread be created here
> instead of commenting closed GH issues about that matter.

Thanks for that. While many people were quite intensely demanding a
TTL on GitHub, nobody has bothered so far to use the channels that the
Prometheus developers would like to use for this discussion.

(As a reminder: It's fine to discuss things in an issue of the repo
that are really local to that repo. However, the question which use
cases should and should not be supported by the Pushgateway affects
the Prometheus ecosystem as a whole, and in particular, the decision
to not have a TTL was made back then with many stakeholders on board,
so it should not be reverted by an isolated discussion in a GH issue.
Admittedly, the discussion back then is not well documented, as things
were way more local back then with a lot of tribal knowledge. I hope
this thread will help to clarify things in a public forum, which is
searchable and linkable.)

> My use case is quite simple. We have hosts we monitor shipped to our customer all over the world
> and they only communicate with our infrastructure via AMQP messages.
>
> We used this AMQP channel to funnel metrics from the hosts' node_exporter to a µservice in our
> infrastructure which has access to the pushgateway. This µservice listens to an AMQP queue
> and push the metrics delivered in the messages into the pushgateway.

This use case might be simple, but it is also simply a use case
Prometheus wasn't built for. In particular, we actively discourage
users from using the Pushgateway to turn Prometheus into a push-based
monitoring system. You are of course free to create tooling to use
Prometheus in ways it is not intended to be used in, but you also have
to understand that the developers of the core components don't want to
add features that go against the fundamental concepts of Prometheus
(plus take the burden of maintaining said features and deal with the
(from their perspective predictable) issues those features will
cause).

Just a few of the issues that come to my mind (I'm sure people like
Brian Brazil will be able to come up with many more):

- What's your HA concept? What happens if your one Pushgateway
crashes? What happens if your AMQP listener fails to push to the
Pushgateway?

- What happens if you have inconsistent metrics between your nodes?
Easily possible as you won't upgrade them all in one go. The
Pushgateway cannot expose the merged state of inconsistent metrics as
that would be an invalid exposition. Prometheus, however, can deal
with inconsistent metrics coming from different targets.

> But sometimes those hosts crash, and instead of having no data we see in our graphs flat lines equivalent to the metrics' values in the last push.

If the metrics just disappear after a while, how do you detect a
crashed host then? How do you distinguish a host that has been
deprovisioned deliberately from a host that has crashed/doesn't come
up anymore from a host that is fine but fails to send metrics?

--
Björn Rabenstein, Engineer
http://soundcloud.com/brabenstein

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B

tgr...@gmail.com

unread,
Oct 26, 2018, 6:06:40 PM10/26/18
to Prometheus Developers
Hi Björn,

We'd also like to see TTL based time series expiry in the push gateway so I wanted to explain our use case.

We use pull model for as much as we can due to many of the reasons you've identified, but a non-trivial chunk of our business logic is executed in short lived ephemeral pipeline jobs. These execute in various contexts (job-scoped hadoop clusters, airflow operators, kubernetes jobs) that do not have proper conditions to create a pull model directly. The execution is usually too short lived or difficult to discover since they are running within a third party scheduling framework.

A concrete example is an airflow operator. While airflow itself stores some job level data that a proper pull based exporter can surface, the business logic code is by definition short running and has no native place to store metrics for the exporter. That means any instrumentation of the code internals *must* be persisted outside the process and outside the scheduler. Airflow simply has no way to store granular detail like this on the job's behalf. Hadoop and kubernetes jobs have the same issue. Something external must store the data.

A push model - even with the trade-offs it makes on reliability, consistency, etc - would help us put actual instrumentation on these types of processes. The pushgateway, as it stands, is a reasonable match and means we don't need to create too much net-new functionality. However, we are continuously updating our set of jobs, their attributes and internal functionality...so at some point we're going to have stale metrics being presented in the push gateway. They could be stale due to internal changes or simply because a job no longer runs. This leads to a needless storage and processing overhead that we'd like to avoid, and arbitrarily stale metrics can lead to confusing data in dashboards and reporting.

An expiration time that we could set to throw away metrics that haven't been updated in some time would help here.

Two questions:

1) Would you be willing to accept PR for this type of functionality?

2) If not, would you accept a PR that put this functionality into a sidecar type process that uses the DELETE api? We've discussed writing this ourselves and it sounds like others would benefit.

Obviously this hinges on the use case making sense. Happy to answer questions.

Thanks,

Travis

Björn Rabenstein

unread,
Oct 27, 2018, 3:33:45 PM10/27/18
to tgr...@gmail.com, prometheus...@googlegroups.com
On Sat, 27 Oct 2018 at 00:06, <tgr...@gmail.com> wrote:
>
> A concrete example is an airflow operator. While airflow
> itself stores some job level data that a proper pull based
> exporter can surface, the business logic code is by definition
> short running and has no native place to store metrics for the
> exporter. That means any instrumentation of the code internals
> *must* be persisted outside the process and outside the
> scheduler. Airflow simply has no way to store granular detail
> like this on the job's behalf. Hadoop and kubernetes jobs have
> the same issue. Something external must store the data.

I'd say if each instance of these short-lived jobs are
single-shot “unique” tasks, i.e. “at” style rather than “cron”
style, with a certain amount of information to be persisted
outside as the task reaches the end of its short lifetime, as I
understand you, then this really looks like event-logging. I
don't have all the information for a detailed analysis, but in
general Prometheus is utterly inappropriate for
event-logging. The Prometheus community has always advocated for
the use of event-logging and metrics-based monitoring in
parallel, complementing each other (plus not to forget
distributed tracing as the third pillar of
observability). Prometheus is a good choice for metrics. There
are a number of solutions available for event-logging. Prometheus
has no ambitions to “land-grab” into territories of other tools.

If users try to shoehorn the Pushgateway into a push-based
event-logging system, they usually push every event with a
different grouping key. Since the Pushgateway accumulates the
events forever, and the Prometheus server scrapes them all each
time, naturally a need for a TTL arises. The fallacy here is not
asking for the TTL, the fallacy is the (ab-)use of the
Pushgateway as an event-logging system. If you don't do the
latter, you don't need the former.

But perhaps what you try to push is actually legitimately
metrics-based. I see two categories for that.

The first category is where the instances of a job are all
doing the same thing sequentially, pushing into the same group,
with the proverbial example of a daily backup job. For
illustration check out the documentation of the Go client:

- A simple example:
https://godoc.org/github.com/prometheus/client_golang/prometheus/push#example-Pusher-Push

- A complex example:
https://godoc.org/github.com/prometheus/client_golang/prometheus/push#example-Pusher-Add

In this case, the proposed use of a TTL on the Pushgateway is
that there is no additional DELETE step required after a job is
removed deliberately. The pushed metric group will disappear
automatically after the TTL has passed.

That's a trap, however. With the TTL, you got the worst of both
worlds. Let's assume it's indeed a daily job. You alert if 25
hours pass without the job completing successfully. Your TTL,
obviously, needs to be significantly larger than 24 hours as you
only push once daily anyway. If the backup job fails, you want to
see when the last backup job succeeded for a couple of days, at
least, let's say for longer than a weekend, hence about three
days. Let's look at the two cases:

1. A job is deliberately removed, no backup is running
anymore. Your alert will fire 25h after the last backup has
run. It will continue to fire for ~2d until the TTL kicks in and
removes the metric. This is bad because an alert has fired for
no reason.

2. A job fails to run. As intended, after 25h the alert
fires. Let's assume it happens Friday evening. The engineer
on-call decides to work on it during work hours on Monday
because the backup is not that critical right now to justify a
weekend work session. However, on Monday morning the TTL has
kicked in and the alert ceased to fire. Let's say the Friday
on-call engineer forgot about it, or a different engineer is
on-call now. They see the alert has resolved and move on to
work on other things.

Conclusion: You get noisy alerts in the one case. You miss a real
outage in the other case.

Solution: Do _not_ implement a TTL. Instead, make metrics
deletion (with a simple DELETE call of the RESTful API) a part of
decommissioning a job.

The second possibility is if a larger workload is distributed
over many smaller short-lived tasks. This use-case becomes
increasingly popular with the serverless paradigm, and perhaps
that's similar to what you accomplish with your Airflow
setup. There are several interesting metrics I can see here, most
obviously the number of work units processed in aggregate by all
the short-lived tasks. The truth is that there is currently no
good way of doing this within the core Prometheus ecosystem. If
users try to shoehorn the Pushgateway into it, they usually push
to different groupings again (as above) and then try to aggregate
the metrics on the Prometheus server after scraping, e.g. a sum
over all the pushed amounts of work unit processed would give you
an idea of the total amount of work units processed. I see how a
TTL would seemingly come in handy, but it should be easy to see
how brittle this setup is. Again the right course of action is to
use the right tools instead of asking for a feature that would
make the wrong tool slightly more easier to shoehorn into
it. What you need in this case is a way for distributed
counting. This is the case explained in the first paragraph of
the non-goals, including possible tools to accomplish what you need:
https://github.com/prometheus/pushgateway#non-goals

I'm fairly certain that the Prometheus community will have to
think more about the “distributed counter” use-case, as it will
become more common (see above, serverless etc.). Perhaps, a
solution will happen within the Pushgateway (but I personally
don't think so, as it cannot provide a good HA-story). More
likely it will be solved very differently. But the solution is
definitely not a TTL for metrics pushed to the Pushgateway.


Having said all that, if your use-case doesn't fit any of the
categories above, please explain how it is different.


> 1) Would you be willing to accept PR for this type of functionality?

To avoid misunderstandings: The problem here is not that a TTL
would be difficult to implement (which it is not). It would not
even be difficult to maintain (which often is a reason to be
cautious about adding other features of limited use). The concern
here is that adding such a feature is considered positively
harmful by those that have created the Pushgateway and
Prometheus as a whole.

There needs to be a consensus within the community that this
feature is useful rather than harmful before even thinking about
an implementation.

> 2) If not, would you accept a PR that put this functionality
> into a sidecar type process that uses the DELETE api? We've
> discussed writing this ourselves and it sounds like others
> would benefit.

If the feature is harmful within the PGW, it is also harmful as a
sidecar. Of course, nobody will stop you from implementing
it. You could even fork the PGW and add such a feature. You might
even prove us wrong in that way. Personally, I'm happy to stand
corrected. I see forking as fair game in the open-source
world. However, you cannot expect the developers to recommend or
support a feature that they belief to be harmful.

Travis Groth

unread,
Oct 29, 2018, 12:17:31 PM10/29/18
to Björn Rabenstein, prometheus...@googlegroups.com
We're definitively _not_ trying to implement logging. 100% on board with the reasoning there.

Our use case sits between the distributed counting and backup semantics you've described.

Re: Backup use case.

As you point out, critical metrics that are pageable become problematic with a TTL. The overall job status and the execution of them are more appropriately captured with a traditional exporter that introspects the job engine (airflow, hadoop coordinator, whatever). It has the best information about if a job is scheduled to run and if it did run and the status thereof. It would also remove the operational burden of retired/changed jobs.

However, not all schedulers are going to provide that type of information easily, which is where you see the desire to use PGW from inside the job logic. TTL makes this much less operationally complex (read: error prone) and reduces the barrier to entry by not requiring that you create scheduler instrumentation. No, this is not ideal, but I don't think it is harmful to allow this with the caveats that are clear: No HA and be aware of the behavior of removed job metrics. This is better than nothing.

Re: distributed counting. This is close to where we are anticipating the PGW being useful. Internal metrics about the components of the job. The distributed case is possible as well as non-distributed (the 'serverless' type of execution environment).


> I'm fairly certain that the Prometheus community will have to
> think more about the “distributed counter” use-case, as it will
> become more common (see above, serverless etc.). Perhaps, a
> solution will happen within the Pushgateway (but I personally
> don't think so, as it cannot provide a good HA-story). More
> likely it will be solved very differently.


Without a solution, people will look to something that provides semantics that they require. "Push" (or at least client-initiated metrics) for serverless job metrics is going to be required in many cases. There are only a handful of solutions that support a push model for prometheus, and PGW is one of the most obvious (and well documented...). A TTL would make PGW a reasonable fit until there is a better solution. Imperfect solutions are better than none and could be used to build up toward an ideal pattern.

> But the solution is definitely
> not a TTL for metrics pushed to the Pushgateway.


However, if you're going to simply assert that TTL will not be used to support this use case, then I suppose that ends my portion of this thread. It would serve our use case, but it appears to be off the table. We will look for alternative ways of accomplishing what we need.

A question - Is there any other discussion around how to support the mechanics of distributed/serverless jobs in prometheus? I understand that there isn't an actual solution right now but would be interested in any prior work or planning that wasn't directly tied to a PGW TTL.

Thanks,

Travis

Björn Rabenstein

unread,
Oct 29, 2018, 12:21:55 PM10/29/18
to tgr...@gmail.com, prometheus...@googlegroups.com
On Mon, 29 Oct 2018 at 17:17, Travis Groth <tgr...@gmail.com> wrote:
>
> A question - Is there any other discussion around how to support the mechanics of distributed/serverless jobs in prometheus? I understand that there isn't an actual solution right now but would be interested in any prior work or planning that wasn't directly tied to a PGW TTL.

This is documented right in the README.md of the Pushgateway repo. Quoting:
"""
If you need distributed counting, you could either use the actual
statsd in combination with the [Prometheus statsd
exporter](https://github.com/prometheus/statsd_exporter), or have a
look at [Weavework's aggregation
gateway](https://github.com/weaveworks/prom-aggregation-gateway).

Björn Rabenstein

unread,
Oct 29, 2018, 12:37:14 PM10/29/18
to tgr...@gmail.com, prometheus...@googlegroups.com
On Mon, 29 Oct 2018 at 17:17, Travis Groth <tgr...@gmail.com> wrote:
>
> Re: Backup use case.
>
> As you point out, critical metrics that are pageable become
> problematic with a TTL. The overall job status and the
> execution of them are more appropriately captured with a
> traditional exporter that introspects the job engine (airflow,
> hadoop coordinator, whatever). It has the best information
> about if a job is scheduled to run and if it did run and the
> status thereof. It would also remove the operational burden of
> retired/changed jobs.
>
> However, not all schedulers are going to provide that type of
> information easily, which is where you see the desire to use
> PGW from inside the job logic. TTL makes this much less
> operationally complex (read: error prone) and reduces the
> barrier to entry by not requiring that you create scheduler
> instrumentation. No, this is not ideal, but I don't think it
> is harmful to allow this with the caveats that are clear: No HA
> and be aware of the behavior of removed job metrics. This is
> better than nothing.

While scheduler instrumentation is nice, the one functionality I was
referring to is not related to it. What I would recommend is that
whatever precedure is taken to create a job and then eventually remove
a job from the scheduler also has the responsibility to remove metrics
from the Pushgateway. That's a single DELETE call, if need be easily
implementable via a `curl` command, and should be easily automatable.
Thus, it is not error-prone. It _is_, however, error-prone to have a
TTL on the Pushgateway the moment alerting is in the game. Providing
features that are harmful once alerting is in the game for a
monitoring systems whose core competence is alerting, assuming that
enough users will use it without alerting, doesn't sound like a good
idea.

There are people in the community who don't want to provide features
that are misleading for naive users even if there is some use for
informed users. I'm actually not one of them. I prefer catering for
our informed users and simply warn the naive users in the
documentation. However, the TTL-for-Pushgateway case looks different
to me. So far, I'm still convinced that an informed user simply won't
have the use case.

Travis Groth

unread,
Oct 29, 2018, 12:50:09 PM10/29/18
to Björn Rabenstein, prometheus...@googlegroups.com
You had said, 'I'm fairly certain that the Prometheus community will have to think more about the “distributed counter” use-case, as it will become more common (see above, serverless etc.)'. I was looking to more info around that process/conversation. I also believe this is going to become more common as an operational pattern, and a strong distributed/serverless story would be valuable. I take it there isn't much beyond those projects thus far. Understood.

Thanks,

Travis

Björn Rabenstein

unread,
Oct 29, 2018, 1:09:42 PM10/29/18
to tgr...@gmail.com, prometheus...@googlegroups.com
On Mon, 29 Oct 2018 at 17:50, Travis Groth <tgr...@gmail.com> wrote:
>
> You had said, 'I'm fairly certain that the Prometheus community will have to think more about the “distributed counter” use-case, as it will become more common (see above, serverless etc.)'. I was looking to more info around that process/conversation. I also believe this is going to become more common as an operational pattern, and a strong distributed/serverless story would be valuable. I take it there isn't much beyond those projects thus far.

At least not that I am aware of. Most of the conversation I took part
in happened at meetups or conferences. Right now, I'm just following
what other people experiment with. Luckily, it is something that
doesn't need much coordination across the Prometheus stack, so anybody
who needs it or is interested in can work on it. Whatever people are
working on, it's certainly worth sharing and discussing, whether here
on the mailing list, on IRC, or anywhere else. I would just like to
take that out of the context of a feature discussion for the
Pushgateway.

Sylvain Rabot

unread,
Oct 29, 2018, 3:22:59 PM10/29/18
to bjo...@soundcloud.com, prometheus...@googlegroups.com
On Thu, 18 Oct 2018 at 12:36, Björn Rabenstein <bjo...@soundcloud.com> wrote:
> My use case is  quite simple. We have hosts we monitor shipped to our customer all over the world
> and they only communicate with our infrastructure via AMQP messages.
>
> We used this AMQP channel to funnel metrics from the hosts' node_exporter to a µservice in our
> infrastructure which has access to the pushgateway. This µservice listens to an AMQP queue
> and push the metrics delivered in the messages into the pushgateway.

This use case might be simple, but it is also simply a use case
Prometheus wasn't built for. In particular, we actively discourage
users from using the Pushgateway to turn Prometheus into a push-based
monitoring system. You are of course free to create tooling to use
Prometheus in ways it is not intended to be used in, but you also have
to understand that the developers of the core components don't want to
add features that go against the fundamental concepts of Prometheus
(plus take the burden of maintaining said features and deal with the
(from their perspective predictable) issues those features will
cause).

I might be wrong but isn't the main purpose of Prometheus to be a metric storage backend first ? If I am wrong you still should consider that it's what it will be for most users.

My company has chosen Prometheus as metric storage backend because of its deep integration with Kubernetes and we want to make the most of it but we do not want to multiply the instances nor maintain other types of metrics backend for a specific use cases.
 
Just a few of the issues that come to my mind (I'm sure people like
Brian Brazil will be able to come up with many more):

- What's your HA concept? What happens if your one Pushgateway
crashes? What happens if your AMQP listener fails to push to the
Pushgateway?

I believe that is hardly relevant here. Prometheus itself is not high available and a HA setup is worth as much as its weakest link. So even if I had a HA AMQP listener + HA PG I couldn't have HA as I'm running only one Prometheus.


- What happens if you have inconsistent metrics between your nodes?
Easily possible as you won't upgrade them all in one go. The
Pushgateway cannot expose the merged state of inconsistent metrics as
that would be an invalid exposition. Prometheus, however, can deal
with inconsistent metrics coming from different targets.

That's indeed a problem. Can I dare say that's a good reason to have a push endpoint in prometheus itself ?

 
> But sometimes those hosts crash, and instead of having no data we see in our graphs flat lines equivalent to the metrics' values in the last push.

If the metrics just disappear after a while, how do you detect a
crashed host then? How do you distinguish a host that has been
deprovisioned deliberately from a host that has crashed/doesn't come
up anymore from a host that is fine but fails to send metrics?

I think it is also irrelevant because with no metrics or bogus metrics in both case I couldn't detect a crash host. At least with no metrics I'm not polluting my vectors with false values.

--
Sylvain Rabot <syl...@abstraction.fr>

Björn Rabenstein

unread,
Oct 29, 2018, 3:50:57 PM10/29/18
to syl...@abstraction.fr, prometheus...@googlegroups.com
On Mon, 29 Oct 2018 at 20:22, Sylvain Rabot <syl...@abstraction.fr> wrote:
>
> I might be wrong but isn't the main purpose of Prometheus to be a metric storage
> backend first ? If I am wrong you still should consider that it's what it will be for most users.

As I have often said (most prominently in the keynote of Percona
Live): Prometheus is not a TSDB; it's a metrics-based monitoring and
alerting system that happens to contain an embedded, special-purpose
TSDB to fulfill its task.

The Prometheus TSDB is neither durable nor long-term. The community
has always been very explicit about that.

> I believe that is hardly relevant here. Prometheus itself is not high available and a HA setup is worth as much as its weakest link.

The simple but effective HA concept of Prometheus (as an alerting
system, _not_ as a TSDB) has been a core concept from the beginning.

> Can I dare say that's a good reason to have a push endpoint in prometheus itself ?

I don't think there will be much enthusiasm for that among the core
developers. In any case, that's a discussion separate from a feature
request for the Pushgateway. (And remember that the Pushgateway is
explicitly not meant to turn Prometheus into a pushed-based monitoring
system.)

>> > But sometimes those hosts crash, and instead of having no data we see in our graphs flat lines equivalent to the metrics' values in the last push.
>>
>> If the metrics just disappear after a while, how do you detect a
>> crashed host then? How do you distinguish a host that has been
>> deprovisioned deliberately from a host that has crashed/doesn't come
>> up anymore from a host that is fine but fails to send metrics?
>
> I think it is also irrelevant because with no metrics or bogus metrics in both
> case I couldn't detect a crash host. At least with no metrics I'm not
> polluting my vectors with false values.

The idea is that you can alert on the last time a metric was pushed.

If a host is removed in a planned fashion, the same process that
removes the host will also remove metrics from the Pushgateway.
If a host crashes, the last-pushed metrics will remain.

Obviously, this isn't ideal at all, but Prometheus wasn't designed for
that use case in the first place.
But adding a metrics TTL makes an inelegant work-around even worse.

Julius Volz

unread,
Oct 30, 2018, 11:56:19 AM10/30/18
to syl...@abstraction.fr, Björn Rabenstein, prometheus...@googlegroups.com
On Mon, Oct 29, 2018 at 8:22 PM Sylvain Rabot <syl...@abstraction.fr> wrote:
On Thu, 18 Oct 2018 at 12:36, Björn Rabenstein <bjo...@soundcloud.com> wrote:
- What's your HA concept? What happens if your one Pushgateway
crashes? What happens if your AMQP listener fails to push to the
Pushgateway?

I believe that is hardly relevant here. Prometheus itself is not high available and a HA setup is worth as much as its weakest link. So even if I had a HA AMQP listener + HA PG I couldn't have HA as I'm running only one Prometheus.

As Björn mentioned, you can make Prometheus highly available for the purpose of alerting (by running >1 identical Prometheus servers). But yes, Prometheus does not cover the data durability / no-gaps-ever use case.

- What happens if you have inconsistent metrics between your nodes?
Easily possible as you won't upgrade them all in one go. The
Pushgateway cannot expose the merged state of inconsistent metrics as
that would be an invalid exposition. Prometheus, however, can deal
with inconsistent metrics coming from different targets.

That's indeed a problem. Can I dare say that's a good reason to have a push endpoint in prometheus itself ?

To elaborate a bit more here, the reason that Prometheus does not have a push endpoint is that Prometheus is not a TSDB, it's a monitoring system that happens to contain a TSDB for a special purpose. In the context of Prometheus, a lot of things depend on Prometheus being in charge of pulling, labeling, and timestamping incoming samples:

- recording and alerting rules are only executed at the "now" timestamp, so scraped data needs to arrive in lockstep for results to make sense
- service discovery integration allows Prometheus to determine what *should* be out there, and only need identity configuration on one side of the chain (the monitoring system's side)
- metadata (target labels) can be automatically attached to target's metrics by the monitoring system in this way, since it knows who it pulled from; this is not only good for informational purposes, but also ensures that targets cannot trample on each others' series (unless you set the "honor_labels: true" scrape option).
- overload situations can be mitigated at a central point (the monitoring system)

Sometimes it would be nice to be able to push (and the code change for that would be easy), but again the worry is about the majority of users ending up shooting themselves in the foot because they won't be aware of the consequences within the context of Prometheus...

Julius Volz

unread,
Oct 30, 2018, 11:58:07 AM10/30/18
to Björn Rabenstein, tgr...@gmail.com, prometheus...@googlegroups.com
On Mon, Oct 29, 2018 at 5:22 PM 'Björn Rabenstein' via Prometheus Developers <prometheus...@googlegroups.com> wrote:
On Mon, 29 Oct 2018 at 17:17, Travis Groth <tgr...@gmail.com> wrote:
>
> A question - Is there any other discussion around how to support the mechanics of distributed/serverless jobs in prometheus?  I understand that there isn't an actual solution right now but would be interested in any prior work or planning that wasn't directly tied to a PGW TTL.

This is documented right in the README.md of the Pushgateway repo. Quoting:
"""
If you need distributed counting, you could either use the actual
statsd in combination with the [Prometheus statsd
exporter](https://github.com/prometheus/statsd_exporter), or have a
look at [Weavework's aggregation
gateway](https://github.com/weaveworks/prom-aggregation-gateway).
"""

Also, if you run your own serverless / FaaS framework, you can add metrics to the framework itself, like OpenFaaS does: https://github.com/openfaas/faas (it has Prometheus metrics in the gateway that calls the functions). 

roidel...@gmail.com

unread,
Nov 15, 2018, 10:58:51 AM11/15/18
to Prometheus Developers
I would like to add that some of the use cases discussed here coan be solved by using the statsd_exporter (e.g. functions can push to statsd buckets).

谢磊

unread,
Aug 20, 2019, 10:53:37 PM8/20/19
to Prometheus Developers
+1, 

I also think a TTL for metrics in the pushgateway would really be useful. 

Regards. 

在 2018年10月17日星期三 UTC+8下午7:21:10,syl...@abstraction.fr写道:

Chris Scott-Thomas

unread,
Nov 7, 2019, 2:43:36 AM11/7/19
to Prometheus Developers
For us, we're not using Pushgateway for short lived jobs, but for getting metrics from multiple instances that are obscured by the platform they are deployed to.

Unfortunately we can't query each individual instance with Prometheus as they all basically sit behind a loadbalancer, meaning every scrap would hit only one of X instances each attempt, which isn't good enough.  We also can't deploy Prometheus in this platform, nor can we install exporters inside it as we don't own it.  Using the Pushgateway to have each instance communicate its metrics out, is the only way forward.

We're planning to move off this platform eventually, and all our problems will go away.  But in the meantime we do suffer from stale metrics existing on the Pushgateway, once an instance is removed for example.

If there is a better way to do this, we're not aware of it as yet but we are restricted to only being able to instrument the apps/instances with a metrics endpoint.  This does mean that we would benefit from being able to expire stale metrics in someway.  I suppose we have the benefit of this only being a medium term issue, once we move from the old platform to the new, but this may be something that is a longer lasting issue for others?

Bjoern Rabenstein

unread,
Nov 11, 2019, 5:39:32 AM11/11/19
to Chris Scott-Thomas, Prometheus Developers
On 06.11.19 23:43, 'Chris Scott-Thomas' via Prometheus Developers wrote:
>
> Unfortunately we can't query each individual instance with Prometheus as they
> all basically sit behind a loadbalancer, meaning every scrap would hit only one
> of X instances each attempt, which isn't good enough.  We also can't deploy
> Prometheus in this platform, nor can we install exporters inside it as we don't
> own it.  Using the Pushgateway to have each instance communicate its metrics
> out, is the only way forward.

I don't think it's a way forward. The Pushgateway is not meant for
that use case. I mean, you can shoehorn the Pushgateway into this use
case, but I hope you'll understand that I don't want to support
features that are only there to enable a use case the PGW is not meant
for.

The setup has so many problems that removing metrics from old
instances will be one of your smaller problems.

If you really want to do it, I would have an informed procedure to
remove metrics explicitly when an instance gets removed. But even
then, I would not recommend this setup.

--
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in

Chris Scott-Thomas

unread,
Nov 12, 2019, 2:07:14 AM11/12/19
to Prometheus Developers
I understand that.  I just figured our use case may not be so unique.  I'm not sure of a better way of doing it.  Currently we have a custom sidecar on the pushgateway that checks for metrics older than 30 seconds, and clears them down.  When you have no option but to push metrics, than scrape, what is the better solution?  Perhaps this is just being missed? Pushgateway does kind of allude to this function in its name.

Julien Pivotto

unread,
Nov 12, 2019, 5:40:16 AM11/12/19
to Chris Scott-Thomas, Prometheus Developers
On 06 Nov 23:43, 'Chris Scott-Thomas' via Prometheus Developers wrote:
> If there is a better way to do this, we're not aware of it as yet but we
> are restricted to only being able to instrument the apps/instances with a
> metrics endpoint.

You could look into: https://github.com/RobustPerception/PushProx

Regards,

--
(o- Julien Pivotto
//\ Open-Source Consultant
V_/_ Inuits - https://www.inuits.eu
signature.asc

Bjoern Rabenstein

unread,
Nov 12, 2019, 8:10:02 AM11/12/19
to Chris Scott-Thomas, Prometheus Developers
On 11.11.19 23:07, 'Chris Scott-Thomas' via Prometheus Developers wrote:
> When you have no option but to push metrics, than scrape, what is the
> better solution?

It really depends on details. But if you have a whole complex system
that doesn't really fit the Prometheus approach, and you cannot really
change it to fit better, a possible answer is to not use Prometheus at all.

If a statsd approach works well for you, you might consider using
statsd instrumentation and then use the statsd_exporter to lift things
over into the Prometheus world.

But really, no silver-bullet recommendation is possible here.

> Perhaps this is just being missed? Pushgateway does kind of
> allude to this function in its name.

Yes, in hindsight the naming was a mistake.

jorin

unread,
Mar 29, 2020, 10:00:58 AM3/29/20
to Prometheus Developers
 Hey there,

since this discussion seems rather persistent and there still seems to be a need for expiration of jobs in the pushgateway,
I decided to also leave a note here to share our current solution to the problem which might be of use to others:

https://github.com/jorinvo/prometheus-pushgateway-cleaner

This solution is only one possibility and it might be similar to other custom solutions, but it is open source.
The README tries to explain the reasoning behind the project and tries to direct people to go with the option that is right for them.
Feel free to leave feedback in Github.

I hope this is helpful and contributes to slowing down these arguments going on for years now :)

Regards,

Jorin

John Yu

unread,
Apr 19, 2024, 5:18:16 AMApr 19
to Prometheus Developers
I'm thinking, why can't we deploy an additional prom as an agent to receive data, and then write to the core prom remotely after receiving the data?
Although I know that this will lead to breaking away from the pull model to a certain extent, it is undeniable that we do have push scenarios in metrics monitoring. Having it seems to be a better solution to push scenarios. At least in my opinion, increasing ttl will be better than pgw. better?

Bjoern Rabenstein

unread,
Apr 19, 2024, 10:07:06 AMApr 19
to John Yu, Prometheus Developers
On 19.04.24 01:33, John Yu wrote:
> I'm thinking, why can't we deploy an additional prom as an agent to receive
> data, and then write to the core prom remotely after receiving the data?
> Although I know that this will lead to breaking away from the pull model to
> a certain extent, it is undeniable that we do have push scenarios in
> metrics monitoring. Having it seems to be a better solution to push
> scenarios. At least in my opinion, increasing ttl will be better than pgw.
> better?

Remote-writing into a vanilla Prometheus server has its own set of
problems, but it's safe to say that it's less of an abuse then using
the Pushgateway to turn Prometheus into a push-based metrics
collection system.
Reply all
Reply to author
Forward
0 new messages