Guaranteeing that last metric of a pod is scraped

34 views
Skip to first unread message

Rafael Paulovic

unread,
Nov 12, 2020, 8:47:18 AM11/12/20
to Prometheus Users
Hi all, I have an architectural question.

I am using Prometheus as follows:

In a K8S cluster, running pods with multiple running processes running in a single container. These processes send metrics via a websocket connection to a central process which exposes a /metrics endpoint so that Prometheus scrapes from there the metrics from all processes.

When all process are finished they might send some special metrics just before finishing.

But all when all these process finish, this also makes the pod goes to completed, closing the http server and therefore making Prometheus unable to scrape.

What is the best way to guarantee that the latest metrics sent are scraped?

Would I need to wait that Prometheus scrapes the last metric before pod goes to completed?

I was planning to do that, but then I saw that the interval in the cluster is too big (1 minute + potentially the time to discover the job).

Does anyone has another, more feasible idea? Am I missing something?

Thanks,
Best regards,
Rafael.

Laurent Dumont

unread,
Nov 12, 2020, 9:45:07 AM11/12/20
to Rafael Paulovic, Prometheus Users
I'm not sure if I fully understand the flow of metrics, but you can use a PushGateway as a "central" scrape target. In your case, since your application pods seem to be short-lived, a per pod scrape architecture might not be great. You can use the prometheus_client library to push application metrics to the PushGateway pod which is then scraped by Prometheus.

The PushGateway will always be online and available as a target to Prometheus.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f0b0cc7a-377a-4c0d-9e38-3d36f1d9ae9dn%40googlegroups.com.

Rafael Paulovic

unread,
Nov 12, 2020, 10:19:06 AM11/12/20
to Prometheus Users
Hi,

We used Prometheus Pushgateway in the past, but for our scenario it presents a security flaw.

The pods running these processes can be executing some arbitrary customer code.
If we allow the pod to communicate with Pushgateway, technically the customer would be also able to communicate with the Pushgateway service and override/create arbitrary metrics.
In the end we replaced the Pushgateway to normal scrape because of that. 

It could be that we should just have implemented some security measure for this case and continued with the Pushgateway.

Laurent Dumont

unread,
Nov 12, 2020, 11:02:12 AM11/12/20
to Rafael Paulovic, Prometheus Users
You can add some HTTP auth in front of the PG (not directly though PG but with nginx/apache). But yes, as long as the pod needs to access PG directly, I guess it means that the customer running code inside the pod will be able to talk to PG.

Reply all
Reply to author
Forward
0 new messages