We're looking into replacing parts of our infrastructure with serverless functions. Obviously we can't expose pull metrics for short-lived functions like this. I was wondering if PushGateway is an appropriate choice for capturing metrics for these short-lived functions.
From the Prometheus docs and Robust Perception blog posts I know that PushGateway is generally considered to be an antipattern, and that the only valid use case is "collecting metrics from service-level batch jobs."
For a serverless architecture, would you recommend PushGateway, or a different solution altogether?
One solution we considered since we're mostly an AWS shop at the moment was to have the functions push metrics to CloudWatch and then have Prometheus scrape cloudwatch. But the problems with that are:
1) not vendor-agnostic
2) incurs cost
Thanks!
Dave