Requirements / Best Practices to use Prometheus Metrics for Serverless environments

Bartłomiej Płotka

unread,

Jun 15, 2021, 2:59:52 PM6/15/21

to Prometheus Developers

Hi All,

Prometheus has seen the fashion shifting from on-premise to clouds, monoliths to microservices, virtual machines to containers etc. Prometheus has proven to be successful for users in all those scenarios. Let's now talk about FaaS/Serverless. (Let's leave other buzzwords - blockchain/AI for later 🙈).

I would love to start a discussion around the usage of Prometheus Metrics on Serverless environments. I wonder if, from the Prometheus dev point of view, we can implement/integrate anything better, document or explain more etc. (:

In this thread, I am specifically looking for:

* Existing best practices for using Prometheus for gathering metrics from Serverless/FaaS platforms and functions

* Specific gaps and limitation users might have in these scenarios

* Existing success stories?

* Ideas for improvements.

Action Item: Feel free to respond if you have any input on those!

Past discussions:

* Fair suggestion to use cloud exporters for FaaS cases

* Suggestion to use event aggregation proxy

* Pushgateway improvements for serverless cases

My thoughts:

IMO the FaaS function should be like function in any other full pledge application/pod. You programmatically increment common metric for your aggregated view (e.g overall number of errors).

Trying to switch to a push model for this case, sounds like an unnecessary complication because, in the end, those functions are running in the common, longer living context (e.g FaaS runtime). This runtime should give programmatic APIs to use custom metrics like it's possible in a normal app when your function has local variables (e.g *prometheus.CounterVec) to use.

In fact, this is what AWS Lambda allows and there are exporters to get that data into Prometheus.

We see users attempting to switch to the push model. I just wonder if for FaaS functions this really makes sense.

If you init the TCP connection and use remote write, OM push, pushgateway API / Otel/OpenCensus to push metric, you take enormous latency hit to spin up a new TCP connection just for that. This might be already too slow for FaaS. If you do this asynchronously on Faas platform, you need to care about discovery/backoffs/persistent buffer/auth and all pains of push model + some aggregation proxy like Pushgateway/Aggregation gateway or OTel collector to get this data to Prometheus (BTW this is what knative is recommending). Equally, one could just expose those metrics on /metrics endpoint and drop all of this complexity (or run exporter if FaaS is in the cloud, like Lambda/Google Run).

I think the main problem appears if those FaaS runtimes are short-living workloads that automatically spins up only to run some functions (batch jobs). In some way, this is then a problem of short-living jobs and the design of those workloads.

For those short-living jobs, we again see users try to use the push model. I think there is room to either streamline those initiatives OR propose an alternative. A quick idea, yolo... why not killing the job after the first successful scrape (detecting usage on /metric path)?

Kind Regards,

Bartek Płotka (@bwplotka)

Bjoern Rabenstein

unread,

Jun 18, 2021, 6:16:58 PM6/18/21

to Bartłomiej Płotka, Prometheus Developers

On 15.06.21 20:59, Bartłomiej Płotka wrote:
>
> Let's now talk about FaaS/Serverless.

Excellent! That's my 2nd favorite topic after histograms. (And while I
provably talked about histograms as my favorite topic since early
2015, I have only started to talk about FaaS/Serverless as an
important gap to fill in the Prometheus story since 2018.)

I think "true FaaS" means that the function calls are
lightweight. The additional overhead of sending anything over the
networks defeats that purpose. So similar to what has been said
before, and what Bartek has already nicely worked out, I think the
metrics have to be managed by the FaaS runtime, in the same path as
billing is managed.

And that's, of course, what cloud providers are doing, and it's also a
formidable way of locking their customers into their own metrics and
monitoring system.

And that's in turn precisely where I think Prometheus can use its
weight. Prometheus has already proven that cloud providers can
essentially not get away with ignoring it, and even halfhearted
integrations won't be enough. With more or less native Prometheus
support by cloud providers, it might actually just require a small
step to come to some convention how to collect and present FaaS
metrics in a "Promethean" way. If all cloud providers do it the same
way, the lock-in is gone.

I think it would be very valuable to study what OpenFaaS has already
done: https://docs.openfaas.com/architecture/metrics/

In the simplest case, we could just say: Please, dear cloud providers,
please expose exactly the same metrics for general benefit. If there
is anything to improve with the OpenFaaS approach, I'm sure they will
be delighted to get help. (Spontaneously, I'm missing a way to define
custom metrics, e.g. how many records a function call has processed.)

> * Suggestion to use event aggregation proxy

> <https://github.com/weaveworks/prom-aggregation-gateway>
> * Pushgateway improvements
> <https://groups.google.com/g/prometheus-users/c/sm5qOrsVY80/m/nSfbzHd9AgAJ> for
> serverless cases

Despite all of what I said above, I think there _are_ quite a few user
of FaaS who have fairly heavy-weight function calls. For them, pushing
counter increments etc. via the network might actually be more
convenient than funneling metrics through the FaaS runtime. This is
then just another use-case of the "distributed counter" idea, which
the Pushgateway quite prominently is not catering for. As discussed
in the thread linked above and at countless other places, I strongly
recommend to not shoehorn the Pushgateway into this use-case, but
create a separate project for it, which would be designed from the
beginning for this use-case. Perhaps
weaveworks/prom-aggregation-gateway is just that. I haven't studied it
in detail yet. In a way, we need "statsd done right". Again, I would
suggest to look what others have already done. For example, there are
tons of statsd users out there. What have they done in the last years
to overcome the known shortcomings? Perhaps statsd instrumentation and
the Prometheus statsd exporter just needs a bit of development in that
way to make it a viable solution.

> I think the main problem appears if those FaaS runtimes are short-living
> workloads that automatically spins up only to run some functions (batch
> jobs). In some way, this is then a problem of short-living jobs and the
> design of those workloads.
>
> For those short-living jobs, we again see users try to use the push model.
> I think there is room to either streamline those initiatives OR propose
> an alternative. A quick idea, yolo... why not killing the job after the
> first successful scrape (detecting usage on /metric path)?

Ugh, that doesn't sound right. I think this problem should be solved
within the FaaS runtime in the way they prefer. Cloud providers need
billing in any case (they want to make money after all), so they have
already solved reliably metrics collection for that. They just need to
hook in a simple exporter to present Prometheus metrics. See how
OpenFaaS has done it. Knative seems to have gone down the OTel path,
but that could be seen as an implementation detail. If they in the end
expose a /metrics endpoint with the desired metrics for Prometheus to
scrape, all is good. It's just a terribly overengineered exporter,
effectively. (o;

--
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in

Tobias Schmidt

unread,

Jun 22, 2021, 5:26:14 AM6/22/21

to Bjoern Rabenstein, Bartłomiej Płotka, Prometheus Developers

Thanks for bringing up this topic Bartek and your great insights Björn!

I think it's a great idea to open the discussion with the big cloud providers about an open runtime integration for metrics. Maybe they're more open about this than I expect. My fear is that this won't really lead to any substantial improvement, as the vendor lock-in seems to be quite desired judging from my personal experience with cloud providers.

> * Suggestion to use event aggregation proxy
> <https://github.com/weaveworks/prom-aggregation-gateway>
> * Pushgateway improvements
> <https://groups.google.com/g/prometheus-users/c/sm5qOrsVY80/m/nSfbzHd9AgAJ> for
> serverless cases

Despite all of what I said above, I think there _are_ quite a few user
of FaaS who have fairly heavy-weight function calls. For them, pushing
counter increments etc. via the network might actually be more
convenient than funneling metrics through the FaaS runtime. This is
then just another use-case of the "distributed counter" idea, which
the Pushgateway quite prominently is not catering for. As discussed
in the thread linked above and at countless other places, I strongly
recommend to not shoehorn the Pushgateway into this use-case, but
create a separate project for it, which would be designed from the
beginning for this use-case. Perhaps
weaveworks/prom-aggregation-gateway is just that. I haven't studied it
in detail yet. In a way, we need "statsd done right". Again, I would
suggest to look what others have already done. For example, there are
tons of statsd users out there. What have they done in the last years
to overcome the known shortcomings? Perhaps statsd instrumentation and
the Prometheus statsd exporter just needs a bit of development in that
way to make it a viable solution.

First of all, I wonder if there is really any difference in terms of heavy-weight/light-weight classification of serverless / FaaS in contrast to traditional deployment styles. Personally the reason I chose a serverless runtime (GCP Cloud Run) for my application layer is just in order to focus on business feature development. The runtime manages container lifecycles and we're only paying for the time containers serve traffic. I could deploy the exact same Docker container outside a serverless environment as well.
My needs are still the same though: I want to instrument the various aspects of the service and its many endpoints, both with common request related metrics as well as custom metrics. The problem I face is the fundamental mismatch of Prometheus' pull architecture and the serverless runtime which doesn't even allow me to see individual container instances.

The StatsD / push-over-network approach has some serious latency impact as you both highlighted already. Additionally, it requires the deployment of a service with an external TCP API which would need to be protected from public access as well (might be easy depending on the serverless runtime provider).

Last night I was wondering if there are any other common interfaces available in serverless environments and noticed that all products by AWS (Lambda) and GCP (Functions, Run) at least provide the option to handle log streams, sometimes even log files on disk. I'm currently thinking about experimenting with an approach where containers log metrics to stdout / some file, get picked up by the serverless runtime and written to some log stream. Another service "loggateway" (or otherwise named) would then stream the logs, aggregate them and either expose them on the common /metrics endpoint or push them with remote write right away to a Prometheus instance hosted somewhere (like Grafana Cloud).
My hopes are that the latency impact of logging a dozen metrics per request should be neglectable especially compared to TCP pushing. There are a lot of open questions about the log format, how to handle metric metadata (without logging it all the time), and HA deployment of the log aggregation service. Furthermore this approach requires some support by the client libraries (I think only the Ruby client supports custom data stores).

Besides the implementation details, one major downside would be the pollution of the common log stream if the runtime provider doesn't support separate log streams (AWS Lambda only supports stdout/stderr I think). Anything else I'm missing which would make this idea infeasible?

> I think the main problem appears if those FaaS runtimes are short-living
> workloads that automatically spins up only to run some functions (batch
> jobs). In some way, this is then a problem of short-living jobs and the
> design of those workloads.
>
> For those short-living jobs, we again see users try to use the push model.
> I think there is room to either streamline those initiatives OR propose
> an alternative. A quick idea, yolo... why not killing the job after the
> first successful scrape (detecting usage on /metric path)?

Ugh, that doesn't sound right. I think this problem should be solved
within the FaaS runtime in the way they prefer. Cloud providers need
billing in any case (they want to make money after all), so they have
already solved reliably metrics collection for that. They just need to
hook in a simple exporter to present Prometheus metrics. See how
OpenFaaS has done it. Knative seems to have gone down the OTel path,
but that could be seen as an implementation detail. If they in the end
expose a /metrics endpoint with the desired metrics for Prometheus to
scrape, all is good. It's just a terribly overengineered exporter,
effectively. (o;

--
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/20210618221656.GS3670%40jahnn.

Richard Hartmann

unread,

Jun 22, 2021, 6:09:56 AM6/22/21

to Bartłomiej Płotka, Prometheus Developers

This has come up in the context of OM, OTel, and TAG Observability. My
own thinking largely mirrors beorn's & grobie's: In a perfect world
the orchestration layer has all the information and interfaces
required and billing knows about the required datapaths, NB:
Monitoring usually has higher speed and lower reliability requirements
than billing. Still, for doability, lock-in, convenience, and velocity
reasons, it's enticing to bypass the ideal solution and do something
that works-ish now. If someone incurs ~100% overhead for monitoring
lightweight functions but gets their job done, they are are still
getting their job done and can optimize later if they so choose.

Pushing might appear hamfisted here, and arguably is, but it's largely
under the control of the dev; as such, they can do it with less
coordination. This might get us near to using the Prometheus Agent as
a Collector to reduce latency and blast radius. Far from ideal, but...

An in-between would be what grobie said: To speak in Prometheus terms,
the orchestrator is node_exporter, the serverless functions write out
something which the textfile collector can ingest.

OpenMetrics deliberately supports push, but this approach creates
issues with `up` and staleness handling. OTel is currently facing
similar issues, maybe there's room for cooperation. Also see
https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#supporting-target-metadata-in-both-push-based-and-pull-based-systems
and https://docs.google.com/document/d/1hn-u6WKLHxIsqYT1_u6eh94lyQeXrFaAouMshJcQFXs/edit#heading=h.e4p9f543e7i2

I strongly believe that we should be particular about the wire format;
in a future in which orchestrators have a collector component, it
would be nice to be able to simply expose the metrics for pulling or
use PRW code and wire format.

Best,
Richard

Tobias Schmidt

unread,

Jun 22, 2021, 10:32:28 AM6/22/21

to Richard Hartmann, Bartłomiej Płotka, Prometheus Developers

On Tue, Jun 22, 2021 at 12:09 PM Richard Hartmann <richih.ma...@gmail.com> wrote:

This has come up in the context of OM, OTel, and TAG Observability. My
own thinking largely mirrors beorn's & grobie's: In a perfect world
the orchestration layer has all the information and interfaces
required and billing knows about the required datapaths, NB:
Monitoring usually has higher speed and lower reliability requirements
than billing. Still, for doability, lock-in, convenience, and velocity
reasons, it's enticing to bypass the ideal solution and do something
that works-ish now. If someone incurs ~100% overhead for monitoring
lightweight functions but gets their job done, they are are still
getting their job done and can optimize later if they so choose.

Pushing might appear hamfisted here, and arguably is, but it's largely
under the control of the dev; as such, they can do it with less
coordination. This might get us near to using the Prometheus Agent as
a Collector to reduce latency and blast radius. Far from ideal, but...

An in-between would be what grobie said: To speak in Prometheus terms,
the orchestrator is node_exporter, the serverless functions write out
something which the textfile collector can ingest.

There is not much overlap between the node_exporter and the functionality needed here. It would need something which can read common log streams from major cloud providers / serverless runtimes, aggregate the logs, and then expose them. Only the last part is somewhat available in the node_exporter and the rest doesn't really make sense there. Google's mtail would be a bit closer conceptually, but as we have full control over the clients and wire format there is no need for a full-fledged log parsing engine, and the cloud provider log reading part is still missing.

OpenMetrics deliberately supports push, but this approach creates
issues with `up` and staleness handling. OTel is currently facing
similar issues, maybe there's room for cooperation. Also see
https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#supporting-target-metadata-in-both-push-based-and-pull-based-systems
and https://docs.google.com/document/d/1hn-u6WKLHxIsqYT1_u6eh94lyQeXrFaAouMshJcQFXs/edit#heading=h.e4p9f543e7i2

I strongly believe that we should be particular about the wire format;
in a future in which orchestrators have a collector component, it
would be nice to be able to simply expose the metrics for pulling or
use PRW code and wire format.

Best,
Richard

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CAD77%2BgSiKWVrnoGydB2hBVkeX87NejCht93JPVvaY%2BQ-Y%3DGvoQ%40mail.gmail.com.

Bjoern Rabenstein

unread,

Jun 24, 2021, 5:09:11 PM6/24/21

to Tobias Schmidt, Bartłomiej Płotka, Prometheus Developers

On 22.06.21 11:26, Tobias Schmidt wrote:
>
> Last night I was wondering if there are any other common interfaces
> available in serverless environments and noticed that all products by AWS
> (Lambda) and GCP (Functions, Run) at least provide the option to handle log
> streams, sometimes even log files on disk. I'm currently thinking about
> experimenting with an approach where containers log metrics to stdout /
> some file, get picked up by the serverless runtime and written to some log
> stream. Another service "loggateway" (or otherwise named) would then stream
> the logs, aggregate them and either expose them on the common /metrics
> endpoint or push them with remote write right away to a Prometheus instance
> hosted somewhere (like Grafana Cloud).

Perhaps I'm missing something, but isn't that
https://github.com/google/mtail ?

Rob Skillington

unread,

Jun 25, 2021, 12:11:28 AM6/25/21

to Bjoern Rabenstein, Bartłomiej Płotka, Prometheus Developers, Tobias Schmidt

With respect to OpenMetrics push, we had something very similar at $prevco that pushed something that looked very similar to the protobuf payload of OpenMetrics (but was Thrift snapshot of an aggregated set of metrics from in process) that was used by short running tasks (for Jenkins, Flink jobs, etc).

I definitely agree it’s not ideal and ideally the platform provider can supply a collection point (there is something for Jenkins, a plug-in that can do this, but custom metrics is very hard / nigh impossible to make work with it, and this is a non-cloud provider environment that’s actually possible to make work, just no one has made it seamless).

I agree with Richi that something that could push to a Prometheus Agent like target that supports OpenMetrics push could be a good middle ground with the right support / guidelines:

- A way to specify multiple Prometheus Agent targets and quickly failover from one to another if within $X ms one is not responding (you could imagine a 5ms budget for each and max 3 are tried, introducing at worst 15ms overhead when all are down in 3 local availability zones, but in general this is a disaster case)

- Deduplication ability so that a retried push is not double counted, this might mean timestamping the metrics… (so if written twice only first record kept, etc)

I think it should similar to the Push Gateway be generally a last resort kind of option and have clear limitations so that pull still remains the clear choice for anything but these environments.

Is there any interest discussing this on a call some time?

Rob

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/20210624210908.GB11559%40jahnn.

Bartłomiej Płotka

unread,

Nov 16, 2021, 2:54:49 AM11/16/21

to Rob Skillington, Bjoern Rabenstein, Prometheus Developers, Tobias Schmidt

Hi All,

I would love to resurrect this thread. I think we are missing a good push-gateway like a product that would ideally live in Prometheus (repo/binary or can be recommended by us) and convert events to metrics in a cheap way. Because this is what it is when we talk about short-living containers and serverless functions. What's the latest Rob? I would be interested in some call for this if that is still on the table. (:

I think we have some new options on the table like supporting Otel metrics as such potential high-cardinal event push, given there are more and more clients for that API. Potentially Otel collector can work as such "push gateway" proxy, but at this point, it's extremely generic, so we might want to consider something more focused/efficient/easier to maintain. Let's see (: The other problem is that Otel metrics is yet another protocol. Users might want to use push gateway API, remote write or logs/traces as per @Tobias Schmidt idea

Another service "loggateway" (or otherwise named) would then stream the logs, aggregate them and either expose them on the common /metrics endpoint or push them with remote write right away to a Prometheus instance hosted somewhere (like Grafana Cloud)."

Kind Regards,

Bartek Płotka (@bwplotka)

Rob Skillington

unread,

Nov 27, 2021, 6:41:22 AM11/27/21

to Bartłomiej Płotka, Bjoern Rabenstein, Prometheus Developers, Tobias Schmidt

FWIW we have been experimenting with users pushing OpenMetrics protobuf payloads quite successfully, but only sophisticated exporters that can guarantee no collisions of time series and generate their own monotonic counters, etc are using this at this time.

If you're looking for a solution that also involves aggregation support, M3 Coordinator (either standalone or combined with M3 Aggregator) supports Remote Write as a backend (and is thus compatible with Thanos, Cortex and of course Prometheus itself too due to the PRW receiver).

M3 Coordinator however does not have any nice support to publish to it from a serverless environment (since the primary protocol it supports is Prometheus Remote Write which has no metrics clients, etc I would assume).

Rob

Rob Skillington

unread,

Nov 27, 2021, 6:50:18 AM11/27/21

to Rob Skillington, Bartłomiej Płotka, Bjoern Rabenstein, Prometheus Developers, Tobias Schmidt

Here’s the documentation for using M3 coordinator (with it without M3 aggregator) with a backend that has a Prometheus Remote Write receiver:

https://m3db.io/docs/how_to/any_remote_storage/

Would be more than happy to do a call some time on this topic, the more we’ve looked at this it’s a client library issue primarily way before you consider the backend/receiver aspect (which there are options out there and are fairly mechanical to overcome, vs the client library concerns which have a lot of ergonomic and practical issues especially in a serverless environment where you may need to wait for publishing before finishing your request - perhaps an async process like publishing a message to local serverless message queue like SQS is an option and having a reader read that and use another client library to push that data out is ideal - it would be more type safe and probably less lossy than logs and reading the logs then publishing but would need good client library support for both the serverless producers and the readers/pushers).

Rob

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CABakzZaGy-Rm1qv5%3D6-2ghjmDyW3k1YkO12YfWurHZmzfsv4-g%40mail.gmail.com.

Matthias Rampke

unread,

Nov 27, 2021, 6:22:53 PM11/27/21

to Rob Skillington, Rob Skillington, Bartłomiej Płotka, Bjoern Rabenstein, Prometheus Developers, Tobias Schmidt

What properties would an ideal OpenMetrics push receiver have? In particular, I am wondering:

- What tradeoff would it make when metric ingestion is slower than metric production? Backpressure or drop data?

- What are the semantics of pushing a counter?

- Where would the data move from there, and how?

- How many of these receivers would you typically run? How much coordination is necessary between them?

From observing the use of the statsd exporter, I see a few cases where it covers ground that is not very compatible with the in-process aggregation implied by the pull model. It has the downside of mapping through a different metrics model, and its tradeoffs are informed by the ones statsd made 10+ years ago. I wonder what it would look like, remade in 2022 starting from OpenMetrics.

/MR

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CAFtK1UOa5ORJyui5-ORACtCMgS-82ZGz4G1T90EV6WY_RPDpqQ%40mail.gmail.com.

Colin Douch

unread,

Nov 27, 2021, 6:42:38 PM11/27/21

to Matthias Rampke, Rob Skillington, Rob Skillington, Bartłomiej Płotka, Bjoern Rabenstein, Prometheus Developers, Tobias Schmidt

Just to throw my 2c in, we've been battling with this problem at (company) as we move more services to a serverless model for our customer facing things. Chiefly the issue of metrics aggregation for services that can't easily track their own state across multiple requests. For us, there's just too many metric semantics for different aggregations than can be expressed in Prometheus types, so we have resorted to hacks such as https://github.com/sinkingpoint/gravel-gateway to be able to express these. The wider variety of OpenMetrics types solves most of these issues, but that requires push gateway support as above, and a non zero effort from clients to migrate to OpenMetrics client libs (if those even exist for their languages of choice).

For the above, _we_ answer the above in the following way:

> What tradeoff would it make when metric ingestion is slower than metric production? Backpressure or drop data?

Just drop it, with metrics to indicate as such

> What are the semantics of pushing a counter?

Aggregation by summing by default with different options available, configurable by the client

> Where would the data move from there, and how?

Exposed as per the push gateway as a regular Prometheus scrape

> How many of these receivers would you typically run? How much coordination is necessary between them?

This gets complicated. In our setup we have a daemonset in k8s and an ingress that does consistent hashing on the service name so that any given service is routed to two different instances

Having run this setup in production for about a year and a half now it works for us in practice although it's definitely not ideal. We'd welcome some sort of official OpenMetrics solution

- Colin

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CAMV%3D_gb0ZYLNs%2B%2BYx9LSc885%3DivHMno7DPA3eEvjifgnD5Lx%3DQ%40mail.gmail.com.

Stavros Kontopoulos

unread,

Jan 19, 2022, 8:55:57 AM1/19/22

to Prometheus Developers

Hi all!

Hope not too late for the discussion. I would like to revive it as I find it really useful for Knative and any serverless framework. As a Knative contributor, working also on the monitoring side of the project, here is my pov:

a) OpenFaas as an example (mentioned earlier above) might not be the best to consider as it seems that it only provides metrics
at the ingress side (Gateway), similarly to what you get from a Service mesh like istio when you monitor its ingress.
Don't see any option to collect user metrics at least out of the box. Another serverless system, Dapr, wrt tracing, has a sidecar that among others pushes traces to the OTEL collector (https://docs.dapr.io/operations/monitoring/tracing/open-telemetry-collector). Although Dapr for metrics uses a pull model still this highlights the path they are taking. Knative btw supports different exporters and so it can either use a pull model or a push model. It is not restricted to opentelemetry at all.

b) What is the targeted latency for serverless? In cloud environments it is possible to get invocation latency down to milliseconds (https://aws.amazon.com/blogs/compute/creating-low-latency-high-volume-apis-with-provisioned-concurrency) for simple funcs and also minimize cold start issues. As a rule any solution that ship metrics should take far less than the func run and should not add considerable resource overhead. Also users depending on the cost model should not pay for that overhead and you need to be able to distinguish that somehow at least. Regarding latency some apps can tolerate seconds or even minutes of latency. So it depends on how people want to ship metrics given their scenario. Btw as a background info Knative cold start time is a few seconds (https://groups.google.com/g/knative-users/c/vqkP95ibq60).

c) There is a question whether serverless runtime should provide metrics forwarding/collection. I would say it is possible for at least the end-to-end traffic metrics.This is for metrics related to requests entering the system eg. at ingress and usually each requests corresponds to a function invocation (Knative has this 1-1 mapping). Ingress seems the right point for robustness reasons. For example a request may fail at different stages and this is also true for Knative where different components may be on the request path. For any other metric including user metrics I would say that a different localized approach for gathering metrics seems preferable. Separation of concerns is one reason behind this as we dont want centralized components to become a metric sink like a collector while also doing other stuff like scaling apps etc.

Looking at a possible generic solution, I would guess this to be based on a local agent. Afaik a local tcp connection is at that ms scale including time for sending a few kbs of metrics data. Of course this is not the only option, metrics could be written to some local file and then stream its contents (log solution mentioned above). Ideally an architecture that ships metrics locally to some agent on a node would roughly satisfy reqs (which should be captured btw in detail). That agent would then be possible to push metrics to a metrics collector with either via remote writing, if it is Prometheus based, or via some other way if it is OTEL node agent(https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/design.md#running-as-an-agent) etc. This is already done elsewhere for example AWS (https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-open-telemetry.html).

Best,
Stavros

Bartłomiej Płotka

unread,

Apr 5, 2022, 6:26:55 AM4/5/22

to Prometheus Developers

Thanks a lot for the feedback so far!

It's not a forgotten topic. We are actively gathering feedback from different projects/teams, and input from the Knative project is really valuable. There will be also two talks about monitoring short-living jobs at the next KubeCon EU:

* Operating Prometheus in a Serverless World - Colin Douch, Cloudflare

* Fleeting Metrics: Monitoring Short-lived or Serverless Jobs with Prometheus- Bartłomiej Płotka & Saswata Mukherjee, Red Hat

We are working with Saswata and Colin on making sure we don't miss any requirements, so we can explain the current situation and propose a way forward.

FYI: We are meeting tomorrow with the OpenFaas community to learn from them too: https://twitter.com/openfaas/status/1511266154005807107 if you want to join! 🤗