Patterns for grouping and relabeling in an exporter

28 views
Skip to first unread message

Peter Bourgon

unread,
Oct 13, 2020, 9:32:32 PM10/13/20
to promethe...@googlegroups.com
The fastly-exporter [0] converts statistics scraped from the Fastly
real-time stats API to metrics that can be scraped by Prometheus. When
configured with many services that receive a lot of global traffic,
the metric cardinality can grow quite large. And I'll soon add a new
data source that will potentially ~double metric cardinality again.

The exporter can already be configured to export only e.g. specific
metrics. But users are suggesting it would be good to be able to
manipulate the exported metrics in a more powerful way. For example,
they'd like a way to define regions as sets-of-datacenters (a label
applied to all metrics) and then "collapse" (merge) the metrics so
that the datacenter labels are erased and the region label applied.
Or, to define feature groups as sets-of-metric-names, and then turn
entire feature groups on or off. Importantly, this should all be done
within the exporter itself, not via Prometheus relabeling rules,
because we're explicitly trying to reduce load on the Prometheus
servers scraping the exporter.

I can come up with a little DSL or config format that could get the
job done. But is there some prior art here that's been successful?

[0] https://github.com/peterbourgon/fastly-exporter

Brian Brazil

unread,
Oct 14, 2020, 4:12:24 AM10/14/20
to Peter Bourgon, Prometheus Users
I'm not aware of anything generic in this space.

Disabling specific expensive collectors via flags isn't unusual, both the mysqld and node exporters have it for example.

There's also a handful of things out there where you exclude labels that some metrics would usually have, e.g. removing the datacenter label. Dynamically adding in a region label is a bit odd though, as new instrumentation labels are a breaking change for downstream. If that's a useful label it should be there in the first place, not only something that appears when you're trying to reduce cardinality. I'd suggest keeping with simple flags over creating yet another language that users to have to learn.

Brian
 

[0] https://github.com/peterbourgon/fastly-exporter

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAFm1DRFpP5_5XF7DRXD%3DnjqsHEwE1%3DAU70tccj5UDH2JiSP_hA%40mail.gmail.com.


--

Ben Kochie

unread,
Oct 14, 2020, 4:36:32 AM10/14/20
to Peter Bourgon, Prometheus Users
Another way to help with scrapes would be to provide URL params for subsets.

Something like `/metrics?region=FOO` would allow the scrape to be broken up into several smaller scrapes.

Peter Bourgon

unread,
Oct 18, 2020, 8:20:09 PM10/18/20
to Brian Brazil, Prometheus Users
On Wed, Oct 14, 2020 at 10:12 AM Brian Brazil
<brian....@robustperception.io> wrote:
>
> On Wed, 14 Oct 2020 at 02:32, Peter Bourgon <pe...@bourgon.org> wrote:
>>
>> The fastly-exporter [0] converts statistics scraped from the Fastly
>> real-time stats API to metrics that can be scraped by Prometheus. When
>> configured with many services that receive a lot of global traffic,
>> the metric cardinality can grow quite large. And I'll soon add a new
>> data source that will potentially ~double metric cardinality again.
>>
>> The exporter can already be configured to export only e.g. specific
>> metrics. But users are suggesting it would be good to be able to
>> manipulate the exported metrics in a more powerful way. For example,
>> they'd like a way to define regions as sets-of-datacenters (a label
>> applied to all metrics) and then "collapse" (merge) the metrics so
>> that the datacenter labels are erased and the region label applied.
>> Or, to define feature groups as sets-of-metric-names, and then turn
>> entire feature groups on or off. Importantly, this should all be done
>> within the exporter itself, not via Prometheus relabeling rules,
>> because we're explicitly trying to reduce load on the Prometheus
>> servers scraping the exporter.
>>
>> I can come up with a little DSL or config format that could get the
>> job done. But is there some prior art here that's been successful?
>
> I'm not aware of anything generic in this space.

As I thought. Thanks for the confirmation.

> Disabling specific expensive collectors via flags isn't unusual, both the mysqld and node exporters have it for example.

In my case, it's not that any given collector or set of metrics are in
themselves expensive, but that all of the metrics and common label
dimensions are duplicated per-service, and power users can have many,
many services. I already have affordances to select metric names and
services in different ways, including "sharding" services over
different exporter instances. But even that is not enough for some
users.

> There's also a handful of things out there where you exclude labels that some metrics would usually have, e.g. removing the datacenter label. Dynamically adding in a region label is a bit odd though, as new instrumentation labels are a breaking change for downstream. If that's a useful label it should be there in the first place, not only something that appears when you're trying to reduce cardinality. I'd suggest keeping with simple flags over creating yet another language that users to have to learn.

To be clear, the notion of regions was just a speculative example of
how users might want to define and re-group metrics, it's not a
well-defined domain concept.
Reply all
Reply to author
Forward
0 new messages