Federation aggregating all metrics

anja...@gmail.com

unread,

Mar 28, 2018, 11:30:57 AM3/28/18

to Prometheus Users

I have multiple datacenters, and I plan to place a single Prometheus in each one scraping data from a few thousand of apps in that datacenter. I also want to have a federation to provide a global view aggregating all the app instances of each data center.

The problem I’ve run into is that Federation appears to expect a rule explicitly defined for every single metric name. This is highly impractical to my use case, as app teams may introduce new metric names at any time via a Prometheus library, and these teams expect to be able to see their data aggregated globally. This appears to mean I would then need to add a new aggregation rule on the datacenter-level Prometheus servers every time an app team introduces a new metric name.

Thus, I’m hoping there’s a way I can configure this setup to aggregate every potential metric name with just a single or handful of rules. Ideally, I’d love if I could do something like take the sum of all counter metrics, and the avg of all gauge metrics… but I’m fairly certain Prometheus doesn’t store metric types and thus this isn’t possible.

The next best solution, I think, may be to just have a single rule on each DC-level-Prometheus that applies to absolutely every single metric and makes a SUM-aggregation metric, and then a second rule on each DC-level-Prometheus that applies to every absolutely single metric and makes a AVG-aggregation metric. These aggregation metrics would then of course be collected by the Federation. Is this possible? How would I configure this in the datacenter Prometheus rule file?

Or is there fundamentally a different approach I should be taking in trying to provide app teams a global view of their data? I don’t think simply using multiple datasources in Grafana is a valid solution, because you can’t perform global aggregations across multiple datacenters- (e.g. a team wouldn’t be able to say globally the number of requests they’ve had).

Ben Kochie

unread,

Mar 28, 2018, 2:58:52 PM3/28/18

to anja...@gmail.com, Prometheus Users

Federation is designed for cluster level aggregations into a global view, not as way to centralize all metrics[0].

There are two other options for centralizing all metrics.

Thanos: https://github.com/improbable-eng/thanos

Cortex: https://github.com/weaveworks/cortex

They both take slightly different approaches, but in the end you can get a global query view of all data. In the case of Cortex, there are companies providing this as a service, so you don't need to operate it yourself.

https://kausal.co/

https://www.weave.works/product/cloud/

[0]: https://www.robustperception.io/federation-what-is-it-good-for/

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b23ee8a5-e766-4326-9b8a-39acdfdc096c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

Message has been deleted