Prometheus federation and Grafana

dc3o

unread,

Feb 12, 2021, 8:32:31 AM2/12/21

to Prometheus Users

Currently managing few independent Prometheus instances, located in different datacenters, in different geo locations. All the alerting is managed by local(dc) Alertmanager. Each datacenter has it's Grafana. I'm re-thinking the strategy to move all the dashboards(from different dc) to a single Grafana instance. The way I see there are two options:

use federation and provision a single Prometheus that will replicate the data from distant Prometheus sources - and set this single Prometheus as the only Grafana datasource
create a datasource for every remote Prometheus server

...or moving Prometheus to use a remote storage solution (cortex,thanos...) is also an option?
Which one do you recommend?

TY

Julius Volz

unread,

Feb 12, 2021, 8:56:22 AM2/12/21

to dc3o, Prometheus Users

On Fri, Feb 12, 2021 at 2:32 PM dc3o <deln...@gmail.com> wrote:

Currently managing few independent Prometheus instances, located in different datacenters, in different geo locations. All the alerting is managed by local(dc) Alertmanager. Each datacenter has it's Grafana. I'm re-thinking the strategy to move all the dashboards(from different dc) to a single Grafana instance. The way I see there are two options:
use federation and provision a single Prometheus that will replicate the data from distant Prometheus sources - and set this single Prometheus as the only Grafana datasource

Federation is not recommended (or optimized / working well) for replicating an entire Prometheus server into another one. Usually with federation you would only pull some aggregated data into a global Prometheus server, but not all instance-level detail.

create a datasource for every remote Prometheus server

This is the usual way, yes.

...or moving Prometheus to use a remote storage solution (cortex,thanos...) is also an option?

That's fine as well. Especially with Thanos, you could just use the Sidecar and Querier components to build a global query layer without having to build up a separate storage system (object store uploads & read integration through the Store gateway are optional).

I think both Thanos or just using multiple datasources are fine. If you really care about running a single PromQL query on data from multiple servers at once (e.g. to aggregate over multiple geolocations), then you'd need Thanos.

Regards,

Julius

Which one do you recommend?

TY

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/6c72675e-2355-4483-b022-b4ddddb6af12n%40googlegroups.com.

--

Julius Volz

PromLabs - promlabs.com

dc3o

unread,

Feb 12, 2021, 9:52:07 AM2/12/21

to Prometheus Users

Re Alertmanager:

1. a single AM per datacenter
or
2. single AM cluster ( all existing AMs in different datacenters in a single cluster)

Reply all

Reply to author

Forward