--You received this message because you are subscribed to the Google Groups "Prometheus Users" group.To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.To post to this group, send email to promethe...@googlegroups.com.To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/8b0045c6-1b88-477b-983a-82bea28d61e4%40googlegroups.com.For more options, visit https://groups.google.com/d/optout.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/d3d1cc1a-3d59-4f5b-ac10-0812f0b86598%40googlegroups.com.
Hi Julius,
Thanks for the quick response. Seems like we definitely misunderstood the use of federation.
We have two data centers each with their own set of Prometheus servers. We use service discovery to have each Prometheus scrape only the instances in its respective data center.
Our application servers (which generate these large amounts of metrics) are split roughly 50/50 between the the two data centers. We wanted to use federation to have a global Prometheus that could scrape from the Prometheus servers for each respective data center. This is so we could make dashboards/alerts that reflected all of our application servers.
The alternative we tried out that worked before was having a global Prometheus that scraped all the application servers across all data centers (we scaled up the instance type if we needed more capacity). Our worry was that if a data center went down that the global Prometheus availability/performance would be affected (which might be an incorrect assumption).
Are there any drawbacks you can think of with our previous approach?
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/18f26d73-4962-4f93-9a43-f8d87a3302c6%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/18f26d73-4962-4f93-9a43-f8d87a3302c6%40googlegroups.com.
Hi Brian,We have the similar setup where we have one prometheus scrapper per datacenter. We have a global Federate server scrapping from these Prometheus servers in individual data centers. For some reason when we run a prom sql query, federate nodes are showing a different count than the individual scrapper nodes. Any thoughts?
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/04b7d92b-0d0a-41c7-91ef-be7feb3e0eae%40googlegroups.com.
Thanks Brian, We ran the below query against all 4 federate nodes and got output as
topk(5, sum(sum_over_time(publish[30m])) by (root_topic))
Node1: 530
Node 2: 560
Node 3: 500
Node 4: 510
When we run the same query against Scrapper Nodes:
Scrapper 1: 2340
Scrapper 2: 2125
Not sure why there is a huge difference between Federate Node Vs Scrapper Node.Thanks,Govind