Hello,We're using Prometheus to monitor multiple datacentres, and federating key service-level metrics up to global Prometheus instances:In each datacentre, we have Prometheus installed on 1-3 machines, depending on the the size of the datacentre. This is partly for consistency (these machines are configured alike, so it makes sense for them all to run Prometheus) and partly for high availability - if we lose one machine, we don't lose monitoring for that datacentre.The difficult comes when federating metrics - all the instances are federating up, so we see each federated metric duplicated 1-3 times. The size is not an issue (we are federating only select metrics), but it makes it's clumsy to query them at the top-level instance since you have to pick which federated instance you're querying against.So far my thoughts were to either:a) use a local forward HTTP proxy, running on the same machine as the top-level Prometheus, that round-robins between the HA Promethei in the datacentres and fails over if one of them dies
b) use relabelling somehow to drop the redundant metrics (using something along the lines of hashmod: https://prometheus.io/docs/operating/configuration/#relabel_config)
Does anyone have experience of this setup or suggestions on best practices for dealing duplicate federated metrics?Thanks,Matt
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAH6-%3DC%2BtmkhOrHGGjT7s%2Bh5ywrfa_xPMGxae5zs_ajoRmjkc9g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
On 6 December 2016 at 22:34, Matt Bostock <ma...@mattbostock.com> wrote:Hello,We're using Prometheus to monitor multiple datacentres, and federating key service-level metrics up to global Prometheus instances:In each datacentre, we have Prometheus installed on 1-3 machines, depending on the the size of the datacentre. This is partly for consistency (these machines are configured alike, so it makes sense for them all to run Prometheus) and partly for high availability - if we lose one machine, we don't lose monitoring for that datacentre.The difficult comes when federating metrics - all the instances are federating up, so we see each federated metric duplicated 1-3 times. The size is not an issue (we are federating only select metrics), but it makes it's clumsy to query them at the top-level instance since you have to pick which federated instance you're querying against.So far my thoughts were to either:a) use a local forward HTTP proxy, running on the same machine as the top-level Prometheus, that round-robins between the HA Promethei in the datacentres and fails over if one of them diesThat may cause artifacts.Generally I'd suggest either scraping just one of the per-DC Prometheus servers, or scraping all of them. A key point is that all the Prometheus servers should have distinct external_labels to avoid clashes, and then do a min/max/mean in your queries.b) use relabelling somehow to drop the redundant metrics (using something along the lines of hashmod: https://prometheus.io/docs/operating/configuration/#relabel_config)That won't help. All relabelling is stateless.Brian
Does anyone have experience of this setup or suggestions on best practices for dealing duplicate federated metrics?Thanks,Matt
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAH6-%3DC%2BtmkhOrHGGjT7s%2Bh5ywrfa_xPMGxae5zs_ajoRmjkc9g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.