Prometheus High Availability across different Availability Zones on AWS EKS

85 views
Skip to first unread message

kanaka raju

unread,
Jan 3, 2024, 3:43:07 PM1/3/24
to Prometheus Users

Hello Team,

Fairly new to the prometheus architecture, but currently I'm looking if there exists a model where I have 3 different prometheus deployments which would span across 3 different AZ's. And have thanos or cortex where these prometheus pushes metrics to. This is actually to reduce our inter AZ cost which is relatively very high.

So, I want to know if this architecture is feasible and I'm looking for some relevant document which exists for the same.



Thanks and Regards.

Bryan Boreham

unread,
Jan 4, 2024, 7:51:35 AM1/4/24
to Prometheus Users
Yes it is feasible.  

One Prometheus in each AZ, scraping targets in that AZ, will minimize inter-AZ traffic, but give you a single point of failure in each AZ.
Two Prometheus in each AZ, scraping targets in that AZ, will also minimize inter-AZ traffic, and be resilient to one random* Prometheus failure.  Cortex, Thanos, etc., can dedupe the samples.

"scraping targets in that AZ" requires that metadata exists to identify those targets.  For instance in Kubernetes there should be a label giving the AZ.

If you had something else in mind, please clarify.

Sorry I am not aware of documents about this subject, but "prometheus scraping az" as a search term turns up lots of people talking about it.

* some failures, such as scraping too much data and running out of memory, will happen at the same time in both.

Bryan

Reply all
Reply to author
Forward
0 new messages