Suggested hardware for production installation

4,118 views
Skip to first unread message

Jan Algermissen

unread,
Sep 4, 2016, 5:28:52 AM9/4/16
to Prometheus Developers
Hi,

what hardware/VM specifications would you recommend for a production installation of Prometheus (roughly thousands of services, hundreds of DB etc. nodes)?

I would guess the focus should be on RAM and network throughput, both the more the better. Is that a correct way to approach the sizing, or do I need to consider other factors?

Jan

Ben Kochie

unread,
Sep 4, 2016, 5:50:57 AM9/4/16
to Jan Algermissen, Prometheus Developers
That's a good question.  It mostly comes down to how many individual metrics and how many samples per second you plan to ingest.  The number of actual targets isn't as big an issue as the scrapes are cheap, a simple http GET, but the sample ingestion takes some work.

RAM is a big factor
* It limits how much data you can crunch with queries
* It limits how much data can be buffered before writing to the disk storage

Network throughput is not a huge issue.  A single server with millons of timeseries and 100k samples/second only needs a few megabits/second.

CPU is important, a large server can easily use many cores.

For example, a prometheus server configured to monitor just node_exporter metrics:
* ~1700 nodes
* ~1400 metrics/node
* ~2.3M in-memory series
* ~78k samples/second

This server uses about 45GB of ram, and typically uses about 5 CPUs

It also needs about 5GB/day of storage space (SSD in this case) with varbit encoding.

We could probably get away with a lot less ram, but it allows for very large historical queries.

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ivaquero...@gmail.com

unread,
Apr 30, 2019, 5:25:40 AM4/30/19
to Prometheus Developers
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

Thanks for your answer, Ben.

Do you know what are the requirements for the Alert Manager component, given these assumptions for node_exporter metrics?

Ben Kochie

unread,
May 1, 2019, 2:47:20 AM5/1/19
to ivaquero...@gmail.com, Prometheus Developers
The alertmanager is a very small service, it only requires about 50MiB of memory and very little cpu. It scales with how many alerts you send, but it would take a huge amount of alerts to cause any load.

Ben Kochie

unread,
May 1, 2019, 2:49:06 AM5/1/19
to ivaquero...@gmail.com, Prometheus Developers
Note, this thread contains obsolete scaling information. The numbers here are from Prometheus 1.x. Prometheus 2.x has a completely different memory, cpu, and IO profile.

mrserg...@gmail.com

unread,
May 27, 2019, 4:48:46 AM5/27/19
to Prometheus Developers
Hello, Ben!
Could you please put here scaling information of Prometheus 2.x? Whta is hte memory, cpu, and IO profile for Prometheus 2.x?
Your previos posts were very helpfull for me.
Thanks in advance!

Reply all
Reply to author
Forward
0 new messages