limit on time series per metric

165 views
Skip to first unread message

Johny

unread,
Feb 17, 2021, 1:48:49 PM2/17/21
to Prometheus Users
Is there a limit on the number of time series per metric (cardinality)? My metric has cardinality of 100,000 with a tag for each process in my infrastructure. I was wondering if this causes performance or other issues. I couldn't find official guidance on this.

Stuart Clark

unread,
Feb 17, 2021, 2:18:06 PM2/17/21
to Johny, Prometheus Users
> this. --

The limit is down to the hardware that Prometheus is running on. The
more time series (the total number of different label combinations in
use for every metric) the more memory you would need. A cardinality of
100k for a single metric is pretty large. With only a few such metrics
you'd quickly be in the millions of time series, which would have pretty
substantial infrastructure requirements.

For larger Prometheus setups you would generally try to avoid having a
single large central server. Instead you would look to have a Prometheus
for every failure domain (e.g. different datacenters or AWS regions) as
well as different services/applications/areas (whatever makes sense
based on organisation or technical structures).

You can then use tools such as federation or a remote read/write system
(such as Thanos) to construct global views and alerts if needed.

--
Stuart Clark

Johny

unread,
Feb 17, 2021, 3:20:45 PM2/17/21
to Prometheus Users
Thanks. THis is helpful.

From performance standpoint, is there a difference in having a 1 metric with 100x cardinality vs 10 metrics with 10x cardinality?

Stuart Clark

unread,
Feb 17, 2021, 4:00:59 PM2/17/21
to Johny, Prometheus Users
On 17/02/2021 20:20, Johny wrote:
> Thanks. THis is helpful.
>
> From performance standpoint, is there a difference in having a 1
> metric with 100x cardinality vs 10 metrics with 10x cardinality?

To some degree that depends how you run queries.

In general you'd probably expect to have queries which interrogate the
different labels of a single metric (e.g. via sum aggregation) more
often than queries which touch many different metrics. If so, that would
suggest that you'd end up dealing with more time series (loading into
memory, etc.) in the 1 metric with 100 series case than the 10 metrics
with 10 series each. The more time series you are touching for a query,
the more data you have to load into memory, so the larger the resource
requirements.

--
Stuart Clark

Johny

unread,
Feb 17, 2021, 5:33:26 PM2/17/21
to Prometheus Users
Thanks. 
Reply all
Reply to author
Forward
0 new messages