Metrics relabeling not working as expected

87 views
Skip to first unread message

Mike W

unread,
Jan 17, 2025, 11:49:49 AMJan 17
to Prometheus Users
We may be running into a bug with metrics relabeling, but I wanted to post here in case anyone has an idea of what my team and I are doing wrong here with this relabeling. 

Background: We have a metric named "order_total" that has a very high cardinality label "store_number" slowing down our queries and causing massive resource usage that we need to drop during/before ingestion. 

We have tried multiple methods and variations to do this, (1) by the labeldrop action- 
Everything I'm seeing online and in the Prometheus documentation says just to add the following to the scrape config target/job and it should work.
And (2) by relabeling this to a single value for all ("0" in our case), just to remove the high cardinality aspect.

1: (green line in screenshot)
metric_relabel_configs:
- action: labeldrop
  regex: store_number

2: (blue line in screenshot {store_number="0"})
metric_relabel_configs:
- source_labels: [store_number]
  separator: ; 
  regex: (.*) 
  target_label: store_number 
  replacement: "0"

But as you can see, the numbers aren't nearly close to what we are expecting. This is during a simulated load test that is sending a constant rate of orders, hence the reason why the counts are at a constant increase. I am expecting to see both blue and green lines at exponentially higher values than the unique store_number labels. If this isn't a bug, does anyone have a clue what we are doing wrong? Any replies are greatly appreciated.

Screenshot 2025-01-16 at 8.43.55 PM.png
Screenshot 2025-01-16 at 8.43.55 PM.png

Mike W

unread,
Jan 17, 2025, 11:50:54 AMJan 17
to Prometheus Users
My hypothesis is that this is a bug in Prometheus and it's only changing the metric label for a single "store_number", and dropping the rest of the metrics.

Bjoern Rabenstein

unread,
Jan 21, 2025, 10:12:50 AMJan 21
to Mike W, Prometheus Users
On 17.01.25 08:49, Mike W wrote:
>
> Background: We have a metric named "order_total" that has a very high
> cardinality label "store_number" slowing down our queries and causing
> massive resource usage that we need to drop during/before ingestion.

In general, you cannot reduce the cardinality by just removing the
label or by changing the label value to the same. The many series will
not disappear by that, they will just collide, so that their values
will all be written into the same series in the TSDB, creating a
nonsensical series with possibly many duplicate timestamp errors, too.

What you are looking for is a feature to actually aggregate a set of
high cardinality series into fewer series, but vanilla Prometheus does
not have such a feature (yet). It is offered by some of the cloud
vendors ingesting Prometheus metrics.

For ideas and work in Prometheus itself, see the following:
- https://github.com/prometheus/prometheus/issues/394
- https://github.com/prometheus/prometheus/pull/10529
- https://docs.google.com/document/d/11LC3wJcVk00l8w5P3oLQ-m3Y37iom6INAMEu2ZAGIIE/edit?tab=t.0#bookmark=id.hs3vj63cj7uk


--
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in

Mike W

unread,
Jan 21, 2025, 1:09:00 PMJan 21
to Prometheus Users
That makes a lot of sense now that you describe it that way. I was thinking about implementing a solution for this that involves recording rules like your links suggest, but was concerned we'd run into resource consumption issues when making these constant calculations. We were able to get the underlying metric fixed before I was able to try. 

It might be useful for others if a note was added in documentation that stated metric relabeling is not helpful to solve high cardinality issues due to the deduplication/collision. Could be a good fit here https://training.promlabs.com/training/relabeling/introduction-to-relabeling/relabeling-overview/

Thank you very much for your answer.

Bryan Boreham

unread,
Feb 8, 2025, 2:14:14 PMFeb 8
to Prometheus Users
Prometheus documentation does include this phrase:

Care must be taken with labeldrop and labelkeep to ensure that metrics are still uniquely labeled once the labels are removed.

Also note that PromLabs is an independent company, albeit run by a Prometheus founder.

Bryan

Reply all
Reply to author
Forward
0 new messages