Drop metrics based on a different metric

104 views
Skip to first unread message

rksakt...@gmail.com

unread,
Dec 1, 2023, 4:33:22 AM12/1/23
to Prometheus Users
Hi All,

We want to drop set of metrics where the pod is created cronjobs and jobs (shortlived pods)

kube_pod_info has a label created_by_kind with value as "<none>". I have another set of metrics such a kube_pod_status_phase , kube_pod_status_reason etc.

Now I want to drop all the three metrics but I don't have the label created_by_kind in the kube_pod_status_phase , kube_pod_status_reason metrics. 

The common label between all three timeseries is UID and pod .

Is it possible to drop a set of metrics based on the output of another metric  in the metric_relabeling phase or using any recording rule ?

Looking for some assistance on how we can achieve this

Thank You
Regards
Sakthi




Brian Candler

unread,
Dec 1, 2023, 4:47:17 AM12/1/23
to Prometheus Users
> Is it possible to drop a set of metrics based on the output of another metric  in the metric_relabeling phase

No.

> or using any recording rule ?

Yes. Write a PromQL query which filters the metrics in the way that you want, and when it's working (in the web UI), put it in a recording rule. You haven't given exact examples of any of the metrics in question (with complete label sets), but a starting point might look like this:

    kube_pod_status_phase unless on (uid, pod) kube_pod_info{created_by_kind="<none>"}

That will drop any entries in the kube_pod_status_phase vector which match on both the uid and pod labels with the expression on the RHS. Documentation for logical/set binary operators is here: https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators

rksakt...@gmail.com

unread,
Dec 3, 2023, 11:19:06 PM12/3/23
to Prometheus Users
Hi, Thanks for your reply! 

The promQL is working just fine and returning rows as per our expectation. We have multiple clusters and we created a recording rule using the below promql.

{job=“cluster-1”} unless on (uid,pod) kube_pod_info{job=“cluster-1“,created_by_kind=“<none>”}

With the recording rule, we created a new static label called highcardinality=“true” but this creates new time series. When doing remote write to our long term storage we are dropping those time series which has highcardinality=“true” but the original metric doesn’t have this label so its still getting into our remote write system.

So the recording rule was kind of duplicating the metrics collected. Is there a way to drop all the metrics that are returned by above promQL you shared as we don’t want the time series to end up in our long term TSDB.

Local Prometheus.  —> Pull Metrics from — > Kube state metrics Statefulset
|
|
| — - - - > Remote write to (Drop metrics that has highcardinality =“true”) . —> Long Term TSDB (Cortex)

We are thinking of add a new label as part of metric_relabeling section with highcardinality=“false” and update the label to true using recording rules and label_replace function instead of the static label implementation . Is this the right way to do it or is there any other better options that you can suggest us with.

Thank You

Brian Candler

unread,
Dec 4, 2023, 3:51:57 AM12/4/23
to Prometheus Users
> With the recording rule, we created a new static label called highcardinality=“true” but this creates new time series. When doing remote write to our long term storage we are dropping those time series which has highcardinality=“true” but the original metric doesn’t have this label so its still getting into our remote write system.

Why don't you configure your remote_write so that it only sends metrics with highcardinality="true"? Use write_relabel_configs with "keep" or "drop" rules.

> We are thinking of add a new label as part of metric_relabeling section with highcardinality=“false” and update the label to true using recording rules

Again, not sure exactly why you'd want to do that. Changing a label from one value to another also creates a new timeseries, because the bundle of labels is what defines a timeseries, so it's not really any different.  But your recording rules *are* generating a new timeseries anyway.

I'm also not sure why you are saying that the recorded metrics have a "high" cardinality when compared to the original. Otherwise, you seem to have more or less the right ideas:

1. If you want to add a label like highcardinality="X" to your original source metrics, you can do this at scrape time, either using target relabelling (if it applies to all metrics from a given target) or metric relabelling (if it only applies to specific metrics)
2. You can set or override a label like highcardinality="Y" in your recording rules. You don't need label_replace() to do that; the recording rule itself has a "labels" block.

BTW, it's standard practice that recording rules should be generating metrics with a different name. If you did this, you can match on the name pattern for remote storage. This is a case where label_replace may do the job; I'm not sure if it's allowed to change __name__ with that, but it's worth a try. There are some hints on how to name metrics in recording rules at https://prometheus.io/docs/practices/rules/#recording-rules 

OTOH, I can see why you don't want to change the metric names, when you're not really rolling up metrics into a summary, you're just dropping a subset of metrics that are not of interest.

Reply all
Reply to author
Forward
0 new messages