Aggregation Metrics - Found duplicate series for the match group (How delete a label before join metrics ?)

376 views
Skip to first unread message

BDT

unread,
Mar 10, 2020, 6:33:33 PM3/10/20
to Prometheus Users
Hi everyone,

Today I have a problem about my rules expression because I try to join metrics together to get the name of the swam node in it.
In order to do this, I have left_joined my metrics by node_id and get node_name. It's works fine in the prometheus console but when I deploy my rules, I get this message:

level=warn ts=2020-03-10T20:27:12.111Z caller=manager.go:525 component="rule manager" group="Container alert" msg="Evaluating rule failed"

rule="alert: task_high_memory_usage_1g

expr: sum by(container_label_com_docker_swarm_task_name, container_label_com_docker_swarm_node_id) (container_memory_rss{container_label_com_docker_swarm_task_name=~".+"}) * on(container_label_com_docker_swarm_node_id)  group_left(node_name) node_meta > 1e+06

err="found duplicate series for the match group {container_label_com_docker_swarm_node_id=\"jilcpclonjg7jj1chh29b19pl\"} on the right hand-side of the operation:

[{__name__=\"node_meta\", container_label_com_docker_swarm_node_id=\"jilcpclonjg7jj1chh29b19pl\", instance=\"10.0.9.10:9100\", job=\"node-exporter\", node_id=\"jilcpclonjg7jj1chh29b19pl\", node_name=\"***\"},

{__name__=\"node_meta\", container_label_com_docker_swarm_node_id=\"jilcpclonjg7jj1chh29b19pl\", instance=\"10.0.8.20:9100\", job=\"node-exporter\", node_id=\"jilcpclonjg7jj1chh29b19pl\", node_name=\"***\"}];

many-to-many matching not allowed: matching labels must be unique on one side"

I have a label instance with two different values. my first metric is get by cadvisor and the other is generated by node_exporter so two different instances

node_meta{container_label_com_docker_swarm_node_id="jilcpclonjg7jj1chh29b19pl", instance="10.0.9.10:9100", job="node-exporter", node_id="jilcpclonjg7jj1chh29b19pl", node_name="***"}

This is an example of  container_memory_rss:

container_memory_rss{container_label_caddy_version="1.0.3",container_label_com_docker_stack_namespace="supervision",container_label_com_docker_swarm_node_id="jilcpclonjg7jj1chh29b19pl",container_label_com_docker_swarm_service_id="5t7qdfsjdirecjvw0nlpzlw0w",container_label_com_docker_swarm_service_name="supervision_dockerd-exporter",container_label_com_docker_swarm_task_id="n8ji42apbj0t2bctr6ilgcvt2",container_label_com_docker_swarm_task_name="supervision_dockerd-exporter.jilcpclonjg7jj1chh29b19pl.n8ji42apbj0t2bctr6ilgcvt2",",id="/docker/312f83a134e144e3342d6229ec1a239e0539b77ec79f3a7e493f04fc9e24edf1",image="registry.test.toto.fr/dockerd-exporter:1.0.3@sha256:31f18339414875c404647acac452883e2a7d6fdf89b8d8c8ae5e61140c392b80",instance="10.0.9.20:8080",job="cadvisor",name="supervision_dockerd-exporter.jilcpclonjg7jj1chh29b19pl.n8ji42apbj0t2bctr6ilgcvt2"}

The alert is sent anyway but I don't like have errors in app ^^
Maybe there is a way to delete instance label before doing the left_join, I don't need this label in my expression.

Thanks for your time !

Best regards

BDT

unread,
Mar 13, 2020, 8:56:54 AM3/13/20
to Prometheus Users
Hi everyone,

I have found the ignoring keyword:

The ignoring keyword can also be used as an inverse of that to specify which labels should be ignored when trying to match.


But I'm not sure that will apply before the join of series. Do you have an idea about this, I have still the problem


Error executing query: found duplicate series for the match group {container_label_com_docker_swarm_node_id="echkja2e9osl9gxzbg7xuc6fq"} on the right hand-side of the operation: [{__name__="node_meta", container_label_com_docker_swarm_node_id="echkja2e9osl9gxzbg7xuc6fq", instance="10.0.43.52:9100", job="node-exporter", node_id="echkja2e9osl9gxzbg7xuc6fq", node_name="test-int-swarmnode-002"}, {__name__="node_meta", container_label_com_docker_swarm_node_id="echkja2e9osl9gxzbg7xuc6fq", instance="10.0.43.31:9100", job="node-exporter-host", node_id="echkja2e9osl9gxzbg7xuc6fq", node_name="test-int-swarmnode-002"}];many-to-many matching not allowed: matching labels must be unique on one side

Christian Hoffmann

unread,
Mar 14, 2020, 7:46:36 AM3/14/20
to BDT, Prometheus Users
Hi,


On 3/10/20 11:33 PM, BDT wrote:
> Today I have a problem about my rules expression because I try to join
> metrics together to get the name of the swam node in it.

To simplify: This is about "joining" some real-world metric with a meta
metric to get additional labels and the problem being that the meta
metric exists twice with different instance labels, right?

E.g.

container_memory_rss * on(container_label_com_docker_swarm_node_id)
group_left(node_name) node_meta

with

container_memory_rss{container_label_com_docker_swarm_node_id="foo"} 1234
node_meta{container_label_com_docker_swarm_node_id="foo",instance="a"} 1
node_meta{container_label_com_docker_swarm_node_id="foo",instance="b"} 1


The simplest way might be to aggregate away the instance label form
node_meta, which you don't seem to require:

container_memory_rss * on(container_label_com_docker_swarm_node_id)
group_left(node_name) sum ignoring(instance) (node_meta)



Kind regards,
Christian

BDT

unread,
Mar 19, 2020, 4:57:00 AM3/19/20
to Prometheus Users
Hi,

Thanks for your answer Christian.

Your idea was good, I tried to remove all unused labels and finaly succeed on getting my metric joined correctly.

I used without because I see it in an answer of Brian Brazil and it did the job.


Final aggregation
sum (container_memory_rss{container_label_com_docker_swarm_task_name=~".+"}) without (instance, job) * on(container_label_com_docker_swarm_node_id) group_left(node_name)  node_meta{job="node-exporter"}

Thank you, it resolved

Have a good day
Le mardi 10 mars 2020 23:33:33 UTC+1, BDT a écrit :
Reply all
Reply to author
Forward
0 new messages