Error executing query: many-to-many matching not allowed.

1,325 views
Skip to first unread message

Raphael Konno

unread,
Nov 4, 2021, 9:34:42 AM11/4/21
to Prometheus Users
Hi Gentlemen, I recently decided to do a Prometheus implementation with my Proxmox cluster, however I noticed that one of my rules to alert when a VM shuts down is failing, with the following message:

"Error executing query: many-to-many matching not allowed: matching labels must be unique on one side"

However it was functional until a few days ago, I couldn't identify what changed so that this could have happened, the query I'm using is as follows:

(pve_up == 0) * on(id) group_left(name) pve_guest_info

Has anyone dealt with this specific error and can you recommend something?

Thanks so far!

Brian Candler

unread,
Nov 4, 2021, 11:02:22 AM11/4/21
to Prometheus Users
To debug it, simply run the two halves separately:

(a) pve_up == 0
(b) pve_guest_info

Match up the the timeseries from (a) and (b) with identical "id" value.  I think you'll find at least one case where a particular value of "id" links to more than one (a) and more than one (b).

group_left requires that multiple instances of (a) map to exactly one instance of (b).  So another query to try is:

count by (id) (pve_guest_info) > 1

and see if you can find multiple instances of pve_guest_info for the same "id".

If you do, then the solution depends on the meaning of the metric.  Is "id" only unique when in combination with some other label?  Then join on both labels at once:

(pve_up == 0) * on(id, foo) group_left(name) pve_guest_info

where "foo" is this other label.  (Clearly it has to exist on both pve_up and pve_guest_info metrics though)

Raphael Konno

unread,
Nov 4, 2021, 3:00:50 PM11/4/21
to Prometheus Users
Bryan, first of all thank you very much for the clarifications, the explanation was great.

About the meaning of my metric, it is to know when any virtual machine goes offline in my cluster. However I have more than one cluster in my environment. This generates repeated VM ID's.

When I run my "pve_up" query, it brings me:

pve_up{id="qemu/100",instance="0.0.0.0:9221",job="PVE-Cluster"}

Where ID refers to the machine created in the cluster and if there is more than one, therefore, two identical ID's.

Using your query "count by (id) (pve_guest_info) > 1" I found which ID's are repeated, but I still don't know how to get around it.

When I run my "pve_guest_info" query it brings me:

pve_guest_info{id="qemu/100",instance="0.0.0.0:9221",job="PVE-Cluster",name="lab01",node="MyNode",type="qemu"}

So, through the repeating ID I would need to get the information from a repeating ID, I have no idea how to solve this, since the ID's are repeated and I need them to generate my reference, or is there another way?

Thank you very much.

Brian Candler

unread,
Nov 5, 2021, 4:04:36 AM11/5/21
to Prometheus Users
On Thursday, 4 November 2021 at 19:00:50 UTC vrar...@gmail.com wrote:
When I run my "pve_up" query, it brings me:

pve_up{id="qemu/100",instance="0.0.0.0:9221",job="PVE-Cluster"}

Where ID refers to the machine created in the cluster and if there is more than one, therefore, two identical ID's.


I don't see how you can get instance="0.0.0.0:9221".  How does a scrape to 0.0.0.0 work?  Or have you replaced the real IP with 0.0.0.0 when posting?

You need some way to identify the clusters in the metrics, and I would have expected this to be via the instance label: e.g. instance "qemu/100" on two different clusters might appear as

    pve_up{id="qemu/100",instance="192.0.2.3:9221",job="PVE-Cluster"}
    pve_up{id="qemu/100",instance="172.16.4.5:9221",job="PVE-Cluster"}

If that's the case, the pair (id,instance) is your unique instance identifier that you can join on.

If not, then you need to find some other way to do this.  A good way is to use instance relabelling to control the instance label:

    pve_up{id="qemu/100",instance="pve_cluster1",job="PVE-Cluster"}

Or you could add another label at scrape time, e.g. via your targets file:

    pve_up{id="qemu/100",instance="0.0.0.0:9221",job="PVE-Cluster",cluster="cluster1"}

Or you could scrape them using two different jobs (this is the least scalable as you need to replicate your scrape config for every cluster):

    pve_up{id="qemu/100",instance="0.0.0.0:9221",job="PVE-Cluster1"}

Then your join would be on (id,instance) or (id,cluster) or (id,job) respectively.

Raphael Konno

unread,
Nov 5, 2021, 6:45:25 AM11/5/21
to Brian Candler, Prometheus Users
In fact I switched IP addresses to post here.

Perfect, the two methods you gave me are possible to get where I needed. I would like to thank you for your support and attention to my question. I managed to solve my problem!

--
You received this message because you are subscribed to a topic in the Google Groups "Prometheus Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-users/DdLfQkzoZLA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/080b69c6-43bf-45b4-80b2-ba65d35c959en%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages