Dividing Vectors with different labels when needed.(example in desc)

187 views
Skip to first unread message

Erick

unread,
Oct 2, 2020, 2:14:49 PM10/2/20
to Prometheus Users
Reposting from the IRC chat:

I have been running into an issue with dividing:

vector with one label
by
vector with different labels

I understand that prometheus divides vectors via label matching. However, sometimes dividing by vectors with non-matching labels "makes sense"

Take my particular use case:

I am trying to integrate two exporters, process-exporter and node-exporter, correlating process memory metrics with node memory metrics.

To calculate memory use, I divide a process' memory use by the total memory on the machine

Node-exporter exposes this as node_memory_MemTotal_bytes

Now this metric will never change, this is basically a "scalar"

What is the proper way to divide the first vector by the second vector?

The only real workaround I can get to work, is to put a dummy label on both metrics via label replace and divide them that way. But this is an extra calculation everytime I want to do this, for something that, in my case when integrating two exporters together, will happen a lot.

For example,

- record: procex_memory_megabytes_by_process expr: label_replace(procex_memory_megabytes_by_process, "dummy_label", "dummy_val", "", "") - record: node_memory_MemTotal_bytes expr: label_replace(node_cpu_seconds_total, "dummy_label", "dummy_val", "", "") # final metric - record: memory_usage_percentage expr: procex_memory_megabytes_by_process / ignoring(groupname) group_left node_memory_MemTotal_bytes

I don't see anything in the docs mentioning this and have searched the issues but haven't found much.

Any help is greatly appreciated!

Brian Candler

unread,
Oct 3, 2020, 5:02:06 AM10/3/20
to Prometheus Users
On Friday, 2 October 2020 19:14:49 UTC+1, Erick wrote:

I understand that prometheus divides vectors via label matching. However, sometimes dividing by vectors with non-matching labels "makes sense"


It's possible to convert a vector to a scalar.  But fundamentally you still need to identify which value of the other metric you want to divide by - since in general one metric has multiple timeseries, each identified by a different set of labels.
 

Take my particular use case:

I am trying to integrate two exporters, process-exporter and node-exporter, correlating process memory metrics with node memory metrics.

To calculate memory use, I divide a process' memory use by the total memory on the machine

Node-exporter exposes this as node_memory_MemTotal_bytes

Now this metric will never change, this is basically a "scalar"

What is the proper way to divide the first vector by the second vector?


First, find some label or combination of labels which matches the process memory to the associated node memory - in this case it might just be the "instance" label.

Then do something like: 

foo / on (instance) group_left bar

This assumes that for any particular value of "instance" label, there could be many values of the "foo" metric but only one value of the "bar" metric.
 
If you show some actual examples of the metrics you want to divide, complete with labels, then more specific advice is possible.  Here is a fairly silly example (ratio of filesystem size to node RAM size!)

node_filesystem_avail_bytes / on(instance) group_left node_memory_MemTotal_bytes

You can also copy labels from the right-hand metric into the result.  This is a more realistic example:

node_filesystem_avail_bytes * on(instance) group_left(domainname,machine,nodename) node_uname_info

The value of the RHS is always 1, so the multiply doesn't do anything, but it picks up those three extra labels from the RHS and applies them to the result.

I don't see anything in the docs mentioning this and have searched the issues but haven't found much.


Brian Candler

unread,
Oct 3, 2020, 5:13:17 AM10/3/20
to Prometheus Users
One other thing: to do this readily, you need to use "meaningful instance labels":

That is, rather than

node_info{instance="foo:9100"}
process_info{instance="foo:9256"}

you need to record

node_info{instance="foo"}
process_info{instance="foo"}

This is easy to arrange using relabelling at scrape time.  Set the instance label to what you want, and then set __address__ to the address:port.  Example:

  - job_name: node
    scrape_interval: 1m
    scrape_timeout: 50s
    file_sd_configs:
      - files:
        - /etc/prometheus/targets.d/node_targets.yml  # don't include the port number in here
    metrics_path: /metrics
    relabel_configs:
      - source_labels: [__address__]
        target_label: instance
      - source_labels: [__address__]
        target_label: __address__
        replacement: '${1}:9100'

erick dagenais

unread,
Oct 3, 2020, 5:18:08 PM10/3/20
to Prometheus Users

erick dagenais <edag...@gmail.com>

2:34 AM (10 hours ago)
to Brian
Hey Brian appreciate the answer,

**Copying over all message from direct message**

‘First, find some label or combination of labels which matches the process memory to the associated node memory - in this case it might just be the "instance" label.’


This is the exact problem I have though. Since I’m mixing metrics from different exporters (in this case ‘node-exporter’ and ‘process-exporter’, they have different output metrics, none which overlap in all of them. Specifically, process-exporter doesn’t have a job or instance label. 

This is where the dummy_label workaround came from. Since they are two metrics with no intersection, I created a fake intersection, I guess in your case a fake “instance” label named “dummy_label”.

The creation of the dummy label is exactly what I’m trying to avoid here.



On 03/10/2020 10:34, erick dagenais wrote:
> Specifically, process-exporter doesn’t have a job or instance label.

It *must* have both a job and an instance label, since prometheus adds
these labels itself.

https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series

**New Message**

Hey Brian,

You're right I should have looked into these labels more, looks like I was summing without these labels to get rid of them.

The relabel config solution in this post is one solution I was looking for. The other one would be to make node-exporter and process-exporter be scraped under the same job, so that they have the same job label.
I'll have to look into which solution is preferred, if either.

I guess the real question I should have asked first, is why don't they have any labels in common in the first place.

Thanks Brian for the help!



--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/74c30936-b7c8-4d4e-825b-c320db295707o%40googlegroups.com.

Brian Candler

unread,
Oct 4, 2020, 3:56:08 AM10/4/20
to Prometheus Users
On Saturday, 3 October 2020 22:18:08 UTC+1, Erick wrote:
You're right I should have looked into these labels more, looks like I was summing without these labels to get rid of them.

The relabel config solution in this post is one solution I was looking for. The other one would be to make node-exporter and process-exporter be scraped under the same job, so that they have the same job label.
I'll have to look into which solution is preferred, if either.


You should have separate jobs, especially if you're manipulating the instance label.  This is in case there are two metrics which happen to have the same name: if they are in separate jobs you can guarantee they will be in separate timeseries.

blah{instance="foo",job="node"}
blah{instance="foo",job="process"}

It also allows for better concurrency in collection.

I guess the real question I should have asked first, is why don't they have any labels in common in the first place.

Perhaps it would have been better if the port number had been a separate attribute (e.g. metrics_port, like metrics_path), so that __address__ only contained the address, not the address+port.  But I think it's unlikely that will change.

Regards,

Brian. 
Reply all
Reply to author
Forward
0 new messages