Node exporter file_sd scraping old dns names

39 views
Skip to first unread message

Alexandre Neves Ferreira

unread,
May 5, 2020, 10:34:01 AM5/5/20
to Prometheus Users
Hi, 
I'm using a custom script to discovery nodes that generates a file with targets and file_sd to configure the Node Exporter job. 
In a given day lets say that this file was like this (the ip address is only for exemplification):

-targets
    -instance1.app1.prod:9100  (10.1.1.50)
    -instance2.app1.prod:9100  (10.1.1.51)
 

After some time, instance2 was destroyed, but the record stills in the targets file. A new app was created and the same ip that instance2.app1 had whas assigned to other instance for other app: 

-targets
    -instance1.app1.prod:9100  (10.1.1.50)
    -instance2.app1.prod:9100  (10.1.1.51)
    -instance1.app2prod:9100  (10.1.1.51)

This means that the metric node_uname_info{job="node", nodename="instance1.app2.prod"} returns two values (unecessary labls ommited): 

node_uname_info{instance="instance2.app1.prod:9100",job="node", nodename="instance1.app2.prod"}
node_uname_info{instance="instance1.app2prod:9100 ",job="node", nodename="instance1.app2.prod"}

And that is ok, it is expected.

The problem started when I recreated the instance2.app1.prod, with a different ip (lets say 10.1.1.71). The metrics in prometheus was not updated with the new ip of the hostname instance2.app1.prod. 
Is important to say that in the vm where prometheus is running, if I ping instance2.app1.prod I get the correct ip, 10.1.1.71 but in prometheus, every metric with instance="instance1.app2prod:9100 still returning the values of the node that has its old ip.

I've already restarted prometheus, but the problem persists. 
What I need to do to make prometheus scrapes the correct ip for this node?

Thanks!



Brian Candler

unread,
May 5, 2020, 11:02:52 AM5/5/20
to Prometheus Users
You are probably getting into issues with DNS caching - if you're dynamically creating and destroying things this may result in problems if you keep the same name but with a different IP address.  Setting a low DNS TTL (say 300 seconds = 5 minutes) may be sufficient if you don't mind things being out of whack for a while.

However what you can also do is to put the name in the instance label, but give prometheus the IP address to scrape.  You can do that by putting both the name *and* the IP address in the targets file, and separate them using relabelling rules.

  - job_name: node
    scrape_interval: 1m
    file_sd_configs:
      - files:
        - /etc/prometheus/targets.d/node_targets.yml
    metrics_path: /metrics
    relabel_configs:
      # When __address__ consists of just a name or IP address,
      # copy it to the "instance" label.  Doing this explicitly
      # keeps the port number out of the instance label.
      - source_labels: [__address__]
        regex: '([^/]+)'
        target_label: instance

      # When __address__ is of the form "name/address", extract
      # name to "instance" label and address to "__address__"
      - source_labels: [__address__]
        regex: '(.+)/(.+)'
        target_label: instance
        replacement: '${1}'
      - source_labels: [__address__]
        regex: '(.+)/(.+)'
        target_label: __address__
        replacement: '${2}'

      # Append port number to __address__ so that scrape gets
      # sent to the right port
      - source_labels: [__address__]
        target_label: __address__
        replacement: '${1}:9100'

Your file_sd file then looks something like this:

- targets:
    - instance1.app1.prod/10.1.1.50
    - instance2.app1.prod/10.1.1.51

This of course assumes you have the correct data to hand to build the targets files, i.e. what the names and addresses of your instances are.

The above has intentionally removed the :9100 from the instance label too.  This makes it much easier to join things, e.g. if you want to join a different metric (collected on a different port) to node_uname_info, you can use the common label {instance="instance2.app1.prod"}

Alexandre Neves Ferreira

unread,
May 5, 2020, 12:56:19 PM5/5/20
to Prometheus Users
This is awesome! 
I will refactor to use the relabeling solution, thank you very much! 
Reply all
Reply to author
Forward
0 new messages