DNS SD - How to properly format the SRV records to identify scrape paths for each node (target)

4,165 views
Skip to first unread message

Danny Kulchinsky

unread,
May 23, 2017, 7:45:17 PM5/23/17
to Prometheus Users
Dear Forum,

We are working on setting up Prometheus to use DNS SD (using SRV records), we have several data-centers and environments, most of our services deployed in K8s but still have several roles deployed on "stand alone" servers (Databases, Cache nodes, etc..)

I'm trying to figure out what would be the best approach to define the SRV records to efficiently discover the targets, my doubt is about is how to dynamically discover what are the endpoints (ports & paths) that each node should be scraped for, most will have node-exporter and cadvisor, but others may have mysql-exporter, haproxy-exporter, etc...

I guess I could have multiple records per each target, each with a unique port number to indicate where the exporter listens (i.e. 9100 for node-exporter, etc...), however, how would I specify the path?

Another alternative we considered was to define an SRV record (and a corresponding scrape job) per each type of exporter and include the relevant node targets in that record.


Perhaps I'm missing something and the approach here should be quite different, would love to hear some ideas/suggestions or perhaps even examples on how to best achieve this.


Regards,
Danny

Ben Kochie

unread,
May 24, 2017, 1:24:57 AM5/24/17
to Danny Kulchinsky, Prometheus Users
The typical configuration is to have an SRV record per instance of a target.  These should be broken up with your naming by job.

- job_name: mysql
  dns_sd_configs:
    - names:
- job_name: haproxy
  dns_sd_configs:
    - names:

You can also use metric_relabel_configs and __meta_dns_name within those jobs to extract other labels like cluster names, datacenter names, etc depending on how you build your DNS trees.

If you could share the specifics of your current DNS use, I could be more specific with advice.

You don't typically need to adjust the path, as it's always /metrics on standard exporters.  And even if you do have a non-standard exporter, the path should be identical for each instance within a job.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/aa57cdcf-00c3-403a-a150-033d426579b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Danny Kulchinsky

unread,
May 24, 2017, 4:25:11 PM5/24/17
to Prometheus Users, dann...@gmail.com
Thanks Ben, this seems to align with what I was thinking too.

I don't have anything actual setup (just trying to figure out what's needed), but in general I'm looking into something like this.

For all 'haproxy' targets, define an SRV Record "metrics.haproxy.<env>.<dc>.domain.com", which will include the following entries:

metrics.haproxy.<env>.<dc>.domain.com.   86400 IN    SRV 10       60     9100 haproxy01.<env>.<dc>.domain.com.
metrics.haproxy.<env>.<dc>.domain.com.   86400 IN    SRV 10       60     8080 haproxy01.<env>.<dc>.domain.com.
metrics.haproxy.<env>.<dc>.domain.com.   86400 IN    SRV 10       60     9100 haproxy02.<env>.<dc>.domain.com.
metrics.haproxy.<env>.<dc>.domain.com.   86400 IN    SRV 10       60     8080 haproxy02.<env>.<dc>.domain.com.

Each haproxy server runs node-exporter (port 9100) and cadvisor (port 8080).

In Prometheus as you suggested the Job should follow:

- job_name: haproxy
  dns_sd_configs:
    - names:
      - metrics.haproxy.<env>.<dc>.domain.com


Should I be able to extract the port number from the retrieved target list?


Danny


On Wednesday, May 24, 2017 at 1:24:57 AM UTC-4, Ben Kochie wrote:
The typical configuration is to have an SRV record per instance of a target.  These should be broken up with your naming by job.

- job_name: mysql
  dns_sd_configs:
    - names:
- job_name: haproxy
  dns_sd_configs:
    - names:

You can also use metric_relabel_configs and __meta_dns_name within those jobs to extract other labels like cluster names, datacenter names, etc depending on how you build your DNS trees.

If you could share the specifics of your current DNS use, I could be more specific with advice.

You don't typically need to adjust the path, as it's always /metrics on standard exporters.  And even if you do have a non-standard exporter, the path should be identical for each instance within a job.

On Wed, May 24, 2017 at 1:45 AM, Danny Kulchinsky <dann...@gmail.com> wrote:
Dear Forum,

We are working on setting up Prometheus to use DNS SD (using SRV records), we have several data-centers and environments, most of our services deployed in K8s but still have several roles deployed on "stand alone" servers (Databases, Cache nodes, etc..)

I'm trying to figure out what would be the best approach to define the SRV records to efficiently discover the targets, my doubt is about is how to dynamically discover what are the endpoints (ports & paths) that each node should be scraped for, most will have node-exporter and cadvisor, but others may have mysql-exporter, haproxy-exporter, etc...

I guess I could have multiple records per each target, each with a unique port number to indicate where the exporter listens (i.e. 9100 for node-exporter, etc...), however, how would I specify the path?

Another alternative we considered was to define an SRV record (and a corresponding scrape job) per each type of exporter and include the relevant node targets in that record.


Perhaps I'm missing something and the approach here should be quite different, would love to hear some ideas/suggestions or perhaps even examples on how to best achieve this.


Regards,
Danny

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

Ben Kochie

unread,
May 24, 2017, 6:19:58 PM5/24/17
to Danny Kulchinsky, Prometheus Users
On Wed, May 24, 2017 at 10:25 PM, Danny Kulchinsky <dann...@gmail.com> wrote:
Thanks Ben, this seems to align with what I was thinking too.

I don't have anything actual setup (just trying to figure out what's needed), but in general I'm looking into something like this.

For all 'haproxy' targets, define an SRV Record "metrics.haproxy.<env>.<dc>.domain.com", which will include the following entries:

metrics.haproxy.<env>.<dc>.domain.com.   86400 IN    SRV 10       60     9100 haproxy01.<env>.<dc>.domain.com.
metrics.haproxy.<env>.<dc>.domain.com.   86400 IN    SRV 10       60     8080 haproxy01.<env>.<dc>.domain.com.
metrics.haproxy.<env>.<dc>.domain.com.   86400 IN    SRV 10       60     9100 haproxy02.<env>.<dc>.domain.com.
metrics.haproxy.<env>.<dc>.domain.com.   86400 IN    SRV 10       60     8080 haproxy02.<env>.<dc>.domain.com.

Each haproxy server runs node-exporter (port 9100) and cadvisor (port 8080).

In Prometheus as you suggested the Job should follow:

- job_name: haproxy
  dns_sd_configs:
    - names:
      - metrics.haproxy.<env>.<dc>.domain.com


The way "job" is intended to be used in Prometheus is to relate to the software being monitored, not the overall service. (more below)

So in this case you would define something like this:

metrics.node.<env>.<dc>.domain.com 9100 haproxy01...
metrics.cadvisor.<env>.<dc>.domain.com 8080 haproxy01...
metrics.haproxy.<env>.<dc>.domain.com 9101 haproxy01... (haproxy_exporter)

Then you would have a job for each one

- job_name: node
  dns_sd_configs:
    - names:
      - metrics.node.<env>.<dc>.domain.com
- job_name: cadvisor
  dns_sd_configs:
    - names:
      - metrics.cadvisor.<env>.<dc>.domain.com

- job_name: haproxy
  dns_sd_configs:
    - names:
      - metrics.haproxy.<env>.<dc>.domain.com
      - metrics.haproxy.<env2>.<dc>.domain.com

You would also have a relabel in each one to map the <env> and <dc> into separate labels.
It may seem like a lot of config, but we designed the config file with config mgmt templating in mind, so it would be easy to spit out from machine generated attribute trees.

So as you can see, job doesn't distinguish a service grouping.  You may want to add another DNS level like <service> to your tree if that's how you logically separate nodes into different services.  The reason we do this is so you can aggregate your lookup queries across all variations of a single piece of software, no matter how many different haproxy cluster you have, you can lookup by job="haproxy".

Or if you have the concept of a machine owner tag, you might want to do something like

metrics.<job>.<env>.<service>.<owner>.<dc>.domain.com

But, job should always be "like" software.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/c6116826-6898-4cdc-83a0-47adf5f8e2af%40googlegroups.com.

Danny Kulchinsky

unread,
May 24, 2017, 7:26:54 PM5/24/17
to Prometheus Users, dann...@gmail.com
Dear Ben!

Thank you so much for the detailed response, I see now exactly what I was missing.

This will surely help put us on the right track.

Best Regards,
Danny
Reply all
Reply to author
Forward
0 new messages