Multi-target exporter with Netbox

385 views
Skip to first unread message

Elliott Balsley

unread,
Dec 13, 2023, 7:27:08 PM12/13/23
to Prometheus Users
I've just started looking at Netbox as a way to store "inventory" for Prometheus using this plugin: https://github.com/FlxPeters/netbox-plugin-prometheus-sd

The documentation is a bit light and I'm wondering if anyone can share a good example of how to use the multi-target exporter pattern.  My initial thought is to add custom fields in netbox like this:
prom_job: (multi-selection with choices like blackbox, snmp, json)
prom_blackbox_module: (selection with choices like ping, http, tcp3389, etc.)
prom_snmp_module: (selection with choices for all my snmp modules)
prom_snmp_auth: (selection with choices like public_v2, public_v3, etc. to be used by snmp).

Then in Prometheus, I would have just one blackbox job which chooses the right module based on that field, and its target URL includes a Netbox filter for prom_job to only return targets which have that module selected.

Similarly, I would have one snmp job, etc.  Does this sound like the best approach?

Brian Candler

unread,
Dec 14, 2023, 4:02:38 AM12/14/23
to Prometheus Users
Right now I use Netbox to control monitoring with node_exporter and snmp_exporter.

I have three custom fields on Device and VirtualMachine, all of which are optional:

* "monitoring" is a selection field with values "node", "snmp" and "icmp"
* "snmp_auth" is a selection field with the names of the configured auths
* "snmp_module" is a multi-select field with the names of the snmp modules I use. (snmp_exporter now supports selection of multiple modules in a single scrape, but sadly combined with how netbox-plugin-prometheus-sd exposes lists, that makes the rewriting config messy)

Prometheus configuration:

  - job_name: exporter
    scrape_interval: 1m
    metrics_path: /metrics
    static_configs:
      - targets:
          - localhost:9115  # blackbox_exporter
          - localhost:9116  # snmp_exporter

  - job_name: snmp
    scrape_interval: 15s
    http_sd_configs:
      # https://github.com/netbox-community/netbox/issues/11538#issuecomment-1635839720
      - url: https://netbox.example.net/api/plugins/prometheus-sd/devices/?cf_monitoring=snmp&status=active
        refresh_interval: 10m
        authorization:
          type: Token
          credentials_file: /etc/prometheus/netbox.token
      - url: https://netbox.example.net/api/plugins/prometheus-sd/virtual-machines/?cf_monitoring=snmp&status=active
        refresh_interval: 10m
        authorization:
          type: Token
          credentials_file: /etc/prometheus/netbox.token
    metrics_path: /snmp
    relabel_configs:
      # Labels which control scraping
      - source_labels: [__address__]
        target_label: instance
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__meta_netbox_primary_ip]
        regex: '(.+)'
        target_label: __param_target
      - source_labels: [__meta_netbox_custom_field_snmp_module]
        target_label: __param_module
      # Ugh: multiselect is of form ['foo', 'bar'] and we need foo,bar. There is no gsub.
      - source_labels: [__param_module]
        regex: "\\['(.*)'\\]"
        target_label: __param_module
      - source_labels: [__param_module]
        regex: "(.*)', *'(.*)"
        replacement: "$1,$2"
        target_label: __param_module
      - source_labels: [__param_module]
        regex: "(.*)', *'(.*)"
        replacement: "$1,$2"
        target_label: __param_module
      - source_labels: [__param_module]
        regex: "(.*)', *'(.*)"
        replacement: "$1,$2"
        target_label: __param_module
      - source_labels: [__meta_netbox_custom_field_snmp_auth]
        target_label: __param_auth
      - target_label: __address__
        replacement: 127.0.0.1:9116  # SNMP exporter
      # Optional extra metadata labels
      - source_labels: [__param_module]
        target_label: module
      - source_labels: [__meta_netbox_cluster_slug]
        target_label: cluster
      - source_labels: [__meta_netbox_device_type_slug]
        target_label: device_type
      - source_labels: [__meta_netbox_model]
        target_label: netbox_model
      - source_labels: [__meta_netbox_platform_slug]
        target_label: platform
      - source_labels: [__meta_netbox_role_slug]
        target_label: role
      - source_labels: [__meta_netbox_site_slug]
        target_label: site
      - source_labels: [__meta_netbox_tag_slugs]
        target_label: tags
      - source_labels: [__meta_netbox_tenant_slug]
        target_label: tenant

  - job_name: icmp
    scrape_interval: 1m
    http_sd_configs:
      - url: https://netbox.example.net/api/plugins/prometheus-sd/devices/?cf_monitoring=icmp&status=active
        refresh_interval: 10m
        authorization:
          type: Token
          credentials_file: /etc/prometheus/netbox.token
      - url: https://netbox.example.net/api/plugins/prometheus-sd/virtual-machines/?cf_monitoring=icmp&status=active
        refresh_interval: 10m
        authorization:
          type: Token
          credentials_file: /etc/prometheus/netbox.token
    metrics_path: /probe
    params:
      module: [icmp]
    relabel_configs:
      # Labels which control scraping
      - source_labels: [__address__]
        target_label: instance
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__meta_netbox_primary_ip]
        regex: '(.+)'
        target_label: __param_target
      - target_label: __address__
        replacement: 127.0.0.1:9115  # Blackbox exporter
      # Optional extra metadata labels
      - source_labels: [__meta_netbox_cluster_slug]
        target_label: cluster
      - source_labels: [__meta_netbox_device_type_slug]
        target_label: device_type
      - source_labels: [__meta_netbox_model]
        target_label: netbox_model
      - source_labels: [__meta_netbox_platform_slug]
        target_label: platform
      - source_labels: [__meta_netbox_role_slug]
        target_label: role
      - source_labels: [__meta_netbox_site_slug]
        target_label: site
      - source_labels: [__meta_netbox_tag_slugs]
        target_label: tags
      - source_labels: [__meta_netbox_tenant_slug]
        target_label: tenant

  - job_name: node
    scrape_interval: 1m
    scrape_timeout: 50s
    http_sd_configs:
      - url: https://netbox.example.net/api/plugins/prometheus-sd/devices/?cf_monitoring=node&status=active
        refresh_interval: 10m
        authorization:
          type: Token
          credentials_file: /etc/prometheus/netbox.token
      - url: https://netbox.example.net/api/plugins/prometheus-sd/virtual-machines/?cf_monitoring=node&status=active
        refresh_interval: 10m
        authorization:
          type: Token
          credentials_file: /etc/prometheus/netbox.token
    metrics_path: /metrics
    relabel_configs:
      # Labels which control scraping
      - source_labels: [__address__]
        target_label: instance
      - source_labels: [__meta_netbox_primary_ip4]
        regex: '(.+)'
        target_label: __address__
      - source_labels: [__meta_netbox_primary_ip6]
        regex: '(.+)'
        target_label: __address__
        replacement: '[${1}]'
      - source_labels: [__address__]
        target_label: __address__
        replacement: '${1}:9100'
      # Optional extra metadata labels
      - source_labels: [__meta_netbox_cluster_slug]
        target_label: cluster
      - source_labels: [__meta_netbox_device_type_slug]
        target_label: device_type
      - source_labels: [__meta_netbox_model]
        target_label: netbox_model
      - source_labels: [__meta_netbox_platform_slug]
        target_label: platform
      - source_labels: [__meta_netbox_role_slug]
        target_label: role
      - source_labels: [__meta_netbox_site_slug]
        target_label: site
      - source_labels: [__meta_netbox_tag_slugs]
        target_label: tags
      - source_labels: [__meta_netbox_tenant_slug]
        target_label: tenant

  # Monitoring of netbox itself
  - job_name: netbox
    scrape_interval: 2m
    scheme: https
    authorization:
      type: Token
      credentials_file: /etc/prometheus/netbox.token
    static_configs:
      - targets: ['netbox.example.net:443']


I added the "icmp" monitoring option only for completeness and haven't really tested it yet.

I'm still on static files for other blackbox_exporter tests though. Although Netbox does have a "service" model, there are things I want to monitor which are not attached to devices or VMs. I'm currently thinking about using config contexts and/or attaching blackbox monitoring directly to an IP address object rather than a device or VM. Discussion: https://github.com/netbox-community/netbox/discussions/14261

(That shows a slightly older Prometheus config where I was using tags to enable the various monitoring types; a custom field is actually simpler)

Elliott Balsley

unread,
Dec 14, 2023, 8:08:42 PM12/14/23
to Prometheus Users
Thanks Brian, this is helpful.
It seems like the "active" flag would be a nice way to take devices out of monitoring for long-term outages.

I'm curious what naming convention you use for devices in Netbox.  Since device names are required to be unique within a site, I'm thinking about some short "computer-friendly" convention and then put a "human-readable" functional name in the "description" field.  This functional name will be used to label Grafana panels, so it only needs to be unique within its local context (room, rack, workflow, etc., depending on the situation).  Site-wide uniqueness would be excessively long and hard to read here.

My problem is, the "description" field in Netbox is not exposed by the Prometheus SD plugin.  So I'm wondering how other people approach this?

--
You received this message because you are subscribed to a topic in the Google Groups "Prometheus Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-users/q99ZvG_ke9w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b95127d2-c1f2-47d1-b6d2-1f1b8e2bc949n%40googlegroups.com.

Brian Candler

unread,
Dec 15, 2023, 3:48:35 AM12/15/23
to Prometheus Users
Personally I use a short device name (without domain) and generally it's unique; I use that as the "instance" label, and it's what people expect to see in graphs and menus anyway, i.e. what they know the device as colloquially.

I also add a label for netbox model (device or vm). But to avoid issues with naming conflicts in the metrics themselves, I probably ought to get around to adding the device_id or vm_id as an additional label.

You could raise a FR to netbox-plugin-prometheus-sd to include the description as a label. It would be a pretty simple change.
Reply all
Reply to author
Forward
0 new messages