snmp_exporter and job showing down

55 views
Skip to first unread message

dari...@gmail.com

unread,
Aug 13, 2020, 9:07:51 AM8/13/20
to Prometheus Users
Hello,

I have snmp exporter with various network devices, which works fine most of the time. However, sometime I notice false alerts and when I look at the job, there are gaps of. which snmp exporter doesn't register this as 0, its just a gap. When looking at snmp exporter logs I noticed these log messages.

2020-08-13_11:06:01.15743 level=info ts=2020-08-13T11:06:01.157Z caller=collector.go:224 module=protocols target=tor1.mydomain msg="Error scraping target" err="error connecting to target tor1.mydomain: error establishing connection to host: dial udp: lookup tor1.mydomain on 10.1.1.5:53: dial udp 10.1.1.5:53: connect: resource temporarily unavailable"

This seems like a DNS issue? Just want to confirm this is in fact the issue as searching online for this error didn't turn up much. Also if I have 3 dns servers in my resolve.conf why wouldn't it try each one before failing?

Thanks

Ben Kochie

unread,
Aug 13, 2020, 10:28:11 AM8/13/20
to dari...@gmail.com, Prometheus Users
Yes, that looks like a DNS issue to me.

What is the prometheus.yml configuration for this?

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/79172931-2ca0-48e2-b6ce-bde2800f8896n%40googlegroups.com.

dari...@gmail.com

unread,
Aug 13, 2020, 10:36:10 AM8/13/20
to Prometheus Users
boilerplate..

- job_name: 'protocols'
file_sd_configs:
- files:
- protocols_targets.json
metrics_path: /snmp
params:
module: [protocols]
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 127.0.0.1:30964 # The SNMP exporter's real hostname:port.

Ben Kochie

unread,
Aug 13, 2020, 11:18:49 AM8/13/20
to dari...@gmail.com, Prometheus Users
It looks like the error is coming from the SNMP implementation, when trying to use a standard net Dialer.


If my understanding of the code is correct, this should be using a normal Golang DNS resolver.


But the root error "connect: resource temporarily unavailable" is a bit weird to me.

Ben Kochie

unread,
Aug 13, 2020, 11:20:15 AM8/13/20
to dari...@gmail.com, Prometheus Users
One thought, maybe there's an open file descriptor limit on the snmp_exporter that's being hit?
Reply all
Reply to author
Forward
0 new messages