Unable to "scrape" snmp targets using snmp_exporter

1,841 views
Skip to first unread message

Kodo65

unread,
Nov 26, 2019, 3:23:24 AM11/26/19
to Prometheus Users
I'm unable to get the snmp_exporter to SNMP-scrape any device(s). I've created the following snmp.yml file (it has been stripped for brevity):

    if_mib:
      version: 2
      auth:
        community: public
      timeout: 120s
      walk:
      - 1.3.6.1.2.1.2
      get:
      - 1.3.6.1.2.1.1.3.0
      - 1.3.6.1.2.1.31.1.1.1.6.40
      metrics:
      - name: sysUpTime
        oid: 1.3.6.1.2.1.1.3
        type: gauge
        help: The time (in hundredths of a second) since the network management portion
          of the system was last re-initialized. - 1.3.6.1.2.1.1.3
      - name: ifNumber
        oid: 1.3.6.1.2.1.2.1
        type: gauge
        help: The number of network interfaces (regardless of their current state) present
          on this system. - 1.3.6.1.2.1.2.1


If I issue a simple snmpwalk:

    snmpwalk -v2c -c public z.y.z.t 1.3.6.1.2.1.1.3.0

OR

  snmpbulkwalk -v2c -c public -Cr25 -On x.y.z.t 1.3.6.1.2.1.1.3.0

I get an *instant* (< 0.1s) response. However, if I issue the following command towards the snmp_exporter (port:9116):

    curl 'my_machine:9116/snmp?module=if_mib&target=x.y.z.t'

it "hangs" and after 120s I get a 'Request timeout (120s)' exception.

I'm running the following components:
  •     Ubuntu 18.04 "Bionic"
  •     go version: 1.13.4
  •     snmp_exporter version: 0.15.0
  •     prometheus version: 2.14.0
What am I doing wrong here??????

Brian Brazil

unread,
Nov 26, 2019, 3:26:05 AM11/26/19
to Kodo65, Prometheus Users
We'd be doing an snmpget on that OID. Is that working?

Brian
 

I get an *instant* (< 0.1s) response. However, if I issue the following command towards the snmp_exporter (port:9116):

    curl 'my_machine:9116/snmp?module=if_mib&target=x.y.z.t'

it "hangs" and after 120s I get a 'Request timeout (120s)' exception.

I'm running the following components:
  •     Ubuntu 18.04 "Bionic"
  •     go version: 1.13.4
  •     snmp_exporter version: 0.15.0
  •     prometheus version: 2.14.0
What am I doing wrong here??????

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/16d145d4-6385-4201-8392-ec19dc41d7ac%40googlegroups.com.


--

Kodo65

unread,
Nov 26, 2019, 3:48:08 AM11/26/19
to Prometheus Users
Hi Brian!

Thanks for a very prompt reply :)

Yes - issuing the following :

snmpget -v2c -c public -On 172.30.10.251 1.3.6.1.2.1.1.3.0

and

snmpget -v2c -c public -On  172.30.10.251 1.3.6.1.2.1.31.1.1.1.6.40

are indeed *instantaneous* and returns one value each.

On Tuesday, November 26, 2019 at 9:26:05 AM UTC+1, Brian Brazil wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Kodo65

unread,
Nov 26, 2019, 4:32:29 AM11/26/19
to Prometheus Users
Just to clarify my response: the snmp_exporter still fails... :(

Brian Brazil

unread,
Nov 26, 2019, 6:16:54 AM11/26/19
to Kodo65, Prometheus Users
On Tue, 26 Nov 2019 at 08:48, Kodo65 <peter.li...@gmail.com> wrote:
Hi Brian!

Thanks for a very prompt reply :)

Yes - issuing the following :

snmpget -v2c -c public -On 172.30.10.251 1.3.6.1.2.1.1.3.0

and

snmpget -v2c -c public -On  172.30.10.251 1.3.6.1.2.1.31.1.1.1.6.40

are indeed *instantaneous* and returns one value each.

That's a bit odd then. If you turn on debug logging on the snmp exporter what is it showing?

Brian
 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7e83b13d-91f6-47cf-90e1-2f7c0788a448%40googlegroups.com.


--

Kodo65

unread,
Nov 26, 2019, 6:19:40 AM11/26/19
to Prometheus Users
Hi Brian!

Sample output rows below:

DEBU[4400] Scraping target '172.30.14.251' with module 'if_mib'  source="main.go:87"
DEBU[4400] Getting 2 OIDs from target "172.30.14.251"    source="collector.go:126"
DEBU[4430] Scraping target '172.30.10.251' with module 'if_mib'  source="main.go:87"
DEBU[4430] Getting 2 OIDs from target "172.30.10.251"    source="collector.go:126"
INFO[4460] Error scraping target 172.30.14.251: error getting target 172.30.14.251: Request timeout (after 3 retries)  source="collector.go:211"
DEBU[4460] Scrape of target '172.30.14.251' with module 'if_mib' took 180.015173 seconds  source="main.go:98"
INFO[4490] Error scraping target 172.30.10.251: error getting target 172.30.10.251: Request timeout (after 3 retries)  source="collector.go:211"
DEBU[4490] Scrape of target '172.30.10.251' with module 'if_mib' took 180.004774 seconds  source="main.go:98"



On Tuesday, November 26, 2019 at 12:16:54 PM UTC+1, Brian Brazil wrote:

Brian Brazil

unread,
Nov 26, 2019, 6:26:27 AM11/26/19
to Kodo65, Prometheus Users
On Tue, 26 Nov 2019 at 11:19, Kodo65 <peter.li...@gmail.com> wrote:
Hi Brian!

Sample output rows below:

DEBU[4400] Scraping target '172.30.14.251' with module 'if_mib'  source="main.go:87"
DEBU[4400] Getting 2 OIDs from target "172.30.14.251"    source="collector.go:126"
DEBU[4430] Scraping target '172.30.10.251' with module 'if_mib'  source="main.go:87"
DEBU[4430] Getting 2 OIDs from target "172.30.10.251"    source="collector.go:126"
INFO[4460] Error scraping target 172.30.14.251: error getting target 172.30.14.251: Request timeout (after 3 retries)  source="collector.go:211"
DEBU[4460] Scrape of target '172.30.14.251' with module 'if_mib' took 180.015173 seconds  source="main.go:98"
INFO[4490] Error scraping target 172.30.10.251: error getting target 172.30.10.251: Request timeout (after 3 retries)  source="collector.go:211"
DEBU[4490] Scrape of target '172.30.10.251' with module 'if_mib' took 180.004774 seconds  source="main.go:98"

That's not what our logs look like, that's an old binary. Can you try with the latest release?

Brian
 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e1f17c8d-efa3-4012-b9f2-8ae21504a2e9%40googlegroups.com.


--

Kodo65

unread,
Nov 26, 2019, 6:29:12 AM11/26/19
to Prometheus Users
It's version 0.15.0 of the snmp_exporter when I issue the snmp_exporter --version command

is it an old version?



On Tuesday, November 26, 2019 at 12:26:27 PM UTC+1, Brian Brazil wrote:

Brian Brazil

unread,
Nov 26, 2019, 6:39:41 AM11/26/19
to Kodo65, Prometheus Users
On Tue, 26 Nov 2019 at 11:29, Kodo65 <peter.li...@gmail.com> wrote:
It's version 0.15.0 of the snmp_exporter when I issue the snmp_exporter --version command
 
is it an old version?

Ah no that's the latest, I just haven't released a version since the logging change went in. Features that actually require a release aren't very common.

It's the get that's failing anyway, it's not at the stage of the walk even. Can you have a peek with tcpdump?

Brian
 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/5964e18d-5485-47bb-afc0-436adbda23d3%40googlegroups.com.


--

Kodo65

unread,
Nov 26, 2019, 6:51:33 AM11/26/19
to Prometheus Users
Hi Brian!

I just did that and I think I "know" what's wrong here. I'm running this in a VBOX VM with two interfaces, enp0s3 and enp0s8. enp0s8 is a bridged network adapter connecting to the switches over an openvpn/tap interface. When I start the snmp_exporter and invoke a tcpdump I can see UDP traffic on the enp0s3 interface - whoever the switches are accessed over the bridged interface (enp0s8)... Hmmmm - how can we solve this?

On Tuesday, November 26, 2019 at 12:39:41 PM UTC+1, Brian Brazil wrote:

Brian Brazil

unread,
Nov 26, 2019, 7:09:47 AM11/26/19
to Kodo65, Prometheus Users
On Tue, 26 Nov 2019 at 11:51, Kodo65 <peter.li...@gmail.com> wrote:
Hi Brian!

I just did that and I think I "know" what's wrong here. I'm running this in a VBOX VM with two interfaces, enp0s3 and enp0s8. enp0s8 is a bridged network adapter connecting to the switches over an openvpn/tap interface. When I start the snmp_exporter and invoke a tcpdump I can see UDP traffic on the enp0s3 interface - whoever the switches are accessed over the bridged interface (enp0s8)... Hmmmm - how can we solve this?

It sounds like your routing tables need adjusting.

Brian
 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/40cc8b6f-8af4-44f4-8368-0bf0b8bb92b7%40googlegroups.com.


--

Kodo65

unread,
Nov 26, 2019, 8:09:31 AM11/26/19
to Prometheus Users
Hmmmm, maybe. however it seems "odd" that I'm still able to scrape physical hosts residing on the 172.30.15.0 net using the node_exporter from the same Prometheus server instance. The fact that I'm also able to perform snmpget/snmpwalk operations towards to switches I'm currently unable to scrape with snmp_exporter makes me feel a 'little lost' here... ... ...

On Tuesday, November 26, 2019 at 1:09:47 PM UTC+1, Brian Brazil wrote:

Brian Brazil

unread,
Nov 26, 2019, 8:57:14 AM11/26/19
to Kodo65, Prometheus Users
On Tue, 26 Nov 2019 at 13:09, Kodo65 <peter.li...@gmail.com> wrote:
Hmmmm, maybe. however it seems "odd" that I'm still able to scrape physical hosts residing on the 172.30.15.0 net using the node_exporter from the same Prometheus server instance. The fact that I'm also able to perform snmpget/snmpwalk operations towards to switches I'm currently unable to scrape with snmp_exporter makes me feel a 'little lost' here... ... ...

It sounds like you might have something beyond normal routing going on (such as source routing or iptables) and/or netsnmp is doing something fancy and not merely using the default source address.

Have you tried the snmp commands from the exact same machine/user as the snmp_exporter is running on?

Brian
 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/12820f7f-7a5e-40ec-9655-ec1ecaad5091%40googlegroups.com.


--

Kodo65

unread,
Nov 26, 2019, 9:01:00 AM11/26/19
to Prometheus Users
Hi Brian! yes - this is is the case. Everything Prometheus/Grafana related is run in the same VM. Everything else is working "as advertised" - it's only the snmp_exporter that fails...

On Tuesday, November 26, 2019 at 2:57:14 PM UTC+1, Brian Brazil wrote:

Brian Brazil

unread,
Nov 26, 2019, 9:05:46 AM11/26/19
to Kodo65, Prometheus Users
On Tue, 26 Nov 2019 at 14:01, Kodo65 <peter.li...@gmail.com> wrote:
Hi Brian! yes - this is is the case. Everything Prometheus/Grafana related is run in the same VM. Everything else is working "as advertised" - it's only the snmp_exporter that fails...

I'd break out strace and tcpdump at this stage then.

Brian
 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/daa04d62-2b5c-4b24-a905-2c829ee6384f%40googlegroups.com.


--

Kodo65

unread,
Nov 26, 2019, 9:44:47 AM11/26/19
to Prometheus Users
sudo strace -v -e trace=%network -p 5614 -f  -v

Resulted in the below (5614 is the pid of the snmp_exporter process):

[pid  5625] accept4(3, 0xc0002459e0, [112], SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5620] socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 6
[pid  5620] setsockopt(6, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
[pid  5620] connect(6, {sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("172.30.10.251")}, 16) = 0
[pid  5620] getsockname(6, {sa_family=AF_INET, sin_port=htons(55947), sin_addr=inet_addr("10.0.2.15")}, [112->16]) = 0
[pid  5620] getpeername(6, {sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("172.30.10.251")}, [112->16]) = 0
[pid  5619] accept4(3, {sa_family=AF_INET6, sin6_port=htons(42850), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, [112->28], SOCK_CLOEXEC|SOCK_NONBLOCK) = 8
[pid  5619] getsockname(8, {sa_family=AF_INET6, sin6_port=htons(9116), inet_pton(AF_INET6, "::ffff:127.0.1.1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, [112->28]) = 0
[pid  5619] setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
[pid  5619] setsockopt(8, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
[pid  5619] setsockopt(8, SOL_TCP, TCP_KEEPINTVL, [180], 4) = 0
[pid  5619] setsockopt(8, SOL_TCP, TCP_KEEPIDLE, [180], 4) = 0
[pid  5619] accept4(3, 0xc0002459e0, [112], SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
[pid  5619] socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 10
[pid  5619] setsockopt(10, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
[pid  5619] connect(10, {sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("172.30.10.251")}, 16) = 0
[pid  5619] getsockname(10, {sa_family=AF_INET, sin_port=htons(60610), sin_addr=inet_addr("10.0.2.15")}, [112->16]) = 0
[pid  5619] getpeername(10, {sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("172.30.10.251")}, [112->16]) = 0
[pid  5617] socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 11
[pid  5617] setsockopt(11, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
[pid  5617] connect(11, {sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("172.30.14.251")}, 16) = 0
[pid  5617] getsockname(11, {sa_family=AF_INET, sin_port=htons(40901), sin_addr=inet_addr("10.0.2.15")}, [112->16]) = 0
[pid  5617] getpeername(11, {sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("172.30.14.251")}, [112->16]) = 0
[pid  5619] socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 12
[pid  5619] setsockopt(12, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
[pid  5619] connect(12, {sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("172.30.10.251")}, 16) = 0
[pid  5619] getsockname(12, {sa_family=AF_INET, sin_port=htons(50897), sin_addr=inet_addr("10.0.2.15")}, [112->16]) = 0
[pid  5619] getpeername(12, {sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("172.30.10.251")}, [112->16]) = 0


On Tuesday, November 26, 2019 at 3:05:46 PM UTC+1, Brian Brazil wrote:

Kodo65

unread,
Nov 27, 2019, 7:20:39 AM11/27/19
to Prometheus Users
Hi again Brian!

Sorry for bothering you again with this :( I get the feeling I miss something VERY obvious here. What's very interesting though, is that when I use the snmp_exporter UI and use the values 172.30.10.251/if_mib I get an INSTANT response from the tcpdump command below BUT the UI times out???!!! The UI obviously sends the correct parameters and GETS a response according to tcpdump but the snmp_exporter times out. Please have a look below...

tcpdump -i any -nn port snmp

(The following lines are the result from the UI submit)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
13:02:53.735061 IP 10.0.2.15.51148 > 172.30.10.251.161:  GetRequest(45)  .1.3.6.1.2.1.1.3.0 .1.3.6.1.2.1.31.1.1.1.6.40
13:02:53.740890 IP 10.0.2.2.161 > 10.0.2.15.51148:  GetResponse(54)  .1.3.6.1.2.1.1.3.0=174994816 .1.3.6.1.2.1.31.1.1.1.6.40=319439595564

which is EXACTLY (well almost as the values have changed slightly of course):

(My issued commands)
snmpget -v2c -c public 172.30.10.251 1.3.6.1.2.1.1.3.0
iso.3.6.1.2.1.1.3.0 = Timeticks: (175003362) 20 days, 6:07:13.62


snmpget -v2c -c public 172.30.10.251 .1.3.6.1.2.1.31.1.1.1.6.40
iso.3.6.1.2.1.31.1.1.1.6.40 = Counter64: 319439648306

I'm not, by far, a network guy so distilling a sense out of this is hard to say the least :) Do you have any ideas?

On Tuesday, November 26, 2019 at 3:05:46 PM UTC+1, Brian Brazil wrote:

Brian Brazil

unread,
Nov 27, 2019, 7:35:46 AM11/27/19
to Kodo65, Prometheus Users
On Wed, 27 Nov 2019 at 12:20, Kodo65 <peter.li...@gmail.com> wrote:
Hi again Brian!

Sorry for bothering you again with this :( I get the feeling I miss something VERY obvious here. What's very interesting though, is that when I use the snmp_exporter UI and use the values 172.30.10.251/if_mib I get an INSTANT response from the tcpdump command below BUT the UI times out???!!! The UI obviously sends the correct parameters and GETS a response according to tcpdump but the snmp_exporter times out. Please have a look below...

tcpdump -i any -nn port snmp

(The following lines are the result from the UI submit)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
13:02:53.735061 IP 10.0.2.15.51148 > 172.30.10.251.161:  GetRequest(45)  .1.3.6.1.2.1.1.3.0 .1.3.6.1.2.1.31.1.1.1.6.40
13:02:53.740890 IP 10.0.2.2.161 > 10.0.2.15.51148:  GetResponse(54)  .1.3.6.1.2.1.1.3.0=174994816 .1.3.6.1.2.1.31.1.1.1.6.40=319439595564

The IP addresses are different, this smells like broken NAT.

Brian
 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/bcab208d-2178-4460-83c5-e25a74c5e7eb%40googlegroups.com.


--

Kodo65

unread,
Dec 2, 2019, 3:35:00 AM12/2/19
to Prometheus Users
Hi Brian,

I hear what you're saying and you're probably right. However, I'm puzzled about the fact that the "ordinary" SNMP-client tools such as snmgget and snmpwalk works without a glitch and the responses are sub-second... Could it be I'm missing some intricate GO-dependency here?

On Wednesday, November 27, 2019 at 1:35:46 PM UTC+1, Brian Brazil wrote:

Brian Brazil

unread,
Dec 2, 2019, 3:49:16 AM12/2/19
to Kodo65, Prometheus Users
On Mon, 2 Dec 2019 at 08:35, Kodo65 <peter.li...@gmail.com> wrote:
Hi Brian,

I hear what you're saying and you're probably right. However, I'm puzzled about the fact that the "ordinary" SNMP-client tools such as snmgget and snmpwalk works without a glitch and the responses are sub-second... Could it be I'm missing some intricate GO-dependency here?

I'd guess that netsnmp failing to filter responses to ensure they're coming from the address that it sent to, or is binding to the socket it a bit different somehow. I'd consider the snmp exporter's behaviour to be the correct one here.

Brian
 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/54cb8139-a759-4cd1-ab81-b58497c338ca%40googlegroups.com.


--

Ben Kochie

unread,
Dec 2, 2019, 4:04:57 AM12/2/19
to Brian Brazil, Kodo65, Prometheus Users
Other problems with netsnmp, it ignores ICMP destination unreachable messages. The snmp_exporter respects these messages, which is really handy for detecting when SNMP targets are blocked or down.

Kodo65

unread,
Dec 2, 2019, 4:14:58 AM12/2/19
to Prometheus Users
Ok, hmmmm, have you heard of similar problems from users using snmp_exporter in a VirtualBox v6 VM (Ubuntu 18.04 Bionic) using NAT?

On Monday, December 2, 2019 at 9:49:16 AM UTC+1, Brian Brazil wrote:
Reply all
Reply to author
Forward
0 new messages