Can not collect metric on Cisco Device by snmp-exporter

249 views
Skip to first unread message

Hieu Nguyen

unread,
Dec 18, 2019, 4:01:22 AM12/18/19
to Prometheus Users

Dear Admin,

I have a problem when I collect a metric on Cisco devices.

An error has occurred during metrics gathering:
error collecting metric Desc{fqName: "snmp_error", help: "Error scraping target", constLabels: {}, variableLabels: []}: error walking target 10.10.10.101: Request timeout (after 3 retries)

Although I can check with snmpwalk. It's very fast.

Please help me fix a bug.

80737461_1454726091351675_4893132631174021120_o.jpg
79696789_1454724314685186_8093648599801397248_n.jpg

Brian Candler

unread,
Dec 18, 2019, 5:55:40 AM12/18/19
to Prometheus Users
"Timeout" simply means it failed.  If you communicate with a device but give the wrong community string, or you send from an IP address which is blocked by the device's ACL, you will get no response.

Note that your snmpbulkwalk uses a different community string than the default of "public".  I suspect the problem is that you have not set the community properly in snmp.yml.

Therefore:

1. Show us your snmp.yml

2. Use curl to scrape snmp_exporter directly, and show us the exact command you're using: e.g.


(we don't know what you're using for "FOO")

Once you have curl scraping directly working, then you can update your prometheus configuration to match.

Daniel Swarbrick

unread,
Dec 18, 2019, 8:23:25 AM12/18/19
to Prometheus Users
If I remember correctly, snmpwalk uses a lower max-repetitions than snmp_exporter by default. I experienced bogus "request timeouts" when walking certain large Juniper devices, where it would work flawlessly with snmpwalk (or snmpbulkwalk), but fail with snmp_exporter.

Try setting max_repetitions to 20 or less in your exporter config.

Ben Kochie

unread,
Dec 18, 2019, 9:01:22 AM12/18/19
to Daniel Swarbrick, Prometheus Users
The default max_repetitions is 25, which matches the screenshot posted.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7aafd2e3-ffa2-4ed0-8fef-9bd8d633e4b1%40googlegroups.com.

Brian Candler

unread,
Dec 18, 2019, 9:05:38 AM12/18/19
to Prometheus Users
That's helpful, but the OP has already ruled this out: they provided -Cr25 to snmpbulkwalk (increasing the max_repetitions from 10 to 25), and 25 is also the default from snmp_exporter.
Message has been deleted
Message has been deleted

Hieu Nguyen

unread,
Dec 18, 2019, 11:01:23 PM12/18/19
to Brian Candler, Prometheus Users
Hi Brian Candler.

1. Here is my snmp.yml config. 
cisco:
  walk:
  - 1.3.6.1.2.1.2.2.1.1
  metrics:
  - name: ifIndex
    oid: 1.3.6.1.2.1.2.2.1.1
    type: gauge
    help: A unique value, greater than zero, for each interface - 1.3.6.1.2.1.2.2.1.1
    indexes:
    - labelname: ifIndex
      type: gauge
  version: 2
  max_repetitions: 25
  retries: 3
  timeout: 10s
  auth:
    community: xxxpuclic


2. I tested with the comment: snmpwalk -v2c -c xxxpublic 10.10.10.101 1.3.6.1.2.1.2.2.1.1


Screenshot_1.png


Screenshot_2.png


Thanks for your support.


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.

Hieu Nguyen

unread,
Dec 18, 2019, 11:46:34 PM12/18/19
to Prometheus Users
I tried to change max_repetitions, request timeouts but it not work.

Brian Candler

unread,
Dec 19, 2019, 2:10:12 AM12/19/19
to Prometheus Users
In snmp.yml you are using community "xxxpuclic", but at the snmpwalk command line you're using "xxxpublic".  Both of these are different to what you posted before (but deleted).

Anyway, you can most likely debug it yourself using tcpdump: something like

tcpdump -i eth0 -nn -s0 -v udp port 161
(replace "eth0" with your interface name)

Run the scrape using curl; look at the packet it sends.  You should see it retry several times.

Then run the snmpbulkwalk that works.  Compare the first packet of that with the packets generated by the scrape.  Something will be different: e.g. the source IP address, the community string, the SNMP version, the OID.

Hieu Nguyen

unread,
Dec 19, 2019, 4:07:00 AM12/19/19
to Prometheus Users
I am sure I using 1 community string for snmp.yml and snmpwalk. "xxx" because I want to hidden community value. So sorry went  I was delivery information not cleanly.  

  When I run scrape using curl. I checked with tcpdump bit I only see... 

Screenshot_4.png


It's not work. :(

Hieu Nguyen

unread,
Dec 19, 2019, 4:22:42 AM12/19/19
to Prometheus Users
But when I checked with snmpwalk i saw packet send to Prometheus server.

Screenshot_5.png

Brian Candler

unread,
Dec 19, 2019, 6:40:02 AM12/19/19
to Prometheus Users
Once again, you are using a different community string in both those screenshots ("ldgpublic" versus "ldgpuclic")

Also, try to capture the *very first* packet that is sent by snmpbulkwalk, and compare it with the packets being sent by snmp_exporter.

Hieu Nguyen

unread,
Dec 19, 2019, 8:14:15 PM12/19/19
to Prometheus Users

Now, It is work.

Thanks for your support.
Reply all
Reply to author
Forward
0 new messages