snmp_exporter inaccurate metrics for high bandwidth network interfaces

411 views
Skip to first unread message

Brian Lee

unread,
Aug 5, 2021, 5:07:40 PM8/5/21
to Prometheus Users
Hi!

I have several new devices that all share the same snmp_exporter problem: invalid metrics on high bandwidth network interfaces.  They're all the same make/model (Checkpoint MHO170). And they all push 1-3Gbps on a daily basis, depending on the time of day. The network interfaces with the high bandwidth have wildly inaccurate metrics according to snmp_exporter. Any advice on how to troubleshoot this?

For example, here's irate ifOutOctets *8 (preferring bits over bytes) on an interface over the last day:

snmp_exporter-eth2-09x.jpg

I happen to have access to a 3rd party monitoring tool, and here's the same interface over the same period (blue line is egress):

3rd-party-eth2-09.jpg

$ curl <snmp_exporter> | grep eth2-09
ifAdminStatus{ifDescr="eth2-09",ifName="eth2-09"} 1 ifDescr{ifDescr="eth2-09",ifName="eth2-09"} 1 ifInDiscards{ifDescr="eth2-09",ifName="eth2-09"} 36 ifInErrors{ifDescr="eth2-09",ifName="eth2-09"} 0 ifInNUcastPkts{ifDescr="eth2-09",ifName="eth2-09"} 0 ifInOctets{ifDescr="eth2-09",ifName="eth2-09"} 1.535609325e+09 ifInUcastPkts{ifDescr="eth2-09",ifName="eth2-09"} 2.6149782e+07 ifInUnknownProtos{ifDescr="eth2-09",ifName="eth2-09"} 0 ifIndex{ifDescr="eth2-09",ifName="eth2-09"} 103 ifLastChange{ifDescr="eth2-09",ifName="eth2-09"} 0 ifMtu{ifDescr="eth2-09",ifName="eth2-09"} 1500 ifOperStatus{ifDescr="eth2-09",ifName="eth2-09"} 1 ifOutDiscards{ifDescr="eth2-09",ifName="eth2-09"} 0 ifOutErrors{ifDescr="eth2-09",ifName="eth2-09"} 0 ifOutNUcastPkts{ifDescr="eth2-09",ifName="eth2-09"} 0 ifOutOctets{ifDescr="eth2-09",ifName="eth2-09"} 1.738399962e+09 ifOutQLen{ifDescr="eth2-09",ifName="eth2-09"} 0 ifOutUcastPkts{ifDescr="eth2-09",ifName="eth2-09"} 3.217078663e+09 ifPhysAddress{ifDescr="eth2-09",ifName="eth2-09",ifPhysAddress="00:1C:7F:<redacted>"} 1 ifSpecific{ifDescr="eth2-09",ifName="eth2-09",ifSpecific="0.0"} 1 ifSpeed{ifDescr="eth2-09",ifName="eth2-09"} 4.294967295e+09 ifType_info{ifDescr="eth2-09",ifName="eth2-09",ifType="ethernetCsmacd"} 1

Brian Candler

unread,
Aug 6, 2021, 3:00:12 AM8/6/21
to Prometheus Users
Yes, that's expected.  ifInOctets / ifOutOctets are 32-bit counters; a link running at 1Gbps (125,000,000 bytes per second) will wrap a 32-bit counter every 34 seconds.

You need to query ifHCInOctets / ifHCOutOctets instead, which are 64-bit counters.

Almost certainly those counters do exist on your device, if the "3rd party monitoring tool" gives correct answers, and it is polling at the same interval as your prometheus server.

Brian Lee

unread,
Aug 6, 2021, 2:48:10 PM8/6/21
to Prometheus Users
Thank you! Adding the named "HC" counter OIDs to my generator.yml and graphing on those instead fixed this issue:

walk: [ interfaces, ifHCInOctets, ifHCOutOctets ]
Reply all
Reply to author
Forward
0 new messages