Slflow counters vs SNMP counters

Skip to first unread message

Munroe Sollog

Nov 15, 2021, 7:02:00 PM11/15/21
to sFlow-RT
I am evaluating replacing our traditional SNMP-based interface metrics collection with an sflow-based solution.  I'm using a Cisco 3k as a testing switch.  For testing, I just enabled sflow collection on a single port-channel like so:

sflow sampling-rate : 4096
sflow max-sampled-size : 128
sflow counter-poll-interval : 20
sflow max-datagram-size : 1400
sflow collector-ip : , vrf : default
sflow collector-port : 6343
sflow agent-ip :
sflow data-source interface port-channel1

When comparing the metrics information to our existing SNMP information, I get wildly different numbers.  To make thinks easier, I pulled both the SNMP counters and the sflow metrics simultaneously and received this:

IF-MIB::ifInOctets.369098752 = Counter32: 2465535836
IF-MIB::ifOutOctets.369098752 = Counter32: 3786122594

sflow_ifoutoctets{agent="",datasource="369098752",,ifindex="369098752",ifname="port-channel1",ifspeed="20G"} 7039.657893397139
sflow_ifinoctets{agent="",datasource="369098752",ifindex="369098752",ifname="port-channel1",ifspeed="20G"} 130994.45094944764

I pulled some of the superfluous info out, but as you can see not only are the numbers not even close, but sflow is showing a decimal point?  On a hunch I also collected 4-5 datapoints and converting the raw SNMP counters to rates to see if sflow is doing the math for me, but no joy.  

Any advice would be helpful.

Peter Phaal

Nov 15, 2021, 7:21:52 PM11/15/21
to sFlow-RT
The sFlow-RT sflow_ifoutoctets{} value reported by the prometheus exporter is a rate (in octets/seconds). The rate is computed between successive counters samples (which you have configured to be exported every 20 seconds). For example, your first entry, sflow_ifoutoctets{agent="",datasource="369098752",,ifindex="369098752",ifname="port-channel1",ifspeed="20G"} 7039.657893397139, reports just over 7k octets / seconds (56k bits/second) averaged over at 20 second interval between the previous two counter samples received from that interface.

Since it appears you are using Prometheus, you might be interested in the following Grafana dashboard to trend the interface counters:

In addtion to counter-based metrics, you can also use the prometheus application to export flow data:

You won't get exact correspondence between SNMP polling and sFlow since the times that the counters are reported by sFlow and the times they are retrieved by SNMP won't be exactly synchronized.

A good way to verify the accuracy of the counters and packet samples is to use the sflow-test application to compare the results you get using each method of measurement:

Munroe Sollog

Nov 15, 2021, 9:10:42 PM11/15/21
to sFlow-RT
I would settle for similar orders of magnitude.  "Exact" is a long way off.  

  out: 6.9Mbps, 303Kbps, 242Kbps, 263Kbps
  in:  1.5Mbps, 1.2Mbps, 1.2Mbps, 1.8Mbps
  out: 60kbps, 1.3Mbps, 56kbps, 45kbps, 18kbps
  in: 6Mbps, 1.1Mbps, 921Kbps, 879Kbps, 2.9Mbps

I'll play with the sflow-test suite to try to figure out why the numbers are so different.  I'm actually using InfluxDB 2, not Grafana, but the prometheus API for Sflow is the only option to pull the data out.

Peter Phaal

Nov 15, 2021, 9:46:59 PM11/15/21
to sFlow-RT
Actually, the numbers you gave are in the same ballpark, sometimes sFlow is higher and sometimes SNMP is higher. The variability increases the shorter you make the polling interval (for sFlow and SNMP). Are you polling with SNMP every 20 seconds as well?

The following article provides a few examples with InfluxDB 2.0:

Reply all
Reply to author
0 new messages