How to graph SNMP tables?

346 views
Skip to first unread message

Elliott Balsley

unread,
Aug 4, 2023, 9:42:33 PM8/4/23
to Prometheus Users
I've just started using Grafana and Prometheus with SNMP Exporter.  My Adder AIM server provides SNMP data in a table format, which is very hard to comprehend in Grafana.  Is there some recommended way to handle this type of data?
This table has 15 columns and 65 rows.  Each row represents a "sub-device", or an Adder KVM endpoint.  Attached is a screenshot from the open-source SnmpB app showing it nicely.

The data comes into Prometheus looking like this (just a few lines example), with each metric on a separate line:
# HELP deviceEth1Status Status of eth1 interface - 1.3.6.1.4.1.25119.1.1.1.30 (EnumAsStateSet) # TYPE deviceEth1Status gauge deviceEth1Status{deviceEth1Status="absent",deviceIndex="1"} 0 deviceEth1Status{deviceEth1Status="absent",deviceIndex="101"} 0 deviceEth1Status{deviceEth1Status="absent",deviceIndex="1101"} 0 deviceEth1Status{deviceEth1Status="absent",deviceIndex="1201"} 0 deviceEth1Status{deviceEth1Status="absent",deviceIndex="1301"} 0 deviceEth1Status{deviceEth1Status="absent",deviceIndex="1401"} 0 deviceEth1Status{deviceEth1Status="absent",deviceIndex="1501"} 0

It also includes dozens of extra "rows" for devices which don't actually exist.
Ideally, I'd like to create some kind of time series graph for each metric/column, each of which has lines for each device/row.
Screenshot 2023-08-04 at 6.36.59 PM.png

Ben Kochie

unread,
Aug 5, 2023, 3:13:01 AM8/5/23
to Elliott Balsley, Prometheus Users
What you need is to take the device MIB and use the SNMP exporter's generator to translate the table into metrics. Unfortunately, Adder Technology doesn't seem to have their MIB publicly available, so I can't see what it specifies in the table.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/fc2aceb2-6b45-4c8f-9297-eee882bdf8acn%40googlegroups.com.

Brian Candler

unread,
Aug 5, 2023, 5:45:48 AM8/5/23
to Prometheus Users
> This table has 15 columns and 65 rows.

That sound similar to ifTable/ifXTable in IF-MIB.

> Each row represents a "sub-device"

That sounds similar to "interfaces" on a network device. Is there some column which can be used to identify the sub-device, similar to ifName/ifDescr/ifAlias?

> Ideally, I'd like to create some kind of time series graph for each metric/column, each of which has lines for each device/row.

Like plotting graphs for ifHCInOctets, ifHCOutOctets, ifErrors etc.  See if you can model on that.

As for enums, there's some info here:

You'd use EnumAsStateSet only if you want a single SNMP data point to be exploded into N different timeseries, one for each possible value. e.g. suppose you have a single enumerated value for the status of a UPS, it could expand to

apcups_status{status="online"} 0
apcups_status{status="onbatt"} 1
apcups_status{status="trim"} 0
apcups_status{status="boost"} 0
apcups_status{status="overload"} 0
...etc

EnumAsInfo will apply a single label which changes (therefore, if it ever changes, the timeseries will change; only use this for labels which are essentially static).  Alternatively, you could represent enums as a plain integer stored as a numeric metric, and then in Grafana you map each integer value to a different label and/or colour.

Elliott Balsley

unread,
Aug 5, 2023, 7:33:25 PM8/5/23
to Brian Candler, Prometheus Users
Yes I do think this sounds similar to network switch interfaces, where each interface has a set of properties.

I've attached the MIB file here in case that's helpful, and the generator file I'm currently using.
There is a deviceIndex column but I'm not sure how it's supposed to work.  As an example, for one particular device with index 4201, the data looks like this.  Do I need to configure lookups, in order to merge all these data points into a single "device"?

deviceEth1Status_info{deviceEth1Status="online",deviceIndex="4201"} 1

deviceEth2Status_info{deviceEth2Status="unconfigured",deviceIndex="4201"} 1

deviceFirmware{deviceFirmware="5.5.0",deviceIndex="4201"} 1

deviceIP1{deviceIP1="10.49.55.136",deviceIndex="4201"} 1

deviceIP2{deviceIP2="",deviceIndex="4201"} 1

deviceIdentifier{deviceIdentifier="000f58fffe079bdd",deviceIndex="4201"} 1

deviceIndex{deviceIndex="4201"} 4201

deviceLock_info{deviceIndex="4201",deviceLock="none"} 1

deviceMAC1{deviceIndex="4201",deviceMAC1="00:0f:58:07:9b:df"} 1

deviceMAC2{deviceIndex="4201",deviceMAC2="00:0f:58:07:9b:e0"} 1

deviceName{deviceIndex="4201",deviceName="ENG 4021T"} 1

deviceSerialNum{deviceIndex="4201",deviceSerialNum="2004A0241038"} 1

deviceStatus_info{deviceIndex="4201",deviceStatus="online"} 1

deviceType{deviceIndex="4201",deviceType="tx4"} 1





You received this message because you are subscribed to a topic in the Google Groups "Prometheus Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-users/J8P5NNe5ez4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/3f83a84d-2fb1-4320-afeb-5d6b088d750en%40googlegroups.com.
AIM-MIB.txt
adder-generator.txt

Ben Kochie

unread,
Aug 6, 2023, 3:40:20 AM8/6/23
to Elliott Balsley, Brian Candler, Prometheus Users
The `deviceIndex` already indicates that it is one "device". Each metric is a different property of that device. Prometheus metrics are single float64 values, tables are mapped into metric names (think columns) and each unique label index (deviceIndex in this case) is a different row in the table.

You can use lookups to map the deviceName onto the metrics to make them easier to select in Grafana via variables.

Then when you want to lookup a specific property of the device you would query it like this:

deviceEth1Status_info{deviceName="ENG 4021T"}

I think what you're asking for is you want to consolidate all of the information data (IPs, MACs, SerialNum) onto a single info metric. Unfortunately this is not easy with the current SNMP exporter generator. What you want is to have a whole bunch of lookups that map all of the various properties. But these lookups are "global" within each module. So you will end up with a bunch of noise on all metrics in the table. There's no filter to say "Only apply this lookup onto this metric".

Something like this:

deviceName_info{deviceIndex="4201",deviceIP1="10.49.55.136",deviceIP2="",deviceMAC1="00:0f:58:07:9b:df",deviceMAC2="00:0f:58:07:9b:e0",deviceName="ENG 4021T",deviceSerialNum="2004A0241038"}

This is technically possible in the exporter/snmp.yml config, but the generator doesn't have a syntax to make this. This is something we could support, but it would require some code changes to add a "only lookup for this metric" feature.

One workaround would be to have separate modules. One to gather the state information and one to gather the info.


Elliott Balsley

unread,
Aug 6, 2023, 2:50:49 PM8/6/23
to Ben Kochie, Brian Candler, Prometheus Users
Thanks Ben.  I think it will take me some time to wrap my head around this information.  It's my first time working with SNMP, and I'm trying to learn several tools at once including Prometheus, Grafana, and Loki.
Maybe a time-series database is not the best way to store the "info" parts of this, since things like IP, MAC address, and deviceType should never change.  It's more like a long term inventory.  Some of the other items like deviceFirmware and deviceStatus are useful to track over time and be able to trigger alerts.

Some of the other OIDs like serverMemoryUsage should be tracked as numbers, but they are returned as strings.  Can you advise how to convert this properly?  For example snmpwalk shows serverMemoryUsage like this:

SNMPv2-SMI::enterprises.25119.1.3.2.0 = STRING: "63%"


So I tried using regex to remove the percent symbol
serverMemoryUsage:
regex_extracts:
'':
- regex: '(.*)%'
value: '$1'

But I don't get any value this way and it logs this error:
msg="No match found for regexp" metric=serverMemoryUsage value=0x363425 regex=^(?:^(?:(.*)%)$)$


Brian Candler

unread,
Aug 7, 2023, 2:43:20 AM8/7/23
to Prometheus Users
value=0x363425

That looks to me like it's being interpreted as octets instead of DisplayString. Perhaps you have to add "type: DisplayString" next to regex_extracts - I see you have overrides for other fields already in your generator.yml

It also includes dozens of extra "rows" for devices which don't actually exist.

A "filter" function was recently added, but I never used it myself - it's mentioned in the 0.22.0 release notes. It may do what you need.

Ben Kochie

unread,
Aug 7, 2023, 3:05:09 AM8/7/23
to Elliott Balsley, Brian Candler, Prometheus Users
Yes, like anything, some vendors do a bad job of implementing things. SNMP is a common one that vendors implement because they're told they need to for "standards reasons", but they don't put in the effort to understand the subtleties. I know SNMP pretty well, but I consider it a method of last resort.

IMO, if any device/server has a JSON or similar API I would write a custom exporter for it rather than use SNMP.

It's pretty normal to store the info stuff in Prometheus. Because the data is static (a series of 1 values), the data compresses extremely well.

And yes, having firmware versions in the TSDB is very useful to do reporting and sometimes alerts on deviations over a fleet of instances. PromQL makes it reasonably easy to report on deviations like this.
Reply all
Reply to author
Forward
0 new messages