Correlation between snmp scrape time and massive rate output for ifHCInOctets

240 views
Skip to first unread message

Nick Carlton

unread,
Mar 15, 2024, 6:41:52 PM3/15/24
to Prometheus Users
Hello Everyone,

I have just seen something weird in my environment where I saw interface bandwidth on a gigabit switch reach about 1tbps on some of the interfaces.....

Here is the query im using:

rate(ifHCInOctets{ifHCInOctetsIntfName=~".*.\\/.*.",instance="<device-name>"}[2m]) * 8

Which ive never had a problem with. Here is an image of the graph showing the massive increase in bandwidth and then decrease back to normal:

Screenshot 2024-03-15 222353.png

When Ive done some more investigation into what could have happened, I can see that the 'snmp_scrape_duration_seconds' metric increases to around 20s at the time. So the cisco switch is talking 20 seconds to respond to the SNMP request.

Screenshot 2024-03-15 222244.png

Im a bit confused as to how this could cause the rate query to give completely false data? Could the delay in data have caused prometheus to think there was more bandwidth on the interface? The switch certainly cannot do the speeds the graph is claiming!

Im on v0.25.0 on the SNMP exporter and its normally sat around 2s for the scrapes. Im not blaming the exporter for the high response times, thats probably the switch. Just wondering if in some way the high response time could cause the rate query to give incorrect data. The fact the graph went back to normal post the high reponse times makes me think it wasn't the switch giving duff data.

Anyone seen this before and is there any way to mitigate? Happy to provide more info if required :)

Thanks
Nick

Nick Carlton

unread,
Mar 15, 2024, 6:43:19 PM3/15/24
to Prometheus Users
To clarify, my scrapes for this data run every 1m and have a timeout of 50s

Alexander Wilke

unread,
Mar 15, 2024, 7:00:57 PM3/15/24
to Prometheus Users
Hello,

1.) is the timeout of 50s the same on prometheus scrape_config and snmp.yml file?
2.) is this really the name of the interface? ifHCInOctetsIntfName
3) the =~".*.\\/.*." maybo shows many interfaces, maybe som internal loppback which may count traffic twice? Further it may show PortChannel (Po) and then VLAN (Po.xy) and physical interfaces !?

I am not sure but the screenshost show "stacked lines" - is it possible that in the first screenshot the throughput of all interfaces was stacked ?

Nick Carlton

unread,
Mar 15, 2024, 7:24:44 PM3/15/24
to Alexander Wilke, Prometheus Users
Thanks Alexander,

I’m not aware you can setup a timeout in the snmp.yml, or at least I’m not familiar with it?

The name of the interface is a regex string to match only interface names with a ‘/‘ in them. So on Cisco will only match physical interfaces like Gig1/0/1 and not stuff like Loopback0 or Port-Channel1 for example. So only physical interfaces included.

I had to use the visualisation of stacked lines because when it was on line for some reason all of the lines were in really light colours which meant you could barely see them on the graph. I can get another screenshot if needs be? But the value on the left shows the bytes per second the interfaces showed as getting to.

Thanks
Nick

--
You received this message because you are subscribed to a topic in the Google Groups "Prometheus Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-users/poGtu50nisA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/12ddad30-e5f8-4fb1-9869-45095b48b647n%40googlegroups.com.

Ben Kochie

unread,
Mar 16, 2024, 1:31:17 AM3/16/24
to Nick Carlton, Prometheus Users
This is very likely a problem with counter resets or some other kind of duplicate data.

The best way to figure this out is to perform the query, but without the `rate()` function.

This can be done via the Prometheus UI (harder to do in Grafana) in the "Table" view.


The results is a list of the raw samples that are needed to debug.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/6fd3dca6-2013-47ad-af8f-3344e79954a7n%40googlegroups.com.

Alexander Wilke

unread,
Mar 16, 2024, 4:08:44 AM3/16/24
to Prometheus Users
Check File Format example.

Time Out, retries, max-repetition.

I use Repetition 50 or 100 with Cisco, retries 0 and Time Out 1s or 500ms below Prometheus timeout

Alexander Wilke

unread,
Mar 16, 2024, 4:10:18 AM3/16/24
to Prometheus Users

Nick Carlton

unread,
Mar 16, 2024, 4:39:50 AM3/16/24
to Alexander Wilke, Prometheus Users
Thanks both,

I must be honest I never managed to get the generator to work with mib dependencies so have written my snmp.yml manually with other lookups etc so have never seen these values documented.

Is there a best practice guide for their values when you are having certain issues or used in a way to speed up SNMP scrapes? I can’t seem to find any solid documentation.

Ben - I’ll try and get that data, but this is a managed Prometheus so I don’t have access to the main Prometheus UI, just a built in version. Thought it should give me the same data. It’s possible there is duplicate data here because there are two Prometheus boxes polling these switches for the same metrics and sending duplicate data over remote write to the managed endpoint and then the other end supposedly deduplicates the metrics. Is there any way to defend against this on the side I can control?

Thanks
Nick

You received this message because you are subscribed to a topic in the Google Groups "Prometheus Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-users/poGtu50nisA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/4cd1e6b8-fa73-4ee0-92c0-c504c161870bn%40googlegroups.com.

Ben Kochie

unread,
Mar 16, 2024, 5:38:09 AM3/16/24
to Nick Carlton, Alexander Wilke, Prometheus Users
You can also execute the query via the Prometheus compatible API.


The same can be done via the Grafana datasource API endpoint.

> managed endpoint and then the other end supposedly deduplicates the metrics

This is 99% likely the problem. The remote storage is deduplicating, but it's flip-flopping between your two Prometheus instances data. Each prometheus consistently psudo-randomizes the exact millisecond of the scrape time to avoid load spikes on the targets. Since each Prometheus instance is scraping at slightly different times, if the remote TSDB is inserting one that is slightly older, a "newer" sample may actually be slightly lower values from the devices. This tricks Prometheus into thinking there was a counter reset, so it thinks there was the full counter's value of data between the two scrapes.

There are a few options:
* Use only one Prometheus server for SNMP targets to avoid the deduplication happening on your remote write storage.
* Setup a caching HTTP reverse proxy between your Prometheus instances and the snmp_exporter with a cache TTL that matches your scrape interval.
* Wait for / contribute to SNMP walk caching in the snmp_exporter.

I would love to add a full SNMP walk cache to the snmp_exporter. I would like to support memcached/redis as well for clustering persistence. But since my $dayjob has no SNMP, it's hard for me to prioritize work on it.

Nick Carlton

unread,
Mar 16, 2024, 7:16:12 AM3/16/24
to Ben Kochie, Alexander Wilke, Prometheus Users
Thanks Ben, that makes sense. I suppose that was exasperated by the longer scrape times at the time.

With the second option of using a caching http proxy. I’m running each Prometheus and the snmp exporter on the same box so a separate instance per Prometheus instance, so while it’s a great idea, it would only cache the local snmp exporters results. I’ve tried to make this setup as resilient as possible without something like k8s. At the point of snmp walk caching coming in, for the same reason above I think I’ll have the same issue?

I think what I might have to do is pull the interface bandwidth counters out of the main snmp module and only scrape them from one of the instances, that way there is no risk of duplicate data hitting the remote write and also move anything else that I use query using “rate”.

Though I would love to contribute I’m not fluent enough in Go to offer any meaningful assistance :).

Thanks
Nick

Ben Kochie

unread,
Mar 16, 2024, 8:07:33 AM3/16/24
to Nick Carlton, Alexander Wilke, Prometheus Users
On Sat, Mar 16, 2024 at 12:16 PM Nick Carlton <nick.ca...@gmail.com> wrote:
Thanks Ben, that makes sense. I suppose that was exasperated by the longer scrape times at the time.

That shouldn't make much of a difference, unless the remote storage does something funny with the data. Prometheus tags timestamps at the start of the scrape for consistency. The scrape duration does not affect the timestamp at which the data represents. The idea is that it is up to the target to lock any mutexes to provide a consistent data snapshot, then the time it takes to ship that data over the wire is unimportant.
 

With the second option of using a caching http proxy. I’m running each Prometheus and the snmp exporter on the same box so a separate instance per Prometheus instance, so while it’s a great idea, it would only cache the local snmp exporters results. I’ve tried to make this setup as resilient as possible without something like k8s. At the point of snmp walk caching coming in, for the same reason above I think I’ll have the same issue?

My idea is to use an external cache like memcached or redis. Something that can share clustered caches between multiple instances of the exporter for reliability.

I guess your best option there is to setup a separate node with the caching proxy and the exporter. Then point both Prometheus to that one node.

One other idea. There are caching options that can use redis. I have no idea if this is still viable / high quality, but you can try it:

 

I think what I might have to do is pull the interface bandwidth counters out of the main snmp module and only scrape them from one of the instances, that way there is no risk of duplicate data hitting the remote write and also move anything else that I use query using “rate”.

That depends a bit on how the remote write service handles deduplication. IIRC, services based on Cortex and Mimir deduplicate the whole connection. So they assume that two instances connected are "identical". Again, this is very much up to your remote storage implementation to figure out.
Reply all
Reply to author
Forward
0 new messages