Unexpected scraping from Huawei Devices via snmp exporter

63 views
Skip to first unread message

Umut Cokbilir

unread,
Jun 9, 2020, 3:09:11 AM6/9/20
to Prometheus Users
Hi All,
As we understand,it's scraping value less than before and then subinterface counters reset unexpectedly.
Normally,The below metric should be less than 300Mb.
There are different types of device in our enviroment, Some of them have this issue.
I couldn't find the root cause of this problem.
Do you have any suggestion?

Prometheus_problem.PNG


Brian Candler

unread,
Jun 9, 2020, 8:45:43 AM6/9/20
to Prometheus Users
What is the metric "acc:ifHCOutOctetsNNI:rate"?

Does it come from a recording rule?  If so, could you show the recording rule?

Prometheus doesn't lie.  So very simply, you need to check where the metric comes from, and if it's calculated from other metric(s), look at the underlying data and the calculation.

Umut Cokbilir

unread,
Jun 9, 2020, 2:19:38 PM6/9/20
to Prometheus Users


9 Haziran 2020 Salı 15:45:43 UTC+3 tarihinde Brian Candler yazdı:
What is the metric "acc:ifHCOutOctetsNNI:rate"?

Does it come from a recording rule?  If so, could you show the recording rule?

Prometheus doesn't lie.  So very simply, you need to check where the metric comes from, and if it's calculated from other metric(s), look at the underlying data and the calculation.


 Hi Brian

You can find the detail of "acc:ifHCOutOctetsNNI:rate" below.
It comes from a recording rule.
((rate(ifHCOutOctets{ifAlias=~".*NNI.*"}[10m]) * 8) * on(instance) group_left(sysName) sysName{sysName = "103(M)-PTN2128242_RAMI"})

Also I attached the screenshoot
Capture.PNG
Message has been deleted

Brian Candler

unread,
Jun 9, 2020, 4:32:18 PM6/9/20
to Prometheus Users
OK, so now using promQL, run the query:

rate(ifHCOutOctets{ifAlias=~".*NNI.*",instance="10.85.0.82"}[10m]

If you see the same peaks, then it's junk coming out of your device - nothing that Prometheus can do about that, talk to your vendor.  If you don't, then investigate further what's going on with your recording rules.
Message has been deleted
Message has been deleted

Umut Cokbilir

unread,
Jun 10, 2020, 2:31:22 AM6/10/20
to Prometheus Users
As you can see at attachment, it's coming again, one port has this fault only outbound direction. it's strange.

9 Haziran 2020 Salı 23:32:18 UTC+3 tarihinde Brian Candler yazdı:
Capture.PNG

Brian Candler

unread,
Jun 10, 2020, 3:08:13 AM6/10/20
to Prometheus Users
That's the thing about metrics: they tell you stuff.  This metric seems to be telling you that your vendor's MIB is broken. Take the evidence to your vendor.

I'd suggest showing them the raw metrics, without rate() or any other processing, so that they can see *exactly* what the device is returning.

If you call the API with a range vector query, you'll see the exact samples collected with their timestamps over that period:

curl -g 'localhost:9090/api/v1/query?query=ifHCOutOctets{ifAlias=~".*NNI.*",instance="10.85.0.82"}[10m]' | python3 -mjson.tool

Umut Cokbilir

unread,
Dec 29, 2020, 9:45:58 AM12/29/20
to Prometheus Users

Hello Brian,

We have tested the above problem with huawei RD but We couldn't find any problem during the test. You can find the verification report and the other monitoring tools result on attached files.
While other tools show the result as desired, prometheus does not.
Also I called the range vector query and shared with them and you.
Please give me advice to solve this problem. The problem happens only type of device which is PTN 6900-2-M8. The other types are OK.(Some ATNseries and PTNs)
Thanks.

10 Haziran 2020 Çarşamba tarihinde saat 10:08:13 UTC+3 itibarıyla Brian Candler şunları yazdı:
Capture_Grafana.JPG
Capture_Prometheus_[Rate].JPG
accnwmonap_16-56.json
Verification Report of Interface utilization cannot be correctly get by ....docx
accnwmonap_16-47.json
Capture_Cacti.JPG
Capture_u2000.JPG
Reply all
Reply to author
Forward
0 new messages