[ANN] - Network Exporter

Sebastian YEPES

unread,

May 27, 2020, 4:32:06 PM5/27/20

to Prometheus Users

The Network Exporter is a mix of the blackbox-exporter and smokeping with more specific features and metrics but specially with the addition of the MTR module (traceroute).

Source: https://github.com/syepes/network_exporter
Doc PR: https://github.com/prometheus/docs/pull/1647

Have fun and enjoy your metrics!

Ben Kochie

unread,

May 27, 2020, 4:54:49 PM5/27/20

to Sebastian YEPES, Prometheus Users

This seems to do a bunch of pre-calculation for things like min/max/avg that are better done in PromQL. This is both extraneous data, and unhelpful data at the same time.

For example, ping_rtt_seconds{type=worst}, worst when? What timeframe does this calculate over? Doing this with `max_over_time()` in PromQL allows the user to decide what timeframe to look at.

Also, this is bad practice for labeling, because you have different metrics for different meaning.

For example: sum without (type) (ping_rtt_seconds)

The results of this calculation are nonsensical.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a47d97d6-94a4-4a6e-a9cc-60dcf21c8533%40googlegroups.com.

Béla Törös

unread,

May 28, 2020, 5:30:31 AM5/28/20

to Ben Kochie, Sebastian YEPES, Prometheus Users

> This seems to do a bunch of pre-calculation for things like min/max/avg that are better done in PromQL.

I think the main point is that you want higher resolution data, but
don't want to store individual results as they provide little data.
When we are trying to monitor certain things we want the results of
things like ping -c 30 every 30 seconds but definitely don't want to
store the results of individual echo responses. The problem with doing
things in PromQL with this type of data is that the resolution would
need to be superhigh to do meaningful comparison.

> ping_rtt_seconds{type=worst}
since last scrape, I would assume.

> Doing this with `max_over_time()` in PromQL allows the user to decide what timeframe to look at.

and by decide you mean any timeframe that is > scrape interval, right?

We are definitely giving this a spin. Thanks for the contribution,
this looks promising to solve a few issues for us:)

--
B

> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABbyFmoZE4t1LdgdtSFhrQ_gBfj7_LW6x_nNeG4w02OTL3Oa4Q%40mail.gmail.com.

Sebastian YEPES

unread,

May 28, 2020, 1:51:06 PM5/28/20

to Prometheus Users

> This seems to do a bunch of pre-calculation for things like min/max/avg that are better done in PromQL

Béla Törös, you got the idea, this works just like the regular ping command.

The pre-calculations, are done in the exporter because this is the actual way the ICMP stats work and they are based on the user defined pkg count, same thing goes for the MTR.

> Also, this is bad practice for labeling, because you have different metrics for different meaning.

I have never been a great fan of this best practice, I like to keep all related metrics under a single metric name.. if I need to do any specific operation we can just filter the needed type..

> ping_rtt_seconds{type=worst}, since last scrape, I would assume.

Most all stat types are based on the user defined packet count, at the exception for type=last that is the actual last ping value.

> We are definitely giving this a spin. Thanks for the contribution,

No problem hope it's also useful for others.
It really starts getting interesting when you start deploying several stations (network_exporter) in your network to get the overall picture.

Sneak preview, in the next version I'll be adding support for enriching the IP's with the MaxMind IP database details.

Regards and thanks for the feedback,
Seb

>> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

>> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a47d97d6-94a4-4a6e-a9cc-60dcf21c8533%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Ben Kochie

unread,

May 29, 2020, 8:13:42 AM5/29/20

to Béla Törös, Sebastian YEPES, Prometheus Users

On Thu, May 28, 2020 at 11:30 AM Béla Törös <kale...@gmail.com> wrote:

> This seems to do a bunch of pre-calculation for things like min/max/avg that are better done in PromQL.

I think the main point is that you want higher resolution data, but
don't want to store individual results as they provide little data.
When we are trying to monitor certain things we want the results of
things like ping -c 30 every 30 seconds but definitely don't want to
store the results of individual echo responses. The problem with doing
things in PromQL with this type of data is that the resolution would
need to be superhigh to do meaningful comparison.

This is solved with things like Histograms.

For example, this is how I implemented it in my smokeping_exporter.

https://github.com/SuperQ/smokeping_prober

> ping_rtt_seconds{type=worst}
since last scrape, I would assume.

That's something you can't assume. Also, invalid way to handle it if you have HA Prometheus servers.

Reply all

Reply to author

Forward