Prometheus meets Nagios/Icinga

martialblog

unread,

Sep 20, 2023, 6:51:35 AM9/20/23

to Prometheus Users

Hi,

I just wanted to spread the word that me and my colleagues release a little tool that helps to integrate Prometheus into monitoring tools like Nagios/Icinga.

It's a Nagios-style monitoring plugin that talk with the Prometheus API and transforms the response into the OK,WARNING,CRITICAL semantic. All packed into a Golang Binary, released under GPL-2.0 license.

https://github.com/NETWAYS/check_prometheus

Current features are:

health, Checks the health or readiness status of the Prometheus server
alert, Checks the status of one or more Prometheus alerts
query, Checks the status of a PromQL query

Feedback is most welcome!

Regards
Markus

Brian Candler

unread,

Sep 20, 2023, 7:43:13 AM9/20/23

to Prometheus Users

Cool. How does this compare with https://github.com/claranet/nagitheus ?

martialblog

unread,

Sep 20, 2023, 8:40:49 AM9/20/23

to Prometheus Users

From what I can tell nagitheus as well as https://github.com/prometheus/nagios_plugins can only be used for PromQL checks.

We wanted to have a tool that's also able to do other things, like a simple heath check or alerts.

I also hope that the we can extend the CLI in the future if other features are required, thus the subcommand pattern.

Brian Candler

unread,

Sep 20, 2023, 1:04:43 PM9/20/23

to Prometheus Users

Thanks.

It wasn't clear to me how the -c (critical) and -w (warning) thresholds work. I had to dig through source and I found my way to a dependency: https://github.com/NETWAYS/go-check#thresholds

There, the README shows an example "~:3" but not what it actually means. In the source (which presumably ends up in godoc) I found:

// Defining a threshold for any numeric value
//
// Format: [@]start:end
//
// Threshold Generate an alert if x...
// 10 < 0 or > 10, (outside the range of {0 .. 10})
// 10: < 10, (outside {10 .. ∞})
// ~:10 > 10, (outside the range of {-∞ .. 10})
// 10:20 < 10 or > 20, (outside the range of {10 .. 20})
// @10:20 ≥ 10 and ≤ 20, (inside the range of {10 .. 20})
//
// Reference: https://www.monitoring-plugins.org/doc/guidelines.html#THRESHOLDFORMAT

So my main feedback is, a direct documentation link from check_prometheus to THRESHOLDFORMAT would be very helpful :-)

(I guess this is standard for nagios though. I know check_snmp works in this way)

Conall O'Brien

unread,

Sep 20, 2023, 2:04:41 PM9/20/23

to Brian Candler, Prometheus Users

On Wed, 20 Sept 2023 at 14:04, Brian Candler <b.ca...@pobox.com> wrote:

Thanks.

It wasn't clear to me how the -c (critical) and -w (warning) thresholds work. I had to dig through source and I found my way to a dependency: https://github.com/NETWAYS/go-check#thresholds

There, the README shows an example "~:3" but not what it actually means. In the source (which presumably ends up in godoc) I found:

It's been a long time since I used nagios, but nagios warning vs critical thresholds are akin to using a Prometheus metric in multiple alerts, with different alerting thresholds and severity labels for each alert definition.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ee5d6a30-8b5b-4cb1-bc36-1aa9768776e1n%40googlegroups.com.

--

Conall O'Brien

martialblog

unread,

Sep 21, 2023, 6:56:58 AM9/21/23

to Prometheus Users

See now that is something that one would fail to mentions because when working with Nagios/Icinga all day it's essential. In German we call it "betriebsblind".