temporarily stopping alerts (from script)

1,050 views
Skip to first unread message

Yosef Yudilevich

unread,
Aug 13, 2016, 2:46:29 PM8/13/16
to Prometheus Developers
hi
i have situation when some script stops monitored service,
then does something (some upgrade for example) say for 10 mins
then starts the service back

with previous monitoring system we was touching some file in /tmp
and then when monitoring system seas this file it puts the machine in maintenance mode

what is the approach with prometheus?

thanks 

Fabian Reinartz

unread,
Aug 13, 2016, 4:28:08 PM8/13/16
to Yosef Yudilevich, Prometheus Developers

The correct pattern for Prometheus would be for whatever system doing the maintenance to create a silence muting the alerts in the Alertmanager.

This is currently possible but the silence API is not yet documented in our docs.
This is something we are working towards for the near future.


--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nicholas Capo

unread,
Aug 13, 2016, 5:50:03 PM8/13/16
to Fabian Reinartz, Yosef Yudilevich, Prometheus Developers
We use consul for service discovery (and therefore scrape target discovery) so for us putting the host in maintainece mode stops scrapes.

Nicholas

Yosef Yudilevich

unread,
Aug 14, 2016, 3:44:43 AM8/14/16
to Prometheus Developers, fab.re...@gmail.com, yosef.yu...@gmail.com
consul is very good idea, but still not implemented here...
Fabian, can you please reveal how to silence via api as it exists anyway
is a bit problem for me to dig in code as not a developer

thanks

or at least where to look :)


On Sunday, August 14, 2016 at 12:50:03 AM UTC+3, Nicholas Capo wrote:
We use consul for service discovery (and therefore scrape target discovery) so for us putting the host in maintainece mode stops scrapes.

Nicholas



On Sat, Aug 13, 2016, 15:28 Fabian Reinartz <fab.re...@gmail.com> wrote:

The correct pattern for Prometheus would be for whatever system doing the maintenance to create a silence muting the alerts in the Alertmanager.

This is currently possible but the silence API is not yet documented in our docs.
This is something we are working towards for the near future.


On Sat, Aug 13, 2016, 8:46 PM Yosef Yudilevich <yosef.yu...@gmail.com> wrote:
hi
i have situation when some script stops monitored service,
then does something (some upgrade for example) say for 10 mins
then starts the service back

with previous monitoring system we was touching some file in /tmp
and then when monitoring system seas this file it puts the machine in maintenance mode

what is the approach with prometheus?

thanks 

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.

Javier Linares

unread,
Aug 14, 2016, 2:37:03 PM8/14/16
to Yosef Yudilevich, Prometheus Developers
Hi Yosef,

On 13 August 2016 at 20:46, Yosef Yudilevich <yosef.yu...@gmail.com> wrote:
> with previous monitoring system we was touching some file in /tmp
> and then when monitoring system seas this file it puts the machine in
> maintenance mode

If you cannot implement service discovery right now, in the short term
I suppose in this machine you could export a metric with the
maintenance mode and then add a condition to the alert of being in
non-maintenance mode.

Regards,

--
Javier Linares

Yosef Yudilevich

unread,
Aug 14, 2016, 3:21:40 PM8/14/16
to Prometheus Developers
hmm
just that you said that api code is already there...
but ok :)

Julius Volz

unread,
Aug 14, 2016, 3:26:09 PM8/14/16
to Yosef Yudilevich, Prometheus Developers
If you want to know how the Alertmanager silencing API works, the Alertmanager's UI is using it as well.

So in your browser, you can open the network diagnostics tab and record some requests from the UI that add/edit/delete silences and inspect what they look like.

For example, to list silences:

- GET /api/v1/silences

To add a new one:

- POST /api/v1/silences
 - JSON request body: see your browser's inspector

Remove a silence:
- DELETE /api/v1/silence/<id>

--

Yosef Yudilevich

unread,
Aug 14, 2016, 3:30:08 PM8/14/16
to Julius Volz, Prometheus Developers
wow beautiful
thanks
i will implement tomorrow 

Sent from my iPhone

petel...@googlemail.com

unread,
Mar 5, 2018, 5:41:16 PM3/5/18
to Prometheus Developers
Don’t suppose you have a working script for this?

Cheers
Pete
Reply all
Reply to author
Forward
0 new messages