Any Metric/Alert When Consul dns disconnects from prometheus

17 views
Skip to first unread message

dineshnithy...@gmail.com

unread,
Aug 27, 2020, 2:40:25 AM8/27/20
to Prometheus Users
Hi Team,

we predominantly use multiple consul_sd_configs and have a service regex to scrape metrics across 1000's of target nodes

Issues:

Sometimes we observe that consul dns malfunctions and due to which prometheus drops connectivity to consul server and while this happens all the metric ingestion were getting dropped and getting into a non-healthy state

  • How do we mitigate this and whether do we have any metric to quantify and alert when there is a disconnect during consul service discovery ?

Any sort of pointers/help would be highly appreciated here.

Regards,
Dinesh

Julien Pivotto

unread,
Aug 27, 2020, 4:33:27 AM8/27/20
to dineshnithy...@gmail.com, Prometheus Users
On 26 Aug 23:40, dineshnithy...@gmail.com wrote:
> Hi Team,
>
> we predominantly use multiple consul_sd_configs and have a service regex to
> scrape metrics across 1000's of target nodes
>
> Issues:
>
> Sometimes we observe that consul dns malfunctions and due to which
> prometheus drops connectivity to consul server and while this happens all
> the metric ingestion were getting dropped and getting into a non-healthy
> state
>
>
> - How do we mitigate this and whether do we have any metric to quantify
> and alert when there is a disconnect during consul service discovery ?


You can use the prometheus_sd_consul_rpc_failures_total metric.

>
>
> Any sort of pointers/help would be highly appreciated here.
>
> Regards,
> Dinesh
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/70a767fc-88a7-4fb2-af01-5b95484cd389n%40googlegroups.com.


--
Julien Pivotto
@roidelapluie
Reply all
Reply to author
Forward
0 new messages