On 26 Aug 23:40,
dineshnithy...@gmail.com wrote:
> Hi Team,
>
> we predominantly use multiple consul_sd_configs and have a service regex to
> scrape metrics across 1000's of target nodes
>
> Issues:
>
> Sometimes we observe that consul dns malfunctions and due to which
> prometheus drops connectivity to consul server and while this happens all
> the metric ingestion were getting dropped and getting into a non-healthy
> state
>
>
> - How do we mitigate this and whether do we have any metric to quantify
> and alert when there is a disconnect during consul service discovery ?
You can use the prometheus_sd_consul_rpc_failures_total metric.
>
>
> Any sort of pointers/help would be highly appreciated here.
>
> Regards,
> Dinesh
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
prometheus-use...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/70a767fc-88a7-4fb2-af01-5b95484cd389n%40googlegroups.com.
--
Julien Pivotto
@roidelapluie