Prometheus in a restarting loop

40 views

Skip to first unread message

GI D

unread,

Apr 4, 2022, 5:33:09 AM4/4/22

to Prometheus Users

As I try to install the Prometheus Stack Helm chart v33 on a Kubernetes 1.20 cluster, I see the "main" pod that, among other things, manages the TSDB database, go through the process of processing WAL segments. This frequently takes a very long time, and the process is interrupted and restarted and so on for ever.

In the events list for the pod I see this:

Normal Killing pod/prometheus-prometheus-kube-prometheus-prometheus-0 Container prometheus failed startup probe, will be restarted

Now, the deployment YAML for this pod shows a startup probe config with 60 checks every 15sec. This implies a total of 15min till the pod is declared problematic.

Is there a way for me to increase the startup config, so as to allow for more than 15min? This very likely would eliminate the restarts. Yet in the values file of the chart I can't see how I can enter such probe settings.