We are facing issue with PromQL queries.
we are using below queries to get the count of total number of pods, running pods and failed pods.
total number of pods
count(max without(instance,kubernetes_io_hostname,job)(sum(kube_pod_status_phase{namespace=~"$ns",phase!~"Succeeded"}) by (phase,pod,namespace) > 0)) or vector(0)
running pods
count(max without(instance,kubernetes_io_hostname,job)(sum(kube_pod_status_phase{namespace=~"$ns",phase="Running"}) by (phase,pod,namespace) > 0)) or vector(0)
failed pods
count(max without(instance,kubernetes_io_hostname,job)(sum(kube_pod_status_phase{namespace=~"$ns",phase="Failed"}) by (phase,pod,namespace) > 0)) or vector(0)
however there is a issue with these queries
even if there are some containers down this query will not count it in failed pod
example
platform-app-86b88c875c-ps99x 0/2 Running 24h
here the pod status is running but there are no containers up.
still it is not counted in failed pod
how can i modify the above queries to count total number of pods, total running pods and failed pod based on container staus.
Thanking you,
Akash