Prometheus unable to find API servers

289 views
Skip to first unread message

Junaid Subhani

unread,
May 26, 2017, 5:52:55 PM5/26/17
to Prometheus Users
I have run into an issue that appeared out of nowhere. Looks like Prometheus  (in K8S cluster ) is unable to connect to the API server

My config map is ::

# cat prometheus-configmap-1.yaml
apiVersion
: v1
kind
: ConfigMap
metadata
:
  name
: prometheus
data
:
  prometheus
.yml: |-
 
   
global:
      scrape_interval
: 5s
    scrape_configs
:
   
- job_name: 'kubernetes_apiserver'
      tls_config
:
        insecure_skip_verify
: true
      kubernetes_sd_configs
:
     
- api_servers:
       
- http://172.29.219.65:8080
        role
: apiserver
      relabel_configs
:
     
- source_labels: [__meta_kubernetes_role]
        action
: keep
        regex
: (?:apiserver)
   
     
###################### Kubernetes Pods ##########################
   
- job_name: 'haproxy'
      static_configs
:
       
- targets:
         
- 172.29.219.19:9100


   
- job_name: 'docker_containers'
      metrics_path
: '/metrics'
      tls_config
:
        insecure_skip_verify
: true
      static_configs
:
         
- targets:
           
- 172.29.219.96:4194
           
- 172.29.219.66:4194
           
- 172.29.219.95:4194
           
- 172.29.219.97:4194


   
- job_name: 'kubernetes_pods'
      tls_config
:
        insecure_skip_verify
: true
      kubernetes_sd_configs
:
     
- api_servers:
       
- http://172.29.219.65:8080
        role
: pod
      relabel_configs
:
       
- source_labels: [__meta_kubernetes_pod_name]
          action
: replace
          target_label
: pod_name
       
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action
: replace
          target_label
: __metrics_path__
          regex
: (.+)

What I am seeing on the dashboard is ::





# kubectl logs prometheus-2340598711-zuhe0
time
="2017-05-26T21:45:25Z" level=info msg="Starting prometheus (version=1.1.1, branch=release-1.0, revision=ab312a075f810e2ed124783c46d68674af071293)" source="main.go:73"
time
="2017-05-26T21:45:25Z" level=info msg="Build context (go=go1.6.3, user=root@8ab14ddb4898, date=20160907-09:37:01)" source="main.go:74"
time
="2017-05-26T21:45:25Z" level=info msg="Loading configuration file /etc/prometheus/prometheus.yml" source="main.go:221"
time
="2017-05-26T21:45:25Z" level=info msg="Loading series map and head chunks..." source="storage.go:358"
time
="2017-05-26T21:45:25Z" level=info msg="0 series loaded." source="storage.go:363"
time
="2017-05-26T21:45:25Z" level=info msg="Starting target manager..." source="targetmanager.go:76"
time
="2017-05-26T21:45:25Z" level=warning msg="No AlertManagers configured, not dispatching any alerts" source="notifier.go:176"
time
="2017-05-26T21:45:25Z" level=info msg="Listening on :9090" source="web.go:233"
time="2017-05-26T21:45:35Z" level=error msg="Cannot initialize pods collection: unable to list Kubernetes pods: unable to query any API servers: Get http://172.29.219.65:8080/api/v1/pods: dial tcp 172.29.219.65:8080: i/o
timeout"
source="pod.go:46"

But the node is definitely accessible and responding. 

# curl  http://172.29.219.65:8080
{
 
"paths": [
   
"/api",
   
"/api/v1",
   
"/apis",
   
"/apis/apps",
   
"/apis/apps/v1alpha1",
   
"/apis/authentication.k8s.io",
   
"/apis/authentication.k8s.io/v1beta1",
   
"/apis/authorization.k8s.io",
   
"/apis/authorization.k8s.io/v1beta1",
   
"/apis/autoscaling",
   
"/apis/autoscaling/v1",
   
"/apis/batch",
   
"/apis/batch/v1",
   
"/apis/batch/v2alpha1",
   
"/apis/certificates.k8s.io",
   
"/apis/certificates.k8s.io/v1alpha1",
   
"/apis/extensions",
   
"/apis/extensions/v1beta1",
   
"/apis/policy",
   
"/apis/policy/v1alpha1",
   
"/apis/rbac.authorization.k8s.io",
   
"/apis/rbac.authorization.k8s.io/v1alpha1",
   
"/apis/storage.k8s.io",
   
"/apis/storage.k8s.io/v1beta1",
   
"/healthz",
   
"/healthz/ping",
   
"/logs",
   
"/metrics",
   
"/swaggerapi/",
   
"/ui/",
   
"/version"
 
]
}

Any idea what might have gone wrong ? 

Junaid Subhani

unread,
May 27, 2017, 9:01:07 PM5/27/17
to Prometheus Users
Just saw that this happens only if Prometheus pod is deployed on a particular minion. I have 4 minions. Prometheus works fine if deployed on Minion 2. But does not work (gets API server not found) if deployed on Minion 1,3 or 4. This is very strange behavior. Trying to figure it out.  

Matthias Rampke

unread,
May 28, 2017, 1:28:56 PM5/28/17
to Junaid Subhani, Prometheus Users

Is that curl from the node that Prometheus is running on?


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ac283d34-4486-46ec-9357-340eb120eb43%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Junaid Subhani

unread,
May 30, 2017, 8:42:32 AM5/30/17
to Prometheus Users, ijunaid...@gmail.com
Yes is it.
Reply all
Reply to author
Forward
0 new messages