Config of prometheus to cAdvisor for monitoring the k8s Cluster data

1,794 views
Skip to first unread message

pbasan...@gmail.com

unread,
Dec 10, 2017, 11:26:23 PM12/10/17
to Prometheus Users
Hi ,
I am using prometheus (2.0) to monitor the Kubernetes cluster(1.7.4). Finally want to use the dash board https://grafana.com/dashboards/315.

after  deploying the prometheus by using the command 

kubectl create -f https://raw.githubusercontent.com/coreos/blog-examples/master/monitoring-kubernetes-with-prometheus/prometheus.yml
and then
kubectl get pods -l app=prometheus -o name | \ sed 's/^.*\///' | \ xargs -I{} kubectl port-forward {} 9090:9090
and 

installed cadvisor  

sudo docker run \
  --volume=/:/rootfs:ro \
  --volume=/var/run:/var/run:rw \
  --volume=/sys:/sys:ro \
  --volume=/var/lib/docker/:/var/lib/docker:ro \
  --volume=/dev/disk/:/dev/disk:ro \
  --publish=8080:8080 \
  --detach=true \
  --name=cadvisor \
  google/cadvisor:latest
  

After that i am not able to see the container data on the prometheus url (localhost:9090) only system data is available 
Looks like the config of the cadvisor to prometheus missing . Can some some please help .

Tom Wilkie

unread,
Dec 11, 2017, 10:47:21 AM12/11/17
to pbasan...@gmail.com, Prometheus Users
In a Kubernetes cluster, cAdvisor runs as part of the kubelet.  As of k8s 1.7, the cAdvisor port on the kubelet changed, so you'll need a scrape config like this:

- job_name: 'kube-system/cadvisor'
  kubernetes_sd_configs:
    - role: node

  relabel_configs:
  - source_labels: [__address__]
    regex: (.+):([0-9]+)
    target_label: __address__
    replacement: $1:4194

  metric_relabel_configs:
  # Drop container_* metrics with no image.
  - source_labels: [__name__, image]
    regex: container_([a-z_]+);
    action: drop

I include a specific job name (kube-system/cadvisor) as I like my job names to include namespace (to prevent aggregating similar jobs across namespaces, which represent dev/test envs for me).  I then use a relabel rule to add the cAdvisor port to the address, and a metric relabel rule to drop any metrics with names that match container_* and an empty image after the scrape, as I find cAdvisor exports way more metrics than I need.

HTH

Tom



--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/1338bb15-1ba0-4468-a30d-f2bfbed4b2c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

pbasan...@gmail.com

unread,
Dec 11, 2017, 12:39:02 PM12/11/17
to Prometheus Users
Hi Tom , 
Thanks for the help
Can you please send the complete prometheus.yml file ..and mean while i am getting this error 

E1211 09:34:30.246263  122535 portforward.go:331] an error occurred forwarding 9090 -> 9090: error forwarding port 9090 to pod 236155853ed949162807a9c46bec8305207d5bdfb9cb72070773970430eba069, uid : exit status 1: 2017/12/11 09:34:30 socat[122625] E connect(5, AF=2 127.0.0.1:9090, 16): Connection refused

Tom Wilkie

unread,
Dec 11, 2017, 12:50:31 PM12/11/17
to pbasan...@gmail.com, Prometheus Users
Attached; its very opinionated mind you.



prometheus.yml

pbasan...@gmail.com

unread,
Dec 12, 2017, 1:53:54 AM12/12/17
to Prometheus Users
Tried the below on scrap config and got the below error 

E1211 22:06:55.551998   58588 portforward.go:331] an error occurred forwarding 9090 -> 9090: 
error forwarding port 9090 to pod 813e633639c6e25dedf380af29f60552a42752db306924563a941df9ffc6aa5c, 
uid : exit status 1: 2017/12/11 22:06:55 socat[58814] E connect(5, AF=2 127.0.0.1:9090, 16): Connection refused
Regards,
Basanta

On Monday, December 11, 2017 at 9:17:21 PM UTC+5:30, Tom Wilkie wrote:

Tom Wilkie

unread,
Dec 12, 2017, 6:17:28 AM12/12/17
to pbasan...@gmail.com, Prometheus Users
What are the logs from the pod?

pbasan...@gmail.com

unread,
Dec 12, 2017, 7:13:09 AM12/12/17
to Prometheus Users
Here is the error deatils ..
Dec 11 22:07:53 slc11yke kubelet: E1211 22:07:53.249539   92242 pod_workers.go:182] 
Error syncing pod 6d6f399f-df02-11e7-9864-fa163efa261a 
("prometheus-432098442-1p74x_default(6d6f399f-df02-11e7-9864-fa163efa261a)"), 
skipping: failed to "StartContainer" for "prometheus" with CrashLoopBackOff: 
"Back-off 1m20s restarting failed container=prometheus 
pod=prometheus-432098442-1p74x_default(6d6f399f-df02-11e7-9864-fa163efa261a)"

pbasan...@gmail.com

unread,
Dec 13, 2017, 5:29:20 AM12/13/17
to Prometheus Users

pbasan...@gmail.com

unread,
Dec 13, 2017, 5:34:10 AM12/13/17
to Prometheus Users
Hi Tome ,

here is the command running ..
kubectl get pods -l app=prometheus -o name |         sed 's/^.*\///' |         xargs -I{} kubectl port-forward {} 9090:9090

and error is 
E1211 09:02:29.749606  118344 portforward.go:331] an error occurred forwarding 9090 -> 9090: error forwarding port 9090 to pod 0b7ca924dc9dec5f4d267eb1708e8c59c6c8d889b58a01158147340f3a18b928, uid : exit status 1: 2017/12/11 09:02:29 socat[118414] E connect(5, AF=2 127.0.0.1:9090, 16): Connection refused


POD Error :  Dec 11 22:07:53 slc11yke kubelet: E1211 22:07:53.249539   92242 pod_workers.go:182] 
Error syncing pod 6d6f399f-df02-11e7-9864-fa163efa261a 
("prometheus-432098442-1p74x_default(6d6f399f-df02-11e7-9864-fa163efa261a)"), 
skipping: failed to "StartContainer" for "prometheus" with CrashLoopBackOff: 
"Back-off 1m20s restarting failed container=prometheus 
pod=prometheus-432098442-1p74x_default(6d6f399f-df02-11e7-9864-fa163efa261a)"

Tom Wilkie

unread,
Dec 13, 2017, 6:35:07 AM12/13/17
to pbasan...@gmail.com, Prometheus Users
Hi Basanta - I need the output of kubectl logs -l app=prometheus to help you further.

Thanks

Tom

pbasan...@gmail.com

unread,
Dec 13, 2017, 8:53:49 AM12/13/17
to Prometheus Users
here is the details ..
bash-4.2#  kubectl logs -l app=prometheus
time="2017-12-13T13:51:09Z" level=info msg="Starting prometheus (version=1.0.1, branch=master, revision=be40190)" source="main.go:73"
time="2017-12-13T13:51:09Z" level=info msg="Build context (go=go1.6.2, user=root@e881b289ce76, date=20160722-19:54:46)" source="main.go:74"
time="2017-12-13T13:51:09Z" level=info msg="Loading configuration file /etc/prometheus/prometheus.yml" source="main.go:206"
time="2017-12-13T13:51:09Z" level=error msg="Couldn't load configuration (-config.file=/etc/prometheus/prometheus.yml): Kubernetes SD configuration requires at least one Kubernetes API server" source="main.go:218"

pbasan...@gmail.com

unread,
Dec 15, 2017, 3:55:19 AM12/15/17
to Prometheus Users


Dec 13
here is the details ..
bash-4.2#  kubectl logs -l app=prometheus
time="2017-12-13T13:51:09Z" level=info msg="Starting prometheus (version=1.0.1, branch=master, revision=be40190)" source="main.go:73"
time="2017-12-13T13:51:09Z" level=info msg="Build context (go=go1.6.2, user=root@e881b289ce76, date=20160722-19:54:46)" source="main.go:74"
time="2017-12-13T13:51:09Z" level=info msg="Loading configuration file /etc/prometheus/prometheus.yml" source="main.go:206"
time="2017-12-13T13:51:09Z" level=error msg="Couldn't load configuration (-config.file=/etc/prometheus/prometheus.yml): Kubernetes SD configuration requires at least one Kubernetes API server" source="main.go:218"


Auto Generated Inline Image 1
Reply all
Reply to author
Forward
0 new messages