Prometheus not getting metrics from cadvisor

Max Furman

unread,

Sep 1, 2020, 1:51:29 AM9/1/20

to Prometheus Users

Heyo,

I've deployed a prometheus, grafana, kube-state-metrics, alertmanager, etc. setup using kubernetes in GKE v1.16.x. I've used https://github.com/do-community/doks-monitoring as a jumping off point for the yaml files.

I've been trying to debug a situation for a few days now and would be very grateful for some help. My prometheus nodes are not getting metrics from cadvisor.

* All the services and pods in the deployments are running. prometheus, kube-state-metrics, node-exporter, all running - no errors.

* The cadvisor targets in prometheus UI appear as "up".

* Prometheus is able to collect other metrics from the cluster, but no pod/container level usage metrics.

* I can see cadvisor metrics when I query `kubectl get --raw "/api/v1/nodes/<your_node>/proxy/metrics/cadvisor"`, but when I look in prometheus for `container_cpu_usage` or `container_memory_usage`, there is no data.

* My cadvisor scrape job config in prometheus

```

- job_name: kubernetes-cadvisor

honor_timestamps: true

scrape_interval: 15s

scrape_timeout: 10s

metrics_path: /metrics/cadvisor

scheme: https

bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

tls_config:

ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

insecure_skip_verify: true

kubernetes_sd_configs:

- role: node

relabel_configs:

- action: labelmap

regex: __meta_kubernetes_node_label_(.+)

```

cribbed from the prometheus/docs/examples.

It's odd to me that the path I got from the examples is /metrics/advisor, but the path I queried using `kubectl get` was /proxy/metrics/advisor.

I've tried a whole bunch of different variations on paths and scrape configs, but no luck. Based on the fact that I can query the metrics using `kubectl get` (they exist) it seems to me the issue is prometheus communicating with the cadvisor target.

If anyone has experience getting this configured I'd sure appreciate some help debugging.

Cheers,

max

Brian Candler

unread,

Sep 1, 2020, 5:17:55 AM9/1/20

to Prometheus Users

On Tuesday, 1 September 2020 06:51:29 UTC+1, Max Furman wrote:

It's odd to me that the path I got from the examples is /metrics/advisor, but the path I queried using `kubectl get` was /proxy/metrics/advisor.

This is not well documented, but I think it's the difference between scraping the kubelet directly versus scraping through the k8s API.

I did attempt to reverse-engineer some of this here:

https://discuss.kubernetes.io/t/list-of-prometheus-metrics-endpoints/9406

but I wouldn't say I really got to the bottom of it.

Max Furman

unread,

Sep 1, 2020, 11:18:14 AM9/1/20

to Prometheus Users

Funnily enough I was exploring that post yesterday - it came up based on some of the keywords I was searching for. Definitely useful to see someone breaking down the various possible data sources. I missed the part right at the bottom (the UPDATE) - it's helpful to see that. <host>:10255/metrics/cadvisor is the url my prometheus config is currently scraping. Still no luck there, but at least nice to be able to confirm that I'm not scraping the completely wrong place.

Brian Candler

unread,

Sep 1, 2020, 1:38:07 PM9/1/20

to Prometheus Users

:10255 is on a test server running microk8s snap:

root@nuc1:~# netstat -natp | grep kubelet

tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 4129/kubelet

tcp 0 0 127.0.0.1:37888 127.0.0.1:16443 ESTABLISHED 4129/kubelet

tcp6 0 0 :::10255 :::* LISTEN 4129/kubelet

tcp6 0 0 :::10250 :::* LISTEN 4129/kubelet

root@nuc1:~#

Looking at a production k8s 1.16 cluster (installed using kubectl) I don't see this port open. And I'm afraid I don't know where it's configured, even in microk8s.

I do note that the kubelet command line includes "--insecure-port=0", but I don't think that's it; even with a reboot it comes up on the same ports.

Max Furman

unread,

Sep 1, 2020, 2:41:19 PM9/1/20

to Prometheus Users

Interesting! This comment helped me a lot. Under the new assumption that the address was wrong, I searched where cadvisor metrics could be scraped for prometheus in GKE. I found a blog post with an example kubernetes-cadvisor config that worked for me.

https://medium.com/htc-research-engineering-blog/monitoring-kubernetes-clusters-with-grafana-e2a413febefd

config pasted below for posterity:

```

- job_name: 'kubernetes-cadvisor'

scheme: https

tls_config:

ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

insecure_skip_verify: true

bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

kubernetes_sd_configs:

- role: node

relabel_configs:

- action: labelmap

regex: __meta_kubernetes_node_label_(.+)

- target_label: __address__

replacement: kubernetes.default.svc.cluster.local:443

- source_labels: [__meta_kubernetes_node_name]

regex: (.+)

target_label: __metrics_path__

replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

```

Now I see the metrics appearing in prometheus.

Thanks again for the hint!

Now on to trying to get the grafana dashboards working. No rest for the wicked.

Reply all

Reply to author

Forward