Useful labels missing from prometheus node_exporter in 1.7.x GKE

3,448 views
Skip to first unread message

ericuldall

unread,
Oct 20, 2017, 2:14:48 PM10/20/17
to Prometheus Users
I've been seeking an answer to this questions for some time now and you can see the work in the following places:


To reiterate my question here: After upgrading to GKE 1.7.x I found that many helpful labels were missing from the node data.

Previously my output from prom looked like this:

node_cpu{beta_kubernetes_io_arch="amd64",beta_kubernetes_io_instance_type="n1-standard-2",beta_kubernetes_io_os="linux",cloud_google_com_gke_nodepool="default-pool",cpu="cpu1",failure_domain_beta_kubernetes_io_region="europe-west1",failure_domain_beta_kubernetes_io_zone="europe-west1-b",instance="my-cluster-default-pool-3cbb3136-22jh",job="kubernetes-node-exporter",kubernetes_io_hostname="my-cluster-default-pool-3cbb3136-22jh",mode="system"}

Now it looks like this:


node_cpu{app="node-exporter",cpu="cpu1",instance="10.128.0.25:9100",job="kubernetes-service-endpoints",kubernetes_name="node-exporter",mode="system",name="node-exporter"}

As you can see I'm now missing tons of really useful information like: arch, instance_type, os, nodepool, region, zone, instance (the real instance name, not just local ip) and hostname (same as instance previously)
Does anyone have a clue why this information is missing after our upgrade?


Here's my prom config:

prometheus.yml: |-
    global:
      scrape_interval: 15s
    scrape_configs:
    - job_name: 'kubernetes-nodes'

      scheme: https

      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
      - role: node

      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics
    - job_name: 'kubernetes-cadvisor'
      scheme: https

      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
      - role: node

      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}:4194/proxy/metrics
    - job_name: 'kubernetes-apiservers'
      kubernetes_sd_configs:
      - role: endpoints
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        action: keep
        regex: default;kubernetes;https
    - job_name: 'kubernetes-service-endpoints'
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
        action: replace
        target_label: __scheme__
        regex: (https?)
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: (.+)(?::\d+);(\d+)
        replacement: $1:$2
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_service_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        target_label: kubernetes_name
    - job_name: 'kubernetes-services'
      scheme: https
      metrics_path: /probe
      params:
        module: [http_2xx]
      kubernetes_sd_configs:
      - role: service
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
        action: keep
        regex: true
      - source_labels: [__address__]
        target_label: __param_target
      - target_label: __address__
        replacement: blackbox
      - source_labels: [__param_target]
        target_label: instance
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_service_namespace]
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        target_label: kubernetes_name
    - job_name: 'kubernetes-pods'
      scheme: https
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        regex: (.+):(?:\d+);(\d+)
        replacement: ${1}:${2}
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - source_labels: [__meta_kubernetes_pod_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: kubernetes_pod_name

st...@eonbit.org

unread,
Nov 26, 2017, 4:12:52 PM11/26/17
to Prometheus Users
Hi ericuldall,

I had the same problem. I'm was using a prometheus.yml basically identical to yours, set up a daemonset for node exporter, and a service for it, added prometheus.io/scrape=true annotation to that service to scrape it. Then I'd only get the IP of the node into Prometheus, and no node name.

The solution for me was to:

a) Don't scrape the service, but the pod instead. Ie. remove the prometheus.io/scrape=true annotation from the service, and add it to the daemonset instead (which will put it on the pods).

b) Add this to relabel_configs for job_name kubernetes-pods in prometheus.yml:

        - source_labels: [__meta_kubernetes_pod_node_name]
          target_label: node

That gave me at least the node name. Not sure if some more relabel configs could give me instance type etc. too. Haven't got that far yet :)

 - Stian

Tom Wilkie

unread,
Dec 11, 2017, 11:25:56 AM12/11/17
to st...@eonbit.org, Prometheus Users
Hi ericuldall + stian; this information can be scraped out of kube-state-metrics with the `kube_node_labels` timeseries.


Thanks

Tom

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/9a6a3045-f38d-444f-8bb3-8c17fe2c86cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

eugene.po...@gmail.com

unread,
Dec 20, 2017, 3:46:51 PM12/20/17
to Prometheus Users
How can this be combined with node-exporter metrics? kube_node_lables doesn't have common labels with node_memory_MemFree for example

Tom Wilkie

unread,
Dec 21, 2017, 8:44:53 AM12/21/17
to eugene.po...@gmail.com, Prometheus Users

eugene.po...@gmail.com

unread,
Dec 21, 2017, 9:35:04 AM12/21/17
to Prometheus Users
Thank a lot for answer, but this example points to "container_cpu_usage_seconds_total" which has all node labels and there is no need to combine anything.

The problem is with node-exporter metrics for example "node_memory_MemFree":
https://github.com/kausalco/public/blob/6f886b635bfed6de1ed0d4a96b25638c8401756e/klumps/recording_rules.jsonnet#L127

I just want to be able to calculate memory usage (or other metrics) for nodes with specific labels (kube_node_labels) and it seems completely impossible.

Tom Wilkie

unread,
Dec 21, 2017, 10:35:36 AM12/21/17
to eugene.po...@gmail.com, Prometheus Users
node_memory_MemFree{job="default/node-exporter"} 
  * on (namespace, instance) group_left(node) 
label_replace(kube_pod_info, "instance", "$1", "pod", "(.*)") 
  * on (node) group_left(label_beta_kubernetes_io_os) 
kube_node_labels

Reply all
Reply to author
Forward
0 new messages