Missing some metrics in prometheus

3,423 views
Skip to first unread message

yuzh...@megvii.com

unread,
Dec 19, 2017, 4:07:03 AM12/19/17
to Prometheus Users

What did you do?
i install prometheus and grafana in k8s, and i want some graph monitoring the status of CPU、Mem, the Node and the container.
What did you expect to see?

What did you see instead? Under which circumstances?
But the container's graph is blank, and without data, the Node's graph is ok.
and the metrics in prometheus does not has the metrics such as container_xxxx_xxxx

so, i visit the nodeIP:4194/metrics, i got this:

ersion & cadvisor revision.

TYPE cadvisor_version_info gauge

cadvisor_version_info{cadvisorRevision="",cadvisorVersion="",dockerVersion="17.09.0-ce",kernelVersion="4.4.0-62-generic",osVersion="Ubuntu 16.04.3 LTS"} 1

HELP container_cpu_load_average_10s Value of container cpu load average over the last 10 seconds.

TYPE container_cpu_load_average_10s gauge

container_cpu_load_average_10s{container_name="",id="/",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/docker",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/docker/11a1cb87706e9b4f16ce5884ee7b0da46a298d4f85c0068a129056b3aef91efd",image="calico/node:v2.6.2",name="calico-node",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/init.scope",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/kube-proxy",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/kubepods",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/kubepods/besteffort",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/kubepods/besteffort/poda6606d8d-e0b4-11e7-9793-00163e0c0192",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/kubepods/besteffort/podd3f3ab95-e098-11e7-9793-00163e0c0192",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/kubepods/besteffort/podfa4dda06-e491-11e7-9793-00163e0c0192",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/kubepods/besteffort/podfaa5ae07-e491-11e7-9793-00163e0c0192",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/kubepods/burstable",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/accounts-daemon.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/aegis.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/agentwatch.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/apparmor.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/atd.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/calico-node.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/cloud-config.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/cloud-final.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/cloud-init-local.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/cloud-init.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/console-setup.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/cron.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/dbus.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/docker.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/grub-common.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/if...@eth0.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/irqbalance.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/keyboard-setup.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/kmod-static-nodes.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/kube-proxy.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/kubelet.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/networking.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/nscd.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/ntp.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/ondemand.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/rc-local.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/resolvconf.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/rsyslog.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/run-r0ee7a8f62da743e4a5e8639aaebe4e0b.scope",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/run-r4ddcb99c044d4cb1a59f6fb993db2aaa.scope",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/run-rac2bfac20382421ab42bddae02068252.scope",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/run-rb13e8936a2584deea5c32f81058dd692.scope",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/setvtrgb.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/ssh.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/sysstat.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/system-getty.slice",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-journal-flush.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-journald.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-logind.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-modules-load.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-random-seed.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-remount-fs.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-sysctl.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-tmpfiles-setup-dev.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-tmpfiles-setup.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-udev-trigger.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-udevd.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-update-utmp.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/systemd-user-sessions.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/system.slice/uuidd.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/user.slice",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/user.slice/user-0.slice",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/user.slice/user-0.slice/session-1203.scope",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="",id="/user.slice/user-0.slice/us...@0.service",image="",name="",namespace="",pod_name=""} 0
container_cpu_load_average_10s{container_name="POD",id="/kubepods/besteffort/poda6606d8d-e0b4-11e7-9793-00163e0c0192/f15d7337adfd997e365ade1d9606122ff026f3fdf320e4080fff4f12384d75bb",image="mirrorgooglecontainers/pause-amd64:3.0",name="k8s_POD_mc-fs-crypted-5fbb99dbd6-sc9tz_staging_a6606d8d-e0b4-11e7-9793-00163e0c0192_1",namespace="staging",pod_name="mc-fs-crypted-5fbb99dbd6-sc9tz"} 0
container_cpu_load_average_10s{container_name="POD",id="/kubepods/besteffort/podd3f3ab95-e098-11e7-9793-00163e0c0192/3820e6b18214e821cc458f1dfa52879bbc77ef2cb86b8ca53374dfc7af59456b",image="mirrorgooglecontainers/pause-amd64:3.0",name="k8s_POD_heapster-6956dd7956-52l7k_kube-system_d3f3ab95-e098-11e7-9793-00163e0c0192_1",namespace="kube-system",pod_name="heapster-6956dd7956-52l7k"} 0
container_cpu_load_average_10s{container_name="POD",id="/kubepods/besteffort/podfa4dda06-e491-11e7-9793-00163e0c0192/43efaf47d46104b21eafffe10607a58916f6099d571f0cf04c6a81e7ed3a551b",image="mirrorgooglecontainers/pause-amd64:3.0",name="k8s_POD_prometheus-node-exporter-fn9cd_monitoring_fa4dda06-e491-11e7-9793-00163e0c0192_0",namespace="monitoring",pod_name="prometheus-node-exporter-fn9cd"} 0
container_cpu_load_average_10s{container_name="POD",id="/kubepods/besteffort/podfaa5ae07-e491-11e7-9793-00163e0c0192/c4494774f5b53cb63c669166b99808ec5ab2c2e5bb1b9a4772120c7ecea49d5f",image="mirrorgooglecontainers/pause-amd64:3.0",name="k8s_POD_node-directory-size-metrics-rgf46_monitoring_faa5ae07-e491-11e7-9793-00163e0c0192_0",namespace="monitoring",pod_name="node-directory-size-metrics-rgf46"} 0
container_cpu_load_average_10s{container_name="caddy",id="/kubepods/besteffort/podfaa5ae07-e491-11e7-9793-00163e0c0192/949887dfb82288f9b334908c4d75ddba7cb74cb4366c0797c99d95829244d441",image="dockermuenster/caddy@sha256:34899a9cf74fd3943b128e2ce65e08068ec7dcbf3631c5f08c64b483fbf8171b",name="k8s_caddy_node-directory-size-metrics-rgf46_monitoring_faa5ae07-e491-11e7-9793-00163e0c0192_0",namespace="monitoring",pod_name="node-directory-size-metrics-rgf46"} 0
container_cpu_load_average_10s{container_name="crypted",id="/kubepods/besteffort/poda6606d8d-e0b4-11e7-9793-00163e0c0192/0e87abb82923fb6fcb20c16341e190839377a0af307b92047de47e6ce8fd8991",image="harbor.csg-bjv.megvii-inc.com/megvii/fs-crypted@sha256:8579027bed75f35a2f57246f46754003f17d6bc3cc3940852a2f634b4db52de2",name="k8s_crypted_mc-fs-crypted-5fbb99dbd6-sc9tz_staging_a6606d8d-e0b4-11e7-9793-00163e0c0192_1",namespace="staging",pod_name="mc-fs-crypted-5fbb99dbd6-sc9tz"} 0
container_cpu_load_average_10s{container_name="heapster",id="/kubepods/besteffort/podd3f3ab95-e098-11e7-9793-00163e0c0192/634ea11c5ed62c171798e803f74e76f8d503f6cd77d503876709e58db579f015",image="mirrorgooglecontainers/heapster-amd64@sha256:3dff9b2425a196aa51df0cebde0f8b427388425ba84568721acf416fa003cd5c",name="k8s_heapster_heapster-6956dd7956-52l7k_kube-system_d3f3ab95-e098-11e7-9793-00163e0c0192_1",namespace="kube-system",pod_name="heapster-6956dd7956-52l7k"} 0
container_cpu_load_average_10s{container_name="prometheus-node-exporter",id="/kubepods/besteffort/podfa4dda06-e491-11e7-9793-00163e0c0192/79751fa851833cc01f162f11ae6aaec039136068b10c66ddf90ac7a30be9318c",image="prom/node-

Environment
k8s: 1.8.4
grafana: 4.2
prometheus: 1.8.2

  • System information:

    insert output of uname -srm here
    Linux 4.4.0-62-generic x86_64

  • Prometheus version:

    insert output of prometheus --version here

  • Alertmanager version:

    insert output of alertmanager --version here (if relevant to the issue)

  • Prometheus configuration file:

insert configuration here

A scrape configuration for running Prometheus on a Kubernetes cluster.
# This uses separate scrape configs for cluster components (i.e. API server, node)
# and services to allow each to use different authentication configs.
#
# Kubernetes labels will be added as Prometheus labels on metrics via the
# `labelmap` relabeling action.
#
# If you are using Kubernetes 1.7.2 or earlier, please take note of the comments
# for the kubernetes-cadvisor job; you will need to edit or remove this job.

# Scrape config for API servers.
#
# Kubernetes exposes API servers as endpoints to the default/kubernetes
# service so this uses `endpoints` role and uses relabelling to only keep
# the endpoints associated with the default/kubernetes service using the
# default named port `https`. This works for single API server deployments as
# well as HA API server deployments.
apiVersion: v1
data:
  prometheus.yaml: |
    global:
      scrape_interval: 10s
      scrape_timeout: 10s
      evaluation_interval: 10s
    rule_files:
      - "/etc/prometheus-rules/*.rules"
    scrape_configs:
    - job_name: 'kubernetes-apiservers'

      kubernetes_sd_configs:
      - role: endpoints

      # Default to scraping over https. If required, just disable this or change to
      # `http`.
      scheme: https

      # This TLS & bearer token file config is used to connect to the actual scrape
      # endpoints for cluster components. This is separate to discovery auth
      # configuration because discovery & scraping are two separate concerns in
      # Prometheus. The discovery auth config is automatic if Prometheus runs inside
      # the cluster. Otherwise, more config options have to be provided within the
      # <kubernetes_sd_config>.
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        # If your node certificates are self-signed or use a different CA to the
        # master CA, then disable certificate verification below. Note that
        # certificate verification is an integral part of a secure infrastructure
        # so this should only be disabled in a controlled environment. You can
        # disable certificate verification by uncommenting the line below.
        #
        # insecure_skip_verify: true
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      # Keep only the default/kubernetes service endpoints for the https port. This
      # will add targets for each API server which Kubernetes adds an endpoint to
      # the default/kubernetes service.
      relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        action: keep
        regex: default;kubernetes;https

    # Scrape config for nodes (kubelet).
    #
    # Rather than connecting directly to the node, the scrape is proxied though the
    # Kubernetes apiserver.  This means it will work if Prometheus is running out of
    # cluster, or can't connect to nodes for some other reason (e.g. because of
    # firewalling).
    - job_name: 'kubernetes-nodes'

      # Default to scraping over https. If required, just disable this or change to
      # `http`.
      scheme: https

      # This TLS & bearer token file config is used to connect to the actual scrape
      # endpoints for cluster components. This is separate to discovery auth
      # configuration because discovery & scraping are two separate concerns in
      # Prometheus. The discovery auth config is automatic if Prometheus runs inside
      # the cluster. Otherwise, more config options have to be provided within the
      # <kubernetes_sd_config>.
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
      - role: node

      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics

    # Scrape config for Kubelet cAdvisor.
    #
    # This is required for Kubernetes 1.7.3 and later, where cAdvisor metrics
    # (those whose names begin with 'container_') have been removed from the
    # Kubelet metrics endpoint.  This job scrapes the cAdvisor endpoint to
    # retrieve those metrics.
    #
    # In Kubernetes 1.7.0-1.7.2, these metrics are only exposed on the cAdvisor
    # HTTP endpoint; use "replacement: /api/v1/nodes/${1}:4194/proxy/metrics"
    # in that case (and ensure cAdvisor's HTTP server hasn't been disabled with
    # the --cadvisor-port=0 Kubelet flag).
    #
    # This job is not necessary and should be removed in Kubernetes 1.6 and
    # earlier versions, or it will cause the metrics to be scraped twice.
    - job_name: 'kubernetes-cadvisor'

      # Default to scraping over https. If required, just disable this or change to
      # `http`.
      scheme: https

      # This TLS & bearer token file config is used to connect to the actual scrape
      # endpoints for cluster components. This is separate to discovery auth
      # configuration because discovery & scraping are two separate concerns in
      # Prometheus. The discovery auth config is automatic if Prometheus runs inside
      # the cluster. Otherwise, more config options have to be provided within the
      # <kubernetes_sd_config>.
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
      - role: node

      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

    # Scrape config for service endpoints.
    #
    # The relabeling allows the actual service scrape endpoint to be configured
    # via the following annotations:
    #
    # * `prometheus.io/scrape`: Only scrape services that have a value of `true`
    # * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need
    # to set this to `https` & most likely set the `tls_config` of the scrape config.
    # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
    # * `prometheus.io/port`: If the metrics are exposed on a different port to the
    # service then set this appropriately.
    - job_name: 'kubernetes-service-endpoints'

      kubernetes_sd_configs:
      - role: endpoints

      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
        action: replace
        target_label: __scheme__
        regex: (https?)
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        target_label: kubernetes_name

    # Example scrape config for probing services via the Blackbox Exporter.
    #
    # The relabeling allows the actual service scrape endpoint to be configured
    # via the following annotations:
    #
    # * `prometheus.io/probe`: Only probe services that have a value of `true`
    - job_name: 'kubernetes-services'

      metrics_path: /probe
      params:
        module: [http_2xx]

      kubernetes_sd_configs:
      - role: service

      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
        action: keep
        regex: true
      - source_labels: [__address__]
        target_label: __param_target
      - target_label: __address__
        replacement: blackbox-exporter.example.com:9115
      - source_labels: [__param_target]
        target_label: instance
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        target_label: kubernetes_name

    # Example scrape config for probing ingresses via the Blackbox Exporter.
    #
    # The relabeling allows the actual ingress scrape endpoint to be configured
    # via the following annotations:
    #
    # * `prometheus.io/probe`: Only probe services that have a value of `true`
    - job_name: 'kubernetes-ingresses'

      metrics_path: /probe
      params:
        module: [http_2xx]

      kubernetes_sd_configs:
        - role: ingress

      relabel_configs:
        - source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_probe]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
          regex: (.+);(.+);(.+)
          replacement: ${1}://${2}${3}
          target_label: __param_target
        - target_label: __address__
          replacement: blackbox-exporter.example.com:9115
        - source_labels: [__param_target]
          target_label: instance
        - action: labelmap
          regex: __meta_kubernetes_ingress_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_ingress_name]
          target_label: kubernetes_name

    # Example scrape config for pods
    #
    # The relabeling allows the actual pod scrape endpoint to be configured via the
    # following annotations:
    #
    # * `prometheus.io/scrape`: Only scrape pods that have a value of `true`
    # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
    # * `prometheus.io/port`: Scrape the pod on the indicated port instead of the
    # pod's declared ports (default is a port-free target if none are declared).
    - job_name: 'kubernetes-pods'

      kubernetes_sd_configs:
      - role: pod

      relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: kubernetes_pod_name
kind: ConfigMap
metadata:
  creationTimestamp: null
  name: prometheus-core
  namespace: monitoring

  • Alertmanager configuration file:
insert configuration here (if relevant to the issue)
  • Logs:
insert Prometheus and Alertmanager logs relevant to the issue here

wait for help , thanks!

wer...@beroux.com

unread,
Jan 10, 2018, 4:40:25 AM1/10/18
to Prometheus Users
I'm having the same issue even though the Kubernetes dashboard can get CPU usage for example.

container_cpu_load_average_10s{container_name="",id="/user.slice/user-0.slice/user@0.service",image="",name="",namespace="",pod_name=""} 0

yuzh...@megvii.com

unread,
Jan 10, 2018, 5:03:35 AM1/10/18
to Prometheus Users
i have slove this issue with update the prometheus configmap, so i think it's my config have some mistake.

在 2018年1月10日星期三 UTC+8下午5:40:25,wer...@beroux.com写道:

wer...@beroux.com

unread,
Jan 10, 2018, 5:09:46 AM1/10/18
to Prometheus Users
I tried 3 versions:

- job_name: 'kubernetes-cadvisor'

  scheme
: https
  tls_config
:
    ca_file
: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify
: true

  bearer_token_file
: /var/run/secrets/kubernetes.io/serviceaccount/token

  kubernetes_sd_configs
:
   
- role: node

  relabel_configs
:
   
- action: labelmap
      regex
: __meta_kubernetes_node_label_(.+)
   
- target_label: __address__
      replacement
: kubernetes.default.svc:443
   
- source_labels: [__meta_kubernetes_node_name]
      regex
: (.+)
      target_label
:
 __metrics_path__
      replacement
: /api/v1/nodes/${1}:4194/proxy/metrics
     
#replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
      #replacement: /api/v1/proxy/nodes/${1}:4194/metrics

yuzh...@megvii.com

unread,
Jan 10, 2018, 5:22:33 AM1/10/18
to Prometheus Users
My k8s version is 1.8.4, and this is all my configmap, hope useful for you 

# A scrape configuration for running Prometheus on a Kubernetes cluster.
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
      - role: node

      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
      - role: node

      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
在 2018年1月10日星期三 UTC+8下午6:09:46,wer...@beroux.com写道:

wer...@beroux.com

unread,
Jan 10, 2018, 7:05:49 AM1/10/18
to Prometheus Users
Thanks... sadly I tried and even upgraded to 1.8.5-gke.0 but still no luck. container_cpu_load_average_10s remains zero.

Could you confirm that you see container_cpu_load_average_10s != 0 from job "kubernetes-cadvisor"? Query should be:

container_cpu_load_average_10s{job="kubernetes-cadvisor"} != 0


PS: What use did you find in the other probes likes "kubernetes-nodes" and "kubernetes-apiservers", or you just have them because you can?

yuzh...@megvii.com

unread,
Jan 11, 2018, 5:57:31 AM1/11/18
to Prometheus Users

Befor this time, my problem is cann`t get metrics like container_xxxx_xxx in prometheus web ui

and this is result image



在 2018年1月10日星期三 UTC+8下午8:05:49,wer...@beroux.com写道:

wer...@beroux.com

unread,
Jan 11, 2018, 7:04:49 AM1/11/18
to Prometheus Users
Scroll to the right.

wer...@beroux.com

unread,
Jan 15, 2018, 3:06:02 AM1/15/18
to Prometheus Users
Ok I found a solution running query:

rate(container_cpu_usage_seconds_total{id="/"}[5m])

container_cpu_load_average_10s is still zero, and just container_cpu_usage_seconds_total seems constant, but the above works.
Reply all
Reply to author
Forward
0 new messages