How to scrape all K8s etcd metrics?

274 views
Skip to first unread message

Lijing Zhang

unread,
Nov 13, 2018, 11:03:32 PM11/13/18
to Prometheus Users
Hi, 

We are using Prometheus v2.3.1 deployed in Kubernetes, and now want to get metrics from Kubernetes ETCD. Currently we can only get following ones
  • etcd_helper_cache_entry_count
  • etcd_helper_cache_hit_count
  • etcd_helper_cache_miss_count etcd_object_counts
  • etcd_request_cache_add_latencies_summary
  • etcd_request_cache_add_latencies_summary_count
  • etcd_request_cache_add_latencies_summary_sum
  • etcd_request_cache_get_latencies_summary
  • etcd_request_cache_get_latencies_summary_count
  • etcd_request_cache_get_latencies_summary_sum

But cannot get these,
  • etcd_disk_backend_commit_duration_seconds_bucket
  • etcd_disk_backend_snapshot_duration_seconds_bucket
  • etcd_disk_wal_fsync_duration_seconds_bucket
  • etcd_http_received_total
  • etcd_http_failed_total

Could you please give advices, that why only parts of etcd metrics can be scraped?

Attach scrape_config in prometheus.yml
scrape_configs:
- job_name: prometheus
  static_configs:
  - targets:
    - localhost:9090
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  job_name: kubernetes-apiservers
  kubernetes_sd_configs:
  - role: endpoints
  relabel_configs:
  - action: keep
    regex: default;kubernetes;https
    source_labels:
    - __meta_kubernetes_namespace
    - __meta_kubernetes_service_name
    - __meta_kubernetes_endpoint_port_name
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  job_name: kubernetes-nodes
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - replacement: kubernetes.default.svc:443
    target_label: __address__
  - regex: (.+)
    replacement: /api/v1/nodes/${1}/proxy/metrics
    source_labels:
    - __meta_kubernetes_node_name
    target_label: __metrics_path__
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  job_name: kubernetes-nodes-cadvisor
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - replacement: kubernetes.default.svc:443
    target_label: __address__
  - regex: (.+)
    replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    source_labels:
    - __meta_kubernetes_node_name
    target_label: __metrics_path__
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
- job_name: kubernetes-service-endpoints
  kubernetes_sd_configs:
  - role: endpoints
  relabel_configs:
  - action: keep
    regex: true
    source_labels:
    - __meta_kubernetes_service_annotation_prometheus_io_scrape
  - action: replace
    regex: (https?)
    source_labels:
    - __meta_kubernetes_service_annotation_prometheus_io_scheme
    target_label: __scheme__
  - action: replace
    regex: (.+)
    source_labels:
    - __meta_kubernetes_service_annotation_prometheus_io_path
    target_label: __metrics_path__
  - action: replace
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
    source_labels:
    - __address__
    - __meta_kubernetes_service_annotation_prometheus_io_port
    target_label: __address__
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - action: replace
    source_labels:
    - __meta_kubernetes_namespace
    target_label: kubernetes_namespace
  - action: replace
    source_labels:
    - __meta_kubernetes_service_name
    target_label: kubernetes_name
- honor_labels: true
  job_name: prometheus-pushgateway
  kubernetes_sd_configs:
  - role: service
  relabel_configs:
  - action: keep
    regex: pushgateway
    source_labels:
    - __meta_kubernetes_service_annotation_prometheus_io_probe
- job_name: kubernetes-services
  kubernetes_sd_configs:
  - role: service
  metrics_path: /probe
  params:
    module:
    - http_2xx
  relabel_configs:
  - action: keep
    regex: true
    source_labels:
    - __meta_kubernetes_service_annotation_prometheus_io_probe
  - source_labels:
    - __address__
    target_label: __param_target
  - replacement: blackbox
    target_label: __address__
  - source_labels:
    - __param_target
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: kubernetes_namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: kubernetes_name
- job_name: kubernetes-pods
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - action: keep
    regex: true
    source_labels:
    - __meta_kubernetes_pod_annotation_prometheus_io_scrape
  - action: replace
    regex: (.+)
    source_labels:
    - __meta_kubernetes_pod_annotation_prometheus_io_path
    target_label: __metrics_path__
  - action: replace
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
    source_labels:
    - __address__
    - __meta_kubernetes_pod_annotation_prometheus_io_port
    target_label: __address__
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - action: replace
    source_labels:
    - __meta_kubernetes_namespace
    target_label: kubernetes_namespace
  - action: replace
    source_labels:
    - __meta_kubernetes_pod_name
    target_label: kubernetes_pod_name

Thanks in advance.

Lijing Zhang

unread,
Nov 14, 2018, 6:14:04 AM11/14/18
to Prometheus Users
After adding this job I can scrape all metrics from ETCD
- job_name: etcd
  scrape_interval
: 1m
  scrape_timeout
: 10s
  metrics_path
: /metrics
  scheme: https
  static_configs:
  - targets:
    - <ip_addr>:2379
  tls_config:
    ca_file: /
etc/etcd/ssl/ca.pem
    cert_file
: /etc/etcd/ssl/etcd-client.pem
    key_file
: /etc/etcd/ssl/etcd-client-key.pem
    insecure_skip_verify
: true

However, this needs to manually write IP address. How can I discover ETCD IP address?

在 2018年11月14日星期三 UTC+8下午12:03:32,Lijing Zhang写道:

Tristan Colgate

unread,
Nov 14, 2018, 8:02:07 AM11/14/18
to Lijing Zhang, Prometheus Users
On our kops clusters I use a role: node scrape, and match `__meta_kubernetes_node_label_kubernetes_io_role` for master, then target the etcd ports directly (using an __address__ rule. I repeat this rule for the etcd-events instance.


- job_name: 'kubernetes_nodes_masters_etcd'
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: keep
    source_labels: [__meta_kubernetes_node_label_kubernetes_io_role]
    regex: master
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - source_labels: [__address__]
    target_label: __address__
    regex: "([^:]+):.+"
    replacement: "${1}:4001"
  - target_label: job
    replacement: "etcd"
- job_name: 'kubernetes_nodes_masters_etcd-events'
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: keep
    source_labels: [__meta_kubernetes_node_label_kubernetes_io_role]
    regex: master
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - source_labels: [__address__]
    target_label: __address__
    regex: "([^:]+):.+"
    replacement: "${1}:4002"
  - target_label: job
    replacement: "etcd-events"


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/2053d012-610d-4b48-9618-518607a1e80e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Lijing Zhang

unread,
Nov 14, 2018, 8:06:10 AM11/14/18
to Prometheus Users
Sincerely thanks, Tristan! 

But how about those TLS certs? Seems "kubernetes_sd_config" and "etcd" are using different cert files...

在 2018年11月14日星期三 UTC+8下午9:02:07,Tristan Colgate写道:

Tristan Colgate

unread,
Nov 14, 2018, 8:12:00 AM11/14/18
to Lijing Zhang, Prometheus Users
Getting the etcd certs might be problematic. (in the example I provided, it's just plain http at the moment). You can either skip verification (and hope you don't need a valid client cert), or actually get the etcd's CA certs into your prom container and explicitly set the tls_config. If you need a client cert you;ll have to manually issue an additional cert from the CA. Things will get a little more fragile (you'll need to manage that cert).

Lijing Zhang

unread,
Nov 14, 2018, 8:29:38 AM11/14/18
to Prometheus Users
I checked again. It's curious that etcd's ca.pem and kubernetes_sd_config's ca.crt are quite the same (same content)...
But etcd use more (cert_file: /etc/etcd/ssl/etcd-client.pem  and  key_file: /etc/etcd/ssl/etcd-client-key.pem)

So I can have a try...

在 2018年11月14日星期三 UTC+8下午9:12:00,Tristan Colgate写道:
Reply all
Reply to author
Forward
0 new messages