We have 4 prom-agents having scrape config as below
regrex is different for each agent(0, 1, 2, and 3)
` - job_name: 'kube-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__address__]
modulus: 4
target_label: __tmp_hash
action: hashmod
- source_labels: [__tmp_hash]
regex: ^0$
action: keep
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name`
our expectations was same metrics should not be scraped by 2 prometheus agent, which is not working and we are seeing same metrics being scraped by more than 1 agents.
I verified this using query
count(count({job="kube-pods"}) by (prometheus_agent,kubernetes_pod_name,name,instance)) by (kubernetes_pod_name,name,instance) > 1
there are some metrics with exactly 1 count as well which means issue is not consistent for all metrics. not sure if its something to do with pods or their config or somethings else.
Interestingly we don't see such duplicated for below scrape job
` - job_name: 'hlo-pods'
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- hlo
-
relabel_configs:
- source_labels: [__address__]
modulus: 4
target_label: __tmp_hash
action: hashmod
- source_labels: [__tmp_hash]
regex: ^2$
action: keep
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- source_labels: [ __meta_kubernetes_pod_container_name ]
action: replace
target_label: kubernetes_container_name `
Any clue what should I check? what could be possible cause?