Blackbox - target Down

144 views
Skip to first unread message

sally zhang

unread,
Mar 11, 2020, 11:00:13 AM3/11/20
to Prometheus Users
Hi Experts, 

I configured blackbox-exporter to monitor kubernetes services. But I didn't managed to make it work
It shows targets down in Prometheus with error connection reset by peer. 
Here is my settings: 

blackbox config:
    modules:
      http_2xx:
        prober: http
        http:
          method: GET
          preferred_ip_protocol: "ip4"
          valid_status_codes: [200]
      http_post_2xx:
        prober: http
        http:
          method: POST
      http_kubernetes_service:
        prober: http
        timeout: 5s
        http:
          headers:
            Accept: "*/*"
            Accept-Language: "en-US"
          tls_config:
            ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
          preferred_ip_protocol: "ip4"

Prometheus additional-scrape-configs

- job_name: blackbox-exporter-kubernetes-services
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  metrics_path: /probe
  params:
    module: [http_2xx]
  kubernetes_sd_configs:
  - role: service
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probed]
    action: keep
    regex: true
  - source_labels: [__address__]
    target_label: __param_target
  - target_label: __address__
    replacement: monitoring-blackbox-exporter.kyma-system.svc.cluster.local:9115
  - source_labels: [__param_target]
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    target_label: kubernetes_name

Error In Prometheus Targets

I have tried several suggestion and cannot solve this! Anyone could help, appreciate it!

Thanks!

Abu Belal

unread,
Aug 7, 2020, 10:52:34 AM8/7/20
to Prometheus Users
Hi,

Did you ever get to the bottom of this? I too have a similar problem

We use prometheus in Kubernetes, and from other pods on same cluster I'm able to do a curl against the blackbox exporter which works fine.

```
- job_name: prometheus-blackbox-exporter-lon-internal
  honor_timestamps: true
  params:
    module:
    - http_2xx
  scrape_interval: 20s
  scrape_timeout: 10s
  metrics_path: /probe
  scheme: https
  static_configs:
  - targets:
    - https://www.google.com
  tls_config:
    insecure_skip_verify: true
  relabel_configs:
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    target_label: __param_target
    replacement: $1
    action: replace
  - source_labels: [__param_target]
    separator: ;
    regex: (.*)
    target_label: instance
    replacement: $1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: promblackbox-lon.xxx.internal:443
    action: replace
```

What I discovered is if I try wget from the pod where promethues is running I get this error
```
/prometheus $ wget "https://promblackbox-lon.xxx.internal.live:443/probe?module=http_2xx&target=https%3A%2F%2Fwww.google.com"
Connecting to promblackbox-lon.sea.live:443 (10.53.10.244:443)
wget: note: TLS certificate validation not implemented
wget: short read, have only 0: Connection reset by peer
wget: error getting response: No such file or directory
```

Same command from another pod (ubuntu) works fine

So for whatever reason there is some TLS issue in promethues pod causing this, anyone have any ideas?

Abu Belal

unread,
Aug 10, 2020, 11:21:53 AM8/10/20
to Prometheus Users
anyone able to help? Struggling to get this working :(

Christian Hoffmann

unread,
Aug 10, 2020, 4:37:20 PM8/10/20
to Abu Belal, Prometheus Users
On 8/7/20 4:52 PM, Abu Belal wrote:
> What I discovered is if I try wget from the pod where promethues is
> running I get this error
> ```
> /prometheus $ wget
> "https://promblackbox-lon.xxx.internal.live:443/probe?module=http_2xx&target=https%3A%2F%2Fwww.google.com"
> Connecting to promblackbox-lon.sea.live:443 (10.53.10.244:443)
> wget: note: TLS certificate validation not implemented
> wget: short read, have only 0: Connection reset by peer
> wget: error getting response: No such file or directory
> ```
>
> Same command from another pod (ubuntu) works fine

Hrm, had never seen this, but a quick Google search turns up this issue:

https://github.com/docker-library/busybox/issues/80

And as I think the Prometheus docker images are based on busybox, this
might explain the wget problem.

I don't think a missing openssl implementation would cause issues for
blackbox_exporter, as it uses Go's http/tls stack, as far as I
understand. However, it might still rely on some default certificates.

I suggest trying to get more blackbox_exporter logs and maybe trying to
place a (relevant) ca bundle in the proper paths.

This article may also help:
https://www.robustperception.io/debugging-blackbox-exporter-failures

Kind regards,
Christian

Abu Belal

unread,
Aug 11, 2020, 3:31:49 AM8/11/20
to Prometheus Users
Hi Christian,

Thank you for your response :)

I was thinking of mounting the underlying nodes (managed Azure Kubernetes) certs to prometheus, do you think that could cause problems?

Abu Belal

unread,
Aug 11, 2020, 3:55:34 AM8/11/20
to Prometheus Users
I tried mounting local certs and made no difference to wget, the problem still persists with prometheus as well.

I looked in the article for troubleshooting blackbox exporter, however the issue appears to be with prometheus (or busybox container) as I can successfully connect to blackbox from other pods within the same Kubernetes cluster.

This is the error as seen from /targets endpoint.

This seems to be a common issue yet no solution that I could find (yet!).
Reply all
Reply to author
Forward
0 new messages