We want to health check our servers with the Blackbox exporter through the ICMP probe, the black box exporter server and the Prometheus server both have two interfaces:
172.16.76.0/22
172.20.0.0/22
The problem is, the Blackbox exporter can ping the servers through the interface on
172.16.76.0/22 but it cannot ping the servers which their network interface is located on
172.20.0.0/22 but I can ping them manually in both Blackbox exporter and Prometheus server:
PING 172.20.3.29 (172.20.3.29) 56(84) bytes of data.
64 bytes from 172.20.3.29: icmp_seq=1 ttl=63 time=0.187 ms
64 bytes from 172.20.3.29: icmp_seq=2 ttl=63 time=0.294 ms
64 bytes from 172.20.3.29: icmp_seq=3 ttl=63 time=0.218 ms
64 bytes from 172.20.3.29: icmp_seq=4 ttl=63 time=0.211 ms
^C
--- 172.20.3.29 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3002ms
rtt min/avg/max/mdev = 0.187/0.227/0.294/0.042 ms
Here is the additional information:
Blackbox exporter host and Prometheus server:
4.4.0-176-generic #206-Ubuntu SMP Fri Feb 28 05:02:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Blackbox exporter version:
blackbox_exporter, version 0.16.0 (branch: HEAD, revision: 991f898)
build user: root@64f600555645
build date: 20191111-16:27:24
go version: go1.13.4
blackbox.yml module config:
server_health_check:
prober: icmp
timeout: 15s
icmp:
preferred_ip_protocol: "ip4"
source_ip_address: "172.16.76.147"
prometheus.yml scrape config:
- job_name: blackbox_healthcheck
scrape_interval: 30s
scrape_timeout: 15s
metrics_path: /probe
params:
module: [server_health_check]
file_sd_configs:
- files:
- 'file_sd/opennebula_vms.yml'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: plugins-01.tool.x.y.z:9115
adding &debug=true to the probe URL:
Logs for the probe:
ts=2020-05-05T16:07:39.322818925Z caller=main.go:304 module=server_health_check target=172.20.3.29 level=info msg="Beginning probe" probe=icmp timeout_seconds=15
ts=2020-05-05T16:07:39.322941208Z caller=icmp.go:82 module=server_health_check target=172.20.3.29 level=info msg="Resolving target address" ip_protocol=ip4
ts=2020-05-05T16:07:39.322962788Z caller=icmp.go:82 module=server_health_check target=172.20.3.29 level=info msg="Resolved target address" ip=172.20.3.29
ts=2020-05-05T16:07:39.322975038Z caller=main.go:119 module=server_health_check target=172.20.3.29 level=info msg="Using source address" srcIP=172.16.76.147
ts=2020-05-05T16:07:39.322990936Z caller=main.go:119 module=server_health_check target=172.20.3.29 level=info msg="Creating socket"
ts=2020-05-05T16:07:39.323043107Z caller=main.go:119 module=server_health_check target=172.20.3.29 level=info msg="Creating ICMP packet" seq=33370 id=29272
ts=2020-05-05T16:07:39.323060805Z caller=main.go:119 module=server_health_check target=172.20.3.29 level=info msg="Writing out packet"
ts=2020-05-05T16:07:39.323151009Z caller=main.go:119 module=server_health_check target=172.20.3.29 level=info msg="Waiting for reply packets"
ts=2020-05-05T16:07:54.322978818Z caller=main.go:119 module=server_health_check target=172.20.3.29 level=warn msg="Timeout reading from socket" err="read ip4 172.16.76.147: i/o timeout"
ts=2020-05-05T16:07:54.323073248Z caller=main.go:304 module=server_health_check target=172.20.3.29 level=error msg="Probe failed" duration_seconds=15.000189909
What I get when I ping instance on
172.16.76.0/22 subnet: