Hello everyone,
What did you do?
Tried to run Prometheus server in a ubuntu server and scrape metrics from node_exporter instance running in a few private ip server . (see environment section.)
Used a basic config which scrapes these private ip server, via its private ip address, every 60 seconds.
Noticed that this very basic target was showing as DOWN in the Prometheus dashboard targets view, with the error: "context deadline exceeded"
What did you expect to see?
Both target groups reporting as UP and scrapes completing quickly
What did you see instead? Under which circumstances?
DOWN "context deadline exceeded"
Sometimes, it will become a down state. a few hours later, It will become the up state , but I'm not doing anything..
It looks like that the scrape request timed out. But when I set scrape_timeout: 120s
, still appear "context, deadline exceeded" error
Environment
System information:Prometheus version:
prometheus, version 1.7.1 (branch: master, revision: 3afb3fffa3a29c3de865e1172fb740442e9d0133)
build user: root@0aa1b7fc430d
build date: 20170612-11:44:05
go version: go1.8.3
global:
scrape_interval: 30s
evaluation_interval: 30s
external_labels:
monitor: 'Video-monitor'
rule_files:
- "rules/alert.rules"
scrape_configs:
- job_name: 'Video_prometheus'
static_configs:
- targets: ['192.168.1.13:9090']
labels:
alias: video-prometheus
- job_name: 'Video_node_exporter'
scrape_interval: 120s
scrape_timeout: 120s
static_configs:
- targets: ['192.168.1.13:9100']
labels:
alias: video-prometheus
- targets: ['192.168.1.2:9100']
labels:
alias: video-master
- targets: ['192.168.1.3:9100']
labels:
alias: video-node01
- targets: ['192.168.1.4:9100']
labels:
alias: video-node02
- targets: ['192.168.1.5:9100']
labels:
alias: video-node03
- targets: ['192.168.1.6:9100']
labels:
alias: video-node04
- targets: ['192.168.1.7:9100']
labels:
alias: video-node05
- targets: ['192.168.1.8:9100']
labels:
alias: video-node06
… … … …
time="2017-10-25T12:39:05+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T12:39:24+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 19.442712172s." source="persistence.go:665"
time="2017-10-25T12:40:32+08:00" level=warning msg="Storage has entered rushed mode." chunksToPersist=51609 memoryChunks=401057 source="storage.go:1867" urgencyScore=1
time="2017-10-25T12:40:32+08:00" level=error msg="Storage needs throttling. Scrapes and rule evaluations will be skipped." chunksToPersist=51457 memoryChunks=385939 source="storage.go:1007" urgencyScore=1
time="2017-10-25T12:40:48+08:00" level=info msg="Storage has left rushed mode." chunksToPersist=46416 memoryChunks=387607 source="storage.go:1857" urgencyScore=0.678
time="2017-10-25T12:40:48+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T12:41:15+08:00" level=warning msg="Storage has entered rushed mode." chunksToPersist=47004 memoryChunks=390279 source="storage.go:1867" urgencyScore=0.824
time="2017-10-25T12:41:33+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 45.583003977s." source="persistence.go:665"
time="2017-10-25T12:41:35+08:00" level=info msg="Storage does not need throttling anymore." chunksToPersist=47012 memoryChunks=390413 source="storage.go:1015" urgencyScore=0.824
time="2017-10-25T12:41:50+08:00" level=info msg="Storage has left rushed mode." chunksToPersist=45292 memoryChunks=391119 source="storage.go:1857" urgencyScore=0.643
time="2017-10-25T12:42:19+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T12:42:45+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 26.396031303s." source="persistence.go:665"
time="2017-10-25T12:45:19+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T12:45:54+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 34.336368215s." source="persistence.go:665"
time="2017-10-25T12:48:23+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T12:48:47+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 24.752371907s." source="persistence.go:665"
time="2017-10-25T12:52:05+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T12:52:26+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 21.347779175s." source="persistence.go:665"
time="2017-10-25T12:55:19+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T12:55:47+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 27.398922414s." source="persistence.go:665"
time="2017-10-25T12:59:01+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T12:59:19+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 17.44142578s." source="persistence.go:665"
time="2017-10-25T13:02:00+08:00" level=warning msg="Storage has entered rushed mode." chunksToPersist=42452 memoryChunks=382769 source="storage.go:1867" urgencyScore=0.894
time="2017-10-25T13:02:01+08:00" level=info msg="Storage has left rushed mode." chunksToPersist=41925 memoryChunks=382769 source="storage.go:1857" urgencyScore=0.685
time="2017-10-25T13:02:01+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T13:02:30+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 29.087830198s." source="persistence.go:665"
time="2017-10-25T13:02:37+08:00" level=warning msg="Storage has entered rushed mode." chunksToPersist=41838 memoryChunks=383493 source="storage.go:1867" urgencyScore=0.844
time="2017-10-25T13:02:40+08:00" level=info msg="Storage has left rushed mode." chunksToPersist=40930 memoryChunks=383563 source="storage.go:1857" urgencyScore=0.641
time="2017-10-25T13:03:09+08:00" level=info msg="Completed maintenance sweep through 56789 archived fingerprints in 6h31m42.655936538s." source="storage.go:1442"
time="2017-10-25T13:03:32+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-10-25T13:04:00+08:00" level=info msg="Done checkpointing in-memory metrics and chunks in 28.79803733s." source="persistence.go:665"
time="2017-10-25T13:06:53+08:00" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/803b5a3c-e918-4e79-bd24-1520b0bbfc36%40googlegroups.com.
node_exporter, version 0.14.0 (branch: master, revision: 840ba5dcc71a084a3bc63cb6063003c1f94435a6) build user: root@bb6d0678e7f3 build date: 20170321-12:12:54 go version: go1.7.5
netstat -anpt |grep 9100
tcp 0 0 192.168.1.13:54006 192.168.1.23:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:55320 192.168.1.30:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:34320 192.168.1.26:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:59492 192.168.1.11:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:56348 192.168.1.4:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:59468 192.168.1.12:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:41684 192.168.1.31:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:34608 192.168.1.26:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:59756 192.168.1.12:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:43668 192.168.1.34:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:45682 192.168.1.32:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:60682 192.168.1.6:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:36934 192.168.1.27:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:60298 192.168.1.20:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:55608 192.168.1.30:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:37170 192.168.1.10:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:45394 192.168.1.32:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:41396 192.168.1.31:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:54294 192.168.1.23:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:42906 192.168.1.22:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:34140 192.168.1.5:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:58214 192.168.1.35:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:36440 192.168.1.8:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:60586 192.168.1.20:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:41794 192.168.1.21:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:58502 192.168.1.35:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:56636 192.168.1.4:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:59204 192.168.1.11:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:60970 192.168.1.6:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:51494 192.168.1.3:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:51994 192.168.1.24:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:55176 192.168.1.7:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:59764 192.168.1.29:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:44736 192.168.1.28:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:51782 192.168.1.3:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:46882 192.168.1.9:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:48168 192.168.1.25:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:42618 192.168.1.22:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:48366 192.168.1.33:9100 ESTABLISHED 73915/prometheustcp 0 0 192.168.1.13:41506 192.168.1.21:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:36646 192.168.1.27:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:51706 192.168.1.24:9100 FIN_WAIT2 - tcp6 0 0 :::9100 :::* LISTEN 2018/node_exporter
netstat -anpt |grep 9100 |grep 192.168.1.10
tcp 0 0 192.168.1.13:37458 192.168.1.10:9100 FIN_WAIT2 - tcp 0 0 192.168.1.13:37746 192.168.1.10:9100 ESTABLISHED 73915/prometheus
tcpdump -i ens224 |grep 192.168.1.13.37746tcpdump: verbose output suppressed, use -v or -vv for full protocol decodelistening on ens224, link-type EN10MB (Ethernet), capture size 262144 bytes18:04:10.891665 IP 192.168.1.13.37746 > localhost.example.9100: Flags [F.], seq 1723464181, ack 3655520680, win 229, options [nop,nop,TS val 806110754 ecr 1021397731], length 018:04:10.931419 IP localhost.example.9100 > 192.168.1.13.37746: Flags [.], ack 1, win 235, options [nop,nop,TS val 1021427741 ecr 806110754], length 018:04:42.805920 IP localhost.example.9100 > 192.168.1.13.37746: Flags [P.], seq 1:4097, ack 1, win 235, options [nop,nop,TS val 1021435709 ecr 806110754], length 409618:04:42.805945 IP localhost.example.9100 > 192.168.1.13.37746: Flags [.], seq 4097:11337, ack 1, win 235, options [nop,nop,TS val 1021435709 ecr 806110754], length 724018:04:42.805951 IP localhost.example.9100 > 192.168.1.13.37746: Flags [.], seq 11337:14233, ack 1, win 235, options [nop,nop,TS val 1021435709 ecr 806110754], length 289618:04:42.806158 IP 192.168.1.13.37746 > localhost.example.9100: Flags [R], seq 1723464182, win 0, length 018:04:42.806177 IP 192.168.1.13.37746 > localhost.example.9100: Flags [R], seq 1723464182, win 0, length 018:04:42.806185 IP 192.168.1.13.37746 > localhost.example.9100: Flags [R], seq 1723464182, win 0, length 018:04:42.806188 IP 192.168.1.13.37746 > localhost.example.9100: Flags [R], seq 1723464182, win 0, length 018:04:42.806190 IP 192.168.1.13.37746 > localhost.example.9100: Flags [R], seq 1723464182, win 0, length 018:04:42.806192 IP 192.168.1.13.37746 > localhost.example.9100: Flags [R], seq 1723464182, win 0, length 0
netstat -anpt |grep 9100
tcp6 0 0 :::9100 :::* LISTEN 2018/node_exportertcp6 0 0 192.168.1.13:9100 192.168.1.13:47460 TIME_WAIT -
A few questions:* What version of node_exporter?* Is this VM/cloud or bare metal?* What flags are you passing to the node_exporter?
On Wed, Oct 25, 2017 at 11:51 AM, Meng Zhao <zhaome...@gmail.com> wrote:
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7e5b73c2-c5c1-45da-9b33-56d779bf8155%40googlegroups.com.
./node_exporter -log.level debug -web.listen-address ":9101"
root@video-node08:/opt/node_exporter-0.14.0.linux-amd64# ./node_exporter -log.level debug -web.listen-address ":9101"
INFO[0000] Starting node_exporter (version=0.14.0, branch=master, revision=840ba5dcc71a084a3bc63cb6063003c1f94435a6) source="node_exporter.go:140"
INFO[0000] Build context (go=go1.7.5, user=root@bb6d0678e7f3, date=20170321-12:12:54) source="node_exporter.go:141"
INFO[0000] No directory specified, see --collector.textfile.directory source="textfile.go:57"
INFO[0000] Enabled collectors: source="node_exporter.go:160"
INFO[0000] - textfile source="node_exporter.go:162"
INFO[0000] - time source="node_exporter.go:162"
INFO[0000] - zfs source="node_exporter.go:162"
INFO[0000] - filesystem source="node_exporter.go:162"
INFO[0000] - hwmon source="node_exporter.go:162"
INFO[0000] - infiniband source="node_exporter.go:162"
INFO[0000] - sockstat source="node_exporter.go:162"
INFO[0000] - diskstats source="node_exporter.go:162"
INFO[0000] - edac source="node_exporter.go:162"
INFO[0000] - mdadm source="node_exporter.go:162"
INFO[0000] - meminfo source="node_exporter.go:162"
INFO[0000] - uname source="node_exporter.go:162"
INFO[0000] - wifi source="node_exporter.go:162"
INFO[0000] - entropy source="node_exporter.go:162"
INFO[0000] - filefd source="node_exporter.go:162"
INFO[0000] - loadavg source="node_exporter.go:162"
INFO[0000] - netstat source="node_exporter.go:162"
INFO[0000] - conntrack source="node_exporter.go:162"
INFO[0000] - netdev source="node_exporter.go:162"
INFO[0000] - stat source="node_exporter.go:162"
INFO[0000] - vmstat source="node_exporter.go:162"
INFO[0000] Listening on :9101 source="node_exporter.go:186"
DEBU[0005] OK: mdadm collector succeeded after 0.000052s. source="node_exporter.go:98"
DEBU[0005] OK: sockstat collector succeeded after 0.000415s. source="node_exporter.go:98"
DEBU[0005] Unable to detect InfiniBand devices source="infiniband_linux.go:110"
DEBU[0005] OK: infiniband collector succeeded after 0.000092s. source="node_exporter.go:98"
DEBU[0005] Return time: 1509334424.000000 source="time.go:47"
DEBU[0005] OK: time collector succeeded after 0.000026s. source="node_exporter.go:98"
DEBU[0005] Cannot open "/proc/spl/kstat/zfs/arcstats" for reading. Is the kernel module loaded? source="zfs_linux.go:32"
DEBU[0005] ZFS / ZFS statistics are not available source="zfs.go:62"
DEBU[0005] OK: textfile collector succeeded after 0.000001s. source="node_exporter.go:98"
DEBU[0005] OK: zfs collector succeeded after 0.000056s. source="node_exporter.go:98"
DEBU[0005] OK: vmstat collector succeeded after 0.000347s. source="node_exporter.go:98"
DEBU[0005] OK: netstat collector succeeded after 0.001169s. source="node_exporter.go:98"
DEBU[0005] OK: wifi collector succeeded after 0.000100s. source="node_exporter.go:98"
DEBU[0005] OK: uname collector succeeded after 0.001157s. source="node_exporter.go:98"
DEBU[0005] Ignoring device: sda2 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: sda5 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: loop0 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: loop2 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: sda1 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: loop5 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: loop7 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: loop3 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: loop1 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: loop4 source="diskstats_linux.go:175"
DEBU[0005] Ignoring device: loop6 source="diskstats_linux.go:175"
DEBU[0005] OK: diskstats collector succeeded after 0.000961s. source="node_exporter.go:98"
DEBU[0005] OK: entropy collector succeeded after 0.000031s. source="node_exporter.go:98"
DEBU[0005] OK: filefd collector succeeded after 0.000036s. source="node_exporter.go:98"
DEBU[0005] OK: netdev collector succeeded after 0.002577s. source="node_exporter.go:98"
DEBU[0005] return load 0: 8.060000 source="loadavg.go:51"
DEBU[0005] return load 1: 5.540000 source="loadavg.go:51"
DEBU[0005] return load 2: 4.250000 source="loadavg.go:51"
DEBU[0005] OK: loadavg collector succeeded after 0.002898s. source="node_exporter.go:98"
DEBU[0005] Set node_mem: map[string]float64{"PageTables":9.580544e+06, "WritebackTmp":0, "VmallocTotal":3.5184372087808e+13, "Buffers":4.12995584e+08, "Cached":1.2282290176e+10, "Inactive_file":1.1768369152e+10, "SwapTotal":2.143285248e+09, "Writeback":0, "VmallocUsed":0, "HugePages_Surp":0, "Hugepagesize":2.097152e+06, "HugePages_Free":0, "HugePages_Rsvd":0, "SwapFree":2.094804992e+09, "SReclaimable":3.0253056e+08, "KernelStack":1.06496e+07, "NFS_Unstable":1.7522688e+07, "CmaTotal":0, "Active":1.733500928e+09, "Inactive_anon":1.73965312e+08, "Dirty":7.098368e+06, "Shmem":1.272832e+08, "HugePages_Total":0, "HardwareCorrupted":0, "AnonHugePages":2.81018368e+08, "CmaFree":0, "MemFree":2.412945408e+09, "Inactive":1.1942334464e+10, "Active_file":7.97052928e+08, "Bounce":0, "VmallocChunk":0, "DirectMap2M":1.0542383104e+10, "Mlocked":3.74784e+06, "AnonPages":9.84121344e+08, "Mapped":1.67022592e+08, "SUnreclaim":2.27487744e+08, "CommitLimit":1.0547777536e+10, "Active_anon":9.36448e+08, "Slab":5.30018304e+08, "Committed_AS":3.882319872e+09, "MemTotal":1.6808988672e+10, "DirectMap1G":8.589934592e+09, "MemAvailable":1.4934536192e+10, "SwapCached":290816, "Unevictable":3.74784e+06, "DirectMap4k":1.94838528e+08} source="meminfo.go:48"
DEBU[0005] OK: meminfo collector succeeded after 0.003253s. source="node_exporter.go:98"
DEBU[0005] OK: edac collector succeeded after 0.000015s. source="node_exporter.go:98"
DEBU[0005] OK: conntrack collector succeeded after 0.003276s. source="node_exporter.go:98"
DEBU[0005] OK: stat collector succeeded after 0.001069s. source="node_exporter.go:98"
DEBU[0005] OK: hwmon collector succeeded after 0.008732s. source="node_exporter.go:98"
DEBU[0005] Ignoring mount point: /sys source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /proc source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /dev source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /dev/pts source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/kernel/security source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /dev/shm source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/systemd source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/pstore source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/net_cls,net_prio source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/hugetlb source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/blkio source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/devices source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/perf_event source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/freezer source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/cpu,cpuacct source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/memory source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/cpuset source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/cgroup/pids source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /proc/sys/fs/binfmt_misc source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /dev/mqueue source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /dev/hugepages source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/kernel/debug source="filesystem_linux.go:42"
DEBU[0005] Ignoring mount point: /sys/fs/fuse/connections source="filesystem_linux.go:42"
DEBU[0008] OK: filesystem collector succeeded after 3.365349s. source="node_exporter.go:98"
DEBU[0602] OK: mdadm collector succeeded after 0.000281s. source="node_exporter.go:98"DEBU[0602] OK: textfile collector succeeded after 0.000001s. source="node_exporter.go:98"DEBU[0602] return load 0: 4.750000 source="loadavg.go:51"DEBU[0602] return load 1: 7.070000 source="loadavg.go:51"DEBU[0602] return load 2: 6.270000 source="loadavg.go:51"DEBU[0602] Unable to detect InfiniBand devices source="infiniband_linux.go:110"DEBU[0602] OK: loadavg collector succeeded after 0.000275s. source="node_exporter.go:98"DEBU[0602] OK: vmstat collector succeeded after 0.000520s. source="node_exporter.go:98"DEBU[0602] OK: infiniband collector succeeded after 0.000136s. source="node_exporter.go:98"DEBU[0602] Return time: 1509335020.000000 source="time.go:47"DEBU[0602] OK: time collector succeeded after 0.000050s. source="node_exporter.go:98"DEBU[0602] OK: conntrack collector succeeded after 0.000054s. source="node_exporter.go:98"DEBU[0602] OK: sockstat collector succeeded after 0.000225s. source="node_exporter.go:98"DEBU[0602] OK: edac collector succeeded after 0.000028s. source="node_exporter.go:98"DEBU[0602] OK: uname collector succeeded after 0.000027s. source="node_exporter.go:98"DEBU[0602] Set node_mem: map[string]float64{"Mlocked":3.74784e+06, "Committed_AS":2.74610176e+09, "VmallocChunk":0, "HardwareCorrupted":0, "HugePages_Free":0, "Hugepagesize":2.097152e+06, "DirectMap4k":1.94838528e+08, "Cached":1.348175872e+10, "Active_file":1.000648704e+09, "Shmem":1.27266816e+08, "AnonHugePages":1.71966464e+08, "CmaFree":0, "MemTotal":1.6808988672e+10, "MemAvailable":1.5493750784e+10, "Buffers":4.13782016e+08, "Inactive":1.2938891264e+10, "SwapTotal":2.143285248e+09, "Dirty":57344, "Writeback":1.4680064e+07, "HugePages_Surp":0, "DirectMap1G":8.589934592e+09, "SwapCached":290816, "Inactive_file":1.2765110272e+10, "Unevictable":3.74784e+06, "NFS_Unstable":1.9922944e+07, "WritebackTmp":0, "CmaTotal":0, "HugePages_Rsvd":0, "Active":1.38620928e+09, "SUnreclaim":2.2284288e+08, "VmallocTotal":3.5184372087808e+13, "Active_anon":3.85560576e+08, "Mapped":1.57380608e+08, "Slab":5.23497472e+08, "SReclaimable":3.00654592e+08, "KernelStack":9.568256e+06, "Bounce":0, "AnonPages":4.3307008e+08, "CommitLimit":1.0547777536e+10, "VmallocUsed":0, "HugePages_Total":0, "MemFree":1.773699072e+09, "Inactive_anon":1.73780992e+08, "SwapFree":2.094804992e+09, "PageTables":7.319552e+06, "DirectMap2M":1.0542383104e+10} source="meminfo.go:48"DEBU[0602] OK: wifi collector succeeded after 0.000089s. source="node_exporter.go:98"DEBU[0602] OK: netdev collector succeeded after 0.000431s. source="node_exporter.go:98"DEBU[0602] OK: meminfo collector succeeded after 0.000685s. source="node_exporter.go:98"DEBU[0602] OK: entropy collector succeeded after 0.000047s. source="node_exporter.go:98"DEBU[0602] OK: filefd collector succeeded after 0.000059s. source="node_exporter.go:98"DEBU[0602] OK: stat collector succeeded after 0.000352s. source="node_exporter.go:98"DEBU[0602] Cannot open "/proc/spl/kstat/zfs/fm" for reading. Is the kernel module loaded? source="zfs_linux.go:32"DEBU[0602] ZFS / ZFS statistics are not available source="zfs.go:62"DEBU[0602] OK: zfs collector succeeded after 0.000421s. source="node_exporter.go:98"DEBU[0602] OK: netstat collector succeeded after 0.001010s. source="node_exporter.go:98"DEBU[0602] Ignoring device: loop2 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: sda1 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: loop7 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: sda2 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: loop1 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: loop4 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: loop6 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: loop0 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: loop3 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: loop5 source="diskstats_linux.go:175"DEBU[0602] Ignoring device: sda5 source="diskstats_linux.go:175"DEBU[0602] OK: diskstats collector succeeded after 0.003107s. source="node_exporter.go:98"DEBU[0602] OK: hwmon collector succeeded after 0.004411s. source="node_exporter.go:98"DEBU[0602] Ignoring mount point: /sys source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /proc source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /dev source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /dev/pts source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/kernel/security source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /dev/shm source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/systemd source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/pstore source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/net_cls,net_prio source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/hugetlb source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/blkio source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/devices source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/perf_event source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/freezer source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/cpu,cpuacct source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/memory source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/cpuset source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/cgroup/pids source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /proc/sys/fs/binfmt_misc source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /dev/mqueue source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /dev/hugepages source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/kernel/debug source="filesystem_linux.go:42"DEBU[0602] Ignoring mount point: /sys/fs/fuse/connections source="filesystem_linux.go:42"DEBU[0696] Statfs on /var/lib/kubelet/pods/8edacf3f-bd22-11e7-b192-005056b10f29/volumes/kubernetes.io~secret/default-token-dhk6c returned no such file or directory source="filesystem_linux.go:55"DEBU[0696] OK: filesystem collector succeeded after 94.682083s. source="node_exporter.go:98"
It sounds like one of the collectors is hanging, One way to debug this is to startup a node_exporter on a separate port, with debug logs enabled, and use curl from the Prometheus server to try and reproduce the problem..
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b4fca44d-4296-4336-b7d9-dd286bc8ca11%40googlegroups.com.