I am facing issue regarding the WAL and Too many open files
1) Our WAL is about "00024713*" and it taking about lot of time to just reboot the prometheus like nearly 3hrs.
2) Too many open files we have set it to 1million but still the issue exists am not getting why it is happening.
How to solve this issue and back our data.
Please help on this.
root@ fs/inotify # systemctl status prometheus -l
* prometheus.service - Prometheus
Loaded: loaded (/etc/systemd/system/prometheus.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2020-12-28 12:04:54 CET; 1h 13min ago
Main PID: 25196 (prometheus)
Tasks: 146
CGroup: /system.slice/prometheus.service
`-25196 /usr/local/bin/prometheus --config.file /root/Prometheus/prometheus-2.19.2.linux-amd64/prometheus.yml --storage.tsdb.path /root/Prometheus/prometheus-2.19.2.linux-amd64 --web.console.templates /root/Prometheus/prometheus-2.19.2.linux-amd64/consoles --web.console.libraries=/root/Prometheus/prometheus-2.19.2.linux-amd64/console_libraries --storage.tsdb.retention.time=1y --storage.tsdb.wal-compression --web.enable-lifecycle --web.enable-admin-api
Dec 28 13:17:42 prometheus[25196]: level=error ts=2020-12-28T12:17:42.257Z caller=db.go:675 component=tsdb msg="compaction failed" err="persist head block: open chunk writer: open /root/Prometheus/prometheus-2.19.2.linux-amd64/01ETMMGRVCKGH1R01AAFXTAFZJ.tmp/chunks: too many open files"
Dec 28 13:17:58 prometheus[25196]: level=error ts=2020-12-28T12:17:58.262Z caller=compact.go:538 component=tsdb msg="removed tmp folder after failed compaction" err="open /root/Prometheus/prometheus-2.19.2.linux-amd64: too many open files"
Dec 28 13:17:58 prometheus[25196]: level=error ts=2020-12-28T12:17:58.262Z caller=db.go:675 component=tsdb msg="compaction failed" err="persist head block: open chunk writer: open /root/Prometheus/prometheus-2.19.2.linux-amd64/01ETMMH8FHNMHCE1Q5BRQXCBQC.tmp/chunks: too many open files"