Too many open files and Promethues reboot is dead slow.

93 views
Skip to first unread message

Saipradeep Bojja

unread,
Dec 28, 2020, 7:24:38 AM12/28/20
to Prometheus Users
Hi,
 
I am facing issue regarding the WAL and Too many open files
1) Our WAL is about "00024713*" and it taking about lot of time to just reboot the prometheus like nearly 3hrs.

2) Too many open files we have set it to 1million but still the issue exists am not getting why it is happening.

 And also it is increasing the the temporary files on. 
How to solve this issue and back our data.
  
Please help on this.


root@    fs/inotify # systemctl status prometheus -l
* prometheus.service - Prometheus
   Loaded: loaded (/etc/systemd/system/prometheus.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2020-12-28 12:04:54 CET; 1h 13min ago
 Main PID: 25196 (prometheus)
    Tasks: 146
   CGroup: /system.slice/prometheus.service
           `-25196 /usr/local/bin/prometheus --config.file /root/Prometheus/prometheus-2.19.2.linux-amd64/prometheus.yml --storage.tsdb.path /root/Prometheus/prometheus-2.19.2.linux-amd64 --web.console.templates /root/Prometheus/prometheus-2.19.2.linux-amd64/consoles --web.console.libraries=/root/Prometheus/prometheus-2.19.2.linux-amd64/console_libraries --storage.tsdb.retention.time=1y --storage.tsdb.wal-compression --web.enable-lifecycle --web.enable-admin-api


Dec 28 13:17:42  prometheus[25196]: level=error ts=2020-12-28T12:17:42.257Z caller=db.go:675 component=tsdb msg="compaction failed" err="persist head block: open chunk writer: open /root/Prometheus/prometheus-2.19.2.linux-amd64/01ETMMGRVCKGH1R01AAFXTAFZJ.tmp/chunks: too many open files"
Dec 28 13:17:58  prometheus[25196]: level=error ts=2020-12-28T12:17:58.262Z caller=compact.go:538 component=tsdb msg="removed tmp folder after failed compaction" err="open /root/Prometheus/prometheus-2.19.2.linux-amd64: too many open files"
Dec 28 13:17:58  prometheus[25196]: level=error ts=2020-12-28T12:17:58.262Z caller=db.go:675 component=tsdb msg="compaction failed" err="persist head block: open chunk writer: open /root/Prometheus/prometheus-2.19.2.linux-amd64/01ETMMH8FHNMHCE1Q5BRQXCBQC.tmp/chunks: too many open files"
w

Matthias Rampke

unread,
Jan 2, 2021, 5:10:30 PM1/2/21
to Saipradeep Bojja, Prometheus Users
1. it is likely that writing out blocks did not work because of issue 2. so after you resolve that, the long WAL replay should only happen one time.

2. How did you raise the limit? I often find that systemd has applied a setting that I did not expect. In the systemd unit, what is LimitNOFile? Prometheus' own metrics (:9090/metrics) also include the process_max_fds metric that shows the actual limit, does this match your expectations?

/MR


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ee808e47-317b-4722-b3b6-d0ccd5dd6f05n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages