Hi all -
I'm trying to use 2 file watchers to monitor my application. One hits the application's endpoint the other hits a node_exporter endpoint.
My application is running in a docker container running on Mesos/Marathon. It has multiple instances, the IPs of which I can get via an API call (so I can monitor the individual containers). I have a python script that makes the API call and writes out the IP's/ports to two different JSON files (one for app, one for node_exporter).
On startup, I get a "too many open files" error. Unlike most of the other cases I've seen, my limits appear to be plenty high.
This is the entire log on startup (I do have --log.level=debug but no debug messages show up)
level=info ts=2018-04-05T17:04:58.26146891Z caller=main.go:220 msg="Starting Prometheus" version="(version=2.2.1, branch=HEAD, revision=bc6058c81272a8d938c05e75607371284236aadc)"
level=info ts=2018-04-05T17:04:58.261570154Z caller=main.go:221 build_context="(go=go1.10, user=root@149e5b3f0829, date=20180314-14:15:45)"
level=info ts=2018-04-05T17:04:58.261600951Z caller=main.go:222 host_details="(Linux 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Dec 28 14:23:39 EST 2017 x86_64 9793c3ecf950 (none))"
level=info ts=2018-04-05T17:04:58.261627063Z caller=main.go:223 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-04-05T17:04:58.35498776Z caller=main.go:504 msg="Starting TSDB ..."
level=info ts=2018-04-05T17:04:58.355060632Z caller=web.go:382 component=web msg="Start listening for connections" address=:9090
level=info ts=2018-04-05T17:04:58.454805243Z caller=main.go:514 msg="TSDB started"
level=info ts=2018-04-05T17:04:58.454869605Z caller=main.go:588 msg="Loading configuration file" filename=/app/prometheus.yml
level=info ts=2018-04-05T17:04:58.455488363Z caller=main.go:491 msg="Server is ready to receive web requests."
level=error ts=2018-04-05T17:04:58.455502888Z caller=file.go:230 component="discovery manager scrape" discovery=file msg="Error adding file watcher" err="too many open files"
level=error ts=2018-04-05T17:04:58.455525562Z caller=file.go:230 component="discovery manager scrape" discovery=file msg="Error adding file watcher" err="too many open files"
prometheus.yml, app name redacted to <myapp>:
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: '<myapp>-monitor'
rule_files:
scrape_configs:
- job_name: '<myapp>-apps'
scrape_interval: 15s
file_sd_configs:
- files:
- 'targets/apps.json'
refresh_interval: 2m
- job_name: '<myapp>-nodes'
scrape_interval: 15s
file_sd_configs:
- files:
- 'targets/nodes.json'
refresh_interval: 2m
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets:
- 'localhost:9090'
- 'localhost:9100'