########################################################################################
########################################################################################
########################################################################################
- Sysctl configuration:
########################################################################################
# sysctl -a | egrep 'net.ipv4.tcp_tw_recycle|net.ipv4.tcp_tw_reuse|net.ipv4.ip_local_port_range'
error: permission denied on key 'vm.compact_memory'
error: permission denied on key 'net.ipv4.route.flush'
net.ipv4.tcp_tw_recycle = 0
net.ipv4.ip_local_port_range = 2000 65000
net.ipv4.tcp_tw_reuse = 0
error: permission denied on key 'net.ipv6.route.flush'
########################################################################################
- Quantity of log files:
########################################################################################
# find /var/log -type f | grep -v gz | egrep -v '\.pos$' | wc -l
42
########################################################################################
- Quantity os log files been used at every moment:
########################################################################################
# lsof | grep '/var/log' | wc -l
22
########################################################################################
- Processes running:
########################################################################################
nginx
nrsysmond
sensu-client
uwsgi
fluentd
########################################################################################
- Fluentd management configuration
########################################################################################
# cat /etc/init/fluentd.conf
start on runlevel [2345]
stop on starting rc RUNLEVEL=[016]
setuid root
setgid root
respawn
respawn limit 10 5
console log
exec fluentd -c /etc/fluent.conf --suppress-repeated-stacktrace
########################################################################################
- upstart log rotation file:
########################################################################################
# cat /etc/logrotate.d/upstart
/var/log/upstart/*.log {
daily
missingok
rotate 7
compress
notifempty
nocreate
}
########################################################################################
OBS: As you can see:
1) I didn't use "*" in my fluentd.conf. I instead used "*.log";
2) The logrotate conf file were not modified. ( I didn't posted it because of it);
3) I just modified the max-file-descriptors and didn't make the network configurations as the doc says i only needed to do it when the it consists of many fluentd instances and i'm not sure "How Many instances the doc were talking about";
4) There are not much log files. Even less are the quantity been used in this test;
5) Not many programs running on the server;
6) The python/uwsgi service is managed by upstart so the log file goes to /var/log/upstart;
7) Fluentd is managed by the upstart;
8) Fluentd is already using the --suppress-repeated-stacktrace option;
9) The upstart log rotation file compresses the rotated files immediately;
Objective
=======
- See if the logrotation could trigger the error even when my server is configured right and if i was receiving the "Too many open files";
Proceedings
==========
1) Stop fluentd service:
# service fluentd stop
fluentd stop/waiting
#
2) Confirm fluentd is not running:
# ps -ef f | egrep [f]luentd
#
3) Erase actual fluentd.log file and compressed too:
# rm -f /var/log/upstart/fluentd*.log /var/log/upstart/fluentd.log.?.gz
#
4) Confirm all logfiles were deleted:
# ls /var/log/upstart/*fluentd*
ls: cannot access /var/log/upstart/*fluentd*: No such file or directory
#
- Now we have a clean environment to see the fluentd logfile messages;
5) Start fluentd service:
# service fluentd start
fluentd start/running, process 56390
#
6) Confirm fluentd is not running:
# ps -ef f | egrep [f]luentd
root 56390 1 2 09:54 ? Ssl 0:00 /usr/bin/ruby1.9.1 /usr/local/bin/fluentd -c /etc/fluent.conf --suppress-repeated-stacktrace
root 56392 56390 7 09:54 ? Sl 0:00 \_ /usr/bin/ruby1.9.1 /usr/local/bin/fluentd -c /etc/fluent.conf --suppress-repeated-stacktrace
#
7) Confirming the fluentd log file were created:
ls -l /var/log/upstart/*fluentd*
-rw-r----- 1 root root 2992 Feb 15 09:54 /var/log/upstart/fluentd.log
root@api1n:/#
8) Seeing the output of the log file:
################################################################
# cat /var/log/upstart/fluentd.log
2015-02-15 09:54:47 +0000 [info]: reading config file path="/etc/fluent.conf"
2015-02-15 09:54:47 +0000 [info]: starting fluentd-0.12.4
2015-02-15 09:54:47 +0000 [info]: gem 'fluent-mixin-config-placeholders' version '0.3.0'
2015-02-15 09:54:47 +0000 [info]: gem 'fluent-plugin-s3' version '0.5.1'
2015-02-15 09:54:47 +0000 [info]: gem 'fluentd' version '0.12.4'
2015-02-15 09:54:47 +0000 [info]: using configuration file: <ROOT>
<source>
type tail
path /var/log/nginx/*access.log
pos_file /var/log/fluentd/nginx.access.log.pos
tag nginx.access
format nginx
</source>
<source>
type tail
path /var/log/nginx/*error.log
pos_file /var/log/fluentd/nginx.error.log.pos
tag nginx.error
format /^(?<time>[^ ]+ [^ ]+) \[(?<log_level>.*)\] (?<pid>\d*).(?<tid>[^:]*): (?<message>.*)$/
</source>
<source>
type tail
path /var/log/upstart/*.log
pos_file /var/log/fluentd/application.log.log.pos
tag application.log
format none
</source>
<source>
type tail
path /var/log/upstart/celery/*.log
pos_file /var/log/fluentd/celery.application.log.log.pos
tag celery.application.log
format none
</source>
<source>
type tail
path /var/log/apps/*.log
pos_file /var/log/fluentd/apps.log.log.pos
tag apps.log
format none
</source>
<source>
type tail
path /var/log/apps/celery/*.log
pos_file /var/log/fluentd/celery.apps.log.log.pos
tag celery.apps.log
format none
</source>
<match **>
type s3
path <REMOVED>
buffer_path<REMOVED>
aws_key_id <REMOVED>
aws_sec_key <REMOVED>
s3_bucket <REMOVED>
s3_region <REMOVED>
time_slice_format %Y%m%d%H%M
</match>
</ROOT>
2015-02-15 09:54:47 +0000 [info]: adding match pattern="**" type="s3"
2015-02-15 09:54:47 +0000 [info]: adding source type="tail"
2015-02-15 09:54:47 +0000 [info]: adding source type="tail"
2015-02-15 09:54:47 +0000 [info]: adding source type="tail"
2015-02-15 09:54:47 +0000 [info]: adding source type="tail"
2015-02-15 09:54:47 +0000 [info]: adding source type="tail"
2015-02-15 09:54:47 +0000 [info]: adding source type="tail"
2015-02-15 09:54:48 +0000 [info]: following tail of /var/log/nginx/access.log
2015-02-15 09:54:48 +0000 [info]: following tail of /var/log/nginx/stub-status_access.log
2015-02-15 09:54:48 +0000 [info]: following tail of /var/log/nginx/error.log
2015-02-15 09:54:48 +0000 [info]: following tail of /var/log/upstart/fluentd.log
2015-02-15 09:54:48 +0000 [info]: following tail of /var/log/upstart/<APPLICATION_LOG_NAME_REMOVED>
#
################################################################
- I removed sensitive information from the file above
9) Force the rotation of the upstart log files:
# logrotate -f /etc/logrotate.d/upstart
#
10) Verifying the old logfile for modifications:
#zcat /var/log/upstart/fluentd.log.1.gz | tail -2
2015-02-15 09:54:48 +0000 [info]: following tail of /var/log/upstart/<APPLICATION_LOG_NAME_REMOVED>
2015-02-15 10:00:58 +0000 [info]: detected rotation of /var/log/upstart/fluentd.log; waiting 5 seconds
#
11) Verifying the new log file created:
#cat /var/log/upstart/fluentd.log
2015-02-15 10:00:58 +0000 [info]: detected rotation of /var/log/upstart/<APPLICATION_LOG_NAME_REMOVED>.log; waiting 5 seconds
2015-02-15 10:00:58 +0000 [info]: following tail of /var/log/upstart/fluentd.log
#
12) Verifying if we have any log file deleted still been handled:
#lsof | grep delete
init 1 root 8w REG 202,1 64 271599 /var/log/upstart/dbus.log.1 (deleted)
init 1 root 12w REG 202,1 95 271607 /var/log/upstart/acpid.log.1 (deleted)
init 1 root 13w REG 202,1 840 269572 /var/log/upstart/<APPLICATION_LOG_NAME_REMOVED>.log.1 (deleted)
#
# tail -f fluentd.log
2015-02-15 10:24:50 +0000 [warn]: pattern not match: "10.0.0.5 - - [15/Feb/2015:10:24:50 +0000] \"GET / HTTP/1.0\" 302 239 \"-\" \"check_http/v1.4.15 (nagios-plugins 1.4.15)\" \"201.20.34.11\""
2015-02-15 10:25:48 +0000 [info]: following tail of /var/log/upstart/fluentd.log
2015-02-15 10:32:48 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:375:in `seek'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:375:in `on_rotate'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:564:in `call'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:564:in `on_notify'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:340:in `on_notify'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:323:in `attach'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:132:in `setup_watcher'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:150:in `block in start_watchers'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:137:in `each'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:137:in `start_watchers'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:127:in `refresh_watchers'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:427:in `call'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:427:in `on_timer'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/cool.io-1.3.0/lib/
cool.io/loop.rb:88:in `run_once'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/cool.io-1.3.0/lib/
cool.io/loop.rb:88:in `run'
2015-02-15 10:32:48 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:211:in `run'
2015-02-15 10:32:49 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:375:in `seek'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:375:in `on_rotate'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:564:in `call'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:564:in `on_notify'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:340:in `on_notify'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:427:in `call'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:427:in `on_timer'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/cool.io-1.3.0/lib/
cool.io/loop.rb:88:in `run_once'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/cool.io-1.3.0/lib/
cool.io/loop.rb:88:in `run'
2015-02-15 10:32:49 +0000 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.4/lib/fluent/plugin/in_tail.rb:211:in `run'
2015-02-15 10:32:50 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:50 +0000 [error]: suppressed same stacktrace
2015-02-15 10:32:51 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:51 +0000 [error]: suppressed same stacktrace
2015-02-15 10:32:52 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:52 +0000 [error]: suppressed same stacktrace
2015-02-15 10:32:53 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:53 +0000 [error]: suppressed same stacktrace
2015-02-15 10:32:54 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:54 +0000 [error]: suppressed same stacktrace
2015-02-15 10:32:55 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:55 +0000 [error]: suppressed same stacktrace
2015-02-15 10:32:56 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:56 +0000 [error]: suppressed same stacktrace
2015-02-15 10:32:57 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:57 +0000 [error]: suppressed same stacktrace
2015-02-15 10:32:58 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:58 +0000 [error]: suppressed same stacktrace
2015-02-15 10:32:59 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:32:59 +0000 [error]: suppressed same stacktrace
2015-02-15 10:33:00 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:33:00 +0000 [error]: suppressed same stacktrace
2015-02-15 10:33:01 +0000 [error]: bignum too big to convert into `long'
2015-02-15 10:33:01 +0000 [error]: suppressed same stacktrace
1) I think I've made a mistake on the fluentd.conf file because i'm tailing all files in the /var/log/upstart/ directory and the fluentd's log files are there as the upstart says.
2) Could this error still happens if i remove the fluentd.log from the fluentd configuration?
...