I'm finding that the td-agent process is consuming a hefty chunk of CPU (10%). This undoubtedly has to do with the fact that there are ~2000 stdout/stderr files. I setup a small test environment on another server:
# cat /tmp/foo.conf
<source>
@type tail
path /tmp/foobar/*
refresh_interval 300
pos_file /var/log/td-agent/foo.pos
format none
tag foo
</source>
<match **>
@type stdout
</match>
#
I created 2000 tiny files in /tmp/foobar which never change. The same CPU consumption problem exists. A brief strace -c on the running process reveals a myriad of stat/open/etc calls:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
31.37 0.651782 1293 504 140 futex
31.26 0.649345 8433 77 poll
24.07 0.500080 20003 25 epoll_wait
3.14 0.065335 4 15337 fstat
3.11 0.064710 4 15337 fcntl
1.95 0.040458 5 7669 open
1.72 0.035830 5 7846 118 read
1.67 0.034787 5 7668 close
1.57 0.032657 4 7668 7668 ioctl
0.06 0.001257 7 187 clock_gettime
0.04 0.000774 10 80 write
0.02 0.000340 7 52 sched_yield
0.01 0.000193 97 2 1 restart_syscall
------ ----------- ----------- --------- --------- ----------------
100.00 2.077548 62452 7927 total
Perusing in_tail.rb leads to Cool.io::StatWatcher which seems to have inotify support. Running an strace against a new td-agent process shows that all 2000 files were successfully processed by inotify_add_watch():
23461 inotify_add_watch(9, "/tmp/foobar/1453443965", IN_MODIFY|IN_ATTRIB|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF|IN_DONT_FOLLOW|IN_MASK_ADD) = 2000
In spite of this, the td-agent process repeatedly opens the files in /tmp/foobar. Looking through the strace output for one of the foobar files (/tmp/foobar/1453443965):
# grep 1453443965 /tmp/strace
23461 read(7, "/tmp/foobar/1453443965\t000000000"..., 8192) = 57
23461 lstat("/tmp/foobar/1453443965", {st_mode=S_IFREG|0664, st_size=29, ...}) = 0
23461 inotify_add_watch(9, "/tmp/foobar/1453443965", IN_MODIFY|IN_ATTRIB|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF|IN_DONT_FOLLOW|IN_MASK_ADD) = 2000
23461 statfs("/tmp/foobar/1453443965", {f_type=0x58465342, f_bsize=4096, f_blocks=268431285, f_bfree=133606231, f_bavail=133606231, f_files=1073735360, f_ffree=1071502703, f_fsid={51713, 0}, f_namelen=255, f_frsize=4096}) = 0
23461 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2009
23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
#
If inotify is being used, and /tmp/foobar/1453443965 never changes, why does td-agent keep re-opening the file!?
David