in_tail performance (high CPU)

1,265 views
Skip to first unread message

David Wood

unread,
Jan 22, 2016, 3:06:04 AM1/22/16
to Fluentd Google Group
Hello,

I'm attempting to implement the mesos agent node configuration using fluentd (td-agent rpms) instead of logstash; specifically:
    
    "paths": [
        "-",
        "/var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stdout",
        "/var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stderr"
    ]

I'm finding that the td-agent process is consuming a hefty chunk of CPU (10%).  This undoubtedly has to do with the fact that there are ~2000 stdout/stderr files.  I setup a small test environment on another server:

    # cat /tmp/foo.conf
    <source>
      @type tail
      path /tmp/foobar/*
      refresh_interval 300
      pos_file /var/log/td-agent/foo.pos
      format none
      tag foo
    </source>
    <match **>
      @type stdout
    </match>
    #

I created 2000 tiny files in /tmp/foobar which never change.  The same CPU consumption problem exists.  A brief strace -c on the running process reveals a myriad of stat/open/etc calls:

    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     31.37    0.651782        1293       504       140 futex
     31.26    0.649345        8433        77           poll
     24.07    0.500080       20003        25           epoll_wait
      3.14    0.065335           4     15337           fstat
      3.11    0.064710           4     15337           fcntl
      1.95    0.040458           5      7669           open
      1.72    0.035830           5      7846       118 read
      1.67    0.034787           5      7668           close
      1.57    0.032657           4      7668      7668 ioctl
      0.06    0.001257           7       187           clock_gettime
      0.04    0.000774          10        80           write
      0.02    0.000340           7        52           sched_yield
      0.01    0.000193          97         2         1 restart_syscall
    ------ ----------- ----------- --------- --------- ----------------
    100.00    2.077548                 62452      7927 total

Perusing in_tail.rb leads to Cool.io::StatWatcher which seems to have inotify support.  Running an strace against a new td-agent process shows that all 2000 files were successfully processed by inotify_add_watch():

    23461 inotify_add_watch(9, "/tmp/foobar/1453443965", IN_MODIFY|IN_ATTRIB|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF|IN_DONT_FOLLOW|IN_MASK_ADD) = 2000

In spite of this, the td-agent process repeatedly opens the files in /tmp/foobar.  Looking through the strace output for one of the foobar files (/tmp/foobar/1453443965):

    # grep 1453443965 /tmp/strace
    23461 read(7, "/tmp/foobar/1453443965\t000000000"..., 8192) = 57
    23461 lstat("/tmp/foobar/1453443965", {st_mode=S_IFREG|0664, st_size=29, ...}) = 0
    23461 inotify_add_watch(9, "/tmp/foobar/1453443965", IN_MODIFY|IN_ATTRIB|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF|IN_DONT_FOLLOW|IN_MASK_ADD) = 2000
    23461 statfs("/tmp/foobar/1453443965", {f_type=0x58465342, f_bsize=4096, f_blocks=268431285, f_bfree=133606231, f_bavail=133606231, f_files=1073735360, f_ffree=1071502703, f_fsid={51713, 0}, f_namelen=255, f_frsize=4096}) = 0
    23461 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2009
    23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
    23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
    23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
    23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
    23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
    23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
    23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
    23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
    23488 open("/tmp/foobar/1453443965", O_RDONLY|O_CLOEXEC) = 2011
    #

If inotify is being used, and /tmp/foobar/1453443965 never changes, why does td-agent keep re-opening the file!?

David

Mr. Fiber

unread,
Jan 22, 2016, 6:35:24 AM1/22/16
to Fluentd Google Group
> If inotify is being used, and /tmp/foobar/1453443965 never changes, why does td-agent keep re-opening the file!?

Currently, in_tail shares rotation check and updating IO handler code across timer and stat watchers.
This is why in_tail calls open repeatedly.

For reducing open call, changing this mechanizm is needed.


Masahiro

--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Lance N.

unread,
Jan 22, 2016, 9:19:21 PM1/22/16
to Fluentd Google Group
Log rotation can change the file. in_tail assumes that the file it tails could be changed by a log rotator. It caches the i-node number in the .pos file for exactly this purpose.

It should be possible to add a flag saying "assume files are permanent, you don't have to poll them".
Message has been deleted
Message has been deleted

David Wood

unread,
Jan 28, 2016, 8:33:49 PM1/28/16
to Fluentd Google Group
Hi Masahiro,

Thank you for the pointer!  I had noticed the timer for refresh_watchers but I didn't notice the timer for on_notify:


The line you referenced is indeed responsible for the open()s ... and the high CPU load.

> across timer and stat watchers

I guess I don't understand why this is necessary?  Both log file updates and log file rotation should be detected by inotify.  To prove this I conducted a simple test:

@@ -305 +305 @@
-        @timer_trigger = TimerWatcher.new(1, true, log, &method(:on_notify))
+        @timer_trigger = TimerWatcher.new(86400, true, log, &method(:on_notify))
(obviously this means that the timer won't fire during the test)

Firstly, the CPU load dropped to zero, which is great!  Next, testing a log file update:

...
2016-01-28 23:54:55 +0000 [info]: following tail of /tmp/foobar/1453443965
2016-01-28 23:55:33 +0000 foo: {"message":"try"}

Next, testing log rotation with logrotate:

2016-01-29 00:01:06 +0000 [info]: detected rotation of /tmp/foobar/1453437009; waiting 5 seconds
2016-01-29 00:01:06 +0000 [info]: following tail of /tmp/foobar/1453437009
2016-01-29 00:01:06 +0000 fluent.info: {"message":"detected rotation of /tmp/foobar/1453437009; waiting 5 seconds"}
2016-01-29 00:01:06 +0000 fluent.info: {"message":"following tail of /tmp/foobar/1453437009"}

So, it seems that inotify works in both cases.  On systems without inotify ... the Coolio::StatWatcher comments imply that a timer would be used anyway:

/**
 *  call-seq:
 *    Coolio::StatWatcher.initialize(path, interval = 0) -> Coolio::StatWatcher
 *
 * Create a new Coolio::StatWatcher for the given path.  This will monitor the
 * given path for changes at the filesystem level.  The interval argument
 * specified how often in seconds the path should be polled for changes.
 * Setting interval to zero uses an "automatic" value (typically around 5
 * seconds) which optimizes performance.  Otherwise, values less than
 * 0.1 are not particularly meaningful.  Where available (at present, on Linux)
 * high performance file monitoring interfaces will be used instead of polling.
 */

If there is in fact some other good/valid reason for keeping the timer ... would you instead consider a PR which removes the hard coded value of 1 second?  Perhaps re-using refresh_interval (or rotate_wait)?  Or perhaps a new parameter?

Lastly, what would you think of the following optimization?  It saved ~3% CPU using my original test case.  If you agree I'll submit a PR for it.  David

--- ./in_tail.rb.fcs    2016-01-28 23:04:00.514350429 +0000
+++ ./in_tail.rb        2016-01-28 23:13:15.555784679 +0000
@@ -580,8 +580,7 @@

         def on_notify
           begin
-            io = FileWrapper.open(@path)
-            stat = io.stat
+            stat = FileWrapper.stat(@path)
             inode = stat.ino
             fsize = stat.size
           rescue Errno::ENOENT
@@ -593,13 +592,11 @@
           begin
             if @inode != inode || fsize < @fsize
               # rotated or truncated
+              io = FileWrapper.open(@path)
               @on_rotate.call(io)
-              io = nil
             end
             @inode = inode
             @fsize = fsize
-          ensure
-            io.close if io
           end

         rescue

Mr. Fiber

unread,
Jan 29, 2016, 9:34:03 AM1/29/16
to Fluentd Google Group
I will check the corner case of using only statwatcher.

Lastly, what would you think of the following optimization?

It seems good. Could you send a PR?


--

David Wood

unread,
Jan 31, 2016, 1:13:27 AM1/31/16
to Fluentd Google Group
> I will check the corner case of using only statwatcher.

Great - thanks!!

> It seems good. Could you send a PR?

Mr. Fiber

unread,
Feb 3, 2016, 3:23:01 AM2/3/16
to Fluentd Google Group
I ask about TimerWatcher issue to first author and he says "StatWatcher doesn't work on some environment, e.g. Mac OS X".
So we can't remove TimerWatcher by default.

--

David Wood

unread,
Feb 17, 2016, 1:15:36 AM2/17/16
to Fluentd Google Group
Hi Masahiro,

Sorry for the delay.  I wanted to complete https://github.com/fluent/fluentd/pull/788 before continuing here.

> StatWatcher doesn't work on some environment, e.g. Mac OS X

Hmm ... odd.  I've again looked through the source code for cool.io ... and I remain convinced that a timer would be used on systems without inotify (EV_USE_INOTIFY=0).  In particular (from ev.c):

#if EV_USE_INOTIFY
  infy_init (EV_A);

  if (fs_fd >= 0)
    infy_add (EV_A_ w);
  else
#endif
    {
      ev_timer_again (EV_A_ &w->timer);
      ev_unref (EV_A);
    }

  ev_start (EV_A_ (W)w, 1);

  EV_FREQUENT_CHECK;
}

To prove it works ... I conducted some tests on the oldest MacOS system I could find:

bash-3.2# sw_vers -productVersion
10.9.5
bash-3.2#

I made the same change as before which prevents the in_tail timer from firing for 24 hours:

@@ -320 +320 @@
-        @timer_trigger = TimerWatcher.new(1, true, log, &method(:on_notify))
+        @timer_trigger = TimerWatcher.new(86400, true, log, &method(:on_notify))

First, testing a log file update:

2016-02-17 16:42:15 +1100 [info]: plugin/in_tail.rb:492:initialize: following tail of /tmp/foobar/file4
2016-02-17 16:46:16 +1100 foo: {"message":"try"}

This works as expected.  Next, testing log rotation via newsyslog:

2016-02-17 16:50:36 +1100 [info]: plugin/in_tail.rb:492:initialize: following tail of /tmp/foobar/file4
2016-02-17 16:50:36 +1100 foo: {"message":"Feb 17 16:50:35 w8109195 newsyslog[31338]: logfile turned over due to -F request using <default> rule"}
2016-02-17 16:50:36 +1100 foo: {"message":"Feb 17 16:50:35 w8109195 newsyslog[31338]: logfile turned over due to -F request using <default> rule"}
2016-02-17 16:50:36 +1100 fluent.info: {"message":"detected rotation of /tmp/foobar/file4; waiting 5 seconds"}
2016-02-17 16:50:36 +1100 fluent.info: {"message":"following tail of /tmp/foobar/file4"}

This works too!  I'm not meaning to belabor the point; rather, this seemingly unnecessary timer is causing us a fair amount of grief.  I conducted some experiments on a system with 2000 log files which were not being updated:
  • fluentd in_tail: 12% CPU (removing the 1 second timer drops the CPU to 0%)
  • logstash forwarder: 3% CPU
  • filebeat: 1.5% CPU
  • logstash: 6% CPU (although this can be easily reduced to 1% with a stat_interval of 10 seconds.
In all of our other use cases we've always found fluentd to be superior.  It'd be a real shame to have an all-fluentd setup *except* for this one scenario :-/.

After reading this, if you're still reluctant to remove the timer, would you instead consider a PR which removes the hard coded value of 1 second?  Perhaps re-using refresh_interval (or rotate_wait)?  Or perhaps a new parameter?

David



Mr. Fiber

unread,
Feb 17, 2016, 6:12:27 AM2/17/16
to Fluentd Google Group
Fluentd focuses on lower resource and high performance so
consuming high CPU should be resolved.

I think there are two approaches.
- Add disable_watch_timer like parameter to disable TimerWatcher instantiation
- Add watch_timer_interval like parameter to modify TimerWatcher interval

Which is the best?
I think former is more effective.



Mr. Fiber

unread,
Feb 17, 2016, 6:19:29 AM2/17/16
to Fluentd Google Group
For keeping compatibility, default disable_watch_timer is false or default watch_timer_interval is 1s.

Mr. Fiber

unread,
Feb 17, 2016, 6:44:42 AM2/17/16
to Fluentd Google Group
> StatWatcher doesn't work on some environment, e.g. Mac OS X

in_tail is oldie plugin so old Cool.io(libev) and Mac OS X combination may had a problem in the past.


David Wood

unread,
Feb 19, 2016, 12:41:48 AM2/19/16
to Fluentd Google Group
> old Cool.io(libev) and Mac OS X combination may had a problem in the past

Ah, that would explain it!

It's too bad I don't have a more diverse environment; otherwise, I'd offer to test it out on more platforms.  At this point, we're mostly a CentOS 7 shop, so it's the only platform where I could conduct real world testing.

> Which is the best?  I think former is more effective.

I agree.  A new disable_watch_timer parameter would make it easier for others to help test too.  After some time/testing/feedback, you could eventually consider making it the default; however, as you suggested the logical first step is to maintain the current behaviour.

I'll work on a PR and let you know.  Thank you for your time!

David

Mr. Fiber

unread,
Feb 19, 2016, 2:30:57 PM2/19/16
to Fluentd Google Group
I'll work on a PR and let you know.  Thank you for your time!

Wait your patch :)

David Wood

unread,
Feb 24, 2016, 3:00:26 AM2/24/16
to Fluentd Google Group

Lance N.

unread,
Feb 24, 2016, 10:20:05 PM2/24/16
to Fluentd Google Group
Thanks, David. We recently had to dial down our poll times because of the startling amount of CPU used by Fluentd on our web site. I look forward to deploying this!

David Wood

unread,
Mar 3, 2016, 6:38:07 PM3/3/16
to Fluentd Google Group
Hi Lance,

The fluentd pull request has been merged.  It has been implemented as "enable_watch_timer" as documented here:


I'd appreciate any feedback you might have ...

David

Lance N.

unread,
Mar 3, 2016, 7:56:13 PM3/3/16
to Fluentd Google Group
Thanks! I use the td-agent binary here, so it will be awhile before I can check it out.

Lance

David Wood

unread,
Mar 4, 2016, 12:04:42 AM3/4/16
to Fluentd Google Group
> I use the td-agent binary here

Actually, we do too, on our production systems.  I've attached patch files for td-agent v2.3.1.  FWIW, this is the excerpt from our td-agent Ansible role:

- name: td-agent install
  yum:
    name: '{{ item }}'
  with_items:
    - td-agent-2.3.1

- name: in_tail patches for better performance
  patch:
    basedir: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.20
    src: '{{ item }}'
  with_items:
    - avoid-unnecessary-opens.patch
    - enable_watch_timer-option.patch

David
avoid-unnecessary-opens.patch
enable_watch_timer-option.patch
Reply all
Reply to author
Forward
0 new messages