How the tail plugin work?

1,096 views
Skip to first unread message

hengyunabc

unread,
Apr 18, 2012, 1:09:23 PM4/18/12
to Fluentd Google Group
Hi, I try to use fluentd to collect nginx log, but I don't know how
the tail plugin work.
I know little about ruby.
I try to read the in_tail.rb ,I found that there is a "pos_file"
config_param. How it works?

I have another question.
The log files will roll, such as:

test.log
test.log1
test.log.2

When the condition meet, the logfiles will roll,
test.log.2 -> test.log.3
test.log.1 -> test.log.2
test.log -> test.log.1
then make a new file test.log.

If we use tail plugin to watch test.log, will lost data(the app write
log to test.log, before tail plugin collect the log , the app remane
test.log to test.log.1)?




西田圭介(BizMobile)

unread,
Apr 19, 2012, 8:23:33 PM4/19/12
to flu...@googlegroups.com
Hi hengyunabc,

I don't know much about the implementation, but here are some notes
from the recent releases:

| Released Fluentd v0.10.12. This is a minor bug fix release for everyone.
|
| * I re-designed in_tail plugin to follow all files including
symbolic links correctly.
| New implementation considers log rotation. it remains to read the
rotated old file for
| several seconds since applications may take time to switch the log file.
| You can control the wait time using 'rotate_wait 5s' parameter.

| Released Fluentd v0.10.13. This is a bug fix release for all user of v0.10.12
|
| * Rewrote in_tail [issue #40]
| On v0.10.12, the tail plugin has a problem that consumes all
resources of one CPU core.
| Now it's fixed and the new design is slightly efficient than the old design.

Hope this helps.

Chees,
Keisuke


2012/4/19 hengyunabc <hengy...@gmail.com>:

hengyunabc

unread,
Apr 20, 2012, 3:54:39 AM4/20/12
to Fluentd Google Group
Thank you very much.
I had read the change log too.

I try to read the in_tail.rb.
I think I know how the tail plugin work.
The tail plugin save the last read pos and the inode to the pos_file.
The pos_file has three field: file name, pos, inode.

The TimerWatcher will try to read the log file every second, if inode
or the size change ,will rotate to the new log file.
Then try to read the the log file(nonblok), parse the lines and send
them to the Engine.
At last, save the new pos into the pos_file.

I found some thing may be wrong:
use tail plugin to watch "log.1", and set pos_file log.pos
<source>
type tail
path log.1
tag debug.test
pos_file log.pos
</source>

# cat log.pos
# log.1 0000000000000140 002c1727

We can find the inode of log.1.
then

# rm log.1 && cat sample.log >> log.1
# ls -i log.1
# 2889504 log.1 (2889504 == 0x2c1720 != 0x2c1727)
We will find that the log.1 inode have changed ,but in log.pos, the
inode field is still the old value!



On Apr 20, 8:23 am, 西田圭介(BizMobile) <knish...@bizmobile.co.jp> wrote:
> Hi hengyunabc,
>
> I don't know much about the implementation, but here are some notes
> from the recent releases:
>
> | Released Fluentd v0.10.12. This is a minor bug fix release for everyone.
> |
> | * I re-designed in_tail plugin to follow all files including
> symbolic links correctly.
> | New implementation considers log rotation. it remains to read the
> rotated old file for
> | several seconds since applications may take time to switch the log file.
> | You can control the wait time using 'rotate_wait 5s' parameter.
>
> | Released Fluentd v0.10.13. This is a bug fix release for all user of v0.10.12
> |
> | * Rewrote in_tail [issue #40]
> | On v0.10.12, the tail plugin has a problem that consumes all
> resources of one CPU core.
> | Now it's fixed and the new design is slightly efficient than the old design.
>
> Hope this helps.
>
> Chees,
> Keisuke
>
> 2012/4/19 hengyunabc <hengyun...@gmail.com>:

hengyunabc

unread,
Apr 21, 2012, 3:42:16 AM4/21/12
to Fluentd Google Group
gist: https://gist.github.com/2435280
I wrote a tail input plugin call SplitTail, it can collect the first
log file content exists before fluentd start, and write pos into
pos_file every lines received.
It can also split the special field .
Sample log file:
# 21.18.104.8 www.sample.com 12.12.12.13 [18/Apr/2012:04:01:21 +0800]
"GET /file/file.do?id=321 HTTP/1.1" 200 72040 "-" "-"
"buildtime=2012_02_14_16_25;version=01.03.3752;totalTimeMin=0;"

# for example, by regex , we get :
# {
# ip : 21.18.104.8
# host : www.sample.com
# message :
buildtime=2012_02_14_16_25;version=01.03.3752;totalTimeMin=0;
# }

Use SplitTail to be:
# {
# ip : 21.18.104.8
# host : www.sample.com
# message :
buildtime=2012_02_14_16_25;version=01.03.3752;totalTimeMin=0;
# buildtime : 2012_02_14_16_25
# version : 01.03.3752
# totalTimeNin : 0
# }
Reply all
Reply to author
Forward
0 new messages