Tailing a large file

387 views
Skip to first unread message

Paul McCann

unread,
Mar 23, 2012, 4:29:32 AM3/23/12
to Fluentd Google Group
Hello; first, thank you for a lovely piece of software.

I'm trying to use fluent to gather some rather large logs (1M+ lines)
from multiple servers in one place. I'm using the tail input plugin
for the client fluent server and sending to the hub fluent server with
the forward output plugin. Some messages go through fine, but after a
short period (less than a minute) the client fluent gives the message
"detached forwarding server" and a phi value and then gets several
errors like this:

failed to flush the buffer, retrying. error="no nodes are available"
instance=47083520

So, my questions are:

1. Where can I find a simple explanation of phi values? I think I'm
using Google incorrectly, as I haven't been able to turn much up, and
I'd not heard of them before. Even just "higher phi is slower" would
be a good start.

2. Is there a way to tell fluent's tail plugin to start at the end of
the file rather than reading from the beginning? I don't mind if it
only begins following the file from when I start it up rather than
reading all prior messages.

3. What can I do to prevent disconnections? I put the retry_timeout
quite small (even <10s), but I never saw it reconnect (or indicate an
attempt) to the forward host.

Sorry for all the questions, and thanks for any help you can give me. -
POLM

Sadayuki Furuhashi

unread,
Mar 23, 2012, 7:13:51 PM3/23/12
to flu...@googlegroups.com
1. Roughly speaking, you can assume phi value as 1 second. Because the default value of the 'phi_threshold' is 8,
Fluentd detaches the next server if heartbeat packets from the server stopped for 8 seconds.

The actual mean of the phi value is the probability of failure of the next server. It's calculated by an algorithm named
'phi accrual failure detector', which is also used in Cassandra. It can detect failure of the next server fast in stable network,
but makes it slow in unstable network.

2. The tail plugin should read logs from the end of the file (at least on the latest version). What version of Fluentd are you using now?

3. The forward plugins use UDP for failure detection. So make sure UDP packets are not filtered by firewall. The default port number is 24224.

Thanks,
Sadayuki Furuhashi

日付:2012年3月23日金曜日、時刻:1:29、差出人:Paul McCann:

Paul McCann

unread,
Mar 25, 2012, 9:27:41 PM3/25/12
to Fluentd Google Group
1. Thank you for the explanation of phi values; that makes a lot of
sense (and made it easier to find the paper).

2. I'm using 0.10.15. Given the rate at which the log is written to,
it's possible I was misinterpreting it as reading from the head of the
file when it's simply that the growth rate is very high; let me check
that. Just to be clear, it will read from the end of the file even
without a POS file setting?

3. That could be the cause of this; I'll check that.

Thanks for the complete answer! -POLM

Satoshi Yamada

unread,
Nov 7, 2012, 5:44:12 AM11/7/12
to flu...@googlegroups.com
Hi, I'm satoshi.
Let me ask question in sending large logs with tail.

I have logs to transfer, which ranges from several KB up to several MB (I know it's odd log).
So, I test fluentd to send logs whose size is 10MB per line.

When the lines are small, say 10 or so, fluentd transfers all the logs completely, but
it takes hours to finish it. 

When the lines are bigger, say 10000+, no logs are transfered. fluentd just shows the message
below several hours after I run the test.
fluent/buffer.rb:184:block in emit: Size of the emitted data exceeds buffer_chunk_limit.
fluent/buffer.rb:185:block in emit: This may occur problems in the output plugins ``at this server.``
fluent/buffer.rb:186:block in emit: To avoid problems, set a smaller number to the buffer_chunk_limit
fluent/buffer.rb:187:block in emit: in the forward output ``at the log forwarding server.``

I use different values for buffer_chunk_limit, 256m, 12m, but it does not work so far.
fluentd uses full of one CPU whole time while testing. Also, memory usage keeps on increasing
while running the test. I use Ruby 1.9.3 and fluentd 0.10.25.

Can I have some advice for configuration? or is there any recommended line size limit to run fluentd?

Thanks in advance,
satoshi

Satoshi Yamada

unread,
Nov 7, 2012, 9:08:10 AM11/7/12
to flu...@googlegroups.com
Let me correct the sentence.

-I have logs to transfer, which ranges from several KB up to several MB (I know it's odd log).
+I have logs to transfer, which ranges from several KB up to several MB per line (I know it's odd log).

thanks,
satoshi
Reply all
Reply to author
Forward
0 new messages