Fluentd file buffer no space left on device

1,710 views
Skip to first unread message

Anirudh Venkatesh

unread,
Apr 2, 2021, 5:17:53 PM4/2/21
to Fluentd Google Group
Hello,

In case of disk full scenario with file buffer, I am seeing a lot of these error messages -
"2021-04-02 20:12:44 +0000 [warn]: #0 emit transaction failed: error_class=Errno::ENOSPC error="No space left on device @ io_write - /path/to/fluentd/buffer/myhost/buffer.b5bf02f881edc66aa9281f29a8df7e47b.log" location="/opt/fluentd/gems/fluentd-1.12.2/lib/fluent/plugin/buffer/file_chunk.rb:62:in `write'" tag="sample_tag"

  2021-04-02 20:12:44 +0000 [warn]: #0 suppressed same stacktrace"

And these chunks seem to discarded. Here is my buffer config -

<buffer>

       @type file

       path "/path/to/fluentd/buffer/myhost"

       flush_mode interval

       flush_interval 5s

       flush_thread_count 1

       flush_at_shutdown true

       retry_type exponential_backoff

       chunk_limit_size 8MB

       retry_timeout 1h

       overflow_action block

     </buffer>


The overflow action block should ensure that there is enough space before writing more data correct? If thats the case, why am I loosing these chunks? Also, is there a way to error out in case disk full scenario occurs? Fluentd seems to keep discarding data instead of stopping/crashing when a disk full event occurs.


Thanks.

Kentaro Hayashi

unread,
Apr 6, 2021, 2:26:55 AM4/6/21
to Fluentd Google Group
Hi,


The explanation of overflow_action block will help to understand.

The default behavior (throw_exception) may be what you want.
Or out_copy plugin with  ignore_if_prev_successes may be better in such a situation. [1]


Regards,

2021年4月3日土曜日 6:17:53 UTC+9 anib...@gmail.com:

Anirudh Venkatesh

unread,
Apr 7, 2021, 10:01:29 AM4/7/21
to Fluentd Google Group
Thanks Ken,

I tried both the options of buffer overflow default action and block and saw that chunks were missed though this raised an error on the fluentd that is configured as a sender. I am not sure how you are suggesting I use the out_copy plugin feature, enabling this on a store would only prevent the event from going out to a destination if a previous store was successful right? In our case, the buffer space on the sender is full and this is what causes the error.

I see an open github issue - https://github.com/fluent/fluentd/issues/1698 and this is very similar to what I am experiencing. Any other ideas to circumvent this problem?

Thanks for your response. Any help is greatly appreciated.

Regards

Anirudh Venkatesh

unread,
Apr 14, 2021, 3:28:49 PM4/14/21
to Fluentd Google Group
Any other ideas for handling this scenario? Is the open github issue a problem that needs to be addressed or will be addressed in the future?

Thomas Müller

unread,
Apr 29, 2021, 1:58:35 PM4/29/21
to Fluentd Google Group
anib...@gmail.com schrieb am Freitag, 2. April 2021 um 23:17:53 UTC+2:
Hello,

In case of disk full scenario with file buffer, I am seeing a lot of these error messages -
"2021-04-02 20:12:44 +0000 [warn]: #0 emit transaction failed: error_class=Errno::ENOSPC error="No space left on device @ io_write - /path/to/fluentd/buffer/myhost/buffer.b5bf02f881edc66aa9281f29a8df7e47b.log" location="/opt/fluentd/gems/fluentd-1.12.2/lib/fluent/plugin/buffer/file_chunk.rb:62:in `write'" tag="sample_tag"

  2021-04-02 20:12:44 +0000 [warn]: #0 suppressed same stacktrace"

And these chunks seem to discarded. Here is my buffer config -

<buffer>

       @type file

       path "/path/to/fluentd/buffer/myhost"

       flush_mode interval

       flush_interval 5s

       flush_thread_count 1

       flush_at_shutdown true

       retry_type exponential_backoff

       chunk_limit_size 8MB

       retry_timeout 1h

       overflow_action block

     </buffer>


The overflow action block should ensure that there is enough space before writing more data correct?


The overflow_action block will signal the input plugins to stop reading more if buffer is full. I don't know if ENOSPC is "buffer full" event that will trigger the overflow_action because it's not what the buffer thinks the size of the buffer is.
 

If thats the case, why am I loosing these chunks?

either a) because the buffer isn't expecting disk full or b) you actually exceed retry_timeout 1h .

for ...

a) limit your buffer size. the default for total_limit_size is 64GB for file buffer.


b). set retry_forever true so it will actually retry forever.


But just guessing. maybe you can share a more complete log example than just one line?

- Thomas




Reply all
Reply to author
Forward
0 new messages