failed to write data into buffer by buffer overflow

3,859 views
Skip to first unread message

surfsup.i...@gmail.com

unread,
Jan 20, 2019, 9:13:41 PM1/20/19
to Fluentd Google Group
Hi,

I need help to understand why I'm getting so many buffer error using minimal size chunk_limit_size and queue_limit_length.  I am able to queried or delete the indice using the same host and port.

# kubectl logs fluentd_pod_name

Connection opened to Elasticsearch cluster => {:host=>"my_node_ip", :port=>30998, :scheme=>"http"}

failed to write data into buffer by buffer overflow action=:block 


<match **>

  @id elasticsearch

  @type elasticsearch

  @log_level info

  include_tag_key true

  type_name _doc

  host my_node_ip

  port 30998

  scheme http

  ssl_version TLSv1_2

  logstash_format true

  logstash_prefix logstash

  reconnect_on_error true

  <buffer>

    @type file

    path /var/log/fluentd-buffers/kubernetes.system.buffer

    flush_mode interval

    retry_type exponential_backoff

    flush_thread_count 2

    flush_interval 5s

    retry_forever

    retry_max_interval 30

    chunk_limit_size 1M

    queue_limit_length 8

    overflow_action block

  </buffer>

</match>


Thanks,

su 

Mr. Fiber

unread,
Jan 21, 2019, 3:31:29 AM1/21/19
to Fluentd Google Group
> failed to write data into buffer by buffer overflow

This means your traffic is larger than your buffer growth.
Your buffer is only 8MB so if incoming traffic is larger than 8MB,
this error happens.
In addition, if the buffer flush takes longer time, it also causes this error
even if the traffic is smaller than 8MB.


Masahiro
--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

surfsup.i...@gmail.com

unread,
Jan 21, 2019, 9:59:47 AM1/21/19
to Fluentd Google Group
I upped the chunk_limit to 256M and buffer queue limit to 256.  The buffer flush is 5s.

Still see some errors related,

2019-01-21 09:15:29 +0000 [warn]: [elasticsearch] bad chunk is moved to /tmp/fluentd-buffers/backup/worker0/elasticsearch/57fc4c71326aae28f21d7ca8e9869e5d.log

2019-01-21 09:16:33 +0000 [warn]: [elasticsearch] Could not push logs to Elasticsearch, resetting connection and trying again. read timeout reached

2019-01-21 09:16:35 +0000 [info]: [elasticsearch] Connection opened to Elasticsearch cluster => {:host=>"192.168.23.5", :port=>30998, :scheme=>"http"}

2019-01-21 09:17:28 +0000 [warn]: [elasticsearch] Could not push logs to Elasticsearch, resetting connection and trying again. read timeout reached

2019-01-21 09:17:29 +0000 [info]: [elasticsearch] Connection opened to Elasticsearch cluster => {:host=>"192.168.23.5", :port=>30998, :scheme=>"http"}

2019-01-21 09:17:41 +0000 [warn]: [elasticsearch] Could not push logs to Elasticsearch, resetting connection and trying again. read timeout reached

2019-01-21 09:17:45 +0000 [info]: [elasticsearch] Connection opened to Elasticsearch cluster => {:host=>"192.168.23.5", :port=>30998, :scheme=>"http"}

2019-01-21 09:18:49 +0000 [warn]: [elasticsearch] got unrecoverable error in primary and no secondary error_class=Fluent::Plugin::ElasticsearchOutput::ConnectionFailure error="Could not push logs to Elasticsearch after 2 retries. read timeout reached"

Mr. Fiber

unread,
Jan 21, 2019, 10:04:48 AM1/21/19
to Fluentd Google Group
> 2019-01-21 09:16:33 +0000 [warn]: [elasticsearch] Could not push logs to Elasticsearch, resetting connection and trying again. read timeout reached

This is not fluentd core problem.
Error says the timeout between elasticsearch and elasticsearch client
so your elasticsearch cluster or network or something is wrong.

surfsup.i...@gmail.com

unread,
Jan 21, 2019, 10:11:31 AM1/21/19
to Fluentd Google Group
Thanks for the response I will investigate.

surfsup.i...@gmail.com

unread,
Jan 21, 2019, 1:19:30 PM1/21/19
to Fluentd Google Group
Sorry, I forgot to mention the logs is from the fluentd pod running in k8s.  And I'm not cleared from the log is the connection is open to elasticsesrch cluster but failed to push log to the elastic cluster.  I ran a simple curl command test from the fluentd container to query the elasticsearch indice and was able to get all the indices from elasticsearch.  If the log cannot be push to elasticsearch and I was able to query the indice from fluentd container.  I am a bit confuse!


On Monday, January 21, 2019 at 9:04:48 AM UTC-6, repeatedly wrote:

surfsup.i...@gmail.com

unread,
Jan 21, 2019, 2:02:00 PM1/21/19
to Fluentd Google Group
Sorry, I forgot to mention the logs is from the fluentd pod running in k8s.  And I'm not cleared from the log is the connection is open to elasticsesrch cluster but failed to push log to the elastic cluster.  I ran a simple curl command test from the fluentd container to query the elasticsearch indice and was able to get all the indices from elasticsearch.  If the log cannot push to elasticsearch and I was able to queried the indices from fluentd container.  I am a bit confuse!


On Monday, January 21, 2019 at 9:04:48 AM UTC-6, repeatedly wrote:

Mr. Fiber

unread,
Jan 22, 2019, 2:22:27 AM1/22/19
to Fluentd Google Group
I'm not familiar with ES operation but query and write are different operation.
Maybe, you need to check ES log and more to investigate why read timeout happen.
For example, took longer injestion time or API/Client mismatch or something.

johnz...@gmail.com

unread,
Jan 24, 2019, 1:39:47 AM1/24/19
to Fluentd Google Group

I have similar issue, you can refer:

https://groups.google.com/forum/#!topic/fluentd/u8PXVwKjIVU



I think increase your elastic search performance (More memory/ quick io disk/ es node count) may helpful to resolve this issue most efficiently


And, improve buffer in fluentd configure may helpful sometimes, but not resolve this root reason.  (es is too slow)


Correct me if I am wrong, thanks.


Here is my conf, FYI

```

<match **>

      @id elasticsearch

      @type elasticsearch

      @log_level info

      type_name fluentd

      include_tag_key true

      host elasticsearch-logging

      port 9200

      logstash_format true

      flush_interval 1s

      buffer_chunk_limit 1M

      buffer_queue_limit 512

      <buffer>

        @type file

        path /var/log/fluentd-buffers/kubernetes.system.buffer

        flush_mode interval

        retry_type exponential_backoff

        flush_thread_count 2

        flush_interval 2s

        retry_forever

        retry_max_interval 30

        chunk_limit_size 20M

        queue_limit_length 16

        overflow_action block

      </buffer>

    </match>

```


Reply all
Reply to author
Forward
0 new messages