connect_write timeout reached

Joseph Presley

unread,

Jan 25, 2019, 12:46:54 PM1/25/19

to flu...@googlegroups.com

I see these errors in the fluentd logs before, fluetnd stopped forwarding logs to elasticsearch. I sanitized the logs and fluentd conf. What causes it and how can I fix the issue?

failed to flush the buffer. retry_time=16 next_retry_seconds=2019-01-24 19:13:56 +0000 chunk="5803240525c1ddbbb197dc0dd46ecbfa" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"<SANITIZED>\", :port=>9243, :scheme=>\"https\", :user=>\"fluentd\", :password=>\"obfuscated\"}): connect_write timeout reached"

My sanitized td-agent configuration is

<source> @type forward port 24224 </source> <match fluent.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix fluentd type_name _doc <buffer> @type file path /var/log/td-agent/fluent.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match prod.ems.fix.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-ems-fix type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.ems.fix.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match prod.ems.app.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-ems-app type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.ems.app.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match production.poms.fix.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-poms type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.poms.fix.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match production.poms.nginx.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-poms type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.poms.nginx.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match production.poms.nodejs.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-poms type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.poms.nodejs.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match>

Mr. Fiber

unread,

Jan 25, 2019, 4:53:39 PM1/25/19

to Fluentd Google Group

Joseph Presley

unread,

Jan 25, 2019, 5:15:51 PM1/25/19

to flu...@googlegroups.com

Thank you for the link to the other user's problem. We have a requirement to reduce log latency as much as possible. I can modify the value slightly but we do not want the default value of 60s. Should I consider increasing the chunk_limit_size as well?

Krzysztof Toruński

unread,

Jan 26, 2019, 2:28:21 AM1/26/19

to flu...@googlegroups.com

Hi

We had similar problem with ES on K8S

The issue was in configuration of K8S for ES

In yaml file we did not have information about resources required and limits. Without this ES was using only one CPU from 8. After we added information about requested CPU 6, ES starts using 6 CPU.

Krzysiek

Wiadomość napisana przez Joseph Presley <jpres...@gmail.com> w dniu 25.01.2019, o godz. 23:15:

Reply all

Reply to author

Forward