connect_write timeout reached

554 views
Skip to first unread message

Joseph Presley

unread,
Jan 25, 2019, 12:46:54 PM1/25/19
to flu...@googlegroups.com
I see these errors in the fluentd logs before, fluetnd stopped forwarding logs to elasticsearch. I sanitized the logs and fluentd conf. What causes it and how can I fix the issue?

failed to flush the buffer. retry_time=16 next_retry_seconds=2019-01-24 19:13:56 +0000 chunk="5803240525c1ddbbb197dc0dd46ecbfa" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"<SANITIZED>\", :port=>9243, :scheme=>\"https\", :user=>\"fluentd\", :password=>\"obfuscated\"}): connect_write timeout reached"

My sanitized td-agent configuration is

<source> @type forward port 24224 </source> <match fluent.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix fluentd type_name _doc <buffer> @type file path /var/log/td-agent/fluent.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match prod.ems.fix.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-ems-fix type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.ems.fix.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match prod.ems.app.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-ems-app type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.ems.app.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match production.poms.fix.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-poms type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.poms.fix.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match production.poms.nginx.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-poms type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.poms.nginx.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match> <match production.poms.nodejs.**> @type elasticsearch host <SANITZED> port 9243 scheme https ssl_version TLSv1_2 user fluentd password <SANITZED> logstash_format true logstash_prefix onx-prod-poms type_name _doc <buffer> @type file path /var/log/td-agent/buffer/prod.poms.nodejs.*.buffer flush_thread_count 11 flush_interval 3s chunk_limit_size 50m queue_limit_length 4096 </buffer> </match>

Mr. Fiber

unread,
Jan 25, 2019, 4:53:39 PM1/25/19
to Fluentd Google Group
Other user hits similar problem.


Does your ES cluster have enough resource to handle lots of request?
Your setting is shorter flush_interval so ES gets lots of request from fluentd
and kibana or other apps send query together.


Masahiro

--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Joseph Presley

unread,
Jan 25, 2019, 5:15:51 PM1/25/19
to flu...@googlegroups.com
Thank you for the link to the other user's problem. We have a requirement to reduce log latency as much as possible. I can modify the value slightly but we do not want the default value of 60s. Should I consider increasing the chunk_limit_size as well?

Krzysztof Toruński

unread,
Jan 26, 2019, 2:28:21 AM1/26/19
to flu...@googlegroups.com
Hi

We had similar problem with ES on K8S
The issue was in configuration of K8S for ES
In yaml file we did not have information about resources required and limits. Without this ES was using only one  CPU from 8.  After we added information about requested CPU 6, ES starts using 6 CPU. 

Krzysiek

Wiadomość napisana przez Joseph Presley <jpres...@gmail.com> w dniu 25.01.2019, o godz. 23:15:

Reply all
Reply to author
Forward
0 new messages