questions about max_send_retries in out_kafka plugin

56 views
Skip to first unread message

JeremyWoo

unread,
Jul 5, 2017, 5:07:06 AM7/5/17
to Fluentd Google Group
We use out_kafka plugin to send logs to kafka, our configuration looks like:

<match *.click.csv.info>
  @type kafka_buffered

  #public network traffic
  ssl_ca_cert /etc/td-agent/ssl/ca_cert.pem
  ssl_client_cert  /etc/td-agent/ssl/client_cert.pem
  ssl_client_cert_key   /etc/td-agent/ssl/client_cert_key.pem
  brokers 172.31.22.252:9093,172.31.29.31:9093,172.31.28.40:9093 and so on ....


  # buffer settings
  buffer_type file
  buffer_chunk_limit 1G
  buffer_path /var/log/td-agent/buffer/kafka_click_csv
  flush_interval 60s

  # topic settings
  default_topic test-kafka-log_txt-0

  # data type settings
  output_data_type single_value
  add_newline false
  #compression_codec gzip

  # producer settings
  max_send_retries 1
  required_acks 1
  kafka_agg_max_bytes 40960000
  num_threads 50
  slow_flush_log_threshold 60
  get_kafka_client_log true
</match>

the parameter `max_send_retries` we set to 1 which is default, and we see some error in logs:

2017-07-05 16:41:29 +0800 [info]: Sending 10365 messages to 172.31.22.252:9093 (node_id=2)
2017-07-05 16:41:39 +0800 [error]: Timed out while writing request 2
2017-07-05 16:41:39 +0800 [error]: Could not connect to broker 172.31.22.252:9093 (node_id=2): Connection error: Connection timed out
2017-07-05 16:41:39 +0800 [info]: Sending 8504 messages to 172.31.24.152:9093 (node_id=8)
2017-07-05 16:41:49 +0800 [error]: Timed out while writing request 2
2017-07-05 16:41:49 +0800 [error]: Could not connect to broker 172.31.24.152:9093 (node_id=8): Connection error: Connection timed out
2017-07-05 16:41:49 +0800 [info]: Sending 8487 messages to 172.32.14.52:9093 (node_id=3)
2017-07-05 16:42:00 +0800 [error]: Timed out while writing request 2
2017-07-05 16:42:00 +0800 [error]: Could not connect to broker 172.32.14.52:9093 (node_id=3): Connection error: Connection timed out
2017-07-05 16:42:00 +0800 [info]: Sending 6898 messages to 172.32.18.152:9093 (node_id=6)
2017-07-05 16:42:10 +0800 [error]: Timed out while writing request 2
2017-07-05 16:42:10 +0800 [error]: Could not connect to broker 172.32.18.152:9093 (node_id=6): Connection error: Connection timed out
2017-07-05 16:42:10 +0800 [warn]: Failed to send all messages; attempting retry 1 of 1 after 1s
2017-07-05 16:42:11 +0800 [info]: Fetching cluster metadata from kafka://172.32.26.12:9093
2017-07-05 16:42:11 +0800 [info]: Discovered cluster metadata; nodes: 172.31.24.152:9093 (node_id=8), 172.31.22.252:9093 (node_id=2), 172.28.45.148:9093 (node_id=4), 172.32.30.12:9093 (node_id=7), 172.32.26.12:9093 (node_id=1), 172.32.14.52:9093 (node_id=3), 172.32.18.152:9093 (node_id=6)
2017-07-05 16:42:11 +0800 [info]: Sending 6958 messages to 172.32.26.12:9093 (node_id=1)
2017-07-05 16:42:11 +0800 [info]: Sending 7057 messages to 172.28.45.148:9093 (node_id=4)
2017-07-05 16:42:11 +0800 [info]: Sending 6763 messages to 172.32.30.12:9093 (node_id=7)
2017-07-05 16:42:11 +0800 [info]: Sending 10365 messages to 172.31.22.252:9093 (node_id=2)
2017-07-05 16:42:12 +0800 [info]: Sending 8504 messages to 172.31.24.152:9093 (node_id=8)
2017-07-05 16:42:12 +0800 [info]: Sending 8487 messages to 172.32.14.52:9093 (node_id=3)
2017-07-05 16:42:12 +0800 [info]: Sending 6898 messages to 172.32.18.152:9093 (node_id=6)

I' confused about the max_send_retries, if out_kafka send logs failed as `Connection error: Connection timed out` in the above logs, it will retry 1 , but if retry still failed, does these logs will be dropped by fluentd? Will these logs be lost ?

Any answers will be appreciated

JeremyWoo

unread,
Jul 6, 2017, 6:46:04 AM7/6/17
to Fluentd Google Group
now, we find another error log:

2017-07-06 17:43:00 +0800 [info]: Sending 3321 messages to 172.20.17.90:9093 (node_id=8)
2017-07-06 17:43:00 +0800 [error]: Could not connect to broker 172.20.17.90:9093 (node_id=8): Connection error: end of file reach
ed
2017-07-06 17:43:00 +0800 [info]: Sending 3161 messages to 172.16.19.143:9093 (node_id=2)
2017-07-06 17:43:00 +0800 [error]: Could not connect to broker 172.16.19.143:9093 (node_id=2): Connection error: end of file reach
ed
2017-07-06 17:43:00 +0800 [info]: Sending 3303 messages to 172.17.8.56:9093 (node_id=7)

any ideas about this error?

在 2017年7月5日星期三 UTC+8下午5:07:06,JeremyWoo写道:
Reply all
Reply to author
Forward
0 new messages