duplicate logs in elasticsearch

251 views
Skip to first unread message

Fahimeh Ashrafy

unread,
Sep 11, 2016, 4:04:38 AM9/11/16
to Fluentd Google Group
Hello  all
I use EFK stack for logs. sometime i see that there are some duplicates logs in elasticsearch indexes when I query elasticsearch using:

while in the log file just exist one log.
 fluentd config:
<match **>
      @type elasticsearch
      num_threads 8
      buffer_type memory
      buffer_chunk_limit 500k
      buffer_queue_limit 1000
      flush_interval 60s
      disable_retry_limit false
      retry_limit 17
      retry_wait 1s
      host localhost
      port 9200
      logstash_format true
      logstash_prefix app_logs
      logstash_dateformat %Y.%m.%d
      time_key_format %Y-%m-%dT%H:%M:%S.%N%z
      index_name app_logs.${Time.at(time).getutc.strftime(@logstash_dateformat)}
      type_name app_logs
</match>

actually my problem is like https://groups.google.com/forum/#!topic/fluentd/j09WOG1VaB4, Kiyoto answer number 1. how can I fix it?

Thanks a lot

Curious

unread,
Sep 11, 2016, 6:54:34 AM9/11/16
to Fluentd Google Group
Hi Fahimey, 

do you see FluentD retries in the fluentd logs? Or its duplicating even without showing FluentD retires? 

One way to solve the problem would be generating ids on fluentD side, and either use upsert mode when pushing logs to Elastic or even by default 'index' mode it would create different versions of the same record but when you look up for it in Kibana you will always see one single latest version.

Fahimeh Ashrafy

unread,
Sep 21, 2016, 3:34:07 AM9/21/16
to Fluentd Google Group
Hello 

Thanks for reply. 
yes I can see FluentD retries.
could you please explain the ways to solve the problem?

Thanks a lot

Fahimeh Ashrafy

unread,
Sep 27, 2016, 9:10:44 AM9/27/16
to Fluentd Google Group
no help?

Mr. Fiber

unread,
Sep 27, 2016, 9:31:48 AM9/27/16
to Fluentd Google Group
You didn't paste logs so we can't do detailed reply.
Retry happens when fluentd receives error from Elasticsearch.
Popular case is your elasticsearch lacks capcity, so
ES returns temporary error and fluetnd retries.
ES doesn't have transaction and dedup feature,
and this is Elasticsearch tradeoff.


Masahiro

--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages