Hi,
I'm using the Elasticsearch plugin to write logs, as the sole output for a single machine (fluentd is collocated on the same machine as the source of the logs).
I got some 'error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch"' errors in the fluentd log, and wasn't able to figure out what is wrong with the formatting by just looking at the errors. So I figured that if I try to output the logs to a file, and check the lines that do not appear on Elastic, things may be more obvious.
I used the following output configuration:
<match REDACTED>
@type copy
<store>
@type file
path /var/log/fluent/REDACTED/REDACTED
append true
<format>
localtime false
</format>
<buffer time>
timekey_wait 3s
timekey 60
timekey_use_utc true
path /var/log/fluent/REDACTED-file-buffer
</buffer>
<inject>
time_format %Y%m%dT%H%M%S%z
localtime false
</inject>
</store>
<store>
@type elasticsearch
host REDACTED
port 443
scheme https
reload_on_failure false
reload_connections false
<buffer tag, time>
@type file
path /var/log/fluent/REDACTED_buffer
timekey 60
flush_interval 60
flush_mode interval
</buffer>
index_name fluentd.${tag}
</store>
</match>
And this is where it gets weird:
The sample size was of about 1200 log lines. And each log line contains a unique ID. This is how I was able to identify each log line separately.
* There were 8 log lines in Elastic that are not present in the file.
* There were 11 log lines in the file, that are not present in Elastic.
I searched and verified the buffers didn't contain these lines, after the experiment (which ran for about a minute).
I have no indication why these lines are missing.
Note that I'm working with a slightly old version of Fluentd (v1.11.2) due to the use of OpenSearch. And so can't really upgrade at the moment.
Any tips for additional troubleshooting?
Thanks!
Gal