Tail Input Plugin - using * with read_from_head

210 views
Skip to first unread message

AndrewR

unread,
Apr 21, 2019, 4:13:57 AM4/21/19
to Fluentd Google Group
First, a big thank-you for those who write and maintain fluentd and td-agent. It's an incredibly useful and powerful application. 

I'm using the tail input plugin to read log files from a directory. The files are not rotated - instead, the application writes a new file each day, using the date as part of the filename, so I have to use * to read them, as there are a potentially infinite number of filenames. Old log files are deleted from the directory after 30 days. I initially set up tail to read * with read_from_head false. However, I found that when the new log file was created, some of the initial lines in the log file were not captured by the tail plugin. I think that is because the plugin only notices the new log file shortly after it has been created, and then starts tailing from the point where the file was when fluent noticed it. So I think I should be able to solve this by changing read_from_head to true. (Only log files are in this directory). Is this approach reasonable (read_from_head true, path *.log)?

Best wishes,
Andrew

AndrewR

unread,
Apr 21, 2019, 4:41:04 AM4/21/19
to Fluentd Google Group
I see that using * with tail may cause duplication, and in my test I did get a duplicate record. Is there a better way of doing this? I can't specify all file names individually (which I think is what the docs mean by "You should not use '*' with log rotation because it may cause the log duplication. In such case, you should separate in_tail plugin configuration") because there is no finite list of file names. 

Mr. Fiber

unread,
Apr 21, 2019, 10:58:39 PM4/21/19
to Fluentd Google Group
> in my test I did get a duplicate record

Does this happen without rotation, right?
Could you write reproducible step here?


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrew Roos

unread,
Apr 22, 2019, 3:37:01 AM4/22/19
to flu...@googlegroups.com
Hi Mr Fiber

I checked this morning and the new log file was picked up perfectly, with no duplicates or omissions. I think the duplicates I experienced yesterday had a different cause. The system generating log entries runs on Windows, and I have found that the td-agent service does not always shut down properly, so on occasion I have had to manually kill it. However after shutting it down yesterday I found there was still a ruby process running, so I think that when I killed the agent, one of the worker threads kept running, which would explain why I was receiving duplicates! I rebooted the computer and have not seen any further duplicates. 

Thanks & best wishes,
Andrew

You received this message because you are subscribed to a topic in the Google Groups "Fluentd Google Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/fluentd/BHF9cDw-hZc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to fluentd+u...@googlegroups.com.

Michael Curiel

unread,
Feb 6, 2020, 9:14:03 AM2/6/20
to Fluentd Google Group
I am using tail with * for my path as I am trying to pickup logs that are written to a directory as well.  I have set read_from_head true but it doesn't seem to pickup the contents of the file:
2020-02-06 08:05:49 -0600 [info]: #0 following tail of /tflu/asd.json

If I write another line to the file, it will then tail the previous line.  Any suggestions?


On Monday, April 22, 2019 at 2:37:01 AM UTC-5, AndrewR wrote:
Hi Mr Fiber

I checked this morning and the new log file was picked up perfectly, with no duplicates or omissions. I think the duplicates I experienced yesterday had a different cause. The system generating log entries runs on Windows, and I have found that the td-agent service does not always shut down properly, so on occasion I have had to manually kill it. However after shutting it down yesterday I found there was still a ruby process running, so I think that when I killed the agent, one of the worker threads kept running, which would explain why I was receiving duplicates! I rebooted the computer and have not seen any further duplicates. 

Thanks & best wishes,
Andrew

On Mon, 22 Apr 2019 at 04:58, Mr. Fiber <repea...@gmail.com> wrote:
> in my test I did get a duplicate record

Does this happen without rotation, right?
Could you write reproducible step here?


On Sun, Apr 21, 2019 at 5:41 PM AndrewR <andre...@gmail.com> wrote:
I see that using * with tail may cause duplication, and in my test I did get a duplicate record. Is there a better way of doing this? I can't specify all file names individually (which I think is what the docs mean by "You should not use '*' with log rotation because it may cause the log duplication. In such case, you should separate in_tail plugin configuration") because there is no finite list of file names. 

On Sunday, April 21, 2019 at 10:13:57 AM UTC+2, AndrewR wrote:
First, a big thank-you for those who write and maintain fluentd and td-agent. It's an incredibly useful and powerful application. 

I'm using the tail input plugin to read log files from a directory. The files are not rotated - instead, the application writes a new file each day, using the date as part of the filename, so I have to use * to read them, as there are a potentially infinite number of filenames. Old log files are deleted from the directory after 30 days. I initially set up tail to read * with read_from_head false. However, I found that when the new log file was created, some of the initial lines in the log file were not captured by the tail plugin. I think that is because the plugin only notices the new log file shortly after it has been created, and then starts tailing from the point where the file was when fluent noticed it. So I think I should be able to solve this by changing read_from_head to true. (Only log files are in this directory). Is this approach reasonable (read_from_head true, path *.log)?

Best wishes,
Andrew

--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flu...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "Fluentd Google Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/fluentd/BHF9cDw-hZc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to flu...@googlegroups.com.

Andrew Roos

unread,
Feb 6, 2020, 9:32:39 AM2/6/20
to flu...@googlegroups.com
Hi Michael

My fluentd has been working correctly for some time, picking up JSON entries added to the log file (each line of the file is a separate valid JSON object; the file as a whole is not valid JSON). The relevant section of my conf file is:

<source>
  @type tail
  path C:/var/log/hp/*.log
  pos_file C:/var/log/hp/hp.pos
  tag hp02
  read_from_head true
  <parse>
    @type json
  </parse>
  emit_unmatched_lines true
</source>

This seems to correctly pick up a single line appended to the log file, and also handles log file rotation correctly (a new log file is written each day). 

Hope this helps,
Andrew



To unsubscribe from this group and all its topics, send an email to fluentd+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fluentd/409f060e-2597-4864-89c5-aee490c3123d%40googlegroups.com.

MFC191919

unread,
Feb 6, 2020, 9:54:40 AM2/6/20
to Fluentd Google Group
Thanks Andrew, I used your source to test with.  My file has these contents:
{"host":"192.168.0.1","size":777,"method":"PUT"}
{"host":"192.168.0.2","size":777,"method":"PUT"}
{"host":"192.168.0.3","size":777,"method":"PUT"}
{"host":"192.168.0.4","size":777,"method":"PUT"}
{"host":"192.168.0.5","size":777,"method":"PUT"}
{"host":"192.168.0.6","size":777,"method":"PUT"}
{"host":"192.168.0.7","size":777,"method":"PUT"}

Using your source here is what tail picked up when I started td-agent:
2020-02-06 08:51:53 -0600 [info]: #0 following tail of C:/var/log/hp/asd.log
2020-02-06 08:51:53.161185000 -0600 hp02: {"host":"192.168.0.1","size":777,"method":"PUT"}
2020-02-06 08:51:53.161185000 -0600 hp02: {"host":"192.168.0.2","size":777,"method":"PUT"}
2020-02-06 08:51:53.161185000 -0600 hp02: {"host":"192.168.0.3","size":777,"method":"PUT"}
2020-02-06 08:51:53.161185000 -0600 hp02: {"host":"192.168.0.4","size":777,"method":"PUT"}
2020-02-06 08:51:53.161185000 -0600 hp02: {"host":"192.168.0.5","size":777,"method":"PUT"}
2020-02-06 08:51:53.161185000 -0600 hp02: {"host":"192.168.0.6","size":777,"method":"PUT"}

It didn't pickup the last line of JSON in my file.  You don't ever see anything like this?
To unsubscribe from this group and all its topics, send an email to flu...@googlegroups.com.

Andrew Roos

unread,
Feb 6, 2020, 10:17:24 AM2/6/20
to flu...@googlegroups.com
I disabled fluentd, copied and pasted your example into a new file in my log directory, and restarted fluentd. Within a minute or so, the Mongo database where my log file entries end up contained all seven entries:

fluentd.png
Note that when I created the log file, I was careful to add a single newline to the end of the last entry, so the cursor was on the next line (otherwise, fluentd doesn't know that the line is complete and won't pick it up). 

Regards
Andrew


To unsubscribe from this group and all its topics, send an email to fluentd+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fluentd/20970352-ab24-4497-a23f-7c57dfd75c14%40googlegroups.com.

MFC191919

unread,
Feb 6, 2020, 11:28:09 AM2/6/20
to Fluentd Google Group
ok, that makes sense.  I have it working now.  Thank you for the help!
Reply all
Reply to author
Forward
0 new messages