fluentd format nginx

2,764 views
Skip to first unread message

Manpreet Nehra

unread,
Mar 21, 2014, 3:31:25 AM3/21/14
to flu...@googlegroups.com
I am trying to process nginx logs using tail  plugin to feed them into elastic search with the following config

<source>
    type tail
    path /var/log/nginx/access.log
    pos_file /var/log/td-agent/access.log.pos
    tag hostname.application
    format  nginx
</source>

<match **>
  type elasticsearch
  host xxx.xxx.xxx.xxx
  port 9200
  index_name hostname.colo
  logstash_format true
  logstash_prefix hostname.colo
  logstash_dateformat %Y.%m.%d
  utc_index true
  include_tag_key   true
  tag_key hostname.colo
</match>

when i start td-agent, All i see in logs is warning for pattern not match for something like this :

2014-03-21 13:26:10 +0000 [warn]: pattern not match: "192.168.10.10 192.168.10.5 - [21/Mar/2014:13:26:10 +0000] - \"GET /website/subdir/index.thml HTTP/1.1\" 499 0 \"http://www.example.com/website/\" \"Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0\""

Kiyoto Tamura

unread,
Mar 21, 2014, 3:44:15 AM3/21/14
to flu...@googlegroups.com
Hi Manpreet,

Your nginx log has "-" after timestamp, which is different from the regexp supported by Fluentd's nginx format. Try the following setup


<source>
    type tail
    path /var/log/nginx/access.log
    pos_file /var/log/td-agent/access.log.pos
    tag hostname.application
    format /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] [^ ]* "(?<method>\S+)(?: +(?<path>[^\"]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
</source>

<match **>
  type elasticsearch
  host xxx.xxx.xxx.xxx
  port 9200
  index_name hostname.colo
  logstash_format true
  logstash_prefix hostname.colo
  logstash_dateformat %Y.%m.%d
  utc_index true
  include_tag_key   true
  tag_key hostname.colo
</match>


Kiyoto


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Check out Fluentd, the open source data collector for high-volume data streams

madz i

unread,
Apr 28, 2014, 8:57:13 AM4/28/14
to flu...@googlegroups.com
I'm testing following regexp to parse nginx access.log because default format nginx does not work.

Line from log:
88.8.135.188 - admin [27/Apr/2014:12:52:15 +0000] "GET /manager/html HTTP/1.1" 404 3696 "-" "-" "-"

Regexp:
^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" "(?<xxx>[^\"]*)")?$

Time format:
%d/%b/%Y:%H:%M:%S %z

Using Fluentar to test it.
Strange thing: when I replace <time> with <x> -- it works. So there is some problem with time parsing.

madz i

unread,
Apr 28, 2014, 9:31:12 AM4/28/14
to flu...@googlegroups.com
Actually it works fine on server, but not with Fluentar.

Masahiro Nakagawa

unread,
Apr 28, 2014, 12:14:40 PM4/28/14
to flu...@googlegroups.com
I found a bug of Fluentular.
Just sent a patch.




On Mon, Apr 28, 2014 at 10:31 PM, madz i <my.mad...@gmail.com> wrote:
Actually it works fine on server, but not with Fluentar.

--

Masahiro Nakagawa

unread,
Apr 28, 2014, 12:41:12 PM4/28/14
to flu...@googlegroups.com
Above PR is already merged.
I tried your example on Fluentular and it worked.

Please check again :)

Intrinsic Innovation

unread,
May 1, 2017, 6:20:15 AM5/1/17
to Fluentd Google Group
Hi,
   Thank you everyone above for such a great discussion. It has helped me a lot in my nginx configuration. Even I am getting pattern errors when the nginx logs is included.

This is the logfile line.

2017/05/01 16:00:05 [error] 1008#0: *34363 connect() failed (111: Connection refused) while connecting to upstream, client: <client ip address>, server: mydomain.com, request: "GET /tester HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "<some ip address>"


This is the format field in the <Source> tag.

format  /^ (<time[^\]]) \[error\]  [^\]]: (<Message>(.*)),(<client>client: (.*)),(<server>server:(.*)),(<request>request:(.*)),(<upstream>upstream:(.*)),(<host>host:(.*))?$/

Can you tell me what is wrong with this? I'm erroring out. 

Regards
Umashankar
Reply all
Reply to author
Forward
0 new messages