How to parse a GELF dict forcing a timestamp addition

11 views
Skip to first unread message

Chris Broll

unread,
Feb 14, 2018, 7:14:30 AM2/14/18
to Fluentd Google Group

So I am using logspout with a gelf plugin (vincit/logspout-gelf) to output gelf to fluentd (td-agent). Fluentd is configured with a gelf input plugin (MerlinDMC/fluent-plugin-input-gelf). The aim is to capture docker container logs and send them to a single forwarder with one egress point from the EC2 host. The forwarder already hoovers up all the EC2 hosts syslogs happily and adds the correct date/time to each event.

When I start fluentd with this config I get my logs forwarded to Graylog but the time format is wrong (epoch + uptime):


<source>
  type gelf
  protocol_type udp
  port 12202
  tag stuff
</source>


A logged event looks like this:

1970-01-01 01:33:38 +0100 stuff: {"version":"1.1","host":"server1","short_message":"2018-02-09T16:07:25.546Z [access-log] ::ffff:1.2.3.4 - \"GET /find HTTP/1.1\" 200 5567 \"-\" \"ELB-HealthChecker/2.0\"1234","level":3,"image_id":"sha256:12345","image_name":"hello-world","container_id":"12345","container_name":"hello_world-task","command":"node bin/hello_world.js"}


Adding a little formatting adds the correct date but loses all the field indexing:

format /^(?<time>[^ ]* [^ ]) (?<message>.)$/


The output now has the correct date but the only field indexed is of course message:

2018-02-09 16:14:45 +0000 stuff: {"message":"::ffff:1.2.3.4 - \\\"GET /find HTTP/1.1\\\" 200 5567 \\\"-\\\" \\\"ELB-HealthChecker/2.0\\\"1234\",\"timestamp\":\"2018-02-09T16:14:45.990867153Z\",\"level\":3,\"image_id\":\"sha256:12345\",\"image_name\":\"hello-world\",\"container_id\":\"12345\",\"container_name\":\"hello_world-task\",\"command\":\"node bin/hello_world.js\"}"}


So I guess the plugin "MerlinDMC/fluent-plugin-input-gelf" time stamp is broken so how do I parse the event and index the fields.


I started off trying this in Fluentular, am I on the right track?:


^*:\"(?<version>\d{1}.\d{1})\"\,\"host\"\:\"(?<host>[^ ]*)\"\,


Chris Broll

unread,
Feb 14, 2018, 9:00:23 AM2/14/18
to Fluentd Google Group
So I went off and built a REGEX that would parse the string that is being displayed in the fluentd log:

format /^\"\{\"version\"\,\:\"(?<version>\d{1}.\d{1})\"\,\"host\"\:\"(?<host>[^ ]*)\"\,\"short_message\"\:\"(?<short_message>.*)\"\,\"level\"\:(?<level>\d{1})\,\"image_id\"\:\"(?<image_id>.*)\"\,\"image_name\"\:\"(?<image_name>.*)\"\,\"container_id\"\:\"(?<container_id>.*)\"\,\"container_name\"\:\"(?<container_name>.*)\"\,\"command\"\:\"(?<command>.*)\"\}\"/

time_format %d/%b/%Y:%H:%M:%S %z

This works very well in "fluentular" and matches each field correctly but when this is placed into fluentd:

2018-02-14 13:37:12 +0000 [warn]: pattern not match: "{\"version\":\"1.1\",\"host\":\"server1\",\"short_message\":\"2018-02-09T16:07:25.546Z [access-log] ::ffff:1.2.3.4 - \\"GET /find HTTP/1.1\\" 200 5567 \\"-\\" \\"ELB-HealthChecker/2.0\\"1234\",\"level\":3,\"image_id\":\"sha256:12345\",\"image_name\":\"hello-world\",\"container_id\":\"12345\",\"container_name\":\"hello_world-task\",\"command\":\"node bin/hello_world.js\"}"

What gives? What backspace escaping hell have I fallen into?
Reply all
Reply to author
Forward
0 new messages