Format in Kafka Plugin

128 views

Skip to first unread message

Carlos Martínez

unread,

Nov 22, 2016, 6:09:59 PM11/22/16

to Fluentd Google Group

Hi,

Scenario

I have a Kafka which has a topic where the Apache logs are sent in the original format, not transformed.

This logs in one side are read and stored in HDFS to save all the apache logs for a while in their original format.

On the other side I wanted to transform them to JSON and upload them to Solr to be indexed and queried in there.

Problem

I am using the Kafka plugin https://github.com/fluent/fluent-plugin-kafka to read from Kafka the Apache logs.

The problem is that Kafka plugins seems to only admit the following formats:

format <input text type (text|json|ltsv|msgpack)> :default => json

I wanted to have a format "apache" as in_tail has.

But this is not an option.

I read then:

http://docs.fluentd.org/articles/parser-plugin-overview

To address such cases, for v0.10.46 and above, Fluentd has a pluggable system that enables the user to create their own parser formats.
How To Use

Write a custom format plugin. See here for more information.
From any input plugin that supports the “format” field, call the custom plugin by its name. Here is an example with in_tail.

I added the parser of Apache to /etc/td-agent/plugin/:

ls /etc/td-agent/plugin/
parser_apache.rb

But it is not reading it.

By looking to the source code (fluent-plugin-kafka-0.3.1/lib/fluent/plugin/in_kafka.rb) it does not seem to read other formats as it should have happened from what was pointed by the "fluentd" documentation.

Questions

I am doing anything wrong or missing something and what I want to do can be done easily?
At a personal level I don't understand the fluentd design. Why each plugin (kafka,in_tail, etc) would need to create a parser for the same type of data? Would not have had more sense to create a type "apache" and anyone being able to read it?

Thanks a lot!

Carlos

Mr. Fiber

unread,

Nov 25, 2016, 8:48:16 AM11/25/16

to Fluentd Google Group

> I am doing anything wrong or missing something and what I want to do can be done easily?

https://github.com/fluent/fluent-plugin-kafka/blob/a413c39cad338da1a89a6b094870540b80551b0a/lib/fluent/plugin/in_kafka.rb#L101

https://github.com/fluent/fluent-plugin-kafka/blob/master/lib/fluent/plugin/in_kafka_group.rb#L87

fluent-plugin-kafka's input plugins don't use fluentd's parser mechanizm.

This is why you can't use other parsers.

Masahiro

--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages