FileSourceConnector to read multiple files (Please help!)

566 views
Skip to first unread message

SMJ

unread,
Apr 5, 2017, 7:22:32 PM4/5/17
to Confluent Platform
We have a scenario where Kafka Producer should read a list of incoming files and produce them to Kafka Topics. I've read about FileSourceConnector (http://docs.confluent.io/3.1.0/connect/connect-filestream/filestream_connector.html) but it reads only one file and sends new lines added to that file. File rotation is not handled. A few questions:
1) Is it better to implement our own Producer code to meet our requirement or can we extend the File Connector class so that it reads new files and sends them to Kafka topics.
2) Is there any other source connector that can be used in this scenario?

In terms of performance and ease of development, which approach is better? i.e., developing our Producer code to read files and send to Kafka or extending the Connector code and making changes to it. 

Any kind of feedback will be greatly appreciated!
Thank you!

Ewen Cheslack-Postava

unread,
Apr 7, 2017, 1:57:11 AM4/7/17
to Confluent Platform
The file connectors are really just meant to be a demo. For something that will pull a full directory of log files, you might try https://github.com/jcustenborder/kafka-connect-spooldir.

In general, if you're trying to get data to/from a source/sink and it sounds like something other people will want to do, Connect is the better approach since it provides a ton of built-in support and flexibility that you'd have to reproduce in your custom solution. In fact, even when you're building something customized to your own system, it still may be a better solution. Using the producer to do this directly mainly makes sense if you need fine-grained low-level control over the producer itself and you're limited to a single node so none of the scalability/fault tolerance aspects of Connect would help you.

-Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/1c6976ff-451b-4131-898a-88e93b97b803%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages