Hi,
In our use case we are currently using Kafka JDBC connector, Kafka Streams to process the data.
The requirement is to put the processed data in Parquet format into HDFS. The sequence of data flow that we have is as follows
1. We use JDBC connector that talks to MySQL database and read data.
2. JDBC connector pass data to Kafka broker on topic say 'test-mysql-kafka'
3. Kafka Stream consumer consume data from Kafka on topic 'test-mysql-kafka'.
4. The data that we get is in Avro format in Kafka Stream.
We want to store this data in Hadoop HDFS and in converted Parquet format.
Does Kafka stream or any other library available that we can use for data conversation process as well as store it in HDFS?
Thanks,
OmkarSabane