Kafka real time ingestion not recognizing parquet file?

143 views
Skip to first unread message

Didip Kerabat

unread,
Mar 23, 2021, 4:35:35 AM3/23/21
to Druid User
I got this error:

java.lang.RuntimeException: file:/opt/apache-druid-0.20.1/var/ebs/middlemanager/task/index_kafka_mydata_55dbb2406a8d4eb_fmefgkmf/work/indexing-tmp/druid-input-entity1367321146152770123.tmp is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [-67, -17, -65, -67]

What I did:

1. Pushed one parquet file into kafka:

export KAFKA_OPTS="-Dfile.encoding=UTF-8"
kafka-console-producer.sh --broker-list kafka-0.kafka.default.svc.cluster.local:9092 --topic mydata < ./part-00000-ab6c8913-d5aa-41db-8ae6-b4acf0ad0926-c000.gz.parquet

2. Made sure my supervisor is running and reading the correct data.

3. And then I got the task error above. Is there something wrong with the way I push the parquet file into Kafka?

nilden tutalar

unread,
Mar 23, 2021, 9:20:53 AM3/23/21
to druid...@googlegroups.com
Hi,

For consuming any data from a kafka topic there are commonly used file formats such as Avro and Json. Parquet is convenient for sending data to S3 buckets, it is the fastest file format for writing data. When it comes to Kafka, because you consume something (ok you can also create a producer but at the end a kafka consumer or a DB consumes it) you should use read file file formats such as Avro, or json. 

Didip Kerabat <did...@gmail.com>, 23 Mar 2021 Sal, 11:35 tarihinde şunu yazdı:
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/77f2dd6d-a081-4437-85ff-6f93b208dd75n%40googlegroups.com.

Didip Kerabat

unread,
Mar 23, 2021, 9:44:37 PM3/23/21
to Druid User
But it seems like Druid can read the Parquet file just fine, the connect button seems to work as expected.

The logs looked fine as well, until the last second...

Reply all
Reply to author
Forward
0 new messages