Hi,
I stored few Kafka topics data through Camus ETL tool into MapRFS. It created directories like /tmp/topics, /tmp/exec and /tmp/exec/history .
when I read file /tmp/topics/test/hourly/2015/05/22/09/test.1.0.27.48.1432137600000.snappy generated through camus ETL, I get following output:
$ hadoop fs -text /tmp/topics/test/hourly/2015/05/22/09/test.1.0.27.48.1432137600000.snappy
com.linkedin.camus.etl.kafka.common.KafkaMessage@w3f2e1584
com.linkedin.camus.etl.kafka.common.KafkaMessage@4b734e12
com.linkedin.camus.etl.kafka.common.KafkaMessage@3512048c
com.linkedin.camus.etl.kafka.common.KafkaMessage@31e89e8c
I am getting lost on how to query / read the data after it is stored into HDFS.
I wrote basic StringMessageDecoder and StringRecordWriterProvider for this ETL job. Following is my properties information:
etl.destination.path=/tmp/topics
etl.execution.base.path=/tmp/exec
etl.execution.history.path=/tmp/exec/history
camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.StringMessageDecoder
etl.record.writer.provider.class=com.linkedin.camus.etl.kafka.common.StringRecordWriterProvider
etl.output.codec=snappy
Any thoughts / guidance on how to move ahead after the data is pushed into HDFS?
Regards,
Sagar