It appears that the Connect HDFS sink only supports AVRO encoded messages. Configuring the connector with the JSON converter classes and producer string representations of JSON, or even JSON objects serialized as Byte Arrays always results exceptions, out of memory and others, when using HDFS sink. Using the simple file connector sink works just fine.Looking at the HDFS sink implementation, the code seems to be passing around Avro objects so it seems that it has no JSON support.
--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/d4a8966b-a070-4137-9380-85e8ce994b4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
The connector is not restricted to Avro data. It does support Avro *output*, which is the default for the format.class option. But for input, the connector itself does not manage reading the data from the topic. The connect framework provides pluggable converters for reading data from Kafka. However, because all the output formats currently supported require data with schemas, trying to use the JsonConverter with the HDFS connector will not currently work.-Ewen
On Wed, Feb 24, 2016 at 2:30 AM, Ahmad Alkilani <amk...@gmail.com> wrote:
It appears that the Connect HDFS sink only supports AVRO encoded messages. Configuring the connector with the JSON converter classes and producer string representations of JSON, or even JSON objects serialized as Byte Arrays always results exceptions, out of memory and others, when using HDFS sink. Using the simple file connector sink works just fine.Looking at the HDFS sink implementation, the code seems to be passing around Avro objects so it seems that it has no JSON support.
--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/d4a8966b-a070-4137-9380-85e8ce994b4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--Thanks,
Ewen
Produce to/Consume from Kafka in JSON. Save to HDFS in Parquet or Avro. The HDFS sink connector seems to have a problem with JSON formatted data in Kafka even though the converters technically should handle this.
On Monday, March 7, 2016 at 6:12:17 AM UTC-8, Andrew Stevenson wrote:
Why do you want JSON in HDFS? I doesn't perform very well, Avro and Parquet which are supported are much better.
On Wednesday, 24 February 2016 19:28:33 UTC+1, Ahmad Alkilani wrote:
Thanks Ewen, this is precisely why I was asking; knowing that the connect framework relies on converters to translate to/from the Kafka data format, the expectation was that it would work out of the box. I understand the requirement for a schema. Is this by any chance something that is in the works for JSON (with whatever limitation might be imposed) for a soon to be future release?Thanks
On Wednesday, February 24, 2016 at 9:47:13 AM UTC-8, Ewen Cheslack-Postava wrote:
The connector is not restricted to Avro data. It does support Avro *output*, which is the default for the format.class option. But for input, the connector itself does not manage reading the data from the topic. The connect framework provides pluggable converters for reading data from Kafka. However, because all the output formats currently supported require data with schemas, trying to use the JsonConverter with the HDFS connector will not currently work.-Ewen
On Wed, Feb 24, 2016 at 2:30 AM, Ahmad Alkilani <amk...@gmail.com> wrote:
It appears that the Connect HDFS sink only supports AVRO encoded messages. Configuring the connector with the JSON converter classes and producer string representations of JSON, or even JSON objects serialized as Byte Arrays always results exceptions, out of memory and others, when using HDFS sink. Using the simple file connector sink works just fine.Looking at the HDFS sink implementation, the code seems to be passing around Avro objects so it seems that it has no JSON support.
--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/d4a8966b-a070-4137-9380-85e8ce994b4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
--Thanks,
Ewen
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/7976460e-1b5e-470c-8823-25df404c16ee%40googlegroups.com.