Kafka streams app without avro schema files

884 views
Skip to first unread message

Mohammad Tariq

unread,
Mar 24, 2016, 10:05:08 PM3/24/16
to confluent...@googlegroups.com
Hi fellow confluent users,

I was going through the Confluent Kafka Streams doc and tried going through the examples provided here. I was wondering why do I need to provide the avsc files explicitly while writing a Kafka Streams app. Doesn't Kafka Streams provide integration with Schema registry?

If I were to write a very simple app that would just read some Avro data from a given Kafka topic, do some aggregation and store the result to some sink(say MySQL, for my dashboard) what do I need to do?

Peter Davis

unread,
Mar 25, 2016, 12:44:44 PM3/25/16
to Confluent Platform
Tariq, If you want to serialize and deserialize binary Avro data then you must specify schemas somehow.  Avro requires schemas.

If you just want to pass raw byte[]s around, you can configure ByteArray(De)Serializer or some other de/serializer of your choice, but unless you are just dumping data into a blob, you would not be able to decode the data to map fields into MySQL columns.

Alternatively you could use a schemaless data format, like simple Strings or JSON.  It's pretty straight forward to implement Serializer/Deserializer interfaces that invoke Jackson ObjectMapper for example.  (Kafka Connect's out-of-the-box JsonSerializer, unfortunately, doesn't work for this as it also requires its own schemas, even though JSON is schemaless.  Go figure.)

-Peter

Guozhang Wang

unread,
Mar 25, 2016, 3:50:48 PM3/25/16
to Confluent Platform
Hi Tariq,

You can use the KafkaAvroSerializer provided from Confluent Platform (here), which wraps a schema registry client that auto register and retrieve the messages from the schema registry.

The example codes a GenericAvroSerializer which wraps this serializer; however, when you want to create an intermediate Avro object (the viewRegion record in this case) which is not yet registered in the Schema Register yet, then you need to provide it on the fly, OR you can pre-register this schema and use the registered object to serialize it; this is just like the case when you are using a producer client to send an Avro record for the first time.

Guozhang


On Thursday, March 24, 2016 at 7:05:08 PM UTC-7, Tariq Mohammad wrote:
Reply all
Reply to author
Forward
0 new messages