JSON versus Avro

3,890 views
Skip to first unread message

Kevin Henderson

unread,
Feb 5, 2017, 4:39:52 AM2/5/17
to Confluent Platform
I am a physician who has learned a about the architecture of data systems but not a programmer by any means. We r building a Kafka - Spark - Cassandra platform, +/- Elastic Search. I was wondering if I could get some insights about ingesting data into Kafka.

All data we receive and export will be in JSON format. My question is should be ingest data into Kafka in JSON format or should we use the JSONconnverter to convert data into Avro and use Avro for data ingest into Kafka?

Also, it seems Avro has been optimized for Hadoop and we have no plans to use Hadoop, so if the answer to the first question is yes, why would Avro not b a disadvantage in the architectural framework we have planned?

Thanks for your insight.

Kevin

Andrew Otto

unread,
Feb 6, 2017, 10:16:09 AM2/6/17
to confluent...@googlegroups.com
Hi Kevin,

. My question is should be ingest data into Kafka in JSON format or should we use the JSONconnverter to convert data into Avro and use Avro for data ingest into Kafka?

I think it depends on who your data is intended for.  If it will be teams of engineers who plan to maintain this data in a backwards compatible way for a long lifetime, then Avro is the better choice.   If your data’s lifetime is short lived, or you intend to provide the data for external use, or you want the data to be easily useable in many programming languages and readable by humans, then JSON might be easier.

This blog post describes some pros of Avro (under “Use Avro as Your Data Format”) well.  I recently wrote an article that is more focused on Hadoop usage, but does describe why Wikimedia is using JSON instead of Avro.

Hope that helps a little bit! 




Kevin

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/1b228f02-bbde-4de0-8125-6569409e4b9e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages