avro vs protobuf IDL

886 views
Skip to first unread message

Dennis

unread,
Mar 2, 2016, 3:08:12 AM3/2/16
to grpc.io
Hi,

gRPC docs tell me that it is serialization format agnostic but it appears that protobufs are the IDL of choice and I wanted to better understand why and also help build an argument for protobufs and gRPC as well. (I didn't notice any threads regarding avro on this list (maybe this is the wrong place?).

I've compiled some features of AVRO and Protobufs that stand out, though if there are others I am missing or topics I am conflating please do correct me, pedantically if you want.

* AVRO has the following 'advantages' (?):

** A dynamic schema through negotiation and in general the schema is more 'robust.'  There is no need to declare IDs static data types or do code generation and helps with more generic processing of arbitrary data over the wire (maybe as part of a pub/sub flow).  

** Hadoop support OOB, though this may be here nor there, the pervasiveness of Hadoop might make protobufs themselves a non-starter.

** Language inheritance/nesting and polymorphism 
*** How much impact might this have on performance at the end of the day, does it nullify the advantages of AVRO being a binary format? Is there any prior art here comparing the two. (Sorry for such a loaded item AVRO folks).

* Protobufs OTOH have these 'advantages' seemingly over AVRO

** Human readable IDL (well more human readable)

** Cleaner looking documentation¹ 

** Cleaner looking implementation¹ 

** Diverse language options¹ 

** with gRPC you do not have to write your own webserver since it is generated for you.



¹ probably because its an Apache project and not a Google project?



Thank you for everyone's time,

Dennis 

GoldenBull Chen

unread,
Mar 2, 2016, 5:17:05 AM3/2/16
to Dennis, grpc.io
Here is some clue how gRPC deal with serialization/deserialization, base on my limited knowledge of gRPC/C#. Source code is in examples/csharp/helloworld/Greeter

1. A protoc plugin (grpc_csharp_plugin.exe) will be able to read the helloworld.proto file along with protoc.exe and generate the HelloworldGrpc.cs
2. In HelloworldGrpc.cs, only two lines are actually related to ser/deser:

    static readonly Marshaller<global::Helloworld.HelloRequest> __Marshaller_HelloRequest = Marshallers.Create((arg) => global::Google.Protobuf.MessageExtensions.ToByteArray(arg), global::Helloworld.HelloRequest.Parser.ParseFrom);
    static readonly Marshaller<global::Helloworld.HelloReply> __Marshaller_HelloReply = Marshallers.Create((arg) => global::Google.Protobuf.MessageExtensions.ToByteArray(arg), global::Helloworld.HelloReply.Parser.ParseFrom);

Afterwards, "__Marshaller_HelloRequest" and "__Marshaller_HelloReply" are used as delegates or factories to convert data between C# object and byte[]

So, if you want to use Avro with gRPC, you should write your own plugin or some kind of code generator to create a similar "HelloworldGrpc.cs" in which most codes are the same but with Avro ser/deser functions.
Actually, you could write a simple "HelloworldGrpc.cs" by hand using some simple ser/deser protocol like JSON or even simpler plain text protocol.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/f9b8686c-20fe-4204-aed2-aa375405b456%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christian Rivasseau

unread,
Mar 2, 2016, 9:05:39 AM3/2/16
to Dennis, grpc.io
We've been very happy using protobuf with hadoop, I'm not sure why it would be a non-starter.

We save protobufs in hadoop sequence files and hbase columns (new BytesWritable(message.toByteArray())) is just one line away. And if that's too much you can write a MessageWritable (or perhaps use twitter's' ... ).

Protoc custom plugins also make it very easy to write code generators for various hadoop tasks. For instance I wrote a plugin that generates hive schemas and serializers for a proto message. Avro might have a hedge start here but I suspect those kinds of utilities will soon start popping up for protobuf.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/f9b8686c-20fe-4204-aed2-aa375405b456%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Christian Rivasseau
Co-founder and CTO @ Lefty
+33 6 67 35 26 74
Reply all
Reply to author
Forward
0 new messages