confluent wire format - Is it just a magic byte and a schema ID?

1,737 views
Skip to first unread message

Geir Magnusson

unread,
Apr 7, 2015, 8:51:55 AM4/7/15
to confluent...@googlegroups.com
I'm looking at the confluent platform instead of vanilla kafka because I do like the idea of rigid schema enforcement.  

But I have a mix of technology that would be both producers and consumers, and I'm a little shy about non-java things having to POST stuff to the REST service.

I was reading the code, and as far as I can tell, the only transport-level difference is that messages are 

   <magic_byte 0x00><4 bytes of schema ID><regular avro bytes for object that conforms to schema>

Is that it?  I'd like my golang producers to be first-class citizens and be able to push compatible messages over kafka protocol...

tia

geir

Ewen Cheslack-Postava

unread,
Apr 7, 2015, 12:18:25 PM4/7/15
to confluent...@googlegroups.com
That's correct. The only exception is for the primitive bytes schema (i.e. {"type": "bytes"}), which is written directly. In the normal Avro encoding it is prefixed by a length field, but that's unnecessary if you already know the size of the message, as we do with Kafka records.

-Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/46d1d74e-9f0c-4619-9906-00736592bc09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Ewen

Ewen Cheslack-Postava

unread,
Apr 7, 2015, 12:19:32 PM4/7/15
to confluent...@googlegroups.com
And now I realize I should clarify: by "written directly", I mean that you still write the magic byte and 4 byte schema ID, but it is then followed by the byte[] without any special encoding.

--
Thanks,
Ewen

Jun Rao

unread,
Apr 7, 2015, 12:48:46 PM4/7/15
to confluent...@googlegroups.com
Geir,

The format you described is correct.

<magic_byte 0x00><4 bytes of schema ID><regular avro bytes for object that conforms to schema>

Ewen,

I seems that we don't explicitly encode the length of the Avro bytes since it can be derived by subtracting the fixed header size from the message size.

Thanks,

Jun


On Tue, Apr 7, 2015 at 9:18 AM, Ewen Cheslack-Postava <ew...@confluent.io> wrote:

Geir Magnusson

unread,
Apr 8, 2015, 8:18:03 AM4/8/15
to confluent...@googlegroups.com
Thanks.  QQ- why even have the exception?  Seems like cost in bytes is small vs the need for every Confluent-compliant client (say that fast 3 times...) implementation to implement the exception?

James Cheng

unread,
Apr 8, 2015, 5:02:11 PM4/8/15
to confluent...@googlegroups.com
Will this wire format be documented on the website, per https://github.com/confluentinc/schema-registry/issues/150?

Thanks,
-James

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.

Jun Rao

unread,
Apr 8, 2015, 5:08:11 PM4/8/15
to confluent...@googlegroups.com
Yes, we will do that as part of the next release.

Thanks,

Jun

Reply all
Reply to author
Forward
0 new messages