schema registry and Kafka REST API

1,891 views
Skip to first unread message

Arun Sethia

unread,
Mar 31, 2017, 6:56:26 AM3/31/17
to Confluent Platform
Hi,

I am new to Confluent, try to understand difference between schema registry to create schema for a subject vs Kafka REST API to post message to topic. As I understood the schema registry API creates schema for a subject. The Kafka REST Proxy API - Post message to topic also allow us to create schema (key and value schema). 

1. can I create schema defination (key and value subject) using schema registry and use same part of  Kafka REST API to post message? 
2. which is the best way to create schema for kafka message, via Kafka REST API (part of post message) or schema registry and when to use what? I am assuming topic name is equivalent concept to "subject" in schema registry.

Thanks & Regards,
Arun

Ewen Cheslack-Postava

unread,
Apr 7, 2017, 2:20:03 AM4/7/17
to Confluent Platform
On Fri, Mar 31, 2017 at 3:56 AM, Arun Sethia <sethi...@gmail.com> wrote:
Hi,

I am new to Confluent, try to understand difference between schema registry to create schema for a subject vs Kafka REST API to post message to topic. As I understood the schema registry API creates schema for a subject. The Kafka REST Proxy API - Post message to topic also allow us to create schema (key and value schema). 

That's correct. Schemas are registered with the schema registry on-demand when messages are serialized and produced to Kafka using the Avro serializer. The REST Proxy uses this serializer, so if you produce an Avro message to Kafka using the REST Proxy, it will make sure the schema is registered in the schema registry before writing the message to Kafka (or return an error if trying to register the schema results in an error, e.g. due to incompatibility).
 

1. can I create schema defination (key and value subject) using schema registry and use same part of  Kafka REST API to post message? 

Yes, the REST Proxy is just using the Schema Registry behind the scenes, so anything you register directly with the Schema Registry ahead of time will be used by the REST Proxy.
 
2. which is the best way to create schema for kafka message, via Kafka REST API (part of post message) or schema registry and when to use what? I am assuming topic name is equivalent concept to "subject" in schema registry.

This usually depends a bit on your organization, but the easiest way is to just let the REST Proxy register the schema for you. The drawback is that this doesn't guarantee you'll be able to produce data since registering the schema could fail. So some people will proactively register schemas directly with the schema registry *before* deploying their app (e.g. as part of their continuous delivery steps before the app is rolled out).
 
-Ewen


Thanks & Regards,
Arun

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/8eb8f008-30ae-4885-b2f2-221e24af5a5d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ali Abdel-Aziz Ali

unread,
Apr 18, 2017, 7:41:33 AM4/18/17
to Confluent Platform
Hi,

As I’m a newbie so it might be a silly question
what is the relation between schema-registry's subjects and kafka's topics? is it something like
kafka's topics  | schema-registry subject
topic_A            | topic_A

in this case what if I want to specify one schema for my topic's value and another schema for my topic's key
should I append a prefix for the subject to differentiate between value and topics? will it be automatically linked to my topic? something like
kafka's topics  | schema-registry subject
                       |        Key         |   Value
topic_A            | topic_A-value | topic_A-key

Does it it mean I have to do two or three (in case of key and value) steps?

  1. create kafka-topic "topic_A"
    > kafka-topics –create –zookeeper localhost:22181,localhost:32181,localhost:42181 –replication-factor 1 –partitions 1 –topic topic-A
  2. create schema-registry subject "topic_A"
    curl -X POST -H “Content-Type: application/vnd.schemaregistry.v1+json” \
    –data ‘{“schema”: “{\”type\”: \”string\”, ... }”}’ \
    http://localhost:8081/subjects/topic_A/versions
Thanks


On Friday, April 7, 2017 at 8:20:03 AM UTC+2, Ewen Cheslack-Postava wrote:
On Fri, Mar 31, 2017 at 3:56 AM, Arun Sethia <sethi...@gmail.com> wrote:
Hi,

I am new to Confluent, try to understand difference between schema registry to create schema for a subject vs Kafka REST API to post message to topic. As I understood the schema registry API creates schema for a subject. The Kafka REST Proxy API - Post message to topic also allow us to create schema (key and value schema). 

That's correct. Schemas are registered with the schema registry on-demand when messages are serialized and produced to Kafka using the Avro serializer. The REST Proxy uses this serializer, so if you produce an Avro message to Kafka using the REST Proxy, it will make sure the schema is registered in the schema registry before writing the message to Kafka (or return an error if trying to register the schema results in an error, e.g. due to incompatibility).
 

1. can I create schema defination (key and value subject) using schema registry and use same part of  Kafka REST API to post message? 

Yes, the REST Proxy is just using the Schema Registry behind the scenes, so anything you register directly with the Schema Registry ahead of time will be used by the REST Proxy.
 
2. which is the best way to create schema for kafka message, via Kafka REST API (part of post message) or schema registry and when to use what? I am assuming topic name is equivalent concept to "subject" in schema registry.

This usually depends a bit on your organization, but the easiest way is to just let the REST Proxy register the schema for you. The drawback is that this doesn't guarantee you'll be able to produce data since registering the schema could fail. So some people will proactively register schemas directly with the schema registry *before* deploying their app (e.g. as part of their continuous delivery steps before the app is rolled out).
 
-Ewen


Thanks & Regards,
Arun

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.

Ewen Cheslack-Postava

unread,
Apr 18, 2017, 10:52:21 PM4/18/17
to Confluent Platform
On Tue, Apr 18, 2017 at 4:41 AM, Ali Abdel-Aziz Ali <ali.abdel...@gmail.com> wrote:
Hi,

As I’m a newbie so it might be a silly question
what is the relation between schema-registry's subjects and kafka's topics? is it something like
kafka's topics  | schema-registry subject
topic_A            | topic_A

in this case what if I want to specify one schema for my topic's value and another schema for my topic's key
should I append a prefix for the subject to differentiate between value and topics? will it be automatically linked to my topic? something like
kafka's topics  | schema-registry subject
                       |        Key         |   Value
topic_A            | topic_A-value | topic_A-key

So, first, this mapping where each Kafka topic has <topic>-key and <topic>-value is exactly what we use for the new Java producer & consumer and the non-Java clients where we've added schema registry support. This is done automatically with the serializers.

That said, note that the schema registry doesn't assume the data is in Kafka -- it's actually quite a bit more general than that. And that's why it uses the terminology "subject" instead of matching Kafka's terminology. As an example, you could potentially write data into a directory in HDFS or S3 and use the schema registry to track the format(s) of data under that directory.



Does it it mean I have to do two or three (in case of key and value) steps?

  1. create kafka-topic "topic_A"
    > kafka-topics –create –zookeeper localhost:22181,localhost:32181,localhost:42181 –replication-factor 1 –partitions 1 –topic topic-A
  2. create schema-registry subject "topic_A"
    curl -X POST -H “Content-Type: application/vnd.schemaregistry.v1+json” \
    –data ‘{“schema”: “{\”type\”: \”string\”, ... }”}’ \
    http://localhost:8081/subjects/topic_A/versions
If you are not using the Java, C/C++, or Python serializers Confluent provides, then you would:

1. Create the Kafka topic, "foo"
2. Register the schema for keys in foo, "foo-key"
3. Register the schema for values in foo, "foo-value"

-Ewen
 
Thanks

On Friday, April 7, 2017 at 8:20:03 AM UTC+2, Ewen Cheslack-Postava wrote:
On Fri, Mar 31, 2017 at 3:56 AM, Arun Sethia <sethi...@gmail.com> wrote:
Hi,

I am new to Confluent, try to understand difference between schema registry to create schema for a subject vs Kafka REST API to post message to topic. As I understood the schema registry API creates schema for a subject. The Kafka REST Proxy API - Post message to topic also allow us to create schema (key and value schema). 

That's correct. Schemas are registered with the schema registry on-demand when messages are serialized and produced to Kafka using the Avro serializer. The REST Proxy uses this serializer, so if you produce an Avro message to Kafka using the REST Proxy, it will make sure the schema is registered in the schema registry before writing the message to Kafka (or return an error if trying to register the schema results in an error, e.g. due to incompatibility).
 

1. can I create schema defination (key and value subject) using schema registry and use same part of  Kafka REST API to post message? 

Yes, the REST Proxy is just using the Schema Registry behind the scenes, so anything you register directly with the Schema Registry ahead of time will be used by the REST Proxy.
 
2. which is the best way to create schema for kafka message, via Kafka REST API (part of post message) or schema registry and when to use what? I am assuming topic name is equivalent concept to "subject" in schema registry.

This usually depends a bit on your organization, but the easiest way is to just let the REST Proxy register the schema for you. The drawback is that this doesn't guarantee you'll be able to produce data since registering the schema could fail. So some people will proactively register schemas directly with the schema registry *before* deploying their app (e.g. as part of their continuous delivery steps before the app is rolled out).
 
-Ewen


Thanks & Regards,
Arun

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.

Ali Abdel-Aziz Ali

unread,
Apr 19, 2017, 7:03:47 AM4/19/17
to Confluent Platform
Hi Ewen,

Thanks for your reply.
I have understood that using the new Java producer & consumer client (e.x java kafka-clients-0.10.2.0.jar) each kafka's <topic> will have two schema-registry's subjects (<topic>-key and <topic>-value) "This is done automatically with the serializers"

here I have some questions:
1st How the topic should be created? using "kafka-topics --create --topic" utility will only create the topic. (in-case of auto topic creation is disabled)

2nd which serializer I should use to have "automatically creating schema-registry subjects (<topic>-key and <topic>-value)" and registering the schema to each of them (is there any other API I should use apart from KafkaProducer.send)
currently I'm using "org.apache.kafka.common.serialization.ByteArraySerializer" to write the avro serialized byte[] based on my schema

3rd I couldn't find any schema-registry's Configuration Parameters inside the ProducerConfig/ConsumerConfig; how in this case the serializer will know which schema-registry it should use to register the key/value schemas
moreover I don't know which kafka client's API that I should use to pass the key/value schema, to be used for the subject key/value <-> schema registration


I know it's too many questions but if you could direct me to any resources where I could find the answer I would be thankful.

Regards.

Thanks & Regards,
Arun
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.

Ali Abdel-Aziz Ali

unread,
Apr 19, 2017, 8:18:13 AM4/19/17
to Confluent Platform

Ewen Cheslack-Postava

unread,
Apr 20, 2017, 1:41:28 AM4/20/17
to Confluent Platform
On Wed, Apr 19, 2017 at 4:03 AM, Ali Abdel-Aziz Ali <ali.abdel...@gmail.com> wrote:
Hi Ewen,

Thanks for your reply.
I have understood that using the new Java producer & consumer client (e.x java kafka-clients-0.10.2.0.jar) each kafka's <topic> will have two schema-registry's subjects (<topic>-key and <topic>-value) "This is done automatically with the serializers"

here I have some questions:
1st How the topic should be created? using "kafka-topics --create --topic" utility will only create the topic. (in-case of auto topic creation is disabled)

Yes, you can create these via the kafka-topics command or rely on auto topic creation if you have it enabled.
 

2nd which serializer I should use to have "automatically creating schema-registry subjects (<topic>-key and <topic>-value)" and registering the schema to each of them (is there any other API I should use apart from KafkaProducer.send)
currently I'm using "org.apache.kafka.common.serialization.ByteArraySerializer" to write the avro serialized byte[] based on my schema

The schema registry currently works only with Avro and you should use the Avro serializers. http://docs.confluent.io/current/app-development.html#native-clients-with-serializers gives some more details.
 

3rd I couldn't find any schema-registry's Configuration Parameters inside the ProducerConfig/ConsumerConfig; how in this case the serializer will know which schema-registry it should use to register the key/value schemas
moreover I don't know which kafka client's API that I should use to pass the key/value schema, to be used for the subject key/value <-> schema registration

Configs are passed through directly. Avro + Schema Registry is just one choice for how to serialize data and Kafka makes that choice completely pluggable. That means you won't find configs for *any* serialization format in ProducerConfig/ConsumerConfig. You need to look at the specific Serializer/Deserializer to learn about configs for that specific format.

-Ewen
 

Thanks & Regards,
Arun
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsubscribe@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.

Arun Sethia

unread,
Apr 21, 2017, 11:47:37 AM4/21/17
to Confluent Platform
Thanks a lot Ewen.


On Friday, April 7, 2017 at 11:50:03 AM UTC+5:30, Ewen Cheslack-Postava wrote:
On Fri, Mar 31, 2017 at 3:56 AM, Arun Sethia <sethi...@gmail.com> wrote:
Hi,

I am new to Confluent, try to understand difference between schema registry to create schema for a subject vs Kafka REST API to post message to topic. As I understood the schema registry API creates schema for a subject. The Kafka REST Proxy API - Post message to topic also allow us to create schema (key and value schema). 

That's correct. Schemas are registered with the schema registry on-demand when messages are serialized and produced to Kafka using the Avro serializer. The REST Proxy uses this serializer, so if you produce an Avro message to Kafka using the REST Proxy, it will make sure the schema is registered in the schema registry before writing the message to Kafka (or return an error if trying to register the schema results in an error, e.g. due to incompatibility).
 

1. can I create schema defination (key and value subject) using schema registry and use same part of  Kafka REST API to post message? 

Yes, the REST Proxy is just using the Schema Registry behind the scenes, so anything you register directly with the Schema Registry ahead of time will be used by the REST Proxy.
 
2. which is the best way to create schema for kafka message, via Kafka REST API (part of post message) or schema registry and when to use what? I am assuming topic name is equivalent concept to "subject" in schema registry.

This usually depends a bit on your organization, but the easiest way is to just let the REST Proxy register the schema for you. The drawback is that this doesn't guarantee you'll be able to produce data since registering the schema could fail. So some people will proactively register schemas directly with the schema registry *before* deploying their app (e.g. as part of their continuous delivery steps before the app is rolled out).
 
-Ewen


Thanks & Regards,
Arun

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages