Registering a union Avro type in Schema Registry

928 views
Skip to first unread message

Joe

unread,
Jul 7, 2016, 10:58:55 AM7/7/16
to Confluent Platform
Hi,

I'm trying to register a union Avro type to Schema Registry, the schema is a union of 2 Avro record types (rec1 and rec2), for example:

curl -X POST -i -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '[{ "type": "record", "name": "rec1", "fields" : [ {"name": "age", "type": "long"} ] }, { "type": "record", "name": "rec2", "fields" : [ {"name": "name", "type": "string"} ] }]' http://localhost:8081/subjects/events-value/versions


this throws an exception:


[2016-07-07 17:50:54,070] ERROR Unhandled exception resulting in internal server error response (io.confluent.rest.exceptions.GenericExceptionMapper:37)

com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of io.confluent.kafka.schemaregistry.client.rest.entities.requests.RegisterSchemaRequest out of START_ARRAY token


Does union Avro types deliberately not supported? what schema can be used for a topic if I want to publish multiple different types to a single topic? 



Joe

unread,
Jul 10, 2016, 9:10:50 AM7/10/16
to Confluent Platform
re-posting, anyone knows why first class union types are not supported in Schema Registry and if there are intentions to support it?

Ross Black

unread,
Jul 10, 2016, 11:44:06 PM7/10/16
to Confluent Platform
Hi,

Schema Registry does support union types (at least in my testing).  I think the problem is just the format of the request that you sent...

To add a schema to the registry the request is in JSON with the format {"schema":"<your-schema-goes-here>"}.
Note that in the request the schema is actually just a string inside JSON - it does not accept raw JSON, and so needs to be suitably encoded/escaped.

I got the following data to work ok:

  --data '{"schema": "[ { \"type\": \"record\", \"name\": \"rec1\", \"fields\": [ { \"name\": \"age\", \"type\": \"long\"} ]}, { \"type\": \"record\", \"name\": \"rec2\", \"fields\": [ { \"name\": \"name\", \"type\": \"string\"} ]} ]"}'

Ross

Joe

unread,
Jul 11, 2016, 10:49:34 AM7/11/16
to Confluent Platform
Thank you Ross, a foolish mistake by me.

Now after Enter code here...
posting union schemas, seems like backward compatibility check is not performed per single type in the union schema.

For example, I'm posting a union schema of 2 records (rec1, rec2) and then posting the same union schema but with the first type (rec1) having an additional field (in red). this should have been failed as rec1 is not backward compatible:  

curl -X POST -i -H "Content-Type: application/vnd.schemaregistry.v1+json"  --data  \
 '{"schema": "[{\"type\":\"record\",\"name\":\"rec1\",\"fields\":[{\"name\":\"age\",\"type\":\"long\"}]},  {\"type\":\"record\",\"name\":\"rec2\",\"fields\":[{\"name\":\"username\",\"type\":\"string\"} ]}]"}' \
 http://localhost:8081/subjects/kafka-value/versions

curl -X POST -i -H "Content-Type: application/vnd.schemaregistry.v1+json"  --data  \
 '{"schema": "[{\"type\":\"record\",\"name\":\"rec1\",\"fields\":[{\"name\":\"age\",\"type\":\"long\"}, {\"name\":\"address\",\"type\":\"string\"}]},  {\"type\":\"record\",\"name\":\"rec2\",\"fields\":[{\"name\":\"username\",\"type\":\"string\"} ]}]"}' \
 http://localhost:8081/subjects/kafka-value/versions


for comparison, if I'm posting the same schemas but not as union type (just rec1 schema), I do get a backward compatibility error:

curl -X POST -i -H "Content-Type: application/vnd.schemaregistry.v1+json"  --data  \
 '{"schema": "{\"type\":\"record\",\"name\":\"rec1\",\"fields\":[{\"name\":\"age\",\"type\":\"long\"}]}"}' \

curl -X POST -i -H "Content-Type: application/vnd.schemaregistry.v1+json"  --data  \
 '{"schema": "{\"type\":\"record\",\"name\":\"rec1\",\"fields\":[{\"name\":\"age\",\"type\":\"long\"}, {\"name\":\"address\",\"type\":\"string\"}]}" }' \

Ross Black

unread,
Jul 11, 2016, 8:12:54 PM7/11/16
to Confluent Platform
Hi Joe,

Looking at the code from the schema registry, it uses Avro code to perform the compatibility checks.

This is some code to perform validation directly using Avro (adapted from code of the schema registry)
        final SchemaValidator backwardValidator = new SchemaValidatorBuilder().canReadStrategy().validateLatest();
       
final Schema schema1 = new Schema.Parser().parse(v1);
       
final Schema schema2 = new Schema.Parser().parse(v2);
        backwardValidator
.validate(schema1, ImmutableList.of(schema2));

In the example you have given, the validate for the union succeeds although I would expect it should fail (I tried it with avro 1.8.1.  Schema Registry uses avro 1.7.7)
I am no expert at Avro compatibility rules, but this seems like a bug to me.

Perhaps post to the Avro community?

Ross

Ross Black

unread,
Jul 11, 2016, 8:27:26 PM7/11/16
to confluent...@googlegroups.com
in my code example I had v1 and v2 in the wrong order. Instead it should be :
backwardValidator.validate(schema2, ImmutableList.of(schema1));

to test that a reader using v2 can read data written by v1


--
You received this message because you are subscribed to a topic in the Google Groups "Confluent Platform" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/confluent-platform/ZRDG3yM5iG4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/544fe171-4c89-46f6-ae66-3680abe5c138%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Joe

unread,
Jul 18, 2016, 3:55:49 AM7/18/16
to Confluent Platform
Thanks for your help Ross,

Turns out to be a bug, a new issue on that was created:

To unsubscribe from this group and all its topics, send an email to confluent-platform+unsub...@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages