AvroTypeException(self.writer

denizen...@gmail.com

unread,

Jul 17, 2018, 9:59:49 PM7/17/18

to Confluent Platform

Hi everyone...

I am getting this error and would like any kind advice to solve please. Running confluent enterprise 4.1.1 with the following:

::My schema (2col.avsc)::

{"namespace": "com.example.avro",
"type": "record",
"name": "sor1",
"fields": [
{"name": "SRCE_BOOK_BRNCH", "int": "int"},
{"name": "NUM_FCLTY", "type": ["string", "null"], "default": null} ] }

::My CSV (2col.csv)::

3110,
3110,
3110,1.30E+12
...

::My Code::

from confluent_kafka import avro
from confluent_kafka.avro import AvroProducer
import csv

AvroProducerConf = {'bootstrap.servers': '10.97.176.201:9092,10.97.176.202:9092,10.97.176.203:9092','schema.registry.url': 'http://localhost:8081'}

value_schema = avro.load('2col.avsc')

avroProducer = AvroProducer(AvroProducerConf, default_value_schema = value_schema)

with open('2col.csv', newline = '', encoding = 'utf-8') as f:
reader = csv.reader(f, delimiter = ',')
for row in reader:
    avroProducer.produce(topic = '2col', value = {"SRCE_BOOK_BRNCH": row[0],"NUM_FCLTY": row[1]})
    print(row)
    avroProducer.flush()

::My Error::
... File "/apps/home/kafka/.local/lib/python3.6/site-packages/avro/io.py", line 809, in write

raise AvroTypeException(self.writer_schema, datum)

avro.io.AvroTypeException: The datum {'SRCE_BOOK_BRNCH': '3110', 'NUM_FCLTY': ''} is not an example of the schema {...

Any ideas please?

Thank you in advance!

ravi

Sacha Barber

unread,

Jul 18, 2018, 1:41:16 AM7/18/18

to Confluent Platform

This field doesn't look right

{"name": "SRCE_BOOK_BRNCH", "int": "int"},

Shouldn't that be

{"name": "SRCE_BOOK_BRNCH", "type": "int"},

denizen...@gmail.com

unread,

Jul 18, 2018, 9:26:00 AM7/18/18

to Confluent Platform

Hey Sacha,

Thank you so much. Good catch. Weird the AVSC has what "type", when I copied and paste some how got clobbered?

I further changed both attributes to string and the code successfully pushed messages into the topic. Is there a way to get better diagnostic info on the serialization process?

ravi

Sacha Barber

unread,

Jul 18, 2018, 9:54:27 AM7/18/18

to Confluent Platform

One of the best ways is try use the kafka schema registry and then use producer /consumer against that. Then you get pretty good message

Schema registry has nice rest api too

denizen...@gmail.com

unread,

Jul 18, 2018, 10:48:59 AM7/18/18

to Confluent Platform

Hello Sacha,

Thank you again. My setup is 3 ZK, 3 brokers and a 4th separate VM for schema registry, control center, KSQL, etc.. i can curl PUT the schema ahead of time or have the program push schema with the data. I also delete the schema each time i run to eliminate schema clash.In both cases i get the same error. Since posting this i reduced from 2 columns of data to 1 column of data and still face the same issue / error which leads me to believe there is something i am not understanding either how the schema is being stored and / or how the serialization is being performed.

ravi

Sacha Barber

unread,

Jul 18, 2018, 10:55:28 AM7/18/18

to Confluent Platform

Sadly I'm just off on hols for 3 weeks bow but weirdly I just posted series of blogs on avro and scheme registry

https://sachabarbs.wordpress.com

4 part series this may help you

Sorry but have to go on hols now

Good luck

denizen...@gmail.com

unread,

Jul 18, 2018, 8:50:17 PM7/18/18

to Confluent Platform

thank you and have a wonderful holiday. hopefully you come back with a part 5 in python. LOL

Sacha Barber

unread,

Jul 19, 2018, 5:21:13 AM7/19/18

to Confluent Platform

Ha ha yeah python and go are on my list

denizen...@gmail.com

unread,

Jul 23, 2018, 9:38:32 PM7/23/18

to Confluent Platform

Hi all,

Maybe I am not asking the question correctly.. Can one use any other AVRO type besides string when producing to Kafka. I have am successful in defining a schema with all my elements as string and sending as string. Thinking down the road, I want to use KSQL, so does it make even sense to define an AVRO schema to the stream? Maybe a better approach is to have a raw topic that data comes in as and then use KSQL to cast the data into the correct types?