XML Converter for KAFKA Connect

2,090 views
Skip to first unread message

Saravanan Tirugnanum

unread,
Jan 19, 2016, 11:51:53 AM1/19/16
to Confluent Platform
Hi 

I see JsonConverter is part of the Kafka package and default of kafka connect. 
Looking for a custom converter to build XML format instead of json. Any reference would help.

Regards
Saravanan

Ewen Cheslack-Postava

unread,
Jan 21, 2016, 5:02:03 PM1/21/16
to Confluent Platform
Saravanan,

JSON was chosen to include with Kafka because it requires minimal dependencies (both in terms of jars and services to support it). Confluent also provides an AvroConverter. I'm not aware of any XML implementation yet. If you're interested, we can offer guidance on building one -- the converters are usually pretty straightforward code that just has to handle all the different types. They aren't particularly difficult to write, just require a bit of effort to ensure you've handled all the types correctly.

-Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/5a80727d-e4b1-4d0a-8dec-fe300337f350%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Ewen

Saravanan Tirugnanum

unread,
Jan 28, 2016, 2:33:41 PM1/28/16
to Confluent Platform
Thank you Ewen. Would appreciate if you can provide guidance around overriding this in Kafka Connect. I believe we should start overriding the data packages as well like Struct and Schema classes as i would be generating xml based on schema (xsd). Can you please provide some directions on changes needed to be done to achieve XML Converter.

Regards
Saravanan

On Thursday, January 21, 2016 at 4:02:03 PM UTC-6, Ewen Cheslack-Postava wrote:
Saravanan,

JSON was chosen to include with Kafka because it requires minimal dependencies (both in terms of jars and services to support it). Confluent also provides an AvroConverter. I'm not aware of any XML implementation yet. If you're interested, we can offer guidance on building one -- the converters are usually pretty straightforward code that just has to handle all the different types. They aren't particularly difficult to write, just require a bit of effort to ensure you've handled all the types correctly.

-Ewen
On Tue, Jan 19, 2016 at 8:51 AM, Saravanan Tirugnanum <vtsa...@gmail.com> wrote:
Hi 

I see JsonConverter is part of the Kafka package and default of kafka connect. 
Looking for a custom converter to build XML format instead of json. Any reference would help.

Regards
Saravanan

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.



--
Thanks,
Ewen

Saravanan Tirugnanum

unread,
Jan 29, 2016, 10:29:57 AM1/29/16
to Confluent Platform
Hi Ewen,

Any luck with this. Can you please help provide some guidance around here.

Regards
Saravanan

Ewen Cheslack-Postava

unread,
Jan 29, 2016, 1:08:13 PM1/29/16
to Confluent Platform
Saravanan,

You don't need to override classes like Struct and Schema. Those are the layer of abstraction that Connect provides so connectors don't need to use serialization-specific classes (i.e. they don't need GenericRecord and Avro's Schema class for Avro support). To add XML support, all you should need to do is implement an XmlConverter class that implements the Converter interface (see https://github.com/apache/kafka/blob/trunk/connect/api/src/main/java/org/apache/kafka/connect/storage/Converter.java).

The interface is small, but the two methods each need to handle all possible schemas and data types. See the JsonConverter (https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java) and AvroConverter (https://github.com/confluentinc/schema-registry/blob/master/avro-converter/src/main/java/io/confluent/connect/avro/AvroConverter.java) as examples. If you already have an XML Serializer/Deserializer implementation, you can usually reuse that code and just implement the Connect Data API <-> XML API conversions.

-Ewen

On Fri, Jan 29, 2016 at 7:29 AM, Saravanan Tirugnanum <vtsa...@gmail.com> wrote:
Hi Ewen,

Any luck with this. Can you please help provide some guidance around here.

Regards
Saravanan


On Thursday, January 28, 2016 at 1:33:41 PM UTC-6, Saravanan Tirugnanum wrote:
Thank you Ewen. Would appreciate if you can provide guidance around overriding this in Kafka Connect. I believe we should start overriding the data packages as well like Struct and Schema classes as i would be generating xml based on schema (xsd). Can you please provide some directions on changes needed to be done to achieve XML Converter.

Regards
Saravanan

On Thursday, January 21, 2016 at 4:02:03 PM UTC-6, Ewen Cheslack-Postava wrote:
Saravanan,

JSON was chosen to include with Kafka because it requires minimal dependencies (both in terms of jars and services to support it). Confluent also provides an AvroConverter. I'm not aware of any XML implementation yet. If you're interested, we can offer guidance on building one -- the converters are usually pretty straightforward code that just has to handle all the different types. They aren't particularly difficult to write, just require a bit of effort to ensure you've handled all the types correctly.

-Ewen
On Tue, Jan 19, 2016 at 8:51 AM, Saravanan Tirugnanum <vtsa...@gmail.com> wrote:
Hi 

I see JsonConverter is part of the Kafka package and default of kafka connect. 
Looking for a custom converter to build XML format instead of json. Any reference would help.

Regards
Saravanan

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.



--
Thanks,
Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Ewen

James Cheng

unread,
Jan 29, 2016, 3:47:18 PM1/29/16
to Confluent Platform
And after implementing those files and compiling them, the way to use them is:

1) edit your connect.properties file and set key.converter and value.converter to the name of your classes
2) add your classes (or jar) to the classpath when you run connect.

So, concretely, if your normal command-line is
bin/connect-standalone.sh config/connect-standalone.properties connector1.properties [connector2.properties ...]

Then do something like this
1) change key.converter and value.converter in config/connect-standalone.properties
2) run 
export CLASSPATH=/path/to/your/jar:${CLASSPATH} 
bin/connect-standalone.sh config/connect-standalone.properties connector1.properties [connector2.properties ...]

-James
Saravanan
Saravanan
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.



--
Thanks,
Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.



--
Thanks,
Ewen

Saravanan Tirugnanum

unread,
Jan 29, 2016, 3:57:16 PM1/29/16
to Confluent Platform
Thank you Ewen and James for the support.

I have a schema (xsd) file to validate the generated xml and have generated annotated classes based on the schema. Is there anyway we can use those schema classes inside our framework.  

Regards
Saravanan

Ewen Cheslack-Postava

unread,
Jan 29, 2016, 4:49:01 PM1/29/16
to Confluent Platform
Saravan,

It sounds like you might be trying to handle just a single schema. Converters should be general and be able to convert the Connect Schema to an xsd and the data for the schema to an XML document, regardless of the schema involved. You can of course implement one that works for only a single schema, but it won't be very usable unless connectors happen to use that specific schema.

For source connectors, almost no connectors would work unless your existing XML schema happens to match. A single schema will work with sink connectors, but obviously will only work with that one type of data.

-Ewen

Saravanan
Saravanan
Saravanan
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.



--
Thanks,
Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.



--
Thanks,
Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Ewen

Saravanan Tirugnanum

unread,
Feb 1, 2016, 9:51:07 AM2/1/16
to Confluent Platform
Thanks Ewen for the inputs.. I thought if there is an easy way to implement if the schema is not going to change. Hence asked.  Anyways I understand your point I will start implementing this and share the updates soon.

Regards
Saravanan
Saravanan
Saravanan
Saravanan
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.



--
Thanks,
Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.



--
Thanks,
Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.



--
Thanks,
Ewen

rahmath...@gmail.com

unread,
Nov 6, 2018, 5:20:28 AM11/6/18
to Confluent Platform
Hi Even,

I have to ingest the attached XML into hive table and I have built the pipeline to transform and ingest Hive. What I need is, to put a Kafka cluster in between and Data Center (where logs will be generated) and push the XML logs into Kafka cluster. Later, the logs needs to be consumed by Streaming job and put into an HDFS location.

Once the XML logs are arrived into HDFS location, the remaining part is fine.


On Thursday, January 21, 2016 at 11:02:03 PM UTC+1, Ewen Cheslack-Postava wrote:
Saravanan,

JSON was chosen to include with Kafka because it requires minimal dependencies (both in terms of jars and services to support it). Confluent also provides an AvroConverter. I'm not aware of any XML implementation yet. If you're interested, we can offer guidance on building one -- the converters are usually pretty straightforward code that just has to handle all the different types. They aren't particularly difficult to write, just require a bit of effort to ensure you've handled all the types correctly.

-Ewen
On Tue, Jan 19, 2016 at 8:51 AM, Saravanan Tirugnanum <vtsa...@gmail.com> wrote:
Hi 

I see JsonConverter is part of the Kafka package and default of kafka connect. 
Looking for a custom converter to build XML format instead of json. Any reference would help.

Regards
Saravanan

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.



--
Thanks,
Ewen
log.xml

Jordan Moore

unread,
Nov 26, 2018, 6:21:15 PM11/26/18
to Confluent Platform
Reply all
Reply to author
Forward
0 new messages