Schema Field Types for JSON Converter?

244 views
Skip to first unread message

goramedf...@gmail.com

unread,
Feb 12, 2018, 6:03:04 PM2/12/18
to Confluent Platform
Hey all,

I've gotten Kafka Connect (3.3.1) processing some JSON messages and writing out to S3 nicely (hat-tip to some useful posts by rmoff having to do with the specific format needed for JSON messages). The issue is that the data being transmitted contains fields that are DECIMAL(18,10) and I can't tell if that's supported by the JSON converter. The following works:

{
  type: "double",
  optional: false,
  field: "price"
}

However, there's a loss of fidelity here. The following doesn't work:

{
  type: "decimal",
  optional: false,
  field: "price"
}

I get an error around that being an invalid field type when I kick off my converter.  I've looked through the code here and only see floats. I'm surely missing something here and would much appreciate getting some help from folks around if my use-case (decimal data types) is supported by Kafka Connect 3.3.1. My connector config is shown below.

{
  "name": "foo",
  "config": {
    "connector.class": "io.confluent.connect.s3.S3SinkConnector",
    "tasks.max": "1",
    "topics": "bar",
    "s3.region": "<NIP>",
    "s3.bucket.name": "<SNIP>",
    "s3.part.size": "26214400",
    "flush.size": "3000000",
    "storage.class": "io.confluent.connect.s3.storage.S3Storage",
    "format.class": "io.confluent.connect.s3.format.json.JsonFormat",
    "schema.generator.class": "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
    "partitioner.class": "io.confluent.connect.storage.partitioner.TimeBasedPartitioner",
    "partition.duration.ms": "3600000",
    "schema.compatibility": "NONE",
    "rotate.schedule.interval.ms": "900000",
    "path.format": "'year'=YYYY/'month'=MM/'day'=dd/'hour'=HH",
    "locale": "en_US",
    "timezone": "UTC",
    "name": "s3-sink"
  }
}

goramedf...@gmail.com

unread,
Feb 17, 2018, 4:47:08 PM2/17/18
to Confluent Platform
Anyone have any insights? How can I use a DECIMAL type with the JSONConverter? I know I could convert to INT in the application and send it over that way but is there support for this datatype with the JSONConverter and S3 as the sink?

Ewen Cheslack-Postava

unread,
Feb 21, 2018, 7:21:40 PM2/21/18
to Confluent Platform
Yeah, this isn't documented all that well because most people using JSON use it without schemas. The way logical types work is that they use one of the primitive types as an encoding, but we pass along enough information to know that we should do an extra step of converting to the logical type when deserializing. In the case of decimals, it is encoded as the type bytes. The logical type is just the name of the class implementing it, in this case org.apache.kafka.connect.data.Decimal. This logical type also has a parameter, scale.

I believe this should work:

{
  "type": "bytes",
  "name": "org.apache.kafka.connect.data.Decimal",
  "parameters": { "scale" : <scale> },
  "optional": false,
  "field": "price"
}

-Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/9b38802c-17f4-4da8-89da-b4997774e95c%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

goramedf...@gmail.com

unread,
Feb 21, 2018, 11:20:49 PM2/21/18
to Confluent Platform
Thanks for clarifying that! 

So, a follow-up: should my JSON payload for this particular field by a string or a byte sequence? I assume the former since the latter would need to be base64 encoded or the such to be valid JSON. So then if I understand correctly, the JSONConverter will read the string representation of the number for the field (price in this case), then send it off to the Decimal logical type which would convert it to a decimal representation. Am I on the right track here?
To post to this group, send email to confluent...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages