Adding binary property support

256 views
Skip to first unread message

xiant...@gmail.com

unread,
Apr 15, 2014, 10:33:45 AM4/15/14
to jsonschema...@googlegroups.com
Hello,

  I have some JSON documents that contain base64 encoded binary data.  I would like to generate byte[] types for these properties.  Before I add support for this, I thought I would ask for some input, to make sure my change has a chance of being pulled into the main project.

  The JSON schema and validation APIs do not seem to have any support for binary encoded data, but the JSON Hyper-Schema specification does have a specification for a "media" property.  I am thinking of adding limited support for this property, with a structure like:

"binaryProperty": {
  "type": "string",
  "media": {
    "binaryEncoding": "base64",
    "type": "application/octet-stream"
  }
}

  That will render into fields like:

byte[] binaryProperty;

public byte[] getBinaryProperty() {
  return this.binaryProperty;
}

public void setBinaryProperty( byte[] binaryProperty ) {
  this.binaryProperty = binaryProperty;
}

  Before I add this functionality, I am wondering if:

1) Am I missing something like this that already exists?
2) Would the project accept this functionality?
3) Is there a preferred way to handle unknown media types?

-Christian Trimble

Joe Littlejohn

unread,
Apr 15, 2014, 11:52:37 AM4/15/14
to jsonschema...@googlegroups.com
Christian,

This sounds like a nice addition to me. Although I expect most people don't have this requirement I'd certainly be interested in merging this if you opened a PR. Obviously people will have to configure their data-binding library (e.g. Jackson or Gson) in a way that the byte[] property can be understood, but I don't see any reason why, if they explicitly craft a schema to define binary encoded properties, we shouldn't support them.

Based on draft04 specs I agree that 'media' is the best rule to check for. I think that the behaviour should be:

    "For any property of type "string", the presence of a "media" object having a "binaryEncoding" property would indicate that the Java type will be byte[]"

It seems to me that the *value* of "media.binaryEncoding" and "media.type" are irrelevant as far as the Java types are concerned. So in answer to your questions:

> 1) Am I missing something like this that already exists?

I don't know of anything. This looks sensible to me so feel free to continue as you've suggested.

> 2) Would the project accept this functionality?

Of course if you find out that there are some nasty side-effects to this, or some practical compatibility problems with how jsonschema2pojo currently functions then this might stop us merging your changes into the core. I don't see any right now and I'd hope we can include this.

> 3) Is there a preferred way to handle unknown media types?

Maybe I'm wrong but I don't think this impacts the Java types at all - what do you think? The spec explicitly states that this only applies to strings, and if we see a media property that has any value for binaryEncoding we should use a byte array as the field type.

Cheers




--
You received this message because you are subscribed to the Google Groups "jsonschema2pojo-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jsonschema2pojo-...@googlegroups.com.
Visit this group at http://groups.google.com/group/jsonschema2pojo-users.
For more options, visit https://groups.google.com/d/optout.

Andrew Todd

unread,
Apr 15, 2014, 11:53:13 AM4/15/14
to jsonschema...@googlegroups.com
Not the maintainer, but:

On Tue, Apr 15, 2014 at 10:33 AM, <xiant...@gmail.com> wrote:
1) Am I missing something like this that already exists?

I think that the "format" property should work well for what you're looking for.

{
    "type": "string",
    "format": "base64",
    "pattern": "^[A-Za-z0-9+/=-_]*$"
}

Using "media" to describe the format of a string field does not makes sense to me.

Andrew Todd

unread,
Apr 15, 2014, 11:55:35 AM4/15/14
to jsonschema...@googlegroups.com
Also, does Jackson handle serializing/deserializing Base64 strings to ByteArrays, or is there some other magic there that's not shown in the code sample?

xiant...@gmail.com

unread,
Apr 15, 2014, 12:02:18 PM4/15/14
to jsonschema...@googlegroups.com
Yes it does.  This works with jackson out of the box.

xiant...@gmail.com

unread,
Apr 15, 2014, 12:13:26 PM4/15/14
to jsonschema...@googlegroups.com, xiant...@gmail.com
Andrew,

  I see your point about format, but this is not currently in the spec and the media block does have a definition behind it.  I need compatibility with other libraries, so sticking to a published spec is critical.

Joe,

  I think we are on the same page with what should trigger this functionality.  I will open a pull request for my branch, just before I start work.  Any input on the change would be appreciated.

Thanks,
Christian

Joe Littlejohn

unread,
Apr 15, 2014, 12:32:46 PM4/15/14
to jsonschema...@googlegroups.com

Using "media" to describe the format of a string field does not makes sense to me.


I tend to agree that this is a bit unintuitive, but we don't really have the luxury of rethinking where this should/could have been applied in the specification. It's clear that the 'media' rule is the chosen mechanism for declaring that properties contain binary encoded data so I think it makes sense for us to honour that decision.

Interestingly, if you check the draft03 you'll see that even before the media rule was added the appropriate rule would have been "contentEncoding". So the decision to keep encoding and format distinct was made a long time ago.


Joe Littlejohn

unread,
Apr 15, 2014, 12:35:01 PM4/15/14
to jsonschema...@googlegroups.com

  I think we are on the same page with what should trigger this functionality.  I will open a pull request for my branch, just before I start work.  Any input on the change would be appreciated.


Sounds good. If you need any advice on the implementation feel free to use this list, github, or email me directly.

Cheers

Andrew Todd

unread,
Apr 15, 2014, 3:28:19 PM4/15/14
to jsonschema...@googlegroups.com
On Tue, Apr 15, 2014 at 12:32 PM, Joe Littlejohn <joelit...@gmail.com> wrote:

Using "media" to describe the format of a string field does not makes sense to me.


I tend to agree that this is a bit unintuitive, but we don't really have the luxury of rethinking where this should/could have been applied in the specification. It's clear that the 'media' rule is the chosen mechanism for declaring that properties contain binary encoded data so I think it makes sense for us to honour that decision.

I don't use Hyper-Schema, so that was a red flag for me. It's actually pretty bizarre that it is in hyper and not in validation. Oh well.

lespe...@gmail.com

unread,
Jan 7, 2019, 1:11:00 PM1/7/19
to jsonschema2pojo-users
I am trying the same with getting attachments.

"Attachments":[
{
"Base64Data":"String content",
"Data":{

"type": "string",
"format": "base64",
"pattern": "^[A-Za-z0-9+/=-_]*$"
    },
"FileExtension":"String content",
"Label":"String content"
}
]


I have also tried "Data": [1, 10, 255]


I want data to have a value like byte

@JsonProperty("data")
private byte[] data = null;

Any suggestions on the best way to solve this with the jsonschema2pojo?
Reply all
Reply to author
Forward
0 new messages