allowing file/blobs in jsonschema

1,169 views
Skip to first unread message

jack phelan

unread,
Aug 26, 2012, 4:31:41 PM8/26/12
to json-...@googlegroups.com
Suppose I anticipate either a json error message, or a file blob example, is there any way to incorporate the blob expectation into the existing schema format?

Francis Galiegue

unread,
Aug 31, 2012, 5:29:46 AM8/31/12
to json-...@googlegroups.com
Unlikely so. Correct me if I am mistaken, but these blobs, or
documents etc, don't seem to be JSON at all, are they?

--
Francis Galiegue, fgal...@gmail.com
"It seems obvious [...] that at least some 'business intelligence'
tools invest so much intelligence on the business side that they have
nothing left for generating SQL queries" (Stéphane Faroult, in "The
Art of SQL", ISBN 0-596-00894-5)

penduin

unread,
Sep 6, 2012, 1:13:48 PM9/6/12
to json-...@googlegroups.com
I would think that a "blob" in any conventional sense would lead to invalid JSON, let alone anything validate-able against a schema...  But for cases where we need to embed data, perhaps some sort of recognition of base64-encoded data (or UU, or anything) might be a handy thing for json-schema to recognize.

What might work is a simple "string" type with a "format": "blob" which you define, though you'd have to add your own validation for your own format types.

Maybe we should suggest the inclusion of "format": "base64" (or something) in the spec?  Any blob can be encoded into a string which could be "verified" in as much as it is decode-able.  That doesn't help if you need to examine the contents of the blob, but it would tell you that the data _is_ one.

Does that help at all, or have I completely misunderstood what you're after?

Francis Galiegue

unread,
Sep 6, 2012, 1:38:27 PM9/6/12
to json-...@googlegroups.com
On Thu, Sep 6, 2012 at 7:13 PM, penduin <owen...@gmail.com> wrote:
> I would think that a "blob" in any conventional sense would lead to invalid
> JSON, let alone anything validate-able against a schema... But for cases
> where we need to embed data, perhaps some sort of recognition of
> base64-encoded data (or UU, or anything) might be a handy thing for
> json-schema to recognize.
>
> What might work is a simple "string" type with a "format": "blob" which you
> define, though you'd have to add your own validation for your own format
> types.
>
> Maybe we should suggest the inclusion of "format": "base64" (or something)
> in the spec? Any blob can be encoded into a string which could be
> "verified" in as much as it is decode-able. That doesn't help if you need
> to examine the contents of the blob, but it would tell you that the data
> _is_ one.
>
> Does that help at all, or have I completely misunderstood what you're after?
>

Eh, I don't know whether this is a coindidence, but...

https://github.com/fge/json-schema-formats/commit/0ad95c5de933acececed2d7074d62a45019055c3

Like you, I believe that what the linked page calls a blob is a Base64
representation of the content but I didn't bother (because that's the
word, really, when it comes to W3C specs) reading further to
understand the anatomy of such a blob.

The commit above is but a hypothetical format attribute (the spec
allows for adding attributes, I take advantage of that), but I
genuinely think this one can be made a full-blown part of the
specification.

Cheers,
--
Francis Galiegue, fgal...@gmail.com
JSON Schema: https://github.com/json-schema

penduin

unread,
Sep 6, 2012, 1:43:36 PM9/6/12
to json-...@googlegroups.com
Hah! Coincidence indeed!  I think we should push for this being included in the spec, if such a suggestion isn't already underway.  You're clearly not the only one who needs to put blobs in JSON :^)

Eric Stob

unread,
Sep 6, 2012, 1:50:53 PM9/6/12
to json-...@googlegroups.com, json-...@googlegroups.com
Base 64 is what you are looking for.
You can validate it with a regular expression.

Sent from my iPhone
--
 
 

Francis Galiegue

unread,
Sep 6, 2012, 2:38:31 PM9/6/12
to json-...@googlegroups.com
On Thu, Sep 6, 2012 at 7:50 PM, Eric Stob <eric...@blekko.com> wrote:
> Base 64 is what you are looking for.
> You can validate it with a regular expression.
>

A regular expression is just a waste at this point. Well, OK, this
might work, but then you have to remember one fundamental (even though
only implicit) requirement of JSON Schema: _regexes must conform to
the ECMA 262 regex grammar_.

If someone comes up with a super intelligent regex to accurately
validate a Base64 encoded string, BUT this regex uses a positive
lookbehind, _this will not be a validate ECMA 262 regex_, since ECMA
262 regexes have no support for lookbehinds.

And in any case, a character matcher, in this case, will be faster anyway.

DO NOT, EVER, CONSIDER REGULAR EXPRESSIONS AS AN ACCURATE VALIDATION PROCESS.

penduin

unread,
Sep 7, 2012, 12:26:42 PM9/7/12
to json-...@googlegroups.com
Somehow, we all missed this:
{ "type": "string", "contentEncoding": "base64",
"mediaType": "image/png" }
...already in the spec, no need for a "base64" format or anything.
So, short and final answer to the blob question:
"contentEncoding" for how it's encoded (rfc2045#section-6.1)
"mediaType" for what kind of data it is


On Sunday, August 26, 2012 2:31:41 PM UTC-6, jack phelan wrote:

Francis Galiegue

unread,
Sep 7, 2012, 12:58:53 PM9/7/12
to json-...@googlegroups.com
On Fri, Sep 7, 2012 at 6:26 PM, penduin <owen...@gmail.com> wrote:
> Somehow, we all missed this:
> {
> "type": "string",
> "contentEncoding": "base64",
> "mediaType": "image/png"
> }
> ...already in the spec, no need for a "base64" format or anything.
> So, short and final answer to the blob question:
> "contentEncoding" for how it's encoded (rfc2045#section-6.1)
> "mediaType" for what kind of data it is
>

That is section 6 of the spec, however, and the purpose is to inform
user agents, not validate...

penduin

unread,
Sep 7, 2012, 9:31:19 PM9/7/12
to json-...@googlegroups.com
So it is!  Well I feel less silly about missing it before then, I haven't had the need to dig into hyper schema.
Reply all
Reply to author
Forward
0 new messages