links

238 views
Skip to first unread message

Andrei Neculau

unread,
Oct 15, 2012, 10:15:54 AM10/15/12
to json-...@googlegroups.com
Are links supposed to cover the schema's hypermedia or the instance's hypermedia?

Geraint (David)

unread,
Oct 15, 2012, 10:59:54 AM10/15/12
to json-...@googlegroups.com
They define links for the instance. From the v3 spec:

   The relation to the target SHOULD be
   interpreted as specifically from the instance object that the schema
   (or sub-schema) applies to

Andrei Neculau

unread,
Oct 15, 2012, 3:20:24 PM10/15/12
to json-...@googlegroups.com
Cool. Thanks for confirming.
I read the specs as well, but then why make it impossible to define the instance having a link with a specific rel, but with an unknown href ?
The schema for links says that the link must have a rel and a href. https://github.com/json-schema/json-schema/blob/master/draft-04/links
That must hold true for the instance, but I argue against it holding true for the schema of the instance.

PS: I actually bumped into this question while trying out jsonary - it fails when I paste in a JSON-schema of mine (which has all the links with a rel, but with no href).

Geraint (David)

unread,
Oct 16, 2012, 3:45:09 AM10/16/12
to json-...@googlegroups.com
I'm a little confused - if you have a link without an href, how can anyone use it?  Isn't that like saying:

"Here's a link to all the comments on this article."
"Great, thanks!  Hey, um - where can I actually find these comments?"
"It's a secret."

Andrei Neculau

unread,
Oct 16, 2012, 10:12:30 AM10/16/12
to json-...@googlegroups.com
Let's see if I can confuse you more.

given a SCHEMA
{
  "type": "object",
  "links": [{
    "rel": "self"
    "href": ???
  }]
}

and an INSTANCE that follows SCHEMA
{
  "test": true
}

how does the SCHEMA know the URI of INSTANCE?


Still confused? Or am I totally in the wrong here?
I see that the schema defines that the instances it defines might have a link with a certain rel. Maybe a certain method, etc. But nothing more (not the URI). The URI, whether the link is shown or not, etc is decided at run-time. I maybe not have the right authorization, so then links are not shown to me, or the URI might be through a redirection service that allows me to first login and then follow the link, etc.

Geraint (David)

unread,
Oct 16, 2012, 11:11:36 AM10/16/12
to json-...@googlegroups.com
"The URI, whether the link is shown or not, etc is decided at run-time"

How are they decided?  If they are being decided using magic code in the client, then what's the point in even including the rel and method?  The link definitions in the schemas are not useful.

However, the RESTful way to approach this situation is to include the link URIs in the item.  Say you have this schema:
{
  "type": "object",
  "links": [
    {"href": "{id}", "rel": "self"},
    {"href": "{author}", "rel": "author"},
    {"href": "{reply}", "rel": "action-reply", "method": "POST", "schema": {...}}
  ]
}

This instance:
{
  "test": true
}
would have the "self" link defined, but would not have enough information to automatically assemble the others.

This instance:
{
  "author": "/authors/andrei"
}
would have the "self" link and the "author" link.

Andrei Neculau

unread,
Oct 16, 2012, 2:03:15 PM10/16/12
to json-...@googlegroups.com


On Tuesday, October 16, 2012 5:11:36 PM UTC+2, Geraint (David) wrote:
"The URI, whether the link is shown or not, etc is decided at run-time"

How are they decided?  If they are being decided using magic code in the client, then what's the point in even including the rel and method?  The link definitions in the schemas are not useful.


Think of it this way. You can define a property to be of type integer, but you do not assign its value in the schema, right?
That's my analogy to a link's rel and href.

The point of including the link's rel as part of the schema is that you define the instance to be linked with another instance via a relation.
 
However, the RESTful way to approach this situation is to include the link URIs in the item.  Say you have this schema:
{
  "type": "object",
  "links": [
    {"href": "{id}", "rel": "self"},
    {"href": "{author}", "rel": "author"},
    {"href": "{reply}", "rel": "action-reply", "method": "POST", "schema": {...}}
  ]
}

This instance:
{
  "test": true
}
would have the "self" link defined, but would not have enough information to automatically assemble the others.

This instance:
{
  "author": "/authors/andrei"
}
would have the "self" link and the "author" link.


Perfect example. REST, hypermedia, opaque URIs..

So what you're saying is that if I do not have the URI's value as part of the instance, then the instance cannot have a link with that URI. Does that make sense to you? That's what I'm questioning.

The schema defines some constraints. Just that in this case the link's href has no constraint.
The schema will define that the instance can have links with rels among [x,y,z], but does not define how those URIs are constructed (by server/client depending how RESTful you are)



I'll try clear up the mess by going the other way around.
Take my expectation of using "links" and have them defined by JSON-schema,
and compare with your POV.

{
 "type": "object",
 "properties": {
  "id": {
   "type": "integer"
  },
  "model": {
   "type": "string"
  }
 },
 "links": [{
  "rel": "owner"
 }]
}

{
 "type": "object",
 "properties": {
  "national_identification_number": {
   "type": "string"
  },
  "name": {
   "type": "string"
  }
 },
 "links": [{
  "rel": "owned-car"
 }]
}

Now
< 200
< Link: rel=owner;href=http://andrei/persons/Andrei
{
 "id": 123,
 "brand": "VW"
}

< 200
< Link: rel=owned-car;href=http://andrei/cars/123
{
 "national_identification_number": 1234567890,
 "name": "Andrei"
}

If I am to go with the way you described previously, then it means the car has to have "owner": "123456789" or "owner_uri": "http://andrei/persons/123456789" (which makes no sense to me) and then have a link defined to use that value.

Geraint (David)

unread,
Oct 16, 2012, 2:40:18 PM10/16/12
to json-...@googlegroups.com
Ah-hah!  I understand what's going on here ... possibly. :p

The "links" keyword in the schema does not define a schema for the instance's links.  It defines the links themselves.  The "links" keyword is a way of defining links, not validating them.

If you're providing links using an HTTP Header, as in your example, then you do not need to specify the links using JSON Schema at all.  You have already defined the links - you don't need to define them twice.

This example, however, is one where the "links" keywords is really useful.  Say I have a list of comments:
[
    {
        "message": "Here's my comment.",
        "author": "exampleuser"
    },
    {
        "message": "You are completely wrong.",
        "author": "exampletroll"
    }
]
I want to have a link defined from each comment to the author of that comment.  However, I can't do that with Link HTTP headers - they apply to the instance as a whole, not individual parts of it.

However, if I say that the instance is following this schema:
{
    "title": "List of comments",
    "type": "array",
    "items": {
        "title": "Comment"
        "links": [
            {"href": "/users/{author}", rel="author"}
        ]
    }
}
The inner schema there (the one inside "items") defines a link.  But it defines it for that particular comment - the template is filled out using values from the comment object.

The first comment therefore has a link defined to "/users/exampleuser", and the second comment has a link defined to "/users/exampletroll".

Andrei Neculau

unread,
Oct 16, 2012, 5:17:18 PM10/16/12
to json-...@googlegroups.com
Yes =) _that_ is the right ah-hah indeed. At least you know where I'm coming from.

I also understand where you are coming from, though I personally don't see it as a big gain (client constructing links based on a schema and its instance) -- and please accept my apologies with that; I know that my perspective is deeply skewed due to designing a Level3 REST API ,)

But beyond this,
I read https://github.com/json-schema/json-schema/blob/next/proposals/json-schema-hypermedia.txt carefully enough to say that the links are rather convoluted.
(and as a the side-note -- it references the "describedBy" relation, though IANA says it's "describedby". Not sure if rels are to be treated case insensitive.)

#1
"Also, when links are
   used within a schema, the URI SHOULD be parametrized by the property
   values of the instance object, if property values exist for the
   corresponding variables in the template (otherwise they MAY be
   provided from alternate sources, like user input)."

Too much flexibility, and thus complexity.
This complexity though I will use for my own good - basically if I put href to "{serverProvidedURI}" it means that the server will provide this URI. It doesn't say much, it is not useful per your example, BUT it does serve for hinting at what links this instance might have.

#2 can you reference nested properties ?

#3
"The following relations are applicable for schemas (the schema as the
   "from" resource in the relation):
...........
   Links defined in the schema using these relation values SHOULD not
   require parameterization with data from the instance, as they
   describe a link for the schema, not the instances."

So based on the rel, one should know whether the replacements should be from the instance or from the schema?
Again, too much flexibility (exceptions to the thumb rule).


Thanks a lot for getting back to me!

Geraint (David)

unread,
Oct 17, 2012, 3:55:23 AM10/17/12
to json-...@googlegroups.com
"This complexity though I will use for my own good - basically if I put href to "{serverProvidedURI}" it means that the server will provide this URI. It doesn't say much, it is not useful per your example, BUT it does serve for hinting at what links this instance might have."

This is exactly what I think is great, and why I'm writing Jsonary.  If you have a nice REST API, then an intelligent human could look at the data and have a decent guess at how to interact with it.  A person can look at a data structure, figure out which bits are meant to be links/URLs, which bits represent ID tokens, yada-yada.

If you document that API using JSON Schema, then an intelligent client program can interact with that API having never seen it before.  For example, in this demo I have written schemas for (a small part of) the Facebook Graph API.  I coded the OAuth forwarding stuff by hand, but the rest of the interaction is based entirely on the schemas.

So although it's possible to use JSON Schema to document non-REST APIs, I think its greatest power is improving the "discoverability" wanted by Level 3 REST.

Geraint (David)

unread,
Oct 17, 2012, 4:11:49 AM10/17/12
to json-...@googlegroups.com
As for your numbered points:

1) FYI - we are planning to replace this with a reference to RFC 6570 (URI Templates).  It will be just as flexible (in fact, more so), but we won't be rolling our own standard.

The main advantage I can see of "href templating" is that at the moment, to do higher levels of REST you have to embed quite a lot of URLs in each item, but they will mostly be the same.  JSON Schema lets you provide the same information, but much more compactly.

Conventional REST data might look like this:
{
    "id": 12345,
    "likesUri": "/12345/likes",
    "commentsUri": "/12345/comments",
    ...
}
But you've repeated "12345" three times - not only that, but the "/likes" and "/comments" suffixes are likely going to be the same for all other similar pieces of data.

Now, the rules of REST say that the client shouldn't be assembling links or creating URIs based on pre-knowledge of the URI structure.  The server has to be free to change the URI at any time.

But using this schema:
{
    "links": [
        {"href": "/{id}/likes", rel="likes"},
        {"href": "/{id}/comments", rel="comments"}
    ]
}
You can simply write:
{
    "id": 12345,
    ...
}

All the links are there - in fact, they are explained more explicitly than in the previous example (with rel values).  However, they are still assembled entirely using information that came from the server.  The server can change the URI schema, and simply change the schema to reflect this.  Schema-aware clients will simply pick up the change and not blink an eye.

However, if you still want to include every URL explicitly, JSON Schema still lets you do that, and at the same time it can explicitly document which parts of your data are links, enhancing "discoverability".

Geraint (David)

unread,
Oct 17, 2012, 4:21:30 AM10/17/12
to json-...@googlegroups.com
2) Sadly not.  I can't see a way to do it using RFC 6570 either.  It would be kinda nice, though...

3) No - all replacement values come from the instance.  No special behaviour is defined for these relation values.

The point of "create" and "instances" is for cases like this:

GET /schemas/articles
> 200 OK
> Link: <http://example.com/articles/>; REL=instances
{
  ... JSON Schema data ...
}

In that case, the schema document itself comes with an "instances" link.  The idea is that you only have the schema, you can use it to find content following that schema.

I don't think I'll personally be using them very often, but they are just "rel" values - no special behaviour attached - so I don't think they add any complication.

Geraint (David)

unread,
Oct 17, 2012, 4:27:15 AM10/17/12
to json-...@googlegroups.com
3) Oh wait - no, I see what you mean.

There is some special-casing here.  It advises not using any parametrisation  exactly because of this confusion.

I agree that it breaks the rules and adds complication - I don't like it, and it will probably disappear in my next proposal.  However, I think the "rel" values are still useful, as in my previous email.

Geraint (David)

unread,
Oct 17, 2012, 4:29:58 AM10/17/12
to json-...@googlegroups.com
Worth mentioning, I suppose, that the parametrisation still comes from the instance - which is why there is advice not to use any, and just have plain URLs for these relations.

The special-case is that those "rel" values should be treated as relations upon the schema, not the instance.

Still icky, and likely to disappear in future unless I hear a torrent of support for the existing behaviour.

Andrei Neculau

unread,
Oct 17, 2012, 7:29:16 AM10/17/12
to json-...@googlegroups.com
Ignoring the lets-give-the-client-hints-on-how-to-build-links,
+1 on all head-counts from me ;)

Philippe Marsteau

unread,
Feb 12, 2013, 3:10:41 PM2/12/13
to json-...@googlegroups.com
I posted a similar post about that very same thing. 

The more I think about it, the more I believe coupling the data-representation with some external "out-of-band" information (e.g. an json-schema resource to be downloaded elsewhere by the client) sounds breaking the hypermedia and data self-descriptiveness constraint of REST. 

In my own interpretation, uniform contract of REST APIs are defined by 1) uniform-agreed mime-types, 2) uniform-agreed relation-types and 3) uniform-agreed transport (HTTP methods/headers).
  • The specified mime-type (e.g. application/json) passed on Accept/Content-Type HTTP headers is sufficient to process the data representation of the returned structure (which is described elsewhere, e.g. in a XSD or JSON schema). E.g. If the browser or user agent "knows" the "image/jpg" Content-Type, it will be able to render it without external information. The mime types can be vendor-specific but restricts their interoperability with other user-agents which may ignore them.

  • The specified relation-type (e.g. "parent" or "next" or "prev") passed using Link HTTP headers or retrieved from HTTP body is sufficient to interpret how to interact with the related resource (which might be described elsewhere, eg. IANA registry or json schema). E.g. If the browser or user agent "knows" the "next" or "prev" relation link type, it will be able to navigate between the previous or next resource within the context of the current resource. The relation types can be vendor-specific too, but here again, it restricts their interoperability with other user-agents who will ignore them.
Relying on information out-of-band of current resource HTTP context to interpret the data received sounds not REST to me. Whether the external information is a JSON schema or even a plain WIKI documentation page, it breaks the self-descriptiveness of the resource. Unless you embed the schema within the json instance representation, I don't think we should call this RESTful.

My 0.02$

Francis Galiegue

unread,
Feb 12, 2013, 3:52:31 PM2/12/13
to json-...@googlegroups.com
On Tue, Feb 12, 2013 at 9:10 PM, Philippe Marsteau <mars...@gmail.com> wrote:
>
> I posted a similar post about that very same thing.
>
> The more I think about it, the more I believe coupling the data-representation with some external "out-of-band" information (e.g. an json-schema resource to be downloaded elsewhere by the client) sounds breaking the hypermedia and data self-descriptiveness constraint of REST.
>

Sorry to chime in the discussion like this, but...

What do you call "out of band" representation here? The core
specification clearly definies mechanisms (for HTTP only) coupling
instance and schema (namely, the HTTP link header or the MIME type).
It does not define any other bindings, for any other protocol.

Are you saying that this binding should be formalized in the JSON
representation of the instance? Let's say, something like:

{
"instance": // something
"schema": { "schema": "here" }
}

IIUC, REST defines comprehensiveness at the URI level; after that,
what the defererenced content is and how you should interpret it is
out of REST's scope altogether.

To be more general: what do you think should be done to make an
instance/schema relationship solid, and preferrably, protocol
independent? Admittedly, the existing recommendations right now only
apply to HTTP.

--
Francis Galiegue, fgal...@gmail.com
Try out your JSON Schemas: http://json-schema-validator.herokuapp.com

Geraint (David)

unread,
Feb 14, 2013, 9:06:56 AM2/14/13
to json-...@googlegroups.com
On Tuesday, February 12, 2013 8:10:41 PM UTC, Philippe Marsteau wrote:
I posted a similar post about that very same thing. 

The more I think about it, the more I believe coupling the data-representation with some external "out-of-band" information (e.g. an json-schema resource to be downloaded elsewhere by the client) sounds breaking the hypermedia and data self-descriptiveness constraint of REST. 

In my own interpretation, uniform contract of REST APIs are defined by 1) uniform-agreed mime-types, 2) uniform-agreed relation-types and 3) uniform-agreed transport (HTTP methods/headers).
  • The specified mime-type (e.g. application/json) passed on Accept/Content-Type HTTP headers is sufficient to process the data representation of the returned structure (which is described elsewhere, e.g. in a XSD or JSON schema). E.g. If the browser or user agent "knows" the "image/jpg" Content-Type, it will be able to render it without external information. The mime types can be vendor-specific but restricts their interoperability with other user-agents which may ignore them.

  • The specified relation-type (e.g. "parent" or "next" or "prev") passed using Link HTTP headers or retrieved from HTTP body is sufficient to interpret how to interact with the related resource (which might be described elsewhere, eg. IANA registry or json schema). E.g. If the browser or user agent "knows" the "next" or "prev" relation link type, it will be able to navigate between the previous or next resource within the context of the current resource. The relation types can be vendor-specific too, but here again, it restricts their interoperability with other user-agents who will ignore them.
Relying on information out-of-band of current resource HTTP context to interpret the data received sounds not REST to me. Whether the external information is a JSON schema or even a plain WIKI documentation page, it breaks the self-descriptiveness of the resource. Unless you embed the schema within the json instance representation, I don't think we should call this RESTful.

My 0.02$

RE: self-description

The "self-description" desired with regards to REST interfaces is to do with being able to process messages without knowing the exact context of how they were requested.  It means that you have enough information to choose which parser/interface to use - but you typically still need an appropriate set of possible parsers included in the client.

This is equivalent to shipping the client with a bunch of schemas pre-loaded - at this point, the schema URIs are just identifiers telling you which interpreter setup to use.  The only difference is that when push comes to shove, the server cannot provide a new parser, but it can provide a new schema.

RE: links, actions and HATEOAS

The HATEOAS principle means that the data indicates which actions are available, but it does not require that the data includes a complete description of all relevant API endpoints with each message.  It is the interpreter's job to collect links, attaching URIs to link relations, and have special knowledge about how to use them (e.g. submission data).

Using JSON Hyper-Schema does not suddenly make your API RESTful if it wasn't before.  But a client for a RESTful API still has to be "specialised" - it requires innate knowledge about the formats, including how to identify links in the data, and exactly what parameters different links require.

Hyper-schemas are a way to describe the data such that a client requires far less coded-in specialisation to be able to interact with the API - and if the schemas are being fetched instead of bundled with the client, then this allows for a lot of flexibility in the clients, as some changes that would otherwise have broken clients can now be communicated by changing the schemas.
 

Philippe Marsteau

unread,
Feb 15, 2013, 8:15:48 AM2/15/13
to json-...@googlegroups.com
A parser can be very generic (e.g. a browser can parse json or html data very generically), yet you could have a schema that describes in much details additional information about the format of json/html/xml your provider sends.

The idea is to have a generic way to parse links from any document, and with that very generic parser, have the ability to convey navigation from the data being returned (without knowing necessarily the format of the data returned). Consider e.g. the HTTP Link Headers received on a HTTP GET that returns HTML. Say you receive 2 links with "prev" and "next". You don't need the details of the data being returned (that could be described in your json schema, or could be plain binary data, eg. a XLS spreadsheet or an JPG image) to propose generically a "back" and "next" button to navigate the data (following the links). This is based on your uniform contract that HTTP and the fact link relations "next" and "prev" are IANA standardized. I don't necessarily needs the knowledge of the MIME type (application/jpg or whatever) to enable hyperlinking. 

Schemas describe the payload, and may describe hypermedia links in more details. Very useful as a mean of validation and/or documentation. But not as a mean of parsing the data. If this is required for parsing the data, then it should be part of the document itself. 

With the example I provided above, you woul not need any schema knowledge and yet your user-agent/browser/parser would be able to follow links, assuming it can handle HTTP Links. 
I also disagree that it is interpreter's job to "attach URIs to link relations". Any data point to back that affirmation? Availability of links and actual URLs to navigate these links should be server logic, based on application internal state, and should not be parser/client job to construct. 

I find schemas (XSD or json-schema) a GREAT way to describe your API contract. JSON-schema also allows to document your actions and links (including those non standard ones). I agree the client will build logic against that API contract (and as such, will build its code based on a given version of the API contract / based on a given schema version.

The only thing I struggle with, is expecting the client to download the current version of the schema, at all call, to "detect" changes the server may have decided to undertake, so it can dynamically construct URIs of the links/actions that fits the server implementation. I actually would go further in saying the URI template should not even be part of the schema, it is an implementation details that should be defined outside of the API contract (therefore outside of the schema defining the API contract).

Geraint Luff

unread,
Feb 15, 2013, 11:40:48 AM2/15/13
to json-...@googlegroups.com
On 15 February 2013 13:15, Philippe Marsteau <mars...@gmail.com> wrote:
A parser can be very generic (e.g. a browser can parse json or html data very generically), yet you could have a schema that describes in much details additional information about the format of json/html/xml your provider sends.

The idea is to have a generic way to parse links from any document, and with that very generic parser, have the ability to convey navigation from the data being returned (without knowing necessarily the format of the data returned). Consider e.g. the HTTP Link Headers received on a HTTP GET that returns HTML. Say you receive 2 links with "prev" and "next". You don't need the details of the data being returned (that could be described in your json schema, or could be plain binary data, eg. a XLS spreadsheet or an JPG image) to propose generically a "back" and "next" button to navigate the data (following the links). This is based on your uniform contract that HTTP and the fact link relations "next" and "prev" are IANA standardized. I don't necessarily needs the knowledge of the MIME type (application/jpg or whatever) to enable hyperlinking. 

Schemas describe the payload, and may describe hypermedia links in more details. Very useful as a mean of validation and/or documentation. But not as a mean of parsing the data. If this is required for parsing the data, then it should be part of the document itself. 

With the example I provided above, you woul not need any schema knowledge and yet your user-agent/browser/parser would be able to follow links, assuming it can handle HTTP Links. 

Sure - using the HTTP Link header, you can communicate links completely independently of the payload type.
 
I also disagree that it is interpreter's job to "attach URIs to link relations". Any data point to back that affirmation? Availability of links and actual URLs to navigate these links should be server logic, based on application internal state, and should not be parser/client job to construct. 

Sorry, I didn't mean construct - I meant identify.

When I view an HTML page, the server has helpfully put all the links in for me, but my client (the browser) uses its innate knowledge of HTML to identify them as links and display them to me as such.

That kind of innate knowledge is required for any client to use links, in any data format (unless, as you point out, all links are communicated in the HTTP headers - but that's not very common).  The problem currently is that there are a billion different custom JSON formats out there - so you cannot write a "client that understands JSON and identifies hyperlinks", because the hyperlinks are not presented in a single form.

The goal of hyper-schemas is a surrogate for that "innate knowledge" which is missing for these home-brew JSON formats.

I find schemas (XSD or json-schema) a GREAT way to describe your API contract. JSON-schema also allows to document your actions and links (including those non standard ones). I agree the client will build logic against that API contract (and as such, will build its code based on a given version of the API contract / based on a given schema version.

The only thing I struggle with, is expecting the client to download the current version of the schema, at all call, to "detect" changes the server may have decided to undertake, so it can dynamically construct URIs of the links/actions that fits the server implementation. I actually would go further in saying the URI template should not even be part of the schema, it is an implementation details that should be defined outside of the API contract (therefore outside of the schema defining the API contract).

It doesn't have to download the schema.  My browser does not fetch "http://www.w3.org/TR/html4/strict.dtd" every time it views a web-page, because it has all the knowledge of that DTD baked into it.  However, it still uses that URL (as part of the <DOCTYPE> stuff) to determine the specific details of parsing the page.

Similarly, a custom-written client for a particular API might not fetch "http://example.com/api/schemas/v01" all the time - it might have all the format knowledge baked in, and simply use that URI as an identifier to determine which variation of its interpreter to use.

The advantage comes when one day the API starts returning data referencing the schema "http://example.com/api/schemas/v02", which the client has never seen before.  A client relying completely on its own internal knowledge might break at that point.  But a client that understands JSON Schemas can be extremely flexible at that point - if it chooses to, it can download the new schema, and make a decent fist of using the new API, instead of just rolling over and dying.

I also envision this flexibility being used to make development easier - during dev/testing phases, you have an extremely generic client that uses JSON Hyper-Schema (using common libraries to do all the interpreting), as you keep chopping and changing the API.  Then as you head towards release, you just hard-code a few relevant schemas (so they don't have to be fetched) and go.

--
You received this message because you are subscribed to the Google Groups "JSON Schema" group.
To unsubscribe from this group and stop receiving emails from it, send an email to json-schema...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Geraint Luff

unread,
Feb 15, 2013, 11:44:50 AM2/15/13
to json-...@googlegroups.com
On 15 February 2013 16:40, Geraint Luff <gerai...@gmail.com> wrote:
On 15 February 2013 13:15, Philippe Marsteau <mars...@gmail.com> wrote:
A parser can be very generic (e.g. a browser can parse json or html data very generically), yet you could have a schema that describes in much details additional information about the format of json/html/xml your provider sends.

The idea is to have a generic way to parse links from any document, and with that very generic parser, have the ability to convey navigation from the data being returned (without knowing necessarily the format of the data returned). Consider e.g. the HTTP Link Headers received on a HTTP GET that returns HTML. Say you receive 2 links with "prev" and "next". You don't need the details of the data being returned (that could be described in your json schema, or could be plain binary data, eg. a XLS spreadsheet or an JPG image) to propose generically a "back" and "next" button to navigate the data (following the links). This is based on your uniform contract that HTTP and the fact link relations "next" and "prev" are IANA standardized. I don't necessarily needs the knowledge of the MIME type (application/jpg or whatever) to enable hyperlinking. 

Schemas describe the payload, and may describe hypermedia links in more details. Very useful as a mean of validation and/or documentation. But not as a mean of parsing the data. If this is required for parsing the data, then it should be part of the document itself. 

With the example I provided above, you woul not need any schema knowledge and yet your user-agent/browser/parser would be able to follow links, assuming it can handle HTTP Links. 

Sure - using the HTTP Link header, you can communicate links completely independently of the payload type.
 
I also disagree that it is interpreter's job to "attach URIs to link relations". Any data point to back that affirmation? Availability of links and actual URLs to navigate these links should be server logic, based on application internal state, and should not be parser/client job to construct. 

Sorry, I didn't mean construct - I meant identify.

When I view an HTML page, the server has helpfully put all the links in for me, but my client (the browser) uses its innate knowledge of HTML to identify them as links and display them to me as such.

That kind of innate knowledge is required for any client to use links, in any data format (unless, as you point out, all links are communicated in the HTTP headers - but that's not very common).  The problem currently is that there are a billion different custom JSON formats out there - so you cannot write a "client that understands JSON and identifies hyperlinks", because the hyperlinks are not presented in a single form.

The goal of hyper-schemas is a surrogate for that "innate knowledge" which is missing for these home-brew JSON formats.

I find schemas (XSD or json-schema) a GREAT way to describe your API contract. JSON-schema also allows to document your actions and links (including those non standard ones). I agree the client will build logic against that API contract (and as such, will build its code based on a given version of the API contract / based on a given schema version.

The only thing I struggle with, is expecting the client to download the current version of the schema, at all call, to "detect" changes the server may have decided to undertake, so it can dynamically construct URIs of the links/actions that fits the server implementation. I actually would go further in saying the URI template should not even be part of the schema, it is an implementation details that should be defined outside of the API contract (therefore outside of the schema defining the API contract).

It doesn't have to download the schema.  My browser does not fetch "http://www.w3.org/TR/html4/strict.dtd" every time it views a web-page, because it has all the knowledge of that DTD baked into it.  However, it still uses that URL (as part of the <DOCTYPE> stuff) to determine the specific details of parsing the page.

Similarly, a custom-written client for a particular API might not fetch "http://example.com/api/schemas/v01" all the time - it might have all the format knowledge baked in, and simply use that URI as an identifier to determine which variation of its interpreter to use.

The advantage comes when one day the API starts returning data referencing the schema "http://example.com/api/schemas/v02", which the client has never seen before.  A client relying completely on its own internal knowledge might break at that point.  But a client that understands JSON Schemas can be extremely flexible at that point - if it chooses to, it can download the new schema, and make a decent fist of using the new API, instead of just rolling over and dying.

I also envision this flexibility being used to make development easier - during dev/testing phases, you have an extremely generic client that uses JSON Hyper-Schema (using common libraries to do all the interpreting), as you keep chopping and changing the API.  Then as you head towards release, you just hard-code a few relevant schemas (so they don't have to be fetched) and go.

So basically - if you want to hard-code your API knowledge into the client, then you can.  But ideally, JSON Hyper-Schemas should be able to describe the API flexibly enough that it is possible (not compulsory, but possible) to write a client that dynamically uses hyper-schemas in order to keep its behaviour up to date.

Philippe Marsteau

unread,
Feb 15, 2013, 5:41:32 PM2/15/13
to json-...@googlegroups.com
I think we talk each other on 2 separate topics. You basically explains me that clients needs knowledge of mime-types to interpret data. That's all fine and nobody argues that. My point is around the construction of URI. Without breaking the data structure, and therefore without breaking the mime-type interpreter which can continue to interpret the received data, a server may choose to change its implementation to move a resource to another endpoint. In such a scenario you have 3 options:
  1. The client interpreter hard coded URIs (bad, very bad) to navigate to a related resource: this application is now broken, since server chosen to host the related resource an another endpoint/path
  2. The client has built logic to retrieve/interpret the URIs by downloading the json-schema (that was somehow associated to the data payload) and read from this schema the URI template associated to the related resource: this application is not broken, continues to work, since the URI template was modified by the server which allows the client to "construct" the URI based on this template. Note that if you had not downloaded (at regular interval) the schema, your client would still attempt to consider the old URI template (that was in place before the server endpoint/path change), and in such case, the client would be broken.
  3. The client has built logic to retrieve/interpret the URIs directly from the data payload (or the HTTP headers): the application is not broken, since the received URI happens to match the new modified endpoint/path the server chosen to return. The server implementation change (moving the server hosting the resource) is seems-less to the client.
I hope I made my point clearer with this.

Now if the server decides to change the mime-type definition (sending different data all together) and the server cannot serve the former representation which used to be supported prior to that change, then yes, your client would be broken. And even if the client was built to then attempt to download the "new schema" and to "dynamically interpret" the data received makes little sense to me. If you renamed a property, removed another one or added yet another one (or similarly, renamed a "rel", removed one or added another one) your business case is most likely broken. Your client will likely been coded to "do" something with the interpreted data (e.g. build a UI, etc).

Maybe I missed your point.

Geraint Luff

unread,
Feb 16, 2013, 4:42:58 AM2/16/13
to json-...@googlegroups.com
On 15 February 2013 22:41, Philippe Marsteau <mars...@gmail.com> wrote:
I think we talk each other on 2 separate topics. You basically explains me that clients needs knowledge of mime-types to interpret data.

I'd like to move away from talking about "MIME types", because many APIs of this sort will not have defined their own MIME type.  They will be using "application/json", but they will be sending JSON of a certain shape - they will have defined their own constraints on top of JSON.

But the MIME type ("application/json") along with the URI of a schema describing the format, together make a "document type" that includes the API-specific JSON dialect.
 
That's all fine and nobody argues that. My point is around the construction of URI. Without breaking the data structure, and therefore without breaking the mime-type interpreter which can continue to interpret the received data, a server may choose to change its implementation to move a resource to another endpoint. In such a scenario you have 3 options:
  1. The client interpreter hard coded URIs (bad, very bad) to navigate to a related resource: this application is now broken, since server chosen to host the related resource an another endpoint/path
  2. The client has built logic to retrieve/interpret the URIs by downloading the json-schema (that was somehow associated to the data payload) and read from this schema the URI template associated to the related resource: this application is not broken, continues to work, since the URI template was modified by the server which allows the client to "construct" the URI based on this template. Note that if you had not downloaded (at regular interval) the schema, your client would still attempt to consider the old URI template (that was in place before the server endpoint/path change), and in such case, the client would be broken.
  3. The client has built logic to retrieve/interpret the URIs directly from the data payload (or the HTTP headers): the application is not broken, since the received URI happens to match the new modified endpoint/path the server chosen to return. The server implementation change (moving the server hosting the resource) is seems-less to the client.
I hope I made my point clearer with this.

Hmm... I think the issue here is that 2 and 3 seem incredibly similar to me.

The majority of the time (for RESTful APIs) I would expect the LDO to look something like this:
{
    "href": "{+commentUri}",
    "rel": "create",
    "title:" "Comment",
    ...
}
Now, that's option 2, because it uses a URI Template.  But all that URI Template actually says is "take the URI from the "commentUri" property".  It's equivalent to the following client code:
function getLinks(data) {
    if (data.commentUri != undefined) {
        return [{
            href: data.commentUri,
            rel: "create",
            title: "Create comment"
        }];
    }
    return [];
}
That code snippet (if I'm understanding correctly) would count under option 3, because the client is extracting URIs directly from the data payload.

So for the case of the server changing URIs: if the API is RESTful, then this requires no change in either the client code or the schema.  If the URI Template is a simple one (such as "{+someProperty}") then it does not need to be changed to accomodate the server moving things around - and a client using such a schema will not break.
 
Now if the server decides to change the mime-type definition (sending different data all together) and the server cannot serve the former representation which used to be supported prior to that change, then yes, your client would be broken. And even if the client was built to then attempt to download the "new schema" and to "dynamically interpret" the data received makes little sense to me. If you renamed a property, removed another one or added yet another one (or similarly, renamed a "rel", removed one or added another one) your business case is most likely broken. Your client will likely been coded to "do" something with the interpreted data (e.g. build a UI, etc).

Maybe I missed your point.


I think there's a middle-ground.  For example, what if the change does not relate to the data being displayed?  What if the changes are in:
  • The name of the property holding the URIs for certain links (e.g. it used to be called "commentUri", but it's now called "createComment")
  • The nature of the data that should be submitted to the server (e.g. it now requires a new parameter "public" which defaults to the value true)
In both cases, the fixed client (case 3) will break.  But the schema-driven one (case 2) might choose to re-fetch the schema when it starts getting "400 Bad request" responses - at which point, it will be able to adapt to the changes in the API.

What I'm trying to say is that when the schema is fixed, it is no more or less flexible than a client with the same logic hard-coded.

I also would much prefer that my client did something slightly un-smooth than break completely.  So in the case of the new "permissions" property - my client could display the same interface as before, but under-the-hood start submitting "public":true.  Or alternatively, it might decide to display a checkbox somewhere in the interface.  Either option is not as good as a shiny new version of the client that displays a nice styled slider for "public" - but they're both a mile better than just becoming non-functional.

Philippe Marsteau

unread,
Feb 18, 2013, 8:45:28 AM2/18/13
to json-...@googlegroups.com
On Saturday, February 16, 2013 4:42:58 AM UTC-5, Geraint (David) wrote:
On 15 February 2013 22:41, Philippe Marsteau <mars...@gmail.com> wrote:
I think we talk each other on 2 separate topics. You basically explains me that clients needs knowledge of mime-types to interpret data.

I'd like to move away from talking about "MIME types", because many APIs of this sort will not have defined their own MIME type.  They will be using "application/json", but they will be sending JSON of a certain shape - they will have defined their own constraints on top of JSON.

But the MIME type ("application/json") along with the URI of a schema describing the format, together make a "document type" that includes the API-specific JSON dialect.

Citing Fielding:
"A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types)."

Why reinvent the wheel? MIME type are designed to express the resource representation, right? And relation-type to express the hypermedia. JSON-schemas allows to actually document/describe your mime/relation types. Not to replace them. 

JSON-schemas (and hyperschemas) allows documentation and enforcement of the metadata of your APIs. Not the format of your representation. Your metadata is valid whether you represent data as JSON or as XML. If a resource expose has a boolean field it will translate as a boolean value of an XML element or a boolean value of a JSON element. If your metadata documents a "parent" relation type, it doesn't matter whether it is represented as an atom link in XML representation or as a "links" section or property in a JSON representation. All what matters is that your API exposes a relation from one resource to another. When it comes to defining a language-specific representation (XML or JSON), defining a MIME type is key, if all your consumers actually have to parse the data in a standardized way.

 
That's all fine and nobody argues that. My point is around the construction of URI. Without breaking the data structure, and therefore without breaking the mime-type interpreter which can continue to interpret the received data, a server may choose to change its implementation to move a resource to another endpoint. In such a scenario you have 3 options:
  1. The client interpreter hard coded URIs (bad, very bad) to navigate to a related resource: this application is now broken, since server chosen to host the related resource an another endpoint/path
  2. The client has built logic to retrieve/interpret the URIs by downloading the json-schema (that was somehow associated to the data payload) and read from this schema the URI template associated to the related resource: this application is not broken, continues to work, since the URI template was modified by the server which allows the client to "construct" the URI based on this template. Note that if you had not downloaded (at regular interval) the schema, your client would still attempt to consider the old URI template (that was in place before the server endpoint/path change), and in such case, the client would be broken.
  3. The client has built logic to retrieve/interpret the URIs directly from the data payload (or the HTTP headers): the application is not broken, since the received URI happens to match the new modified endpoint/path the server chosen to return. The server implementation change (moving the server hosting the resource) is seems-less to the client.
I hope I made my point clearer with this.

Hmm... I think the issue here is that 2 and 3 seem incredibly similar to me.

The majority of the time (for RESTful APIs) I would expect the LDO to look something like this:
{
    "href": "{+commentUri}",
    "rel": "create",
    "title:" "Comment",
    ...
}
Now, that's option 2, because it uses a URI Template.  But all that URI Template actually says is "take the URI from the "commentUri" property".  It's equivalent to the following client code:
function getLinks(data) {
    if (data.commentUri != undefined) {
        return [{
            href: data.commentUri,
            rel: "create",
            title: "Create comment"
        }];
    }
    return [];
}

That is not what I mean to describe option 2. Option 2 was considering the commentURI was *not* part of the data; if the URI is part of the data, and the URI template in the schema is nothing more than a unique property ({+propertyURI}) then that is option 3. 

I was referring to the more complex scenario, where your schema uses a given URI template made out of multiple properties, with some hardcoded separators, e.g. "/path/resoureName/{id}", that may change in the future for /anotherPath/anotherResoureName/{id} in the future. With that URI template in mind, you will agree sending only the {id} property in the data will not help. And a client assuming previous URI template will break when the new URI template is introduced, unless it downloads the modified schema and read the new URI template value dynamically.
 
That code snippet (if I'm understanding correctly) would count under option 3, because the client is extracting URIs directly from the data payload.

So for the case of the server changing URIs: if the API is RESTful, then this requires no change in either the client code or the schema.  If the URI Template is a simple one (such as "{+someProperty}") then it does not need to be changed to accomodate the server moving things around - and a client using such a schema will not break.

For this simple URI template yes. But then the only difference between your suggestion and mine is to use a property for the URI instead of an array property "links" that will include a key-based list of URIs (since a resource will likely link to more than one action/URI).

 
Now if the server decides to change the mime-type definition (sending different data all together) and the server cannot serve the former representation which used to be supported prior to that change, then yes, your client would be broken. And even if the client was built to then attempt to download the "new schema" and to "dynamically interpret" the data received makes little sense to me. If you renamed a property, removed another one or added yet another one (or similarly, renamed a "rel", removed one or added another one) your business case is most likely broken. Your client will likely been coded to "do" something with the interpreted data (e.g. build a UI, etc).

Maybe I missed your point.


I think there's a middle-ground.  For example, what if the change does not relate to the data being displayed?  What if the changes are in:
  • The name of the property holding the URIs for certain links (e.g. it used to be called "commentUri", but it's now called "createComment")
  • The nature of the data that should be submitted to the server (e.g. it now requires a new parameter "public" which defaults to the value true)
In both cases, the fixed client (case 3) will break.  But the schema-driven one (case 2) might choose to re-fetch the schema when it starts getting "400 Bad request" responses - at which point, it will be able to adapt to the changes in the API.
 
You are referring here to API changes. I was referring to *implementation* changes. My argument is that URI should remain an implementation detail. Not an API. And as such, should not even be advertised in the json-schema (no URI template at all).
 
What I'm trying to say is that when the schema is fixed, it is no more or less flexible than a client with the same logic hard-coded.

I also would much prefer that my client did something slightly un-smooth than break completely.  So in the case of the new "permissions" property - my client could display the same interface as before, but under-the-hood start submitting "public":true.  Or alternatively, it might decide to display a checkbox somewhere in the interface.  Either option is not as good as a shiny new version of the client that displays a nice styled slider for "public" - but they're both a mile better than just becoming non-functional.

Again, you are talking about API change and how "dynamically" a client could adapt to API changes. I don't even buy a client could ever "intelligently" handle an API change. In many case, I would prefer *not* any magic logic at all. As long as your API does not change (that is, the "old" version of your API remains supported), your client should not break. If you introduce a new API version, client will ultimately have to learn the new API, and adapt its logic accordantly. 

Abhijit Tambe

unread,
Feb 25, 2013, 9:57:33 PM2/25/13
to json-...@googlegroups.com
Do we have any further discussion on this topic? I think I agree that the LDOs should really be part of the instance (resource data) instead of the schema. That seems to be the most intuitive and efficient way to model hypermedia controls for RESTful APIs.

While the hyper schema spec is pretty well-defined, it might be worth considering declaring it as a hypermedia format that can be included with resource data instead of associating it with the instance schema.

-Abhijit

Francis Galiegue

unread,
Feb 26, 2013, 8:10:27 AM2/26/13
to json-...@googlegroups.com
On Tue, Feb 26, 2013 at 3:57 AM, Abhijit Tambe <tambe....@gmail.com> wrote:
> Do we have any further discussion on this topic? I think I agree that the
> LDOs should really be part of the instance (resource data) instead of the
> schema. That seems to be the most intuitive and efficient way to model
> hypermedia controls for RESTful APIs.
>

If you use HTTP, there is the "describedBy" relation type for that:

http://tools.ietf.org/html/draft-zyp-json-schema-04#section-8.2

--
Francis Galiegue, fgal...@gmail.com
JSON Schema in Java: http://json-schema-validator.herokuapp.com

Francis Galiegue

unread,
Feb 26, 2013, 8:54:09 AM2/26/13
to json-...@googlegroups.com
On Tue, Feb 26, 2013 at 2:10 PM, Francis Galiegue <fgal...@gmail.com> wrote:
[...]
>
> If you use HTTP, there is the "describedBy" relation type for that:
>
> http://tools.ietf.org/html/draft-zyp-json-schema-04#section-8.2
>

That said, it does not answer the real question. One solution could be:

{
"describedBy": { "schema": "here" },
"instance": // the JSON instance described
}

But this would mean implementations would have to _decorate_ their
data as such, for protocols which do not have the ability to "attach"
a schema to an instance. Is this reasonable? I wonder.

Abhijit Tambe

unread,
Feb 26, 2013, 9:19:46 AM2/26/13
to json-...@googlegroups.com
Sorry, I'm not clear how 'describedBy' has anything to do with my question.

-Abhijit

Francis Galiegue

unread,
Feb 26, 2013, 9:25:18 AM2/26/13
to json-...@googlegroups.com
On Tue, Feb 26, 2013 at 3:19 PM, Abhijit Tambe <tambe....@gmail.com> wrote:
> Sorry, I'm not clear how 'describedBy' has anything to do with my question.
>
> -Abhijit
>

You say that "LDOs should really be part of the instance (resource
data) instead of the schema".

I say that this needs not be when you can associate a JSON instance to
a JSON Schema having a "links" keyword with LDOs in them. And the URI
of a schema describing an instance can be the value of "describedBy".

Abhijit Tambe

unread,
Feb 26, 2013, 9:31:50 AM2/26/13
to json-...@googlegroups.com
Links are really data, not metadata. The link structure is metadata and should be part of the schema definition (just like any other field). However, the data within the link should really be controlled by the server and might be different for each request, depending on the context of the request (e.g. depending on the access granted to the client, they may or may not be allowed to update/delete a resource).

Note that clients can work just fine even if the schema definition is updated (in a backwards-compatible way), as long as they've read the schema once. This is not true for links. Servers should have the ability to change link attributes and clients should be able to just adapt. This means having to retrieve the schema document for every response to retrieve links that really may or may not be valid for the client.

-Abhijit

Francis Galiegue

unread,
Feb 26, 2013, 9:38:44 AM2/26/13
to json-...@googlegroups.com
On Tue, Feb 26, 2013 at 3:31 PM, Abhijit Tambe <tambe....@gmail.com> wrote:
> Links are really data, not metadata. The link structure is metadata and
> should be part of the schema definition (just like any other field).
> However, the data within the link should really be controlled by the server
> and might be different for each request, depending on the context of the
> request (e.g. depending on the access granted to the client, they may or may
> not be allowed to update/delete a resource).
>
> Note that clients can work just fine even if the schema definition is
> updated (in a backwards-compatible way), as long as they've read the schema
> once. This is not true for links. Servers should have the ability to change
> link attributes and clients should be able to just adapt. This means having
> to retrieve the schema document for every response to retrieve links that
> really may or may not be valid for the client.
>

OK, now I understand better. And indeed, a client implementation can
make the choice to keep the schema once it has fetched it, or it may
choose to dereference the URI each time.

Well, maybe not: if HTTP is used, there are expiration headers, but
still. So, how should this problem be tackled, then? There is the
fundamental issue (to my eyes at least) that you cannot exchange "raw
data" anymore.

Abhijit Tambe

unread,
Feb 26, 2013, 10:14:39 AM2/26/13
to json-...@googlegroups.com
My suggestion would be to move the 'links' parameter to the instance and have API implementers use JSON Schema to point to the hyper-schema.

For example, the schema defined at http://json-schema.org/latest/json-schema-hypermedia.html#anchor3 would change as follows (please ignore any syntax errors, this is only for demonstration):

{
    "title": "Written Article",
    "type": "object",
    "properties": {
        "id": {
            "title": "Article Identifier",
            "type": "number"
        },
        "title": {
            "title": "Article Title",
            "type": "string"
        },
        "authorId": {
            "type": "integer"
        },
        "imgData": {
            "title": "Article Illustration (small)",
            "type": "string",
            "media": {
                "binaryEncoding": "base64",
                "type": "image/png"
            }
        },
        "links": {
            "title": "Article Hyperlinks",
            "type": "array",
            "items": "http://json-schema.org/links"
        }
    },
    "required" : ["id", "title", "authorId"]
}

And the data would change as follows:

{
    "id": 15,
    "title": "Example data",
    "authorId": 105,
    "imgData": "iVBORw...kJggg==",
    "links": [
        {
            "rel": "full",
            "href": "/articles/15"
        },
        {
            "rel": "author",
            "href": "/users?id=105"
        }
    ]
}

-Abhijit

Geraint Luff

unread,
Feb 26, 2013, 10:45:42 AM2/26/13
to json-...@googlegroups.com
You're right - links are indeed data, not meta-data.

But JSON (Hyper-)Schema is not a data format - it is a schema format.  So surely, the format that the specification defines should not be a format that APIs have to follow.  If we mandate a particular link format, we are no longer defining a hyper-schema format - we are defining a hyper-data format.

What we want instead is a way to describe the link format that the data is using - and that's what this proposal is attempting.


--
You received this message because you are subscribed to a topic in the Google Groups "JSON Schema" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/json-schema/ttXvKdyMLi0/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to json-schema...@googlegroups.com.

Abhijit Tambe

unread,
Feb 26, 2013, 11:24:27 AM2/26/13
to json-...@googlegroups.com
Sure. I'm 100% on-board with you on that. What I'm questioning is the current assumption that _link data_ should be part of the schema associated with the instance, instead of with the instance directly. If you tell me that this is not true, then I have no issues at all with the current approach.

If you look at the example in my previous post, that should make things clearer. The example still uses JSON hyper-schema as a schema, it just moves link data to the instance instead of the schema.

-Abhijit

Geraint Luff

unread,
Feb 26, 2013, 12:20:53 PM2/26/13
to json-...@googlegroups.com
That currently (i.e. v4) is true.

I'm actually trying to change that, because I agree with you.  The way I'm trying to change that is to give the existing LDOs a bit more power - allowing templates for more properties than just "href".

The current use of "http://json-schema.org/links" is actually a little bit of a hack - the "href" property of LDOs should be a template, but that's just kind of ignored when using them in data.

What I'm proposing is: to describe in-data link formats, you put an LDO in the schema that parameterises all of its values from the data.  For example, this hypothetical "data link" schema would describe a link format suitable for use in data, which has "href"/"rel"/etc. properties:
{
    "type": "object",
    "properties": {
        "href": {...},
        ...
    },
    "links": [{
        "href": {"template": "{+href}"},
        "rel": {"template": "{+rel}"},
        ...
    }]
}
That schema (which I think should be available as http://json-schema.org/data-links, or something) would be a more appropriate schema to use in your example (instead of http://json-schema.org/links, which has the "href is a URI Template" problem).

But: that above schema is not the only possible one - people can organise their links differently, and we do not limit them.  They could even store the URIs as properties (like many existing RESTful APIs do - e.g. {"author": "http://..."}) - and the schema would be able to describe both cases.

Does that make sense?

Abhijit Tambe

unread,
Feb 26, 2013, 1:39:15 PM2/26/13
to json-...@googlegroups.com
I think I'm on-board with that. Based on what you said, it seems that the schema will only contain templated properties that point to other properties in the actual instance. However, I'm still left with some concerns:

1. How does this work for nested data in the instance? For example, I would probably want to have a 'link' object in my instance that contains things other than the URI (e.g. HTTP method, media type, etc.). How does the template reference these fields?

2. In practice, how long do you expect clients to 'hold on' to the schema document before refreshing it? I'm assuming freshness information will be controlled by the server and the client just sticks to it?

3. While the approach you suggested works, it seems pretty roundabout to me. Clients are redirected from the instance to the schema to retrieve link templates, and then from the schema back to the instance so that they can resolve the templates. I appreciate the need for extensibility, but I'm wondering if that is how we want to proceed for hyperlinks. While JSON Schema is a pretty generic construct that applies to all JSON use cases, JSON Hyper-Schema seems to be something more specialized, and is primarily useful only for REST APIs. Why not just fix the LDO structure (as in the v3 spec), add support for both templated and real hrefs, and have that be used as the schema for links in the instance? Adding LDOs to the instance would be an additive change, so existing APIs (that already contain links) are not really broken. Their clients have 2 options for figuring out how they want to transition their state.

-Abhijit

Geraint Luff

unread,
Feb 26, 2013, 1:54:04 PM2/26/13
to json-...@googlegroups.com
On 26 February 2013 18:39, Abhijit Tambe <tambe....@gmail.com> wrote:
I think I'm on-board with that. Based on what you said, it seems that the schema will only contain templated properties that point to other properties in the actual instance. However, I'm still left with some concerns:

1. How does this work for nested data in the instance? For example, I would probably want to have a 'link' object in my instance that contains things other than the URI (e.g. HTTP method, media type, etc.). How does the template reference these fields?

I was proposing enabling templating for all of those fields.  If we used URI Templates, it would be simple to define - but I'm open to other ideas.

In my previous example, "rel" was templated as well - you would just copy the same syntax and use it for "mediaType", etc.
 
2. In practice, how long do you expect clients to 'hold on' to the schema document before refreshing it? I'm assuming freshness information will be controlled by the server and the client just sticks to it?

3. While the approach you suggested works, it seems pretty roundabout to me. Clients are redirected from the instance to the schema to retrieve link templates, and then from the schema back to the instance so that they can resolve the templates. I appreciate the need for extensibility, but I'm wondering if that is how we want to proceed for hyperlinks. While JSON Schema is a pretty generic construct that applies to all JSON use cases, JSON Hyper-Schema seems to be something more specialized, and is primarily useful only for REST APIs. Why not just fix the LDO structure (as in the v3 spec), add support for both templated and real hrefs, and have that be used as the schema for links in the instance? Adding LDOs to the instance would be an additive change, so existing APIs (that already contain links) are not really broken. Their clients have 2 options for figuring out how they want to transition their state.

This concern about multiple requests is one I hear a lot.  The problem is, though, that's it's comparing apples and oranges.

On the one hand, we have "baked-in" knowledge of a format (such as a particular link format).  On the other hand, we have a schema that defines the format.  But just because a schema has been defined does not mean that every client using the data has to fetch the schema and process it.

Say, hypothetically, you have written a client that uses a particular API, with hard-coded knowledge about the data formats and embedded links.  Then, after you have written and released your client, the API authors document their API using JSON (Hyper-)Schema.

At that point - you can totally ignore the schemas if you want.  I mean, you already know what's in the data, why would you need to go and fetch a document that tells you what you already know?  However, if a new client comes along that does not have that built-in knowledge of the API, then the schema is extremely useful to them.

Schemas should only add functionality to those clients that understand them - they should not be required for use of the API, if the client already has special knowledge.

Abhijit Tambe

unread,
Feb 26, 2013, 4:41:54 PM2/26/13
to json-...@googlegroups.com


On Tuesday, February 26, 2013 1:54:04 PM UTC-5, Geraint (David) wrote:
On 26 February 2013 18:39, Abhijit Tambe <tambe....@gmail.com> wrote:
I think I'm on-board with that. Based on what you said, it seems that the schema will only contain templated properties that point to other properties in the actual instance. However, I'm still left with some concerns:

1. How does this work for nested data in the instance? For example, I would probably want to have a 'link' object in my instance that contains things other than the URI (e.g. HTTP method, media type, etc.). How does the template reference these fields?

I was proposing enabling templating for all of those fields.  If we used URI Templates, it would be simple to define - but I'm open to other ideas.

In my previous example, "rel" was templated as well - you would just copy the same syntax and use it for "mediaType", etc.

Right, I got that all fields are templated. My question is more about the template itself. In the example you gave, the template maps to simple 'href' and 'rel' properties in the instance. How do we get the template to map to properties that are nested deeper in the instance? I notice now that we can use fragment resolution for this, so my question is answered.

About using URI Templates, while that definitely seems like the right approach to me, I wonder if there are any restrictions in the spec concerning templates for non-URI data. For example, in this case, not all variables will map to URIs. Some will just map to strings (e.g. HTTP method, media type)
 
 
2. In practice, how long do you expect clients to 'hold on' to the schema document before refreshing it? I'm assuming freshness information will be controlled by the server and the client just sticks to it?

3. While the approach you suggested works, it seems pretty roundabout to me. Clients are redirected from the instance to the schema to retrieve link templates, and then from the schema back to the instance so that they can resolve the templates. I appreciate the need for extensibility, but I'm wondering if that is how we want to proceed for hyperlinks. While JSON Schema is a pretty generic construct that applies to all JSON use cases, JSON Hyper-Schema seems to be something more specialized, and is primarily useful only for REST APIs. Why not just fix the LDO structure (as in the v3 spec), add support for both templated and real hrefs, and have that be used as the schema for links in the instance? Adding LDOs to the instance would be an additive change, so existing APIs (that already contain links) are not really broken. Their clients have 2 options for figuring out how they want to transition their state.

This concern about multiple requests is one I hear a lot.  The problem is, though, that's it's comparing apples and oranges.

On the one hand, we have "baked-in" knowledge of a format (such as a particular link format).  On the other hand, we have a schema that defines the format.  But just because a schema has been defined does not mean that every client using the data has to fetch the schema and process it.

Say, hypothetically, you have written a client that uses a particular API, with hard-coded knowledge about the data formats and embedded links.  Then, after you have written and released your client, the API authors document their API using JSON (Hyper-)Schema.

At that point - you can totally ignore the schemas if you want.  I mean, you already know what's in the data, why would you need to go and fetch a document that tells you what you already know?  However, if a new client comes along that does not have that built-in knowledge of the API, then the schema is extremely useful to them.

Schemas should only add functionality to those clients that understand them - they should not be required for use of the API, if the client already has special knowledge.

Yes, I'm on the same page as you on that. My concerns about fetching the schema document for every instance (which I brought up in my previous post) were addressed by this templating approach. The concern I bring up in point (3) above has more to do with complexity than with performance. Clients who do want to integrate with the schema will need to implement cross-referencing from templates in the schema to fields in the instance and then substitute values to construct links. This seems complicated to me.

Now, it's possible you say that this concern is not a charter or a design goal for this spec, and you expect all clients to use robust, well-tested libraries for working with the schema instead of using the schema directly. If this is the case, then feel free to ignore my concerns.

I guess I'm still lost on why we would want to have this templating approach instead of just fixing the schema for LDOs and having LDOs be specified as part of instance data. The only benefit I see is that servers have the ability to define link data in any format they want, although I'm not able to think of any compelling reasons for why that's a good thing. Note also that servers can never change the link data format once they define it (because, as you said, clients don't _have_ to use the schema and can directly depend on the data format), in which case templating seems completely unnecessary to me.

Philippe Marsteau

unread,
Feb 28, 2013, 5:04:05 PM2/28/13
to json-...@googlegroups.com
+1

I agree this templating approach sounds complicated. It might be powerful, but not truly intuitive for anybody consuming the data nor reading the schema. Adding a dedicated data-href property (being the actual server-generated link) instead of the href (being the template URI that producers leverage to generate links) sounds far more intuitive. Then let people who build hypermedia APIs refer to LDO structure wherever they see fit (as bag of links, or as simple properties).

I agree the schema is only for those who understand/interpret it. In reality very few consumer will ever go fetch a schema somewhere to "adapt" or "interpret" dynamically to whatever format changes the server may have applied. This makes the integration too complicated. 

If consumers agree to never build URIs but instead get them fed by the server (looking up "rel" in a link array or a specific property in the payload), they don't need to have any schema/template information at hand. The link being provided in data may include not only the URI but also any meta-properties that consumers must know to follow the link, e.g. expected HTTP method, media-type, encoding-type, even properties needed as input or as output... basically all what a LDO already defines... just in the data payload.
Reply all
Reply to author
Forward
0 new messages