An Even More Compact JSON-Based Hypermedia Representation

452 views
Skip to first unread message

Dan Duvall

unread,
May 22, 2013, 6:36:23 PM5/22/13
to api-...@googlegroups.com
Hey all,

I'm currently working on the design of an API for National Novel Writing Month (nanowrimo.org) and after a few weeks of research, we've decided to prototype a lightweight hypermedia API as a proof of concept using HAL as the media type. What appeals to us most about HAL is how it facilitates linking with very straightforward semantics, and the rate of adoption, evident by the number of client implementations and revisions to the draft specification, is reassuring.

However, all this research and evaluation of hypermedia formats got me thinking about an even more compact representation that may be possible to achieve almost the same level of hypermedia (H Factor in Collection+JSON terms) as HAL. Given that I'm brand new to this community and very new to hypermedia APIs, I wanted to get some general feedback on it that might help me better understand RESTful theory in general.

Essentially it boils down to:

 1. A removal of the envelope that separates both links and embedded resources from the subject resource
 2. A required self-referencing link, but as a resource property named "href"
 3. Embedded resources may be partial representations (HAL already allows for this)
 4. All href values are valid URI templates

The second part is important as it defines that only objects with self-referencing hrefs are resources, and that all resources inherently represent links to themselves. The third part is important in that it allows partial (or otherwise empty) embedded resources to serve as links. The last part removes the need for additional qualifiers of the href value (helping to condense the format without having additional reserved words).

Example:

A little background on our application: We have the resources participants and novels. A participant represents a user of the site that is taking place in a month-long writing event. A participant writes one novel during the course of the event. There are many events, so a participants relates to many novels, and each novel relates to one event.

Using HAL to represent the resources, a request for a participant might look like this:

GET /participants/1
200 OK

{
 
"_links": {
   
"self": { "href": "/participants/1" },
   
"next": { "href": "/participants/2" },
   
"novels": { "href": "/participants/1/novels" },
   
"profile": { "href": "/participants/1/profile" }
 
},
 
"name": "Dan",
 
"created_at": "2010-02-25T12:21:20",
 
"time_zone": { "region": "US/Pacific", "offset": "-07:00" },
 
"_embedded": {
   
"novels": [
     
{
       
"_links": {
         
"self": { "href": "/novels/123" },
         
"event": { "href": "/events/1" }
       
},
       
"title": "A Tailored Pursuit"
     
},
     
{
       
"_links": {
         
"self": { "href": "/novels/124" },
         
"event": { "href": "/events/2" }
       
},
       
"title": "Haphazard Names"
     
}
   
]
 
}
}



The same resource represented in the alternate format would look something like the following.

GET /participants/1
200 OK

{
 
"href": "/participants/1",
 
"name": "Dan",
 
"created_at": "2010-02-25T12:21:20",
 
"time_zone": { "region": "US/Pacific", "offset": "-07:00" },
 
"novels": [
   
{ "href": "/novels/123", "title": "A Tailored Pursuit" },
   
{ "href": "/novels/124", "title": "Haphazard Names" }
 
],
 
"profile": { "href": "/participants/1/profile" },
 
"next": { "href": "/participants/2" }
}


As you can see, the formats represent essentially the same resources but the latter is much more compact with a lot less metastructure (no envelope). In fact, there's only one reserved word at the moment (href) and a lot of information can be inferred by it.

There are six resources that can be identified by their self-referential href properties. As in HAL, embedded resources can be partial representations. Unlike HAL, however, in their most minimal state (having only an href property) embedded resources can represent links themselves, links to their full canonical representation.

Note that I've represented the participant's novels as collection of resources without the collection being a resource itself. However, it could have just as easily been represented as the latter.

Link relations could be inferred by property names. Alternatively, one more (optional) reserved property name (rel) could be introduced to facilitate explicit relation names (or URLs).

{
 
...
 
"profile": { "rel": "/relations/profile", "href": "/participants/1/profile" }
}



There are obviously some limitations to the format I've described, most notably that it lacks anything beyond the linking hypermedia factors, but I think that it does offer a very compact JSON-serialized hypermedia format.

Dan

Steve Klabnik

unread,
May 22, 2013, 10:28:51 PM5/22/13
to api-...@googlegroups.com
You should cross-post this to hal-discuss.

Kevin Swiber

unread,
May 22, 2013, 11:01:18 PM5/22/13
to api-...@googlegroups.com
Hey Dan,

Welcome!

The approach you documented is a lot more compact.

Most JSON-based hypermedia types (that are discussed in forums such as this, anyway) are aimed at being a "generic" media type in the sense that anyone can use the format for their specific purposes.  I'm going to move forward assuming your proposal reflects this, as well.

I enjoy the minimal boilerplate.  I think a lot of people lean this way.  There are some common constraints with this, however.

1. With the "rel": [ { "href": "…" } } format, it's difficult to follow the Web Linking spec (RFC5988), which supports multiple relation values per link.
2. There are no controls for state transitions, so documentation would have to provide that to client developers.  Some see this as a bonus for M2M, though I do not.
3. Name collisions.  All responses need to be careful not to provide a property name that matches a link relation name.

With the format you outlined, in particular, I also see some parsing difficulties.  How does a parser reliably pull media type semantics out of this response without first understanding some application semantics?

A parser would require some upfront knowledge/configuration to figure out what's a link and what's not.  If the parsing rules are "any object in an array that contains an 'href' property is considered a link with the property name of its parent array being the link relation value," then so be it, but this could potentially bring some headaches.

I encourage you to move forward with it.  There's nothing wrong with listing constraints in your spec, either.  Sometimes the limiting of a feature set actually becomes a feature in itself.  :)

Cheers,

-- 
Kevin Swiber
@kevinswiber

--
You received this message because you are subscribed to the Google Groups "API Craft" group.
To unsubscribe from this group and stop receiving emails from it, send an email to api-craft+...@googlegroups.com.
Visit this group at http://groups.google.com/group/api-craft?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Peter Williams

unread,
May 23, 2013, 11:16:39 AM5/23/13
to api-...@googlegroups.com

I like the compactness and the intuitive organization of this style of links. I think this is how most people approach links in json at least initially.

On May 22, 2013 4:36 PM, "Dan Duvall" <d...@mutual.io> wrote:
>
> {
>   "href": "/participants/1",

This seems equivalent to the `self` rel. Is that the correct interpretation? If so why not use `self`?

>   "novels": [
>     { "href": "/novels/123", "title": "A Tailored Pursuit" },

Is this the title of the link or of the referenced novel? If is the former is there a way to expand links into an embedded representation of the referenced resource? If it is latter how do consumers distinguish between a link and and expanded references?

Peter
Barelyenough.org

Mike Kelly

unread,
May 23, 2013, 1:17:53 PM5/23/13
to api-...@googlegroups.com

HAL was created so that multiple variants (json and xml) could share the same abstract model (i.e. resource, links, embedded resources). So its possible to come up with another media type based on json that is more compact and is still HAL.

hal+json is designed the way it is mostly to keep the complexity of the spec down and to draw simple lines between what bits of the message are links, embedded resources, and properties. There's nothing to say that design is "right".. perhaps a more compact design might be more appealing?

If you want, you are more than welcome to bring this up on hal-discuss and see  where that goes.

Good luck!

Cheers,
M

--

Repenning, Jack

unread,
May 23, 2013, 1:32:20 PM5/23/13
to api-...@googlegroups.com
On May 22, 2013, at 8:01 PM, Kevin Swiber <ksw...@gmail.com> wrote:

> A parser would require some upfront knowledge/configuration to figure out what's a link and what's not.

I'm also a little concerned about mixing data and metadata. IMO, an important goal is to make the API exchanges (particularly at development time) explain each other. That is, the data returned by, say, a read is a quick education on the data required in a create. In restructuring the links out to top level, you obscure this a bit. Does the presence of "href", "next", and "profile" in the reply to "GET /participants/1" mean I'm supposed to provide those values during a POST? I'm guessing probably not: those both sound generated rather than set. But other cases might be less obvious; the original's structure helps clarify.

Are you working in some context where compactness is so much more important than clarity?

--
Jack Repenning
Repenni...@gmail.com

Dan Duvall

unread,
May 23, 2013, 2:28:54 PM5/23/13
to api-...@googlegroups.com
Thanks for the feedback, Kevin!

On Wednesday, May 22, 2013 8:01:18 PM UTC-7, Kevin Swiber wrote:
Most JSON-based hypermedia types (that are discussed in forums such as this, anyway) are aimed at being a "generic" media type in the sense that anyone can use the format for their specific purposes.  I'm going to move forward assuming your proposal reflects this, as well.

Yes, that was my intent: a format that reflects our immediate requirements but is general enough to be useful for others.
 
1. With the "rel": [ { "href": "…" } } format, it's difficult to follow the Web Linking spec (RFC5988), which supports multiple relation values per link.

I shared the same concern when I first discovered HAL (AFAICT it shares this limitation). For my own education, what's a common use case that would necessitate multiple rels?

Perhaps introducing the additional "rel" property as I mentioned, but as an array instead of a single string value, would satisfy this requirement.
 
2. There are no controls for state transitions, so documentation would have to provide that to client developers.  Some see this as a bonus for M2M, though I do not.

I had planned on describing possible transitions via the Allow header in responses to both GET and OPTIONS requests. This implementation would limit the description of the controls for sure (definitely more M2M oriented). Is this what you mean by state-transition controls or is there more to it?
 
3. Name collisions.  All responses need to be careful not to provide a property name that matches a link relation name.

Very true. I'm not sure how best to address namespacing issues. Though introducing the "rel" attribute—or possibly enforcing it—would allow for the link/resource name to be separate from the relation name(s). 

With the format you outlined, in particular, I also see some parsing difficulties.  How does a parser reliably pull media type semantics out of this response without first understanding some application semantics?
 
A parser would require some upfront knowledge/configuration to figure out what's a link and what's not.  If the parsing rules are "any object in an array that contains an 'href' property is considered a link with the property name of its parent array being the link relation value," then so be it, but this could potentially bring some headaches.

The parsing/presentation method that I envisioned when coming up with this format was less focused on extraction of these semantics up front and more focused on discovery and delegation, the latter being more akin to how a user agent renders HTML or how XSLT might transform XML.

A very contrived M2P example (in some weird Ruby/Scala pseudocode):

def render_obj(name, obj) =
 
if (obj.href?) render_resource(name, obj) else render_property(name, obj)

def render_resource(name, obj) =
  render_link
(obj.href) { render_relation(obj.ref || [ name ], obj) }

A crawler example (M2M):

def traverse(obj) =
 
for (value <- obj.properties)
   
if (value.href?) follow(value) else index(value)

def follow(obj) =
  traverse
(derefence(obj.href))



I encourage you to move forward with it.  There's nothing wrong with listing constraints in your spec, either.  Sometimes the limiting of a feature set actually becomes a feature in itself.  :)

Thanks again for your feedback. I already appreciate it.

Dan

Dan Duvall

unread,
May 23, 2013, 2:39:04 PM5/23/13
to api-...@googlegroups.com
And HAL does the things you mention quite well, Mike. (Thanks for all of your hard work on it!)

I'm definitely not saying that my way is 'right' either. In my mind there is never one right way to do anything. Exploring this format is more of an exercise for me to better understand concepts surrounding hypermedia and the relative importance of each as it might relate to our API consumers, not to mention that I just find these sorts of experiments to be really fun.

Dan Duvall

unread,
May 23, 2013, 3:10:48 PM5/23/13
to api-...@googlegroups.com
On Thursday, May 23, 2013 10:32:20 AM UTC-7, Jack Repenning wrote:
I'm also a little concerned about mixing data and metadata. IMO, an important goal is to make the API exchanges (particularly at development time) explain each other. That is, the data returned by, say, a read is a quick education on the data required in a create. In restructuring the links out to top level, you obscure this a bit. Does the presence of "href", "next", and "profile" in the reply to "GET /participants/1" mean I'm supposed to provide those values during a POST? I'm guessing probably not: those both sound generated rather than set. But other cases might be less obvious; the original's structure helps clarify.

I'm not sure that the reserved words "href" and "rel" make for more cumbersome self documentation than HALs "_links" and "_embedded" but I definitely see your point when it comes to the relations. In my mind, since you can still distinguish relations/embeds/links (they are all one in the same) from mutable resource properties by interrogating for the href, it would make constructing a create/update payload fairly straightforward.
 

Are you working in some context where compactness is so much more important than clarity?

No, not really. This is more of an exercise and we may end up going with HAL for practical reasons (existing client libraries, etc.).

Regarding clarity, I find my format fairly clear but that may just be because it manifested so close to my own eyes. :) That said, it's important to hear how clear or unclear it seems to others.

Peter Williams

unread,
May 23, 2013, 3:22:48 PM5/23/13
to api-...@googlegroups.com

On May 23, 2013 11:32 AM, "Repenning, Jack" <repenni...@gmail.com> wrote:
>
> On May 22, 2013, at 8:01 PM, Kevin Swiber <ksw...@gmail.com> wrote:
>
> > A parser would require some upfront knowledge/configuration
> to figure out what's a link and what's not.

How is knowing that objects which have an `href` member are links different from knowing that members of the `_links` object are links?

> I'm also a little concerned about mixing data and metadata.

How are you defining metadata? I don't think of the relationships between resources as meta. Ime, they are often the most important data.

> IMO, an important goal is to make the API exchanges (particularly at development
> time) explain each other. That is, the data returned by, say, a read is a quick education
> on the data required in a create. In restructuring the links out to top level, you
> obscure this a bit.

OTOH, if the reader is looking to understand both the structure and relationships pushing the links down a level may actually obscure things.

>Does the presence of "href", "next", and "profile" in the reply to "GET /participants/1"
> mean I'm supposed to provide those values during a POST? I'm guessing
> probably not: those both sound generated rather than set. But other cases might be
> less obvious; the original's structure helps clarify.

In some, hopefully increasingly number of, services links are part of the client specified data. For example, to transfer money between accounts the client might create a transfer resource specifying the source and destination accounts using links to them in the POST. In this situation not segregating links might be a bit less confusing. People generally don't seem to have serious issues figuring out which properties can and cannot be changed. Eg, if the response has a created at timestamp people don't get confused by the immutability of that property. I don't see why the fact that the data type of the property value is a link, rather than a string or number, would make that any more confusing.

Peter
Barelyenough.org

Mike Schinkel

unread,
May 23, 2013, 5:14:06 PM5/23/13
to api-...@googlegroups.com
On May 22, 2013, at 6:36 PM, Dan Duvall <d...@mutual.io> wrote:
> I'm currently working on the design of an API for National Novel Writing Month (nanowrimo.org) and after a few weeks of research, we've decided to prototype a lightweight hypermedia API as a proof of concept using HAL as the media type. What appeals to us most about HAL is how it facilitates linking with very straightforward semantics, and the rate of adoption, evident by the number of client implementations and revisions to the draft specification, is reassuring.
>
> However, all this research and evaluation of hypermedia formats got me thinking about an even more compact representation that may be possible to achieve almost the same level of hypermedia (H Factor in Collection+JSON terms) as HAL. Given that I'm brand new to this community and very new to hypermedia APIs, I wanted to get some general feedback on it that might help me better understand RESTful theory in general.

Are you proposing the idea of a simpler *generic* JSON media type for APIs, or a media type tailored to your own specific API with it's own specific media type designation?

-Mike

Austin Wright

unread,
May 25, 2013, 11:58:51 AM5/25/13
to api-...@googlegroups.com
The latter suggestion would work if JSON Schema were used with hyper-schema annotations, then the media type would be e.g.:

application/json;profile=http://example.com/v1/novel.json

<http://example.com/v1/novel.json> being the URI of a JSON Schema document that describes which properties are links (among other things... note that you need not serve the schema at that URL, like an XML DTD, but you should). Using this pattern is far more powerful, I believe, because you can indicate which format the request or response is in (and other properties like API version). Or use both the schema-URI and HAL patterns.

I wouldn't recommend designing a new hypermedia format, if there's some element of complexity which you believe is undesirable, I've found it's likely for good reason, including for consumers, if not producers.

Austin.

Steve Klabnik

unread,
May 25, 2013, 3:05:17 PM5/25/13
to api-...@googlegroups.com
application/json does not have parameters:

> Required parameters: n/a
> Optional parameters: n/a
>
> http://www.ietf.org/rfc/rfc4627.txt

Therefore, including a profile as a parameter is non-compliant.

A Link header would be a way to include one and be compliant.

Austin Wright

unread,
May 26, 2013, 5:07:31 AM5/26/13
to api-...@googlegroups.com
Other specifications may add parameters to the media type. In this case, this is done in the JSON Schema draft (and it is just a draft, hence why I point out it can be used strictly as an identifier).

A Link header could provide a location to download a schema, but it carries slightly different semantics, it implies the schema is supposed to be downloadable (not necessarily what one wants), and it applies to the document regardless of media type, which is odd considering the Link header for JSON Schema is JSON-specific (using the same Link header for an HTML document, in an HTML or XML API, wouldn't make any sense).

Steve Klabnik

unread,
May 26, 2013, 10:36:38 AM5/26/13
to api-...@googlegroups.com
Ah ha! Thank you; I didn't know JSON Schema did that.

Dan Duvall

unread,
May 28, 2013, 3:56:07 PM5/28/13
to api-...@googlegroups.com
On Saturday, May 25, 2013 8:58:51 AM UTC-7, Austin Wright wrote:
I wouldn't recommend designing a new hypermedia format, if there's some element of complexity which you believe is undesirable, I've found it's likely for good reason, including for consumers, if not producers.

Sound advice. I'm trying not to presume anything about the hypermedia formats that I come across without first reading their history and experimenting a bit, and the format I've described is more a product of the latter than an attempt to codify something completely new.

Dan Duvall

unread,
May 28, 2013, 4:43:40 PM5/28/13
to api-...@googlegroups.com
After doing some more research I've actually found that my terse format can be expressed almost entirely in an existing—well, 'developing' might be a better word for it—media type: JSON-LD with Hydra Core Vocabulary.

JSON-LD provides the hierarchical resource structure and Hydra provides the hypermedia semantics. I'm liking this approach so far for the following reasons.

 1. Representation of resources and links is simple, almost identical to the format I was playing around with. Addition of an @id property to an object identifies it as a resource, and the value of it is the dereference-able URI.
 2. Semantic capabilities are layered and can be added to the @context only when necessary. In this case, the hypermedia semantics of link relations can be provided by referencing Hydra as a vocabulary.
 3. Much of the underlying complexity can be abstracted out by keeping @context descriptions separate from the resources themselves. In fact, one could go further and provide a plain "application/json" response that only references the json-ld semantics as a Link header (though allowing for this type of consumption might promote coupled-RPC over hypermedia behavior).

I didn't see many discussions about JSON-LD in this list. Have any of you played around with or implemented it?

Dan Duvall

unread,
May 28, 2013, 4:56:24 PM5/28/13
to api-...@googlegroups.com
Ironically, I left out links to what I'm referring to. :)

Reply all
Reply to author
Forward
0 new messages