Best practices on handling media-types in APIs

212 views
Skip to first unread message

Felipe Sere

unread,
Dec 1, 2013, 3:54:12 PM12/1/13
to api-...@googlegroups.com
Hi everyone,

After the amazing turnout on the "Selling the benefits of hypermedia” thread I wanted to ask a question I had on my mind while reading “RESTful Web APIs” during my last vacation:

Do you build APIs around a single media-type or does every message have its specific media-type?

In RESTful Web APIs Mike re-uses the Maze+XML example from Building Hypermedia APIs with HTML and Node.JS.
And that go me thinking. The internal APIs I have helped build use individual media-types for different “things” in the APIs.
Though they all follow similar patterns for things such as links, the actual “Accepts: “ and “Consumes:” header is individualised.
On the other hand, in the Maze+XML example the entire Maze API is driven by a single media-type.
The client knows in which state he is by looking for some specific tags.

How have you guys gone about it? Have you used a media-type per “thing” (to not say ‘resource’)? Or have you used one and made the actual distinction within the message? Have you used profiles? How did you communicate it? How has this affected your documentation, if at all?

I hope to spark a little discussion around this :-)

Cheers,
Felipe

Jørn Wildt

unread,
Dec 1, 2013, 6:48:02 PM12/1/13
to api-...@googlegroups.com
Hi Felipe

I was actually trying to do a follow up on the previous hypermedia thread, but never got around it - but since you are asking :-)

The role of media types has been debated for a long time now, and it is difficult to get to some kind of consensus on how to apply media types to APIs. Some will argue in favor of generic media types like HAL, Sirene and (X)HTML, while others will argue for very narrow domain specific media types. The thing is - you can do both and both will work. 

I once wrote a piece called "The role of media types in RESTful web services" (http://soabits.blogspot.dk/2013/05/the-role-of-media-types-in-restful-web.html). Here I show how a fictive generic media type combined with a narrow domain knowledge can be used to drive a RESTful service with the tiniest possible amount of coupling between client and server. Maybe that can inspire you a bit.

There is also a recent piece called "The lie of the API" by Ruben Verborgh (http://ruben.verborgh.org/blog/2013/11/29/the-lie-of-the-api/) which touches upon the use of media types. Its a relative provocative text but it does get you thinking.

But lets take the previous example of "Big corporation buys smaller companies and the result is a big unruly combination of customers, sales orders and other stuff living on different systems". We established earlier on that hypermedia and links are a necessity in such an environment in order for any client to navigate to a specific customer resource.

Now lets assume our fictive client is handed a link/URL to a customer resource in this mess of a heterogenous mix of different company resources. The client can issue a GET on the URL and in return it will receive a stream of bytes. How does the client interpret those bytes? Hopefully everyone screams "using media types" from the top of their lungs. But which media types?

Lets assume the client understand a generic media type like HAL. So together with the GET request it sends an accept header "Accept: application/hal+json". Luckily the server knows how to serve the customer resource as HAL, so the client gets a HAL document in return. Now what? We have integrated customer resources from three different organizations and each of these have been encoding customer info in HAL - but in different ways.

For instance: Company X has these customer properties:

{
  ID: 1234,
  Name: "John Larsson",
  Address: "Marienborg 1, 2830 Virum, Denmark"
}

and company Y uses these properties:

{
  ID: 1234,
  FirstName: "John",
  LastName: "Larsson",
  Address:
  {
    Address: "Marienborg 1",
    PostalCode: "2830",
    City: "Virum",
    Country: "Denmark"
  }
}

With nothing but this information our client must either give up or do some guessing like "If FirstName is present then assume format of company Y". So we need a bit more information than this.

Now we can either choose to add some kind of profile to the representation - either as a header or in the payload - or we can use a domain specific media type. 

A profile in the payload could be done like this:

{
  ID: 1234,
  ... other properties ...
}

It would of course require all (different) implementations to include a profile string in the payload, otherwise this attempt would be futile.

The profile could also be part of the media type, so we would get "application/hal+json;profile=http://company-x.com/profiles/customer-care".

A domain specific media type could be something like "application/company-x.customer-care.hal+json" or similar. 

As I see it, any of these approaches would work. There HAS to be some kind of shared information that tells the client how to interpret the response. Further more, such shared information can be split into two pieces: the WHAT element and the HOW element:

- The WHAT element is the profile that informs the client of what it is looking at (for instance a customer record from company-x).

- The HOW element is the media type that tells the client how to parse the byte stream before it starts reading WHAT the bytes represent.

There are a few different ways to approach the client side code and figure out HOW to read the data and WHAT it is.

- One way is to assume the client was given a URL together with the WHAT information; If the client did a customer search and received a URL it would expect to find a customer (the WHAT element) - and then it can use a generic media type to figure out HOW to decode the byte stream into a customer record. Unfortunately there can be different ways to encode the same customer data in a single media type.

- The client can use very domain specific media types.

- The client can use generic media types with profiles.

- The client/server can use something like a profile, but not as exactly, like I presented here:http://soabits.blogspot.dk/2013/05/the-role-of-media-types-in-restful-web.html

The next question would be "how specific should a media type or profile be?". Well, you can either have very specific identifiers (profile or media type) that identifies stuff at a very precise level, saying "this is a customer record with company-x properties". Or you can use more general identifiers like "this is stuff from company-x" and then let the client assume the data represents a customer record since it was told so be the search engine or by following some other specific link. In my view both approaches work.

Some people will say that there should be very few media types for a client to understand - the fewer the better - as it will simplify client side coding since the client won't need to understand so many different ways to interpret a byte stream. I tend to agree with that, but it will only cover the HOW (to parse bytes) problem - the client still need some kind of domain specific information to figure out WHAT it is looking at. The client may assume its a customer record, that is after all what it was looking for, but as I have said already, a customer record can be encoded in many different ways using the same media type.

So, as with the hypermedia problem: if you stick to "single server" implementations (aka "snowflake APIs") with dedicated clients (like a dedicated Twitter client) then none of all this matters. The client can safely assume that there will be one, and only one, representation of what ever kind of resource it is looking for. But if you take broader perspective and venture into a highly heterogenous, loosely coupled, unorganized, incoherent and fragmented ecology (also called "The internet") - then you need more domain specific information about the resources - either through domain specific media types, or generic media types with profiles.

One well known example of domain specific media types is the ATOM syndication format (http://www.ietf.org/rfc/rfc4287.txt). A client that understands this format is able to consume information feeds from any kind of resource, from any place on the web, using any wacky URL structure you can think of - AND IT WILL WORK. That is beauty of REST implemented at its finest, information utterly decoupled from ANY kind of server implementation.

If hypermedia is the engine of application state then media types are the gas and oil it runs on.

/Jørn









--
You received this message because you are subscribed to the Google Groups "API Craft" group.
To unsubscribe from this group and stop receiving emails from it, send an email to api-craft+...@googlegroups.com.
Visit this group at http://groups.google.com/group/api-craft.
For more options, visit https://groups.google.com/groups/opt_out.

mca

unread,
Dec 1, 2013, 6:56:28 PM12/1/13
to api-...@googlegroups.com
I recommend using as few hypermedia message models as possible when implementing an API. adding more than more message model raises the bar in both coding and debugging (and also evolving) a running system.

IME, most times there are "many message models" for the same API it is because each model is actually an object serialization instead of a message model. 

For example:
application/customer+json
application/warehouse+json
application/product+json

instead of:
application/siren+json

or at least:
application/supply-chain-mgmt+json

for another POV, last RESTFest (2013) David Zuekle did a talk on his attempt to use multiple hypermedia types in a single client app:

Cheers.

--
Reply all
Reply to author
Forward
0 new messages