Media Type Design

107 views
Skip to first unread message

paulkmoore

unread,
Mar 8, 2012, 10:36:18 AM3/8/12
to restinpractice
Hi - I'm struggling a little with the design of our domain-specific
hypermedia format, and wondered if the group could help provide some
clarity?

I shall attempt to use the RiP venacular to explain the problem...

Let's say I'm documenting the media-type 'application/vnd.restbucks
+xml' (as per Chapter 5 of RiP). I have the notion of "order" and
"payment" resources, and I can convey the domain-specifc and protocol
information. I interpret this as the respresentation design for the
'order' and 'payment' resources (appropriate XML schema) and the set
of Link Relations (standard or domain specific) appropriate to domain
(i) structural and (ii) state transition semantics.

In the 'Media type design and formats' paragraph there is some
discourse about the balance of media types and representation
formats. Specifically, whether to use 'application/vnd.restbucks+xml'
as an umbrella type, or to use representation specific formats
'application/vnd.restbucks.order+xml' etc. In the book, the choice is
made to use the umbrella type, leaving indentification of individual
representations to "their XML namespaces". This approach is also
chosen over the Media Type Parameter approach adopted by ATOM with the
'type' parameter.

Now for the slight diversion...

My encoding format is JSON. I acknowledge that I will have to define
the hypermedia control elements etc, but I'm somewhat stuck with the
encoding format.

I have a couple of choices, I can either:

i) Add the resource name into the representation, in a manner similar
to the XML namespace, or
ii) I can stamp out specific media types for each resource, or
iii) I can add a Media Type Parameter, such as ATOM's 'type' and
include the representation format in there e.g. 'application/
vnd.restbucks+xml;type=order'

My (current) views are as follows:

Option (i): I don't like the fact that the identification of the
resource is (only) encoded within the body of the HTTP Message, and
would prefer that it was drawn out into a header of some type (i.e.
Content-Type). Also, I'm not using XML, so trying to 'replicate' this
approach for JSON smells bad.

Option (ii): Seems overly verbose, and frankly wrong. I have a Media
Type with multiple resources, not multiple media types.

Option (iii): Seems reasonable and extensible, and aligned with ATOM,
but I've trawled the IANA Media Type Registry and can't find a single
example of anybody doing it this way (except ATOM). That fills me with
fear. There are some notable uses of Media Type Paramaters in the
field of audio & video codecs (http://www.iana.org/assignments/media-
type-sub-parameters) which seem analogous, but nothing substantial
from the application/* tree.

Questions:

1. Does the Media Type Parameter approach of Option (iii) seem a)
reasonable and b) standards compliant?
2. Am I (unknowingly) inferring anything else about the Media Type by
defining and using a Parameter?

Thanks for your time and thoughts

Paul

PS: As an adendum Option (iii) also provides a deterministic way of
specifying the resource type when following a 'plain' link (i.e. not a
hypermedia control - no 'rel') from an email, or similar. That
*might* be useful.

Paul Moore

unread,
Mar 8, 2012, 12:46:49 PM3/8/12
to restinp...@googlegroups.com
Quick update: the addendum is in fact only a re-statement of the header vs. message body inspection to determine type point.  Apologies.

Paul Moore

unread,
Mar 8, 2012, 1:23:03 PM3/8/12
to restinp...@googlegroups.com
Or, Option (iv), the representation type of a link is entirely defined by the semantics of the 'rel' (as with the HTTP methods) and therefore further annotation either in the representation or in the header as a Content-Type is superfluous.  In which case the use of XML Schema in the book to "identify representations" is perhaps a little misleading?

This would explain why so few Media Type Sub-Parameters are registered....

Ian Robinson

unread,
Mar 9, 2012, 3:39:09 AM3/9/12
to restinp...@googlegroups.com
Hi Paul

Good questions, which get to the heart of media type design - a topic that I don't think we covered in great depth in the book. We've learnt quite a bit since we wrote then, and there's been some very good work more recently elsewhere that goes into even more depth (e.g. Mike Amundsen's recent 'Building Hypermedia APIs with HTML5 and Node').

The question for me comes down to the difference between the representation format itself and how a document produced using that format relates to a particular domain. To illustrate: here's a representation format for representing lists of things (e.g. Atom); here's a document that uses that format to represent a stream of events. Where the 2 overlap (representation format = domain semantics) you end up with things like application/order+xml. Where the 2 are kept quite separate you end up with a far more generic media type, but then need some other mechanism for indicating what a particular representation means domain-wise.

Link relations help clue the client in to what it can expect next; that is, they indicate what meaning the client might attach to the response should they follow a particular link. If a link is annotated 'rel="payment"' then the client knows that it can treat the next response - no matter what media type the response uses - as being something that allows for payment. That's a nice separation of concerns between domain meaning (the link relation) and the representation format, but it only works if the client is already involved in the app, and always and only ever following annotated links. As you point out, many clients will enter the system by following 'plain' links, which give no hint as to how to treat the subsequent response.

So we need something on the wire, something that accompanies the response. The options then are: a header or header parameter (such as the Content-Type 'type' parameter), or something in the entity body itself. My preference is for something in the entity body itself: a 'class' attribute, for example. This is how HTML works today. We have a generic media type (HTML isn't specialised for displaying orders, or descriptions of books, or news stories) that allows for 'class' attributes that add domain semantics to some of the markup. And because this domain semantic is 'in' in the representation itself, there's no chance of it being lost or never discovered (as there is if you follow 'plain' links or strip the headers from a response).

So design reasonably generic media types that allow for domain semantics using something like 'type' or 'class' attributes.

Hope that helps

ian

Ian Robinson

unread,
Mar 9, 2012, 3:42:21 AM3/9/12
to restinp...@googlegroups.com
And my addendum:

This doesn't invalidate the link relation approach - they're complementary. Link relations help a client decide what it might do next - which route might it take next - based on some domain semantics as expressed through the link relations. The in-response 'class' annotations then tell the client how it might interpret that specific document according to the 'types' it understands once it has received the response.

ian

Daniel Roop

unread,
Mar 10, 2012, 2:43:55 PM3/10/12
to restinp...@googlegroups.com
Paul,

I am in a similar situation to you where my technology group is going the JSON encoding route. Mi think this is great for many reasons and I champion it within my org but the one downside is exactly this problem.  I have attempted to boil it down and while JSON and XML are very similar there are two differences when it comes to defining hyper media types with them.
- XML has no formal notion of an array/list.  (you can represent a list but it can,t actually tell the difference between a one item list and a nested object, see jersey parsing problem)
- JSON has no property type ( XML nodes are the types)

The second one is what causes this problem you have to define a property like "links" that a user agent can leverage to know that all links for this object are within that propery.  This has worked well for us for generic concepts like links, errors, media but when it comes to domain specific objects we have been a little more loose.

If I haven't gone on too much of a tangent I will attempt to give my opionion of your question now.

First I think option 2 and 3 are actually the same except 3 has the notion that you might embed types within types if you don't specify the "type" qualifier (like atom does it assumes a feed Outside with entries inside if no type is proved if I member correctly,)

So for that reason I favor 3.

I have convinced myself that 1 is a bad choice for json because of versioning.  Now to be clear I think there are multiple points you should consider versioning your API and I think you should only version when you can't maintain backwards compatibility.  The best option is to Of course avoid the need to version the content type (see HTML as an example of that by always building rarely removing from the spec)...but when you absolutely need to I think the user agent should be able to specify via the accept header what version it understands.  If you have a single uber content type any change to any sub content type is a version increase.  Even if this was okay it means you are forcing all clients to move forward on all types at the same time instead of each individually.

An d for that reason I am torn between 2 and 3.

That being said I guess we do a level of three by defining a concept, for example person, as a content type and then having a type property on that content type that can subclass it essentially, player, manager, ball boy etc...

I also would throw out there a couple examples where people are trying to figure is out

WRML.org : is a project to standardize the way you define media types and it has a generic type that accepts a format and a scheme property.  Format is like XML, json, etc... And schema is your domain concepts person, game, team.  This creates a very verbose accept string in some cases but does solve the problems I raised(I think)

hal+json: this is a specification that believes your domain concepts should not be in the content type but does extend json to include hypermedia controls,

Not sure if that helps but I guess my answer right now is probably a mix of 3 and 1.

Daniel
Message has been deleted
Message has been deleted

Paul Moore

unread,
Mar 19, 2012, 3:14:41 PM3/19/12
to restinp...@googlegroups.com
Ian,

Many thanks for the considered response - very useful.

I have (re-)read Mike Amundsen's 'Building Hypermedia APIs wwith HTML5 and Node' and frankly I'm not sure it adequalately adresses the issue (or I've not fully comprehended it).  Mike articulates the problem space well in the section 'The Type-Marshaling Dilemma', specifically drawing out 'Shared schema', 'URI construction', 'Payload decoration' and 'Narrow media types' as (previous) solutions to the typing problem.  'Payload decoration' and 'Narrow media types' resonate well with my earlier mail, 'Shared schema' obviously encompasses the XML approach.  'URI construction' I would suggest falls into the "don't go there camp" due to the tight coupling of type and uri (specifically to be avoided from Roy's edicts).

Mike suggests that 'Hypermedia' is the solution, and then outlines a 'Payload decoration' approach with additional hypermedia controls.  I understand the hypermedia controls point, but the text doesn't really explore why 'Payload decoration' is the answer.  I will pick this up with Mike separately.

Your point about the overlap of representation format and domain semantics took me a couple of passes, but it's a very useful thought-experiment approach of determing the "Domain Style" (in Mike's terminology) from specific to generic.

The separation of concerns of link relation semantics and response type, I agree with.  In the HTML 4.0.1 Standard in Section 12 Links the definition of the 'type' attribute (content-type) is as follows:

"This attribute gives an advisory hint as to the content type of the content available at the link target address"

If the specified 'type' attribute is only an 'advisory hint' then my proposed use of 'rel' to convey response type is a step too far.  Conclude, scrap Option (iv).

With regard to the choice between Option (i) aka 'Payload Decoration' and Options (ii) & (iii) 'Narrow media types' I'm still not convinced.  I hear loud and clear the advantages of using a 'Generic' media type, that allows the domain semantics to be embedded ('Payload decoration') in terms of allowing flexibility for further types, changes etc.  However, I think the impact on user-agents is largely the same whether I introduce a new (narrow) content-type or specifiy a new type of decorated payload.  This may further depend on the type and nature of the user-agent.

I have carried out some further research into the MIME / Internet Media Standards (mainly RFC2045, RFC2046 and RFC4288).  The following quote [RFC2045] is useful:

"The purpose of the Content-Type field is to describe the data contained in the body fully enough that the receiving user agent can pick an appropriate agent or mechanism to present the data to the user, or otherwise deal with the data in an appropriate manner. The value in this field is called a media type."

The HTTPbis (draft) also echoes this sentiment:

"Content-Type specifies the media type of the underlying data, which defines both the data format and how that data SHOULD be processed by the recipient (within the scope of the request method semantics)"

Further in HTTPbis, and within the context of Content-Type mismatch handling, there is the following:

"Implementers are encouraged to provide a means of disabling such "content sniffing" when it is used."

This is supported by a ticket raised against the draft, which suggests that the use of 'sniffing' (payload inspection) is disallowed by the (HTTP) specification itself.

I'm therefore becoming more convinced that the specific media-type is more the intent.  I have written to Ned Freed in this regard will report back if and when he has a chance to respond.

Best regards

Paul

Paul Moore

unread,
Mar 19, 2012, 5:22:56 PM3/19/12
to restinp...@googlegroups.com
Daniel,

Thanks for the response, and great to hear from others out there battling with the same!

It's interesting that you highlight the differences between JSON / XML so early on in your reply, it's exactly the difference that is making me think about how best to apply the HTTP tools, rather than 'porting' XML solutions to the JSON space.

Skipping on slightly, I have looked at WRML.org & hal+json a little, and whilst a standard approach would be attractive I somewhat feel that the effort would be better spent documenting how to use JSON and Media Types rather than stampimg out more frameworks.  To paraphrase slightly, my encoding format is JSON, I need use the HTTP tools consistently, rather than just use a framework with which I may / may not agree.  From a personal perspective, I dislike various aspects of both WRML and hal+json, mainly because they both introduce 'custom' ways of doing standard things.  I digress...

The versioning point is interesting and I like the granularity you suggest.  Whether I'd need to use that granularity is a different question of course.  I suspect that Option (ii) 'Specific media types' provides the most granularity here.  In Option (iii) any versioning would apply to the whole Media Type I think i.e. application/vnd.restbucks;type=order;version=1.0 - does the 'version' attribute apply to the 'order' type or simplify 'vnd.restbucks' - I suspect the latter.  The HTTP spec also provides 'Product Tokens' as a way of identifying the capabilities of a user-agent so I'm also looking into that.

Appreciate the view point

Best regards

Paul

Ian Robinson

unread,
Mar 21, 2012, 5:34:32 PM3/21/12
to restinp...@googlegroups.com
Hi Paul

Thank you for the feedback, useful criticism, thoughts and further investigative efforts.

I was wondering today whether the issue is clouded in part by the overloading of the word 'type'. I don't think the 'type' in content-type/media-type means quite what it does when we refer to a type system.

Reflecting on the prohibition on content sniffing, I would distinguish between Web-level/transfer concerns, and application/domain concerns. How something is represented, how that format is to be processed, and whether it supports hypermedia, is, to my mind, a Web-level/transfer concern. What it is in the context of the application is, naturally, an application-level concern.

(As an aside, the question as to whether a representation format supports hypermedia is quite different from the question as to whether it supports, e.g., orders, or line items, or contact details. Why different? Because hypermedia - the ability to link - is a foundational mechanism of the Web; it's what makes the the Web a web, rather than a collection of of documents. At the transfer protocol level, application/xml tells us a representation format, but it tells us nothing about hypermedia. To discover the Web-friendliness of an application/xml formatted document, we have to peer inside. application/atom+xml, on the other hand, is a key into a representation format that intrinsically supports hypermedia. It advertises its Web-friendliness at the level of the Web, not at the level of the application.)

There are no 'actual types' (Mark Masse's phrase) on the Web, in the way we traditionally take type to refer to a type system; rather, I would say, there are simply documents whose processing model is expressed by a media type. The 'type' in media-type is no more than a file or document format processing model. In this I disagree with Mark Masse, whose WRML seems intended to add a type system to resources. In 2008 Stefan Tilkov asked the question on rest-discuss, 'What do you call the concept of "classes" or "types" of resources in your RESTful designs?', to which Roy Fielding replied: 'We call them resources. If they had types, they would be strongly coupled to whatever expected that type.'  We transfer documents; what we care about at the level of document transfer is how a document is represented and whether its processing model can be inferred by the client based on the Content-Type header. 'Typing' - what domain concept does this document communicate - is an application-level concern, not a transfer concern. This 'late' typing can, I believe, be evaluated as part of that media-type's processing model, if necessary. (This is how I interpret the 'how that data SHOULD be processed by the recipient' part of the HTTPbis draft: if the processing mode allows me to adjust my behavior, whether presentational behavior or some other business behaviour, based on the value of a 'class' attribute or some other profile value inside the representation, then my handling of the response in line with that content-type's processing model will, as a nice side effect, bring about the desired application behaviours.)  I take the ban on 'sniffing' to be directed against a client having to peer inside an entity body simply to determine what kind of format and processing model it uses, and whether said format supports hypermedia. I'm happier to allow a client, once it has established the format, the 'how this is represented and processed', to peer inside an entity body to discover application or domain semantics - the 'what this represents in the context of an application'.

All that said, this, my current position is a delicate balancing act based on a journey that began with my relative ignorance of the significance or use of media types, progressed through a rather fervid adherence to the application/order+xml school of thought, and is now paused at the nuanced separation of representation format and domain semantics. It adjusts bit by bit every time I apply it, and each time I engage in conversations like this.

Hopefully this is a useful contribution to your investigation and thoughts...

Kind regards

ian

On 19 March 2012 19:12, Paul Moore <paulk...@gmail.com> wrote:
Ian,

Many thanks for the considered response - very useful.

I have (re-)read Mike Amundsen's 'Building Hypermedia APIs wwith HTML5 and Node' and frankly I'm not sure it adequalately adresses the issue (or I've not fully comprehended it).  Mike articulates the problem space well in the section 'The Type-Marshaling Dilemma', specifically drawing out 'Shared schema', 'URI construction', 'Payload decoration' and 'Narrow media types' as (previous) solutions to the typing problem.  'Payload decoration' and 'Narrow media types' resonate well with my earlier mail, 'Shared schema' obviously encompasses the XML approach.  'URI construction' I would suggest falls into the "don't go there camp" due to the tight coupling of type and uri (specifically to be avoided from Roy's edicts).

Mike suggests that 'Hypermedia' is the solution, and then outlines a 'Payload decoration' approach with additional hypermedia controls.  I understand the hypermedia controls point, but the text doesn't really explore why 'Payload decoration' is the answer.  I will pick this up with Mike separately.

Your point about the overlap of representation format and domain semantics took me a couple of passes, but it's a very useful thought-experiment approach of determing the "Domain Style" (in Mike's terminology) from specific to generic.

The separation of concerns of link relation semantics and response type, I agree with.  In the HTML 4.0.1 Standard in Section 12 Links the definition of the 'type' attribute (content-type) is as follows:

"This attribute gives an advisory hint as to the content type of the content available at the link target address"

If the specified 'type' attribute is only an 'advisory hint' then my proposed use of 'rel' to convey response type is a step too far.  Conclude, scrap Option (iv).

With regard to the choice between Option (i) aka 'Payload Decoration' and Options (ii) & (iii) 'Narrow media types' I'm still not convinced.  I hear loud and clear the advantages of using a 'Generic' media type, that allows the domain semantics to be embedded ('Payload decoration') in terms of allowing flexibility for further types, changes etc.  However, I think the impact on user-agents is largely the same whether I introduce a new (narrow) content-type or specifiy a new type of decorated payload.  This may further depend on the type and nature of the user-agent.

I have carried out some further research into the MIME / Internet Media Standards (mainly RFC2045, RFC2046 and RFC4288).  The following quote [RFC2045] is useful:

"The purpose of the Content-Type field is to describe the data contained in the body fully enough that the receiving user agent can pick an appropriate agent or mechanism to present the data to the user, or otherwise deal with the data in an appropriate manner. The value in this field is called a media type."

The HTTPbis (draft) also echoes this sentiment:

"Content-Type specifies the media type of the underlying data, which defines both the data format and how that data SHOULD be processed by the recipient (within the scope of the request method semantics)"

Further in HTTPbis, and within the context of Content-Type mismatch handling, there is the following:

"Implementers are encouraged to provide a means of disabling such "content sniffing" when it is used."

This is supported by a ticket raised against the draft, which suggests that the use of 'sniffing' (payload inspection) is disallowed by the (HTTP) specification itself.

I'm therefore becoming more convinced that the specific media-type is more the intent.  I have written to Ned Freed in this regard will report back if and when he has a chance to respond.

Best regards

Paul

On Friday, 9 March 2012 08:42:21 UTC, iansrobinson wrote:
Reply all
Reply to author
Forward
0 new messages