The resent discussions about a new media type for APIs has some underlying assumption about M2M communications. But do we really agree on what M2M communication is anyway?
Is it important to define a new term (M2M) for discussion? Doesn't API (in the context of this group) or Web API in general mean the same thing as you are trying to mean with M2M? I think adding more terms if they don't add important insight only makes thing more confusing.
* Team C is also implementing a mobile application for a public API, just like team B. But this time the API exposes hyper media controls that define inputs and buttons and so on. They take that information and renders a dynamic UI that changes at runtime based on what information the server returns.* Team B is implementing a classic mobile application that interacts with a public API. Lets say something like implementing a Twitter client for mobile phones. Team B hard codes the UI at compile time with the knowledge of what the API can do for them. They build a UI with X specific input fields and transfer these inputs to the API when the user click "Submit".Let me try to illustrate with some different examples:* Team A is implementing a background service that reads stuff from one API, transforms it in some way, and writes it back to another API - a "typical" systems integration service that keeps two legacy systems synchronized.
* Team D is also implementing a mobile application. But they decide to simply embed the HTML version of the service exposed by the API (so they simply ignore that an API exists).
...
What kind of M2M project are you involved in?So, the different implementations have different characteristics, going from autonomous/tightly coupled/low bandwidth to human driven/loosely coupled/high bandwidth. You choice of media type will be affected by these characteristics.
For what I believe you meant as the purpose of your examples, I think they are contrived and I think ultimately they all have the same characteristics if viewed in a different context:
- Team A is simply reading from API #1 and writing to API #2. It's not a special case.- Team B simply implements read and write of an API which for argument sake we'll say is API #1.- Team C is doing the same thing as Team B only they are doing it with API #2, for example.- Team D is using only the services that could be a subset of services provided by either API #1 or API #2.
The only real difference is that the implementor of API #1 published docs for URL construction and API #2 published a hypermedia API. But there's no reason other then their choices that they did so; API #1 could have been hypermedia and API #2 could have been URL construction.The entire point of the proposal in the other thread was that the differences in the four teams and the two APIs is arbitrary and we'd be better off without arbitrary differences.
-MikeHow do we get rid of arbitrary differences? We create a set of standards so all teams and all API developers have guidance for doing it the same way, and we do our best to cultivate open-source implementations of these standards so that the easiest thing for everyone to do is to follow the standards rather than attempt to roll their own.
--
Seems like its a bit difficult for me to convey my thoughts ... and maybe its just me thinking too loud in the public ...
I think we might both be struggling with a bit of that.
Let me start we your last sentence: "Ultimately it's about one machine talking to another" ... well, no, there are different approaches to how they talk together - some more loosely coupled than others and some requiring "in line" end user documentation whereas others do not.
I understand you see them as different however I see them as potentially just aspects of a unified approach. For example, why can't we have an API that uses a HAL-like format but that can also serve HTML forms when needed for human interaction?
Lets take HAL as an example. It is perfectly suited for autonomous background integrations like Team A is doing. It is also a perfect match for Team B. But Team C will be left without any kind of support for in line end user documentation, input definitions, labels, buttons and so on. So, its M2M communication, yes, but HAL (or similar) lacks something that Team C needs. You can of course layer your own interpretation of end user docs properties on top of HAL - but then you go beyond the media type specs.
So, where I am going with all this? Maybe I am trying to say that using a "low fidelity" media type like HAL (that only concerns itself with links and embedded resources) will force clients to be more tightly coupled to the server implementation than clients that use a "high fidelity" media type which extends itself to include end user documentation, UI generation and various kinds of forms definitions that allows the client to build dynamic user interfaces.
I assert you are describing what *is* today, but I assert that what is does not invalidate what *could be* tomorrow. For example, we can choose to accept that HAL can only be used for low fidelity or we can demand greater i.e. that HAL evolve or we give up on HAL and use something else. (I just inadvertently paraphrased the SyFy channel's motto. How apropos. :)
I am also trying to say that it is a trade-off. If you want to mash things together and build your own interface then you must get "closer to the metal" to get the raw data (compared to simply using the HTML UI).
I believe the same arguments were made to explain why TBL's vision of the web was not attainable. And like TBL I assert that it is not a given.
Further more I am trying to say that, well, perhaps this is simply the constraints that we have to live with - IF we want to create clients with hard coded UIs, because we want to go beyond what is some existing HTML UI, then our implementations will be tighter coupled to the service compared to a dynamic UI generated, on the fly, by the server.
A lot of clients today have hard-coded UIs and I would assert that many of them have hard-coded UIs because that was the path of least resistance. I'm proposing we focus on giving them an easier path to dynamic solutions.There were many people who said forms had to be hardcoded in the late 80s and early 90s. But yet here we are; HTML5 provides rich functionality for declarative forms.Nothing I'm proposing keeps anyone from continuing to hard-code UIs. If what I'm proposing comes to pass then there will still be people that still choose to hard-code. But there will also be many in this future who do not hardcode because it will be easier to use the dynamic, uncoupled, hypermedia approach.I think it's good for me to restate one aspect of the proposal: this is for the 80th percentile[1] use-case. The other 20 percentile will continue with business as usual; no capabilities will be taken away from them. I sense that most of the push back is because those pushing back cannot envision how we could do this for 100% of use-cases, and we probably cannot. But that doesn't mean addressing 80% is not doable nor a good idea.So we would be building a highway but that wouldn't mean the side roads would be closed. (And my analogy fails in a happy way in that we wouldn't have to take anyone's home to build our highway.)
And - in the opposite corner - if we really want to make truly loosely coupled clients then we need a "high fidelity" media type ... and we will end up re-inventing HTML.
No, for web APIs we don't need the vast majority of things that HTML provides for the benefit of human visitors who are visual, auditory or other sensing. We can continue to leverage HTML for those things that require a human UI.What we instead need are machine affordances that allow for hands-off workflow orchestration. We need standard ways to discover authentications methods for APIs, standard ways to recognize the services offered by APIs, and standard ways to transition from one API to another. Each of these "standards" would and should each be small. An example might be a way to define a fragment of JSON that can represent the list of entities exposed by an API. HAL might store that fragment in one location and Siren in another.
My final point is that, due to this "spectrum" of flexibility and loose coupling we can gain by including or excluding hyper media elements, input forms and so on, there will never be such a thing as a media type that fits all M2M needs - unless we can parameterize it with the level of inline docs and so on the client requires ... and maybe that's an idea worth exploring? Or, perhaps, it is just two media types where one is a subset of the other?
My dad taught me never to say "never" (except in this one context. :)
Again, let's focus on the goal of 80 percentile, not 100%. The former is doable, the latter is (realistically speaking) not.