Once again: editable references

90 views
Skip to first unread message

Oliver Drotbohm

unread,
Jun 18, 2020, 6:11:28 AM6/18/20
to HAL Discuss
I spent last night skimming through this group on a topic that I just face designing an API in which I think HAL slightly complicates the picture. I found a few threads with related focus but that either leave open aspects to the situation I face.

Context

1. I have to resources, one referencing the other.
2. The reference(s) need to be updatable by clients alongside all other properties of the resource

HAL's default approach to references looks as follows:

GET /foo/4711

{
  _links : {
    bar : { href : "…" }
  },
  property : "value",
  …
}

Assume bar is fundamentally defined to meet certain requirements like the document structure, supported HTTP methods etc. The aspect I'd like to focus on here is that relationships to other resources are moved out of the default JSON document into _links (as it can be seen with bar here). When it comes to updating the relationship this creates a few challenges:

GET / PUT symmetry is broken unless the server sort of excludes the relationships when handling a
PUT request, it either needs to get the entire HAL document sent via PUT which seems to be non idiomatic. If the client simply strips everything known to be HAL (_links, _embedded), it will PUT a document without relationship information. If we strictly follow PUT semantics, the request would wipe out the relationships as it doesn't naturally include the properties and the client needs extra knowledge which of the links constitute editable relationships and which do not.

So far I've worked around that issue by letting the bar link point to a dedicated resource that would manage the relationship:

{
  _links : {
    bar : { href : "/foo/4711/bar" }
  },
  property : "value",
  …
}

If a client followed that link, it would see the representation of the resource ultimately pointed to, it would be able to wipe the relationship by issuing a DELETE request and PUT would take requests with text/uri-list (also works nice for 1:n relationships) as media type to update the relationship. However, that leads to a violation of requirement 2, which was to be able to let the client update other properties and the relationships at the same time. The UI the client team is thinking about building is supposed to use a simple text field with auto-complete to be then submitted alongside the rest of the form. If the relationships were editable in a separate dialog or tab, the solution outlined above would just work. However, I can't and actually don't want to put constraints on the clients for such technical reasons.

Of course I could define which relationships are updatable (through e.g. HAL Forms) but that complicates the picture on both the client and the server: the client needs to either always add those additional attributes (one programmer mistake and the reference is wiped) or the server needs to get extra smarts to only process the relationships if they're given and thus partially violate PUT. I could let the client switch to PATCH to solve that losing idempotency on the way.

It feels like quite a tricky beast for something that doesn't look like an exotic requirement.

Thanks for your input!
Ollie

Jørn Wildt

unread,
Jun 18, 2020, 6:57:10 AM6/18/20
to hal-d...@googlegroups.com
My guess is that your relationship update is part of a certain business operation. What I usually do is to model business operations as actions on the resource. In this case I would add an "BusinessOperationX" link to your ressource. Clients would then (a priori) know that POSTing to that action link means "Here is the data you need to perform your business operation server side - I don't care what it involves, just do it". The data then contains a reference to the other resource that you want to relate to the first resource. The reference may be either a simple identifier or a URL depending on the scenario.

POSTing (or PUTing) URLs as relationships is generally a bit dangerous in closed systems, as a client might POST a URL pointing to something completely unexpected (google.com for instance?). On the other hand, if you allow links between unrelated systems, then URLs are good to go - but maybe you should GET the resource pointed to and validate it before allowing the update.

Idempotency is solved by supplying a transaction/operation identifier (GUID) with the rest of the data. The server then keeps track of already executed actions to avoid multiple executions of the same operation. 

/Jørn

--
You received this message because you are subscribed to the Google Groups "HAL Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hal-discuss...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/hal-discuss/f7bbb51c-37c1-4efc-9175-2d78885dfb5bo%40googlegroups.com.

Oliver Drotbohm

unread,
Jun 18, 2020, 7:13:23 AM6/18/20
to HAL Discuss
Thanks for your input Jørn, replies inline.


On Thursday, June 18, 2020 at 12:57:10 PM UTC+2, Jørn Wildt wrote:
My guess is that your relationship update is part of a certain business operation. What I usually do is to model business operations as actions on the resource. In this case I would add an "BusinessOperationX" link to your ressource. Clients would then (a priori) know that POSTing to that action link means "Here is the data you need to perform your business operation server side - I don't care what it involves, just do it". The data then contains a reference to the other resource that you want to relate to the first resource. The reference may be either a simple identifier or a URL depending on the scenario.

The business operation here is literally: update the details of Foo. As part of that editing, the relationships shall be edited, too. Imagine some person details that include some kind of assignment of other people to that person. Isn't what you suggest roughly what I described in my "workaround"? Having a dedicated resource that gets its own lifecycle and specification? As indicated, that works fine if that business action would somehow be separated from the rest of the editing. But unfortunately it's not something I wouldn't want (or can) to impose on the client.

Right now, all our forms need extra acknowledgement (pressing a button like "save") or the like to actually persist the data. There's no "persist the field as you change it". I could of course model the resources according to my "workaround", let the client immediately issue the requests, but that would leave the form in a partially applied state as the references would've been updated while applying everything else would require the extra acknowledgement. Also, leaving that form after making changes to the references would then not roll those changes back. Or at least extra measures would have to be taken to do that. All complexities I impose on the client.
 
POSTing (or PUTing) URLs as relationships is generally a bit dangerous in closed systems, as a client might POST a URL pointing to something completely unexpected (google.com for instance?). On the other hand, if you allow links between unrelated systems, then URLs are good to go - but maybe you should GET the resource pointed to and validate it before allowing the update.

The server of course would validate the submitted payload and make sure the submitted URIs point to something it expects. I don't see any difference to anything else that could be completely bogus. My primary reason for using URIs is that that's how you identify resources and it's the established way to do so. I.e. I want to avoid exposing backend IDs separately, even if they're part of the URI. The client should simply not think of anything but URIs to identify related things.
 
Idempotency is solved by supplying a transaction/operation identifier (GUID) with the rest of the data. The server then keeps track of already executed actions to avoid multiple executions of the same operation. 

Sure. Aware of that but it seems a bit over the top requiring the clients to add all of this complexity, when in the casual client developers mind, all they try to do is get some JSON, tweak some fields according to rules specified somewhere and send those fields back to the server.

Cheers,
Ollie
 
To unsubscribe from this group and stop receiving emails from it, send an email to hal-d...@googlegroups.com.

Jason Desrosiers

unread,
Jun 18, 2020, 11:29:07 AM6/18/20
to HAL Discuss
Hi Oliver,

That's an interesting problem. Here are several options I can think of.

PUT HAL

If you want to PUT links, the best way to do that is with a media type that understands links. I don't understand the resistance to PUTting HAL in the threads you linked (I only briefly skimmed them). If the client already understands HAL, I see no reason to go to great lengths to avoid using it for what it's designed to do, which is describing data together with links. Honestly, I don't see how you can PUT JSON, then GET HAL and expect it to have links without breaking the semantics of PUT.

PUT Link headers + JSON

If you really don't want to use HAL for some reason, you still have to encode the links in the PUT request somehow. One approach I've seen people consider is using Link headers in PUT requests. This would allow you to use JSON without losing links. There are a few drawbacks to this approach. One problem is that the semantics for using a Link in request rather than a response was never defined. It seems to make sense, but it was never standardized, so someone else might interpret it differently or it might get standardized with different semantics in the future. The other problem is that I don't think headers are an appropriate place to put this kind of data. In my opinion, HTTP headers are for meta-data about the interaction. Things like describing the content type, or what content type the client can accept are things that fall into that category. Things that need to be persisted with the data do not belong. However, you should take that with a grain of salt because others have disagreed with me on this point and believe HTTP headers are a great place to put any kind of meta-data.

Divergent HAL and JSON representations

HAL and JSON are just two ways to represent a resource. When the resource is serialized as HAL, the links are encoded under the _links property. When the resource is serialized as JSON, those links should still be serialized somehow. For example,

{
  property : "value",
  bar_url: "...",
  …
}

Then the client can work with JSON and send JSON without losing the links or violating PUT semantics. It's perfectly valid to PUT a representation in one content type and GET with another content type without breaking any HTTP semantics. The problem with this approach is that once you encode those links into JSON, they loose the semantics of being links. Both clients and the server need to have special knowledge that a particular JSON property should be treated as a link. This means neither the clients nor the server can be as generic as they could/should be.

I hope that was helpful. I'll be curious to hear about what you end up doing.

Jason

Evert Pot

unread,
Jun 22, 2020, 2:51:48 AM6/22/20
to HAL Discuss
What we do might be a bit unusual, but I thought it might be interesting.

Some people have a distinction between 'navigational links' (and sometimes put them only in Link headers) and links that are not meta-data and part of the datamodel.
When we do PUT requests, we basically allow links to be omitted, but if they are being updated they can be specified with the _links object.

We also don't include _embedded on PUT, because we consider that a 'hack' to speed things up, but not a semantic part of the result of GET.

So we're kinda making an exception for _links. If they are there, we process them, but they are optional... accepting that this could be considered technically incorrect.

But for the most part we consider the PUT/GET symmetry to be semantics based, not octet-based.

Our client (https://github.com/evert/ketting) will from version 6 onwards re-encode all links by default. This is currently in alpha. This effectively means that if you don't do anything, all links will be sent back.

So say we do a GET request and get this back (not valid HAL but summarized for brevity)

{
  _links
: {
     
self: '/foo/bar',
     collection
: '/foo',
     author
: 'https://evertpot.com/',
 
},
  title
: 'Hello'
}


Then we can interact with this object in this way:

const resource = ketting.follow('...some rel');
const state = await resource.get();

state.links.set('author', 'https://twitter.com/evertp');
state.data.title = 'V2!!';

await resource.put(state);
```

This will result in a PUT request with all the original links intact, but 'author' updated.

Evert

Fredrik Sjöblom

unread,
Jun 23, 2020, 3:27:32 PM6/23/20
to HAL Discuss
Hi,

I would not dismiss a PATCH format on the premise of loosing idempotency. Using a custom PATCH format can always introduce specific constraints and rules that makes PATCH quite safe. Not necessarily idempotent but is that actually a hard requirement?

You could for example define a PATCH media type that behaves more or less like a PUT but per definition only allow changes to specific link relations, ignoring the rest. So if you would send back the resource as is or only containing relevant links, the end result would be the same.

Existing PATCH formats like http://jsonpatch.com/ might actually work for this if you introduce additional processing constraints that only allow change operations on some link relations. And you can always require that requests include a test operation on the value of a suitable state property to minimize the risk of missed state changes.

Experience from using Java libraries for processing specific JSON Patch operations has actually been quite positive. So using a generic PATCH format don't require that you allow patching everything on the resources. Specific constraints can still be dictated by the business logic in question.


-Fredrik

Reply all
Reply to author
Forward
0 new messages