Looking for feedback on a simple python HAL client

Skip to first unread message

Matt Clark

Feb 9, 2015, 11:43:23 AM2/9/15
to hal-d...@googlegroups.com
After being frustrated in various ways by current HAL clients (both python and javascript) I made one of my own in Python, called it HALEasy, and thought I'd be brave and let the people here give some feedback on my motivations and the resulting code.

The motivations were:
  • Clear from the code what actions are happening. There are no implicit names or actions apart from 'GET' so it is always clear to someone reading code that uses it what the class will do.  I find restnavigator for example to be rather difficult to follow.
  • Reuse existing mature libraries. All HAL processing is done by the dougrain library, all HTTP processing by requests.  HALEasy itself is <100 SLOC
  • Easy access to full URL, scheme+host and path+query+fragment of the document.  As a client sometimes I need relative paths, sometimes hosts, sometimes full URLs.
  • Uniform access to links regardless of whether there is a single one of a kind or more than one. With HALEasy you can always iterate over even single instances of a rel.
  • No distinction between embedded and linked resources as far as client usage goes.  Actually the current version just ignores embedded resources.  HTTP has perfectly good mechanisms for caching.  If you need to get the embedded resources you can do so via the .doc property of the HALEasy object
The code is at:


and yes I know the test coverage is best described as 'not entirely absent', but it covers the cases I need for my project right now.

Chris DaMour

Feb 13, 2015, 11:08:03 PM2/13/15
to hal-d...@googlegroups.com
>HTTP has perfectly good mechanisms for caching

how do you cache what you haven't requested?

Matt Clark

Feb 14, 2015, 2:49:54 AM2/14/15
to hal-d...@googlegroups.com
Ah, yes, I thought that might be a bit controversial.  From practical experience building HAL APIs the _embedded pattern causes a lot of confusion.  First of all, within an organisation you need to decide pretty firmly whether you're going to allow people to embed partial presentations or only whole representations.  If you do partials then that's quite hard to document well and can cause funny bugs in conjunction with the practical reality of not using schemas - how do you know if the full representation has the "foo" property set by looking at the embedded one?  You don't.

If then decide to embed full representations, what exactly have you gained?  The data to build the representation has to be fetched and the code to build it executed.  You may save the users an extra request, but then some other users will complain because your response is loaded down with resources _they_ don't want.

Further, when you embed a resource, you lose all that nice HTTP caching information in the headers.  Worse, you have to share that information across the primary and embedded resources.

So my strong personal preference in any real HAL API will always be to ignore the _embedded pattern completely, let clients request what they want, and indeed really the API user should have all of that hidden from them by the HAL client.  It really makes no sense to have two different methods for getting the same resource.

Of course if you take a different view on the issue of partial representations then my entire argument is moot, but literally days of my life spent trying to get multiple teams to agree on what to embed or not have left me scarred!

Chris DaMour

Feb 14, 2015, 3:01:22 AM2/14/15
to hal-d...@googlegroups.com

>Further, when you embed a resource, you lose all that nice HTTP caching information in the headers.  Worse, you have to share that information across the primary and embedded resources.

that is very true, definitely the trade off with hypertext cache pattern, and i think this is why we'll all be happy when HTTP2 really gets on our roadmaps (even though browser support seems REALLY good).

Have you been using the prefetch link relationship to indicate to the client stuff it should go grab on a background i/o channel to pre-fill the cache?  The performance gains we se from hypertext (especially on cell network) just can't be beat.

> If you do partials then that's quite hard to document well and can cause funny bugs in conjunction with the practical reality of not using schemas 

I think we stumbled on a really easy way to deal with this.  If a partial representation is provided then the server must include the "full" link relationship so the client can get the full representation.  So say you have an item and you want to know if it has children.  the probe no longer becomes a check for a "child" relationship, but a check for a "child" and a "full", if not and you really need to know, get the full, then check.  It is actually rather simple in code and hasn't come up much for us.  

(Invalid representations...no idea what to do with those, still haven't found a use case)

Also a nice strategy is to put a profile link that indicates it's a partial resource.  That's dead simple too...but then it becomes a question of how to get the full?  I guess you could use self link by some convention...but that's kinda weird...

Personally I can't quite get past abandoning _embedded, and enough APIs use it that it makes the client a little less applicable to the various APIs out there.

Another thing we've been using the _embedded for is error responses.  Say you submit a request to build an "item" with some fields.   If that item doesn't pass validation, we've been returning an error object that contains error info (vnd.error actually) that embeds the "item" json representation to bind to the UI.  It's been a nice pattern cause it allows the API to make correction to the item and present them in the ui just as if they were using an edit form (instead of a create).

Matt Clark

Feb 14, 2015, 4:49:19 AM2/14/15
to hal-d...@googlegroups.com
I think the points you make are really good, and I will definitely add _embedded support to the client, but I will probably try to avoid users of the client from having to be aware of it for now.  The error example you give is a particularly good use case for _embedded now I think about it, but it still seems that as a user I shouldn't really _have_ to care whether res.link(rel="error").follow() was embedded or not...

Chris DaMour

Feb 14, 2015, 4:55:18 AM2/14/15
to hal-d...@googlegroups.com
yeah our java hal client that we built works exactly like that.

@Link(rel ="some-rel")
ResultType someMethod()

if it can resolve it locally, good, otherwise go make the request.

then we have stuff like

@Link(rel ="some-rel")
boolean canSomeMethod()

for if blocks

BUT...it starts to break down when you get into interesting environments like android where i/o tasks have to happen on another thread.  At that point we added

resource.canResolveLinkLocal("some-rel") to be an indicator.  It works, but devs mostly just end up writing everything based on the results when they looked at them..which is ok.  removing something from embedded is a breaking change in most cases.

what i really want to get to is:

@Link(rel ="some-rel")
Observable<ResultType> someMethod()

so they can just use Rx's awesome conventions to bounce around threads as needed by giving the library the scheduler to use.

You probably won't have such problems in python.

You received this message because you are subscribed to a topic in the Google Groups "HAL Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hal-discuss/DubyH0A7nnQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hal-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matt Clark

Feb 14, 2015, 5:06:24 AM2/14/15
to hal-d...@googlegroups.com
Well, Reactive Python exists (https://github.com/ReactiveX/RxPY), so it might be fun to do it anyway so you can have one pattern across multiple client languages :-)  The more we can create reusable client usage patterns the less the average developer has to think about HAL itself, which I'm pretty sure is a Good Thing

Merlyn Morgan-Graham

Feb 15, 2015, 11:39:21 PM2/15/15
to hal-d...@googlegroups.com
Embedded objects aren't necessarily a "partial" representation. They can be totally different representations.

Non-exhastively, _embedded can be used to:
1) Reduce calls to the server that aren't driven by the client
2) Surface computed properties (e.g. a property for the count of items in a linked collection, without surfacing every sub-link)
3) Surface display-specific properties (like "searchResultType": "Person", "searchResultType": "Video", etc). On a link, "title" could work instead, but there might be more than one such display property to expose.

The fewer-calls use case lends itself to a transparent interface, but computed properties and display-specific properties don't map as well.

Also, _embedded items aren't required to have a corresponding entry in _links.

I like the idea of an interface that make HAL easy to use, but I wouldn't try to hide the fact that it is still JSON + hyperlinks + embedded. All three things are so different that I wouldn't try to bunch them together.

But being able to request the resource at the end of an _embedded object's self link is pretty important, and should be similar to requesting a _link, even if it isn't identical.
Reply all
Reply to author
0 new messages