an idea for solving the inline/subresource problem

220 views
Skip to first unread message

Dustin Getz

unread,
Nov 27, 2013, 5:05:17 PM11/27/13
to collect...@googlegroups.com
  • Design your resources such that all objects are versioned
  • Every time a resource is modified, it increments the version
  • then the URI is a function of just ID + version, which means the URI is a permalink
  • a permalink'ed representation is a snapshot in time, it will never ever change
  • which means its infinitely cacheable in a browser
  • If you squint at this a bit, it means a URI is kind of like a pointer in C
  • you can have pointers to subresources now, or build arbitrary recursive data structures
  • Since everything is perfectly cachable you don't have to care about querying a bunch of stuff at once
  • so you don't have to ever inline anything! Query freely and you will almost always hit cache
  • If you squint even harder this looks like a persistent data structure (in the functional programming sense)
  • If you mutate something, the self describing URLs change and the http client will make a request (cache miss), but only the first time

mca

unread,
Nov 27, 2013, 5:22:57 PM11/27/13
to collect...@googlegroups.com
i like the goal here (making inlining either a non-issue or trival) and would like to offer another option on how to make that possible:
  • don't start w/ objects, start w/ representations - there there is no versioning to expose to client apps miles (and years) removed from the origin server.
  • don't use resources as your inflection point, use templates and links (and their identifiers)
  • the URI is meaningless to clients and the rel/name of the link/template is the important value - the rel is the "permalink"
  • which means caching is orthogonal and always available
  • if you squint at this a bit, it means the rel is like a pointer (in any language)
  • there is no need for a hierarchy (sub-) since a pointer is just that. this is more like the "tagging" universe instead of a hierarchy
  • since querying is orthogonal, you don't have to worry about whether you need to "query a bunch of stuff at once" or not. you can decide any time and even change your decision later w/o breaking anything.  
  • you don't have to inline anything ...
  • if you squint harder you realize that you don't need to treat representations on the web as if they are data structures or treat the Web as a functional programming space. the Web is a unqiue programming space based on message models and state transitions, not objects and behavior/functions.
  • if you mutate something, the rel still points to the latest edition(version, copy, lifetime, etc.) and the client will make requests just like before w/o ever knowing if it came from an origin server or a cached copy at some other location (including pre-fetched, pre-parsed instances of state data stored at another location entirely)
just another POV. keep in mind that AFAIK, both POVs are supported using Cj.

--
You received this message because you are subscribed to the Google Groups "Collection+JSON" group.
To unsubscribe from this group and stop receiving emails from it, send an email to collectionjso...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Dustin Getz

unread,
Nov 27, 2013, 5:53:01 PM11/27/13
to collect...@googlegroups.com
Thanks, i just read that several times. Most of it makes great sense.

since querying is orthogonal, you don't have to worry about whether you need to 
> "query a bunch of stuff at once" or not. you can decide any time and even change 
> your decision later w/o breaking anything.

My goal isn't to give the client the ability to decide what to query, its to remove the client's need to care about the performance cost of making a request since the results are already in browser cache. So I can make a series of requests rather than one big request with a bunch of stuff inlined.

I'm not sure if our vocabulary is aligned-

by "perfect caching" i mean like how you can query a cloned git repo, as of a specific commit. If your local clone doesn't know about that commit then you have to touch network, but otherwise your local clone is guaranteed to be consistent for that commit; its never stale and doesn't need to be invalidated.

if you mutate something, the rel still points to the latest edition(version, copy, lifetime, etc.)

It seems to me that perfect caching isn't possible if this is the case. The rel would have to encode the version otherwise maybe I get stale data from cache and my app state is inconsistent.



--
You received this message because you are subscribed to a topic in the Google Groups "Collection+JSON" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/collectionjson/g0LdwFEVc2w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to collectionjso...@googlegroups.com.

mca

unread,
Nov 27, 2013, 6:13:53 PM11/27/13
to collect...@googlegroups.com
yeah - we're aiming at the same target, have diff POVs is all.

"perfect caching" is an interesting name. i imagine you are thinking it's "perfect" from the client's POV. you proly don't care about what it _means_ to create a "version" some examples that come to mind easily:
- is it just structure changes? data? 
- if i remove a template from a Cj representation is that a new "version"
- if i add a template since this is for a logged in admin user, is that a new "version"?
- if this one user has a new data element that has a link that others do not, is that a new version?

i also suspect the number (and associated costs) of "versions" CDNs and intermediaries will handle for you is something to consider. Is it "cheaper" for me to pay for storing these versions (assuming they will be used "frequently") or to regen them from the server upon request?

my point here is that there may be other factors that make "perfect caching" less than desirable from the network POV. 

but, hey, i'd like to see this up and running and see what it takes for a server to generate these "perfect cache" resources/addresses and what pay off you get from perceived reponsiveness on the client.

Glenn Block

unread,
Nov 27, 2013, 6:19:39 PM11/27/13
to collect...@googlegroups.com
Mike, the rel as a pointer isn't really working for me. Isn't a rel more of a descriptor than a pointer? A pointer points ever to one thing at a moment in time. In the case of a rel that rel would be used across many different links with the uniqueness living at the URI level i.e. this URI points to one and only one resource.

What do you mean exactly......

Cheers
Glenn

Dustin Getz

unread,
Nov 27, 2013, 6:31:25 PM11/27/13
to collectionjson
Interesting, you've caught me thinking in terms of database objects rather than resources. Anything that could impact the served representation (like the user) would have to be encoded in the URI, or at least the http client would have to have a smart enough cache to hash the entire request (including cookies and stuff) as a cacheKey. I don't know if browsers do that though it could probably be worked around in javascript.


--
You received this message because you are subscribed to a topic in the Google Groups "Collection+JSON" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/collectionjson/g0LdwFEVc2w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to collectionjso...@googlegroups.com.

mca

unread,
Nov 27, 2013, 6:33:28 PM11/27/13
to collect...@googlegroups.com
hmm....

so the word "pointer" in the sense you (and I assume Dustin) are using means a "permanent location"?

I was using the word differently and that was my mistake, i think. "point me to a hotel" was my usage. "Here, let me point out all my friends in this room", etc. 

"A pointer points ever to one thing at a moment in time." to make sure i have this, you mean that a pointer identifies a _permanent_ location, one that will be true "forever" (relatively speaking, of course), right?

If that's true, that sounds like the URI "name" school (per RDF-land) rather than the URI "location" school (per REST-land). Is that right?


On Wed, Nov 27, 2013 at 6:19 PM, Glenn Block <glenn...@gmail.com> wrote:

mca

unread,
Nov 27, 2013, 6:38:12 PM11/27/13
to collect...@googlegroups.com
well, yes, this is one of the important distinctions (Data Object v Resource Representation) and that's cool. lots of room here.

keep in mind that HTTP *does* account for the level of changes I talked about (both data and structure mods to a single representation of the same resource). That's the ETag (entity tag). Usually that's a hash value that is shipped w/ a response. There is also a Vary header that allows origin servers to mark other headers that "make up the uniqueness" of a representation. 

Intermediaries are supposed to use both the ETag and any accompanying Vary header to identify these variant representations delivered for the same URL. It's stock HTTP/1.1.

So, the goal you have in mind has, AFAICT, all the support in HTTP that would be needed. now WebSockets, FTP, XMPP, and other level7 transfer protocols will not be as "smart" about this. If your representations would ever be sent over these other transfer protocols, adjustments to the "perfect caching" wouild need to be made.

Glenn Block

unread,
Nov 27, 2013, 6:41:56 PM11/27/13
to collect...@googlegroups.com
OK, now I see what you mean. I was thinking you meant in the programming language sense which is how i read what Dustin was saying.

In terms of the pointer, I just meant that a URI refers to a specific resource while a REL in general applies to many resources.

Dustin Getz

unread,
Nov 27, 2013, 6:41:45 PM11/27/13
to collect...@googlegroups.com
Yes, "permalink" is to permanent resource that will be the same forever. As an example, here are two links to the same resource but only one of these has a version encoded.

https://github.com/dustingetz/ubercrud/blob/b4cdc1a39cecdad7aa45ba274f8a657868281b7e/app/controllers/Application.scala

https://github.com/dustingetz/ubercrud/blob/master/app/controllers/Application.scala

mca

unread,
Nov 27, 2013, 7:00:16 PM11/27/13
to collect...@googlegroups.com
ok, i think i see where i slipped off the rails. 

so this gets to the notion of how much version/change information i want to expose to the "outside" world.

for example we humans don't need to "know" that each day i lose/change about 1 million skin cells. in fact, there are about 1.6 trillion skin cells in the average adult human. that means the next time i see you (or anyone for that matter) they are tangibly a "different version" of themselves. this discounts the whole new outfit, hairdo, etc.  We just don't "act like" that's a new person. just a new representation of the same resource.

in fact, it would be incredibly fiddly if i had to (in conversation with others) refer to the "permalink" of you based on when we last met. "I was talking with Dustinb4cdc1a39cecdad7aa45ba274f8a657868281b7eGetz about Web architecture and it was quite interesting."

Instead we operate w/o going into that level of detail. IMO, this is the way the WWW should be operating, too. while it's possible that local data stores need to operate at the lower level for direct data storage (Rich Hickey does something like you describe for his Datomic Data Store), i don't think that is helpful at the WWW level.


Dustin Getz

unread,
Nov 27, 2013, 9:01:10 PM11/27/13
to collectionjson
that's right, the philosophy i present is actually modeled after Datomic and intended to be backed by it, and im testing the hypothesis "is this helpful at the WWW level".

I'm not quite sure how to best model this in hypermedia and this is my first time (I just read your excellent book and need to read it again). The whole point of this approach is to traverse an ORM style object graph that might look something like this, so the concept of "pointers" seems to be a first-class notion which is part of the actual object, which is in addition to the usual hypermedia controls. 

Here is what im thinking currently:
  • start with collection+Json
  • all of my database objects have an "id" and "version" (which need not be part of the representation)
  • an object can have an attribute which is a pointer to another object or list of objects - using the URI as the "foreign key"
  • you can resolve any object just by GET on one of these URIs (this is an extension to collection+json)
  • CRUD effects etc. still use standard collection+json hypermedia controls 
how does this look? i realize its kind of sleazy in that im still designing my resources based on my database objects, but it doesn't feel particularly important to fix this. do you think i should reconsider this?

mca

unread,
Nov 27, 2013, 9:16:00 PM11/27/13
to collect...@googlegroups.com
all-in-all this seems sensible. hard for me (at this distance) to be all too helpful, ;()

image is crazy-making (but that's just me).

i'd like to see some cj representations. certainly all the parts should be in place to support what you have here. the "resolve object w/ a GET" seems fine to me. are you offering up the "pointer to another object" as one of the collection.items[x].links[]? that would be my guess, but i've not seen your samples. 

how do the rels play into all this. do you have to mint quite a few or is a small set workable? 

one of the tips i've learned along the way is to model the Cj representations from the _client_ POV, not the server. IOW, "what does the client want/need at this point (both data and functionality) and make sure there is a resource representation that provides that. this accounts for some cases where clients need to do things that are not (strictly speaking) part of the data model (reporting, filtering of available records, other mgmt issues).

i'd love to watch you work through this (as much as is possible/allowed). seems a very interesting project.

keep me posted and let me know how it goes.

Cheers.


Glenn Block

unread,
Nov 28, 2013, 12:52:29 AM11/28/13
to collect...@googlegroups.com
Dustin, you might also want to check out OData as it has object/relational semantics baked in including navigation of graphs http://www.odata.org/

To be clear, I am not endorsing it, but it seems like it has overlap in goals.

Dustin Getz

unread,
Nov 22, 2015, 8:53:01 PM11/22/15
to Collection+JSON
Hey Mike - so I've implemented this, here is the beginnings of documentation

Here is the punchline - 

Hypercrud representations look like this, it exposes a graph, this is what we develop against (if there were no cache inlining)

Here is the general hypermedia client, open up the dev tools network tab, filter by XHR, you will see one-and-only-one request even as you navigate through the pages no more requests.

mca

unread,
Nov 22, 2015, 10:00:34 PM11/22/15
to collect...@googlegroups.com
interesting...

would like to see the (eventual) media type write up to get a better sense of the design details.

 


mca
Mike Amundsen

--
You received this message because you are subscribed to the Google Groups "Collection+JSON" group.
To unsubscribe from this group and stop receiving emails from it, send an email to collectionjso...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dustin Getz

unread,
Aug 5, 2017, 2:18:44 PM8/5/17
to Collection+JSON
Mike so that rabbit hole ended up being really deep with a lot of yaks. But now I present to you, a fully general, and efficient, hypermedia client: 


My specific question for you is, do you agree with the above statement? Looking for any/all types of feedback.

To solve the subresource problem efficiently, it makes some very unusual architecture choices. Hyperfiddle apps are coded in javascript, but the javascript runs in both the browser and the server. The subresource problem (known as N+1 query problem in database land) is solved by moving this javascript actually inside the database query engine. If queries have zero latency it doesn't matter how many queries we make. And of course we need an acid database with distributed reads so we can execute our application javascript in the same machine as the database query engine, so anything SQL is out, but Datomic is suitable.

Dustin Getz

unread,
Aug 5, 2017, 3:24:09 PM8/5/17
to Collection+JSON
I just wrote you a blog post actually: Hyperfiddle vs REST
Reply all
Reply to author
Forward
0 new messages