Designing URLs for "yourself" and "everyone"

181 views
Skip to first unread message

Jack Snow

unread,
Jan 19, 2013, 11:01:57 PM1/19/13
to api-...@googlegroups.com
We have an application that is deployed in "instances", resulting in one or multiple instances for each customer.
 
Each instance will need to talk to our internal web service to deal with things like billing and other centralized management features.
 
Some instances, should be able to access all data including that of other instances in the web service. Other instances should only access data belonging to themselves.
 
For the first case, it is easy to design URLS:
 
GET /instances/instance_id/ to get instance information
POST
/instances/instance_id/logs to create a new log item.

The second case is harder. Do we design a special URL for those guys?
GET /instance to get instance information for his own instance
POST
/instance/log to create a new log item for his own instance

If the guy in the second case has an instance id of 1234, should he also be able to access his information using GET /instance/1234?

Ted M. Young [@jitterted]

unread,
Jan 19, 2013, 11:48:17 PM1/19/13
to api-...@googlegroups.com
That seems fine to me. There's no reason you can't have multiple URIs that reference the same resource. However, I'm not clear on why the "get my data" URI needs to be different from the "get anyone's data"? I see two possibilities for the "get my own data":

1. Use the standard URI template: GET /instances/1234/ but use authorization to ensure the client can access the 1234 instance data and no others.
2. Use a different URI: GET /instance/ and use the client's IP address as authorization.

#2 might be useful if you're managing an internal network, or pre-approved clients, but unless there was a need to do that, I'd go with #1.

;ted
--


--
You received this message because you are subscribed to the Google Groups "API Craft" group.
To unsubscribe from this group, send email to api-craft+...@googlegroups.com.
Visit this group at http://groups.google.com/group/api-craft?hl=en.
 
 

Jack Snow

unread,
Jan 19, 2013, 11:55:39 PM1/19/13
to api-...@googlegroups.com
I guess my rationale for #2, is that it makes it easier to get your own data. With #1, you would need to know your instance_id and use a URL template, where as with #2, you just query /instance directly.
 
I noticed that Facebook also has a /me for accessing information about yourself.

Ted M. Young [@jitterted]

unread,
Jan 20, 2013, 12:13:42 AM1/20/13
to api-...@googlegroups.com
Maybe I missed something, but how would the server know with #2 that "my" instance was #1234? Is there an authorization or identification mechanism that determines who "me" is?

For example, if I went to Facebook and asked for '/me', but I wasn't logged in, there'd be no way to return the right information, since I hadn't identified myself yet. Once I was logged, then the '/me' would make sense.

;ted

Mike Schinkel

unread,
Jan 20, 2013, 12:24:31 AM1/20/13
to api-...@googlegroups.com
On Jan 19, 2013, at 11:48 PM, "Ted M. Young [@jitterted]" <tedy...@gmail.com> wrote:
That seems fine to me. There's no reason you can't have multiple URIs that reference the same resource.

I might be thinking wrong but I think you can run into potential caching issues if you do that. Consider this series of requests:

GET /my-instance
PUT /instance/<my-instance-id>
GET /my-instance

If I'm not mistaken and HTTP caching is being used the 2nd GET will think it has a current copy but it will actually be out of date.  

Maybe someone else who has more real-world experience on this specific topic can chime in?

-Mike

Jack Snow

unread,
Jan 20, 2013, 12:25:32 AM1/20/13
to api-...@googlegroups.com
We are using OAuth 2.0 for authorization, so implementing of  /instance to access the data that belongs to "me" is pretty trivial. We can just look up the instance number associated with the access token.

Ted M. Young [@jitterted]

unread,
Jan 20, 2013, 12:39:53 AM1/20/13
to api-...@googlegroups.com
Right. Though as Mike points out, you'll lose any potential for caching the information since the URL is the same, though I'm not sure if that's lost anyway due to the use of OAuth? It's been a while since I've look at that.

I guess my understanding (and I'm still new to REST) is that the '/my-instance' breaks a REST constraint, i.e., that URI potentially points to _different_ resources. For example, machine A will see '/my-instance' as representing instance 1234, whereas machine B will see the same URI as representing instance 4567. That seems dangerous.

Is there a reason that you want to use a shortened URI? I assume it'd even foster code reuse if the URIs were always built the same way, regardless if I was getting someone else's instance info or "mine".

;ted
--
Message has been deleted
Message has been deleted
Message has been deleted

Jack Snow

unread,
Jan 20, 2013, 1:05:02 AM1/20/13
to api-...@googlegroups.com
I am not sure why my reply was deleted, but here it is again:
 
@ Mike: Good call on the caching. However, since I need information from the API to always be "fresh", I will be using ETags like Facebook. This means that every request will be validated against the server and will consume a request, but freshness is very important for this web service.
 
@Ted: I guess the benefits can be easily outlined as below:
 
  • It's very easy to access your own information without having to generate a URL from a template or know your instance_id at all.
  • Facebook uses /me and /your_fb_id to give yourself access to the same information about yourself. This probably means that /me is providing some value to their clients and API consumers.
  • Github (which imo has a beautiful API) uses /users/user_id to access a user or /user to access the authenticated user.
Of course, this is not without disadvantages, as you have mentioned:
  • The URI will result in different content for different users.
  • Any other ones?
Also, I am implementing HATEOS, so one of the URIs should be marked as canonical (I think). Which one should be marked as the "original" and which one the "canonical"?

Brian Mulloy

unread,
Jan 20, 2013, 3:02:47 AM1/20/13
to api-...@googlegroups.com, api-...@googlegroups.com
Unless your caching mechanism includes the header, which is the beauty of using an API cache instead of a UI cache. 

Sent from my iPhone
Message has been deleted

Mike Kelly

unread,
Jan 20, 2013, 5:21:56 AM1/20/13
to api-...@googlegroups.com
If you're doing hypermedia why do you need to worry about having a
'simple' url? Your client should be traversing over a 'my-instance'
link, not constructing a url so its structure should be irrelevant to
them.

That being the case, the link should direct a client to a specific URL
- this will make your life a lot easier (caching, logging, debugging,
etc). e.g.:

GET /client/123/dashboard

{
"_links": {
"my-instance": { "href": "/instance/123" }
}
}

Of course, this client specific URL doesn't have to be the same as the
admin specific URL (it just needs to be specific).

In my experience, it's best to avoid resources that vary per client.

Cheers,
M

Jack Snow

unread,
Jan 20, 2013, 5:44:40 AM1/20/13
to api-...@googlegroups.com
Not sure why my previous message was deleted.
 
@landlessness: Could you expand on that?
 
@Mike Kelly: the /instances/1234 or /instance is actually the starting point for the clients to query.
 
So, if an instance wants to get information store in the web service about himself, he can either do:
 
GET /instances/1234 or GET /instance
 
So, essentially in my case, /instances or /instance is the root, so to speak. Obviously, /instance makes it easier for the client, because he no longer has to know his instance_id nor construct a URL. But there are draw backs to this as mentioned before.
 
Could you expand on your experiences with resource that vary per client?
 
Cheers :)

Mike Kelly

unread,
Jan 20, 2013, 8:55:45 AM1/20/13
to api-...@googlegroups.com
On Sun, Jan 20, 2013 at 10:44 AM, Jack Snow <infec...@gmail.com> wrote:
> Not sure why my previous message was deleted.
>
> @landlessness: Could you expand on that?
>
> @Mike Kelly: the /instances/1234 or /instance is actually the starting point
> for the clients to query.
>
> So, if an instance wants to get information store in the web service about
> himself, he can either do:
>
> GET /instances/1234 or GET /instance

The latter option smells bad to me, I think you would be better off
with something like:

/client/1234/instance

But, really, why give them a choice?

> So, essentially in my case, /instances or /instance is the root, so to
> speak. Obviously, /instance makes it easier for the client, because he no
> longer has to know his instance_id nor construct a URL. But there are draw
> backs to this as mentioned before.
>
> Could you expand on your experiences with resource that vary per client?
>

It makes HTTP caching (particularly reverse proxy) much more
difficult. It makes logging/debugging more difficult because the URL
doesn't identify a specific resource, making reproducing
requests/responses more challenging.

I'm all for API pragmatism and quality user experience, but this is an
issue where (at least in my opinion) the convenience you afford
clients with your proposed approach is not significant enough to
offset the costs you'll incur on the server side by not using URLs to
properly identify resources.

Cheers,
M

Daniel Roop

unread,
Jan 20, 2013, 10:49:33 AM1/20/13
to api-...@googlegroups.com
Jack,

The thing I have accepted recently is there are many competing constraints and you have to figure out what makes sense for your situation.  Sometimes that means favor caching, sometimes that means favoring simplistic APIs.  This is to say that sometimes I think Option 1 is the right option and sometimes Option 2 is the right option.

Couple points:
Resource Models are Different than Data Models
I like pointing this out when I hear people say you shouldn't have two URLs pointing to the same resource for caching etc...  A Resource Model describings how you want a client to interact with a system, this may include multiple ways to view a particular data model.  In your examples I would argue there is the potential for 2 resources for 1 Data Model

Resource 1: The Data about the Instance
Resource 2: A Lookup Mechanism for finding the Authenticated Users Instance

REST Often Involves Layers
There is nothing wrong with building a lower level API that takes full advantage of caching and a very thin "application specific" layer that makes consuming the resources easier.  You can look for examples of this in the direction Netflix is moving in their API.  There is also a good picture (which I can't find right now) where it shows how you combine resources from a very granular level to a very application specific layer in a REST architecture).  It may be the case that your core system does identify each of these instances like Option 1, and then you have a resource that sits higher on the stack to simplify user access.

URLs that are constructed by the client are hardest to change
As has been stated in this thread, because this is an entry point you should be very mindful of what it looks like.  This is a URL you won't be able to change, without coordinating with the clients.  This isn't something hypermedia can fix "easily" unless you develop some sort of "shield" document that references this.  I wouldn't go that far for your use case, I would just make sure I am confident that the API I am providing make sense and is flexible enough for future uses.  For instance I tend to prefer exposing the Option 2 URL (using the pattern I will describe below) to my clients so I can change what the Option 1 URL looks like over time.

If you haven't guessed yet I would favor two resources for my API.

Resource One would be the raw data, that everyone can access and use standard OAuth to make sure the user requesting is able to see the data being requested.  This gives me the flexibilty to expand access in the future, to "friends" or "proxy admins" and lets me take advantage of the network cache for these items.  

Resource two would follow the redirect pattern (made up that name) to redirect the client to the right instance.  Something like

GET /instance
Authorization: BEARER 2398radlfj0239osd

HTTP/1.1 303 See Other
Location: /instance/1234

GET /instance/1234
Authorization: BEARER 2398radlfj0239osd

...

But as I said at the beginning you have to consider your use case and how flexible you need to be.  Also keep in mind being RESTFul does not mean exposing your raw data model.  Hope that helps

- Daniel


M

--
You received this message because you are subscribed to the Google Groups "API Craft" group.
To unsubscribe from this group, send email to api-craft+...@googlegroups.com.
Visit this group at http://groups.google.com/group/api-craft?hl=en.





--
Daniel Roop

Mike Kelly

unread,
Jan 20, 2013, 11:25:49 AM1/20/13
to api-...@googlegroups.com
I'm not fond of this redirect approach for a couple of reasons:

Firstly, it's potentially more complicated as clients now have to
follow a redirect, which is not always handled transparently by http
client libraries.

And second, it doubles the number of requests on your entry point
which is already, by definition, the heaviest hit part of the
application in question.

I'm assuming the client (having been allocated an instance and
negotiated an Auth token) is aware of its own identity within the
application.. Why not re-use that identity in the entry URLs, i.e:

/client/645123/instance

That way you don't need the extra redirect noise and your URLs remain
unambiguous.

Cheers,
M

Daniel Roop

unread,
Jan 20, 2013, 5:14:08 PM1/20/13
to api-...@googlegroups.com

Mike,

I think those are valid trade offs to weigh into your decision.  And I wouldn't argue that the most performant solution would be to have the client use what it knows to generate the call, but it is not without its complexities.

- you can't change that URL
- you add complexity to the client (albeit very little in this case) to construct the URL.

These may be the right risks to take in some circumstances but I am not sure it is always right.

To your point about clients not obeying 3xx redirects, I mostly reject it.  Most clients do this out of the box, and those that don't it is just another API interface that needs to be agreed to, in my mind no different they telling them how to construct a URL, or that they need to send a PUT instead of a POST.

The extra network hop is something to consider in your design.  If you instead had both URLs work you would have the potential caching problem to worry about, but depending on your use case is a non issue.

And in some cases the client won't know the data that should go in the URL.  I like to use the example of most recent.  Imagine I have a set of documents I manage

/documents/<document-id>

And I also have a feature where I keep track of the last document that was edited for client to reference.

/users/<user-id>/most-recent-document

In this case the client doesn't know what the most recent document-id is, because the server keeps track of that state for it.  You could have the /users/<user-id>/most-recent-document be the resource you interact with exclusively but that makes purging caches and sharing updates between users very difficult.  Instead you would want to have this resource reference the most recent document, or perform a redirect (either case does require two hops).

That is all a bit of a tangent from Jack's Original question, so maybe if we want to continue that conversation we should spin off the thread, instead of hi-jacking this one.

-Daniel

Jack Snow

unread,
Jan 20, 2013, 5:21:07 PM1/20/13
to api-...@googlegroups.com
Great insight guys! Definitely a lot of things to think about and a lot of decisions to make.
 
I am actually going to go and have a look at APIs in the wild that do this. So far, I have found Facebook and Github that does something like this, but I want to see if there are any others and perhaps investigate how they have implemented theirs. Then I will see if this is a good idea.
 
I am very new to designing RESTful web services, so all of your comments have been great help :)
 
In the mean time, keep your comments coming :)

Daniel Roop

unread,
Jan 22, 2013, 8:52:57 AM1/22/13
to api-...@googlegroups.com
The URI points to different things which is probably unrestful.

I don't think it is unrestful, it just depends on your goal.  You could argue it always points to the same thing, details about the Logged In User.  By going with this resource model you may loose some benefits of using something like /users/<user-id>/  but it sounds like in your case that tradeoff make sense.


On Sun, Jan 20, 2013 at 12:53 AM, Jack Snow <infec...@gmail.com> wrote:
@Mike: Good thinking :)
 
I actually plan to do something similar to Facebook by using ETags. Due to the nature of the system, data can change often, so using expires for caching is unsafe. This, means however, that I am not saving on requests (each requests needs to be validated against the server), but I think it is a good compromise.
 
@Ted: Yes, that was one of the reasons why I started this discussion :) My reason for the shortened URIs are:
  • It is very simple to access your own information without having to build a URI or know your instance_id
  • Facebook is using them, so it might be offering something of value to clients/api consumers.
  • Github, which has a pretty beautiful API in my opinion (unlike twitter's ugly API) also has something like this: /users/user_id to get a user in the system and /user to get the authenticated user.
The disadvantages are of course as you have mentioned:
  • The URI points to different things which is probably unrestful.
--
You received this message because you are subscribed to the Google Groups "API Craft" group.
To unsubscribe from this group, send email to api-craft+...@googlegroups.com.
Visit this group at http://groups.google.com/group/api-craft?hl=en.
 
 

Steve Klabnik

unread,
Jan 22, 2013, 9:30:13 AM1/22/13
to api-...@googlegroups.com
On Sun, Jan 20, 2013 at 12:39 AM, Ted M. Young [@jitterted]
<tedy...@gmail.com> wrote:
> I guess my understanding (and I'm still new to REST) is that the
> '/my-instance' breaks a REST constraint, i.e., that URI potentially points
> to _different_ resources. For example, machine A will see '/my-instance' as
> representing instance 1234, whereas machine B will see the same URI as
> representing instance 4567. That seems dangerous.


Fielding actually VERY SPECIFICALLY calls this case out as one that's
perfectly fine. From the dissertation:
http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_1_1

> For example, the "authors' preferred version" of an academic paper is a mapping whose value changes over time, whereas a mapping to "the paper published in the proceedings of conference X" is static. These are two distinct resources, even if they both map to the same value at some point in time. The distinction is necessary so that both resources can be identified and referenced independently. A similar example from software engineering is the separate identification of a version-controlled source code file when referring to the "latest revision", "revision number 1.2.7", or "revision included with the Orange release."

One other note about your wording, note that these aren't 'a URI that
points to different resources,' these are 'two resources that point to
the same entity.' A resource is something you expose via a name and
representation, an entity is some sort of internal data. So you have
two resources: /my-instance, and /instance/1234, and they both expose,
say, a JSON representation of that singular entity, which is the
actually existing instance in your system.
Reply all
Reply to author
Forward
Message has been deleted
0 new messages