Hi Vishal:
For the application I'm working on, we don't have end user data and user specific OAuth tokens, but we do have tenant specific tokens. A tenant application uses its issued client credentials to obtain bearer tokens, and then hands them out to the user agents accessing their site. This token is then used by the user agent to make cross-domain (JSOP) API calls directly to our server. By issuing that token, the tenant has authorized the user agent to access our APIs, as well as any tenant specific content used to satisfy the requests.
Being a bearer token, the bearer gets access. Token acquisition and the API calls themselves are performed over HTTPS, so an eavesdropper can't just pull tokens out of thin air, but on the client side, a user can inspect the DOM, or a non-browser client can gain access to the token and send it off elsewhere.
In our case, the server keeps additional token specific information. My app is Java based, and I use Spring Security OAuth. However, that framework does not provide anti-hijacking, usage limits etc. So, once Spring Security has okay'd the token and my RESTful APIs process the request, I have an additional chain of request processing that goes on before any API request is satisfied.
After a tenant app has obtained a token and handed it to a user agent, I only know a given token was created and that gets persisted. The very first API call in which that token is used is what I call the 'token binding event'. That is, the client source IP address is associated with the token. Then, for the lifetime of the token, if it is used by any other source IP address, the token is considered hijacked and is made invalid.
That may or may not be iron clad enough for you. As you can see, there is a window before binding where the original recipient of the token could hand it off to any other client. But, once that binding occurs, a particular IP address is locked in.
Originally, I thought having the tenant application supply the client IP address at the time the token is acquired could close that window. But, many companies, institutions etc. can access their tenant application using private IP addresses. Then the source address seen by the tenant app, and that seen by my server would differ. Also, using Network Address Translation (NAT), many individual clients within a private IP space will share a single public IP address. So, my token hijacking mechanism cannot distinguish at that level.
Anyway, that's what I have, worts and all. I'm very much interested in what others suggest in how to improve on this or do it differently.
Moving beyond that, I'm also interested in other asset protection ideas. I'm not talking end user data here, but rather the large corpus of valuable, curated data our application serves up via its APIs. Allowing a client to systematically pull content is not in our best interest. So, for an individual token, I've implemented:
(1) A given token is only good for so many API points. Once reached, the token is invalidated.
(2) Since we control both sides of the conversation for the most part (embeddable widgets on the client, control server), if token usage appears excessive over some interval of time, the API can return that a captcha challenge must be satisfied. The objective is to stop, or slowdown some program that acquired a token and is scraping content.
(3) If the captcha is unsatisfied for x number of API calls, then the token is invalidated.
(4) Even if captcha is satisfied, but the number of challenges is excessive, the token is invalidated.
(5) The token itself expires in 24 hours.
That all seems well and good, but if tokens are easy to get (tenant application), then once the current token is invalid, simply get another one and keep going. To this end we go back to the token binding event I described earlier. If a given source IP address binds too many tokens within some period of time, it gets blacklisted for a period of time. The server will not honor any token presented by that client while blacklisted. A repeat offense can result in permanent (administrative) blacklisting. There is a whilelisting mechanism as well.
As you can guess, the blacklisting mechanism itself has its problems. Going back to NAT, if there are 1000s of clients sharing a single public IP address legitimately going about their business, that could look like abuse. That address can be whitelisted, but that disables the ability to blacklist. I'm not happy with this solution; things are not fine grained enough.
Other suggestions would be greatly appreciated.
Cheers,
Jeff