Today one of the GitHub guys released a new Node.js library for PuSH.
His call for JSON support in the protocol is clear:
http://techno-weenie.net/2010/10/5/nub-nub/
We've been wanting to add JSON support for quite a while, with notable
contributions from Mart
(http://martin.atkins.me.uk/specs/pubsubhubbub-json) and others. A
while ago Monica wrote up this proposal about how to support arbitrary
content types in PubSubHubbub:
http://code.google.com/p/pubsubhubbub/wiki/ArbitraryContentTypes
I worry about option 2, translation to JSON, because I think it
dictates what format the JSON payload needs to be in when served by
publishers. A big benefit of JSON is making things easier to use and
more ad hoc. Dictating the packaging format would harm that. I also
don't see Facebook changing their JSON API
(http://developers.facebook.com/docs/api/realtime) to match and they
shouldn't have to.
The core issue with option 1, the REST approach, is security (replay
attacks). But I believe I've finally cracked the nut!
To explain: Feeds are good formats because they are self describing.
Update times and IDs are part of the feed body and individual entries,
enabling idempotent synchronization and race-condition tie-breaks.
This also means the 'X-Hub-Signature' on the body of PuSH new content
notifications is sufficient for security/verification because we can
ignore everything besides the body (the other headers are ignored).
With the REST approach to arbitrary content types, we *need* to
represent the HTTP headers in the new content notifications or else
we'll have no idea what order the messages came in (Date), etc. And
that's the security problem. If we rely on the headers, then we also
need to verify them (to prevent replay attacks). But X-Hub-Signature
only signs the body, so we're stuck. Some folks have discussed signing
headers too, similar to OAuth1.0, but that lead to a lot of pain
nobody wants to repeat. Others have talked about putting headers into
the body (using mime multipart), but that's just another world of
hurt-- the so-called Turduckin problem (thanks jsmarr for the name).
The solution is to borrow from the OAuth2 playbook: Treat the
X-Hub-Signature like a password/bearer token and require HTTPS for
callbacks. That means for security we have these components:
1. hub.secret used by subscriber to verify the hub is authorized to
post content (after delivery)
2. Subscriber's SSL cert used by Hub to verify authenticity of
subscriber endpoint (before delivery)
3. SSL connection between Hub and Subscriber is encrypted, protecting
the header values and preventing replay attacks (during delivery)
Thus, Monica's proposal as-is does the trick. All that's left to work
out are details around verbs and Link headers. I hope to bring
everyone back together to build out a spec around this proposal now
that I think the security issue has been solved. Hopefully we can
convince GitHub to be the first implementors of it.
Let me know what you think!
-Brett
Sorry for the delay in my response. Been a busy October! Thanks so
much for the detailed response. I've got some followup.
> My point, if anything, is that we are dividing what we protect and
> subjecting all of it to a defence which is already implemented quite poorly
> in practice. On the one hand, we have a signature for request bodies, and on
> the other hand, nothing for headers. If the SSL/TLS protection held true -
> why even rely on body signatures at all?
Why: Because the SSL/TLS request coming from the hub is of unknown
origin (unless you use SSL *client* certs, which is never going to
happen). You use the body signature as a bearer token to verify the
authenticity of the Hub that's pushing the new content to subscribers.
> In this scenario, SSL/TLS becomes a single point of failure in a system
> where failure is not only common but sometimes encouraged. In a network of
> Hubs and Subscribers where parties may fail to verify certs, MITM attacks
> will be possible. Extending from that, not signing ALL components of the
> request will allow for requests to be manipulated without invalidating the
> signature of whatever subset of data is signed.
The body is signed by an HMAC of a shared secret and the content. Even
if the hub *never* validates server certs for subscribers, the message
will not be subject to replay attacks unless the SSL connection itself
is compromised. The existence/chance of such MITM attacks in the wild
is low. My claim is that hubs who care about the privacy of the data
they distribute should verify server certs properly and require valid
CA-signed subscriber certificates.
> That's why signatures remain compelling. Yes, they are a pain in the ass to
> develop. Yes, they can lead to incompatibilities between large posterchild
> implementations. Yes, programmers hate them with a vengeance. Why? Because
> programmers can't ignore them by setting curl's peer verification to false.
> They actually have to go and deal with them properly or nothing will work.
> In a sense, they are supposed to be a PITA.
The problem with this conclusion, as I see it, is that full signing is
way too hard and people won't do it. There are rare exceptions to this
(the Twitter API) that have much more to do with the current market
conditions than technical considerations.
So in this proposal I'm trying to split the difference of usability
vs. security. Thus far, every proposal I've seen for arbitrary content
type support in PubSubHubbub requires a level of complexity in signing
that is approaching that of OAuth1.0 (a bad thing). If the alternative
is way easier but requires hubs to do proper server cert validation,
is that worth it?
A potential alternative approach to this is the up-and-coming signing standard:
http://developers.facebook.com/docs/authentication/canvas
Which I hear will be integrated into OAuth2:
https://docs.google.com/document/pub?id=1kv6Oz_HRnWa0DaJx_SQ5Qlk_yqs_7zNAm75-FmKwNo4&pli=1
We could try to use that for a combined body/header signing scheme. It
could use HMAC-SHA1 as the signing algorithm and using the shared
secret established during subscription time to sign the message.
-Brett
Thanks a lot for the response and my apologies for taking so long to
get back to you.
On Thu, Oct 7, 2010 at 11:24 AM, Monica Keller <monica...@gmail.com> wrote:
> Concerns for Option1 here
> -Putting burden on subscribers to handle the different HTTP methods (DELETE,
> PUT) -- Not a huge concern
Indeed, and the method stuff may just be in the X-HTTP-Method-Override
header anyways.
> Would we know be asking all subscribers to have SSL certs ? That is a fairly
> big requirement.
>
> OAuth 2 burdens the service providers with this so I have concers about
> burdening the subscribers with it.
Yes I agree that's an issue. My hope was there is a way to have Hubs
cache SSL cert fingerprints, so even a self-signed cert could be added
to the certificate chain if it was the same one that was originally
used to establish the subscription.
> My other question would be whether web hooks is a better fit today for APIs
> since there really isn't a need for a hub to fan out.
>
> As much as I love PubSubHubbub I think we should answer the question of how
> many service providers would want to push their response to another hub.
> MySpace and FB didn't really need an external hub. At Socialcast its the
> same thing we are going to add PuSH but its a private response for which you
> need to authenticate
>
> My experience leads me to believe there is a serious need to support a
> publisher who is its own hub.
Well I totally agree with you that the common case is becoming people
running their own hub. The old light-pings are mostly there for
bootstrapping and boosting adoption. However, even if you run your own
hub, how do you achieve "a private response for which you need to
authenticate"? What is Socialcast using for authentication from your
self-run hub to the subscriber?
X-Hub-Signature works well enough for payload-only messages, but what
about messages that have headers, like arbitrary content? I don't
think that running your own hub alleviates that problem, which is why
I'm looking for a general solution that all providers can employ. Does
that make sense?
-Brett
What if the application/http mime-type was used as the content body?
Would allow the headers for the content and the headers for the PuSH
message to be fully seperated. I see some possible problems related to
difficulty of parsing, however...
Eric,
This is the solution that Brett alluded to in the subject line when he
refers to "turducken". This was discussed as a possible solution but
some folks at the table found nesting an HTTP message in the body of an
HTTP message to be distasteful/confusing.
I don't have any major objection to it on principle, but I can
sympathize with the viewpoint that it puts an unusual burden on
subscribers since many web application frameworks won't expose a
facility to parse an arbitrary string as an HTTP message and so
implementers would end up working around their framework to implement
such a thing, and that is likely to lead to bugs and security issues.