Solving the "turduckin" problem for PuSHing arbitrary content types

33 views
Skip to first unread message

Brett Slatkin

unread,
Oct 7, 2010, 2:32:03 AM10/7/10
to pubsubhubbub, Monica Keller, Joseph Smarr, martin...@gmail.com
Hey all,

Today one of the GitHub guys released a new Node.js library for PuSH.
His call for JSON support in the protocol is clear:

http://techno-weenie.net/2010/10/5/nub-nub/

We've been wanting to add JSON support for quite a while, with notable
contributions from Mart
(http://martin.atkins.me.uk/specs/pubsubhubbub-json) and others. A
while ago Monica wrote up this proposal about how to support arbitrary
content types in PubSubHubbub:

http://code.google.com/p/pubsubhubbub/wiki/ArbitraryContentTypes

I worry about option 2, translation to JSON, because I think it
dictates what format the JSON payload needs to be in when served by
publishers. A big benefit of JSON is making things easier to use and
more ad hoc. Dictating the packaging format would harm that. I also
don't see Facebook changing their JSON API
(http://developers.facebook.com/docs/api/realtime) to match and they
shouldn't have to.

The core issue with option 1, the REST approach, is security (replay
attacks). But I believe I've finally cracked the nut!


To explain: Feeds are good formats because they are self describing.
Update times and IDs are part of the feed body and individual entries,
enabling idempotent synchronization and race-condition tie-breaks.
This also means the 'X-Hub-Signature' on the body of PuSH new content
notifications is sufficient for security/verification because we can
ignore everything besides the body (the other headers are ignored).

With the REST approach to arbitrary content types, we *need* to
represent the HTTP headers in the new content notifications or else
we'll have no idea what order the messages came in (Date), etc. And
that's the security problem. If we rely on the headers, then we also
need to verify them (to prevent replay attacks). But X-Hub-Signature
only signs the body, so we're stuck. Some folks have discussed signing
headers too, similar to OAuth1.0, but that lead to a lot of pain
nobody wants to repeat. Others have talked about putting headers into
the body (using mime multipart), but that's just another world of
hurt-- the so-called Turduckin problem (thanks jsmarr for the name).


The solution is to borrow from the OAuth2 playbook: Treat the
X-Hub-Signature like a password/bearer token and require HTTPS for
callbacks. That means for security we have these components:

1. hub.secret used by subscriber to verify the hub is authorized to
post content (after delivery)
2. Subscriber's SSL cert used by Hub to verify authenticity of
subscriber endpoint (before delivery)
3. SSL connection between Hub and Subscriber is encrypted, protecting
the header values and preventing replay attacks (during delivery)

Thus, Monica's proposal as-is does the trick. All that's left to work
out are details around verbs and Link headers. I hope to bring
everyone back together to build out a spec around this proposal now
that I think the security issue has been solved. Hopefully we can
convince GitHub to be the first implementors of it.

Let me know what you think!

-Brett

Pádraic Brady

unread,
Oct 7, 2010, 11:16:40 AM10/7/10
to pubsub...@googlegroups.com
Hi Brett,
 
The problem as I see it is that we're pushing more responsibility onto the Subscribers while simultaneously assuming all Hubs will use SSL/TLS correctly. While I wish I had more faith in my fellow programmers, I don't. I do afterall come from PHP where the SSL context of PHP Streams (a popular alternative to curl) does not verify SSL certificates by default. Hard to keep the faith alive over here ;). I can also cite a very large list of clients/libraries which, if not using PHP Streams, will disable cert verification anyway through curl options or similar. Apparently, certificates make testing really really hard. I wish I were joking but that is a common explanation despite the obvious fact that end users are presumably not guinea pigs. We hope.
 
My point, if anything, is that we are dividing what we protect and subjecting all of it to a defence which is already implemented quite poorly in practice. On the one hand, we have a signature for request bodies, and on the other hand, nothing for headers. If the SSL/TLS protection held true - why even rely on body signatures at all? That would put us on a footing with OAuth 2.0's bearer token and the concepts behind it. Of course, we then end up all the same problems.
 
The problem with this approach is exactly what I opened with - it requires clients to obey the rules. Not all of them will. Those with sufficient security oversight will get it perfectly right, and then get completely outpopulated by the little Hubs/Subscribers springing up which value a valid response over dealing with SSL exceptions.
 
In this scenario, SSL/TLS becomes a single point of failure in a system where failure is not only common but sometimes encouraged. In a network of Hubs and Subscribers where parties may fail to verify certs, MITM attacks will be possible. Extending from that, not signing ALL components of the request will allow for requests to be manipulated without invalidating the signature of whatever subset of data is signed. Further, and it's a problem in the existing scheme, replay attacks will remain possible until a non-repeating (within a reasonable time horizon) nonce value is introduced into the signing method. Otherwise you're not only subject to replay attacks, but attempts to craft a validly signed request via remote Timing Attacks.
 
SSL/TLS would be perfect if not for all the programmers in the world happy to ignore it.
 
That's why signatures remain compelling. Yes, they are a pain in the ass to develop. Yes, they can lead to incompatibilities between large posterchild implementations. Yes, programmers hate them with a vengeance. Why? Because programmers can't ignore them by setting curl's peer verification to false. They actually have to go and deal with them properly or nothing will work. In a sense, they are supposed to be a PITA.
 
My 2c for what it's worth.
 
Pádraic Brady

http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team



From: Brett Slatkin <bsla...@gmail.com>
To: pubsubhubbub <pubsub...@googlegroups.com>
Cc: Monica Keller <monica...@gmail.com>; Joseph Smarr <jsm...@gmail.com>; martin...@gmail.com
Sent: Thu, October 7, 2010 7:32:03 AM
Subject: [pubsubhubbub] Solving the "turduckin" problem for PuSHing arbitrary content types

Monica Keller

unread,
Oct 7, 2010, 11:24:41 AM10/7/10
to Brett Slatkin, pubsubhubbub, Joseph Smarr, martin...@gmail.com
Hey Brett,
Indeed one of the major requirements for my proposal was to be able to use PubSubHubbub for APIs. With the experience coming from real time at MySpace and Facebook and I believe it also made sense for the Buzz team as well.
I think that telling Service Providers that they can keep their response format will really help with adoption as seen with what GitHub
Concerns for Option1 here http://code.google.com/p/pubsubhubbub/wiki/ArbitraryContentTypes were
-Signing of headers which HTTPS would help with
-Putting burden on subscribers to handle the different HTTP methods (DELETE, PUT) -- Not a huge concern

Would we know be asking all subscribers to have SSL certs ? That is a fairly big requirement.

OAuth 2 burdens the service providers with this so I have concers about burdening the subscribers with it.

My other question would be whether  web hooks is a better fit today for APIs since there really isn't a need for a hub to fan out.

As much as I love PubSubHubbub I think we should answer the question of how many service providers would want to push their response to another hub. MySpace and FB didn't really need an external hub. At Socialcast its the same thing we are going to add PuSH but its a private response for which you need to authenticate

My experience leads me to believe there is a serious need to support a publisher who is its own hub.

Brett Slatkin

unread,
Nov 1, 2010, 5:37:28 PM11/1/10
to pubsub...@googlegroups.com
Hey Pádraic,

Sorry for the delay in my response. Been a busy October! Thanks so
much for the detailed response. I've got some followup.

> My point, if anything, is that we are dividing what we protect and
> subjecting all of it to a defence which is already implemented quite poorly
> in practice. On the one hand, we have a signature for request bodies, and on
> the other hand, nothing for headers. If the SSL/TLS protection held true -
> why even rely on body signatures at all?

Why: Because the SSL/TLS request coming from the hub is of unknown
origin (unless you use SSL *client* certs, which is never going to
happen). You use the body signature as a bearer token to verify the
authenticity of the Hub that's pushing the new content to subscribers.

> In this scenario, SSL/TLS becomes a single point of failure in a system
> where failure is not only common but sometimes encouraged. In a network of
> Hubs and Subscribers where parties may fail to verify certs, MITM attacks
> will be possible. Extending from that, not signing ALL components of the
> request will allow for requests to be manipulated without invalidating the
> signature of whatever subset of data is signed.

The body is signed by an HMAC of a shared secret and the content. Even
if the hub *never* validates server certs for subscribers, the message
will not be subject to replay attacks unless the SSL connection itself
is compromised. The existence/chance of such MITM attacks in the wild
is low. My claim is that hubs who care about the privacy of the data
they distribute should verify server certs properly and require valid
CA-signed subscriber certificates.

> That's why signatures remain compelling. Yes, they are a pain in the ass to
> develop. Yes, they can lead to incompatibilities between large posterchild
> implementations. Yes, programmers hate them with a vengeance. Why? Because
> programmers can't ignore them by setting curl's peer verification to false.
> They actually have to go and deal with them properly or nothing will work.
> In a sense, they are supposed to be a PITA.

The problem with this conclusion, as I see it, is that full signing is
way too hard and people won't do it. There are rare exceptions to this
(the Twitter API) that have much more to do with the current market
conditions than technical considerations.

So in this proposal I'm trying to split the difference of usability
vs. security. Thus far, every proposal I've seen for arbitrary content
type support in PubSubHubbub requires a level of complexity in signing
that is approaching that of OAuth1.0 (a bad thing). If the alternative
is way easier but requires hubs to do proper server cert validation,
is that worth it?

A potential alternative approach to this is the up-and-coming signing standard:
http://developers.facebook.com/docs/authentication/canvas

Which I hear will be integrated into OAuth2:
https://docs.google.com/document/pub?id=1kv6Oz_HRnWa0DaJx_SQ5Qlk_yqs_7zNAm75-FmKwNo4&pli=1

We could try to use that for a combined body/header signing scheme. It
could use HMAC-SHA1 as the signing algorithm and using the shared
secret established during subscription time to sign the message.

-Brett

Brett Slatkin

unread,
Nov 1, 2010, 5:43:33 PM11/1/10
to Monica Keller, pubsubhubbub, Joseph Smarr
Hey Monica,

Thanks a lot for the response and my apologies for taking so long to
get back to you.

On Thu, Oct 7, 2010 at 11:24 AM, Monica Keller <monica...@gmail.com> wrote:
> Concerns for Option1 here


> -Putting burden on subscribers to handle the different HTTP methods (DELETE,
> PUT) -- Not a huge concern

Indeed, and the method stuff may just be in the X-HTTP-Method-Override
header anyways.

> Would we know be asking all subscribers to have SSL certs ? That is a fairly
> big requirement.
>
> OAuth 2 burdens the service providers with this so I have concers about
> burdening the subscribers with it.

Yes I agree that's an issue. My hope was there is a way to have Hubs
cache SSL cert fingerprints, so even a self-signed cert could be added
to the certificate chain if it was the same one that was originally
used to establish the subscription.

> My other question would be whether  web hooks is a better fit today for APIs
> since there really isn't a need for a hub to fan out.
>
> As much as I love PubSubHubbub I think we should answer the question of how
> many service providers would want to push their response to another hub.
> MySpace and FB didn't really need an external hub. At Socialcast its the
> same thing we are going to add PuSH but its a private response for which you
> need to authenticate
>
> My experience leads me to believe there is a serious need to support a
> publisher who is its own hub.

Well I totally agree with you that the common case is becoming people
running their own hub. The old light-pings are mostly there for
bootstrapping and boosting adoption. However, even if you run your own
hub, how do you achieve "a private response for which you need to
authenticate"? What is Socialcast using for authentication from your
self-run hub to the subscriber?

X-Hub-Signature works well enough for payload-only messages, but what
about messages that have headers, like arbitrary content? I don't
think that running your own hub alleviates that problem, which is why
I'm looking for a general solution that all providers can employ. Does
that make sense?

-Brett

Eric Williams

unread,
Nov 1, 2010, 7:53:23 PM11/1/10
to pubsub...@googlegroups.com
On 11/1/2010 4:43 PM, Brett Slatkin wrote:
> X-Hub-Signature works well enough for payload-only messages, but what
> about messages that have headers, like arbitrary content? I don't
> think that running your own hub alleviates that problem, which is why
> I'm looking for a general solution that all providers can employ. Does
> that make sense?

What if the application/http mime-type was used as the content body?
Would allow the headers for the content and the headers for the PuSH
message to be fully seperated. I see some possible problems related to
difficulty of parsing, however...

Martin Atkins

unread,
Nov 29, 2010, 12:10:53 PM11/29/10
to pubsub...@googlegroups.com

Eric,

This is the solution that Brett alluded to in the subject line when he
refers to "turducken". This was discussed as a possible solution but
some folks at the table found nesting an HTTP message in the body of an
HTTP message to be distasteful/confusing.

I don't have any major objection to it on principle, but I can
sympathize with the viewpoint that it puts an unusual burden on
subscribers since many web application frameworks won't expose a
facility to parse an arbitrary string as an HTTP message and so
implementers would end up working around their framework to implement
such a thing, and that is likely to lead to bugs and security issues.

John Panzer

unread,
Nov 29, 2010, 1:14:26 PM11/29/10
to pubsub...@googlegroups.com
As another data point, when we tried something similar for OpenSocial we found that PHP offered no facilities for parsing this sort of thing -- requiring new libraries to be rolled to do MIME parsing at least.  This was a discouraging finding at the time (2008).  I haven't heard that this situation has changed.

--
John Panzer / Google
jpa...@google.com / abstractioneer.org / @jpanzer
Reply all
Reply to author
Forward
0 new messages