In the diagram you'll note that I've included mockups of the curl
commands needed to actually perform these steps – I think the fact
that this approach can be performed using nothing but a few simple
HTTP[S] requests speaks volumes for its potential adoption, and hope
that we can start with this and layer the crypto on as necessary once
we have a better idea of where the pain points of this sort of flow
might emerge (i.e., it's not clear what the hardest to implement
aspects of this approach are or will be).
As Laurent very correctly pointed out earlier on this? list and in a
conversation we had last week, this approach can very neatly also
represent a generic approach to obtaining protected content on the web
by an authenticated, named entity. The only thing missing from the
generic approach is a way to verify the requesting (delegate) site*
b.
* possible candidate approaches: client-side SSL certs (associated
with the delegate domain), client-side SSL certs (hosted in the
webfinger profile of the user), Magic Signatures (either for
identifying the domain or the webfinger profile).
The pubsub dialback isn't including the 'From' user's address, which as
a subscribing server makes it harder for me to tell which subscription
is being confirmed. Is this meant to be distinguished based on the
challenge string, or should we include the 'From' here as well?
(I'd still prefer it to be in the query string/POST form data with the
rest of the PuSH parameters; using a second data channel just for one
parameter may make it more likely for implementors to make mistakes.)
-- brion vibber (brion @ status.net)
Good point – the challenge string should absolutely be sufficient;
because we don't have a "To" (it's not guaranteed that the webfinger
address you use to look up a user will be the same as the address they
use to identify themselves on a publishing site), we can't tell if the
subscriber knows the address or just the feed URL. As such, sending
the address would disclose information unnecessarily.
So I think it's safer overall to rely on the PSHB mechanisms, which
has the side-benefit that the approach remains virtually identical for
non-PSHB implementations.
> (I'd still prefer it to be in the query string/POST form data with the rest
> of the PuSH parameters; using a second data channel just for one parameter
> may make it more likely for implementors to make mistakes.)
The challenge there is that if we use query string or form-encoded
body parameters, there's a chance of interfering with
application-level parameters. One really nice property of using the
header-based approach is that it becomes a legitimate method for
generalised authentication. The same isn't true if we use get/post
variables.
FWIW, the only reason we included support for query string and form
variables in OAuth was because iirc, older versions of Flash didn't
allow setting Authorization headers, so e.g., building OAuth API
clients on the Chumby or crappy phones with Flash Lite would have been
impossible (gasp! horrors! – what a difference a few years make ;-)
).
b.
> FWIW, the only reason we included support for query string and form
> variables in OAuth was because iirc, older versions of Flash didn't
> allow setting Authorization headers, so e.g., building OAuth API
> clients on the Chumby or crappy phones with Flash Lite would have been
> impossible (gasp! horrors! – what a difference a few years make ;-)
> ).
Haha... Good to have that historical context. Well, it's a nice
side-benefit that it lets you build capability URLs. Clearly someone
thought that was useful since OAuth2 reuses that approach heavily,
yeah?
I think it was also because CGI & PHP apps were not (still are not?)
given access to the the Authorization headers by Apache and Lighttpd,
which meant that you can't run an OAuth enabled service on a shared
hosted environment.
Hmm, well my worry is that we're bypassing the PSHB verification
mechanisms, making the system more fragile.
If I'm understanding this right, I've got a scenario where things break
because we're not actually confirming to the subscribing server that the
requesting user's identity is part of the subscription data.
When a private-aware client attempts to subscribe to feeds on a
non-private-aware hub we may have trouble, since there's a mismatch in
what uniquely identifies a particular subscription:
Alice's server: (from, topic, callback)
Bob's hub: (topic, callback)
Bob's hub is completely unaware of the 'From' header and ignores it.
The verification callback passes back all the PubSubHubbub parameters
for confirmation but not the 'From' header, so Alice's subscribing
server can't tell whether or not Bob's hub saw it. The subscription is
probably confirmed immediately (204 response), which Alice's server
assumes means that Bob has pre-approved Alice as a friend.
If Charlie, another user on Alice's server, now follows Bob, Alice's
server will send another subscription request, this time with Charlie in
the 'From' header. Bob's hub will again ignore the 'From' header, and
will consider it to be an update to the original subscription:
"Hubs MUST allow subscribers to re-request subscriptions
that are already activate. Each subsequent request to a
hub to subscribe or unsubscribe MUST override the previous
subscription state for a specific topic URL and callback
URL combination once the action is verified. Any failures to
confirm the subscription action MUST leave the subscription
state unchanged. This is required so subscribers can renew
their subscriptions before the lease seconds period is over
without any interruption. "
Now, if either Charlie or Alice unsubscribes, Alice's server will send
an unsubscription request (with a From header which will be ignored),
which will tear down the single subscription that the hub knows about.
You can work around this by using a separate callback URL for every
individual feed+requester, but that feels fragile; it looks like things
are supposed to work with a single callback URL on each subscribing site.
This is an important point, but I think it's okay, because the
subscription response (assuming the preconditions succeed) is *always*
"OK, you're subscribed". The actual data flow is decided out-of-band,
so that even though it *looks like* Bob has approved the subscription,
he may or may not be sending actual data.
The upshot of this is that if Bob's hub ignores From data, then by
necessity it means that either:
1. Bob's feed is public, or
2. Alice has a capability URL that allows private access to Bob's feed.
In the latter case, Bob "SHOULD" keep track of who or what has access
to that capability URL.
> Now, if either Charlie or Alice unsubscribes, Alice's server will send an
> unsubscription request (with a From header which will be ignored), which
> will tear down the single subscription that the hub knows about.
This is true, except that it would be a bug in the subscribing server
– the subscribing server should only send an actual unsubscribe
request to Bob's hub when there are no remaining subscribers.
> You can work around this by using a separate callback URL for every
> individual feed+requester, but that feels fragile; it looks like things are
> supposed to work with a single callback URL on each subscribing site.
That's definitely one approach, and allows for fine-grained access
control when the hub *is* tracking who is subscribing, but I don't
think it's necessary for the scenario you're describing here.
The other approach is for hubs to do a redirect on the feed URL
per-subscriber, so that the From address becomes a conditional on
which the topic is defined. It might be that this isn't possible with
PSHB, but I think it is – the hub simply needs to do a redirect on the
subscribe endpoint another subscribe endpoint. Whether that works in
the wild is another question. ;-)
But absolutely all valid and important points. I suspect these issues
remain with the OAuth scenario UNLESS the actual subscribing user is
required to do an OAuth click-through on the AuthZ server, which has
its own set of (mostly UX) challenges.
b.
I'm not expecting to win any converts on this forum - but maybe help the wheels turn and get you thinking outside the box.
|
Evan Prodromou, CEO StatusNet Inc., 1124 rue Marie-Anne Est #32, Montreal, QC H2J 2T5 T: 438-380-4801 x101 C: 514-554-3826 W: http://evan.status.net/ E: ev...@status.net |
Indeed, and that's my goal for the OAuth/PubSubHubbub approach: User
from site X goes to site Y to actually request following a user over
there. My assumption is big companies like Google and Facebook need an
"interstitial" on-site in order to present a ToS to the user. The
lawyers freak out without it.
I'm concerned about the UX implications of that. I think before baking
interstitial UI into a protocol someone should see about doing some
usability testing to see if that interstitial page causes dropoff.
There's no point in going to the trouble to design a lawyer-friendly
protocol if users can't actually use it.
I think everyone is extremely concerned about the UX implications of
delegated authorization and cross-site privacy. The OAuth flows sites
already in-use have been user-tested quite a lot, as I hear. So that's
a concrete, existing system that we can build on, that people are
reasonably happy with, that the lawyers like, etc. If it causes
drop-off (likely), my hope is we can iterate and improve. It's in
everyone's best interest (wrt: FWS) to maximize the simplicity of
establishing cross-site connections.
If a site doesn't need to present the interstitial, then it can just
immediately redirect back to the OAuth consumer. I believe this
situation is already covered.
The OAuth flow has been tested as a means to sign into one site with an
account for another. It has not been tested in the context of a specific
interaction like following, which in the non-federated scenario is
simply a click of a button with no interstitial UI at all.
I got the feeling in the discussion at the summit that what the lawyers
really wanted was to get TOS agreement from the application developer,
not from the user. Usually the OAuth interstitial page is used for the
user to authorize the release of private data, but in this case the user
going through the flow is the user requesting the data, not the user
releasing it, so I don't think the two are really comparable from a UX
perspective.
I don't know about anyone else, but I'd be satisfied with a transitional
strategy where providers can choose whether to require client
pre-registration by weighing the desire for lawyer happiness vs. being
supported by all clients; I'd like to think that over time sanity will
prevail and client pre-registration will go away except for a few big,
stubborn providers that I (as a client developer) can begrudgingly
pre-register for.
Getting in the user's way to avoid client pre-registration is a step in
the wrong direction and I expect will cause *more* client
pre-registration (to escape the terrible UX) rather than less.
I'm concerned about the UX implications of that. I think before baking interstitial UI into a protocol someone should see about doing some usability testing to see if that interstitial page causes dropoff. There's no point in going to the trouble to design a lawyer-friendly protocol if users can't actually use it.
At the summit, I was adamant that the legal issues must come after the
UX and protocol issues. The only scenario that was raised (by David)
that couldn't be neatly solved by having the publisher agree to a ToS
that allowed syndication of content [if the publisher allowed people
from other networks to view that publisher's content] was the
following:
- publ...@example.com posts content.
- content is syndicated to subsc...@example.com (no ToS problem)
and subsc...@notexample.com (no ToS problem as long as
publ...@example.com has asserted that syndication is OK, which is
explicit in ALL ToSs for websites with [public] feeds today).
- subsc...@example.com comments on publ...@example.com's content.
- Now, subsc...@example.com HAS NOT allowed
subsc...@notexample.com to see her content. Does the comment get
syndicated? What are the ToS implications if it is? What are the UX
implications if it is not?
In my mind, there are [at least] three ways to do this *without*
baking ToS click-throughs into the protocol, in order of
UX-friendliness preference:
1. example.com and notexample.com get ToS agreements from publisher,
subscriber1, and subscriber2 that give the services permission to
syndicate any content posted to the sites. Obviously trickier if
subscriber2 is commenting on publisher's post - that comment can be
syndicated to publisher, but it's somewhat unclear what happens for
subscriber1 and another subscriber on a 3rd site. Let the lawyers
figure that out.
2. Syndicate an alert to the subscriber saying "You need to click this
ToS link in order to receive comments on publ...@example.com's post,
and assert that we may re-syndicate your comments on
publ...@example.com's post."
3. Have a discoverable mechanism / extension to PSHB that transfers
ToS responsibility from server-to-server, like an NDA in reverse, or a
non-assert covenant. On subscription, the subscribing server says
something programmatic like "by publishing this feed, you assert that
you have permission to publish everything that you send to me", and
the publisher server needs to have some verified way of saying "yes, I
have permission". This mechanism could be extended to license
assertions like "I have permission to syndicate stuff, and you have
permission to re-syndicate it as long as a link back to the original
is included." or "I have permission to syndicate content, but you're
not allowed to re-syndicate it."
... So anyhow, I don't think it's necessary to bake in a ToS click
into the protocol at such a low level, for some theoretical legal
concerns (given that there are a number of potential solutions to the
problem).
And, anyhow, what the heck are we worried about GOOGLE's lawyers for?
Google is crawling the hell out of all sorts of copyrighted material,
and doing all sorts of vaguely legal stuff with it, all without ToSes
of any sort. The Google Book Settlement is explicitly a legal hack
after the fact to deal with the fact that Google has stolen and is
arguably illegally reproducing content. Given this state of affairs,
why on earth would we bake in affordances for the lawyers who will
find a way, when necessary, no matter what the protocol looks like?
> I got the feeling in the discussion at the summit that what the lawyers
> really wanted was to get TOS agreement from the application developer, not
> from the user. Usually the OAuth interstitial page is used for the user to
> authorize the release of private data, but in this case the user going
> through the flow is the user requesting the data, not the user releasing it,
> so I don't think the two are really comparable from a UX perspective.
You said:
#1 The OAuth page is "used for the user to authorize the release of
private data".
#2 Versus, "the user going through the flow is the user requesting the
data, not the user releasing it".
My claim is that the second flow *is also* the user releasing the
private data to a third-party site.
Consider the case where the OAuth Provider knows in advance *who* is
requesting access (determined through oauth_consumer_key, From:,
whatever). Once the Provider knows who the requestor is, it can make
the policy decision of, "yes, b...@status.net has access to
al...@gmail.com's feed". For example, Alice could have imported her
address book into Buzz and already given b...@status.net approval
before he ever requested it.
Thus, the OAuth approval screen (with ToS) asserts that the user
looking at it (b...@status.net) does indeed already have access to the
requested resource. The only purpose of the approval screen, then, is
to confirm that the user (b...@status.net) is comfortable with making
the requested resource accessible by a third-party on their behalf.
What you've identified is a third situation where the OAuth provider
has never heard of b...@status.net and must consult Alice directly for
the policy decision. In that situation this flow conflates requesting
access to private data with approving release of private data.
I don't think it's a requirement to conflate these two actions. There
could just as well be a Salmon slap from b...@status.net to
al...@gmail.com before any OAuth dance occurs, requiring user approval
before syndication is allowed (this is Blaine's suggestion #2). My
concern is then there would be two approval dialogs for every
cross-site follow, since the OAuth approval dialog isn't going away. I
say OAuth approval isn't going away because of the experience I've had
with lawyers in the belly of the beast.
Does that make sense?
-Brett
On Thu, 5 Aug 2010 01:25:28 -0700 (PDT), "Mike (DFRN)"
<mi...@macgirvin.com> wrote:
> A re-tweet is on shaky ground if you abide by these rules. However a
> re-tweet is a different beast and you're already on shaky ground. You
> aren't forwarding an original post to a third party, you're creating a
> new post, which is now authored by you. Legally, by publishing it -
> you're engaged in copyright violation unless you use a fair-use
> defence and add a comment (hard to do sometimes in 140 chars)
If the re-tweet is a fundamental feature of the service wouldn't that
imply that someone tweeting something is automatically
allowing that text to be re-tweeted? Just as posting something on twitter
will make it public readable.