PSHB is not using fat pings. There are use cases for fat pings that
are under discussion, but fat pings are not in the spec at this time.
alexis
Alexis is referring to #1 not being fat. The story thus far has been
if a publisher wants to fat ping, they should integrate a Hub into
their CMS.
Otherwise, the benefits of fat content pushing versus URL forwarding
is discussed in some detail here:
http://code.google.com/p/pubsubhubbub/wiki/ComparingProtocols
To your second point, Subscribers should never synchronously process updates. They should be dumped immediately to a job queue for asynchronous processing. This will help spread the processing load more evenly over time instead of being clumped together which I gather is what you're against. So it's receive update, verify it is an update (input validation), dump update to queue, and respond with a 200 code.
On Tue, Oct 27, 2009 at 9:56 AM, Alex Barth <lxb...@gmail.com> wrote:
> How do you guys see the advantages/disadvantages of POSTing feed data
> in these scenarios:
>
> 1. Hub does not serve delta feed. In my mind, this is can be
> interesting for 3 reasons: a) building simple hubs that don't inspect
> feeds at all b) building hubs that are completely agnostic to their
> feed formats, c) hubs convert feed to standard format, subscribers
> pull the first feed data from hub, not from original publisher (heck,
> how do the superfeedr guys do that?)
I think (a) isn't too compelling. We're going to have a few, very
well-tested hub implementations that people can run or use as a hosted
service.
We would like (b) to be part of the core spec eventually, with other
secondary specs that explain how to do differential updates for
secondary content types (if necessary).
For (c), Superfeedr is acting as a federated hub, meaning they
subscribe to all other hubs' updates and proxy them to their
subscribers. This allows for composition and data transformation.
> 2. Large data sets (i. e. DC's 2009 crime feed has 1.2MB) http://data.octo.dc.gov/
I think distributing just the changes is significantly more efficient
for large feeds. Instead of pushing 1.2MB each time the feed changes
to 1000+ subscribers, you can just send the newest 2KB update.
Combined with the Atom Tombstoning draft spec
(http://www.ietf.org/id/draft-snell-atompub-tombstones-06.txt) we
should be able to get Hubbub to communicate new and deleted content in
the same way.
> 3. Many and often changing subscribers - wouldn't this lead to
> unnecessarily sent large POST requests to subscribers that actually
> don't exist anymore?
Subscriptions in the hub have a lease period and must be checked for
validity ever-so-often. This allows the hub to prune old/bad
subscribers that aren't receiving the feed anymore. Again, this let's
the data flow be streamlined to the minimum bandwidth possible.
Hope that helps,
-Brett
I remain with some concerns around feed format agnostic hubs and heavy
payloads but I actually need to sit down and do more homework on this.
It may well be the case - like many posters argued here - that these
concerns are practically not relevant.
Alex