Identifying the feed when distributed

1 view
Skip to first unread message

Julien

unread,
Jul 11, 2009, 10:58:09 PM7/11/09
to Pubsubhubbub
Hey,

I know the root of the problem is RSS and not PSHB, but yet, I think
it should be addressed to ease the protocol adoption.

I have recently added RSS feeds to the hub to see how it behaves. I
took my identi.ca feed : http://identi.ca/api/statuses/user_timeline/julien.rss
This is a valid RSS feed (not perfect, yet valid) :
http://beta.feedvalidator.org/check.cgi?url=http%3A%2F%2Fidenti.ca%2Fapi%2Fstatuses%2Fuser_timeline%2Fjulien.rss

The problem with it that it doesn't contain any info whatsoever to
identify URI in its content : here is an example of what my subscriber
received from the "demo" hub :

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>julien timeline</title>
<link>http://identi.ca/julien</link>
<link href="http://identi.ca/main/sup#3429" xmlns="http://www.w3.org/
2005/Atom" type="application/json" rel="http://api.friendfeed.com/
2008/03#sup"/>
<description>Updates from julien on Identi.ca!</description>
<language>en-us</language>
<ttl>40</ttl>

<item>
<title>julien: Testing something!</title>
<description>Testing something!</description>
<pubDate>Sun, 12 Jul 2009 01:21:02 +0000</pubDate>
<guid>http://identi.ca/notice/6341652</guid>
<link>http://identi.ca/notice/6341652</link>
</item>

</channel>
</rss>


As a subscriber, I can only identify the URI of the feed only if I
keep a "mapping" of feed URIs <-> source URIs (http://identi.ca/julien
in that specific case).


Yet, I know that the hub fetched this feed itself, so the hub has its
URI, and I think it should pass it to the subscribers.

My suggestion is then to add a param "fetched_from_uri" in the content
distribution, just as a failover if the XML itself doesn't include the
URI of the feed.

Let me know what you think.

Julien









Jeff Lindsay

unread,
Jul 12, 2009, 1:24:46 AM7/12/09
to pubsub...@googlegroups.com
Perhaps as a header?

--
Jeff Lindsay
http://webhooks.org -- Make the web more programmable
http://shdh.org -- A party for hackers and thinkers
http://tigdb.com -- Discover indie games
http://progrium.com -- More interesting things

Brett Slatkin

unread,
Jul 12, 2009, 4:21:59 AM7/12/09
to pubsub...@googlegroups.com
Hey Julien,

I need to put together two "best practices" wiki pages, one for
subscribers and another for publishers. Questions like this would go
in there.

The "callback" URL for a subscriber is supposed to be a real callback,
i.e., a pure closure like in functional programming
(http://en.wikipedia.org/wiki/Closure_%28computer_science%29). That
means you can encode whatever you want in the callback URL. You can
even make these parameters secure by signing them or encrypting them.

For example, in your case:

> <?xml version="1.0" encoding="utf-8"?>
> <rss version="2.0">
>  <channel>
>  <title>julien timeline</title>
>  <link>http://identi.ca/julien</link>
>  <link href="http://identi.ca/main/sup#3429" xmlns="http://www.w3.org/
> 2005/Atom" type="application/json" rel="http://api.friendfeed.com/
> 2008/03#sup"/>
>  <description>Updates from julien on Identi.ca!</description>
>  <language>en-us</language>
>  <ttl>40</ttl>
>
> <item>
>   <title>julien: Testing something!</title>
>   <description>Testing something!</description>
>   <pubDate>Sun, 12 Jul 2009 01:21:02 +0000</pubDate>
>   <guid>http://identi.ca/notice/6341652</guid>
>   <link>http://identi.ca/notice/6341652</link>
> </item>
>
> </channel>
> </rss>
>
>
> As a subscriber, I can only identify the URI of the feed only if I
> keep a "mapping" of feed URIs <-> source URIs (http://identi.ca/julien
> in that specific case).

What's your subscriber URL? Why not make it something like:
http://yoursubscriber.com/subscriber/julien/identica/<hmac signature>

Then you know exactly what you need to know. You have a handle to the
original feed with the already parsed information that your subscriber
requires. You could use paths like I did above, or you could append
the whole original feed as a URL. It's up to you to integrate it with
your existing web-hook handlers however you want.

Is this good enough for what you're trying to do? However, that said I
think putting the original canonical source URL somewhere the request
(as a header probably) could work just because the most common case is
people wanting this value. There's no point in making all subscribers
construct their callbacks correctly.


For Jeff: I think giving people these "web-hook closures" is the way
to go for all web-hooks. People like to do things differently. Some
people like to use parameters, some people like "pretty" paths, others
want to include some other information so their web-hook handlers can
be completely stateless.

-Brett

Tony Garnock-Jones

unread,
Jul 12, 2009, 7:43:38 AM7/12/09
to pubsub...@googlegroups.com
Hi,

2009/7/12 Julien <julien.g...@gmail.com>

Yet, I know that the hub fetched this feed itself, so the hub has its
URI, and I think it should pass it to the subscribers.

RabbitHub does this by putting a hub.topic parameter in the URI when it invokes the callback, for symmetry with (un)subscription and validation.

(Aside: during subscription, the hub.topic is treated by RabbitHub as a topic *pattern*, but during publication, it's a topic *value*; RabbitHub topics don't have to be feed URLs)
 
Regards,
  Tony


Tony Garnock-Jones

unread,
Jul 12, 2009, 9:13:17 AM7/12/09
to pubsub...@googlegroups.com
2009/7/12 Brett Slatkin <bsla...@gmail.com>

The "callback" URL for a subscriber is supposed to be a real callback,
i.e., a pure closure like in functional programming

And this is a jolly nice property. One further Really Neat thing is that if you make the URLs also unguessable (i.e. avoid ambient authority) then these closures *also* act as security-capabilities! (http://www.erights.org/)

RabbitHub hasn't yet implemented changes to the spec that have arrived since June; I built it during the first week of June, and have had limited time to look at it since (as evidenced by the lack of documentation!). I'll have a read of the diffs and send any comments to the list in a bit.

Regards,
  Tony

Jeff Lindsay

unread,
Jul 12, 2009, 3:35:17 PM7/12/09
to pubsub...@googlegroups.com
Brett,

When you say webhook closures, do you just mean that you pass along
whatever query string was included when you register the callback
webhook?

-jeff

--

Julien Genestoux

unread,
Jul 13, 2009, 8:22:47 PM7/13/09
to pubsub...@googlegroups.com
Ha! 

This is indeed very smart. I am changing superfeedr's client implementation right now!

Jeff, yes that is what Brett meant (just had coffee with him and we chatted about that!).

Julien


Jeff Lindsay

unread,
Jul 13, 2009, 8:33:11 PM7/13/09
to pubsub...@googlegroups.com
Yeah, I hate the idea that the query string would be stripped off.
I've been doing a sort of decorator pattern with webhooks ... for
example: PostBin can be used like a decorator to another callback by
prepending your callback with the postbin url plus ? and PostBin will
pass the post on to whatever is in the query string (which
theoretically could have another decorator):
http://www.postbin.org/xyz?http://example.com/mycallback

Brett Slatkin

unread,
Jul 14, 2009, 4:10:28 PM7/14/09
to pubsub...@googlegroups.com
On Mon, Jul 13, 2009 at 5:33 PM, Jeff Lindsay<prog...@gmail.com> wrote:
>
> Yeah, I hate the idea that the query string would be stripped off.
> I've been doing a sort of decorator pattern with webhooks ... for
> example: PostBin can be used like a decorator to another callback by
> prepending your callback with the postbin url plus ? and PostBin will
> pass the post on to whatever is in the query string (which
> theoretically could have another decorator):
> http://www.postbin.org/xyz?http://example.com/mycallback

Just to finish this thread, I agree with you guys that the query
params should be preserved. I'm tracking this in this issue:

http://code.google.com/p/pubsubhubbub/issues/detail?id=25

Reply all
Reply to author
Forward
0 new messages