hackernews 100, feedburner/pubsubhubbub

159 views
Skip to first unread message

John Huffaker

unread,
Apr 6, 2014, 4:10:04 PM4/6/14
to pubsub...@googlegroups.com
Hi All,

I use an RSS feed for ycombinator that only posts stories that stories that reach 100 points (described here: http://talkfast.org/2010/07/23/a-cure-for-hacker-news-overload/).  I used to get a fairly snappy steady stream of posts throughout the day but I've noticed in the past month or so that it would tend to burst in every 6 hours or so.  I talked to my RSS reader place (theoldreader) and they said that they rely on pubsubhubbub for feeds that support it and do full refreshes on a much slower cycle.  They also mentioned that the order in which the items are coming (oldest first, newest last) may be causing problems for pubsubhubbub.  All of my searching around made me think that you guys do a plain diff and should be fine with order differences.  I'm not 100% sure what is happening with this feed.  I've attached the pubsubhubbub status page image and the feedburner output at a snapshot in time.  Any help you guys can give me would be very welcome.

PS: I'm in contact with both the RSS reader people and the feed author so I can get more info if needed.

Regards,
John Huffaker
newsyc100.pubsubhubbub.20140406.png
newsyc100.feedburner.20140406.xml

Julien Genestoux

unread,
Apr 7, 2014, 3:23:33 AM4/7/14
to Pubsubhubbub
John,

The order of items does not create problem that I know of with most hubs. Diffing for RSS/Atom should be done using <guid>/<id> and hence the publication time/order should not matter. This is what Superfeedr does with the hubs it hosts, I'll let Roman explain exatcly how Google does its diffs.

Now, I Fedburner cannot ping the hub itself if it is not pinged itself about an update. So, it would probably be useful to know if the "burnt" feed itself uses pubSubHubbub.

Thanks,





--


--

---
You received this message because you are subscribed to the Google Groups "Pubsubhubbub" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pubsubhubbub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Roman

unread,
Apr 7, 2014, 4:41:56 AM4/7/14
to pubsub...@googlegroups.com
Hi John,

On Mon, Apr 7, 2014 at 9:23 AM, Julien Genestoux <julien.g...@gmail.com> wrote:
John,

The order of items does not create problem that I know of with most hubs. Diffing for RSS/Atom should be done using <guid>/<id> and hence the publication time/order should not matter. This is what Superfeedr does with the hubs it hosts, I'll let Roman explain exatcly how Google does its diffs.

It's the same with Google hub: the order of items doesn't matter.

Now, I Fedburner cannot ping the hub itself if it is not pinged itself about an update. So, it would probably be useful to know if the "burnt" feed itself uses pubSubHubbub.

I have verified that the hub is receiving pings for this feed from Feedburner every 6 hours. Let me find out what happens within Feedburner. I'll get back to you.

Roman.

John Huffaker

unread,
Apr 8, 2014, 2:30:23 AM4/8/14
to pubsub...@googlegroups.com
Thanks for looking into it!

If it helps, the primary feed is:

Julien Genestoux

unread,
Apr 9, 2014, 11:00:46 AM4/9/14
to Pubsubhubbub
John, you should make that original feed PubsubHubbub first if you want feedburner to propagate the updates too.




--


Roman

unread,
Apr 9, 2014, 11:36:13 AM4/9/14
to pubsub...@googlegroups.com
On Wed, Apr 9, 2014 at 5:00 PM, Julien Genestoux <julien.g...@gmail.com> wrote:
John, you should make that original feed PubsubHubbub first if you want feedburner to propagate the updates too.

+1.

Here's how FeedBurner works.

Definition: source feed is a feed that publisher hosts. http://jmillerinc.com/newsyc100.xml is a source feed.
Definition: burned feed is a feed that FeedBurner hosts. http://feeds.feedburner.com/newsyc100 is a burned feed corresponding to http://jmillerinc.com/newsyc100.xml.

When a burned feed is requested from FeedBurner (usually by an RSS reader), it checks the age of the cached source feed. If it never downloaded the source feed or it's too old, FeedBurner downloads the source feed. Then FeedBurner serves the requested burned feed, which is essentially a transformed source feed (the set of transformations is specified by the owner of the burned feed in the FeedBurner UI).

I believe by default FeedBurner will download the source feed if it's older than 30 minutes, but don't quote me on that.

All burned feeds support PubSubHubbub with https://pubsubhubbub.appspot.com/ as the hub end-point, regardless of the source feed. Whenever the burned feed changes (that happens when FeedBurner detects changes in the source feed), it notifies https://pubsubhubbub.appspot.com/, which in turns notifies all subscribers of the burned feed.

If the source feed supports PubSubHubbub (it can be any hub, no necessarily https://pubsubhubbub.appspot.com/), FeedBurner subscribes to changes. When the publisher notifies its hub about changes to the source feed, FeedBurner receives those notifications from the hub and instantly recomputes the burned feed.

Summary:
  1. If the source feed doesn't support PubSubHubbub, the burned feed will be up to 30 minutes behind the source feed.
  2. If the source feed does support PubSubHubbub, the burned feed will always be up to date and FeedBurner will not have to poll it.
HTH,
Roman.

John Huffaker

unread,
Apr 10, 2014, 12:56:54 PM4/10/14
to pubsub...@googlegroups.com
Thanks for all of your help and clear explanations!

FeedBurner seems to auto-add a pubsubhubbub link to all feeds even if the underlying one would rather rely on the polling model.  If you look at the differences in XML between the original feed (http://jmillerinc.com/newsyc100.xml) and the feedburner one (http://feeds.feedburner.com/newsyc100) you can see that feedburner automatically adds a <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" />.  Which my feed reader (theoldreader) then respects and waits for the notification and only refreshes every 6 or so hours.  I assume this is probably happening with other burned feeds as well.

It feels like a feedburner is the one that is making the mis-step in the chain (or my expectation that a feed shouldn't have to implement notifications and should be allowed to rely on polling).  You guys know of a different group/list that I can post this to for feedburner?

Regards,
Huff

Julien Genestoux

unread,
Apr 11, 2014, 6:21:04 AM4/11/14
to Pubsubhubbub
Huff,
I understand your concern and I see 2 things you could do to aboid the problem easily:
- support PubSubHubbub on the original feed
- plug the original feed on The Old Reader directly.

That said, I understand the frustration.

I hope this helps.
Thanks




--


Roman

unread,
Apr 15, 2014, 9:54:22 AM4/15/14
to pubsub...@googlegroups.com
On Thu, Apr 10, 2014 at 6:56 PM, John Huffaker <jhuf...@gmail.com> wrote:
Thanks for all of your help and clear explanations!

FeedBurner seems to auto-add a pubsubhubbub link to all feeds even if the underlying one would rather rely on the polling model.  If you look at the differences in XML between the original feed (http://jmillerinc.com/newsyc100.xml) and the feedburner one (http://feeds.feedburner.com/newsyc100) you can see that feedburner automatically adds a <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" />.

This is working as intended. As I said, FeedBurner notifies http://pubsubhubbub.appspot.com/ when the source feed changes.
 
 Which my feed reader (theoldreader) then respects and waits for the notification and only refreshes every 6 or so hours.

Sounds good.
 
 I assume this is probably happening with other burned feeds as well.

Yes, FeedBurner enables PubSubHubbub for all feeds.

It feels like a feedburner is the one that is making the mis-step in the chain

If you could be more specific, I might be able to help. It would be useful if you could provide:
1. T0: timestamp when http://jmillerinc.com/newsyc100.xml has changed.
2. T1: timestamp when http://jmillerinc.com/newsyc100.xml is still not updated in theoldreader.

If T1 - T0 > 4 hours, it definitely indicates a problem and I'll be happy to take a look.

Meanwhile I checked the logs of FeedBurner and Google PuSH Hub for a time period between 2014/04/14-11:04:39 and 2014/04/14-15:07:04. During this 4 hour time window the feed was downloaded by FeedBurner 9 times (roughly every 30 minutes); 3 changes were detected and theoldreader was notified about all of them within 3 seconds.

2014/04/14-11:04:39: downloaded
2014/04/14-11:35:10: downloaded and changed
2014/04/14-11:35:12: theoldreader notified
2014/04/14-12:05:20: downloaded and unchanged
2014/04/14-12:35:23: downloaded and unchanged
2014/04/14-13:05:32: downloaded and unchanged
2014/04/14-13:35:35: downloaded and unchanged
2014/04/14-14:06:08: downloaded and changed
2014/04/14-14:06:11: theoldreader notified
2014/04/14-14:36:52: downloaded and unchanged
2014/04/14-15:07:04: downloaded and changed
2014/04/14-15:07:07: theoldreader notified

All times UTC.

HTH,
Roman.
Reply all
Reply to author
Forward
0 new messages