How to fix an undefined feed?

40 views
Skip to first unread message

Anton Zuiker

unread,
Sep 4, 2015, 10:21:38 PM9/4/15
to river4
Thank you River4 community! I appreciate your work on this. I've been wanting to create a comprehensive river of news for Duke University for many years, and I'm delighted to have a prototype of the Duke River running at river4.dukeriver.co
My question today is about feeds.

The one feed I really need to work is the feed for my unit at Duke, the Department of Medicine - https://medicine.duke.edu/medicinenews/news-rss.xml - coming from our Drupal 7 site.

But, the River4 dashboard continues to say it is 'undefined'. See screenshot below.

I asked my web developer to make sure the feed validates (it hadn't previously), and now it does seem to be valid, albeit with recommendations -- http://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fmedicine.duke.edu%2Fmedicinenews%2Fnews-rss.xml

I did delete the feed from my list, and delete the previous file and folder associated with the feed in my S3 bucket, before adding the validated feed back to my list. Still getting undefined.

What can I do next?

Related feed issues

1. Sometimes the dashboard shows a feed name as 'null' and I think that relates to a missing feed title. Is this correct?

2. I discovered that adding a feed incorrectly can crash River4. Here's my narration:

Aug 26: Hmm, looks like the Heroku app has crashed. I find instructions for using Terminal to view the logs, and I see that node is crashing. I think it's because I added a bad feed earlier today, included the "view-source:" code I must have copied along with the feed URL. Common mistake I've found that I do when getting the feed from the browser. I deleted the list and river and folders associated with the list that had the offending feed. Took the dog outside to play for a few minutes. Came back, and, w00t, the Duke River app is back up.

 

Andy Sylvester

unread,
Sep 7, 2015, 7:16:26 PM9/7/15
to riv...@googlegroups.com
Anton -

I have added the feed to the "myreadinglist" tab at http://rivers.andysylvester.com/. From looking at the feed, it has not updated since September 3rd, hopefully there will be some posts after Labor Day to check. I have not had any problem with the other feeds in that feed list. I was able to read the feed in another feed reader (Inoreader), will let you know what I observe.

Here are some other ideas of approaching debugging this problem:

1. Use an addon app in Heroku to be able to view the console log messages from your app (I have added the Papertrail app, there is a free version available).
2. Use a script to save some information when a new item is going to be added to a river. I have done some successful experiments with the callback feature in River4 (see https://github.com/scripting/river4/wiki/How-callbacks-work-in-River4 for more information). If you would be interested in this, contact me.

Andy Sylvester



--
GitHub repository: https://github.com/scripting/river4
How to ask for help: http://scripting.com/2014/03/19/howToAskForHelpWithSoftware.html
---
You received this message because you are subscribed to the Google Groups "river4" group.
To unsubscribe from this group and stop receiving emails from it, send an email to river4+un...@googlegroups.com.
To post to this group, send email to riv...@googlegroups.com.
Visit this group at http://groups.google.com/group/river4.
To view this discussion on the web visit https://groups.google.com/d/msgid/river4/edda8ecb-fa97-40d0-96ee-231721694105%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dave Winer

unread,
Sep 8, 2015, 9:48:49 AM9/8/15
to river4
First, I'm glad Andy is helping here. ;-)

Second, in general the RSS world is sloppy. People do sometimes crazy things, so you have to put up with a certain number of loose ends. 

As the MLB post-season approaches, I'm putting together a river of news for baseball. I'm finding a lot of dead ends and poorly maintained feeds, but through all the mess, I think a good river will emerge.

But we are very far from perfection. 

We just do the best we can with what we have, which is actually pretty amazing when you tihnk about it.

Dave

Anton Zuiker

unread,
Sep 8, 2015, 10:56:23 AM9/8/15
to river4
Thanks, Andy. I'm traveling today, but will follow your tips tomorrow. We should be able to post a couple of items to MedicineNews today and tomorrow, so that should give us something to watch for.

Andy Sylvester

unread,
Sep 9, 2015, 1:02:28 AM9/9/15
to riv...@googlegroups.com
Anton -

I saw that your feed updated today, but I did not see any items in my river. I did some testing this evening with the Node Inspector module on a filesystem installation of River4 with just your feed in the lists folder. I noticed that there was a 403 Access Denied error shown in the Node Inspector Network window for your feed. I then decided to use the curl utility to test further. I was able to get web pages for websites using http, but when I tried to do the following command:


I got the following response:


<html>
<head>
  <title>Page Unavailable</title>
  <style>
    body { background: #303030; text-align: center; color: white; }
    #page { border: 1px solid #CCC; width: 500px; margin: 100px auto 0; padding: 30px; background: #323232; }
    a, a:link, a:visited { color: #CCC; }
    .error { color: #222; }
  </style>
</head>
<body onload="setTimeout(function() { window.location = '/' }, 5000)">
  <div id="page">
    <h1 class="title">Page Unavailable</h1>
    <p>The page you requested is temporarily unavailable.</p>
    <p>We're redirecting you to the <a href="/">homepage</a> in 5 seconds.</p>
    <div class="error">(Error 403 Access denied.  Please contact via a different client configuration.)</div>
  </div>
</body>
</html>

I have a few thoughts:

1. When I access this URL in a web browser, I can see the feed text displayed. When River4 tries to access it, I am assuming there is a problem.
2. There could be some problem in River4 such that it cannot request feeds via https
3. There could be a problem on your server (medicine.duke.edu) that does not allow access to this feed via a client that is not a web browser.
4, There could be some other problem (grin!).

Do you have any other feeds in your river that come from this server? If you do, are they having problems?

Let me know what you think. Perhaps you could pass this information on to your developer. I am also available for further consultation, you can contact me at sylvest...@gmail.com.

Andy Sylvester


Anton Zuiker

unread,
Sep 9, 2015, 8:52:22 PM9/9/15
to river4
Andy, I appreciate your help on this! I suspect the https is the issue, and will connect my web developer and our server admin to confirm and offer a solution.

I do see a few of my other feeds are coming from https, such as https://sites.duke.edu/neurology/feed/ (that's a Wordpress site, while my problematic feed is coming from Drupal)

Will keep digging, with your help. Thanks again.

Anton

Dave Winer

unread,
Sep 9, 2015, 9:56:28 PM9/9/15
to riv...@googlegroups.com
River4 handles https fine. Pretty sure that's not a problem.

In a sentence, what is the problem with your setup?


For more options, visit https://groups.google.com/d/optout.


--
Typed on an iPad with fat fingers.

Andy Sylvester

unread,
Sep 10, 2015, 2:44:16 PM9/10/15
to riv...@googlegroups.com
Anton -

Thanks for supplying another example of a feed URL using https. The way that River4 gets a feed to read is by using the request module from NPM. I used the following script to test your two feeds:


# Start example test script

var request = require("request");

request("https://medicine.duke.edu/medicinenews/news-rss.xml", function(error, response, body) {
  console.log(body);
});

# End example test script


I had to perform "npm install request" prior to running this script. When I ran this on your problem feed, I got the same error response text (403 Access Denied) as I mentioned earlier in this thread. When I used it with your other example (https://sites.duke.edu/neurology/feed/), I was able to see the full text of the feed. 

Based on this test, I think that there is some problem with your server (medicine.duke.edu). Let me know if you have more questions.

Andy Sylvester


Anton Zuiker

unread,
Sep 15, 2015, 5:02:22 PM9/15/15
to river4, da...@smallpicture.com
My server admin reports that our server does have some blocks, based on user-agents, in place to curb spammers. What user-agent does river4 use when requesting the rss feed? I'd like to ask him to unblock that so the feed can finally show up in the dukeriver.co

Anton
To unsubscribe from this group and stop receiving emails from it, send an email to river4+unsubscribe@googlegroups.com.

To post to this group, send email to riv...@googlegroups.com.
Visit this group at http://groups.google.com/group/river4.
To view this discussion on the web visit https://groups.google.com/d/msgid/river4/1900ee0f-37da-4456-bb41-e47243faac6b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andy Sylvester

unread,
Sep 16, 2015, 3:11:17 PM9/16/15
to riv...@googlegroups.com
Anton -

I do not think that River4 is setting a user agent. From reviewing river4.js at https://github.com/scripting/river4/blob/master/river4.js, line 1384 appears to be the request for a feed, but no user agent is specified.

It also appears that Node.js apps typically do not have a default user agent when making a HTTP request (per this page: https://github.com/nodejs/node-v0.x-archive/issues/4552).

The NPM request module, which River4 uses to make feed requests, does have the capability to set a user agent (see https://www.npmjs.com/package/request#custom-http-headers). From my review, it looks like a code change might be necessary to set a user agent to work around your server block.

Andy Sylvester
 



--
Typed on an iPad with fat fingers.

--
GitHub repository: https://github.com/scripting/river4
How to ask for help: http://scripting.com/2014/03/19/howToAskForHelpWithSoftware.html
---
You received this message because you are subscribed to the Google Groups "river4" group.
To unsubscribe from this group and stop receiving emails from it, send an email to river4+un...@googlegroups.com.
To post to this group, send email to riv...@googlegroups.com.
Visit this group at http://groups.google.com/group/river4.

Dave Winer

unread,
Sep 16, 2015, 3:35:22 PM9/16/15
to riv...@googlegroups.com
It's possible we might put a setting in there for user-agent, but really feeds are supposed to be open to the world by default. 

Dave



Anton Zuiker

unread,
Sep 17, 2015, 1:06:35 PM9/17/15
to river4
Andy, thanks for your report -- that detail helped our sysadmin understand the issue. He says our system was blocking River4 from getting the feed because it was not sending a user-agent. He's opened that one rule to allow the request to flow through. ("We are attempting to balance spam/bot prevention with functionality.")

I'm seeing that river4 has now recognized both the Department of Medicine and School of Medicine feeds, so I'm eagerly anticipating items to show up in the Duke River.

Many thanks for your help and patience!
To unsubscribe from this group and stop receiving emails from it, send an email to river4+unsubscribe@googlegroups.com.

Andy Sylvester

unread,
Sep 17, 2015, 2:17:38 PM9/17/15
to riv...@googlegroups.com

Anton,

That is great to hear! I am glad that I could help, and hope to be able to work with you again soon.

Andy Sylvester



--
Typed on an iPad with fat fingers.

--
GitHub repository: https://github.com/scripting/river4
How to ask for help: http://scripting.com/2014/03/19/howToAskForHelpWithSoftware.html
---
You received this message because you are subscribed to the Google Groups "river4" group.
To unsubscribe from this group and stop receiving emails from it, send an email to river4+un...@googlegroups.com.
To post to this group, send email to riv...@googlegroups.com.
Visit this group at http://groups.google.com/group/river4.
To view this discussion on the web visit https://groups.google.com/d/msgid/river4/503612e4-65f5-4968-95d2-ed5f7a508904%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
GitHub repository: https://github.com/scripting/river4
How to ask for help: http://scripting.com/2014/03/19/howToAskForHelpWithSoftware.html
---
You received this message because you are subscribed to the Google Groups "river4" group.
To unsubscribe from this group and stop receiving emails from it, send an email to river4+un...@googlegroups.com.
To post to this group, send email to riv...@googlegroups.com.
Visit this group at http://groups.google.com/group/river4.

--
GitHub repository: https://github.com/scripting/river4
How to ask for help: http://scripting.com/2014/03/19/howToAskForHelpWithSoftware.html
---
You received this message because you are subscribed to the Google Groups "river4" group.
To unsubscribe from this group and stop receiving emails from it, send an email to river4+un...@googlegroups.com.
To post to this group, send email to riv...@googlegroups.com.
Visit this group at http://groups.google.com/group/river4.
Reply all
Reply to author
Forward
0 new messages