how to use the RSS feeds discovery & parsing tool in another open source project ?

0 views
Skip to first unread message

matthie...@gmail.com

unread,
Sep 29, 2006, 5:03:17 AM9/29/06
to
Hello,

I am developing an open source software in which I need to
automatically find rss feeds in pages, and parse the feeds content to
get the title, description, author, etc.
Ultimately, it would be great to parse HTML in order to get only the
text (without html tags).

Because Firefox already does this job, i would like to use the
firefox/mozilla code. But I can't understand how this works, it's
really big and I don't know anything about "firefox inside" :-).

I found those source codes which look interesting for me:
http://lxr.mozilla.org/mozilla/source/browser/components/feeds/
http://lxr.mozilla.org/mozilla/source/toolkit/components/feeds/
[...] ?

This page looks like a documentation (Feed content access API)
http://developer.mozilla.org/en/docs/Feed_content_access_API
But I can't really understand what to do...

What I need precisely is :
- a tool to discover URLs feeds from a HTML code source page (discover
<link, etc.)
- a tool to parse the RSS/ ATOM feed (if possible, all versions) and
get some arrays with information like author, title, descriptions,
publication date, etc.

- a tool to parse HTML in the <description> tag : i would like to
extract only the text (i don't need <img src, a href, etc.)

Can firefox help me? If yes, how should I proceed? If no, do you have
ideas for me?

I am waiting for your help, thank you very much :-)

Matthieu
dev of gpl web statistics software http://www.phpmyvisites.us/

PS: Am i in the right place for this kind of questions?

Reply all
Reply to author
Forward
0 new messages