Im going to use feedzirra to parse ingredient information from lots of cooking websites. My plan is to subscribe to their respective rss feeds. Unfortunately the entries only contain snippets and the actual ingredient info is back on the entries corresponding webpage.
So my original idea was just to use feedzirra to give me the latest entries for various feeds. I'd then extract each entries source url (url of the original webpage) and then pass that on to my own libraryto get the info I wanted.
But I noticed this in the readme
The final feature of Feedzirra is the ability to define custom parsing classes. In truth, Feedzirra could be used to parse much more than feeds. Microformats, page scraping, and almost anything else are fair game.
So now im wondering if I should just extend the rss parser that comes with feedzirra and extend the parse method to fetch(curl) the url in question and do the relevant parsing of the downloaded html.
Is this in line with what Paul was talking about, or should this functionality be separate from feedzirra?