At our last hacking session with yc and sweemeng, here is a scrapper for news sites:
Right now it scrapes:
* The Star Online
* The Malaysian Insider
* Free Malaysia Kini
* Malay Mail
* Utusan Malaysia (have a look at their html [hint: search for <html> tag])
* Merdeka Review (Malay language edition)
Any other sites to scrap?
The scrapper extracts relevant content from the news articles and tags it (so that it can be used in yc's rails mp app).
Next step: expose a web api for apps.
Will be good to have a host for the scrapper (low cpu, bw requirements) and the api.
Happy hacking.
Ditesh