I'm looking for the Python equivalent of Arc90's readability.js
http://lab.arc90.com/experiments/readability/
http://lab.arc90.com/experiments/readability/js/readability.js
so that I can give it some input.html and the result is cleaned up
version of that html. I want this so that I can use it on the server-
side (unlike the JS version that runs only on browser side).
Any ideas?
PS: I have tried Rhino + env.js and that combination works but the
performance is unacceptable it takes minutes to clean up most of the
html content :( (still couldn't find why there is such a big
performance difference).
--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To post to this group, send email to nltk-...@googlegroups.com.
To unsubscribe from this group, send email to nltk-users+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nltk-users?hl=en.