GSoC 2018, page clustering questions

29 views
Skip to first unread message

Олег Сериков

unread,
Feb 25, 2018, 10:33:09 AM2/25/18
to portia-scraper
Hello, my name is Oleg and I hope to participate in GSoC.

I have few questions about the "Increase Crawling Performance through page clustering" task.
How does the clustering meant to reduce the amount of rendered pages? What should clusters look like? Is clustering the way to find the pages that does not contain any useful information or I've misunderstood the idea?
Reply all
Reply to author
Forward
0 new messages