Hello,
My source site structure looks so;
http://example.com/page/1
http://example.com/page/2
and so on... But some pages may not contain what I'm looking for (images).
What is the most efficient way to crawl such a site with Abot?
For now, I use my own implementation of HyperLinkParser, which queues the next page by incrementing the URL of the current crawled page. Maybe there is a more efficient way? I'm thinking of my own implementation of Scheduler with pre-calculated URLs.
Thanks!
--
You received this message because you are subscribed to the Google Groups "Abot Web Crawler" group.
To unsubscribe from this group and stop receiving emails from it, send an email to abot-web-crawl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/abot-web-crawler/41146505-3ece-4f50-9c79-1e3bd0c49e5cn%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/abot-web-crawler/4e9c0ae2-d461-404a-ac8a-df1034d0a387n%40googlegroups.com.
var crawler = new PoliteWebCrawler( null, null, null, scheduler, null, null, null, null, null);
To view this discussion on the web visit https://groups.google.com/d/msgid/abot-web-crawler/7d017e87-6fa5-407c-b747-81aa2c8054efn%40googlegroups.com.