Retry crawling for unsuccessfull http response by increasing MinCrawlDelayPerDomainMilliSeconds.

Skip to first unread message

Jun 14, 2020, 1:13:30 PM6/14/20
to Abot Web Crawler
I am crawling the website. This is my configuration:
var config = new CrawlConfiguration
                MaxPagesToCrawl = 10000, 
                MaxConcurrentThreads = 10,
                MinCrawlDelayPerDomainMilliSeconds = 200,

I was successfully able to crawl 8457 out of 10000. but 1543 gives http status NA.
if i increase the MinCrawlDelayPerDomainMilliSeconds by some factor i am able to get more better result. but the disadvantage is that it overall increase the time even the request which was possible in 200 milliseconds. now takes more time. 

is there way to just retry the request which were failed (in this about 153 request with increase MinCrawlDelayPerDomainMilliSeconds..


Jun 15, 2020, 4:09:35 PM6/15/20
to, Abot Web Crawler
You can do the following which should allow configurable number or retries.

           CrawlConfiguration configuration = new CrawlConfiguration
                MaxRetryCount = 3,

The other option is to use AbotX instead. It has dynamic AutoThrottling that will slow down when it detects response server stress and also speeds up when it starts getting more successful responses.

        private static async Task DemoCrawlerX_Throttling(Uri siteToCrawl)
            var config = GetSafeConfig();
            config.AutoThrottling = new AutoThrottlingConfig
                IsEnabled = true,
                ThresholdHigh = 2,
                ThresholdMed = 2,
                MinAdjustmentWaitTimeInSecs = 10
            //Optional, configure how aggressively to speed up or down during throttling
            config.Accelerator = new AcceleratorConfig();
            config.Decelerator = new DeceleratorConfig();

            //Now the crawl is able to "Throttle" itself if the site being crawled
            //is showing signs of stress.
            using (var crawler = new CrawlerX(config))
                crawler.PageCrawlCompleted += (sender, args) =>
                    //Check out args.CrawledPage for any info you need
                await crawler.CrawlAsync(siteToCrawl);

Hope that helps
You received this message because you are subscribed to the Google Groups "Abot Web Crawler" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit
Reply all
Reply to author
0 new messages