Strange behavior of the spider

329 views
Skip to first unread message

Максим Горковский

unread,
Dec 26, 2011, 3:07:20 AM12/26/11
to scrapy...@googlegroups.com
Hello.
When I'm starting to crawl site, first 1-3 attempts ending with:

2011-12-26 15:58:05+0800 [scrapy] INFO: Scrapy 0.14.0.2845 started (bot: tenders)
2011-12-26 15:58:05+0800 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole, CloseSpider, WebService, CoreStats, MemoryUsage, SpiderState
2011-12-26 15:58:05+0800 [scrapy] DEBUG: Enabled downloader middlewares: DownloadTimeoutMiddleware, RandomAgentMiddleware, DefaultHeadersMiddleware, RetryChangeProxyMiddleware, RedirectCleanQueryMiddleware, CookiesMiddleware, HttpCompressionMiddleware, ChunkedTransferMiddleware, DownloaderStats
2011-12-26 15:58:05+0800 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2011-12-26 15:58:05+0800 [scrapy] DEBUG: Enabled item pipelines: TendersPipeline, CompanyPipeline
2011-12-26 15:58:05+0800 [companies] INFO: Spider opened
2011-12-26 15:58:05+0800 [companies] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2011-12-26 15:58:05+0800 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6034
2011-12-26 15:58:05+0800 [scrapy] DEBUG: Web service listening on 0.0.0.0:6091
2011-12-26 15:58:05+0800 [companies] DEBUG: Start URL: MY_URL
2011-12-26 15:58:10+0800 [companies] DEBUG: Crawled (200) <GET MY_URL> (referer: None)
2011-12-26 15:58:10+0800 [companies] INFO: Closing spider (finished)
2011-12-26 15:58:10+0800 [companies] INFO: Dumping spider stats:
{'downloader/request_bytes': 598,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'downloader/response_bytes': 996,
'downloader/response_count': 1,
'downloader/response_status_count/200': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2011, 12, 26, 7, 58, 10, 275469),
'request_depth_max': 1,
'scheduler/memory_enqueued': 1,
'start_time': datetime.datetime(2011, 12, 26, 7, 58, 5, 627757)}
2011-12-26 15:58:10+0800 [companies] INFO: Spider closed (finished)
2011-12-26 15:58:10+0800 [scrapy] INFO: Dumping global stats:
{'memusage/max': 141856768, 'memusage/startup': 141856768}

but then all is ok. Anyone had that trouble?

--
С уважением,
Максим Горковский

Kazimir

unread,
Dec 26, 2011, 4:19:04 AM12/26/11
to scrapy-users
Completely forgot: the question is how can I inspect response that
cause this behavior? Debugger in eclipse is unable to do it

Pablo Hoffman

unread,
Dec 27, 2011, 6:32:33 PM12/27/11
to scrapy...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages