--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To post to this group, send email to scrapy...@googlegroups.com.
To unsubscribe from this group, send email to scrapy-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/scrapy-users?hl=en.
Agreed. What do you consider the most suitable place to put this note? Could
you send a patch?
Pablo.
--
I've had precisely that dilemma too about BaseSpider vs CrawlSpider in the
tutorial. I agree the CrawlSpider doesn't get the visibility it deserves in the
tutorial. But maybe it would suffice to write a paragraph or two explaining
what it adds to BaseSpider, and why you'd need it, along with a quick example,
and then just point to the CrawlSpider doc:
http://doc.scrapy.org/topics/spiders.html#crawlspider
Another thing to consider is that we're working on a second revision of
CrawlSpider, more based on pluggable components, so you would have a
link/request extrator, a callback rules dispatcher, a canonilzer, and you can
combine them as you wish (and maybe don't use some of them). However, the basic
idea is the same: being able to crawl and scrape a site based on a set for
rules for crawling and parsing. If the API changes, the documentation can be
quickly updated with the new API, but the guide/introduction part should be
mostly reused.
Thanks for your interest in improving the doc, I hope I've clarified your
questions,
Pablo.
> > scrapy-users...@googlegroups.com<scrapy-users%2Bunsu...@googlegroups.com>
You can grep the log looking for lines containing "Crawled". In unix:
$ grep Crawled scrapy.log
Pablo.