There are currently two working patches in my MQ that add this functionality in
case anyone wants to try an early preview (they need to be applied in order):
http://hg.scrapy.org/users/pablo/mq/file/tip/scheduler_single_spider.patch
http://hg.scrapy.org/users/pablo/mq/file/tip/persistent_scheduler.patch
To run a spider as before (no persistence):
scrapy crawl thespider
To run a spider storing scheduler+dupefilter state in a dir:
scrapy crawl thespider --set SCHEDULER_DIR=run1
During the crawl, you can hit ^C to cancel the crawl and resume it later with:
scrapy crawl thespider --set SCHEDULER_DIR=run1
The SCHEDULER_DIR setting name is bound to change before the final release, but
the idea will be the same - that you pass a directory where to persist the
state.
Pablo.
> --
> You received this message because you are subscribed to the Google Groups "scrapy-users" group.
> To post to this group, send email to scrapy...@googlegroups.com.
> To unsubscribe from this group, send email to scrapy-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/scrapy-users?hl=en.
http://hg.scrapy.org/users/pablo/mq/file/tip/scheduler_single_spider.patch
http://hg.scrapy.org/users/pablo/mq/file/tip/persistent_scheduler.patch
Apply cleanly on trunk right now (in that order).
On Tue, Jul 19, 2011 at 12:57:10PM -0700, massabuntu wrote:
> Hi,
> against which version of scrapy i have to patch?
>
> --
> You received this message because you are subscribed to the Google Groups "scrapy-users" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/scrapy-users/-/Pthh5nVZ4kwJ.
hg clone http://hg.scrapy.org/scrapy
wget http://hg.scrapy.org/users/pablo/mq/raw-file/b926f44f1aaf/scheduler_single_spider.patch
wget http://hg.scrapy.org/users/pablo/mq/raw-file/b926f44f1aaf/persistent_scheduler.patch
patch -p1 < scheduler_single_spider.patch
patch -p1 < persistent_scheduler.patch
python setup.py install
I wouldn't install scrapy system-wide (with setup.py install) but instead point
the python path to the directory where you cloned it, so that it finds it.
Also, the patches are still on development and will change, so you may need to
repeat these steps in the future.
> --
> You received this message because you are subscribed to the Google Groups "scrapy-users" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/scrapy-users/-/wDHmnXNqI70J.