Re: How to run Scrapy spider on several cores in parallel?

2,094 views
Skip to first unread message

Pablo Hoffman

unread,
Sep 29, 2012, 12:21:03 PM9/29/12
to scrapy...@googlegroups.com
Yes, you can run as many instances of a single spider in parallel, and it will spawn a different process per run (to use many cores). This was one of the design goals of Scrapyd: to circumvent Python concurrency limitations and be able to use many cores. The max_proc setting of scrapyd allow to set how many concurrent processes you want to run, and it defaults to the number of cores available in the system. Of course, each run is a different process and completely isolated from the other ones so it doesn't share the request queue. You can partition the start urls (if you have a predefined list of urls to crawl), and give each run a different partition. Another thing that has been mentioned (but I haven't tested myself) is scrapy-redis which allows you to spawn many spider runs sharing the same requests queue.

Good luck!

On Sat, Sep 29, 2012 at 11:47 AM, Ilya Persky <ilya....@gmail.com> wrote:
Hello guys!

Recently I've came across the documentation section which describes scrapyd. It is said there that scrapyd can run several spiders in parallel. So my questions are:

1) Can scrapyd run several instances of _one_ spider at a time? Say, I have a CPU-bounded spider and it would be a great thing to see it running on two cores. If yes - how exactly can I do that?

2) Again, if this is possible - would these spiders share one request queue or I'd need some additional programming to make this work?

Thank you in advance!
Regards,
Ilya.

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/scrapy-users/-/Uks3wzSM8dAJ.
To post to this group, send email to scrapy...@googlegroups.com.
To unsubscribe from this group, send email to scrapy-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/scrapy-users?hl=en.

Ilya Persky

unread,
Sep 29, 2012, 1:47:40 PM9/29/12
to scrapy...@googlegroups.com
Thanks, Pablo, that awesome, I'll try it!

Pandu

unread,
Dec 20, 2012, 2:41:16 AM12/20/12
to scrapy...@googlegroups.com
why the link to scrapy_redis has chnaged to website becouply . is that spam , intentional or error
Reply all
Reply to author
Forward
0 new messages