Why? It's certainly the most straightforward way to do things. Are you worried about performance? Why don't you try it out and see how it scales?
You can always test it by just issuing "scrapy crawl spider1 spider2 spider3 ..."
-Matthew
My primary concern is that I will be crawling many sites that are
trying to stop people stealing their content (student research
papers). Ironically, I'm not interested in their content per se, I
want to index it to for our source-tracking app. With these annoying
sites I would have extra long delays, and each individual spider would
spend a lot of time idling.
I dug into some scrapy code now, and I think I will, indeed, go with
option a). I probably won't have more than 50 of these extra-clever
sites, and scrapy might get multiple spiders running in 0.16.
Still I would appreciate any insight into my particular problem.
Thank you,
Nikolay Surovenko