Hi Adi,
I believe scrapy would meet your needs, especially since you have a decentralized queue to feed the urls into it.
2. scrapyd is a convenient way to send jobs around to different systems, without having to copy your codebase. It's essentially a deployment tool. Scrapy is pretty efficient for web scraping. Scraping is I/O bound, and scrapy uses Twisted, an async http framework. So scrapy fires off a request, then forgets about it until the request comes back through Twisted. In the interim, it can process or fire off other requests.
Processing requirements vary, but I would expect you could have hundreds, if not thousands, of concurrent scraping requests using a medium sized ec2 server.
In my experience, the only shortcomings of scrapy are the architectural complexity (takes some time to master), and the lack of javascript support. So many sites are one page apps that load their content via js, and scrapy (to my knowledge) can't do anything with that.
Hope this helps,
Travis