Hey guys, Now I plan to use scrapy in a more distributed approach, and I'm not sure if the spiders/pipelines/downloaders/schedulers and engine are all hosted in separate processes or threads, could anyone share some info about this?
and could we change the process/thread count for each component? I know now there are two settings "CONCURRENT_REQUESTS" and "CONCURRENT_ITEMS", they will determine the concurrent threads for downloaders and pipelines, right? and if I want to deploy spiders/ pipelines/downloaders in different machines, I need to serialize the items/requests/responses, right?
Cheers,
Shane
--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To post to this group, send email to scrapy...@googlegroups.com.
To unsubscribe from this group, send email to scrapy-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/scrapy-users?hl=en.
2012/6/7 Shane Evans <shane...@gmail.com>
To unsubscribe from this group, send email to scrapy-users+unsubscribe@googlegroups.com.
>> scrapy-users+unsubscribe@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/scrapy-users?hl=en.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To post to this group, send email to scrapy...@googlegroups.com.
> To unsubscribe from this group, send email to
> scrapy-users+unsubscribe@googlegroups.com.