Hi There,
This is my first post here in this group, I haven't had the chance to find the right answer to my question yet so here it is. I'm trying to implement a CrawlSpider which crawls indefinately and I would like to fill new domains to crawl dynamically using a Redis list using the blocking lpop method.
Now when I try to implement it as following the spider does not populate it iterates through the list but does not handle the yielded Request objects. I use an empty start_urls list with a class definition of start_requests as following:
def start_requests(self):
while True:
source, domain = self.server.blpop(['domains'])
if not domain:
continue
self.log(domain)
if domain:
yield Request("http://%s" % domain)
My Python skills are not that fabulous so I'm wondering if someone could point me in the right direction.
Regards,
Roy