Pause the first parse function when a new request is yielded

20 views
Skip to first unread message

Χρατς Χρουτς

unread,
May 25, 2017, 3:42:49 PM5/25/17
to scrapy-users
Hello. I'm writing a spider for a website that gathers some hyperlinks, then visits them and checks if something exists and returns the results into a text file.
I have a for loop that yields requests, calling a parse2 function that checks the link and updates the text file.


     evenselectorlist = response.css('table[id="result_table"] tr.even')
    for evenselector in evenselectorlist:
    relative = evenselector.css('a[title="Link"]::attr(href)').extract_first()
    yield scrapy.Request(response.urljoin(relative), callback=self.parse2,meta={'item':item},dont_filter=True)


         def parse2(self, response):
                  #txt file stuff


Is there a way to make the first parse function pause when the request is yielded? I would like to continue to do some stuff AFTER the new requests have ended.
For example, I'd like to have a counter to see how many links have the information I want, which is available only after all the links have been visited. 
I hope you understand what I'm trying to say. Thank you!
Reply all
Reply to author
Forward
0 new messages