Hello. I'm writing a spider for a website that gathers some hyperlinks, then visits them and checks if something exists and returns the results into a text file.
I have a for loop that yields requests, calling a parse2 function that checks the link and updates the text file.
evenselectorlist = response.css('table[id="result_table"] tr.even')
for evenselector in evenselectorlist:
relative = evenselector.css('a[title="Link"]::attr(href)').extract_first()
yield scrapy.Request(response.urljoin(relative), callback=self.parse2,meta={'item':item},dont_filter=True)
def parse2(self, response):
#txt file stuff
Is there a way to make the first parse function pause when the request is yielded? I would like to continue to do some stuff AFTER the new requests have ended.
For example, I'd like to have a counter to see how many links have the information I want, which is available only after all the links have been visited.
I hope you understand what I'm trying to say. Thank you!