get calling URL

69 views
Skip to first unread message

Raf Roger

unread,
Aug 27, 2016, 4:56:13 AM8/27/16
to scrapy-users
Hi,

i would like to know how to get the calling/reference url once we are parsing data from child url ?

e.g.:

we are on url "http://myweb.com/list?page=2" what list 20 contact urls

once we are on the contact url (e.g.: http://myweb.com/contact?id=123) how can we get the calling URL (e.g.: http://myweb.com/list?page=2) ?

here is a sample of my code:
 def parse_start_url(self, response):
    urls = Selector(response).xpath('//div[contains(@class, "lien-ville")]/ul/li/a/@href').extract()
    for u in urls:
      yield scrapy.Request(response.urljoin(u), callback=self.parse_contact)
      
  def parse_contact(self, response):
      yield {
        "count" : self.counter,
        "page number" : self.page_num,
        "contact page url" : response.url,
        "reference url" : self.url
      }

in my code i would like in parse_contact be able to get the url passed to parse_start_url.
How can i do that ?

thx

Erik Dominguez

unread,
Nov 4, 2016, 10:00:09 PM11/4/16
to scrapy-users
Two ways, either pass it as meta data to the Request instance or get it from the Request's url.
#First method, put this code

url
= response.request.response.request.url # for the
#This is the second method
yield scrapy.Request(response.urljoin(u), callback=self.parse_contact, meta={'url': response.request.url})
#Then inside your other function just get it like this
url
= response.meta['url']
Reply all
Reply to author
Forward
0 new messages