Using contains for an attribute using xpath in scrapy

1,061 views
Skip to first unread message

SreeSindhu Sruthi

unread,
Jul 23, 2015, 4:03:03 AM7/23/15
to scrapy-users
How to use contains for an attribute of a tag using xpath in scrapy

Lhassan Baazzi

unread,
Jul 23, 2015, 4:15:05 AM7/23/15
to scrapy...@googlegroups.com
Hey,

response.xpath('//*[contains(@class, "div-class")]//a[contains(@href, "/blog/")]/@href').extract()


Best Regards.
Lhassan.


2015-07-23 9:03 GMT+01:00 SreeSindhu Sruthi <sss...@gmail.com>:
How to use contains for an attribute of a tag using xpath in scrapy

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

SreeSindhu Sruthi

unread,
Jul 23, 2015, 1:52:57 PM7/23/15
to scrapy-users, sss...@gmail.com
Consider this as, part of html code that is available

<span class = "titlelink"> 
       <a id = "xyz" onclick= "abc" href="#" title = "View this job description"> Associate
        </a> 
</span>

In my spider file ,

item["title"] = hs.xpath(".//span[@class = 'titlelink']/a/").extract() gives me some output 

where as 
item["title"] = hs.xpath(".//span[@class = 'titlelink']/a/text()").extract() results in empty list.

extract doesn;t give all the list.

Please help me how to fix this. 

Lhassan Baazzi

unread,
Jul 23, 2015, 2:40:01 PM7/23/15
to scrapy...@googlegroups.com
Try:

hs.xpath(".//span[@class = 'titlelink']/a//text()").extract()


Best Regards.
Lhassan.


--

SreeSindhu Sruthi

unread,
Jul 24, 2015, 3:17:14 AM7/24/15
to scrapy-users, baazzi...@gmail.com
No even that didn't work. What is the exact problem I am not able to get it.

bruce

unread,
Jul 24, 2015, 10:08:34 AM7/24/15
to scrapy-users
hey guys...

just saw this.. do us/me a favor.. post the "html" and then tell us/me
what's the content/text you'd like to get back. preferably, if you
have a couple of different segments of html, post those, as well as
the "content"/text you'd like to get back from each chunk.

this will help generate the xpath you need..
i'm not a scrapy guy, but I can prob get you the basic xpath, and you
can then xlate the xpath to what it needs to be from scrapy
Reply all
Reply to author
Forward
0 new messages