Can you use Pyquery in scrapy?

126 views
Skip to first unread message

Sayth Renshaw

unread,
Apr 19, 2016, 7:18:03 AM4/19/16
to scrapy-users
Hi

Is it possible to use Pyquery in scrapy so i can get the easy jquery selector syntax but then use the scrapy pipelines to manage data?

The pyquery options made selecting so easy just wanted to see if i could get the benefits of both the scrapy framework and the easy selectors.

Cheers

Sayth

Paul Tremberth

unread,
Apr 19, 2016, 7:33:08 AM4/19/16
to scrapy-users
Hello,

sure you can use PyQuery.
Callbacks get passed HTTP responses, which are `scrapy.http.HtmlResponse` most of the time,
so you do something like

from pyquery import PyQuery as pq
...

class MySpider(Spider):
    ...
    
    def parse_page(self, response):
        d = pq(response.body_as_unicode())
        # or if you're using Scrapy 1.1(rc3)
        # d = pq(response.text)

Hope this helps.

Paul.

Sayth Renshaw

unread,
Apr 19, 2016, 8:02:47 AM4/19/16
to scrapy-users

Ah cool so I had started writing a script with pyquery and just loved how easy its selectors were. This is so easy compared to Xpath and compact this below is main excert:

for items in fileResult:
    # d = pq(filename=items)
    d = pq(response.text)
    res = d('nomination')
    attrs = ('id', 'horse')
    data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))]

so essentially my only change is taking the filename attribute and replacing it with the httpresponse and I will get to use the rest of scrapy's plumbing for free?

Sounds like a good deal.

Sayth

--
You received this message because you are subscribed to a topic in the Google Groups "scrapy-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scrapy-users/8Pi0Qw1pmok/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scrapy-users...@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Travis Leleu

unread,
Apr 19, 2016, 12:36:32 PM4/19/16
to scrapy-users
Sayth,

Out of curiousity, what do you like about jquery that isn't available in scrapy's .cssselect() method?  I had thought they were pretty comparable:

    res = d('nomination')

With scrapy, would be something like:

    res = selector.cssselect('.nomination') // assuming nomination is a class you want to select on

I'm just curious to learn what you like more about pyquery, and if you were aware of the .cssselect() method.

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages