Scrapy 1.0 official release out!

680 views
Skip to first unread message

Julia Medina

unread,
Jun 19, 2015, 7:07:02 PM6/19/15
to scrapy...@googlegroups.com
After nearly a month of testing candidates, we've finally reached the desired stability to roll out Scrapy 1.0. As announced in the first candidate for this release, 1.0 brings a lot of improvements, but more importantly, it represents an important milestone that marks a new stage of maturity for Scrapy.

You can check our Release Notes detailing some of the introduced changes, as well as the whole Changelog in the project's docs. This little snippet brought up in the first announcement will give you a quick glance of some of those changes:

import scrapy

class MySpider(scrapy.Spider):
    # …
    custom_settings = {
        'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)',
    }

    def parse(self, response):
        for href in response.xpath(‘//h2/a/@href’).extract():
            full_url = response.urljoin(href)
            yield scrapy.Response(full_url, callback=self.parse_post)

    def parse_post(self, response):
        yield {
            ‘title’: response.xpath(‘//h1’).extract_first(),
            ‘body’: response.xpath(‘//div.content’).extract_first(),
        }

Upgrade to 1.0 by running:

    $ pip install --upgrade Scrapy

Since this is a stable release pip will fetch this version anytime Scrapy is installed, unless explicitly told otherwise.

As final note we want to thank all our developers and users again for contributing in shaping up a release we're really proud of, Scrapy's community never ceases to amaze us :)

Happy hacking!

Vasco

unread,
Jun 20, 2015, 11:24:51 AM6/20/15
to scrapy...@googlegroups.com
Hi Julia,

Congrats to you and the other contributors for reaching this milestone! The release notes show some very interesting changes! Particularly, does per spider settings mean we can have different pipelines for different spiders within the same project? For example, I currently have a lot of different projects that differ only in one or two pipelines, but they share a lot of pipelines too (which I now define in a separate package that I make available to all projects). If I understand things correctly, with 1.0 I could put all spiders in one project and specify different pipeline paths for each spider. If my understanding is correct, is this something you would typically suggest users to do in 1.0?

I noticed that scrapy 0.24 has been completely removed from pypi. For me this created a small issue because 1.0 breaks the scrapyd package available in pypi. What are your thoughts on keeping a 0.24 build available on pypi so user can install that version if 1.0 breaks their code?

I also noticed that the scrapyd package on pypi hasn't been updated in almost two years. Is using scrapyd to manage scrapy spider versions/runs considered current best practice?

Congrats again! Best,
Vasco

José Ricardo

unread,
Jun 20, 2015, 5:25:38 PM6/20/15
to scrapy...@googlegroups.com
Hi Vasco, are you sure older versions o scrapy have been removed from pypi? 
I have just installed scrapy 0.24 and scrapyd without problems on a new virtualenv.

Regards,

José

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Vasco

unread,
Jun 22, 2015, 5:38:55 AM6/22/15
to scrapy...@googlegroups.com, ro...@josericardo.eti.br
Hi Jose,

You are right, I can install 0.24 with pip. The download section on the pypi page doesn't list any versions other than 1.0.0. I just assumed this was an exhaustive list of the versions available in the repo, but this apparently just shows the last versions. My bad.

thanks,
Vasco


Capi Etheriel

unread,
Jun 28, 2015, 9:47:18 PM6/28/15
to scrapy...@googlegroups.com, ro...@josericardo.eti.br
Does scrapyd (from git) run with scrapy 1.x at all?
I don't see an open issue to track that.

Julia Medina

unread,
Jun 29, 2015, 11:53:58 PM6/29/15
to scrapy...@googlegroups.com
On Sat, Jun 20, 2015 at 12:24 PM Vasco <vasco....@gmail.com> wrote:
Hi Julia,

Congrats to you and the other contributors for reaching this milestone! The release notes show some very interesting changes! Particularly, does per spider settings mean we can have different pipelines for different spiders within the same project? For example, I currently have a lot of different projects that differ only in one or two pipelines, but they share a lot of pipelines too (which I now define in a separate package that I make available to all projects). If I understand things correctly, with 1.0 I could put all spiders in one project and specify different pipeline paths for each spider. If my understanding is correct, is this something you would typically suggest users to do in 1.0?


You can definitely redefine the spider pipelines with custom_settings and keep all your spiders in a single project. Actually, per-spider settings were introduced to avoid creating projects for simple spiders, since custom_settings lets you forget about creating a settings.py file to hold all your redefined settings. It surely opens up more code organization possibilities.

There are still advantages to separate spiders into different projects, for example, keeping thousands of unrelated spiders with nothing in common in a single project is probably not a good a idea, so I think users should listen to suggestions for their particular use-cases.
 
I also noticed that the scrapyd package on pypi hasn't been updated in almost two years. Is using scrapyd to manage scrapy spider versions/runs considered current best practice? 
 

That was a mistake on our side, we tagged a new version not so long ago and forgot to deploy it to PyPI :( Thanks for the heads up, I've just deployed it: https://github.com/scrapy/scrapyd/releases/tag/1.1.0

Scrapyd is mostly community driven so its development deeply relies on contributions, but it's still our recommended method to deploy spiders: http://scrapy.readthedocs.org/en/1.0/topics/deploy.html
 
Congrats again! Best,
Vasco


Thanks for the support!

Julia Medina

unread,
Jun 29, 2015, 11:55:58 PM6/29/15
to scrapy...@googlegroups.com
On Sun, Jun 28, 2015 at 10:47 PM Capi Etheriel <barra...@gmail.com> wrote:
Does scrapyd (from git) run with scrapy 1.x at all?
I don't see an open issue to track that.


Most functionality should work so there isn't a general issue tracking 1.x support, we'll treat any incompatibilities separately as they are reported.
Reply all
Reply to author
Forward
0 new messages