error when changing SSL method

164 views
Skip to first unread message

Bill Ebeling

unread,
Jan 21, 2014, 9:00:38 AM1/21/14
to scrapy...@googlegroups.com
I'm trying to crawl www.wegmans.com and had to follow these instructions to not get a twisted error: http://stackoverflow.com/questions/19578688/crawling-on-uncerficated-website

Now I'm getting this instead:

2014-01-21 07:56:24-0500 [scrapy] INFO: Scrapy 0.18.2 started (bot: scrapybot)
2014-01-21 07:56:24-0500 [Wegmans] INFO: Spider opened
2014-01-21 07:56:24-0500 [Wegmans] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2014-01-21 07:56:24-0500 [Wegmans] ERROR: Error downloading <GET https://www.wegmans.com>
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1309, in request
        parsedURI.originForm)
      File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1186, in _requestWithEndpoint
        d = self._pool.getConnection(key, endpoint)
      File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1075, in getConnection
        return self._newConnection(key, endpoint)
      File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1087, in _newConnection
        return endpoint.connect(factory)
    --- <exception caught here> ---
      File "/usr/local/lib/python2.7/dist-packages/twisted/internet/endpoints.py", line 714, in connect
        timeout=self._timeout, bindAddress=self._bindAddress)
      File "/usr/local/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 494, in connectSSL
        tlsFactory = tls.TLSMemoryBIOFactory(contextFactory, True, factory)
      File "/usr/local/lib/python2.7/dist-packages/twisted/protocols/tls.py", line 608, in __init__
        contextFactory.getContext()
      File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 794, in getContext
        return self._webContext.getContext(self._hostname, self._port)
    exceptions.TypeError: getContext() takes exactly 1 argument (3 given)



Anyone know what to do about this? I'm hoping this isn't because I'm still on 0.18!

Thanks,
Bill

Bill Ebeling

unread,
Jan 22, 2014, 9:53:05 AM1/22/14
to scrapy...@googlegroups.com
Upon investigation I've learned this:

The error being generated above is coming from twisted.  The twisted getContext is here:"/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 794

Scrapy's getContext is here: https://github.com/scrapy/scrapy/blob/master/scrapy/core/downloader/contextfactory.py

Scrapy's get context takes 3 arguments, twisted's takes one.  Given the stack above, seems like I'm somehow calling the wrong getContext()?

Anyone got a clue on this one?
Message has been deleted

Bill Ebeling

unread,
Jan 27, 2014, 9:43:14 AM1/27/14
to scrapy...@googlegroups.com
Has nobody come across this before?  Can someone run a scrapy shell to see if this problem isn't specific to me / my hardware?

Rolando Espinoza La Fuente

unread,
Jan 27, 2014, 10:49:35 AM1/27/14
to scrapy...@googlegroups.com
What's the output of "scrapy version -v" ?

Perhaps it's a miss match between scrapy and twisted versions.


On Mon, Jan 27, 2014 at 10:43 AM, Bill Ebeling <bille...@gmail.com> wrote:
Has nobody come across this before?  Can someone run a scrapy shell to see if this problem isn't specific to me / my hardware?

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.

Bill Ebeling

unread,
Jan 27, 2014, 11:08:12 AM1/27/14
to scrapy...@googlegroups.com
I hadn't seen this before..

Scrapy  : 0.18.2
lxml    : 3.2.3.0
libxml2 : 2.8.0
Twisted : 13.1.0
Python  : 2.7.3 (default, Sep 26 2013, 16:35:25) - [GCC 4.7.2]
Platform: Linux-3.5.0-45-generic-x86_64-with-Ubuntu-12.10-quantal

Rolando Espinoza La Fuente

unread,
Jan 27, 2014, 1:03:58 PM1/27/14
to scrapy...@googlegroups.com
So far I've tested, fetching your website works with twisted 10.2.0 + scrapy 0.18.2.
With latest scrapy works with twisted 11.0.0, it fails up to latest 13.1.0.


Rolando Espinoza La Fuente

unread,
Jan 27, 2014, 1:18:57 PM1/27/14
to scrapy...@googlegroups.com
In fact, this was a bug in the answer as it should import ScrapyClientContextFactory instead ClientContextFactory. I've already fixed that.

Funny thing that it worked with some twisted versions.

Bill Ebeling

unread,
Jan 27, 2014, 1:22:56 PM1/27/14
to scrapy...@googlegroups.com
Sure is.  So the best fix would be to upgrade scrapy, then?


You received this message because you are subscribed to a topic in the Google Groups "scrapy-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scrapy-users/XiV7jazev8c/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scrapy-users...@googlegroups.com.

Rolando Espinoza La Fuente

unread,
Jan 27, 2014, 4:21:59 PM1/27/14
to scrapy...@googlegroups.com
See the code in the updated answer: http://stackoverflow.com/a/19601182/140510

You have to change the import in the custom client context factory to this one:

from scrapy.core.downloader.contextfactory import ScrapyClientContextFactory

With that it should work without changing the twisted version.
Reply all
Reply to author
Forward
0 new messages