BeautifulSoup4 issue with https : CERTIFICATE_VERIFY_FAILED

194 views
Skip to first unread message

Dominique Delcourt

unread,
Jan 28, 2018, 9:50:48 AM1/28/18
to beautifulsoup
Hi,

I am testing BeautifoulSoup4. I tried to scrap some data on different websites.
With https websites I get the following error message :

File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777)>

The full message:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1318, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1400, in connect
    server_hostname=server_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 407, in wrap_socket
    _context=self, _session=session)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 814, in __init__
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 1068, in do_handshake
    self._sslobj.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 689, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777)


My script works fine when I use it with an http website.

from urllib.request import urlopen
from urllib.error import HTTPError
from urllib.error import URLError
from bs4 import BeautifulSoup
import ssl

try:
    html = urlopen("https://www.artistescontemporains.org/evenements_artistiques/")
except HTTPError as e:
    print(e)
    print("Server down or incorrect domain")
else:
    res = BeautifulSoup(html.read(),"lxml");
    if res.title is None:
        print("Tag not found")
    else:
        print(res.title)

My configuration :
Mac OSX 10.8.5
Python 3.6.4
libxml2.2
OpenSSL 0.9.8zg 14 July 2015 ==> not possible to upgrade easily as it is part of the core system

Any help would be very much appreciate !
Thanks
D.

Dominique Delcourt

unread,
Jan 28, 2018, 3:48:54 PM1/28/18
to beautifulsoup
I have installed Scrapy and I get the same error message.
Having Googled the internet I have found this solution :

on OSX, using macport, installing curl-ca-bundle solves it:

sudo port install curl-ca-bundle


I am currently installing MacPort (https://github.com/macports/macports-base/releases/tag/v2.4.2).
I will postthe result.

Dominique Delcourt

unread,
Jan 29, 2018, 6:02:10 AM1/29/18
to beautifulsoup
MacPort has been installed.
 
When launching the command : sudo port install curl-ca-bundle , the system now requires xcode...

The latest version available for Mac OS X 10.8 is XCode 5.1.1 ... 2.1 GB
https://download.developer.apple.com/Developer_Tools/xcode_5.1.1/xcode_5.1.1.dmg

A never ending story?

Reply all
Reply to author
Forward
0 new messages