Programmatic API calls fail due to "unable to get local issuer certificate"

487 views
Skip to first unread message

Daniel Himmelstein

unread,
Dec 6, 2019, 12:44:47 PM12/6/19
to arXiv API
Originally documented here where API calls from Python fail.

Also can replicate the issue locally on Linux using curl:


```
curl: (60) SSL certificate problem: unable to get local issuer certificate

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
```

Thorsten

unread,
Dec 6, 2019, 1:00:38 PM12/6/19
to arXiv API

 The certificate on export.arxiv.org is valid (in fact it was just renewed last month).

 It works for me

In [1]: import requests

In [2]: x = requests.get('https://export.arxiv.org/api/query?id_list=1806.05726v1&max_results=1')

In [3]: x
Out[3]: <Response [200]>

In [4]: x.text
Out[4]: u'<?xml version="1.0" encoding="UTF-8"?>\n<feed xmlns="http://www.w3.org/2005/Atom">\n  <link href="http://arxiv.org/api/query?search_query.............



$ curl  'https://export.arxiv.org/api/query?id_list=1806.05726v1&max_results=1'
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <link href="http://arxiv.org/api/query?search_query%3D%26id_list%3D1806.05726v1%26start%3D0%26max_results%3D1" rel="self" type="application/atom+xml"/>
  <title ...........


  I have seen this error with older versions of python and urllib or requests. I have also seen it with some tools when the server certificate file doesn't include the full chain back to root CA.
 
 
Cheers
T.

Daniel Himmelstein

unread,
Dec 6, 2019, 1:20:34 PM12/6/19
to arXiv API
> The certificate on export.arxiv.org is valid (in fact it was just renewed last month).

I am guessing the renewal is what triggered the problem. Previously, our API calls succeeded.

> I have seen this error with older versions of python and urllib or requests.

We're getting the error using curl 7.64.1 on Ubuntu 19.10 as well as with Python 3.6, 3.7, and 3.8 on Travis CI's Ubuntu 18.04 LTS. So not using outdated systems or software. I'm guessing more users will start reporting this issue as more time goes on.

> I have also seen it with some tools when the server certificate file doesn't include the full chain back to root CA.

If the new server certificate included "the full chain back to root CA", would that resolve the issue? It seems likely that the issue is on arXIv's end given that recent Linux systems can't verify the cerficate.

Best,
Daniel

Thorsten

unread,
Dec 6, 2019, 1:39:54 PM12/6/19
to arXiv API


Hi Daniel,

note that I don't work at arXiv any longer. So I was not involved in the certificate renewal. I will however reach out to the relevant contacts.

Thanks for providing further details. Indeed Ubuntu appears to have problems with export.arxiv.org.

What I can say is that tried this from 3 different hosts with python 2.7.17 and python 3.7.5 and with various versions of requests module and old urllib2.
I cannot reproduce the issue on my RHEL or Fedora instances.

$ ipython
Python 3.7.5 (default, Oct 17 2019, 12:16:48)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import requests                                                                                                                           

In [2]: requests.get('https://export.arxiv.org/api/query?id_list=1806.05726v1&max_results=1')                                                     
Out[2]: <Response [200]>

In [3]: quit                                                                                                                                      

$ python -V
Python 3.7.5

$ pip freeze | grep requests
requests==2.22.0
requests-file==1.4.3
requests-ftp==0.3.1
requests-kerberos==0.12.0
requests-toolbelt==0.9.1

$ curl -V
curl 7.66.0 (x86_64-redhat-linux-gnu) libcurl/7.66.0 OpenSSL/1.1.1d-fips zlib/1.2.11 brotli/1.0.7 libidn2/2.3.0 libpsl/0.21.0 (+libidn2/2.2.0) libssh/0.9.2/openssl/zlib nghttp2/1.39.2
Release-Date: 2019-09-11
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS brotli GSS-API HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz Metalink NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets


However I do see it on a fresh install of Ubuntu 19.10

$ python3
Python 3.7.5 (default, Nov 20 2019, 09:21:52)
[GCC 9.2.1 20191008] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> requests.get('https://export.arxiv.org/api/query?id_list=1806.05726v1&max_results=1')
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 343, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 841, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 344, in connect
    ssl_context=context)
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 345, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.7/ssl.py", line 423, in wrap_socket
    session=session
  File "/usr/lib/python3.7/ssl.py", line 870, in _create
    self.do_handshake()
  File "/usr/lib/python3.7/ssl.py", line 1139, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='export.arxiv.org', port=443): Max retries exceeded with url: /api/query?id_list=1806.05726v1&max_results=1 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)')))

Cheers
T.

Daniel Himmelstein

unread,
Dec 6, 2019, 1:59:29 PM12/6/19
to arxi...@googlegroups.com
Thanks Thorsten.

If anyone is aware of a workaround that we could implement on our end in Python, that's of interest. We could temporarily specify "verify=False" in our requests.get call.

Switching operating systems is a bit out of scope at the moment (:

--
You received this message because you are subscribed to the Google Groups "arXiv API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arxiv-api+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/arxiv-api/3575def7-5e75-485d-b531-ae21de0fe890%40googlegroups.com.

Martin Lessmeister

unread,
Dec 6, 2019, 2:01:18 PM12/6/19
to arXiv API
Hi Thorsten, Daniel,

Thanks for reporting. There was a misconfiguration on one of our export nodes causing this problem--now fixed.

Best,
Martin 
arXiv

Daniel Himmelstein

unread,
Dec 6, 2019, 2:46:06 PM12/6/19
to arxi...@googlegroups.com
Confirming that it's fixed on our end, thanks!

--
You received this message because you are subscribed to the Google Groups "arXiv API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arxiv-api+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages