Authenticating to Wikimedia Commons Query Service using OAuth with SPARQLWrapper

60 views
Skip to first unread message

Frankie Robertson

unread,
Dec 18, 2020, 2:51:05 AM12/18/20
to rdflib-dev
Dear RDFlib devs,

I'm cross posting this from StackOverflow since it's a bit niche: https://stackoverflow.com/questions/65303450/how-to-authenticate-to-wikimedia-commons-query-service-using-oauth-in-python . I hope this is okay.

I am trying to use the Wikimedia Commons Query Service[1] programmatically using Python, but am having trouble authenticating via OAuth 1.

Below is a self contained Python example which does not work as expected. The expected behaviour is that a result set is returned, but instead a HTML response of the login page is returned. You can get the dependencies with `pip install --user sparqlwrapper oauthlib certifi`. The script should then be given the path to a text file containing the pasted output given after applying for an owner only token[2]. e.g.

```
Consumer token
    deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
Consumer secret
    deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
Access token
    deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
Access secret
    deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
```

[1] https://wcqs-beta.wmflabs.org/ ; https://diff.wikimedia.org/2020/10/29/sparql-in-the-shadow-of-structured-data-on-commons/

[2] https://www.mediawiki.org/wiki/OAuth/Owner-only_consumers

```python
import sys
from SPARQLWrapper import JSON, SPARQLWrapper
import certifi
from SPARQLWrapper import Wrapper
from functools import partial
from oauthlib.oauth1 import Client
 
 
ENDPOINT = "https://wcqs-beta.wmflabs.org/sparql"
QUERY = """
SELECT ?file WHERE {
  ?file wdt:P180 wd:Q42 .
}
"""
 
 
def monkeypatch_sparqlwrapper():
    # Deal with old system certificates
    if not hasattr(Wrapper.urlopener, "monkeypatched"):
        Wrapper.urlopener = partial(Wrapper.urlopener, cafile=certifi.where())
        setattr(Wrapper.urlopener, "monkeypatched", True)
 
 
def oauth_client(auth_file):
    # Read credential from file
    creds = []
    for idx, line in enumerate(auth_file):
        if idx % 2 == 0:
            continue
        creds.append(line.strip())
    return Client(*creds)
 
 
class OAuth1SPARQLWrapper(SPARQLWrapper):
    # OAuth sign SPARQL requests

    def __init__(self, *args, **kwargs):
        self.client = kwargs.pop("client")
        super().__init__(*args, **kwargs)
 
    def _createRequest(self):
        request = super()._createRequest()
        uri = request.get_full_url()
        method = request.get_method()
        body = request.data
        headers = request.headers
        new_uri, new_headers, new_body = self.client.sign(uri, method, body, headers)
        request.full_url = new_uri
        request.headers = new_headers
        request.data = new_body
        print("Sending request")
        print("Url", request.full_url)
        print("Headers", request.headers)
        print("Data", request.data)
        return request
 
 
monkeypatch_sparqlwrapper()
client = oauth_client(open(sys.argv[1]))
sparql = OAuth1SPARQLWrapper(ENDPOINT, client=client)
sparql.setQuery(QUERY)
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
 
print("Results")
print(results)
```

Best regards,
Frankie

Nicholas Car

unread,
Dec 23, 2020, 1:50:22 AM12/23/20
to rdfli...@googlegroups.com
Why don't you try and see if you can get a SPARQL query answered "by hand", using `requests` + OAuth etc. and then, if you can, you'll know that you've we've got a bug in SPARQLWrapper as opposed to an issue within your application code.

The `requests` code should look something like the following + OAuth stuff:

```

r = requests.get(
    ENDPOINT,
    params={"query": QUERY},
    auth=auth,
    headers={"Accept": "application/sparql-results+json"}
)
```

Nick

--
http://github.com/RDFLib
---
You received this message because you are subscribed to the Google Groups "rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdflib-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rdflib-dev/c6bfb327-7f4d-4239-87da-2ab8441a5112n%40googlegroups.com.


--

______________________________________________________________________________________
kind regards
Dr Nicholas Car
Data Systems Architect at SURROUND Australia Pty Ltd
Address  Level 9, Nishi Building,
                  2 Phillip Law Street
                  New Acton Canberra 2601
Phone     +61 477 560 177 
Email       nichol...@surroundaustralia.comWebsite   https://www.surroundaustralia.com

Enhancing Intelligence Within Organisations

delivering evidence that connects decisions to outcomes


Australian-National-University-Logo-1 – ANU Centre for Water and Landscape  Dynamics

Dr Nicholas Car
Adj. Senior Lecturer

Research School of Computer Science

The Australian National University
Canberra ACT Australia

 

 https://orcid.org/0000-0002-8742-7730

https://cs.anu.edu.au/people/nicholas-car 

Wes Turner

unread,
Dec 23, 2020, 8:08:04 AM12/23/20
to rdfli...@googlegroups.com
Could optional oauth 1 (and 2) support just be added to SPARQLwrapper?

Are there other oauth-protected SPARQL services?

Reply all
Reply to author
Forward
0 new messages