SPARQL query in R

172 views
Skip to first unread message

Ekaterina Chuprikova

unread,
Sep 23, 2021, 11:31:24 AM9/23/21
to ontop4obda
Hi all,

I would like to ask for help regarding SPARQL endpoint and R. The endpoint is running, I can make queries. But when I try to access the data from R it gives me such an error:
SPARQL(endpoint, query)
Space required after the Public Identifier
SystemLiteral " or ' expected
SYSTEM or PUBLIC, the URI is missing
Error: 1: Space required after the Public Identifier
2: SystemLiteral " or ' expected
3: SYSTEM or PUBLIC, the URI is missing

When I do the same in Python it works partially. I can access the data, but I cannot create a data frame with such an error:

 RuntimeWarning: unknown response content type 'text/sparql-results+csv;charset=UTF-8' returning raw response...
  warnings.warn("unknown response content type '%s' returning raw response..." %(ct), RuntimeWarning)

My R code below.

Thank your in advance! 
Best,
Ekaterina

library(DBI)
library(RPostgreSQL)
library(ggplot2)
library(XML)
library(RCurl)
library(SPARQL) # SPARQL querying package


prefix <- c("","http://ob-visly.com/apples/")
sparql_prefix <- "PREFIX : <http://ob-visly.com/apples/>
                  PREFIX base: <http://ob-visly.com/apples/>
                  PREFIX owl: <http://www.w3.org/2002/07/owl#>
                  PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
                  PREFIX xml: <http://www.w3.org/XML/1998/namespace>
                  PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
                  PREFIX obda: <https://w3id.org/obda/vocabulary#>
                  PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
                  "
q <- paste(sparql_prefix,
          'SELECT DISTINCT ?apple {
             ?apple a :sorts .
           }'
)

SPARQL(endpoint,q,ns=prefix,extra=options)$results

roman.ko...@gmail.com

unread,
Sep 26, 2021, 12:16:52 PM9/26/21
to ontop4obda
Hi Ekaterina

I am no R guru, but a quick search online suggests that such an error (Space required after the Public Identifier ..) is usually a result of the endpoint misconfiguration - the server (endpoint) returns an HTML that begins with "<!DOCTYPE HTML PUBLIC" instead of the expected SPARQL result. When you say "the endpoint is running", what exactly do you mean? You also mention Python code - could you provide it please?

Best
Roman

Ekaterina Chuprikova

unread,
Sep 29, 2021, 12:24:15 PM9/29/21
to ontop4obda
Thank you, Roman! I might call the thing in a wrong way. What I meant by running endpoint, is that I can access the data through the http://localhost:8080/sparql and query it.
What do you recommend to check in the endpoint misconfiguration? Could it be a problem of encoding? In the ontology, I have everything in English and I check the files they are saved as UTF-8.
Thank you again!

My python code and its output:
from SPARQLWrapper import SPARQLWrapper, JSON
import sparql_dataframe
from pandas import DataFrame
from rdflib.plugins.sparql.processor import SPARQLResult

from SPARQLWrapper import SPARQLWrapper, SPARQLWrapper2, JSON, JSONLD, CSV, TSV, N3, RDF, RDFXML, TURTLE
import pandas as pds
import itertools

endpoint = "http://localhost:8080/sparql"
q = """
SELECT DISTINCT ?apple {
?apple a :sorts .
}
"""

sparql = SPARQLWrapper("http://localhost:8080/sparql")
sparql.setQuery(q)
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
print(results)

df = sparql_dataframe.get(endpoint, q)



Output:


"C:\Program Files\Python39\python.exe" "C:/PATH/sparql/tests.py"
{'head': {'vars': ['apple']}, 'results': {'bindings': [{'apple': {'type': 'uri', 'value': 'http://ob-visly.com/apples/sorts/ID_sort=7'}}, {'apple': {'type': 'uri', 'value': 'http://ob-visly.com/apples/sorts/ID_sort=21'}},.........OTHER ID_SORT ....., {'apple': {'type': 'uri', 'value': 'http://ob-visly.com/apples/sorts/ID_sort=1076483'}}]}}
C:\ PATH \AppData\Roaming\Python\Python39\site-packages\SPARQLWrapper\Wrapper.py:1346: RuntimeWarning: unknown response content type 'text/sparql-results+csv;charset=UTF-8' returning raw response...
  warnings.warn("unknown response content type '%s' returning raw response..." %(ct), RuntimeWarning)

Process finished with exit code 0



Ekaterina Chuprikova

unread,
Oct 14, 2021, 11:46:35 AM10/14/21
to ontop4obda
Thank you all for answering my question:
@Peter Hopfgartner has suggested a solution that worked for me. I publish it below:

I was able to solve the issue with the out-of-date libcurl version.
For that you do have to compile RCurl on your own using Rtools (https://cran.r-project.org/bin/windows/Rtools/). With these, you have to install the current versions of libcurl (see https://github.com/r-windows/docs/blob/master/rtools40.md#readme), by opening the Rtools bash and running the commands:

pacman -Sy
pacman -S mingw-w64-{i686,x86_64}-curl

After that, the RCurl package can be resinstalled from source:

install.packages("RCurl", type="source")

All the best,
Ekaterina
Reply all
Reply to author
Forward
0 new messages