unexpected error from federated query on PC

12 views
Skip to first unread message

Matt Karikomi

unread,
Sep 21, 2020, 3:33:20 PM9/21/20
to pathway-commons-help
Hi,
First of all thanks for making pathway commons and the BioPAX standard, it's really opened my eyes to semantic web and ontology-related ideas as a solution to information overload.

I am getting an unexpected error from a federated query on PC (Ex.3), which utilizes the patterns in Ex.1 (tested on PC) and Ex.2 (tested on UniProt)

Ex 1: tested on PC

SELECT distinct ?protein ?acc
WHERE {
?protein rdf:type bp:Protein .
                ?protein bp:entityReference ?eref.
                ?eref bp:xref ?xref.
                ?xref rdf:type bp:UnificationXref.
                ?xref bp:db ?bpdb.
                ?xref bp:id ?acc
# property filters
        FILTER regex(?bpdb,"^uniprot knowledgebase")
}
LIMIT 10

Ex.2: tested on UniProt (uses two accessions "?acc" in the first 10 results from from Ex.1)
SELECT DISTINCT ?protein ?goTerm
WHERE
{
VALUES (?acc) {("O94808") ("O60936")}
BIND (IRI(CONCAT("http://purl.uniprot.org/uniprot/",?acc)) AS ?protein)
?entry a up:Protein .
?entry up:classifiedWith ?goTerm .
}

LIMIT 10

Ex.3: a federated query on PC that binds accessions from the PC query to IRIs on UP

SELECT DISTINCT ?protein ?acc ?goTerm
WHERE {
?protein rdf:type bp:Protein .
                ?protein bp:entityReference ?eref.
                ?eref bp:xref ?xref.
                ?xref rdf:type bp:UnificationXref.
                ?xref bp:db ?bpdb.
                ?xref bp:id ?acc
# property filters
    FILTER regex(?bpdb,"^uniprot knowledgebase")
BIND (IRI(CONCAT("http://purl.uniprot.org/uniprot/",?acc)) AS ?protein)
?entry a up:Protein .
?entry up:classifiedWith ?goTerm .
}
}
LIMIT 10

The following error is given for Ex.3:
Virtuoso 37000 Error SP031: SPARQL compiler: The list of return values contains '*' but the pattern does not contain variables

SPARQL query:
define sql:big-data-const 0 
#output-format:text/html
define sql:signal-void-variables 1 define input:default-graph-uri <http://pathwaycommons.org> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT DISTINCT ?protein ?acc ?goTerm
WHERE {
?protein rdf:type bp:Protein .
                ?protein bp:entityReference ?eref.
                ?eref bp:xref ?xref.
                ?xref rdf:type bp:UnificationXref.
                ?xref bp:db ?bpdb.
                ?xref bp:id ?acc
# property filters
    FILTER regex(?bpdb,"^uniprot knowledgebase")
BIND (IRI(CONCAT("http://purl.uniprot.org/uniprot/",?acc)) AS ?protein)
?entry a up:Protein .
?entry up:classifiedWith ?goTerm .
}
}
LIMIT 10


Gary Bader

unread,
Sep 21, 2020, 3:44:30 PM9/21/20
to pathway-commons-help

Hi Matt - great that this has been useful for you - thanks for the positive feedback. I assume you’re making these calls at http://rdf.pathwaycommons.org/ Unfortunately, we don’t have a lot of SPARQL query expertise, as we normally use the REST API and Java PaxTools to query Pathway Commons, so I’m not sure what the issue is with this query. If you’re able to figure it out and it turns out to be a data issue, and not, for example, an issue with the Virtuoso database software, then we would be interested to know so we could investigate how to fix it.

Best,
Gary

--
You received this message because you are subscribed to the Google Groups "pathway-commons-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pathway-commons-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pathway-commons-help/3fc8473a-c7b2-429f-98c2-2136ee001c1fn%40googlegroups.com.

Matt Karikomi

unread,
Sep 21, 2020, 4:37:20 PM9/21/20
to pathway-co...@googlegroups.com
Thanks Gary, 
I submitted a ticket to UniProt as well and will post any advice they provide
Best, Matt

NB: the following typo in Ex.2-3 (does not change error for Ex.3)

  original: BIND (IRI(CONCAT("http://purl.uniprot.org/uniprot/",?acc)) AS ?protein) 
  should be: BIND (IRI(CONCAT("http://purl.uniprot.org/uniprot/",?acc)) AS ?entry)

You received this message because you are subscribed to a topic in the Google Groups "pathway-commons-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pathway-commons-help/zb7iWqrH05Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pathway-commons-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pathway-commons-help/974CE4C5-1827-44C3-A728-521599DED878%40utoronto.ca.

Igor R

unread,
Sep 21, 2020, 5:55:10 PM9/21/20
to pathway-co...@googlegroups.com
I don’t think there are entities in the PC graph having purl.uniprot.org in the URI. You can download the full Pathway Commons  BioPAX RDF from pathwaycommons.org/downloads to see ProteinReference URIs there. Try http://identifiers.org/uniprot/ prefix perhaps. Try to search/ browse at rdf.pathwaycommons.org/fct/
Most PC entities have URI namespace like http://pathwaycommons.org/pc12/, and reference/utility objects, controlled vocabularies have http://identifiers.org/ based URIs

IR

On Sep 21, 2020, at 4:37 PM, Matt Karikomi <mattka...@gmail.com> wrote:



Jerven Bolleman

unread,
Sep 21, 2020, 6:20:10 PM9/21/20
to pathway-co...@googlegroups.com
Hi All,

I will reply to the helpdesk ticket at UniProt. But it is a technical issue, in the virtuoso software used at both sides.

The same query from the UniProt to rdf.pathwaycommons.org/sparql works. As one can see at https://tinyurl.com/y35evd8u

SELECT
?protein
?acc
?entry
?goTerm
WHERE
{
{
    SELECT
    ?protein
    (IRI(CONCAT("http://purl.uniprot.org/uniprot/",?acc)) AS ?entry)
    ?acc
    ?bpdb
      WHERE {
SERVICE <https://rdf.pathwaycommons.org/sparql/> {

?protein rdf:type bp:Protein .
        ?protein bp:entityReference ?eref.
                ?eref bp:xref ?xref.
                ?xref rdf:type bp:UnificationXref.
                ?xref bp:db "uniprot knowledgebase"^^xsd:string .
                ?xref bp:id ?acc .

  }
    }
}
  ?entry up:classifiedWith ?goTerm .
  ?goTerm a owl:Class . # needed otherwise one could get a uniprot keyword as well.
}

It is an interesting query, as it uses both sparql endpoints to enrich the pathway data with go term annotations from the current UniProt release.

Yes, in pathways common the IRI used for uniprot resources is an identifiers.org one, but life would be easier if it used the purl.uniprot.org/uniprot/ ones.

Regards,
Jerven



--
Jerven Bolleman
m...@jerven.eu

Jerven Bolleman

unread,
Sep 21, 2020, 6:24:07 PM9/21/20
to pathway-co...@googlegroups.com
I forgot, support for federated queries might get better if the following is run in isql on a virtuoso.


It basically says that uniprot supports sparql 1.1. and therefore no down translation to sparql 1 is needed.

Regards,
Jerven
--
Jerven Bolleman
m...@jerven.eu

Matt Karikomi

unread,
Sep 21, 2020, 10:20:08 PM9/21/20
to pathway-co...@googlegroups.com
Hi Jerven, 
Thanks for this solution, works perfectly.  It's really great to see how well you guys at UnitProt (and SIB in general) are supporting linked data.  Especially now that funding for this effort seems to have dried up at EBI.  

Best, Matt


Jerven Bolleman

unread,
Sep 22, 2020, 2:36:22 AM9/22/20
to pathway-co...@googlegroups.com
Hi Matt,

The SIB and EBI are completely different kinds of institutes. SIB is federated and consists of lean groups that are often hardware poor, EBI is centralised and is hardware rich.
So for SIB any federated query capability is a major advantage. SPARQL which has that built in, plus immediately getting the AIR of FAIR ticked off the checklist.
Giving people direct access to the data, in any kind of database used to be a common question on our helpdesks.
Now we point people to the SPARQL endpoints and they are happy.

We still have a lot of work to do to improve our endpoints, and faster federated queries is request number 1.

Regards,
Jerven





--
Jerven Bolleman
m...@jerven.eu

Matt Karikomi

unread,
Sep 22, 2020, 1:57:00 PM9/22/20
to pathway-co...@googlegroups.com
Hi Jerven,
Thanks for that explanation.  As a 'blind consumer', I hadn't really considered how an institute's business model could affect the data model.   
Best, Matt

Reply all
Reply to author
Forward
0 new messages