Aggregate query with subquery failing on ONTOP endpoint

76 views
Skip to first unread message

Rashif Rahman

unread,
Jul 24, 2022, 1:47:25 PM7/24/22
to ontop4obda
Hi all

I am supporting someone with an ONTOP implementation via the Protege plug-in. The data is mapped and we are now trying to translate some application-side SQL queries into SPARQL for testing and validation.

However, a seemingly simple query turned out to be unresolvable. We are not having luck trying to get results from a query of the form:

```
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX : <http://foobar.abc/ontology/demo#>

SELECT (SUM(?length)/?totalLength as ?calc) WHERE {
  ?foo :length ?length .

  {
    SELECT (SUM(?length) as ?totalLength) WHERE {
      ?foo :length ?length .
    }
  }  
}
GROUP BY ?totalLength
```

This is resulting in the error:

```
it.unibz.inf.ontop.exception.OntopReformulationException: it.unibz.inf.ontop.exception.MinorOntopInternalBugException: Was expecting RDF term type functional terms to have a variable as argument [RDF_TYPE{0=xsd:decimal}(CASE5_UNORDERED(STRICT_EQ2(v2,"0"^^BIGINT),"0"^^BIGINT,STRICT_EQ2(v2,"1"^^BIGINT),"0"^^BIGINT,NULL))]
```

The lengths are all `xsd:decimal`, if that matters. The expected result for this example is `1` because we are dividing a sum by the same sum (that was bound in a subquery). (The actual query has a complex filter for which ?foo to get, which is a subset of all the ?foos, and therefore the expected result would be a fraction.)

Trying to create an MWE in RDF appears to work in a plain vanilla triplestore, but as ONTOP is a virtual layer over an SQL database this becomes difficult to debug. Would be really grateful for some pointers, thanks!

Example MWE data:

```
@prefix : <http://foobar.abc/ontology/demo#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

:data_1 :type "A" ;
        :length "123.5"^^xsd:decimal .

:data_2 :type "A" ;
        :length "1234.1"^^xsd:decimal .

:data_3 :type "B" ;
        :length "122.67"^^xsd:decimal .

:data_4 :type "C" ;
        :length "1214.678"^^xsd:decimal .
```

Rashif Rahman

unread,
Jul 25, 2022, 3:11:01 PM7/25/22
to ontop4obda
We just discovered (by accident) that with a mathematical trick this somehow works. If you divide by 1 in the subquery you get the result as expected:

`    SELECT (SUM(?length)/1 as ?totalLength) WHERE {`

However, this still leaves us questioning why we need to employ this trick to get it working.

Benjamin Cogrel

unread,
Aug 1, 2022, 8:08:41 AM8/1/22
to Rashif Rahman, ontop4obda
Hi Rashif,

Thank you for reporting the issue! I managed to reproduce and fix it: https://github.com/ontop/ontop/commit/56b55aaa745b18b9a5c02d14bb35d1746e723fa3

It will be part of the 4.3.0 release.

Best,
Benjamin


--
Please follow our guidelines on how to report a bug https://ontop-vkg.org/community/contributing/bug-report
---
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontop4obda/4ad3f9f9-b5d8-4194-87db-4ae6cdaff212n%40googlegroups.com.

Rashif Rahman

unread,
Aug 1, 2022, 2:30:53 PM8/1/22
to ontop4obda
Thank goodness, we almost lost hope after pulling our hair over and over and exhausting all tricks in the books! The division trick did not work for another (MySQL) database (the one that worked was a Postgres).  How about the Protege plug-in? Do we report that elsewhere? It exhibits the same problem.

Benjamin Cogrel

unread,
Aug 2, 2022, 5:06:01 AM8/2/22
to Rashif Rahman, ontop4obda
Hi Rashif,

If you experience any other problems with aggregation, please let us know.

For the Protégé plugin, I just built for you a snapshot version:

You need to put the JAR in the plugins directory of Protégé and remove the previous Ontop JAR.

This fix will be included in the next release of Ontop-protégé (4.3.0).

Best,
Benjamin

Rashif Rahman

unread,
Aug 2, 2022, 8:15:00 PM8/2/22
to ontop4obda
Thanks a bunch! Just one more request for assistance:

We did not have luck running with docker; the existing environment was set up "manually", involving configuring the jdbc stuff in the properties file and calling the endpoint with the respective arguments. However, we don't know how exactly we should translate that approach to docker.

It occurred to us that we might need to use docker-compose as other databases/services are involved. We tried editing the original docker-compose file in the project directory to reflect this new snapshot version, but we got what looked like infinite errors.

Given this situation, could you suggest how we might translate the original/older approach into the new docker approach? We don't know what ONTOP_PORTAL_FILE is, and how to replace all the default "bgee" stuff for postgresql. Is there some documentation somewhere?

We are also running two instances because we are mapping two databases (one mysql, another pgsql) - is there a better approach for aligning more than one database (possibly of different SQL flavours)?

Example command:

/home/ontop/ontop endpoint -t /home/ontop/data/ontology.ttl  --mapping=/home/ontop/data/ontology.obda --properties=/home/ontop/data/ontology.properties --port 8080 > /home/ontop/data/log/ontop.log 2>&1

P.S: It would be great if you could provide a "friendly" explanation of the error we initially got and of what the underlying problem or root cause was.

Benjamin Cogrel

unread,
Aug 4, 2022, 3:14:25 AM8/4/22
to Rashif Rahman, ontop4obda
Hi Rashif,


Given this situation, could you suggest how we might translate the original/older approach into the new docker approach? We don't know what ONTOP_PORTAL_FILE is, and how to replace all the default "bgee" stuff for postgresql. Is there some documentation somewhere?

The main documentation page for the Docker image is on Docker Hub: https://hub.docker.com/r/ontop/ontop-endpoint . Most env variables correspond to parameters of the CLI command.

For an example of a Docker compose file, you can have a look at our destination tutorial: https://github.com/ontopic-vkg/destination-tutorial/blob/master/docker-compose.solution.yml .

 

We are also running two instances because we are mapping two databases (one mysql, another pgsql) - is there a better approach for aligning more than one database (possibly of different SQL flavours)?

If you would like to query multiple databases at the same time, I recommend using a database federator like Dremio (https://ontop-vkg.org/tutorial/federation/) or Spark (I don't have personal experience with it). The database federators expose the tables from all the sources and then you can map them. Ontop send the SQL query to the federator and the latter takes care of the rest.

Alternatively, you can set one SPARQL endpoint per database, but then if you want to query them together, you need to use a SPARQL federator like FedX (https://rdf4j.org/documentation/programming/federation/). This is way less effective in my personal experience.

Another idea is to use the Foreign Data Wrapper capability of PostgreSQL, but I never tried it.
 

P.S: It would be great if you could provide a "friendly" explanation of the error we initially got and of what the underlying problem or root cause was.

The typing system in SPARQL is quite different from the one in SQL. Most SQL systems are strongly typed, i.e. one column can only have one datatype. In SPARQL, that's not the case, so when we translate SPARQL queries into SQL ones, we need to keep track of the possible types and encode the typing rules associated to SPARQL functions.

For instance, in SPARQL, the SUM aggregation returns an xsd:decimal value when all values in the group are xsd:decimal; but if there is no value in the group, it returns an xsd:integer .

In order to be compliant with SPARQL, we need to keep track of all these corner cases. The error you experienced happened for a particular combination of these corner cases (2 aggregations mixed with division).

Best,
Benjamin

 

Rashif Rahman

unread,
Aug 4, 2022, 4:33:06 PM8/4/22
to ontop4obda
Awesome, thanks a lot for all your support! This really helped us get the progress we needed. We'll look into the federation, that sounds like something we'd want.

P.S: I realized we didn't need docker-compose, and saw that there was a tutorial linked from the docker project page, which gave us the command we needed.
Reply all
Reply to author
Forward
0 new messages