OBDA mapping of rdf:type table

70 views
Skip to first unread message

Lorenz B.

unread,
Jul 14, 2020, 1:55:59 AM7/14/20
to ontop4obda
Hi all,

just a quick question as I couldn't find anything in the current documentation and to be honest, the Github Wiki is confusing given that it refers to different version, some of which already being outdated.
And I'm using Ontop v4 RC1 now.

So my question is, given a table containing class assertions, i.e. having two columns s, C - how does the mapping to it work? I tried the obvious one, i.e.

SELECT s, C FROM RDF_TYPE_TABLE_NAME

<{s}> rdf:type <{C}> .


but when I use a SPARQL query like

SELECT ?s ?o WHERE {?s rdf:type <SOME_CLASS_URI>; :p ?o }


it always returns an empty SQL query I guess because it tries to make use of some metadata. But for me it's not clear how to either i) avoid this or ii) provide Ontop with whatever it needs. Note, you do not always have an ontology containing the schema axioms/triples. IS this necessary? Do you have any examples?


Thanks in advance!

Cheers,
Lorenz

Benjamin Cogrel

unread,
Jul 14, 2020, 2:35:24 AM7/14/20
to ontop...@googlegroups.com, Lorenz B.

Hi Lorenz,

In your SPARQL query, ":p" is not a variable but a constant. If you replace it by "?p" I would expect your query to return results (if the DB table is obviously not empty).

What Ontop does not include at the moment in the virtual RDF graph are the T-Box axioms (e.g. :Employee rdfs:subClassOf :Person). We are planning to work on that in the coming months.

In the past (v1.x), Ontop had some limitations treating properties and classes as variables, as they were having a special status internally (they were called "predicates"). It is not the case anymore. Mappings are now internally represented as triples or quads.

One final remark: beware that using very generic storage tables like RDF_TYPE_TABLE_NAME prevents Ontop from applying most of its optimization techniques such as self-join elimination and pruning the tree based on incompatible IRI templates. In such a setting, it can easily produce inefficient queries. In other words, structural information is essential to Ontop and it would be better not to hide it behind triplestore-like generic storage tables.


Best,

Benjamin

--
Please follow our guidelines on how to report a bug https://ontop-vkg.org/community/contributing/bug-report
---
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontop4obda/698ca2b0-55ca-4be3-ad36-c16bd9170d51o%40googlegroups.com.

Lorenz B.

unread,
Jul 14, 2020, 2:59:20 AM7/14/20
to ontop4obda
Hi Benjamin,

Thanks for the quick response (as usual).

Ok, I was not clear enough I guess and my example confusing. Let's ignore the property :p - I was keeping the example short but there was another mapping for :p indeed.

Let's make the example even more simple, always better:

I have a single Ontop native mapping


[MappingDeclaration] @collection [[
mappingId    
"httpP3aP2fP2fwwwCw3CorgP2f1999P2f02P2f22dashrdfdashsyntaxdashnsP23type"
source        SELECT
"s", "o" FROM "httpP3aP2fP2fwwwCw3CorgP2f1999P2f02P2f22dashrdfdashsyntaxdashnsP23type"
target        
<{s}> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <{o}> .

]]


and the database contains just this single table.

If I run the SPARQL query

select * where {?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?c }

the query will be empty.

I also digged into the code:

At some point the QuestQueryProcessor calls

IQ unfoldedIQ = queryUnfolder.optimize(rewrittenIQ);

which itself tries to map those intensional data nodes to defintions based on the mappings:


private Optional<IQ> getDefinition(RDFAtomPredicate predicate,
ImmutableList<? extends VariableOrGroundTerm> arguments) {
return predicate.getPropertyIRI(arguments)
.map(i -> i.equals(RDF.TYPE)
? getRDFClassDefinition(predicate, arguments)
: mapping.getRDFPropertyDefinition(predicate, i))
.orElseGet(() -> getStarDefinition(predicate));
}

The rdf:type triples are indeed handled differently, but I'm wondering how Ontop knows about classes at this point when calling getRDFClassDefinition because when I look into MappingImpl the classDefinitions ImmutableTable object is empty. For me it looks
like it would only be not empty if I provide an ontology in advance. But I can't see where Ontop does a SQL query call to gather all classes from the table. Otherwise, the IQ becomes an empty node as it happens in my case, thus, the SQL table is empty indeed.

Do I miss an important part?

Cheers,
Lorenz
To unsubscribe from this group and stop receiving emails from it, send an email to ontop...@googlegroups.com.

Benjamin Cogrel

unread,
Jul 20, 2020, 2:55:38 AM7/20/20
to Lorenz B., ontop4obda

Hi Lorenz,

I didn't manage to reproduce your problem. I have created a simple mapping for H2 like yours (but with a short table name) in the branch releasing/v4.0 and it was working. Out of curiosity, have you tried with a shorter table name?

Regarding the implementation, the distinction between properties and classes you pointed out in the unfolder is there for indexing purposes, that is for being faster at finding the relevant definitions.

However, there is a little "bit of magic" appearing before, at mapping processing time. Your mapping entry is what we call a meta-mapping entry, as the class is not a constant but is directly coming from the DB. At mapping processing time, Ontop issues a SQL query for getting all the possible classes and then replaces the meta-mapping entry by regular ones with constant classes (we call this mechanism meta-mapping expansion). Although it is not part of R2RML, it is needed for being able to saturate the mapping using the ontology (the main reasoning task achieved by Ontop). The downside of it is that new classes appearing in the DB after the initialization of Ontop won't be considered (you would need to restart Ontop).

Note that Ontop does not require all the classes used in the mapping to be present in the ontology (except if you are using Protégé). The Ontop endpoint can also work without an ontology.


Best,

Benjamin

To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontop4obda/3b6cb034-c504-4c4d-8ec6-83faa3e0cba1o%40googlegroups.com.

Lorenz B.

unread,
Jul 20, 2020, 3:17:52 AM7/20/20
to ontop4obda
Hi Benjamin,

well, you're right. I figured that Ontop does this kind of mapping expansion somewhere in between loading the OBDA mappings from file and transforming the SPARQL query to SQL. So, it was a misunderstanding from my end - I got it working now, thanks for the very helpful explanation.

Out of curiosity, how much does the number of mappings affect the query transformation? Do you have some kind of metrics, numbers, or even theoretical complexity estimation? I'm wondering because it looks like classes and language tags let the number of mappings explode. If we consider Wikidata as an example - which in fact has many languages and also a lots of classes, it looks like one would get ten thousands of mappings if one would for example do vertical partitioning, i.e. use one table (s,o) per property p.
Do you have any benchmarks or numbers in that regards?

Also, I can see that Ontop does support a lots of SPARQL 1.1 features but not all. I found SPARQL compliance tests in the Github project, are those still up-to-date or is there somewhere a document with the supported/unsupported SPARQL language constructs? Like "all of SPARQL 1.1 except for bnode(), strdt(), ...


Thanks for the support as usual, really appreciate it. And congrats to the impressive Ontop project.

Cheers,
Lorenz

Benjamin Cogrel

unread,
Jul 20, 2020, 2:30:54 PM7/20/20
to Lorenz B., ontop4obda

Hi Lorenz,

In a typical VKG setting over legacy DBs, the number of language tags is usually small. Also, having 1000s of classes would also refer to a fairly complex data model. I am not aware of a benchmark for testing a large number of mappings caused by many classes and language tags.

I would expect something like Wikidata to be challenging for Ontop. Wikidata has a very generic data structure, in particular when it comes to relations between high-level entities. Since the generic structure is very regular, it would be easy to write some scripts for generating the R2RML mapping. In any case, I would not expect the performance with Ontop to be great, as this generic structure does not provide enough information for applying its most common optimization techniques.

Thanks for the remark on documenting the fragment of SPARQL 1.1 that is currently supported, we definitely need to bring it to the website. We already have some material, which we included in a recent paper submission.

Best,

Benjamin

To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontop4obda/c13f972b-4ae9-44ae-9dbc-0917d4d3161co%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages