SQL INNER JOIN using ONTOP mappings

67 views

Skip to first unread message

Jaime Andrade Pinto

unread,

Jun 15, 2020, 1:43:31 PM6/15/20

to ontop4obda

All,

I have a question about the use of ONTOP mappings. I'm sorry about the lengthy message, but I think I have an interesting situation that might be useful to other people in a similar situation. My environment is Windows 8, H2 database, Protege v5 and Ontop v3 plugin.

1) Suppose I have 2 database schemas with 1 table each having the following structure and data:

CREATE SCHEMA schema1;
CREATE TABLE schema1.t1 (
t1_id1 varchar(255) NOT NULL,
t1_id2 varchar(255) NOT NULL,
t1_value INT NOT NULL);

CREATE SCHEMA schema2;
CREATE TABLE schema2.t2 (
t2_id1 varchar(255) NOT NULL,
t2_id2 varchar(255) NOT NULL,
t2_value INT NOT NULL);

INSERT INTO schema1.t1 VALUES ('aaaa', 'bbbb', 100);
INSERT INTO schema1.t1 VALUES ('aaaa', 'cccc', 200);
INSERT INTO schema1.t1 VALUES ('aaaa', 'dddd', 300);
INSERT INTO schema1.t1 VALUES ('eeee', 'ffff', 400);
INSERT INTO schema1.t1 VALUES ('gggg', 'hhhh', 500);

INSERT INTO schema2.t2 VALUES ('aaaa', 'bbbb', 150);
INSERT INTO schema2.t2 VALUES ('aaaa', 'iiii', 250);
INSERT INTO schema2.t2 VALUES ('eeee', 'kkkk', 350);
INSERT INTO schema2.t2 VALUES ('gggg', 'hhhh', 450);
INSERT INTO schema2.t2 VALUES ('llll', 'mmmm', 550);

2) This is the model of a typical SQL query that I want to execute and the actual results returned from this hypothetical data:

SELECT T1_ID1 , T1_ID2 , T1_VALUE , T2_VALUE
FROM SCHEMA1.T1 as A
INNER JOIN SCHEMA2.T2 as B
ON (
A.T1_ID1 = B.T2_ID1 AND
A.T1_ID2 = B.T2_ID2 );

RESULTS
----------------
T1_ID1 T1_ID2 T1_VALUE T2_VALUE
aaaa bbbb 100 150
gggg hhhh 500 450
(2 rows, 2 ms)

3) I want to run the same query using SPARQL, then I built a simple new ontology and did these mappings :

a) define 1 single object, as subclass_of thing (in this simplification I think it is like a "blank node"):
<Declaration>
<Class IRI="#myObject"/>
</Declaration>
<SubClassOf>
<Class IRI="#myObject"/>
<Class abbreviatedIRI="owl:Thing"/>
</SubClassOf>
b) 1 data property for each column with that same template:
<Declaration>
<DataProperty IRI="#has_property_t1id1"/>
</Declaration>
<DataPropertyDomain>
<DataProperty IRI="#has_property_t1id1"/>
<Class IRI="#myObject"/>
</DataPropertyDomain>
<DataPropertyRange>
<DataProperty IRI="#has_property_t1id1"/>
<Datatype abbreviatedIRI="xsd:string"/>
</DataPropertyRange>
  c) Mappings (template for 3 triples, one for each tab/column in RDB)
t1id1 mapping
:SCHEMA1/T1/ a :myObject ;
:has_property_t1id1 {T1_ID1}^^xsd:string ;
:has_property_t1id2 {T1_ID2}^^xsd:string ;
:has_property_t1value {T1_VALUE}^^xsd:int .

4) So I tried many different SPARQL solutions and couldn't replicate the same result. See some wrong attempts below. The first query uses "nested select" and the second uses the UNION clause.

SELECT DISTINCT ?T1_ID1 ?T1_ID2 ?T1_VALUE ?T2_VALUE
WHERE
{:myObject
:has_property_t1id1 ?T1_ID1 ;
:has_property_t1id2 ?T1_ID2 ;
:has_property_t1value ?T1_VALUE .
FILTER (?T1_ID1 = ?T2_ID1 && ?T1_ID2 = ?T2_ID2 )
{SELECT DISTINCT ?T2_ID1 ?T2_ID2 ?T2_VALUE
WHERE
{:myObject
:has_property_t2id1 ?T2_ID1 ;
:has_property_t2id2 ?T2_ID2 ;
:has_property_t2value ?T2_VALUE . } } }

SELECT DISTINCT ?T1_ID1 ?T1_ID2 ?T1_VALUE ?T2_VALUE
WHERE {
{:myObject
:has_property_t1id1 ?T1_ID1 ;
:has_property_t1id2 ?T1_ID2 ;
:has_property_t1value ?T1_VALUE }
UNION { :myObject :has_property_t2value ?T2_VALUE } }

5) Notes and questions:
  a) a simple SPARQL query runs fine, like this one: SELECT ?T1ID1 WHERE {:myObject :has_property_t1id1 ?T1ID1} .
  b) all SQL from mappings editor runs OK.
  c) I tried some SPARQL variations like using FILTER or UNION, without success.
  d) the situation I am describing here is an oversimplification of a real problem of legacy RDB integration.
  e) Question 1: using this ONTOP mappings, how to write an SPARQL query to works as an SQL "INNER JOIN"?
   f) Question 2: how to model this type of RDB integration without modifying the original data model?

Thank you all for any help.

Jaime A. Pinto

UFMG - Brazil

Benjamin Cogrel

unread,

Jun 24, 2020, 2:21:36 AM6/24/20

to ontop...@googlegroups.com

Hi Jaime,

The standard practice is to use the Primary Key (here composite) in the IRI template of the subject of the triple.

In your mapping entry, the subject does not use any variable.

Instead of

:SCHEMA1/T1/ a :myObject ;

I would write

:SCHEMA1/T1/{t1_id1}/{t1_id2} a :myObject ;

To better understand the basics of Ontop, I suggest you to have a look at our official tutorial: https://ontop-vkg.org/tutorial/ . It should clearly answer your two questions. If something remains unclear, let us know.

Best,

Benjamin

--
Please follow our guidelines on how to report a bug https://ontop-vkg.org/community/contributing/bug-report
---
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontop4obda/6cf5526a-efb9-46e3-a222-aacba41b2728o%40googlegroups.com.

Reply all

Reply to author

Forward

0 new messages