SQL INNER JOIN using ONTOP mappings

67 views
Skip to first unread message

Jaime Andrade Pinto

unread,
Jun 15, 2020, 1:43:31 PM6/15/20
to ontop4obda
All,

I have a question about the use of ONTOP mappings. I'm sorry about the lengthy message, but I think I have an interesting situation that might be useful to other people in a similar situation. My environment is Windows 8, H2 database, Protege v5 and Ontop v3 plugin.

1) Suppose I have 2 database schemas with 1 table each having the following structure and data:

CREATE SCHEMA schema1;
CREATE TABLE schema1.t1 (
  t1_id1 varchar(255) NOT NULL,
  t1_id2 varchar(255) NOT NULL,
  t1_value INT NOT NULL);

CREATE SCHEMA schema2;
CREATE TABLE schema2.t2 (
  t2_id1 varchar(255) NOT NULL,
  t2_id2 varchar(255) NOT NULL,
  t2_value INT NOT NULL);
 
INSERT INTO schema1.t1 VALUES ('aaaa', 'bbbb', 100);
INSERT INTO schema1.t1 VALUES ('aaaa', 'cccc', 200);
INSERT INTO schema1.t1 VALUES ('aaaa', 'dddd', 300);
INSERT INTO schema1.t1 VALUES ('eeee', 'ffff', 400);
INSERT INTO schema1.t1 VALUES ('gggg', 'hhhh', 500);

INSERT INTO schema2.t2 VALUES ('aaaa', 'bbbb', 150);
INSERT INTO schema2.t2 VALUES ('aaaa', 'iiii', 250);
INSERT INTO schema2.t2 VALUES ('eeee', 'kkkk', 350);
INSERT INTO schema2.t2 VALUES ('gggg', 'hhhh', 450);
INSERT INTO schema2.t2 VALUES ('llll', 'mmmm', 550);

2) This is the model of a typical SQL query that I want to execute and the actual results returned from this hypothetical data:

SELECT T1_ID1 , T1_ID2 , T1_VALUE , T2_VALUE
  FROM SCHEMA1.T1   as A
  INNER JOIN SCHEMA2.T2 as B
  ON (
    A.T1_ID1 = B.T2_ID1 AND
    A.T1_ID2 = B.T2_ID2 );

RESULTS
----------------
T1_ID1   T1_ID2   T1_VALUE   T2_VALUE  
aaaa        bbbb       100             150
gggg        hhhh       500             450
(2 rows, 2 ms)

3) I want to run the same query using SPARQL, then I built a simple new ontology and did these mappings :

 a) define 1 single object, as subclass_of thing (in this simplification I think it is like a "blank node"):
   <Declaration>
        <Class IRI="#myObject"/>
   </Declaration>
   <SubClassOf>
      <Class IRI="#myObject"/>
      <Class abbreviatedIRI="owl:Thing"/>
   </SubClassOf>
 b) 1 data property for each column with that same template:
    <Declaration>
        <DataProperty IRI="#has_property_t1id1"/>
    </Declaration>
    <DataPropertyDomain>
        <DataProperty IRI="#has_property_t1id1"/>
        <Class IRI="#myObject"/>
    </DataPropertyDomain>
    <DataPropertyRange>
        <DataProperty IRI="#has_property_t1id1"/>
        <Datatype abbreviatedIRI="xsd:string"/>
    </DataPropertyRange>
  c) Mappings (template for 3 triples, one for each tab/column in RDB)
   t1id1 mapping
   :SCHEMA1/T1/ a :myObject ;
     :has_property_t1id1 {T1_ID1}^^xsd:string ;
     :has_property_t1id2 {T1_ID2}^^xsd:string ;
     :has_property_t1value {T1_VALUE}^^xsd:int .

4) So I tried many different SPARQL solutions and couldn't replicate the same result. See some wrong attempts below. The first query uses "nested select" and the second uses the UNION clause.

SELECT DISTINCT ?T1_ID1 ?T1_ID2 ?T1_VALUE ?T2_VALUE  
  WHERE
    {:myObject
       :has_property_t1id1 ?T1_ID1 ;
       :has_property_t1id2 ?T1_ID2 ;
       :has_property_t1value ?T1_VALUE .
      FILTER (?T1_ID1 = ?T2_ID1 && ?T1_ID2 = ?T2_ID2 )
     {SELECT DISTINCT ?T2_ID1 ?T2_ID2 ?T2_VALUE  
       WHERE
         {:myObject
            :has_property_t2id1 ?T2_ID1 ;
            :has_property_t2id2 ?T2_ID2 ;
            :has_property_t2value ?T2_VALUE . } } }

SELECT DISTINCT ?T1_ID1 ?T1_ID2 ?T1_VALUE ?T2_VALUE
  WHERE {
    {:myObject  
       :has_property_t1id1 ?T1_ID1 ;
       :has_property_t1id2 ?T1_ID2 ;
       :has_property_t1value ?T1_VALUE }
   UNION { :myObject :has_property_t2value ?T2_VALUE } }
 
5) Notes and questions:
  a) a simple SPARQL query runs fine, like this one: SELECT ?T1ID1 WHERE {:myObject :has_property_t1id1 ?T1ID1} .
  b) all SQL from mappings editor runs OK.
  c) I tried some SPARQL variations like using FILTER or UNION, without success.
  d) the situation I am describing here is an oversimplification of a real problem of legacy RDB integration.
  e) Question 1: using this ONTOP mappings, how to write an SPARQL query to works as an SQL "INNER JOIN"?
   f) Question 2: how to model this type of RDB integration without modifying the original data model?

Thank you all for any help.

Jaime A. Pinto
UFMG - Brazil

Benjamin Cogrel

unread,
Jun 24, 2020, 2:21:36 AM6/24/20
to ontop...@googlegroups.com

Hi Jaime,

The standard practice is to use the Primary Key (here composite) in the IRI template of the subject of the triple.

In your mapping entry, the subject does not use any variable.

Instead of

:SCHEMA1/T1/ a :myObject ;

I would write

:SCHEMA1/T1/{t1_id1}/{t1_id2} a :myObject ;


To better understand the basics of Ontop, I suggest you to have a look at our official tutorial: https://ontop-vkg.org/tutorial/ . It should clearly answer your two questions. If something remains unclear, let us know.

Best,
Benjamin
--
Please follow our guidelines on how to report a bug https://ontop-vkg.org/community/contributing/bug-report
---
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontop4obda/6cf5526a-efb9-46e3-a222-aacba41b2728o%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages