Mapping questions

59 views
Skip to first unread message

Ricardo G. Martini

unread,
Oct 13, 2015, 11:39:04 AM10/13/15
to ontop4obda
Hi,

I have some doubts about what's the better way to map my source.

An example of my DB schema is presented below with two tables:

1) identificacaoEmigrante table:






2) localidade table:







Here more information about the fields:

1)

`identificacaoEmigrante` (
  `idEmigrante` int(11) NOT NULL,
  `nome` varchar(64) NOT NULL,
  `dtNasc` date NOT NULL,
  `idConj` varchar(10) ,
  `nomeConj` varchar(64) ,
  `idFiliacao` int(11) NOT NULL,
  `idNaturalidade` int(11) NOT NULL,
  PRIMARY KEY (`idEmigrante`),
  KEY `idFiliacao` (`idFiliacao`),
  KEY `idNaturalidade` (`idNaturalidade`),
  CONSTRAINT `identificacaoemigrante_ibfk_1` FOREIGN KEY (`idFiliacao`) REFERENCES `Filiacao` (`idFiliacao`),
  CONSTRAINT `identificacaoemigrante_ibfk_2` FOREIGN KEY (`idNaturalidade`) REFERENCES `Localidade` (`idLocalidade`)
)

2)

`localidade` (
  `idLocalidade` int(11) NOT NULL AUTO_INCREMENT,
  `freguesia` varchar(64) NOT NULL,
  `concelho` varchar(64) NOT NULL,
  `distrito` varchar(64) NOT NULL,
  PRIMARY KEY (`idLocalidade`)
)

An example of my Ontology schema (CIDOC-CRM ontology for museums) detailing the DB schema is presented below:















So, I want the identificacaoEmigrante table, which describes an emigrant, i.e., a Person, to be related to the E21 Person concept.
Similarly, the localidade table describes places, so it should be related to the E53 Place concept.

At this point arises the first question:
1) Should I map the emigrant (identificacaoEmigrante) as E21 Person and the places (E53 Place) separately?
Ex:
    mappingID: emigrant
    Source: SELECT idEmigrante, nome FROM identificacaoEmigrante
    Target: {idEmigrante} a :E21_Person ; :P131_is_identified_by {nome} . {nome} a :E82:Actor_Appellation .

    mappingID: places
    Source: SELECT freguesia, concelho, distrito FROM localidade, identificacaoEmigrante WHERE identificacaoEmigrante.idNaturalidade = localidade.idLocalidade
    Target: {freguesia} a :E53_Place ; :P89_falls_within {concelho} . {concelho} a :E53_Place ; :P89_falls_within {distrito} . {distrito} a :E53_Place .

OR
2) Should I map the places separately?

    mappingID: freguesia
    Source: SELECT freguesia FROM localidade, identificacaoEmigrante WHERE identificacaoEmigrante.idNaturalidade = localidade.idLocalidade
    Target: {freguesia} a :E53_Place ; :P89_falls_within :E53_Place .

Repeating the same to concelho and distrito.

Here arises the second doubt, for example: how can I query (SPARQL) a list of all freguesias (parish), concelhos (council) and distritos (district)?

SELECT ?f ?c ?d
WHERE { ?f a :E53_Place ; :P89_falls_within ?c .
                ?c a :E53_Place ; :P89_falls_within ?d .
                ?d a :E53_Place . }

Notice that exist another tables in my database describing persons (not only identificacaoEmigrante), so, I didn't want to map identificacaoEmigrante just as {idEmigrante} a :E21_Person like this example. The identificacaoEmigrante mapping should include another concept (E9 Move) of my ontology that describes the emigration movement event. Thus, what identifies an emigrant is already having participated in an emigration movement.

Sorry for the very extensive question, but I wanted to detail as much as possible :)

Thanks in advance.

Benjamin Cogrel

unread,
Oct 15, 2015, 8:24:30 AM10/15/15
to Ricardo G. Martini, ontop4obda
Hi Ricardo,
Note that CIDOC-CRM has not been designed with RDF/OWL in mind.
For instance it differs from common OWL ontologiesin the way to handle names (of persons for instance).

With a common OWL ontology (FOAF) you would say:

<http://example.org/person/Peter> foaf:hasName "Peter"^^xsd:string . 

<http://example.org/person/Peter> is the URI of an object, "Peter" here is a literal (a string more precisely)
and thus foaf:hasName is a *datatype* property. Remember that a literal and a URI are two different things in RDF.

In CIDOC-CRM, cidoc-crm:P131_is_identified_by is an *object* property because its range is the Appellation class.

This has consequences in the way you can write your mappings. The above target should become something like:


Target: <http://example.org/emigrant/{idEmigrante}> a :E21_Person ; :P131_is_identified_by <http://example.org/person-name/{nome}> .
            <http://example.org/person-name/{nome}> a :E82:Actor_Appellation ; :hasNote "{nome}"^^xsd:string .


NB: the only way I have quickly found to keep the name as a string is to use the datatype property :hasNote .
Maybe there is something more elegant.



    mappingID: places
    Source: SELECT freguesia, concelho, distrito FROM localidade, identificacaoEmigrante WHERE identificacaoEmigrante.idNaturalidade = localidade.idLocalidade
    Target: {freguesia} a :E53_Place ; :P89_falls_within {concelho} . {concelho} a :E53_Place ; :P89_falls_within {distrito} . {distrito} a :E53_Place .


Another remark: why don't you extend this ontology but adding your own concepts of Emigrant, Parish, Council and District?
These would be subclasses of the classes you are currently using in your mappings.

Having such subclasses would be helpful for querying your data.



OR
2) Should I map the places separately?

    mappingID: freguesia
    Source: SELECT freguesia FROM localidade, identificacaoEmigrante WHERE identificacaoEmigrante.idNaturalidade = localidade.idLocalidade
    Target: {freguesia} a :E53_Place ; :P89_falls_within :E53_Place .

Repeating the same to concelho and distrito.

As you prefer, it is the same for Ontop. It is usually better not have too many triples in the target part.




Here arises the second doubt, for example: how can I query (SPARQL) a list of all freguesias (parish), concelhos (council) and distritos (district)?

SELECT ?f ?c ?d
WHERE { ?f a :E53_Place ; :P89_falls_within ?c .
                ?c a :E53_Place ; :P89_falls_within ?d .
                ?d a :E53_Place . }


As I said before, you would have better to use the classes Parish, Council and District instead of :E53_Place .
Note that saying that ?f ?c ?d are instances of :E53_Place is useless because :E53_Place is the domain and the range of the property :P89_falls_within .



Notice that exist another tables in my database describing persons (not only identificacaoEmigrante), so, I didn't want to map identificacaoEmigrante just as {idEmigrante} a :E21_Person like this example. The identificacaoEmigrante mapping should include another concept (E9 Move) of my ontology that describes the emigration movement event. Thus, what identifies an emigrant is already having participated in an emigration movement.

Sorry for the very extensive question, but I wanted to detail as much as possible :)

Thanks in advance.

Best,
Benjamin



--
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Benjamin Cogrel

unread,
Oct 15, 2015, 8:28:19 AM10/15/15
to Ricardo G. Martini, ontop4obda
I was looking at http://bloody-byte.net/rdf/cidoc-crm/  not http://erlangen-crm.org .
I hope they do not differ too much.

Ricardo G. Martini

unread,
Oct 15, 2015, 11:02:44 AM10/15/15
to Benjamin Cogrel, ontop4obda
Hi Benjamin,

thank you for your reply.

It's just an example. In my RDF files I have used 'P3 has note' for strings.
 

    mappingID: places
    Source: SELECT freguesia, concelho, distrito FROM localidade, identificacaoEmigrante WHERE identificacaoEmigrante.idNaturalidade = localidade.idLocalidade
    Target: {freguesia} a :E53_Place ; :P89_falls_within {concelho} . {concelho} a :E53_Place ; :P89_falls_within {distrito} . {distrito} a :E53_Place .


Another remark: why don't you extend this ontology but adding your own concepts of Emigrant, Parish, Council and District?
These would be subclasses of the classes you are currently using in your mappings.

Having such subclasses would be helpful for querying your data.


OR
2) Should I map the places separately?

    mappingID: freguesia
    Source: SELECT freguesia FROM localidade, identificacaoEmigrante WHERE identificacaoEmigrante.idNaturalidade = localidade.idLocalidade
    Target: {freguesia} a :E53_Place ; :P89_falls_within :E53_Place .

Repeating the same to concelho and distrito.

As you prefer, it is the same for Ontop. It is usually better not have too many triples in the target part.

I have thought about it. It is an option. Extend the ontology may be a good solution for keeping mappings and SPARQL simple.




Here arises the second doubt, for example: how can I query (SPARQL) a list of all freguesias (parish), concelhos (council) and distritos (district)?

SELECT ?f ?c ?d
WHERE { ?f a :E53_Place ; :P89_falls_within ?c .
                ?c a :E53_Place ; :P89_falls_within ?d .
                ?d a :E53_Place . }


As I said before, you would have better to use the classes Parish, Council and District instead of :E53_Place .
Note that saying that ?f ?c ?d are instances of :E53_Place is useless because :E53_Place is the domain and the range of the property :P89_falls_within .

You're right! I'll make this change so my queries will be more simple!



Notice that exist another tables in my database describing persons (not only identificacaoEmigrante), so, I didn't want to map identificacaoEmigrante just as {idEmigrante} a :E21_Person like this example. The identificacaoEmigrante mapping should include another concept (E9 Move) of my ontology that describes the emigration movement event. Thus, what identifies an emigrant is already having participated in an emigration movement.

Sorry for the very extensive question, but I wanted to detail as much as possible :)

Thanks in advance.

Best,
Benjamin



--
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "ontop4obda" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ontop4obda/qPjWyyPQ17k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ontop4obda+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Best Regards,

--
Ricardo G. Martini, MsC.
---------------------------------------------------------------------------
Doutorando em Informática
Programa Doutoral em Informática (PDInf)
Centro Algoritmi
Universidade do Minho (UMinho) - Braga, Portugal

Mestre em Computação - Universidade Federal de Santa Maria (UFSM)
Santa Maria, RS, Brasil
Lattes CV: http://lattes.cnpq.br/8710772034085679
---------------------------------------------------------------------------

Martín Rezk

unread,
Oct 22, 2015, 4:02:42 AM10/22/15
to Ricardo G. Martini, Alessandro Mosca, Benjamin Cogrel, ontop4obda
Hi Ricardo,

We have worked on mapping and integrating CIDOC-CRM and EPNet ontology.
You might find this paper useful:


We have a journal version as well that will be out soon.

Cheers

Ricardo G. Martini

unread,
Oct 22, 2015, 9:30:13 AM10/22/15
to Martín Rezk, Alessandro Mosca, Benjamin Cogrel, ontop4obda
Hi Martín,

I'll read it.
thank you very much.

Best Regards,

Ricardo G. Martini

unread,
Oct 26, 2015, 3:15:16 PM10/26/15
to User support for WebProtege and Protege Desktop, ontop4obda
Hi,

I have another doubt about my mappings with ontop (OBDA) and CIDOC-CRM.

In identificacaoEmigrante table there is a field dtNasc, which represent a birthdate.

I would like to map this as follows:

mappingID    birthdate emigrant
target            :{dtNasc} a :E67_Birth ; :P4_has_time-span {dtNasc} .
source          SELECT dtNasc FROM identificacaoEmigrante

mappingID    date
target            :{dtNasc} a :E52_Time-Span ; :P78_is_identified_by {dtNasc} .
source          SELECT dtNasc FROM identificacaoEmigrante

This should reflect this schema:


But my doubt is: should I create two mappings with the same source?
Notice that there is a E21 Person concept related to E67 Birth that should be mapped too. I can't relate E21 Person directly to E52 Time-Span or E50 Date.

Thanks in advance,


On Wed, Oct 14, 2015 at 6:57 PM, Josef Hardi <joh...@stanford.edu> wrote:
Hi Ricardo,

let me try to answer your questions. 

A) For your first question, the answer is to separate them. AFAIK -ontop- mappings must have only 1 subject map. What I mean by subject map is the rdf:type or the ‘a’ class definition. So in your first example, you should not have the definition of :E21_Person and :E82_Actor_Appellation in the same mapping. The rest of your ‘Target’ schema is the predicate-object map which can be zero or many. These maps are the data or object properties that belong to the initial subject class.

One thing to note is that you should put a prefix after your column-variable assignment, especially when you are working with class and object property assertions.

For example, the right syntax to write is:

Example 1:
:{idEmigrante} a :E21_Person ; :P131_is_identified_by {nome} 

Example 2:
:{freguesia} a :E53_Place ; :P89_falls_within :{concelho}


B) Now here comes the tricky part of developing the mappings (related to your second and third question).

Before we get into your SPARQL query test, I suggest you to rewrite the places mappings as follow:

MappingID: Places
Source: SELECT idLocalidade, freguensa FROM localidade
Target: :{idLocalidade} a :E53_Place ; :P89_falls_within :freguensia/{freguensa} .

MappingID: Parish
Source: SELECT freguensa, concelho FROM localidade
Target: :freguensia/{frequensa} a :E53_Place ; :P89_falls_within :concelho/{concelho} .

MappingID: Council
Source: SELECT concelho, distrito FROM localidade
Target: :concelho/{concelho} a :E53_Place ; :P89_falls_within :distrito/{distrito} .

MappingID: District
Source: SELECT district FROM localized
Target: :distrito/{distrito} a :E53_Place .

Testing queries:

SELECT DISTINCT ?x WHERE {
   ?x a :E53_Place . # Will show all places IRI defined in the mappings, i.e., Places, Parish, Council, District
}

SELECT DISTINCT ?x ?y WHERE {
   ?x :P89_falls_within ?y . # Will show the city administration hierarchy
}

(Notice the strategy that I used to have the IRI template unique and more informative for each Localidade, Freguensa, Concelho, and Distrito; and also notice I didn’t use JOIN table in the source query. I think the best practice of creating OBDA model (i.e., the mappings) is to avoid join table expressions and let the reasoner formulates the joining when it translates SPARQL queries to SQL.


C) If you have another information about the :E21_Person class then just create a new one. So for example, you have that information in another table, say, table X

MappingID: Person in Table X
Source: SELECT personID FROM X
Target: :{personID} a :E21_Person .

The consequence is that when you query SELECT ?x WHERE { ?x a :E21_Person . } the -ontop- reasoner will collect the answer from both tables. But note that in case you know table X doesn’t give you new information about Person (for example, the data is actually just a foreign key from the “master table" identificacaoEmigrante) then I suggest you don’t need to create a new mapping. Creating a new class definition mapping (or subject map) should only be done when you have different table sources that would give you different (or new) information. It will make your model less complicated.

Hope my answers can help. They are quite long and hopefully you won’t get lost :)

Cheers!

/Josef

PS. There might be errors if you just copy-and-paste everything since I haven’t tested these mappings. Just an illustration from what I understand of the system.


On Oct 14, 2015, at 2:17 AM, Ricardo G. Martini <giulian...@gmail.com> wrote:

Hi,

I have some doubts about ontop mapping and what's the better way to map my source.
If someone already worked with ontop, please, help me.
--
You received this message because you are subscribed to a topic in the Google Groups "ontop4obda" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ontop4obda/qPjWyyPQ17k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ontop4obda+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Ricardo G. Martini, MsC.
---------------------------------------------------------------------------
Doutorando em Informática
Programa Doutoral em Informática (PDInf)
Centro Algoritmi
Universidade do Minho (UMinho) - Braga, Portugal

Mestre em Computação - Universidade Federal de Santa Maria (UFSM)
Santa Maria, RS, Brasil
Lattes CV: http://lattes.cnpq.br/8710772034085679
---------------------------------------------------------------------------
_______________________________________________
protege-user mailing list
proteg...@lists.stanford.edu
https://mailman.stanford.edu/mailman/listinfo/protege-user


_______________________________________________
protege-user mailing list
proteg...@lists.stanford.edu
https://mailman.stanford.edu/mailman/listinfo/protege-user

Ricardo G. Martini

unread,
Oct 29, 2015, 8:58:08 PM10/29/15
to User support for WebProtege and Protege Desktop, ontop4obda
Hi,

just for thank you guys.

Now my mappings are correct and working.

For similar issues in the future, I solved my last question this way (2 mappings):

mappingId    Nascimento Emigrante
target            :nascimentoEmigrante#{idEmigrante} a :E67_Birth ; :P4_has_time-span :dataNascimentoEmigrante/{dtNasc} ; :P96_by_mother :Filiacao#{idFiliacao} ; :P97_from_father :Filiacao#{idFiliacao} ; :P98_brought_into_life :Emigrante#{idEmigrante} ; :P7_took_place_at :Naturalidade#{idNaturalidade} .
source          SELECT idEmigrante, dtNasc, idFiliacao, idNaturalidade FROM identificacaoEmigrante

mappingId    Data Nascimento Emigrante
target            :dataNascimentoEmigrante/{dtNasc} a :E52_Time-Span ; :P3_has_note {dtNasc} .

source          SELECT dtNasc FROM identificacaoEmigrante

That's the way that I chose to map the E67 Birth P4 has time-span E52 Time-Span.

Best Regards,

Benjamin Cogrel

unread,
Nov 2, 2015, 7:30:39 AM11/2/15
to ontop...@googlegroups.com
Hi Ricardo,

Good to hear that!
Thanks for sharing your solution, should be useful for other users of CIDOC-CRM.

Best,
Benjamin
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages