Hi Kerstin,
Good to hear your thoughts about terminology mapping, and adopting
Nanopublications
schema to deal with mapping provenance issues. Please see my
response in-line:
On 9/3/13 8:50 AM, Kerstin Forsberg wrote:
Hi Eric,
Thanks for the
link to the RIM RDF tutorial, will read with great interest.
I'll not able to
join tomorrow. Two thoughts re. mappings.
- How to align
this with the interest in getting RDF (SKOS) versions directly
from source (eg we have a good interaction with MedDRA MSSO
about this).
- Mapping
provenance
The justification
and attribution of the mappings (between concept/terms) are
key to trust them. At the ICBO conference earlier this summer
we discussed the idea of turning for example the mappings in
the Bioportal into Nanopublications based on some great work
by Jim McCusker. So that the Bioportal mappings stated as
skos:closeMatch also would have the justification of them as
being the results of using the LOOM lexical algorithm. An
alternative would be to treat mappings as linksets as done by
Open PHACTS and provide the justification for the
links/mappings (between entities) as part of the linkset
description in VoID. Alasdair Gray is working on a nice
proposal 1) on this based on the W3C HCLS task force for
dataset discovery and description that Michel lead.
Here, it would be interesting to distinguish mappings into three
categories and their possible provenance measures:
1) Manually defined mappings: In the
Nanopublications
schema, provenance is captures by the property nanopub:hasProvenance
which ties to the property nanopub:hasSupporting, which could
capture the mappings curation information (e.g. creator, author,
version, rights etc)
2) (semi-)automatically found mappings: You have already discussed
this case above. So in the case, the information about the
LOOM lexical algorithm
could be described using nanopub:hasSupporting property in
Nanopublications--or
alternatively using
the
Open PHACTS approach ...
3) Inferred mappings via reasoning: New mappings can be inferred via
a reasoning process (i.e terminology reasoning). In this case, a
reasoning proof (i.e. set of inference steps under a rule-based
reasoning) can very well be suited to provide some provenance
information.
Just to make my point clear, I would like to share an concrete case:
Test-case:
--------
ICD-9-CM code (999.4) <---exactMatch --> SNOMED-CT code
(213320003) <---exactMatch --> MedDRA code (10067113), for
details see term-mapping-example.png and example-term-map.n3.
Results:
-----
- ICD-9-CM code (999.4) <---exactMatch --> MedDRA code
(10067113), because skos:exactMatch is a transitive property.
A) The proof of this inferred mapping is shown in
example-term-map-proof.n3
B) An abstract or summary of the reasoning results are shown in
example-term-map-ances.n3, which gives an overview information about
which of the asserted facts (i.e. asserted mappings) were used to
derive this inferred mapping.
C) Finally, an example
Nanopublication describing this inferred mapping is shown in example-term-map-nano.n3,
where the reasoning information from A) and B) are treated to
provide some provenance information as two supporting graphs
":NanoPub_1_Supporting_1" and ":NanoPub_1_Supporting_2".
Interestingly :NanoPub_1_Supporting_2 can be validated by a
proof-checker--such as cwm (
http://www.w3.org/2000/10/swap/doc/cwm)
or euler (
http://eulersharp.sourceforge.net/).
I plan to attend the COI call Wed 4 Sep.
Kind Regards,
Sajjad
*****************************************************