How to work best with BFO and Python-Owlready2 – 3 issues

463 views

Skip to first unread message

Carsten Knoll

unread,

Feb 7, 2021, 1:21:58 PM2/7/21

to BFO Discuss

My long term aim is to develop a suite of ontologies which represent part of the knowledge of control engineering. As far as I know, it is recommended to built domain ontologies (reference ontologies, application ontologies) on top of (i.e. by extension) some upper ontology (also called "foundational ontology"). BFO seems to be an obvious candidate for this. My main tool is the Python programming language together with the Owlready2-package [2]. However, I encounter some unexpected hurdles (issues) when I want to work with bfo in Owlready.

My attempts are documented in [1] in more detail.

[1]: https://nbviewer.jupyter.org/github/cknoll/demo-material/blob/main/semantic_methods/unsorted/bfo-experiments.ipynb
[2]: https://owlready2.readthedocs.io/

Issue 1:

`bfo.base_iri` is 'http://purl.obolibrary.org/obo/bfo.owl#'

but

`bfo.search(label="entity").first().iri` is
'http://purl.obolibrary.org/obo/BFO_0000001'

i.e. the substring 'bfo.owl#' from the base iri is missing.

This is the reason why `bfo.BFO_0000001` returns `None` which is quite unintuitive.

Issue 2:

The concepts of BFO like "generically dependent continuant" or "object aggregate" are IMHO quite hard to grasp for a ontology-newbie like me. Being obligated to reference them via their `.name` attribute like "BFO_0000031" or "BFO_0000027" makes it even harder. Is there some more convenient way to reference objects from bfo via their labels than something like

`bfo.search(label="generically dependent continuant").first()`

Ideally some solution with auto-completion (in IPython or Jupyter) would be nice.

Issue 3:

If I do the following

```

import owlready2 as owl2
bfo = owl2.get_ontology("http://purl.obolibrary.org/obo/bfo.owl").load()

annotation_properties = list(bfo.annotation_properties())

# works -> List[str]
print(annotation_properties[2].label)

# works -> empty list
print(annotation_properties[3].label)

# works not -> TypeError:
print(annotation_properties[0].label)

```

I get a TypeError as the first annotation seems not to have a label. However, the 4th (index 3) also has no label but does not raise a TypeError.

Questions for each issue:

a) Am I on the wrong track? Are my assumptions or my expectation wrong? Did I miss something?
b) Is there a problem within the BFO (or the version I am using)?
c) Is there a problem within Owlready2?

Best,
Carsten

Pierre Grenon

unread,

Feb 7, 2021, 8:58:31 PM2/7/21

to BFO Discuss

Hello Carsten,

Couple of prefatory comments:

- really cool to see you do that

- I don't think you are in the right place to look for help. I think there used to be a mailing list for BFO OWL (if this is: https://groups.google.com/g/bfo-owl-devel it may not be very active -- perhaps Alan knows best). I'm not familiar with Owlready2, for that it might be best to contact the author (J-B Lamy) https://bitbucket.org/jibalamy/owlready2/src/master/

Issue 1: Short answer -- This is not an issue.

> a) Am I on the wrong track? Are my assumptions or my expectation wrong? Did I miss something?

Yes, I think so. The base IRI is just the base within the file. You'd use it for something that's defined there but nothing's defined in that namespace (I think). BFO terms (as well as others) come from a different namespace and they are reused with that namespace in the file you use. So the actual namespace you want to use for BFO classes is 'obo', i.e., 'http://purl.obolibrary.org/obo/BFO_0000001'

print(bfo.BFO_0000001) returns None because there is no term defined in that namespace, in particular not one with that short ID. If you think it is counterintuitive, which it may be, consider doing: bfofile = owl2.get_ontology("http://purl.obolibrary.org/obo/bfo.owl").load(). Maybe it will help manage your intuitions around the namespace oddity. (I think this is an Owlready2 thing though.)

It looks like with Owlready2, at any rate, you can call a namespace. As I understand, there's no way around knowing which namespace to call (so you need to know to use 'obo'), but, again, I don't known how .load() works.
However, this is still useful: https://owlready2.readthedocs.io/en/latest/namespace.html

You can do:

obo = bfo.get_namespace("http://purl.obolibrary.org/obo/")
Then print(obo.BFO_0000001) returns obo.BFO_0000001 when print(bfo.BFO_0000001) returned None.

As I understand, this is the correct way of going about it.

> b) Is there a problem within the BFO (or the version I am using)?

I don't think so -- others may have differing opinions depending on how purist their thinking is.

> c) Is there a problem within Owlready2?

I don't fully understand how Owlready2's .load() works. It looks like for this particular call, you are getting the advertised behaviour though. I'm not sure at all how this lib handles namespaces and namespace abbreviations (see your other issues). I'd suggest asking J-B Lamy, he would have a smarter and more useful answer.

Issue 2: This looks like not a BFO issue, but Owlready2 might provide what you are asking for.

> a) Am I on the wrong track? Are my assumptions or my expectation wrong? Did I miss something?

It's not wrong to ask for autocompletion and stuff like that but it's kinda expected people sort this out. (There's a nice service by EBI https://www.ebi.ac.uk/ols/index and people have been known to make their own local equivalent.) But this sort of stuff is a library question, nothing to do with BFO as such.

Yes, BFO has technical language which is a little hardcore until you are so deeply traumatised that it becomes second nature, perhaps. I don't think you can escape but people have put lots of effort in documentation.

Using numerical IDs is purposeful and methodological, considered as best practice. It comes from here: http://www.obofoundry.org/principles/fp-003-uris.html

This being said, Owlready2 seems to help: https://owlready2.readthedocs.io/en/latest/onto.html#simple-queries

Try bfo.search(label="generically*").first() this should return BFO_0000031.

b) Is there a problem within the BFO (or the version I am using)?

c) Is there a problem within Owlready2?

Issue 3: I think this is related to the way .load() works in Owlready2 and how it manages namespaces, somehow.

a) Am I on the wrong track? Are my assumptions or my expectation wrong? Did I miss something?

Kinda possibly but you may not be entirely at fault. On the other hand, I don't know enough about the library you are trying to use and how it behaves.

First do this:

for a in annotation_properties:
print(a)

rdf-schema.isDefinedBy
rdf-schema.seeAlso
obo.IAO_0000116
1.1.contributor
terms.license
0.1.homepage
0.1.mbox
obo.BFO_0000179
obo.IAO_0000115
obo.IAO_0000232
obo.BFO_0000180
obo.IAO_0000119
obo.IAO_0000111
obo.IAO_0000112
obo.IAO_0000117
obo.IAO_0000118
obo.IAO_0000412
obo.IAO_0000600
obo.IAO_0000601
obo.IAO_0000602
obo.IAO_0010000
1.1.member

Compare with the namespace abbreviations and the Annotation properties section in the filehttps://raw.githubusercontent.com/BFO-ontology/BFO/master/releases/2.0/bfo.owl

My guess is that Owlready2 does something dumb such as reads and parses URIs and takes the bit between the last / and # (or something weirder depending on case) to create its own namespace abbreviations. If you look at the files, you will see that the annotation properties in the obo namespace (they are nearly all IAO but a few) all come with an rdfs:label, for example:

<owl:AnnotationProperty rdf:about="http://purl.obolibrary.org/obo/IAO_0000112">
<rdfs:label xml:lang="en">example of usage</rdfs:label>
<rdfs:isDefinedBy rdf:resource="http://purl.obolibrary.org/obo/iao.owl"/>
</owl:AnnotationProperty>

For these you can retrieve the label.

However, the somewhat background stuff (RDFS, FOAF, Dublin Core) that doesn't quite belong is only mentioned and has no label. For example,

<owl:AnnotationProperty rdf:about="http://www.w3.org/2000/01/rdf-schema#isDefinedBy"/>

For these, in principle, you get no label so you end up with your empty list in your call.

However, for the RDFS ones, because you get 'rdf-schema.', you just choke on '-'.

for a in annotation_properties:
if not '-' in str(a):
l = a.label.first()
else:
l = "dunno cause it's broken because of '-'"
print(f"{a} has label '{l}'")

rdf-schema.isDefinedBy has label 'dunno cause it's broken because of '-''
rdf-schema.seeAlso has label 'dunno cause it's broken because of '-''
obo.IAO_0000116 has label 'editor note'
1.1.contributor has label 'None'
terms.license has label 'None'
0.1.homepage has label 'None'
0.1.mbox has label 'None'
obo.BFO_0000179 has label 'BFO OWL specification label'
obo.IAO_0000115 has label 'definition'
obo.IAO_0000232 has label 'curator note'
obo.BFO_0000180 has label 'BFO CLIF specification label'
obo.IAO_0000119 has label 'definition source'
obo.IAO_0000111 has label 'editor preferred term'
obo.IAO_0000112 has label 'example of usage'
obo.IAO_0000117 has label 'term editor'
obo.IAO_0000118 has label 'alternative term'
obo.IAO_0000412 has label 'imported from'
obo.IAO_0000600 has label 'elucidation'
obo.IAO_0000601 has label 'has associated axiom(nl)'
obo.IAO_0000602 has label 'has associated axiom(fol)'
obo.IAO_0010000 has label 'has axiom label'
1.1.member has label 'None'

b) Is there a problem within the BFO (or the version I am using)?

I don't think so. The fact the annotation properties are in the file is probably a side effect of using protege. The fact that there's no label on really basic stuff like W3C recommendations seems reasonable.

c) Is there a problem within Owlready2?

I don't know but the behaviour is odd and looks like some mean string hacking and it's unfortunate you get stuck with this '-'. Perhaps you can work around this and I'm missing something obvious or perhaps there is a problem with the lib.

You could try importing the other vocabularies maybe --- but do you really need to? Or just add the labels yourself (it's just a few). I think Owlready2 will allow you do that. You can probably play with the namespaces but this behaviour with 'rdf-schema' looks grim.

Hope this helps.

with many thanks and kind regards,

Pierre

--
You received this message because you are subscribed to the Google Groups "BFO Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bfo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bfo-discuss/3b1fb7ba-9412-491d-b792-2666251eacfcn%40googlegroups.com.

Carsten Knoll

unread,

Feb 9, 2021, 11:35:37 AM2/9/21

to BFO Discuss

Dear Pierre,

thank you very much for that comprehensive answer! I will need some time to fully digest it.

FYI: I asked the same questions on the owlready malinglist [1] but there was no answer yet.

Generally, I see a big potential in the combination of semantic technologies and python – especially in the field of engineering and physics. I think that for possible developments an upper ontology like BFO should be taken into account already in an early stage. If someone is interested: I try to document my findings (and maybe achievements) in [2]

Aiming at different areas of applications (like engineering) the question arises why bfo-entities live in an obo.* namespace? In my understanding BFO aims to be domain-neutral but on the other hand there is the (abbreviated) term "biology" in the IRI. Is this "just" for historical reasons or is there a deeper meaning behind that?

[1] http://owlready.8326.n8.nabble.com/How-to-work-best-with-the-Basic-Formal-Ontology-BFO-and-Owlready2-3-issues-td2248.html

[2] https://github.com/cknoll/semantic-python-overview/

Thanks again!

Best regards,

Carsten

Reply all

Reply to author

Forward

0 new messages