Translation of BFO to information system models

Thomas Beale

unread,

Apr 3, 2025, 2:33:44 PMApr 3

to BFO Discuss

Something I have been involved with for a long time is information modelling for healthcare domain. Conceptually I've been informally grounding things in BFO categories for many years, but only fairly recently tried to come up with some solid theory of how to get from BFO to proper information system models. Naive direct use of say the BFO IS-A hierarchy plus whatever relations one wants don't quite do the trick.

My headline question is: is there a body of work, including theoretical foundations for doing such modelling, based in underlying realist ontologies?

Here are a couple of the challenges.

1. Representing kinds *and* individuals

For many categories of real world physical entity (organisms, devices, substances including medications, etc) plus aggregates of the same, we want to represent both the 'kind' level as well as individuals. An obvious example is 'normal' species level phenotype versus individual phenotype which includes genetic anomalies, personal history of injuries, sickness, surgery etc.

An even more basic one for health providers is representing inventory. A hospital system may use the Philips Compact 5300 series Obstetric ultrasound machine in its antenatal clinics. The model-level description is important, since it includes all kinds of useful information like resolution, radiation dose (for CT scanners etc), and hundreds of other details, including cost / price info.

This level of information is common to the (say) 30 Philips Compact 5300s the hospital actually has. Each of these individual machines will have a UDI, use & maintenance history, cleaning history, software / firmware upgrade history and so on - just like a human individual phenotype.

Both levels of information are needed, and need to be distinct, so for example, upgrade histories of different machines don't get confused.

This same need to represent kinds extends to substances, e.g. the RxNorm idea of Amoxycillin versus the 20 numbered lots of 100 boxes each of that drug in the hospital right now. The individual level information very much applies to vaccines for obvious reasons (contamination etc). The kind level information is used to determine interactions, general prescribing guidelines and so on.

We could go on to individual processes and 'models of processes', aka CPGs and similar in the healthcare domain.

For any real world individual box of tablets, ultrasound machine etc, its properties and history are the sum of kind-level information + individual-level information.

The challenge is that ontologies in general don't explicitly distinguish kind and individual as universals that could both be instantiated in the real world. Well, one could say that they do, if you just treat the real world instance of the kind level as a 'description', i.e. information standing in the IS-ABOUT relation to the whole extension of category / species / machine type etc. In which case, such individuals are instances of categories in an extended version of IAO. Then you presumably have to recreate parts of BFO under IAO.

The part that I think is missing is that the specific IS-A hierarchy and many relations that would be in an ontology like BFO, pertaining to individuals, but also to their kinds, such as part-of, various dependent qualities like has-function, etc.

So far so obvious. The BFO book of course mentions the distinction between category- and instance-level relations that can mimic each other. But there is IMO a methodological gap of how to create information models based on relevant ontologies that make explicit (most of?) the same relations and properties at both kind and instance level.

2. Social Entities versus Physical Entities

Modelling of what are commonly called demographic entities such as party, Person, Organisation, and their relations, e.g. accountability, 'role' and so on should theoretically be done via categories in the specifically dependent Entity hierarchy, where the social entities are roles or similar, whose bearers are some material entity like a human organism (bfo: object). However in real information systems, including admin systems in hospitals, no-one cares about the material bearers of social 'entities' - the latter stand on their own.

Everything to do with accountabilities, job posts, contracts, teams etc is relations among social entities, not the material bearers. This social realm can be thought of as the realm of autonomous agents, rather than 'objects'.

We only care about the material organism etc when we get to the care part of healthcare, and now the emphasis is reversed - we want a model of bodily processes, and the person's name and id are just details to make sure we attach our observations to the right patient.

This conundrum comes about due to BFO's material-world orientation (which is perfectly correct. But again, I am interested in methodology for bridging from ontologies like BFO to real world information systems that try to maintain 'digital twins' of the social world.

I have ways of handling all of the above, but they feel somewhat ad hoc. Hence the interest in generalised methodology addressing this and other gaps going from ontology to information systems.

Werner Ceusters

unread,

Apr 3, 2025, 2:54:52 PMApr 3

to BFO Discuss

I think that is called 'Referent Tracking', no?

Latest publication here: https://osf.io/preprints/osf/q8hts_v1 (published in https://link.springer.com/book/10.1007/978-3-031-11039-9)

By the way, BFO2020-FOL does have universals AND individuals in its domain of discourse. The problem is only in OWL-ontologies.

Lastly, social entities exist at universal (or defined class) level just as material entities do. That does not require any other mechanism.

Giancarlo Guizzardi is working lately on connecting ontologies to information models. He have a good talk on Ontology Summit a few momths ago: https://ontologforum.com/index.php/ConferenceCall_2025_01_29

W

Barry Smith

unread,

Apr 3, 2025, 3:38:46 PMApr 3

to bfo-d...@googlegroups.com

TB: Everything to do with accountabilities, job posts, contracts, teams etc is relations among social entities, not the material bearers. This social realm can be thought of as the realm of autonomous agents, rather than 'objects'.

BS: What you say is true of everything which matters, but there have to be organisms forming the teams, which are subject to the accountabilities and so forth. BFO has not worked hard enough on these sorts of things, though there is the material here:

https://www.youtube.com/playlist?list=PLyngZgIl3WTht-ilt-WpUhCv7rbB1Crd9

We are also working on a more sophisticated treatment of digital entities in the BFO framework. Currently we deal with, for example, the functions of a given piece of software -- where we live in a world in which only material entities can have functions -- by pointing out the software entities have functions (or at least are capable of exercising their functions) only when they are installed on a (material) computer. Thus what has the function is the computer-with-that-software-installed. This, unfortunately, does not take us very far. Consider, for example, the price of an item on Amazon. I

BS

--
You received this message because you are subscribed to the Google Groups "BFO Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bfo-discuss...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bfo-discuss/b4d70df4-4de2-4a90-b976-1d4d6e948ccbn%40googlegroups.com.

wolandscat

unread,

Apr 3, 2025, 5:23:34 PMApr 3

to bfo-d...@googlegroups.com

Werner,

thanks for the very useful information.

On 4/3/25 12:54, Werner Ceusters wrote:
> WC: I think that is called 'Referent Tracking', no?

Well RT is related, since it's a way to know if two reported instances
of something in the real world really have the same referent. What I am
interested in is the information model construction that clearly allows
for representation of instances (say 5 real ultrasound machines on floor
2 of building 7), such that those representations all refer to an
appropriate kind-level description (Philips 5300 model obstetric US
machine), which is (potentially at least) an instance of a class (in the
UML / info model sense) that represents kind-of-machine (or 'machine
description', if you like).

But let me read your latest...

> WC: Latest publication here: https://osf.io/preprints/osf/q8hts_v1

> (published in https://link.springer.com/book/10.1007/978-3-031-11039-9)
>
> By the way, BFO2020-FOL does have universals AND individuals in its
> domain of discourse. The problem is only in OWL-ontologies.

I'm somewhat aware of this, but what do we practically do about it? I
guess you are suggesting a methodological pathway from FOL
representation rather than OWL?

>
> WC: Lastly, social entities exist at universal (or defined class)

> level just as material entities do. That does not require any other
> mechanism.

I assume you mean what I already mentioned, i.e. the entities like Role,
Capability, Function in BFO, or do you mean something else?

>
> WC: Giancarlo Guizzardi is working lately on connecting ontologies to

> information models. He have a good talk on Ontology Summit a few
> momths ago: https://ontologforum.com/index.php/ConferenceCall_2025_01_29

I just read the main paper there, really interesting. However, we don't
build naive models in the first place (as the ones they dissect in the
paper, which I've been criticizing for 20 years), ours are pretty good
at separating out e.g. accidental v essential characteristics, or
representation 'Patient' as a kind of relation (between a subject and a
provider ), not a kind of Person. And I think they miss a couple of
things, for example, one would not represent 'phases' of being (being
sick, being healthy) as UML classes, you use a model of 'state' to do
that, e.g. state machine or equivalent. OTOH they're way ahead of anyone
doing 'UML modelling'. I need to look more at their work.

Thomas

wolandscat

unread,

Apr 3, 2025, 6:01:05 PMApr 3

to bfo-d...@googlegroups.com

On 4/3/25 13:38, Barry Smith wrote:

TB: Everything to do with accountabilities, job posts, contracts, teams etc is relations among social entities, not the material bearers. This social realm can be thought of as the realm of autonomous agents, rather than 'objects'.

BS: What you say is true of everything which matters, but there have to be organisms forming the teams, which are subject to the accountabilities and so forth.

They are (of course), all I am saying is that in the social world, there is little to no interest in those material bearers that are 'really there', because (again, pretty obviously to all) social relationships obtain between either default social entities (person as citizen, organization as its legal entity) or social personae (aka 'roles', a term I avoid like the plague!) such as general practitioner, CEO etc. A large category of relations only obtain between these social entities, either a bare Agent (in my model - the default social entity) or an Agent in a Persona. They don't make any sense at all between material entities (underlying the social entities).

I don't know that any of this is a problem per se, perhaps all I'm pointing out is that there is a hidden world of complex relationships between entities that in BFO terms are dependent roles, but in the social world are their own thing.

From the POV of information systems, the world looks more like the following

physical entity

all of BFO

social entity / aka agent
info entity

The contents of the latter two top level categories can be rolled into BFO, but only in a somewhat trivialising sense that always subordinates them to their physical bearers or processes.

Information systems tend to be pretty clearly oriented to representing digital twins of either physical or social real world entities (e.g. respectively vital signs of in-patient; patient administrative record) as their first order entities, or else statements about either (historical clinical information for example).

I don't think there is anything wrong with the ontologies we use; again, my interest here is ontology -> information model translation methodology. The paper that Werner referred to is addressing that space, but not (that I can see) the specific challenges I raised.

BFO has not worked hard enough on these sorts of things, though there is the material here:

https://www.youtube.com/playlist?list=PLyngZgIl3WTht-ilt-WpUhCv7rbB1Crd9

This is a nice collection - watching now.

We are also working on a more sophisticated treatment of digital entities in the BFO framework. Currently we deal with, for example, the functions of a given piece of software -- where we live in a world in which only material entities can have functions -- by pointing out the software entities have functions (or at least are capable of exercising their functions) only when they are installed on a (material) computer. Thus what has the function is the computer-with-that-software-installed. This, unfortunately, does not take us very far. Consider, for example, the price of an item on Amazon. I

Presumably this might be understood as 'script' + 'execution engine'. Abstractly that would also cover e.g. mRNA and ribosome translation -> protein.

Thomas

seanno...@gmail.com

unread,

Apr 3, 2025, 6:02:24 PMApr 3

to bfo-d...@googlegroups.com

On 4/3/25 12:54, Werner Ceusters wrote:
> WC: I think that is called 'Referent Tracking', no?

TB: Well RT is related, since it's a way to know if two reported instances

of something in the real world really have the same referent. What I am
interested in is the information model construction that clearly allows
for representation of instances (say 5 real ultrasound machines on floor
2 of building 7), such that those representations all refer to an
appropriate kind-level description (Philips 5300 model obstetric US
machine), which is (potentially at least) an instance of a class (in the
UML / info model sense) that represents kind-of-machine (or 'machine
description', if you like).

WC: RT is more than that. It uses in essence the same sorts of
representations as in the BFO-FOL specifications but with references rather
than variables.
This paper gives a good idea of what I mean:
https://philpapers.org/archive/OTTBBF.pdf

Your example is simple. If m1, m2, ..., are references for each of your 5
machines, then all you need to assert is

(instance-of m1 philips-5300-model-obstetric-US-machine m1t1)
(instance-of m2 philips-5300-model-obstetric-US-machine m2t1)
...
(located-in m1 floor-2-of-building-7 m1f2b7t1)
...
(instance-of floor-2-of-building-7 site t...)

It is not too difficult to map appropriate levels of your ontology to
columns and tables in your information model.
Only tricky part is to make sure the time-stamps are provided consistently,
but you can go in as much detail as required. That is why the paper I just
cited is a good example of how to deal with that.

> WC: By the way, BFO2020-FOL does have universals AND individuals in its

> domain of discourse. The problem is only in OWL-ontologies.

TB: I'm somewhat aware of this, but what do we practically do about it? I

guess you are suggesting a methodological pathway from FOL
representation rather than OWL?

WC: well, I meant actually common logic, but BFO-FOL uses only FOL in its
CLIF files.
You can create axioms that make the bridge between your BFO-based ontology
and your information model. Check the axioms for consistency, and then
translate in whatever softare language your system is implemented in.

>
> WC: Lastly, social entities exist at universal (or defined class) level
> just as material entities do. That does not require any other mechanism.

TB: I assume you mean what I already mentioned, i.e. the entities like Role,

Capability, Function in BFO, or do you mean something else?

WC: I mean they fit in exactly the same framework. They don't require
anything special in the bridge to your information model.

W

wolandscat

unread,

Apr 3, 2025, 7:29:49 PMApr 3

to bfo-d...@googlegroups.com

In the an inventory information system there will be something like m1-info, m2-info etc, that are instances of a model class representing 'individual device'. Data attributes of these information instances are things like:

unique device ID
date of manufacture
serial number
software version
etc
artifact kind (R)

These information instances could have instance-level relationships to parts (e.g. if there is some second screen that can be added or whatever). That 'has-part' relation needs its semantics defined somewhere.

One of the attributes (marked 'R' above) of that class is a reference to another information entity that describes (in this example) 'device type'. There is an information instance dt1-info ('dt' = device type ) of this latter class that represents the Philips model 5300 obstetric ultrasound machine, with details like:

machine type
description
manufacturer
catalogue number
regulatory status
etc

If there are possible 'parts' or other potential material relationships, these should be represented as instances of an Entity relationship that obtains between the main machine and those parts. Both the Entities and relationships here are all at the kind (universal) level, i.e. potentially true of all Philips 5300 US machines.

So in information modelling terms, m1-info and m2-info are (data) instances of some class M-INDIVIDUAL (let's say this equivalent to bfo:object) and dt1-info is a (data) instance of some class M-TYPE (still bfo-object but a kind-level description). What does the class model containing these classes look like?

M-TYPE should belong to a BFO-based hierarchy that contains universal-universal level relationships; the class M-INDIVIDUAL belongs to another hierarchy whose classes represent individual instances of machines or devices. The instance -level relation possible for instances of that second hierarchy should be either inferred from the relations modelled in the M-TYPE hierarchy.

Note that all the specific attributes in both hierarchies are achieved using archetyping, so we don't literally define classes that have hard-wired attributes like 'unique device ID' etc.

So part of my original question was: is there any methodology that states how to create this overall information model, containing classes M-TYPE and M-INDIVIDUAL? Because we need both levels of description.

That use case paper is interesting, although doesn't address this problem that I can see.

Thomas

seanno...@gmail.com

unread,

Apr 3, 2025, 8:25:44 PMApr 3

to bfo-d...@googlegroups.com

This seems to be HL7 RIM all over.

I would say: either drop the information bits (an artifact kind, a manufacturer, etc.., are as much information as a human being is, though some elements, like ‘description’ are information content entities) or drop the ontology approach.

It seems you are looking more in the direction of ISO/IEC 11179. Not my cup of tea. Some have done some effort to link that to an ontology, see https://jbiomedsem.biomedcentral.com/articles/10.1186/2041-1480-2-S4-S1 See the method section of that paper.

W

--
You received this message because you are subscribed to the Google Groups "BFO Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bfo-discuss...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/bfo-discuss/a660a80e-cdaa-4e0e-9280-182785366b08%40gmail.com.

wolandscat

unread,

Apr 4, 2025, 1:42:39 PMApr 4

to bfo-d...@googlegroups.com

Werner,

thanks for that further reference, again very useful.

I know your allergy to anything but pure ontologising, but we must face the fact that most data being created (probably Terabytes / second these days) are technical instances of badly formed 'information models' containing close to zero coherent semantics. Hence the basic problem of the health IT and most other domains - unknown semantics in any given database multiplied by incommensurability of any one database (or schema thereof) with every other. We still live in a semantic apocalypse...

Most industry sectors will never move to pure ontological based representation (notwithstanding the very good reasons to do so, at least for areas like terrorism, biosceurity and so on). They might however move to information systems that represent key semantics in a well-formed way - such that data relate correctly to appropriate BFO categories, and the (main) relationships also have semantics rooted in appropriate ontology (e.g. parthood; the notion of Participation of Roles in a Process and so on.)

The question is: what are 'key' semantics? The most likely answer is: anything that needs to support proper inferencing. (One response is: you never know, formalize everything ... not super-helpful) But there is a lot of data outside of that. For example, characteristics of a device like UDI, date of manufacture, serial number could clearly be formalized via BFO & referent tracking; so could manufacturer, and indeed every other detail of any object or process. However, there are costs associated with doing that (both the Guizzardi and the BFO indicate quite literally the extra complexity of representation), and if there will never be any inferencing beyond the simplest value-matching or set-membership, then the purist representation of such properties probably has no value. Cleaning history of an ultrasound machine comes to mind...

To restate part of the question in my original post: I believe we need a methodology that takes account of what should be ontology-based and what doesn't - i.e. what may be represented as 'data properties' within a coherent ontology-based skeleton.

In the language of Guizzardi we might say something like: to what extent within an overall domain do we apply 'ontological unpacking'? Or in the GFO-based approach you cited just now, to what extent do we apply the General method of ontological reduction in a domain? This really matters in the real world, because there are significant costs associated with the full formalization of everything in a domain. If no query or 'deep' inferencing engine ever touches those things, then the extra cost has not been worth it.

We can actually go a bit further: non-ontologically unpacked data items can still be considered 'keys' into other systems where the same information is found in its ontologically unpacked form, and fully usable.

Thus there is a gap here, and it needs methodology. I think that there is likely a need for 'bridging ontologies' that reify 'data' and 'information' in a ways more directly suited to realisation in IT systems, that would get us from pure realist ontology such as BFO to a real information system. YAMATO and GFO for example contain departures from BFO that potentially make sense if we think of them as such bridging ontologies, rather than competitors in the pure realist space. I also think Referent Tracking is almost always an essential part of what needs to be done 'properly'.

I am quite interested to know if there is any resonance within this community of the above issues.

Thomas

p.s. please, not the HL7 RIM. Barry might remember a 20pp paper I showed him in about 2007 demolishing it just in terms of information & modelling theory. There are good ways to do modelling, and there are really terrible ways...

seanno...@gmail.com

unread,

Apr 4, 2025, 5:27:11 PMApr 4

to bfo-d...@googlegroups.com

Thomas,

I agree with 99% you said in your last reply. Just these two comments on some details.

TB:

YAMATO and GFO for example contain departures from BFO that potentially make sense if we think of them as such bridging ontologies, rather than competitors in the pure realist space. I also think Referent Tracking is almost always an essential part of what needs to be done 'properly'.

WC: There is no need to have an upper ontology be designed in such a way that it can serve as ‘bridging ontology’. If some user needs such a bridge to another ontology, it can be designed without impacting the ontology by using bridging axioms between the two. I did some work in that direction with a PhD student, using Snomed:

https://pubmed.ncbi.nlm.nih.gov/38808039/

https://pubmed.ncbi.nlm.nih.gov/38269770/

TB: To restate part of the question in my original post: I believe we need a methodology that takes account of what should be ontology-based and what doesn't - i.e. what may be represented as 'data properties' within a coherent ontology-based skeleton. I am quite interested to know if there is any resonance within this community of the above issues.

WC: It resonates absolutely with me. But I did not get the impression from your original post that that is what you were looking for. I believe that indeed too much of what shouldn’t be in an ontology, is put in an ‘ontology’, and that too many of these ‘ontologies’ are not different from simple vocabularies. Anyhow, my interest is (1) not in what shouldn’t be in an ontology, but what should, (2) how to do that using BFO such that the result is compatible with and extend all BFO2020-FOL axioms, and (3) doing these exercises so that BFO itself can be improved. Alan Ruttenberg’s axiomatization of BFO is a fantastic resource that is unfortunately under-used. That is where I spend my research time on. And that is the context under which my replies are given. ‘Information modeling’ is outside my scope, that is why I pointed you to Guizzardi who at least works in both areas.

W

--

You received this message because you are subscribed to the Google Groups "BFO Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bfo-discuss...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/bfo-discuss/cc6fc4fb-4ae7-4e53-922b-5d30f65bb83b%40gmail.com.

wolandscat

unread,

Apr 5, 2025, 10:05:02 AMApr 5

to bfo-d...@googlegroups.com

Thanks for yet another reference (which I skimmed already, I need to look properly at it).

Just in the interests of clarity, my original question was about

the general question of bridging from ontology to information systems, which could be understood as 'how much ontological unpacking is needed?' I didn't formulate it like this, as I had not seen the papers you cited.
a specific question on how to deal with both kinds and individuals as first order information system entities.

The second question is one that I think needs a better solution that the current one, which is (most likely) that information about 'kinds' (description of a certain model ultrasound machine etc) are just instances of the Descriptive (or maybe even Representational) ICE category (is-a ICE is-a bfo:gd-Continuant). This is not very helpful, because it doesn't provide any direct expression of the qualities and relationships of the kind of artifact (or other entity) in question, understood in BFO terms, which are potentially found in the descendants of Material Artifact (isa bfo:material entity) from the artifact ontology.

One could imagine a mirror of the BFO material entity category (including Material artifact hierarchy) that appears under the iao:descriptive ICE where all the nodes and edges have the meaning 'description of <original ontology element>'. This could make sense if we agree that you can only describe what can actually exist.

So I think this particular question is distinct from the first one about ontological unpacking.

Reply all

Reply to author

Forward