Hi Chris,
I am back from holiday now and wading through a lot of FDM group emails! You've been busy. Thanks for taking the time to go through my review comments - I've gained a lot more from them than you have :-). Here are some further comments. If nothing else it helps me to articulate what I think about the subject...
--------------------------------------------------------------------
5.3.2 My question on "normalization"
- My question is more about whether a TLO that claims to both
stratify and unify can also be in normal (&/or) denormalised
form? How can I use a TLO that claims to allow both and result in
(in the limit) zero ambiguity? I don't think I can do this as I
have no means "within the system" of translating between (a range
of) Three and Four Dimensionalist representations of things (or I
have to choose in advance which approach I am going to take, which
defeats the point of claiming to support both). You made this
comment in response to mine on Normal Form "And I can see
opportunities for the redundant data to be inconsistent – the
classic denormalization problem.". I think we are in agreement
but is this the sort of issue that you were referring to?
- My next point is that when we talk about Normal Form we are
talking about data models (at least I am). You use the phrase
"ontological denormalization analagous to database normalization"
that implies that the TLO must be normalized. If so, I agree. Is
this an accepted point or is it just a by-product of the careful
choices made 'in the round'? I can't see us having a normalized
TLO without us ensuring that this is the case (see links below).
- Back to the stratify/unify choice. I have a problem with an
approach that views it as optional whether a (particular) thing
exists in space and time (this is a separate point to the
stratify/unify one, I think). At one level this looks convenient
but this seems to me to be unfortunate as it allows many
short-cuts that we wish to avoid/minimise. However, there is an
even bigger issue (at least I think it is big). In BFO, for
example, I have found very few examples of where it is used to
explicity represent (in data) an instance of something that exists
(a particular) - ok, I haven't found any! I have found diagrams
in books/papers that use the instance_of relation to point to a
counterpart in 'reality' but none of these diagrams have included
space or time! I have spoken before of why I find all that
concerning because I don't think it is practical to use BFO as a
data model. It is one thing to claim that BFO is internally
normalized (am not sure Ithat it makes this claim) but I don't
think there is any claim to be made about particulars (expressed
as data - i.e. counterparts to the 'real' particulars). To be
fair to the BFO community they have never claimed that it was a
data model - they claim it to be a taxonomy - so my hope that
there isn't a real argument to be had there. Our goal is to
create a FDM (that really is a data model!). This may beg the
question about what is a "data model"? At the risk of avoiding
that and taking a shortcut my view is that the FDM we have a
requirement for is to allow for all universals and particulars /
things / objects that are constructed based on it to be
universally normal (or mappable into a normalized form with zero
loss - i.e. by only removing repitition) from the outset and as it
is extended. 'No' ambiguity - at least as the goal. That
requirement is a hard one but my view is that it is one that we
can't compromise on as the goal is to enable the FDM to be used as
a data model. In this sense most of the reviewed TLOs are not in
a suitable Normal Form for our intended use.
- I hope that the points above don't raise more questions than
they answer. However, the resulting utility of the FDM (based on
our national TLO choices) will be significantly dependent on them
- we have to implement quality management processes that we can
subsequently use the FDM and to rigorously extend and use the RDL
and resulting user datasets.
- I tried to look for normalization papers relating to (ontology and data) and didn't come up with much. These two have some interesting perspectives, if only to illustrate how other teams have approached the topic. I don't think there is much in them that could influence what you have in your TLO review but here they are if anyone is interested: "Normalisation of ontology implementations" and "Ontology Design Principles and Normalization Techniques in the Web".
5.3.3 - Ghosts and neutrinos. I agree that my example offered no
advantage over yours - and had extra complications :-). However,
does this not indicate that there may be less of a case for
allowing interpenetration? Does it offer benefit if we can't come
up with robust examples? This needn't change anything in your
report but is an important consideration for subsequent
choice(s). My experience (admittedly limited) is that there is
more analytic benefit to be had by only allowing things into the
TLO that assist the subsequent users in solving the challenges
that they have to deal with. Extras can provide opportunity for
inconsistent interpretations/applications of the FDM. I am not
sure that I against admitting interpenetration but I am wary of
it.
5.3.6 Identity; extension vs. intension
Thanks for hinting that I was a off the mark with my comments.
It caused me to give it some more thought - still probably not yet
enough! My thinking was attached more to the treatment of the
representations (in data) and not their counterparts in
reality/possibility. So I still think the points I raised were
valid... but they weren't in the context of the identity section
of your report. One of the most important points for the use of
the resulting FDM for this work is a cautionary comment that you
made to one of my points about the handling of identity (of what
is being represented) in information systems and the (possible)
counterpart(s)*. You stated: "There is a connection, but I am not
so sure that in practice the link is so close or clear.". I
strongly agree and I feel that we need a diplomatic way of stating
this through the work of the group. Trying to ensure that there
is a high degree of consistency between what is stated in
(FDM-conformant) data and the possible counterparts has a cost. I
would describe much of this as (information) assurance activity
and you only get what you pay for (as long as you spend wisely) -
modelling is part of this but assuring information system
implementations, data 'objects', their sources, any
processing/storage/protections/exchange points that involve the
data, the decisions that are dependent upon the data, etc are all
part of the information assurance 'landscape'. Even when a lot of
cost/effort has been expended there still won't be a 'complete'
correspondence between the counterparts - the goal is to get the
data representations sufficient for their intended uses. When I
read articles about "Digtal Twins" (for example) this tends to be
overlooked - often imply that the data is cheap/low cost and that
it can deliver radical benefit without recognising that serious
work needs to be done to make it that useful. This is why we need
to have a data quality manageemnt approach to our ongoing work.
The work on the FDM is just the start of this journey.
BTW, I'll still value a chat/further correction on this subject at some point but don't consider it high priority.
--------------------------
Thanks and I will now embark on reading the dialogue between you and Steven!
Al
Regards,
Chris Partridge
Chris Partridge |
Chief Ontologist | BORO Solutions Limited | www.BOROSolutions.co.uk
M: +44 790 5167263 | e: partr...@borogroup.co.uk
BORO Solutions Limited | Registered Office: 2 West Street, Henley on Thames, Oxfordshire RG9 2DU
Registered in England & Wales | Company No: 06025010 | VAT No. GB 905 6100 58
Hi Chris,
I am back from holiday now and wading through a lot of FDM group emails! You've been busy. Thanks for taking the time to go through my review comments - I've gained a lot more from them than you have :-). Here are some further comments. If nothing else it helps me to articulate what I think about the subject...
--------------------------------------------------------------------
5.3.2 My question on "normalization"
- My question is more about whether a TLO that claims to both stratify and unify can also be in normal (&/or) denormalised form? How can I use a TLO that claims to allow both and result in (in the limit) zero ambiguity? I don't think I can do this as I have no means "within the system" of translating between (a range of) Three and Four Dimensionalist representations of things (or I have to choose in advance which approach I am going to take, which defeats the point of claiming to support both). You made this comment in response to mine on Normal Form "And I can see opportunities for the redundant data to be inconsistent – the classic denormalization problem.". I think we are in agreement but is this the sort of issue that you were referring to?
- My next point is that when we talk about Normal Form we are talking about data models (at least I am). You use the phrase "ontological denormalization analagous to database normalization" that implies that the TLO must be normalized. If so, I agree. Is this an accepted point or is it just a by-product of the careful choices made 'in the round'? I can't see us having a normalized TLO without us ensuring that this is the case (see links below).
- Back to the stratify/unify choice. I have a problem with an approach that views it as optional whether a (particular) thing exists in space and time (this is a separate point to the stratify/unify one, I think). At one level this looks convenient but this seems to me to be unfortunate as it allows many short-cuts that we wish to avoid/minimise. However, there is an even bigger issue (at least I think it is big). In BFO, for example, I have found very few examples of where it is used to explicity represent (in data) an instance of something that exists (a particular) - ok, I haven't found any! I have found diagrams in books/papers that use the instance_of relation to point to a counterpart in 'reality' but none of these diagrams have included space or time! I have spoken before of why I find all that concerning because I don't think it is practical to use BFO as a data model. It is one thing to claim that BFO is internally normalized (am not sure Ithat it makes this claim) but I don't think there is any claim to be made about particulars (expressed as data - i.e. counterparts to the 'real' particulars). To be fair to the BFO community they have never claimed that it was a data model - they claim it to be a taxonomy - so my hope that there isn't a real argument to be had there. Our goal is to create a FDM (that really is a data model!). This may beg the question about what is a "data model"? At the risk of avoiding that and taking a shortcut my view is that the FDM we have a requirement for is to allow for all universals and particulars / things / objects that are constructed based on it to be universally normal (or mappable into a normalized form with zero loss - i.e. by only removing repitition) from the outset and as it is extended. 'No' ambiguity - at least as the goal. That requirement is a hard one but my view is that it is one that we can't compromise on as the goal is to enable the FDM to be used as a data model. In this sense most of the reviewed TLOs are not in a suitable Normal Form for our intended use.
- I hope that the points above don't raise more questions than they answer. However, the resulting utility of the FDM (based on our national TLO choices) will be significantly dependent on them - we have to implement quality management processes that we can subsequently use the FDM and to rigorously extend and use the RDL and resulting user datasets.
- I tried to look for normalization papers relating to (ontology and data) and didn't come up with much. These two have some interesting perspectives, if only to illustrate how other teams have approached the topic. I don't think there is much in them that could influence what you have in your TLO review but here they are if anyone is interested: "Normalisation of ontology implementations" and "Ontology Design Principles and Normalization Techniques in the Web".


5.3.3 - Ghosts and neutrinos. I agree that my example offered no advantage over yours - and had extra complications :-). However, does this not indicate that there may be less of a case for allowing interpenetration? Does it offer benefit if we can't come up with robust examples? This needn't change anything in your report but is an important consideration for subsequent choice(s). My experience (admittedly limited) is that there is more analytic benefit to be had by only allowing things into the TLO that assist the subsequent users in solving the challenges that they have to deal with. Extras can provide opportunity for inconsistent interpretations/applications of the FDM. I am not sure that I against admitting interpenetration but I am wary of it.
5.3.6 Identity; extension vs. intension
Thanks for hinting that I was a off the mark with my comments. It caused me to give it some more thought - still probably not yet enough! My thinking was attached more to the treatment of the representations (in data) and not their counterparts in reality/possibility.
So I still think the points I raised were valid... but they weren't in the context of the identity section of your report. One of the most important points for the use of the resulting FDM for this work is a cautionary comment that you made to one of my points about the handling of identity (of what is being represented) in information systems and the (possible) counterpart(s)*. You stated: "There is a connection, but I am not so sure that in practice the link is so close or clear.". I strongly agree and I feel that we need a diplomatic way of stating this through the work of the group. Trying to ensure that there is a high degree of consistency between what is stated in (FDM-conformant) data and the possible counterparts has a cost. I would describe much of this as (information) assurance activity and you only get what you pay for (as long as you spend wisely) - modelling is part of this but assuring information system implementations, data 'objects', their sources, any processing/storage/protections/exchange points that involve the data, the decisions that are dependent upon the data, etc are all part of the information assurance 'landscape'. Even when a lot of cost/effort has been expended there still won't be a 'complete' correspondence between the counterparts - the goal is to get the data representations sufficient for their intended uses. When I read articles about "Digtal Twins" (for example) this tends to be overlooked - often imply that the data is cheap/low cost and that it can deliver radical benefit without recognising that serious work needs to be done to make it that useful. This is why we need to have a data quality manageemnt approach to our ongoing work. The work on the FDM is just the start of this journey.
BTW, I'll still value a chat/further correction on this subject at some point but don't consider it high priority.
--------------------------
Thanks and I will now embark on reading the dialogue between you and Steven!
Al
--
You received this message because you are subscribed to the Google Groups "UK NDT FDM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to uk-ndt-fdm+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/uk-ndt-fdm/68e37ce4-a372-baf9-2661-7d4b876a8db6%40criticalinsight.co.uk.
Dear Chris and Al,
I just wanted to pick up on the normalisation question you raise and the difference (it seems to me) between database normalisation and ontological normalisation (destratification).
What I notice that seems interesting is that they adopt opposite strategies. The result of database normalisation is the reduction in redundant data (in principle to reduce the risk of update anomalies) is the creation of additional tables that factor out replicated data to a single record. Although the normalisation process is in principle syntactic, it actually benefits greatly from an ontological approach, indeed if you take an ontological approach to developing data models, they will be normalised syntactically as a by blow.
Now the ontological normalisation that de-stratification achieves is an opposite process. Here you might be taking data that is fully normalised in a syntactic sense, but combining (rather than splitting) objects because the different objects are unnecessary, and you reduce the amount of data at least by not needing the relationships among them. The only thing in data modelling that is equivalent to this is that when you come across a one-to-one relationship, one is traditionally encouraged to consider if there is really only one thing rather than two. I’m not sure de-stratification would always fall into this pattern.
Anyway, the key thing here is that whilst there is an analogy to normalisation, this would be a distinct new level of normalisation (not yet documented). So I would have that as 8th normal form.
Regards
Matthew West


CP>This seems to be doing simple facetisation (we (BORO) tend to do this in the early stages of legacy reengineering - I can see why this is called this normalization. From my perspective it is barely ontological. I'd also worry about the tree structure as it is not economical enough (those familiar with my book (and papers) will be familiar with this argument.
5.3.3 - Ghosts and neutrinos. I agree that my example offered no advantage over yours - and had extra complications :-). However, does this not indicate that there may be less of a case for allowing interpenetration? Does it offer benefit if we can't come up with robust examples? This needn't change anything in your report but is an important consideration for subsequent choice(s). My experience (admittedly limited) is that there is more analytic benefit to be had by only allowing things into the TLO that assist the subsequent users in solving the challenges that they have to deal with. Extras can provide opportunity for inconsistent interpretations/applications of the FDM. I am not sure that I against admitting interpenetration but I am wary of it.
CP> In the latest paper I've used an example suggested by a comment by Peter (Parslow) of road and boundary.
CP> As we discuss this, I'm becoming more and more convinced that one can have degrees on interpenetration - and this would be interesting to map - as it would show the results of the details of the stratification choices. One way of thinking of it is the varieties of whole-part like relations one needs for the stratification. Maybe when I (we?) get some time we could map this out. It might also be useful, for the proto-TLOS with little commitment to map the interpenetration vagueness - where it is not clear which choice has been made.
CP> Yes, like you, I think there are arguments that each new interpenetration brings a cost that needs to be outweighed by a benefit. I'm working on a way to articulate this (when I get the time).
5.3.6 Identity; extension vs. intension
Thanks for hinting that I was a off the mark with my comments. It caused me to give it some more thought - still probably not yet enough! My thinking was attached more to the treatment of the representations (in data) and not their counterparts in reality/possibility.
CP> I thought so, and this is a good area to explore.
So I still think the points I raised were valid... but they weren't in the context of the identity section of your report. One of the most important points for the use of the resulting FDM for this work is a cautionary comment that you made to one of my points about the handling of identity (of what is being represented) in information systems and the (possible) counterpart(s)*. You stated: "There is a connection, but I am not so sure that in practice the link is so close or clear.". I strongly agree and I feel that we need a diplomatic way of stating this through the work of the group. Trying to ensure that there is a high degree of consistency between what is stated in (FDM-conformant) data and the possible counterparts has a cost. I would describe much of this as (information) assurance activity and you only get what you pay for (as long as you spend wisely) - modelling is part of this but assuring information system implementations, data 'objects', their sources, any processing/storage/protections/exchange points that involve the data, the decisions that are dependent upon the data, etc are all part of the information assurance 'landscape'. Even when a lot of cost/effort has been expended there still won't be a 'complete' correspondence between the counterparts - the goal is to get the data representations sufficient for their intended uses. When I read articles about "Digtal Twins" (for example) this tends to be overlooked - often imply that the data is cheap/low cost and that it can deliver radical benefit without recognising that serious work needs to be done to make it that useful. This is why we need to have a data quality manageemnt approach to our ongoing work. The work on the FDM is just the start of this journey.
CP> I couldn't agree more that this is a problem we need to solve well-enough to cross the tipping point.
BTW, I'll still value a chat/further correction on this subject at some point but don't consider it high priority.
CP> Give me a few days and then I'll free up (hopefully) - and I'd like to pick your brain on OWL, etc.
--------------------------
Thanks and I will now embark on reading the dialogue between you and Steven!
Al
--
You received this message because you are subscribed to the Google Groups "UK NDT FDM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to uk-ndt-fdm+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/uk-ndt-fdm/68e37ce4-a372-baf9-2661-7d4b876a8db6%40criticalinsight.co.uk.
--
You received this message because you are subscribed to the Google Groups "UK NDT FDM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to uk-ndt-fdm+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/uk-ndt-fdm/CA%2B8EkRq1KKCUXCG1nbgSeHMDBxkmx8H7_McOD0hewEAir_6G7g%40mail.gmail.com.
Regards,
Chris Partridge
Chris Partridge |
Chief Ontologist | BORO Solutions Limited | www.BOROSolutions.co.uk
M: +44 790 5167263 | e: partr...@borogroup.co.uk
BORO Solutions Limited | Registered Office: 2 West Street, Henley on Thames, Oxfordshire RG9 2DU
Registered in England & Wales | Company No: 06025010 | VAT No. GB 905 6100 58
To view this discussion on the web, visit https://groups.google.com/d/msgid/uk-ndt-fdm/01dc01d66b03%249271e520%24b755af60%24%40informationjunction.co.uk.