What about next year's Summit: Formal ontologies justification🎉

19 views
Skip to first unread message

Alex Shkotin

unread,
Sep 26, 2024, 4:24:12 AM9/26/24
to ontolog-forum

Colleagues,


After all the interesting discussions here, let me propose a topic for next year's Summit: Formal ontologies justification.

We suppose to use fontology to verify AI output, but how justified are our ontologies themself?

It may be seen as a continuation of our Summit: Ontology for scientists.

OBO Foundry has many rules for how to make good ontology.

But strictly saying every formal ontology statement must be justified.

The idea is simple: to hear from different groups about how they justify their fontologies.

After all, consistency or, as in the case of DL, the existence of a model is not enough ontologically speaking 👁️


Alex


Alex Shkotin

unread,
Oct 15, 2024, 9:04:59 AM10/15/24
to ontolog-forum
IN ADDITION: scientific knowledge, theories we have as a background for our ontologies.

чт, 26 сент. 2024 г. в 11:23, Alex Shkotin <alex.s...@gmail.com>:

Michael DeBellis

unread,
Oct 24, 2024, 12:15:22 AM10/24/24
to ontolog-forum
First regarding: "scientific knowledge, theories we have as a background for our ontologies." Why? Do we need scientific knowledge as a background for our databases? I mean for some databases sure but for most industry databases no. No one thinks there is a science of shopping carts and catalogs, or a science of checking people into hospitals and keeping medical records.  The goal is to build useful technology that solves business problems. Or have we given up on OWL as something that is relevant to the business world? If we have why? There are many features that OWL and other Semantic Web technology bring that are right in line with what people like Gartner (Data Fabrics), Martin Fowler (Data Mesh) and Dave McComb (Data Centric Revolution) describe as the latest and greatest ideas for how to use data in modern systems. So why are Semantic Web technologies currently being mostly ignored by the business world? (See 2) 

I would like to propose 2 different topics:

1) Data lakes and ontologies/knowledge graphs. Data warehouses are going away. They are expensive, complex, and take a lot of effort to develop and maintain. In their place are what people call Data Lakes. The difference between a data warehouse and a data lake are:
1.1) Data warehouses focus on structured data. Data lakes handle both structured (tables, OWL models, graph schemas) and unstructured (files with raw data, documents, video and other media)
1.2) Data warehouses require very expensive software. Data lakes can be developed using mostly open source tools such as Hadoop, Spark, Flink, Kafka (for data pipelines) and Avro. In spite of requiring some custom development data lakes are still cheaper than data warehouses and provide more functionality. You don't have to precompute all possible queries to design a cube for a data lake. 
1.3) It would seem that an ontology or RDF (or Neo4J) graph model would be the perfect model for a Data Lake. Especially OWL given the extra semantics. Yet all of the open source tools mentioned above either use key-value pairs or a table model. Why isn't OWL used more for Data Lakes and what can be done to increase its use?

2) Ontologies are not gaining traction in industry, especially compared to property graphs and Neo4J in the US. Why? What are we doing wrong? Here is some info from ChatGPT's Market Analyzer that supports my intuition that Neo4J is kicking our butts. Note: I didn't ask it a loaded question, the prompt was: "I have a question about the knowledge graph market in US industry. There are two types of knowledge graphs: those that use the Web Ontology Language and those that use Property Graphs, primarily Neo4J. Do you have any information about the market share of the two approaches?":

Start ChatGPT Market analyzer -----------------------------------------------------------------------------------------------------------------------------------------------
Deleted some overview descriptions of the two different technologies...
Market Dynamics
  • Neo4j Dominance: Neo4j, as the leading Property Graph database, holds a significant market share in the commercial adoption of graph databases, driven by its versatility, developer community, and focus on practical applications like fraud detection, network analysis, and recommendation engines.
  • OWL's Niche: OWL-based knowledge graphs, while less commercially widespread than Neo4j, dominate specific verticals that rely heavily on standardization and formal knowledge representation, such as healthcare and government research.
Market Share Estimates
  • Neo4j and Property Graphs: Neo4j is estimated to capture a substantial share of the graph database market, which is expected to grow from about $1.3 billion in 2022 to $5.5 billion by 2030, according to recent reports. While Neo4j doesn't represent the entire Property Graph market, its dominance indicates that Property Graphs hold the larger share of the market compared to OWL.
  • OWL-Based Knowledge Graphs: OWL-based systems, though less commercially adopted in terms of the broader database market, are crucial for specialized use cases. Stardog, Ontotext, and other OWL-based platforms typically cater to a more niche segment, often used in knowledge-intensive domains.
Key Trends to Consider
  • Hybrid Graph Systems: Platforms like Amazon Neptune are increasingly supporting both RDF/OWL and Property Graph models, reflecting the growing trend toward multi-model databases. This hybrid approach allows organizations to leverage the strengths of both types of graphs, depending on the use case.
  • Growing Adoption of Graphs in AI: Knowledge graphs, regardless of their type, are becoming critical in AI/ML-driven applications, particularly for enhancing context and improving data integration. The intersection of knowledge graphs with AI and machine learning is expected to drive significant future growth for both OWL and Property Graph models.

In summary, Property Graphs, especially Neo4j, hold the majority market share in terms of broader commercial applications, while OWL-based knowledge graphs are essential in more specialized sectors that require formal semantics and inferencing. Both approaches serve distinct use cases, and the future may see increasing adoption of hybrid models combining the strengths of both.

End ChatGPT Market analyzer -----------------------------------------------------------------------------------------------------------------------------------------------

I think part of the problem is that we like to make ontologies sound much more complex than they really are. E.g., we invent terminology like Continuant and Occurrent that only a handful of people understand rather than use common sense terms such as Process and Physical Object. This leads people in industry to think that ontologies are only for specialized scientific use cases rather than the very powerful tool based on the intuitive foundations of logic and set theory that OWL really is. You don't need to understand Turing or model theory or the difference between first and second order logic and Description Logic to use OWL, just as you don't need to understand the difference between a tensor, a matrix and a vector to use an LLM. All you need to understand OWL is set theory and the basic operators of logic which can be taught to someone in elementary school. I've done workshops where I teach the basics of OWL and Protege to people with no technical background such as library scientists and they grasp it right away. 

Michael

Alex Shkotin

unread,
Oct 24, 2024, 5:08:10 AM10/24/24
to ontolo...@googlegroups.com

Michael,


Thank you for your interest in my proposal for Summit-2025. Currently after some fruitful discussion the proposed slogan of Summit is 

"The Two Sides of Ontology: Relating Ontologies to the world and to theories about it."


Let me answer your questions in line.


MDB:First regarding: "scientific knowledge, theories we have as a background for our ontologies." Why? 

AS:It seems to me that you have a slightly elevated view of science. In Russian you can say "his example is a science to others" (Pushkin, "Eugene Onegin"). In this case, we are talking about theoretical knowledge. And continuing Kant's thought about a priori knowledge, we can say that each of our deliberate actions is based on theoretical knowledge.


MDB:Do we need scientific knowledge as a background for our databases? 

AS:Each database contains in the schema, including in the comments for each table, relationship and attribute, a precise description of their purpose, i.e. its specific theoretical knowledge.


MDB:I mean for some databases sure but for most industry databases no. No one thinks there is a science of shopping carts and catalogs, or a science of checking people into hospitals and keeping medical records.  

AS:What is the structure of theoretical knowledge of industrial databases, it is necessary to look at, and in the first approximation it is contained in their documentation. Theoretical knowledge for a specific system is documentation about it.


MDB:The goal is to build useful technology that solves business problems. Or have we given up on OWL as something that is relevant to the business world?  If we have why? 

AS:The choice of OWL and related technologies is the responsibility of the Project leader. But one drawback is widely known and acknowledged by the authors: OWL's expressive power is not sufficient.


MDB:There are many features that OWL and other Semantic Web technology bring that are right in line with what people like Gartner (Data Fabrics), Martin Fowler (Data Mesh) and Dave McComb (Data Centric Revolution) describe as the latest and greatest ideas for how to use data in modern systems. So why are Semantic Web technologies currently being mostly ignored by the business world? (See 2) 

AS:A whole study needs to be conducted here.


AS:

So. Wherever we act consciously, we think logically and theorize. Systematized theoretical knowledge is science, especially when methods of not only one or another logic, but also mathematical ones begin to be applied.

Therefore, it is proposed to consider on Summit-2025 where in our formal ontologies theoretical knowledge is formalized and what it is. And how it relates to theories in certain sciences.

If the ontology is expressed in OWL, then each TBox sentence is theoretical knowledge. Expressed, by the way, quite HOL. This is especially evident if you use Functional Syntax.


Alex



чт, 24 окт. 2024 г. в 07:15, Michael DeBellis <mdebe...@gmail.com>:
--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ontolog-forum/2a9a7cbc-1b95-49cb-94b7-dac4d8566743n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages