FYI(art):Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications

26 views
Skip to first unread message

alex.shkotin

unread,
Jun 6, 2024, 12:33:35 PMJun 6
to ontolog-forum
https://arxiv.org/abs/2311.05876  [Submitted on 10 Nov 2023 (v1), last revised 7 Dec 2023 (this version, v2)]  
 Large language models (LLMs) exhibit superior performance on various natural language tasks, but they are susceptible to issues stemming from outdated data and domain-specific limitations. In order to address these challenges, researchers have pursued two primary strategies, knowledge editing and retrieval augmentation, to enhance LLMs by incorporating external information from different aspects. Nevertheless, there is still a notable absence of a comprehensive survey. In this paper, we propose a review to discuss the trends in integration of knowledge and large language models, including taxonomy of methods, benchmarks, and applications. In addition, we conduct an in-depth analysis of different methods and point out potential research directions in the future. We hope this survey offers the community quick access and a comprehensive overview of this research area, with the intention of inspiring future research endeavors.       

John F Sowa

unread,
Jun 6, 2024, 3:13:49 PMJun 6
to ontolo...@googlegroups.com, CG
Alex,

Thanks for the reference to that article.   But the trends it discusses (from Dec 2023) are based on the assumption that all reasoning is performed by LLM-based methods.   It assumes that any additional knowledge is somehow integrated with or added to data stored in LLMs.  Figure 4 from that article illustrates the methods the authors discuss:

Note that the results they produce come from LLMs that have been modified by adding something new.   That article is interesting.  But without an independent method of testing and verification, Figure 4 is DIAMETRICALLY OPPOSED to the methods we have been discussing and recommending in Ontolog Forum for the past 20 or more years.  

The methods we have been discussing (which have been implemented and used by most subscribers) are based on ontologies as a fundamental resource for supplementing, testing, and reasoning with and about data from any source, including the WWW.

Most LLM-based methods, however, use untested data from the WWW.  A large volume of that data may be based on reliable documents.  But an even larger volume is based on unreliable or irrelevant data from untested, unreliable, erroneous, or deliberately deceptive and malicious sources.  

Even if the data sources are reliable, there is no guarantee that a mixture of reliable data on different topics, when combined by LLMs, will be combined in a way that preserves the accuracy of the original sources.  Since LLMs do not preserve links to the original sources, a random mixture of facts is not likely to remain factual.

In summary, the most reliable applications of LLMs are translations from one language (natural or artificial) to another. Any other applications must be verified by testing against ontologies, databases, and other reliable sources. 

There are more issues to be discussed.  LLMs are an important addition to the toolbox of AI and computer science.  But they are not a replacement for the precision of traditional databases, knowledge bases, and methods of reasoning and computation.

John
______________________________________

Alex Shkotin

unread,
Jun 7, 2024, 3:52:36 AMJun 7
to ontolo...@googlegroups.com, CG

John,


Please! And shortly.


If I want a very reliable LLM, I have to train it myself.


JFS: "That article is interesting.  But without an independent method of testing and verification, Figure 4 is DIAMETRICALLY OPPOSED to the methods we have been discussing and recommending in Ontolog Forum for the past 20 or more years."

But this green box is all about your point.


One interesting point from one talk on the Internet is that Huge Language Models (from ChatGPT to now) use ALL World Wide Available Knowledge we have and it is not enough to make it good. But we do not have more for them🙂


Alex



чт, 6 июн. 2024 г. в 22:13, John F Sowa <so...@bestweb.net>:
--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontolog-forum/7b626705b52045849dd1bcf0c40b6a7f%40bestweb.net.

Alex Shkotin

unread,
Jun 7, 2024, 6:45:06 AMJun 7
to ontolo...@googlegroups.com, CG

John and All,


I began but not finished yet one report [1] of LLM ability to verbalize formal language, in this case OWL2.

The bad places have yellow and red colors.

And here [2] is an example of our dialog.


But the summary for me is clear: we can't trust LLM even for "translations from one language (natural or artificial) to another

It is mostly correct but sometimes unexpectedly wrong.⚽

Id est even in this case we need "Revision" before to give LLM output to decision making.


Alex


[1] The modest charm of Claude 3 Sonnet(-:PUBLIC:-)

[2] DATA

OWL2P:every childless person is a person for which there does not exist anybody who has the first person as parent.

OWL2:SubClassOf(

     :ChildlessPerson

     ObjectIntersectionOf(

       :Person

       ObjectComplementOf(

         ObjectSomeValuesFrom(

           ObjectInverseOf( :hasParent )

           owl:Thing  )       )     )   )

C3S:Subclass Axiom with Complex Class Expression:

- The ChildlessPerson class is a subclass of the intersection between:

   1) The Person class 

   2) The complement of the class of individuals that have at least one value for the inverse of the hasParent property.

(An individual is an instance of ChildlessPerson if it is a Person and does not have any parents)

AS:paraphrase is perfect, but usual natural language is wrong: it describes a parentless person.


чт, 6 июн. 2024 г. в 22:13, John F Sowa <so...@bestweb.net>:
Alex,

--

John F Sowa

unread,
Jun 7, 2024, 5:16:25 PMJun 7
to ontolo...@googlegroups.com, CG
Alex,

No, that third method is NOT what I was saying.   

ALTHOUGH their third method (below) may use precise methods, which could include ontology and databases as input, their FINAL process uses LLM-based methods to combine the information.  (See Figure 4 below, which I copied from their publication.)

When absolute precision is required, the final reasoning process MUST be absolutely precise.   That means precise methods of logic, mathematics, and computer science must be the final step.  Probabilistic methods CANNOT guarantee precision.

Our Permion.ai company does use LLM-based methods for many purposes.  But when absolute precision is necessary, we use mathematics and mathematical logic (i.e. FOL, Common Logic, and metalanguage extensions).

Wolfram also uses LLMs for communication with humans in English, but ALL computation is done by mathematical methods, which include mathematical (formal) logic.  Kingsley has also added LLM methods for communication in English.  But his system uses precise methods of logic and computer science for precise computation when precision is essential.

For examples of precise reasoning by our old VivoMind company (prior to 2010), see https://jfsowa.com/talks/cogmem.pdf .  Please look at the examples in the final section of those slides.  The results computed by those systems (from 2000 to 2010) were far more precise and reliable than anything computed by LLMs today.

I am not denying that systems based on LLMs may produce reliable results.  But to do so, they must use formal methods of mathematics, logic, and computer science at the final stage of reasoning, evaluation, and testing.

John
 


From: "Alex Shkotin" <alex.s...@gmail.com>
Sent: 6/7/24 3:53 AM

John F Sowa

unread,
Jun 7, 2024, 5:31:59 PMJun 7
to ontolo...@googlegroups.com, CG
Alex,

I like your note below, which is consistent with my previous note that criticized your earlier note,   If this is your final position, I'm glad that we agree.

As for translations from one language to another, we can't even depend on humans.  When absolute precision is essential, it's important to produce an "echo" -- a translation from (1) the user's original language (natural or formal) to (2) the computer system's internal representation to (3) the same language as the user's original, and (4) A question "Is this what you mean?"

John
 


From: "Alex Shkotin" <alex.s...@gmail.com>

Alex Shkotin

unread,
Jun 8, 2024, 5:26:51 AMJun 8
to ontolo...@googlegroups.com, CG

John,


For me on this fragment of Fig. 4


"Revision" is a final process. And it is very crucial as it changes the wrong output of LLM (Netherlands) to correct one (Germany).


What kind of knowledge and methods and in which form we use is an important question not only for ontologists but for all scientists and engineers.

Beginning from DB and up to KG and Formal ontologies we duplicate the same knowledge from 10 to 1000 times around the World.

One funny thing about Huge Language Models is that these are whole knowledge in one place concentrators!

By the way the same situation we have in theorem proving: there may be 10 tools like Coq, Isabelle, HOL4 keeping mostly math theories formalized.

I wonder what we have nowadays formalized in the Wolfram Language as a whole.


But to keep theoretical knowledge properly we need one framework for every particular science and technology.

This is huge and subtle work for science and technology experts. In some form this work is done by formal ontologists.

But today formal ontologies like GENO, GO etc. are a special kind of artifact.


We need a framework as a kind of united knowledge maintenance for all around the world.

One science or technology, one place to keep verified knowledge.

My tiny thing was to show a possible framework for the simplest but powerful theory of undirected graphs [1] and how to apply this theory to solve the simplest tasks on the simplest structure [2].


Theoretical knowledge is the treasure. Let's concentrate this knowledge before we formalize it.


It is clear that we can fully formalize Geometry following Hilbert axiomatic theory of Mechanics following Lagrange approach. 

What about the frameworks of these theories?


The important feature of the framework is that we keep the same unit of knowledge in different languages natural and formal. Like this


rus

Пусть e - ребро. ребро e есть петля если и только если у e одна концевая вершина.

eng

Let e be an edge. edge e is a loop if and only if e has one endpoint vertex.

yfl

declaration loop func(TV edge) (e) ≝ (Count(enp(Singl(e)))=1).

And it is possible to add a line here for CL, Coq, Isabelle etc. But I am not sure about the DL for Count.

Advantages of formal definitions we discussed at our 2021 Summit [3].


Alex


[1] https://www.researchgate.net/publication/374265191_Theory_framework_-_knowledge_hub_message_1

[2] https://www.researchgate.net/publication/380576198_Specific_tasks_of_Ugraphia_on_a_particular_structure_formulations_solutions_placement_in_the_framework 

[3] https://www.researchgate.net/publication/349164216_Advantages_of_Formal_Definitions 



сб, 8 июн. 2024 г. в 00:16, John F Sowa <so...@bestweb.net>:
Alex,

No, that third method is NOT what I was saying.   

ALTHOUGH their third method (below) may use precise methods, which could include ontology and databases as input, their FINAL process uses LLM-based methods to combine the information.  (See Figure 4 below, which I copied from their publication.)

When absolute precision is required, the final reasoning process MUST be absolutely precise.   That means precise methods of logic, mathematics, and computer science must be the final step.  Probabilistic methods CANNOT guarantee precision.

Our Permion.ai company does use LLM-based methods for many purposes.  But when absolute precision is necessary, we use mathematics and mathematical logic (i.e. FOL, Common Logic, and metalanguage extensions).

Wolfram also uses LLMs for communication with humans in English, but ALL computation is done by mathematical methods, which include mathematical (formal) logic.  Kingsley has also added LLM methods for communication in English.  But his system uses precise methods of logic and computer science for precise computation when precision is essential.

For examples of precise reasoning by our old VivoMind company (prior to 2010), see https://jfsowa.com/talks/cogmem.pdf .  Please look at the examples in the final section of those slides.  The results computed by those systems (from 2000 to 2010) were far more precise and reliable than anything computed by LLMs today.

I am not denying that systems based on LLMs may produce reliable results.  But to do so, they must use formal methods of mathematics, logic, and computer science at the final stage of reasoning, evaluation, and testing.

John
 


From: "Alex Shkotin" <alex.s...@gmail.com>
Sent: 6/7/24 3:53 AM

John,


Please! And shortly.


If I want a very reliable LLM, I have to train it myself.


JFS: "That article is interesting.  But without an independent method of testing and verification, Figure 4 is DIAMETRICALLY OPPOSED to the methods we have been discussing and recommending in Ontolog Forum for the past 20 or more years."

But this green box  is all about your point.


One interesting point from one talk on the Internet is that Huge Language Models (from ChatGPT to now) use ALL World Wide Available Knowledge we have and it is not enough to make it good. But we do not have more for them🙂


Alex


чт, 6 июн. 2024 г. в 22:13, John F Sowa <so...@bestweb.net>:
Alex,

Thanks for the reference to that article.   But the trends it discusses (from Dec 2023) are based on the assumption that all reasoning is performed by LLM-based methods.   It assumes that any additional knowledge is somehow integrated with or added to data stored in LLMs.  Figure 4 from that article illustrates the methods the authors discuss:

Note that the results they produce come from LLMs that have been modified by adding something new.   That article is interesting.  But without an independent method of testing and verification, Figure 4 is DIAMETRICALLY OPPOSED to the methods we have been discussing and recommending in Ontolog Forum for the past 20 or more years. 

--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

Alex Shkotin

unread,
Jun 8, 2024, 5:32:25 AMJun 8
to ontolo...@googlegroups.com, CG
John,

Exactly!

The_same_meaing(Verbalization(Formalization(natural_text1)) natural_text1)

Alex

сб, 8 июн. 2024 г. в 00:31, John F Sowa <so...@bestweb.net>:
--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

John F Sowa

unread,
Jun 11, 2024, 10:56:59 PMJun 11
to ontolo...@googlegroups.com, ontolog...@googlegroups.com, CG
Jon,

Thanks for that link.  It shows why hopes that LLMs will magically lead to AGI (some kind of intelligence that competes with or goes beyond the human level) are hopelessly MISGUIDED.  For passing a math test, they can get an A+ if they're lucky enough to find the answers in their petabytes of random stuffing.  But if they can't find a correct solution, they're lucky to earn a C.  Even worse, the LLMs are so stupid that they can't say whether their results are good, bad, or indifferent.

The major strength of generatve AI technology is in providing an English-like (more generally a natural-language like) interface to the AI reasoning technology of the past 60 years.   That is extremely valuable, since the complex reasoning methods of GOFAI (Good Old Fashioned AI) require years of studying to learn and use correctly.

But the hope that devoting billions of $$$ of computer horse power will produce AGI is hopelessly misguided.  A good state-of-the-art laptop with GOFAI and a modest amount of LLM processing can outperform the biggest and most expensive LLM systems on the planet.  And it will do so with guaranteed accuracy.  If it can't solve a problem, it will say so.  It won't produce garbage and claim that it's accurate.

Following Jon Awbrey's note is an excerpt that I extracted from the  link that Jon cited.  

John
 


From: "Jon Awbrey" <jaw...@att.net>

John, Alex, ...

I haven't found a use myself for the new spawn of chatbots but
the following is typical of reports I read from those who do
attempt to use them for research and not just entertainment.

Peter Smith • Another Round with ChatGPT

Cheers,

Jon
_____________________________

Another round with ChatGPT

By Peter Smith / This and that / 4 Comments / June 2, 2024

ChatGPT is utterly unreliable when it comes to reproducing even very simple mathematical proofs. It is like a weak C-grade student, producing scripts that look like proofs but mostly are garbled or question-begging at crucial points. Or at least, that’s been my experience when asking for (very elementary) category-theoretic proofs. Not at all surprising, given what we know about its capabilities or lack of them.

But this did surprise me (though maybe it shouldn’t have done so: I’ve not really been keeping up with discussions of  the latest iteration of ChatGPT). I asked — and this was a genuine question, hoping to save time on a literature search — where in the literature I could find a proof of a certain simple result about pseudo-complements (and I wasn’t trying to trick the system, I already knew one messy proof and wanted to know where else a proof could be found, hopefully a nicer one). And this came back:

So I take a look. Every single reference is a total fantasy. None of the chapters/sections have those titles or are about anything even in the right vicinity. They are complete fabrications.

I complained to ChatGPT that it was wrong about Mac Lane and Moerdijk. It replied “I apologize for the confusion earlier. Here are more accurate references to works that cover the concept of complements and pseudo-complements in a topos, along with their proofs.” And then it served up a complete new set of fantasies, including quite different suggestions for the other two books.

[Following this example are two more paragraphs by Peter Smith and a few notes by other readers who had similar experiences,]

Reply all
Reply to author
Forward
0 new messages