FYI:Galactica: A Large Language Model for Science

47 views
Skip to first unread message

alex.shkotin

unread,
Nov 23, 2022, 12:32:20 PM11/23/22
to ontolog-forum
https://galactica.org/static/paper.pdf
It would be great to compare it later with Knowledge concentrator.

alex.shkotin

unread,
Nov 26, 2022, 3:26:43 AM11/26/22
to ontolog-forum
pragmatic "
  • Language Models can Hallucinate.There are no guarantees for truthful or reliable output from language models, even large ones trained on high-quality data like Galactica. NEVER FOLLOW ADVICE FROM A LANGUAGE MODEL WITHOUT VERIFICATION.
"
среда, 23 ноября 2022 г. в 20:32:20 UTC+3, alex.shkotin:

Azamat Abdoullaev

unread,
Nov 26, 2022, 5:50:13 AM11/26/22
to ontolo...@googlegroups.com
Alex wrote:
  • Language Models can Hallucinate.There are no guarantees for truthful or reliable output from language models, even large ones trained on high-quality data like Galactica. NEVER FOLLOW ADVICE FROM A LANGUAGE MODEL WITHOUT VERIFICATION.
As I mentioned before, all LLMs are "stochastic parrots", ideal for mindlessly spitting out biases and nonsense. As all statistical learning software applications which are trained on giga, terra or petabytes of data corpus, are dull and dumb by its design.
Galactica is a LLM for science, "trained on 48 million examples of scientific articles, websites, textbooks, lecture notes, and encyclopedias". In the company’s words, Galactica “can summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more.”
Its demo has failed. It was not all the fault of Meta's team. LeCun, Meta AI chief scientist, defended it to the last. On the day the model was released, LeCun tweeted: “Type a text and Galactica will generate a paper with relevant references, formulas, and everything.” Three days later, he tweeted: “Galactica demo is off line for now. It’s no longer possible to have some fun by casually misusing it. Happy?” The same happened with Microsoft' bot on Twitter.
The weakest point of all LLMs is a sheer lack of world models, instead of language models. It is like with human intelligence, you are a living zomby or soldier without world views, the core of sentience, consciousness and awareness, conscience or morality. In other words, such ML models are not aware of what they do, read, compose, translate, transcribe, recognize, drive, etc. All is performed automatically and mechanically and mindlessly, without sentience, knowing and self-knowing.
Again, LeCun is aware of it, now stating that without controllable predictable world models, there is no true path to autonomous machine intelligence. This is why he is absent in the article you mentioned.
Another weak point is the lack of a mathematical model/theory of intelligence in terms of reality, interaction and data.
So, they have a lot of money, instead of a lot of mind, or theoretical background of real-world AI/ML systems. 
Then why do they do it? Meta does it because it can allow itself, ...till it goes down.
https://www.linkedin.com/pulse/mathematical-theory-intelligence-azamat-abdoullaev/?published=t

--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontolog-forum/dbe5f9d7-243f-4a4f-8a23-ac173a20f0a3n%40googlegroups.com.

John F Sowa

unread,
Nov 26, 2022, 11:20:32 PM11/26/22
to ontolo...@googlegroups.com
Alex and Azamat,
 
I agree with Alex.  Language models can produce good results for many kinds of machine translation because every phrase is tied to a corresponding phrase in the source language.  But those methods can fail on various kinds of highly important technical texts:
 
For human translations, the following examples are ones for which subject-matter knowledge is more important than native experience in the source language.  These are also areas for which even the best MT systems are useless.
 
1. Scientific texts that contain large numbers of rare words and symbols.  The probabilities of those symbols are so low that even language models with huge numbers of words from the same branch of science cannot make reliable translations. New chemical compounds, organic molecules, drugs, etc.,  A chemist can distinguish the name of a new chemical from a misspelling of an old chemical.   But an MT system that does not understand chemistry cannot.
 
2. Texts on financial transactions, which have highly specialized technical terms that have low probabilities and require absolute precision.
 
3. Texts on legal terminology, especially for international organizations such as the UN and EU.  International treaties have specialized terms with precise definitions for which the slightest error could cause an international conflict.  Furthermore, many of those international treaties may involve highly specialized terminology for navigation, mineral rights, geography, etc.  Any errors could be a disaster.
 
4. Scientific, engineering, and business innovation.  New inventions, scientific discoveries, or new business products can introduce new terminology or use old terms in new senses.  Every new publication or patent introduces new terminology in new patterns for which language models become obsolete or misleading for the new subject matter.
 
Finally, the worst MT systems are those that are so good for familiar texts that people fail to recognize the kinds of texts for which they can fail in dangerous, catastrophic, or extremely expersive ways.
 
John
_________________________________

From: "Azamat Abdoullaev" <ontop...@gmail.com>

 
Alex wrote:
  • Language Models can Hallucinate.There are no guarantees for truthful or reliable output from language models, even large ones trained on high-quality data like Galactica. NEVER FOLLOW ADVICE FROM A LANGUAGE MODEL WITHOUT VERIFICATION.
As I mentioned before, all LLMs are "stochastic parrots", ideal for mindlessly spitting out biases and nonsense. As all statistical learning software applications which are trained on giga, terra or petabytes of data corpus, are dull and dumb by its design.
Galactica is a LLM for science, "trained on 48 million examples of scientific articles, websites, textbooks, lecture notes, and encyclopedias". In the company’s words, Galactica “can summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more.”
Its demo has failed. It was not all the fault of Meta's team. LeCun, Meta AI chief scientist, defended it to the last. On the day the model was released, LeCun tweeted: “Type a text and Galactica will generate a paper with relevant references, formulas, and everything.” Three days later, he tweeted: “Galactica demo is off line for now. It’s no longer possible to have some fun by casually misusing it. Happy?” The same happened with Microsoft' bot on Twitter.
The weakest point of all LLMs is a sheer lack of world models, instead of language models. It is like with human intelligence, you are a living zomby or soldier without world views, the core of sentience, consciousness and awareness, conscience or morality. In other words, such ML models are not aware of what they do, read, compose, translate, transcribe, recognize, drive, etc. All is performed automatically and mechanically and mindlessly, without sentience, knowing and self-knowing.
Again, LeCun is aware of it, now stating that without controllable predictable world models, there is no true path to autonomous machine intelligence. This is why he is absent in the article you mentioned.
Another weak point is the lack of a mathematical model/theory of intelligence in terms of reality, interaction and data.
So, they have a lot of money, instead of a lot of mind, or theoretical background of real-world AI/ML systems. 
Then why do they do it? Meta does it because it can allow itself, ...till it goes down.
https://www.linkedin.com/pulse/mathematical-theory-intelligence-azamat-abdoullaev/?published=t
 
On Sat, Nov 26, 2022 at 10:26 AM alex.shkotin <alex.s...@gmail.com> wrote:
pragmatic "
  • Language Models can Hallucinate.There are no guarantees for truthful or reliable output from language models, even large ones trained on high-quality data like Galactica. NEVER FOLLOW ADVICE FROM A LANGUAGE MODEL WITHOUT VERIFICATION.
"
?????, 23 ?????? 2022 ?. ? 20:32:20 UTC+3, alex.shkotin:

Alex Shkotin

unread,
Nov 27, 2022, 2:46:30 AM11/27/22
to ontolo...@googlegroups.com
John,

Let's be careful, or precise: AA wrongly assigned me a citation from the Galactica page, so you agree with Meta AI, that is more important.
For me it's important:
-they understand the Limits of usage of LLM,
-they use it to concentrate knowledge from all sciences in one place.

Alex


вс, 27 нояб. 2022 г. в 07:20, John F Sowa <so...@bestweb.net>:
--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

Barry Smith

unread,
Nov 27, 2022, 10:59:03 AM11/27/22
to ontolo...@googlegroups.com
On Sat, Nov 26, 2022 at 11:20 PM John F Sowa <so...@bestweb.net> wrote:
Alex and Azamat,
 
I agree with Alex.  Language models can produce good results for many kinds of machine translation because every phrase is tied to a corresponding phrase in the source language.  But those methods can fail on various kinds of highly important technical texts:
 
For human translations, the following examples are ones for which subject-matter knowledge is more important than native experience in the source language.  These are also areas for which even the best MT systems are useless.
 
1. Scientific texts that contain large numbers of rare words and symbols.  The probabilities of those symbols are so low that even language models with huge numbers of words from the same branch of science cannot make reliable translations. New chemical compounds, organic molecules, drugs, etc.,  A chemist can distinguish the name of a new chemical from a misspelling of an old chemical.   But an MT system that does not understand chemistry cannot.
 
Example: 53 different labels used for one and the same cytokine in different provinces of the biochemical literature (see attached, from Building Ontologies with Basic Formal Ontology)
I agree wholeheartedly with everything that John has to say here
BS

 
--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.
53-names-for-one-cytokine.pdf

Azamat Abdoullaev

unread,
Nov 28, 2022, 11:54:00 AM11/28/22
to ontolo...@googlegroups.com
JS: For human translations, the following examples are ones for which subject-matter knowledge is more important than native experience in the source language.  These are also areas for which even the best MT systems are useless.
 
1. Scientific texts that contain large numbers of rare words and symbols.  The probabilities of those symbols are so low that even language models with huge numbers of words from the same branch of science cannot make reliable translations. New chemical compounds, organic molecules, drugs, etc.,  A chemist can distinguish the name of a new chemical from a misspelling of an old chemical.   But an MT system that does not understand chemistry cannot...
John.
We are flooded by the data deluge. Not to be drowned in the ocean of information, we invent task-specific data processing machines, led by narrow and weak AI of ML models and Deep NNs algorithms. 
It is all instead of one giant superstrong Noah's Ark saving all the world's creatures from a global deluge, as general superstrong machine intelligence saving us all as from information overabundance or cognitive fog and deep fake data clouds, rains, swamps and lakes.
What marks today's AI is its overspecialization in micro-domains.
We have all sorts and kinds of automated specialists and experts and professionals, from Lex Machina to Machine Scientists. 
Legalese, the formal technical language, is not a big issue for large language models, pretrained models with some 200bn parameters (NN link weights) trained on half a trillion words.
They could be applied for any special domains with its special terminology, be it 
  • bureaucratese
  • computerese
  • psychobabble.
  • gobbledygook
  • technobabble
  • gibberish
  • bafflegab
  • educationese 
  • That the LLM is essentially dumb and dull and mindless is only big plus for the mumbo jumbo language.

--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

John F Sowa

unread,
Nov 28, 2022, 1:44:52 PM11/28/22
to ontolo...@googlegroups.com
Azamat>  That the LLM is essentiigally dumb and dull and mindless is only big plus for the mumbo jumbo language.
 
No!  A big, dumb, dull,  and mindless system that is almost always good enough is extremely dangerous.
 
Any computer system that works successfully 99% of the time is a catastrophe waiting to happen.  Most people make little mistakes from time to time, but most people live to a ripe old age without a major disaster.
 
For language understanding, there is an excellent ways to avoid misunderstanding:   Ask a question  Restate the sentence in different words.  Use pointing,  Ask for a second opinion -- or even  a third opinion.  If it's important or dangerous, sleep on it before acting.
 
Please note the example of the Tesla almost self-driving cars.  They are almost good enough that people have been driving while texting or even sleeping -- until the system makes a mistake and they die.
 
Those self-driving cars work well on well designed, low-speed areas such as Google's campus and similar areas.  But the demanding tests are by the Carnegie-Mellon teams in and around the city of Pittsburgh, Pennsylvania.  That's my old home town, and it's roads , hills, bridges, and valleys are notoriously challenging.
 
The  test experts who sit in the driver's seat with their hands on their lap admit that they have to grab the wheel at least once or twice on every trial run to avoid an accident.   That is far, far worse than even a novice human driver.
 
John
 

Adrian Walker

unread,
Nov 28, 2022, 2:04:01 PM11/28/22
to ontolo...@googlegroups.com
JS wrote: A big, dumb, dull,  and mindless system that is almost always good enough is extremely dangerous.

Yes indeed.  However, in a not so dumb question answering system, we can  (a) ask the user to approve a paraphrase of the question, and (b) automaticaally produce an English explanation of the answer.

That's what the Executable English online platform does.   
 

--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

John F Sowa

unread,
Nov 28, 2022, 2:43:41 PM11/28/22
to ontolo...@googlegroups.com
Barry and Alex,
 
Barry's attachment is a good refutation of the claims for LLMs.  In any branch of science, different researchers may use diverse methods with different terminology and different kinds of applications to diverse cases.
 
For both science and common sense, an LLM might provide a hint or a suggestion,   But any evaluation of those suggestions requires some reasoning and testing method that can analyze the options and generate an explanation.
 
In every branch of science, consensus is only attained *after* careful analysis and repetition of the critical experiments under a wide range of test cases.  Scientific consensus is *never* obtained by mixing and matching a bunch of words.
 
For anybody who has any doubts, please read the one-page pdf.
 
John
 
 

From: "Barry Smith" <ifo...@gmail.com>
53-names-for-one-cytokine.pdf

Ravi Sharma

unread,
Nov 28, 2022, 3:42:57 PM11/28/22
to ontolo...@googlegroups.com
John and Barry and other distinguished communicators.
Thanks.

I agree that words alone do not create or prove theories, experiments and observations do!

Consensus is only one way and has dangers of sometimes using crowd-techniques for opinion creation. I am thinking of the Big Bang! There are probably other mechanisms to understand the physical Universe and as we extend the Standard Model, if we base all R&D (LHC JWST etc) only on BB we may lead investments into a "Black Hole"(joking).

Language has been a major tool to express knowledge recently. Media enrichment of language has made it more explicit.
What can we do besides domain-vocabulary alignments, separations etc, with Logic, as ontologists to improve or create criteria for scientific verifications and provide balance and remove excess biases?

Barry - your one page is a great example of confusion among communities of usage!

Regards
Thanks.
Ravi
(Dr. Ravi Sharma, Ph.D. USA)
NASA Apollo Achievement Award
Chair, Ontology Summit 2022
Senior Enterprise Architect
Particle and Space Physicist
Elk Grove CA



--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

Paola Di Maio

unread,
Nov 29, 2022, 12:41:17 AM11/29/22
to ontolog-forum
Alex
just to follow your exhortation, and to be precise

I shared this exact reference to Galactica with the same  quote  days earlier on the WRC AI KR  mailing list where a rather  well-informed discussion on language models in relation to KR is taking place
Here is the exact thread (21 Nov)
The tread should be read in conjunction with other threads on the subject
since Galactica is not the only model we study there

It is an open mailing list,  and my are reposted without
attribution

May I invite people to sign up to our CG mailing list inf they are interested in the topics we discuss there.

cheers
Paola Di Maio
W3C AI KR
Just saying



Alex Shkotin

unread,
Nov 29, 2022, 2:51:35 AM11/29/22
to ontolo...@googlegroups.com
John,

Exactly! And this is a main direction of inquiry:
-What is the structure of the System of Sciences (SoS)?
-What is a structure of theoretical knowledge for one or another science (for petrology and undirected graphs in my case) or technology?
-What kind of advantages do we get from formalizing part of theoretical knowledge?

Alex

пн, 28 нояб. 2022 г. в 22:43, John F Sowa <so...@bestweb.net>:
--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

Alex Shkotin

unread,
Nov 29, 2022, 3:19:10 AM11/29/22
to ontolo...@googlegroups.com
Paola, 

Super! We think about the same citation, in different directions. My point is to emphasize that the developers of LLM absolutely understand that the model can produce wrong answers, and not only understand, but persuade their users to check the LLM output by an expert. For example, for Cicero [1], if we use it in real diplomacy we should verify every AI output by a real diplomat :-)

Let me think that advice is for Galactica users, not "advice for its developers" see

As far as AI is concerned, I'm just a spectator i.e. fan.

Alex


вт, 29 нояб. 2022 г. в 08:41, Paola Di Maio <paola....@gmail.com>:
You received this message because you are subscribed to a topic in the Google Groups "ontolog-forum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ontolog-forum/_V0AZPcV_EA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ontolog-foru...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontolog-forum/CAMXe%3DSo%3DkMAxMrSfRztgwqbG-gVZFPBnuGqWFyOjweSeQSWixQ%40mail.gmail.com.

Azamat Abdoullaev

unread,
Nov 29, 2022, 7:25:58 AM11/29/22
to ontolo...@googlegroups.com
JS: Please note the example of the Tesla almost self-driving cars.  They are almost good enough that people have been driving while texting or even sleeping -- until the system makes a mistake and they die.
Right you are, but only partly.
The [Fake] AI/ML/DL is smartly divided into several general classes:
Generative AI learning the joint probability distribution p(x,y)
Discriminative AI learning the conditional probability distribution p(y|x)
with several classes: 
Analytic AI
Functional AI
Descriptive AI...
Some Examples of Generative Models
  • ‌Naïve Bayes.
  • Bayesian networks
  • Markov random fields
  • ‌Hidden Markov Models (HMMs)
  • Latent Dirichlet Allocation (LDA)
  • Generative Adversarial Networks (GANs)
  • Autoregressive Model
Generative AI models are allowed anything, with any uncertainty, biases and errors, any "emergent behavior", unlike the strict discriminative models of self-driving cars.
Prompt-triggered, they are unsupervised ML algorithms enabling computers to use existing content like text, audio and video files, images, code to create new possible content, as original artifacts that would look like the real deal, be it articles, photos, images, pictures, art, videos, films, etc. 

--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

John F Sowa

unread,
Dec 1, 2022, 12:09:30 AM12/1/22
to ontolo...@googlegroups.com
Alex and Paola,
 
Please note that there are professionals, called terminologists, who work for the United Nations, EU, and many other organizations for which maximum precision in choice of words, expressions, and translation is essential.
 
Their tools are based on precisely specified terminology, **NOT** on Large Language Models.
 
See the excerpt below.
 
John
_____________________________
 
 
Terminologists facilitate the editing and translation process by researching and locating information or past publications which might help language staff produce high-quality translations. Terminologists are dedicated professionals who ensure accuracy, appropriateness and consistency of usage of terms in the United Nations.

They advise and consult other UN offices and bodies who draft, translate or edit in their respective language, and answer queries and provide guidance in terminology usage.

Their tasks encompass monitoring documentations and identifying changes, developments or linguistic inconsistencies and variations in different areas of terminology such as organisational nomenclature, functional titles, administrative and budgetary matters, and various other areas.

Terminologists use various electronic tools for their trade. They investigate organizational and technological developments particularly in the field of machine-assisted translations and data bank systems for the improvement of efficiency and productivity, making suggestions as to the development of the United Nations Multilingual Terminology Database (UNTERM). They also rely on their extensive language skills to produce terminology that is clear and coherent.

 
-------------------------------------

From: "Alex Shkotin" <alex.s...@gmail.com>

Alex Shkotin

unread,
Dec 1, 2022, 2:49:58 AM12/1/22
to ontolo...@googlegroups.com
John,

We are discussing the future. Today is there any area where LLM is used by professionals in their practice?

Alex

чт, 1 дек. 2022 г. в 08:09, John F Sowa <so...@bestweb.net>:
--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

John F Sowa

unread,
Dec 2, 2022, 11:17:59 PM12/2/22
to ontolo...@googlegroups.com
Alex,
 
I cited terminologists from the UN as an example of a profession that requires a precisely defined set of terms that have precisely specified translations to and from each of the official languages.  They have had these requirements since they were founded almost 80 years ago, and they are not going to abandon them.  Other international groups, such as the EU, have similar requirements.
 
The international banking system for funds transfer also has very precise requirements.  Any errors in translations could create financial disasters.  I have spoken with programmers who worked in large international banks.  After lunch, one of them casually said "I have to go back and look for that $90 billion I lost."   There were enough checks and balances that he found it without too much trouble.
 
Better automated systems could help support those transactions, but the emphasis is on precision with a limited terminology, not on approximations with an immense terminology.
 
All these issues have been true for many years in the past, and they are not going to change in the future.  These systems require the exact opposite of  Large Language Models.  They are designed for Restricted Language Models that have a limited number of precisely define terms that have exact translations to and from terms in each of the designated languages.
 
Prediction:  LLMs are not on the path toward more intelligent systems.  They can be useful as a component in systems that have suitable testing, error checking, and correction.  But the methods for testing and checking would have to use more traditional AI reasoning systems.
 
John
 
From: "Alex Shkotin" <alex.s...@gmail.com>
 
John,
 
We are discussing the future. Today is there any area where LLM is used by professionals in their practice?
 
Alex
 
??, 1 ???. 2022 ?. ? 08:09, John F Sowa <so...@bestweb.net>:

Alex Shkotin

unread,
Dec 3, 2022, 12:42:46 AM12/3/22
to ontolo...@googlegroups.com
John,

Exactly! There are a lot of domains where we have terminology with precise definitions. What we need for a good translation. And this is not just one clear term from one area and another clear term from another area. We have a system of definitions which is a backbone of the theory of structures and processes in one or another domain.
Domain --> primary terms --> axioms --> system of definitions for secondary terms --> hypothesis/theorems (proofs) 
and we have a theory for this particular domain:-)

But let me be back to the topic of this thread: We have Galactica, Cicero - to mention the last projects. What they can do is interesting.
And we know precisely of the creators of LLM models: "Language Models can Hallucinate

My point from the last Ontology Summit planning meeting is that: Let's take any formal ontology, for example from OBO Foundry, and check what kind of sciences are involved, and where in the theoretical texts of these sciences do situated formal statements of our ontology.  
Then we may embed these formal statements into usual theoretical texts with suitable markup. Practically in the way we embed math text.
The advantage: experts responsible for theoretical text itself will check the correctness of formalization on their own, or with the help of math people.
No silos, you know:-)
Partial extraction of formalizations from theoretical texts is a duty of a task manager: I have a task, and I am looking for formalizations I need for this task. This is how we use theoretical textbooks:-)   


Alex

сб, 3 дек. 2022 г. в 07:17, John F Sowa <so...@bestweb.net>:
--
All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see http://ontologforum.org/info/
---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontolog-foru...@googlegroups.com.

Alex Shkotin

unread,
Dec 3, 2022, 1:41:34 AM12/3/22
to ontolo...@googlegroups.com
Paola,

It is also may be interesting another approach described like this "The researchers hadn’t spotted the crucial pattern in their data themselves." [1] Pattern, you know:-)
And later "Symbolic regression algorithms are distinct from deep neural networks, the famous artificial intelligence algorithms that may take in thousands of pixels, let them percolate through a labyrinth of millions of nodes, and output the word “dog” through opaque mechanisms. Symbolic regression similarly identifies relationships in complicated data sets, but it reports the findings in a format human researchers can understand: a short equation."

To math or not to math, that is the question:-)

Alex


вт, 29 нояб. 2022 г. в 08:41, Paola Di Maio <paola....@gmail.com>:
Alex
You received this message because you are subscribed to a topic in the Google Groups "ontolog-forum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ontolog-forum/_V0AZPcV_EA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ontolog-foru...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontolog-forum/CAMXe%3DSo%3DkMAxMrSfRztgwqbG-gVZFPBnuGqWFyOjweSeQSWixQ%40mail.gmail.com.

Anatoly Levenchuk

unread,
Dec 3, 2022, 5:09:48 AM12/3/22
to ontolo...@googlegroups.com

John,
1. LLM have emergent properties that emerge with a size. See https://arxiv.org/abs/2206.07682, and exciting .gif picture in https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html. There is a theory about such an emergence in random structure -- https://www.quantamagazine.org/elegant-six-page-proof-reveals-the-emergence-of-random-structure-20220425/. We will see the new properties of GPT-4.

 

2. LLM seems only as one of the parts of higher-level cognitive architecture. Most of the "applications" of LLM have not only LLM but much more (interfaces, checkers, another LLM or well-trained neural network for a particular task, adversarial architecture with several networks, etc.). Therefore, the theoretical limit of LLM's cleverness does not apply to these cognitive architectures with LLM inside. There we have the usual system emergence. The intellectual properties of such a complex system are not reducible to LLM's properties alone.


Best regards,
Anatoly Levenchuk

 

--

All contributions to this forum are covered by an open-source license.
For information about the wiki, the license, and how to subscribe or
unsubscribe to the forum, see


---
You received this message because you are subscribed to the Google Groups "ontolog-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
ontolog-foru...@googlegroups.com

Reply all
Reply to author
Forward
0 new messages