Why do the language model and the vision model align?

17 views
Skip to first unread message

John Clark

unread,
Feb 7, 2026, 6:55:59 AM (6 days ago) Feb 7
to ExI Chat, extro...@googlegroups.com, 'Brent Meeker' via Everything List

The following quote is from the above: 

"More powerful AI models seem to have more similarities in their representations than weaker ones. Successful AI models are all alike, and every unsuccessful model is unsuccessful in its own particular way.[...] He would feed the pictures into the vision models and the captions into the language models, and then compare clusters of vectors in the two types. He observed a steady increase in representational similarity as models became more powerful. It was exactly what the Platonic representation hypothesis predicted."

In my opinion the above finding has profound philosophical implications. 

John K Clark    See what's on my new list at  Extropolis
qjq

Brent Meeker

unread,
Feb 7, 2026, 11:53:16 PM (5 days ago) Feb 7
to everyth...@googlegroups.com
Humans perceive a common reality and invented language and images to represent it. 

I wonder where this goes when considering several of our most successful representations of reality, quantum mechanics in its various formulations?

Brent

John K Clark    See what's on my new list at  Extropolis
qjq
--
You received this message because you are subscribed to the Google Groups "Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to everything-li...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/everything-list/CAJPayv0x3DLq-Ekcxg%2B9yftO9s5WWpTLeUD%2BJDRWFKhMQyZ3Uw%40mail.gmail.com.

John Clark

unread,
Feb 8, 2026, 7:13:45 AM (5 days ago) Feb 8
to Stefano Ticozzi, ExI chat list, extro...@googlegroups.com, 'Brent Meeker' via Everything List
On Sat, Feb 7, 2026 at 12:14 PM Stefano Ticozzi <stefano...@gmail.com> wrote:

Scientific thought has long since moved beyond Platonism,

Philosophical thought perhaps, but scientific thought never embraced Platonism because the most famous of the ancient Greeks were good philosophers but lousy scientists. Neither Socrates, Plato or Aristotle used the Scientific Method. Aristotle wrote that women had fewer teeth than men, it's known that he was married, twice in fact, yet he never thought of just looking into his wife's mouth and counting. Today thanks to AI, for the first time some very abstract philosophical ideas can actually be tested scientifically. 

1. Ideas do not exist independently of the human mind.  Rather, they are constructs we develop to optimize and structure our thinking.

True but irrelevant.  
 
2. Ideas are neither fixed, immutable, nor perfect; they evolve over time, as does the world in which we live—in a Darwinian sense. For instance, the concept of a sheep held by a human prior to the agricultural era would have differed significantly from that held by a modern individual.

The meanings of words and of groups of words evolve over the eons in fundamental ways, but camera pictures do not.  And yet minds educated by those two very different things become more similar as they become smarter. That is a surprising revelation that has, I think, interesting implications. 

In my view, the convergence of AI “ideas” (i.e., language and visual models) is more plausibly explained by a process of continuous self-optimization, performed by systems that are trained on datasets and information which are, at least to a considerable extent, shared across models.

Do you claim that the very recent discovery that the behavior of minds that are trained exclusively by words and minds that are trained exclusively by pictures are similar and the discovery that the smarter those two minds become the greater the similarities, has no important philosophical ramifications? 

John K Clark    See what's on my new list at  Extropolis

4x@

Brent Meeker

unread,
Feb 8, 2026, 3:34:24 PM (5 days ago) Feb 8
to everyth...@googlegroups.com


On 2/8/2026 4:13 AM, John Clark wrote:
On Sat, Feb 7, 2026 at 12:14 PM Stefano Ticozzi <stefano...@gmail.com> wrote:

Scientific thought has long since moved beyond Platonism,

Philosophical thought perhaps, but scientific thought never embraced Platonism because the most famous of the ancient Greeks were good philosophers but lousy scientists. Neither Socrates, Plato or Aristotle used the Scientific Method. Aristotle wrote that women had fewer teeth than men, it's known that he was married, twice in fact, yet he never thought of just looking into his wife's mouth and counting. Today thanks to AI, for the first time some very abstract philosophical ideas can actually be tested scientifically. 

1. Ideas do not exist independently of the human mind.  Rather, they are constructs we develop to optimize and structure our thinking.

True but irrelevant.  
 
2. Ideas are neither fixed, immutable, nor perfect; they evolve over time, as does the world in which we live—in a Darwinian sense. For instance, the concept of a sheep held by a human prior to the agricultural era would have differed significantly from that held by a modern individual.

The meanings of words and of groups of words evolve over the eons in fundamental ways, but camera pictures do not.  And yet minds educated by those two very different things become more similar as they become smarter. That is a surprising revelation that has, I think, interesting implications. 

In my view, the convergence of AI “ideas” (i.e., language and visual models) is more plausibly explained by a process of continuous self-optimization, performed by systems that are trained on datasets and information which are, at least to a considerable extent, shared across models.

Do you claim that the very recent discovery that the behavior of minds that are trained exclusively by words and minds that are trained exclusively by pictures are similar and the discovery that the smarter those two minds become the greater the similarities, has no important philosophical ramifications? 

Words have been invented to described what we see and otherwise experience, but since sight has the highest information bandwidth of the senses the convergence is most noticeable for sight.

What do you think the philosophical implications are?

Brent

John K Clark    See what's on my new list at  Extropolis

4x@





 

Il sab 7 feb 2026, 12:57 John Clark via extropy-chat <extrop...@lists.extropy.org> ha scritto:

The following quote is from the above: 

"More powerful AI models seem to have more similarities in their representations than weaker ones. Successful AI models are all alike, and every unsuccessful model is unsuccessful in its own particular way.[...] He would feed the pictures into the vision models and the captions into the language models, and then compare clusters of vectors in the two types. He observed a steady increase in representational similarity as models became more powerful. It was exactly what the Platonic representation hypothesis predicted."

--
You received this message because you are subscribed to the Google Groups "Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to everything-li...@googlegroups.com.

John Clark

unread,
Feb 9, 2026, 6:37:33 AM (4 days ago) Feb 9
to Stefano Ticozzi, ExI chat list, extro...@googlegroups.com, 'Brent Meeker' via Everything List
On Sun, Feb 8, 2026 at 10:30 AM Stefano Ticozzi <stefano...@gmail.com> wrote:

The article you linked here appeared to refer to a convergence toward a Platonic concept of the Idea; it therefore seemed relevant to recall that Platonic Ideas have been extensively demonstrated to be “false” by science.

No. You can't use a tape measure to prove that a poem is "false". Science deals with what you can see, hear, feel, taste and smell, Plato was dealing with the metaphysical, the underlying nature of being. However, far from disproving it, in the 20th century Quantum Mechanics actually gave some support to Plato's ideas. In Plato's Allegory of the Cave we can only see the "shadows" of the fundamental underlying reality, and in a similar way modern physics says we can only observe reality through a probability (not a certainty) obtained by the Quantum Wavefunction.  

human language has grown and developed around images, driven almost exclusively by the need to emulate the sense of sight.

We may not be able to directly observe fundamental underlying reality but we are certainly affected by it, and over the eons human language has been optimized to maximize the probability that one's genes get into the next generation. So although words are not the fundamental reality they must be congruent with it. That has been known for a long time but very recently AI has taught us that the connection is much deeper and far more subtle than previously suspected. 

Just a few years ago many people (including me) were saying that words were not enough and that for a machine to be truly intelligent it would need a body, or at least sense organs that can interact with the real physical world. But we now know that is untrue. It is still not entirely clear, at least not to me, exactly how it is possible for words alone to do that, but it is an undeniable fact that somehow it is.

John K Clark    See what's on my new list at  Extropolis

v53
   

 

Il dom 8 feb 2026, 13:13 John Clark <johnk...@gmail.com> ha scritto:
On Sat, Feb 7, 2026 at 12:14 PM Stefano Ticozzi <stefano...@gmail.com> wrote:

Scientific thought has long since moved beyond Platonism,

Philosophical thought perhaps, but scientific thought never embraced Platonism because the most famous of the ancient Greeks were good philosophers but lousy scientists. Neither Socrates, Plato or Aristotle used the Scientific Method. Aristotle wrote that women had fewer teeth than men, it's known that he was married, twice in fact, yet he never thought of just looking into his wife's mouth and counting. Today thanks to AI, for the first time some very abstract philosophical ideas can actually be tested scientifically. 

1. Ideas do not exist independently of the human mind.  Rather, they are constructs we develop to optimize and structure our thinking.

True but irrelevant.  
 
2. Ideas are neither fixed, immutable, nor perfect; they evolve over time, as does the world in which we live—in a Darwinian sense. For instance, the concept of a sheep held by a human prior to the agricultural era would have differed significantly from that held by a modern individual.

The meanings of words and of groups of words evolve over the eons in fundamental ways, but camera pictures do not.  And yet minds educated by those two very different things become more similar as they become smarter. That is a surprising revelation that has, I think, interesting implications. 

In my view, the convergence of AI “ideas” (i.e., language and visual models) is more plausibly explained by a process of continuous self-optimization, performed by systems that are trained on datasets and information which are, at least to a considerable extent, shared across models.

Do you claim that the very recent discovery that the behavior of minds that are trained exclusively by words and minds that are trained exclusively by pictures are similar and the discovery that the smarter those two minds become the greater the similarities, has no important philosophical ramifications? 



Brent Meeker

unread,
Feb 9, 2026, 8:46:30 PM (3 days ago) Feb 9
to everyth...@googlegroups.com


On 2/9/2026 3:36 AM, John Clark wrote:
On Sun, Feb 8, 2026 at 10:30 AM Stefano Ticozzi <stefano...@gmail.com> wrote:

The article you linked here appeared to refer to a convergence toward a Platonic concept of the Idea; it therefore seemed relevant to recall that Platonic Ideas have been extensively demonstrated to be “false” by science.

No. You can't use a tape measure to prove that a poem is "false". Science deals with what you can see, hear, feel, taste and smell, Plato was dealing with the metaphysical, the underlying nature of being. However, far from disproving it, in the 20th century Quantum Mechanics actually gave some support to Plato's ideas. In Plato's Allegory of the Cave we can only see the "shadows" of the fundamental underlying reality, and in a similar way modern physics says we can only observe reality through a probability (not a certainty) obtained by the Quantum Wavefunction.  

human language has grown and developed around images, driven almost exclusively by the need to emulate the sense of sight.

We may not be able to directly observe fundamental underlying reality but we are certainly affected by it, and over the eons human language has been optimized to maximize the probability that one's genes get into the next generation. So although words are not the fundamental reality they must be congruent with it. That has been known for a long time but very recently AI has taught us that the connection is much deeper and far more subtle than previously suspected. 

Just a few years ago many people (including me) were saying that words were not enough and that for a machine to be truly intelligent it would need a body, or at least sense organs that can interact with the real physical world. But we now know that is untrue. It is still not entirely clear, at least not to me, exactly how it is possible for words alone to do that, but it is an undeniable fact that somehow it is.
Isn't it just a matter of bandwidth.  An image contains a lot more information than a paragraph taking up the same space on a page.  So in theory it provides a lot more bandwidth.  But a lot of that is not available to us because we don't process to the finest degree of our vision (hence the possibility of "hidden" messages in pictures).  So I don't think it's because words convey more information than we thought; it's because images convey less, and part of the reason they convey less is we don't see everything in an image, also we don't necessarily have the connections with other concepts that would help us remember details of an image, except when we have words or a name for the detail

Brent

PGC

unread,
Feb 11, 2026, 7:40:52 AM (2 days ago) Feb 11
to Everything List
This is all forest for trees. Despite utility for coding, anyone can spot AI poems, songs, writing a mile away. But that isn't the point. The point is what is happening in economic terms, see here:
https://www.youtube.com/watch?v=NIXd3PEbsNk

And what is happening in subversion of criminal justice for profits and control over allegedly democratic voting populations and people in authoritarian states:

Yeah, I know these videos are long but that's what depth and nuance on such complex topics require. The optimism around AI is baffling when those themes are considered in depth. And yes, those videos are produced by a computer gamer and not NYT or some liberal darling outlet. But at this point, the frustration of someone just wanting to play affordable games, being prevented from doing so in in the foreseeable future because of AI, the companies that exploit it, economic considerations, and their implications on self-determination; turn any singularity simply into dominion of a few insecure folks requiring therapy.

Pronouns don't kill, have not hurt anybody physically, robbed anyone's chance to make a living, and searched homes without discernible justification. Secret police, fed by allegedly beneficial AI that will create this utopian progress narrative future, already perform all these tasks daily. 

John Clark

unread,
Feb 11, 2026, 8:53:25 AM (2 days ago) Feb 11
to everyth...@googlegroups.com
On Sat, Feb 7, 2026 at 11:53 PM Brent Meeker <meeke...@gmail.com> wrote:


>> Why do the language model and the vision model align? Because they’re both shadows of the same world
The following quote is from the above: 
"More powerful AI models seem to have more similarities in their representations than weaker ones. Successful AI models are all alike, and every unsuccessful model is unsuccessful in its own particular way.[...] He would feed the pictures into the vision models and the captions into the language models, and then compare clusters of vectors in the two types. He observed a steady increase in representational similarity as models became more powerful. It was exactly what the Platonic representation hypothesis predicted."

In my opinion the above finding has profound philosophical implications. 

Humans perceive a common reality and invented language and images to represent it. 
I wonder where this goes when considering several of our most successful representations of reality, quantum mechanics in its various formulations?


It would be even more interesting if an AI program is run on a quantum computer. In his 1986 book "The Ghost in the Atom" David Deutsch proposed a way to test Everett's Many Worlds idea; the experiment involves a quantum computer and would be very difficult to perform but Deutsch argues that is not Many Worlds fault, the reason it's so difficult is that the conventional view says conscious observers obey different laws of physics, Many Worlds says they do not, so to test who's right we need a mind is intelligent (and therefore can be presumed to be conscious) that uses quantum properties.

In Deutsch's experiment to prove or disprove the existence of many worlds other than this one, a conscious quantum computer shoots electrons (or photons or some other subatomic particle) at a metal plate that has 2 small slits in it. It does this one at a time. The quantum computer has detectors near each slit so it knows which slit the various electrons went through. The quantum mind now signs a document for each and every electron saying it has observed the electron and knows which slit it went through. It is very important that the document does NOT say which slit the electron went through, it only says that it went through one and only one slit and the mind has knowledge of which one. Now just before the electron hits the plate the mind uses quantum erasure to completely destroy the memory of what slits the electrons went through, but all other memories including all the documents remain undamaged. 

After the document is signed the electron continues on its way and hits the photographic plate. Then after thousands of electrons have been observed and all which-way information has been erased, develop the photographic plate and look at it. If you see interference bands then the Many World interpretation is correct. If you do not see interference bands then there are no worlds but this one and the conventional interpretation is correct.

Deutsch is saying that in the Copenhagen interpretation when the results of a measurement enters the consciousness of an observer the wave function collapses, in effect all the universes except one disappear without a trace so you get no interference. In the many worlds model all the other worlds will converge back into one universe when the electrons hit the photographic film because the two universes will no longer be different (even though they had different histories), but their influence will still be felt. In the merged universe you'll see indications that the electron went through slot X only and indications that it went through slot Y only, and that's what causes interference.
Reply all
Reply to author
Forward
0 new messages