--You received this message because you are subscribed to the Google Groups "Everything List" group.To unsubscribe from this group and stop receiving emails from it, send an email to everything-li...@googlegroups.com.To view this discussion on the web visit https://groups.google.com/d/msgid/everything-list/CAJPayv17gDvhRbpF3p-HDFZzE9krv3ETeqCYmPSsSqH-qNESNw%40mail.gmail.com.
> Are you worried that some of us are not being sufficiently obsequious?
> I don't understand your preocupation John.
> If GPT-4 is indeed close to human intelligence, this will become undeniable in the next few weeks.
On Mon, Mar 20, 2023 at 4:25 AM Telmo Menezes <te...@telmomenezes.net> wrote:> Are you worried that some of us are not being sufficiently obsequious?No, I'm not worried about that because fortunately GPT-4 has not been behaving like the biblical Yahweh, I have seen no evidence that GPT-4 demands, or even would enjoy, constant flattery by humans. All I want is for you to look at this video and then do the rational thing and retract your claim that GPT-4 is "not even close" to human intelligence.
> I don't understand your preocupation John.You don't?! Can you think of anything more important to be preoccupied with? Can you think of anything that has happened in the world in your lifetime that was more significant than passing the Turing Test with flying colors? I can't.
> If GPT-4 is indeed close to human intelligence, this will become undeniable in the next few weeks.It's been undeniable to all rational observers since last Tuesday, but you denied it.
> I want to discuss scientific research and peer-reviews academic articles, but you want me to get excited about YouTube clickbait instead. What happened to you John?
> You are SUPER EXCITED abou ChatGPT but you do not give a shit about the fundamentals of machine learning
> Human beings can form coherent memories and are capable of long-term goals, strategy and slow thinking -- the Turing complete kind.
> I have even seen people now claim that ChatGPT is good at chess. It is incredibly good at chess given that it is a language model trained with chess books
> It is capable of navigating a min-max tree? Of course not, because it lacks recurrence. It cannot possibly win against older generation AIs
>you want to convince me that ChatGPT is the answer to everything.
> Ok, maybe you are right and I am crazy.
--
You received this message because you are subscribed to the Google Groups "Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to everything-li...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/everything-list/CAJPayv0Ko%2BMPKt-T2gsnOMXeTJZFRr%3DuB2-gsHGB%3D%2BTEHaW%3DaA%40mail.gmail.com.
The video John shared is worth watching. This is significant. It is now solving complex math problems which requires a long sequence of steps.
Over-fitting is less of an issue here because it's trivial to write a sentence that's never before been written by any human in history.
You can tweak the parameters of the problem to guarantee it's a problem it has never before been seen, and it can still solve it.
You can choose to wait for the academic write ups to come out a few months down the line but by then things will have advanced another few levels from where we are today.
To view this discussion on the web visit https://groups.google.com/d/msgid/everything-list/CA%2BBCJUhxEegTiFhpunMPN-UL38JJ7G0%3D6VuMZcO1o_Ohg3ORig%40mail.gmail.com.
Am Mo, 20. Mär 2023, um 14:28, schrieb Jason Resch:The video John shared is worth watching. This is significant. It is now solving complex math problems which requires a long sequence of steps.I agree that it is significant and extremely impressive. I never said the opposite. What baffles me is that John is now requiring religious reverence towards a scientific result, and criticizing when I ask questions that are part of the same standard machine learning methodology that got us here.
Over-fitting is less of an issue here because it's trivial to write a sentence that's never before been written by any human in history.That is not enough. A small variation on a standard IQ test is still the same IQ test for a super powerful pattern detector such as GPT-4.I have no doubt that GPT-4 can generalize in its domain. It was rigorously designed and tested for that by people who know what they are doing. My doubt is that you can give it an IQ test and claim OMG GPT-4 IQ > 140. This is just silly and it is junk science.
You can tweak the parameters of the problem to guarantee it's a problem it has never before been seen, and it can still solve it.Some yes, some no. Almost one century of computer science still applies.You can choose to wait for the academic write ups to come out a few months down the line but by then things will have advanced another few levels from where we are today.I am not wanting to wait for anything, I am asking questions that can be addressed right now:- Are there IQ tests in the training data of GPT-4. Yes or no?- Can we conceive of human-level intelligence without recurrent connections or some form of ongoing recursivity / Turing completeness? Yes or no?
|
-
On Mon, Mar 20, 2023 at 10:15 AM Jason Resch <jason...@gmail.com> wrote:
Jason, that was a very interesting and insightful post, thanks for posting it.
Over-fitting is less of an issue here because it's trivial to write a sentence that's never before been written by any human in history.That is not enough. A small variation on a standard IQ test is still the same IQ test for a super powerful pattern detector such as GPT-4.I have no doubt that GPT-4 can generalize in its domain. It was rigorously designed and tested for that by people who know what they are doing. My doubt is that you can give it an IQ test and claim OMG GPT-4 IQ > 140. This is just silly and it is junk science.It's true that once one learns a way to solve problems it becomes easier to reapply that method when you next encounter a related problem.But isn't that partly what intelligence is? If a system has read the whole Internet and seen every type of problem we know how to solve, and it can generalize to know what method to use in any situation, that's an incredible level of intelligence which until now, we haven't had in machine form before.
You can tweak the parameters of the problem to guarantee it's a problem it has never before been seen, and it can still solve it.Some yes, some no. Almost one century of computer science still applies.You can choose to wait for the academic write ups to come out a few months down the line but by then things will have advanced another few levels from where we are today.I am not wanting to wait for anything, I am asking questions that can be addressed right now:- Are there IQ tests in the training data of GPT-4. Yes or no?- Can we conceive of human-level intelligence without recurrent connections or some form of ongoing recursivity / Turing completeness? Yes or no?I've been thinking about this a lot.My friend with access to GPT-4 asked it: "Does your neural network contain such reflexive loops, or is it strictly feed forward?", below is its reply:--------"As an AI language model, my underlying architecture is based on the Transformer model, which is primarily feedforward in nature but includes some elements of recurrence or reflexivity, specifically in the self-attention mechanism.The Transformer model consists of a multi-layer feedforward neural network with self-attention mechanisms that allow the model to learn complex relationships between input and output sequences. The self-attention mechanism weighs the importance of different input elements relative to each other, effectively capturing long-range dependencies and relationships within the input data. This mechanism introduces a form of reflexivity or recurrence, as it allows the model to consider the context of the entire input sequence when generating outputs.However, it is important to note that the self-attention mechanism in the Transformer model is not equivalent to traditional recurrent neural networks (RNNs), which maintain a hidden state that is updated at each time step. The Transformer model processes input sequences in parallel rather than sequentially, which makes it fundamentally different from RNNs.In summary, while my neural network architecture is primarily feedforward, it includes some elements of reflexivity in the form of self-attention mechanisms that allow the model to capture complex relationships within input sequences."--------Is this enough to meet Hofstadter's requirements of recursion? I do not have the expertise to say. But I do see recursion exist in a way no one seems to ever mention:The output of the LLM is fed back in, as input to the LLM that produced it. So all the high level processing and operation of the network at the highest level, used to produce a few characters of output, then reaches back down to the lowest level to effect the lowest level of the input layers of the network.If you asked the network, where did that input that it sees come from, it would have no other choice but to refer back to itself, as "I". "I generated that text."Loops are needed to maintain and modify a persistent state or memory, to create a strange loop of self-reference, and to achieve Turing completeness. But a loop may not exist entirely in the "brain" of an entity, it might offload part of the loop into the environment in which it is operating. I think that is the case for things like thermostats, guided missiles, AlphaGo, and perhaps even ourselves.We observe our own actions, they become part of our sensory awareness and input. We cannot say exactly where they came from or how they were done, aside from modeling an "I" who seems to intercede in physics itself, but this is a consequence of being a strange loop. In a sense, our actions do come in from "on high", a higher level of abstraction in the hierarchy of processing, and this seems as if it is a dualistic interaction by a soul in heaven as Descartes described.In the case of GPT-4, its own output buffer can act as a scratch pad memory buffer, to which it continuously appends it's thoughts to. Is this not a form of memory and recursion?For one of the problems in John's video, it looked like it solved the Chinese remainder theorem in a series of discrete steps. Each step is written to and saved in it's output buffer, which becomes readable as it's input buffer.Given this, I am not sure we can say that GPT-4, in its current architecture and implementation, is entirely devoid of a memory, or a loop/recursion.I am anxious to hear your opinion though.
Jason
--You received this message because you are subscribed to the Google Groups "Everything List" group.To unsubscribe from this group and stop receiving emails from it, send an email to everything-li...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/everything-list/CA%2BBCJUjFhjj5bzZx6x4iq_NjXOy%2BAmadTnvzF464J87xvBc_Ag%40mail.gmail.com.
> the important methodological distinction here is between learning intelligent behavior and demonstrating intelligent behavior. Obviously it is possible to learn and generalize from a dataset, otherwise there would be no point in wasting time with ML. But if you want to convince other people that you have indeed achieved generalization, then the scientific gold standard is to demonstrate this on data that was not used in training,
> I am really just insisting on sticking to the scientific attitude.
> I do not understand what I could saying that is so controversial...
> There is still a huge chasm between Human Intelligence (HI) and GPT-4.
> How long will it take to cross that chasm?
> But this only goes so far. It can never defeat a competent chess player with such an architecture. Of course, we can integrate GPT-4 with some API and let it call some explore_deep_tree() function, but this is not the sort of deep integration that one imagines in sophisticated AI. True recurrence would allow for true computational power within the model.
Over-fitting is less of an issue here because it's trivial to write a sentence that's never before been written by any human in history.That is not enough. A small variation on a standard IQ test is still the same IQ test for a super powerful pattern detector such as GPT-4.I have no doubt that GPT-4 can generalize in its domain. It was rigorously designed and tested for that by people who know what they are doing. My doubt is that you can give it an IQ test and claim OMG GPT-4 IQ > 140. This is just silly and it is junk science.It's true that once one learns a way to solve problems it becomes easier to reapply that method when you next encounter a related problem.But isn't that partly what intelligence is? If a system has read the whole Internet and seen every type of problem we know how to solve, and it can generalize to know what method to use in any situation, that's an incredible level of intelligence which until now, we haven't had in machine form before.I would say that the important methodological distinction here is between learning intelligent behavior and demonstrating intelligent behavior. Obviously it is possible to learn and generalize from a dataset, otherwise there would be no point in wasting time with ML. But if you want to convince other people that you have indeed achieved generalization, then the scientific gold standard is to demonstrate this on data that was not used in training, because beyond generalization there can be also (and often is) overfitting. This is not a controversial statement. Take any published ML result and apply it to the training data, and 99.9999999% of the time it will perform better / much better in the training data. Because it also learned the little details (over-fitting) that guide it towards the correct answer.An extreme case of this is stock trading. I am not kidding, and I suspect you know it: I can easily produce an ML model that achieves >1000% profit per month on the derivatives market, as long as we only test on in-corpus data. But I will raise the stakes! Are you ready?I promise I will train my algorithm only on ONE crypto coin from 2020 to 2022. Then we will apply it to OTHER crypto coins. I still promise >1000% profit per month. Do you want it now?I understand that GPT-4 is trained on most available text in natural language. That is amazing, I love it. But this comes with additional methodological challenges. I am pretty sure that the GPT-4 teams knows about them, and they probably have a rigorously reserved training set to guide their own research. Also, I fully believe that they are serious researchers and would never embark in this IQ test bullshit.I am really just insisting on sticking to the scientific attitude. I do not understand what I could saying that is so controversial...
You can tweak the parameters of the problem to guarantee it's a problem it has never before been seen, and it can still solve it.Some yes, some no. Almost one century of computer science still applies.You can choose to wait for the academic write ups to come out a few months down the line but by then things will have advanced another few levels from where we are today.I am not wanting to wait for anything, I am asking questions that can be addressed right now:- Are there IQ tests in the training data of GPT-4. Yes or no?- Can we conceive of human-level intelligence without recurrent connections or some form of ongoing recursivity / Turing completeness? Yes or no?I've been thinking about this a lot.My friend with access to GPT-4 asked it: "Does your neural network contain such reflexive loops, or is it strictly feed forward?", below is its reply:--------"As an AI language model, my underlying architecture is based on the Transformer model, which is primarily feedforward in nature but includes some elements of recurrence or reflexivity, specifically in the self-attention mechanism.The Transformer model consists of a multi-layer feedforward neural network with self-attention mechanisms that allow the model to learn complex relationships between input and output sequences. The self-attention mechanism weighs the importance of different input elements relative to each other, effectively capturing long-range dependencies and relationships within the input data. This mechanism introduces a form of reflexivity or recurrence, as it allows the model to consider the context of the entire input sequence when generating outputs.However, it is important to note that the self-attention mechanism in the Transformer model is not equivalent to traditional recurrent neural networks (RNNs), which maintain a hidden state that is updated at each time step. The Transformer model processes input sequences in parallel rather than sequentially, which makes it fundamentally different from RNNs.In summary, while my neural network architecture is primarily feedforward, it includes some elements of reflexivity in the form of self-attention mechanisms that allow the model to capture complex relationships within input sequences."--------Is this enough to meet Hofstadter's requirements of recursion? I do not have the expertise to say. But I do see recursion exist in a way no one seems to ever mention:The output of the LLM is fed back in, as input to the LLM that produced it. So all the high level processing and operation of the network at the highest level, used to produce a few characters of output, then reaches back down to the lowest level to effect the lowest level of the input layers of the network.If you asked the network, where did that input that it sees come from, it would have no other choice but to refer back to itself, as "I". "I generated that text."Loops are needed to maintain and modify a persistent state or memory, to create a strange loop of self-reference, and to achieve Turing completeness. But a loop may not exist entirely in the "brain" of an entity, it might offload part of the loop into the environment in which it is operating. I think that is the case for things like thermostats, guided missiles, AlphaGo, and perhaps even ourselves.We observe our own actions, they become part of our sensory awareness and input. We cannot say exactly where they came from or how they were done, aside from modeling an "I" who seems to intercede in physics itself, but this is a consequence of being a strange loop. In a sense, our actions do come in from "on high", a higher level of abstraction in the hierarchy of processing, and this seems as if it is a dualistic interaction by a soul in heaven as Descartes described.In the case of GPT-4, its own output buffer can act as a scratch pad memory buffer, to which it continuously appends it's thoughts to. Is this not a form of memory and recursion?For one of the problems in John's video, it looked like it solved the Chinese remainder theorem in a series of discrete steps. Each step is written to and saved in it's output buffer, which becomes readable as it's input buffer.Given this, I am not sure we can say that GPT-4, in its current architecture and implementation, is entirely devoid of a memory, or a loop/recursion.I am anxious to hear your opinion though.This is a great answer by GPT-4 and a good point.
I agree that the ability to re-feed the output buffer back to the language model constitutes a form of computational recurrence and his indeed a memory mechanism. One could even imagine more sophisticated "tricks", where one explains GPT-4 how to read/write from some form of database.I can imagine several ways forward here:(1) The amount of input/context that LLMs can receive keeps increasing, and eventually it is so large that RLHF can teach LLMs to make use of an input/output buffer as a working memory;
(2) Some neuro-symbolic scheme is devised such that the LLM can use APIs to extend itself;(3) True recurrence inside the model is achieved (this requires some new learning algorithm that does not suffer from vanishing gradient).I think that (3) is by far the scientifically most exciting, but it is one of those things where it seems hard to estimate when the breakthrough will come. Maybe tomorrow, maybe in three decades... So another question is, can we ride (1) or (2) all the way to AGI? I don't know...I suspect that truly integrating all the modalities in a human-being kind of way (language, vision, memory formation and access, meta-cognition, etc) will require (3). But I do not have a strong argument. I love coding, so in that sense (2) is a bit more exciting :)For me only two things are clear at this point:- GPT-* is a spectacular, qualitative jump in AI. It can do things that we couldn't dream of a couple of years ago. It will almost certainly be a piece of the puzzle towards AGI.
- There is still a huge chasm between Human Intelligence (HI) and GPT-4. How long will it take to cross that chasm? Who knows...
One thing I wonder is if the main difference between HI and LLMs lies in the utility function more than everything else. We humans have this highly evolved, emergent utility function that allows us to be guided by feelings (boredom, curiosity, lust, fear, etc) into highly complex behaviors and meta-behaviors. We decide to learn things in a certain way for a complicated set of reasons towards a long term goal. In classical AI parlance, we are autonomous agents.
One final point about recursion: where I was trying to get at with the chess example is that HI can solve problems that are provably more time complex than constant / linear. We can solve polynomial type stuff, and even approximate solutions for NP-hard stuff.
Playing a game like chess requires expensive navigation of a very large tree of possible states. This is true both for computers and humans, although they might implement this capability in different ways. Grand masters sometimes commit blunders when trying to explore the tree further than their cognitive capabilities permit, and they will discuss such things (meta-cognition).
GPT-4 as a pure computational environment lacks the ability to perform polynomial time computations. It "fools" us spectacularly by wielding its immense domain knowledge of... everything. But this only goes so far. It can never defeat a competent chess player with such an architecture. Of course, we can integrate GPT-4 with some API and let it call some explore_deep_tree() function, but this is not the sort of deep integration that one imagines in sophisticated AI. True recurrence would allow for true computational power within the model.This is the sort of things I have been thinking. I may be missing something obvious. Would also love to read your opinion!
--TelmoJason
--You received this message because you are subscribed to the Google Groups "Everything List" group.To unsubscribe from this group and stop receiving emails from it, send an email to everything-li...@googlegroups.com.To view this discussion on the web visit https://groups.google.com/d/msgid/everything-list/CA%2BBCJUjFhjj5bzZx6x4iq_NjXOy%2BAmadTnvzF464J87xvBc_Ag%40mail.gmail.com.
You received this message because you are subscribed to the Google Groups "Everything List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to everything-li...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/everything-list/7738df02-557d-4bfd-aee5-b60d07a2dfb5%40app.fastmail.com.