Example of me using chatGPT

Félix

unread,

Apr 13, 2023, 1:15:17 AM4/13/23

to leo-editor

Maybe a bit off topic, but I started using it, and thought you might find it amusing to see an example screenshot of this :)

Here I am, (simple screenshot below) working on leojs, converting the stripBOM function from leoGlobals.py from python to typescript.

Have you tried? Any thoughts or particular productivity tips to share?

(I eventually plan to use Leo to organize and automate calls to it's API, to make some kind of agi-assistant experiment.)

Félix

Edward K. Ream

unread,

Apr 13, 2023, 8:15:59 AM4/13/23

to leo-editor

On Thursday, April 13, 2023 at 12:15:17 AM UTC-5 Félix wrote:

Here I am, (simple screenshot below) working on leojs, converting the stripBOM function from leoGlobals.py from python to typescript.

Have you tried? Any thoughts or particular productivity tips to share?

My impression is that chatGPT does well on small tests. I wouldn't trust it with larger tasks.

(I eventually plan to use Leo to organize and automate calls to it's API, to make some kind of agi-assistant experiment.)

chatGPT has spawned many creative ideas including yours. Please let us know what happens :-)

Edward

Thomas Passin

unread,

Apr 13, 2023, 8:34:45 AM4/13/23

to leo-editor

There was a recent thread about this on python-list, including someone's experiments. Here's what I wrote -

" People need to remember that ChatGPT-like systems put words together the way that many humans usually do. So what they emit usually sounds smooth and human-like. If it's code they emit, it will tend to seem plausible because lines of code are basically sentences, and learning how to construct plausible sentences is what these systems are built to do. That's **plausible**, not "logical" or "correct".

The vast size of these systems means that they can include a larger context in figuring out what words to place next compared with earlier, smaller systems.

But consider: what if you wrote code as a stream-of-consciousness process? That code might seem plausible, but why would you have any confidence in it? Or to put it another way, what if most of ChatGPT's exposure to code came from StackOverflow archives?

On top of that, ChapGPT-like systems do not know your requirements nor the reasons behind your requests. They only know that when other people put words and phrases together like you did, they tended to make responses that sound like what the chatbot emits next. It's basically cargo-culting its responses.

Apparently researchers have been learning that the more parameters that a system like this has, the more likely it is to learn how to emit responses that the questioner likes. Essentially, it could become the ultimate yes-man!

So there is some probability that the system will tell you interesting or useful things, some probability that it will try to tell you what it thinks you want [to] hear, some probability that it will tell you incorrect things that other people have repeated, and some probability that it will perseverate - simply make things up.

If I were going to write a novel about an alternate history, I think that a chatGPT-like system would be a fantastic writing assistant. Code? Not so much."

Félix

unread,

Apr 14, 2023, 9:10:26 PM4/14/23

to leo-editor

Indeed, for smaller self-contained chunks of code, it's very useful and efficient. For more ambitious code, it's too risky to have hallucinated (erroneous) chunks of code.

Perhaps the years ahead will lead to other neuronal system architectures that will complement the "textual-generation-prediction" ones that are actually in vogue.

Edward K. Ream

unread,

Apr 15, 2023, 6:31:41 AM4/15/23

to leo-e...@googlegroups.com

On Fri, Apr 14, 2023 at 8:10 PM Félix <felix...@gmail.com> wrote:

Perhaps the years ahead will lead to other neuronal system architectures that will complement the "textual-generation-prediction" ones that are actually in vogue.

A new architecture is already here! I was going to write this up in another thread, but I might as well reply here:-)

- Starting point: This article in Quanta magazine.

- The solve a classic problem link in the article leads to a paywalled article in Nature.

However, googling shows there is an arxiv preprint for free.

Important: the "classic problem" is not a math problem, but a psych test called Raven's Progressive Matrices.

I have been studying this preprint closely. The article describes NVSA: Neuro-vector-symbolic Architecture. Googling this term found the home page for (HD/VSA) Vector Symbolic Architecture. This must be one of the best home pages ever written!

The preprint proposes an architecture whose front end involves perception and whose back end involves reasoning. Here is an excerpt from the summary:

"The efficacy of NVSA is demonstrated by solving the Raven’s progressive matrices datasets...end-to-end training of NVSA achieves a new record of 87.7% average accuracy in RAVEN, and 88.1% in I-RAVEN datasets. Moreover, ...[our method] is two orders of magnitude faster [than existing state-of-the art]. Our code is available at
https://github.com/IBM/neuro-vector-symbolic-architectures."

This GitHub repo contains nothing but Python code. Naturally I imported them into Leo. leo-editor-contrib now contains nvsa.leo.

As expected, the code's math consists entirely of matrix operations using the torch and numpy libs. The only "if" statements involve handling user options. There are a few "for" loops. I don't know how those loops affect performance.

Summary of the ideas

Here is my summary of my present study, taken mainly from the HD/VSA page:

The math is unbelievably simple. Anybody who knows high school algebra will likely understand. All data are vectors in a high-dimensional space. n = 10,000 dimensions is typical. The natural distance between two vectors is cosine similarity, the n-dimensional analog of the cosine. In other words, the distance between vectors is taken to be the angle between the vectors.

Here's the kicker. Almost all vectors in this high-dimensional space are almost orthogonal. The converse is extremely useful. Queries are very simple compositions of vectors. These queries contain "cruft", but this cruft doesn't matter. The query is close to equal to the desired answer! These vectors remind me of git's hashes. "Vector collisions" never happen in a high-dimensional space!

Furthermore, with a clever design of the contents of the vectors, queries are bi-directional!! Given the results of a query, it's trivial to find what vectors were involved in the original query.

Summary

The advantages of nvsa:

- Integration of perception (front end) with reasoning (back end).

I don't yet understand the details.

- No searching required!

- No back propagation needed for training!

- Full transparency of reasoning in the back end.

- Dead simple math, supported by far-from-dead-simple theory.

Edward

Edward K. Ream

unread,

Apr 15, 2023, 6:35:02 AM4/15/23

to leo-editor

On Saturday, April 15, 2023 at 5:31:41 AM UTC-5 Edward K. Ream wrote:

The article describes NVSA: Neuro-vector-symbolic Architecture. Googling this term found the home page for (HD/VSA) Vector Symbolic Architecture. This must be one of the best home pages ever written!

I recommend following all the links on this page. Most links point to Wikipedia pages for easy math concepts. I bookmarked many of these links.

Edward

Thomas Passin

unread,

Apr 15, 2023, 9:00:12 AM4/15/23

to leo-editor

Very interesting! I just started reading the home page link. I was struck by this statement:

" HD/VSA addresses these challenges by providing a binding operator associating individual (John, Mary) with roles (AGENT, PATIENT) and a superposition operator that allows multiple associations to be composed into a coherent whole."

The Topic Maps model for organizing knowledge has topics - a topic is anything that can be talked about - and relationships. A relationship type has a number of roles, and those roles are filled by topics. It sounds very similar, at a basic level. A Topic Maps relationship would be the equivalent of the HD/VSA binding operator.

I have some reservations about using cosine similarity with vectors like this. I have experimented with them a little, not in the area of AI but for answering queries in a certain space of questions and knowledge. The trouble is that the components of a vector are not often orthogonal, so the simple ways to compute their projections are not valid. You can crank out the results, but they will not be correct, to a degree that depends on the other vector involved. I will be interested to learn how these investigators handle this.

As an example of what I mean, consider a vector of words, and you want to know how similar it is to another vector of words. A simpleminded approach make each word into a vector component. So here are two sentences:

"Which comes first, the chicken or the egg"

"Evolutionarily speaking a bird can be considered to be the reason for an egg"

Now make vectors of these two sentences, where every word is on its own axis. You take the cosine by multiplying the value of each component in one vector by the value of the same component in the other vector. Each component here has a value of 0 or 1 (since the word is either present or not). The only components that match are "the" and "egg". So the score - the cosine - will be very low. However, we can see that the two sentences are actually very similar in meaning.

And how can we determine how orthogonal a bird is to a chicken?

So this approach is too simple. It will be interesting to see what these folks are really doing. Personally, I expect that an approach using fuzzy logic would be promising. It would be similar to using cosines to project one vector onto another, but with fuzzy operators instead of multiplication. Why fuzzy logic? Because it matches how people (and no doubt animals) actually assess things in the real world. How you you judge how tall a person is? You don't divide up the arithmetic range into spans - 5 ft to 5'2", 5'2" - 5'4", etc. (sorry, non-US units people) and see which bin the person falls into. No, you have a notion of what "tall", "medium", "short" and "very short" mean, and you see how well the person matches each of them. So the person might be "somewhat tall but not far from medium".

Edward K. Ream

unread,

Apr 15, 2023, 9:54:08 AM4/15/23

to leo-e...@googlegroups.com

On Sat, Apr 15, 2023 at 8:00 AM Thomas Passin <tbp1...@gmail.com> wrote:

I have some reservations about using cosine similarity with vectors like this.

Good study question. It seems the authors are not concerned, presumably for good reasons.

A related question. What goes into the 10,000 element vectors ;-)

Edward

Thomas Passin

unread,

Apr 15, 2023, 11:05:51 AM4/15/23

to leo-editor

For one thing, these vectors are very sparse - most elements are empty. I don't know how sparse vectors and matrices are handled by people who know what they are doing, but for playing around a little I used dictionaries.

Thomas Passin

unread,

Apr 16, 2023, 8:51:15 AM4/16/23

to leo-editor

Reading more of the material, and the published paper on solving the Raven's Progressive Matrices, I'm not convinced that the RPM situation is as impressive as it seems. It reminded me of Bart Kosko's writings from around 20 years ago on using a Neural Network to find a ruleset to be used by a fuzzy logic processor to back up a tractor-trailer truck to a loading dock. And the HD/VSA reminded me of the fuzzy logic processor, though of course they are not identical. Here's something I just wrote a friend about it:

"I'm not sure that the Raven's Progressive Matrices test is as big a deal as it sounds, though. It can be solved basically by a large lookup table, which gets constructed by a neural network. The NN is not necessarily an intrinsic part of their system, but it is a way to recognize the problems to be solved.

In this case, there are a finite number of permutations of the matrices, so if the NN can recognize the input arrangement, the system can look up how to proceed. This new system can be looked at as a way to encode the permutations and an algebra for computing the lookup.

Back in the late 90s or early 2000s, Bart Kosko wrote a couple of interesting books on fuzzy logic and NNs. One of the then-classic tasks was to back a tractor-trailer truck up to a loading dock, in computer space, of course: they weren't hooked up to a real truck. He showed that with the right collection of fuzzy rules and a fuzzy processing algorithm, the truck could be backed up effectively and robustly.

And guess what? One way to discover the rules was with a NN. Given the rules, you didn't need the NN to park the truck, just the fuzzy ruleset and processor. It sounds so similar to the present work. In fact, the two main operations are correlation and projection. The rules for doing so are a little different from the fuzzy logic case, and the values in the vectors' cells are binary instead of (possibly) continuous. But that doesn't seem so different to me.

One difference with the current work is that you can do symbolic math with their sparse vectors. In simple cases you can sometimes find the answer using an algebra without actually evaluating anything. That's way cool, but I think for most large cases you would have to do numerical evaluation. "

Edward K. Ream

unread,

Apr 16, 2023, 9:16:33 AM4/16/23

to leo-e...@googlegroups.com

On Sun, Apr 16, 2023 at 7:51 AM Thomas Passin <tbp1...@gmail.com> wrote:

Reading more of the material, and the published paper on solving the Raven's Progressive Matrices, I'm not convinced that the RPM situation is as impressive as it seems.

Hi Thomas. Thanks for your thoughtful comments.

I've mostly given up trying to understand the paper :-) In particular, I have no idea how the NN creates the "universe" of possible answers. Still, the mathematical trick of using a big space is something to remember.

I like to think in pictures, so I paid most attention to the figures at the end. Here is a quote (emphasis mine)

QQQ

At the lowest level of the hierarchy, the four attribute values are represented by randomly drawing four d-dimensional vectors (x[red], ...). The vectors are dense binary, and arranged as d = 10 x 10 for the sake of visual illustration.

At the next level, the red square object is described as a fixed-width product vector by binding two corresponding vectors (x[red] dot x[square]) whose similarity is nearly zero to all attribute vectors and other possible product vectors such as (x[blue] dot x[triangle]), (x[red] dot x[triangle]), etc. as shown in (c).

This quasi-orthogonality allows the VSA representations to be co-activated with minimal interference.

At the highest level, the two object vectors are bundled together by similarity-preserving bundling to describe the scene. The bundled vector is similar solely to those objects' vectors and dissimilar to others.

QQQ

This trick/tool is what I take from the paper. Everything else is mysterious :-)

Edward

Reply all

Reply to author

Forward