----CLEAN SUMMARY OF MY AGI WORK----

A 10GB dataset of observations, being images and/or text from the real world, has only patterns in it, that's all that exists, the same letter/ phrase in it occurs more than once, a random dataset would be 'abcd1234'. All deeper patterns stem off exact matches, such as translation described below.

A brain can predict accurately what the next letter/ word/ phrase or pixel(s)/ frame(s) is in a sentence/ image, by using past experiences that match/ reoccur, allowing it to answer known and new problems.

What is the next letter after 'i was walking down the stree'? T? If 't' occurs more than z in the 10GB dataset, you predict it more. Ex. all letter occurences in 10GBs are 'a' 456765 times seen, 'b' 46457 .... z 564, you can get the % here now, a 10% likely, b 9%, c 8%, z 3%, so out of all letters, a is10% of those letters. A brain only stores a letter / phrase 1 time, it strengthens a connection/ neuron to represent how many times it was seen. This can fade away to forget it too!

You can do better here, you can match the last 2 words in the sentence 'i was walking down the stree' and get the next possible letters ex. 'the stree[t]', this will give you a better set of predictions, more narrowed down. A brain stores walk and walkng only once, but it also stores walk IN walking, so it just stores walkng, in there is the shorter 'walk' word.

The more data you feed AI, the better it predicts, it's so fun, it does have a limit though, you'll need 100x more data to see an improvement at some point. We need another way, to extract deeper patterns from the same dataset.

Using a faster CPU, or GPU, or neuromorphic chip also improve the speed at which you can find patterns and make predictions.

You can match the last 17 letters, but because the 17 letters rarely appear in the dataset, you don't really know well enough what may come next. So, you can blend prediction sets from 17 to 1 last letters. You start at the 17th or furthest match found, then if you get enough counts as shrink the window, you can stop, so you may stop at the last 4 letters for example.

Exact matches are great but we can have some allowances. You can match an input to a memory even if there is some time delay and missing letters ex. you can recognize 'walkinjjg' and 'wlking', just with some weight taken off to slightly lower the prediction since it isn't fully a match.

You can recognize similar letters/ words/ phrases. If ate and swallowed both predict food, tasty, fries, and do so many times, then they probably are interachangable, they should share non shared predictions therefore, including shared predictions. So not only does it allow translation, it allows you to know ate>fish and predict (by however much similar they are) swallowed>fish. Note if dog has 10 billion occurences and therefore words that follow after it, and cat does too, but they rarely of 10B times do similar things (predict similar things), then they are much less similar, this Normalizes it. And the evidence is centered on the right side cat>sleep, there is evidence for similarity in cute cat' and 'dog cute', just less weight since farther away, just as bad as 'cat and cute' and 'dog cute', also 'cat and cute' 'dog and cute' are same position but farther away, so less proof, and very rare/common words/phrases have much less weight because they'd prove everything is similar to everything. A brain only stores sames once, now we must store both cat and dog, so what a brain does is connect them close/ strongly using a 2nd type of connection to 'merge them'. When we predict cat, we also predict a domain cat/dog etc without even yet translating the prompt/ question.

Labrador, poodle, terrior etc are all very very similar, than any others, it is a cluster. We can merge them all into a close web. We can name these clusters, dog. And other clusters dog, cat, monkey etc all fall under a higher level cluster 'animals'. Now we have a hierarchy of sentences made of smaller parts, and a hierarchy made of translate parts.

Given a sentence 'my cat and so i went in and picked up my ?', you don't have the next word predicted as cat, you'll fail, but the next trick is to use things you recently saw as high weight. Put cat there, as long as it can follow. This is actually just long term storage, new neurons are seen only 1 or 2 times, they are bery limited in weight for being predicted, but they have strong weight, as much as a 1 or 17 letter memory that was seen 6,000 times last year but not this year. To store energy we just store a net as lnog term and energy fades until forgets, there's no short term memory in a true brain.

In a dataset, the same things clump together, my cat saw a cat and the cat played with another cat.....my cat and dog saw a ape near a cage.........then the domain changes and see now the rocket went to Mars and the rocket was in a rocket, the rocket and crew went fast in space. In our brain, energized nodes predict themselves and related, so they are bound to, like reality, cluster together in real life too. Our cortical coumn clusters of similar memories I mentioned share energy. When we see cat, the node loses all energy and thinks it shouldn't see it for a short while depending on its neural strength learnt, the nodes in a cluster flash off and begin to grow its energy back, cat, dog, horse, ape, helping it to know not to say them again for now. When the cluster itself fires all its energy, now domain animals will not be predicted much for a while, and will slowly regrow energy. This allows us to know we only have a few things to say, either cat/ dog/ cage, and know we won't be saynig rocket etc, then once done the domain dogs, we know to talk rocket ship domain.

When we predict the next word or sentence 'plan', we predict things we love, like food and sex. I was walking down the street and found food. This curves it to like predicting sex and food all day, keeping it on track. Again we finish sex and won't want it for a while, too. They flip on and off. We can learn by translation and a>b causuality new Rewards, like AI and plates, plates are similar to food, and AI may position you at a table and then get food smell/ vision/ taste/ feel/ sound you want. Now we predict i want AI. It has a new specialization! We can also place our end goal not here yet at end and where we are now at start to get fill in the hole type question ex. 'i am in my house typing, i will ? ? ? ? AI will get made ? ? ? ? then we will stop death and pain for all animals and humans for good thanks to AI and nanobots". This really keeps it on track now, all day it talks about AI, and also is achored by a bi-directional prompt to fill in. Having both sides improves prediction.

You can collect data internally in your brain frmo the same dataset, and store it, all in the box! You can generate likely true (based on data patterns) and then store it back into the brain. Ex. maybe dog=cat and so maybe dog>meows. You also usually collect data from desired sources, like the AI domain, food domain, especially right before a question you are unsure what the answer is confidently enough.

The body has learnt reflexes for survival, like sneeze, cough, retract arm, yelp, mimic, camoflague, flee, spray, shiver, sweat. These are patterns too, a solution to problems n a very limited way. Vestibular reflex makes the eyes track a point unless you move your eyes (if it or you move, you still track it). The eyes cut out input during a saccade because you nor the room is moving, so you should not see Flash-Lag blur. Flash lag blur shows motion in a single image, a single input image is made of multiple shorts causing a blur, this blends/ merges data to help the brain.

A similar limited solution is learning to walk by trying random motor speeds and directions, then strengthening which got accel-o-meter higher. Then tweak best actions to improve to walk faster. This doesn't solve a wide range of problems, but fills in holes.

The dumbest solution is to use Brute Force Search, try all possible doors, it can recreate all machines from the dead, find any solution, but takes too long to run/ store. We can again blend/ merge it in our AI, to help when can ex. have just a few possible combinations to check.

So far we have said our brain uses text for prediction, if we use text + image + sound, we will get a refined merged prediction! Using all 3 help. We can't brute force search, but we can capture observations of the world that capture most of he world usin various sensors, images capture its structure, sound captures slight motion, just like light has various spectrums of color. And better pixel count, etc, and more eyes help too.

The cerebellum makes sure actions, dictated by senses, are smooth and meet the predicted image/ video.

Byte Pair Encoding finds, based on times rreoccured, the building blocks of memories, so instead of storing ex. walking, alking, lking, king.... you store just walking, king, see? There (may) be a loss of accuracy here but very little, it will make the brain much smaller.

If a sentence or image we have seen is now bigger, rotated, brighter, there will be error in matching it, be there is patterns of error, it's ALL bigger/ brighter/ color changed. Ex. you see hello but bigger next time ex. h e l l o, it is hello, you recognize it, you don't include all the space errors, you merge them and only see a tiny error.

A brain pays more attention to flashing/ change, an image can be converted into lines using a sobel line filter detector, this represents the whole image, the lines show where the same shade is in/ outsode of - pattern! and same color, the left/right side of the line shows which side is bright or darker or color changes. We do it for video too, when the same still image finally changes, or when a train starts and edds, we only pay most attetion here, the whole time the train was making sound had less attention mostly in the middle.

A brain can store very long memories by being sparse at the end, eyes do this, and there are 2 blind spots per eye, 4 total. Ex. seen the sentence 'i[ ]was w[a]lki[ng] d[own the road]', see how we store a long memory? We can't store all it, but we can get the long tail. If we see matches along 900 words, we wil know this is most probably the same 900 words. Longer matches get more weight.

We can grow allowance, if we seen some 17 letters lots, we can store 19 letters of it. This works because we prune rare long branches in the neural network, ex. to 5 letters long, a variable storage. We can also do it for multi-sensory, we can store vision 2x as often as sound, and have slow motors and fast vision, if one is more important.

I have mentioned big patterns, but ther is many more, les useful patterns. We need to code for AGI only the dozen big patterns, it can learn how to do by hand or hardwireingly other patterns. Ex. a pattern of translation OF translation is "cat home dog, wind road breeze, moon gold Mars, loop gum circle, up tooth ?", you must match the 3 tuples, seeing they all are similar, and the task is energized strongly here too, so now you know the 1st and 3rd words should be similar! Another rare pattern is ex. 3rd names ar similar ex. my mom Sally Cath has a son Tom Cath. 

For any prediction, ex. the last 15 versus 8 letters, sound versus vision, domain versus related domain, energy versus no energy, there is a pooling effect, bigger companies get bigger very fast, small are smaller than what actually are, it's a yes or no turn on, and as it gets close to 1 ex. 0.9 or even surpasses threshold, it never reaches 100% sure, only 0.999868456 now, never perfect.

The reason we have 2 parents is because evolution learnt to merge DNA like brains merge memories. Maybe man + wings is a good idea. Better than random DNA mutation.

Finally, no structure of particles is any different than any other structure, man, toaster, rocks are all no different, none are unique/special or smart or cute or sexy. The machines/ animals in evolution that live onger, spread more, it is these machines that take over. All these machines are, are data patterns, immortality is just to persist/ repair your structure body. Your goal is to clone too, to improve workforce and backup your self. We use patterns to be patterns, prediction is key to living lnoger. Our homeworld is already becoming a fractal patterns, everything is cube shaped, circular, lined up (same word stored only once, number to represent times seen), similar buildings are grouped near similar buildings, roads are fractals like arteries and plants so to probablistically reach most space in least time for survival. I predict our world will leave air and gravity like we came out of the water (most Earth is water, and water allows mutation which allows improvement but also death), we will live in a very huge but very little dense 3D (not 2D land anymore) metal nanobot world, "you" will be large so can't die as easy, everything will be a pattern fractal, a cold dark place, things will stay in motion and not need light to know where are going since everything is organized nicely, they can predict better in this world, no gravity, on air, no light or loss of energy. The universe cools down as expands, life spans becmoe longer for structures. We will only move when need to repair or defend, we seek immortality so we can get food/ breed/ shed waste radiation, and our desires return us immortality. That's all Life is, is longer living structures, atoms are Life, not just aliens. Atoms naturally form together, and there's clones but only cells directly clone themselves, and only do AIs clone their brains, we can't, cloning yourself allows you to do many similar but different things you wanted to do in parallel, it is faster to clone, AIs will also be better pattern finders than us and will think and move faster than us, neural signals travel 0.07386 miles per sec light: 186282, 2.5 million times faster/ 2.5M years a year, a brain can be smaller too so must travel less, thinking/ simulating is faster than real experiments because thought moves faster and imagination is expense-free/ can see impossible to measure things/ can teleport/ safer/ etc. Our homeworld will be less dense because dense things like a uranium atom that is the heaviest atom, or like our sun whch can fit inside 1.3 million Earths, are too dense and want to radiate, they are very hot and explode like the atom bomb and the sun. They extract free hidden virtual energy, like batteries and food and wood on fire. They do the opposite of attract from gravity, they repel. They merge patterns and e-merge new patterns. We will soon reach the highest regenerative technology, evolution moves exponentialy fast.

The best evaluation for AGI is Lossless Compression. You have to include your algorithm size and tally up all the corrections for non-100% accurate predictions. If your dataset has patterns in it, you'll be able to compress it. Deeper patterns are all based on exact matches, all there is is patterns. Random is a wide range of outputs ex. 5, 2, 9, 5, 7, 3, 0, 8, 4........non random is 4, 4, 4, 4, 4, 4, 4. 4. Our physics has immortal laws for particles, things only appear random to us because we can't capture all the details of how a machnie works. 4,4,4,4,4,4,4,4, is an easier machine we can compress losslessly.