NARS and Visual Perception

82 views
Skip to first unread message

Shubhamkar Ayare

unread,
Apr 17, 2023, 8:57:55 AM4/17/23
to open-nars
Hello, NARS has been exciting because it seems to provide a reasonable formalism to the processes that seem to be going on in our heads. I was looking into the application of NARS to, say, visual perception, since perception seems like a very core requirement for an AGI system to work in the real world. 

And while I'm aware of the Perception from an AGI Perspective paper, I was wondering if more work has been done on this front. One example would be MNIST digit classification, or may be something more fundamental corresponding to the early processes in the human visual system. A related line of work would involve what kinds of preprocessing should be done before the input is forwarded to NARS, and how the different kinds of non-NARS preprocessing would limit the system. Has anyone explored / exploring this?

Maxim Tarasov

unread,
Apr 17, 2023, 9:51:46 AM4/17/23
to open-nars

Shubhamkar Ayare

unread,
Apr 17, 2023, 10:45:46 PM4/17/23
to open-nars
Ah, thank you, this does lead to several questions!

1. How would the performance change with increasing number of training images?
2. As far as humans are concerned, Pylyshyn, in Is Vision Continuous with Cognition? seems to present evidence that early vision is cognitively impenetrable, while the visual perception as a whole can indeed be cognitively penetrable. He also raises the question about what even is the input to the system. In light of that, it will be interesting to see how the performance changes as we change the inputs to the MNIST system - say, by using edge information instead of pixel-level activation.

I also have two other thoughts related to this work: One is that even if it is in-theory possible to learn everything-that-is-necessary-to-perform-tasks-like-a-human, would a constraint on available computing power actually permit learning that everything, and would it not be recommendable to actually provide the right kinds of hard-coding? I think a similar question had also came up during AGI 2022 NARS workshop - there was a hyperparameter that was hardcoded for purposes of exploring, but it was also demonstrated that it is possible to learn the hyperparameter.

The other is that: If an AGI system is to communicate effectively with humans, or learn from the books and works written and made for humans, and that is also how a large part of how humans learn, it feels reasonable that it should at least have the capability to emulate humans in certain basic experiences even if its own primary experience is fairly different from humans.

These are rather open questions at this point, but if anyone has any comments, I'd be glad to read them! Again, thank you for the pointers :)!

Christian Hahm

unread,
Apr 18, 2023, 11:33:51 PM4/18/23
to open-nars
Dear Shubhamkar, thank you for your interest in NARS vision.


I was looking into the application of NARS to, say, visual perception, since perception seems like a very core requirement for an AGI system to work in the real world.
I agree.

may be something more fundamental corresponding to the early processes in the human visual system. A related line of work would involve what kinds of preprocessing should be done before the input is forwarded to NARS, and how the different kinds of non-NARS preprocessing would limit the system.
Preprocessing: For vision in NARS, a pre-processing step could use a deep convolutional network to extract the bounding boxes of objects. That way, NARS will receive both the location and class of various objects captured by its vision sensor. The benefit of this approach is that NARS receives high-level information directly (e.g., "there is a car to the left of the image, and a person to the right"), the downside is that NARS misses out on using the low-level sensory information, and the high-level info is static (e.g., NARS cannot recognize a picture of a "car" unless the pre-processor is also designed to recognize a "car" and pass that information to NARS).

Pure NARS: There is also the possibility to input the raw image data directly into NARS, with no pre-processing involved, as shown the MNIST experiment. Using this method, NARS was able to memorize images of digits with up to 100% accuracy, and could classify new images of digits with up to ~43% accuracy. This approach, however, is speculative. It is reminiscent of a neuro-symbolic technique.
See also the AGI-22 NARS workshop presentation as video: https://www.youtube.com/watch?v=BUlqrbHWW5A
and the raw video recording of the NARS MNIST experiment: https://www.youtube.com/watch?v=m0jHzOPS3wk


1. How would the performance change with increasing number of training images?
Theoretically the accuracy should improve, since NARS would be able to pick out and memorize new visual features it has not seen before, and so could recognize new images. In practice, the performance did not improve further. Perhaps the new features learned by NARS were drowned out by already-learned more-common features.

2. As far as humans are concerned, Pylyshyn, in Is Vision Continuous with Cognition? seems to present evidence that early vision is cognitively impenetrable, while the visual perception as a whole can indeed be cognitively penetrable. He also raises the question about what even is the input to the system. In light of that, it will be interesting to see how the performance changes as we change the inputs to the MNIST system - say, by using edge information instead of pixel-level activation
Thank you for the reference, I will be sure to read it. As for edge detections, that is a keen insight and would be interesting to try in NARS. In the previous MNIST experiment, the images are black-and-white and very low resolution (28x28 pixels), so the pixel values in this case are very similar to edge detections.

I also have two other thoughts related to this work: One is that even if it is in-theory possible to learn everything-that-is-necessary-to-perform-tasks-like-a-human, would a constraint on available computing power actually permit learning that everything, and would it not be recommendable to actually provide the right kinds of hard-coding?  I think a similar question had also came up during AGI 2022 NARS workshop - there was a hyperparameter that was hardcoded for purposes of exploring, but it was also demonstrated that it is possible to learn the hyperparameter.
Hopefully I understand the question: It is feasible to determine the "correct" hard-coding for a specific task, but it is much harder to determine the correct system for accomplishing various tasks, especially various tasks across different domains. In that case, a more general learning mechanism is useful.

The other is that: If an AGI system is to communicate effectively with humans, or learn from the books and works written and made for humans, and that is also how a large part of how humans learn, it feels reasonable that it should at least have the capability to emulate humans in certain basic experiences even if its own primary experience is fairly different from humans.
I agree, if you mean to experience embodiment (physical movement, vision, hunger, etc.). Otherwise, the real-world experiences the AGI system reads about will not be grounded by its own personal experience.


Regards,
Christian Hahm

Christian Hahm

unread,
Apr 18, 2023, 11:42:33 PM4/18/23
to open-nars
The source code for the NARS-with-vision-buffer and MNIST experiment can be accessed here: https://github.com/ccrock4t/NARS-Python/releases/tag/v0.4-alpha


Please see/execute the file    /TestCases/MNISTVisionTests.Py
if you wish to examine the experiment yourself

Shubhamkar Ayare

unread,
Apr 21, 2023, 1:51:31 AM4/21/23
to open-nars
Hello Christian, 

Thank you very much for the pointers to the presentation videos as well as the MNIST experiment code. (And thank you for the python version of NARS too :)!)

Preprocessing: For vision in NARS, a pre-processing step could use a deep convolutional network to extract the bounding boxes of objects.
...
Pure NARS: There is also the possibility to input the raw image data directly into NARS, with no pre-processing involved, as shown the MNIST experiment.
Yes, while these are two possibilities, especially in light of the research in Visual Perception in humans (as also discussed by the Pylyshyn's paper), I want to think of them as two ends of a spectrum. In the case of humans, it seems that the pixel level input is not the kind of input that is penetrable by cognition, but on the other hand, the object level input is certainly cognitively penetrably. However, in between these two, there could be multiple levels of inputs/registrations/representations.

Theoretically the accuracy should improve, since NARS would be able to pick out and memorize new visual features it has not seen before, and so could recognize new images. In practice, the performance did not improve further. Perhaps the new features learned by NARS were drowned out by already-learned more-common features.
I see. It will be fun to investigate what is happening and to think about what more could be done. I'm hoping to run the experiments as well as try out new approaches in the upcoming weeks!

Hopefully I understand the question: It is feasible to determine the "correct" hard-coding for a specific task, but it is much harder to determine the correct system for accomplishing various tasks, especially various tasks across different domains. In that case, a more general learning mechanism is useful.
Indeed, the general learning mechanism is certainly useful; and, once it has learnt something, one could use the learnt knowledge and experience directly for another system - so long as the another system is similar to the original system which performed the learning.

Shubhamkar Ayare

unread,
May 15, 2023, 4:21:59 AM5/15/23
to open-nars
So, I think I'm a bit free now, so I am looking to dive deeper into this. 

I have been able to set up NARS-python v0.4-alpha, and can run "python -m TestCases.MNISTVisionTests" after installing some dependencies like keras, tensorfow, depq, dill.
However, it runs for a long time and thus is prohibitive for experimentation. I was looking for some parameters which I could modify to reduce the number of training/test images and epochs, but was unable to find them. Is there a simple way to go about changing them by modifying some global parameters? (I could issue a PR on github if there are no global parameters about this yet!)

One of the things that caught my eye in the paper was how V0 is constructed from E(i,j) on page 9. It uses only the positive features, I wonder what will happen if we also use negative features. Eye-balling the script run, it seemed like some of the mistakes the system was making involved mis-identifying 0 as an 8, or 8 as an 0. In general, constructing a confusion matrix seems useful to identify the kinds of errors the system is making.

I'm also trying to find where the V0 from E(i,j) computation is implemented in the code, so would be glad if someone could point it out!

Thank you,
Shubhamkar

Christian Hahm

unread,
May 15, 2023, 2:50:54 PM5/15/23
to open-nars
Hello again! I am very glad you were able to get the experiment running without issue!


Is there a simple way to go about changing [epochs/images] by modifying some global parameters?
Yes, except for training epochs. There is no epochs parameter, it only runs training once, but you can easily add epochs by calling train() multiple times.
You can also change the number of working cycles that NARS gets to train on each image.
To change the train/test split proportions, here are the line numbers:
  • There are 4 tests. A test is run by calling a function:
    • Binary Memorization - binary_memorization() - line 127
    • Digit Memorization - digit_memorization() - line 159
    • Binary Classification - binary_classification(percent_of_train_img:) - line 198
    • Digit Classification - digit_classification(percent_of_train_img:) - line 226
  • Since "Memorization" tests dont have separate train/test sets, set the dataset length on one line:
    • Binary Memorization - images_per_class - line 131
    • Digit Memorization - images_per_class - line 163
  • For "Classification", first set the entire dataset length:
    • Binary Classification - images_per_class - line 201
    • Digit Classification - images_per_class - line 228
  • Then, set the parameter to split up the train/test set when calling the test function (e.g., binary_classification(percent_of_train_img=0.5) for a 50/50 split)
It uses only the positive features, I wonder what will happen if we also use negative features?
This is what we tried first, but using only positive features in this case yielded better accuracy scores. It also made the conjunctions/compound terms syntactically much simpler, since all the negated terms were omitted. So, for this experiment we discarded negative events. Regardless, I agree with you that using negative features should still be explored in the future, since it would produce compounds with higher specificity.

Where is V0 from E(i,j) computation implemented in the code?
Every pixel event is input to NARS in parallel, and stored topographically, in an array structure we simply called "spatial buffer". This is "sensation".

At each cycle, NARS processes a conjunction (i.e., AND statement) made from these sensations (e.g., V0). The conjunction is constructed by the spatial buffer, in the take() function found in line 157 of Buffers.py. NARS control mechanism simply calls the function take() on the spatial buffer at each cycle, in NARS.py line 124.


Regards,
Christian

Shubhamkar Ayare

unread,
May 16, 2023, 2:19:53 AM5/16/23
to open-nars
here are the line numbers
Thanks a lot!

This is what we tried first, but using only positive features in this case yielded better accuracy scores.
That's interesting.

I also obtained the confusion matrix - the code has been pushed here
Unfortunately, there does not seem to be any pattern to the errors, except NARS confusing 9 for 7 in this particular run!

======= Test Subtotals ========
+!+ Digit 0 examples (correct | incorrect | total):8 | 2 | 10
+!+ Digit 1 examples (correct | incorrect | total):8 | 2 | 10
+!+ Digit 2 examples (correct | incorrect | total):2 | 8 | 10
+!+ Digit 3 examples (correct | incorrect | total):1 | 9 | 10
+!+ Digit 4 examples (correct | incorrect | total):4 | 6 | 10
+!+ Digit 5 examples (correct | incorrect | total):5 | 5 | 10
+!+ Digit 6 examples (correct | incorrect | total):6 | 4 | 10
+!+ Digit 7 examples (correct | incorrect | total):7 | 3 | 10
+!+ Digit 8 examples (correct | incorrect | total):3 | 7 | 10
+!+ Digit 9 examples (correct | incorrect | total):1 | 9 | 10
+!+ OVERALL TOTAL examples (correct | incorrect | total):45 | 55 | 100
=!=========!= OVERALL TOTAL Test Accuracy: 45.0%
CONFUSION MATRIX (Expected \ Predicted)
[[ 0. -1.  0.  1.  2.  3.  4.  5.  6.  7.  8.  9.]
 [-1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  8.  0.  1.  1.  0.  0.  0.  0.  0.  0.]
 [ 1.  0.  0.  8.  0.  0.  2.  0.  0.  0.  0.  0.]
 [ 2.  0.  2.  0.  2.  0.  0.  0.  2.  0.  1.  3.]
 [ 3.  0.  1.  3.  1.  1.  1.  1.  0.  1.  1.  0.]
 [ 4.  0.  0.  2.  0.  0.  4.  0.  0.  1.  1.  2.]
 [ 5.  0.  2.  2.  0.  0.  0.  5.  0.  1.  0.  0.]
 [ 6.  0.  1.  0.  1.  0.  0.  2.  6.  0.  0.  0.]
 [ 7.  0.  0.  0.  0.  0.  2.  0.  0.  7.  1.  0.]
 [ 8.  0.  1.  2.  2.  0.  0.  1.  0.  0.  3.  1.]
 [ 9.  0.  0.  0.  0.  0.  0.  0.  0.  9.  0.  1.]]


 I will try to dive deeper using the pointers you provided and get back if I am stuck or if I make any progress.

Regards,
Shubhamkar

Shubhamkar Ayare

unread,
May 26, 2023, 3:37:06 AM5/26/23
to open-nars
Is there a term-inheritance or concept-network visualizer for the python version yet?
Reply all
Reply to author
Forward
0 new messages