Hi everyone!
It's a great paper I recommend anyone with interest in Animal Cognition to read, though one aspect bugs me and catched my eye immediately, in Table 1: "Self-control - The ability to delay gratification by resisting the temptation of an immediate reward in preference for a better but delayed reward", a view which, as I found, is quite widespread in psychology and hence not limited to this paper.
Clearly, for us in the AGI field self-control is so much more than this narrow ability to delay gratification. The presence of this ability is especially, as I will now show, fundamentally insufficient to determine whether a system has the self-related mechanisms similar to the ones we described in "Self in NARS, an AGI System" -
https://www.frontiersin.org/articles/10.3389/frobt.2018.00020/fullas many AI systems can delay gratification successfully even without these mechanisms!
To show this, we will run two such systems on the Standford Marshmallow test, a common way to measure ability do delay gratification:
The subject (which is assumed to like marshmallows, and to like two of them more than one) gets a marshmallow. If it resists to eat it, it will get a second one after some time and can eat both. Either from experiencing a few examples runs, or by knowledge transfer (speech/text) of this fact, the subject is then expected to refrain from eating the first marshmallow so that it will get two marshmallows to eat instead.
First I tried a Q-Learner with Eligibility Traces, a purely reactive Behaviorism-based approach which is very mainstream. Due to its way to handle temporal credit assignment and ability to balance near- and long-term reward, it had no issue with this task, as expected. To replicate (on any Linux/Mac/UNIX/Android via Termux shell):
cd OpenNARS-for-Applications
git checkout QLearner
./build.sh
./NAR shell < ./examples/nal/marshmallow.nal | python3 colorize.py
Output: QLearner.png, see attachment
Then I tried `OpenNARS for Applications` (ONA), a simplified NARS which in this experiment is fully restricted to temporal and procedural reasoning&learning (NAL-7, NAL-8, as in the book
"Non-Axiomatic Logic" https://www.worldscientific.com/worldscibooks/10.1142/8665 ). In this setup it lacks all of the SELF-related mechanisms described in
"Self in NARS, an AGI System" which we see as so central for higher-level cognitive functioning we ultimately want to understand to a replicable degree. Again, it had no issue with this task, as expected. To re-run the experiment:
cd OpenNARS-for-Applications
./build.sh
./NAR shell < ./examples/nal/marshmallow.nal | python3 colorize.py
Output: ONA.png, see attachment
Please see the attached info.txt for how to interpret the example file's content!
This post is for discussion and also serves as a courtesy for researchers in other fields (especially Cognitive Science), who are interested in exploring the limits of what the Marshmallow experiment can show, and who want to explore a broader notion of self-control. This will potentially lead them to finding better experiments that allow to demonstrate higher-level cognitive
functioning (especially self-related mechanisms such as introspective reasoning ability) in cephalopods and many others, beyond the simple ability to delay gratification which is easy even for current AI.
Best regards,
Patrick