Language learning from simple robot head experience

Ben Goertzel

unread,

Jul 28, 2016, 9:32:11 PM7/28/16

to opencog

(proposed R&D project for fall 2016 - 2017)

We are now pretty close (a month away, perhaps?) to having an initial,
reasonably reliable version of an OpenCog-controlled Hanson robot
head, carrying out basic verbal and nonverbal interactions. This
will be able to serve as a platform for Hanson Robotics product
development, and also for ongoing OpenCog R&D aimed at increasing
levels of embodied intelligence.

This email makes a suggestion regarding the thrust of the R&D side of
the ongoing work, to be done once the initial version is ready. This
R&D could start around the beginning of September, and is expected to
take 9-12 months…

GENERAL IDEA:
Initial experiment on using OpenCog for learning language from
experience, using the Hanson robot heads and associated tools

In other words, the idea is to use simple conversational English
regarding small groups of people observed by a robot head, as a
context in which to experiment with our already-written-down ideas
about experience-based language learning.

BASIC PERCEPTION:

I think we can do some interesting language-learning work without
dramatic extensions of our current perception framework. Extending
the perception framework is valuable but can be done in parallel with
using the current framework to drive language learning work.

What I think we need to drive language learning work initially, is
that the robot can tell, at each point in time:

— where people’s faces are (and assign a persistent label to each person’s face)

— which people are talking

— whether an utterance is happy or unhappy (and maybe some additional sentiment)

— if person A’s face is pointed at person B’s face (so that if A is
talking, A is likely talking to B) [not yet implemented, but can be
done soon]

— the volume of a person’s voice

— via speech-to-text, what people are saying

— where a person’s hand is pointing [not yet implemented, but can be done soon]

— when a person is moving, leaving or arriving [not yet implemented,
but can be done soon]

— when a person sits down or stands up [not yet implemented, but can
be done soon]

— gender recognition (woman/man), maybe age recognition

EXAMPLES OF LANGUAGE ABOUT THESE BASIC PERCEPTIONS

While simple this set of initial basic perceptions lets a wide variety
of linguistic constructs get uttered, e.g.

Bob is looking at Ben

Bob is telling Jane some bad news

Bob looked at Jane before walking away

Bob said he was tired and then sat down

People more often talk to the people they are next to

Men are generally taller than women

Jane is a woman

Do you think women tend to talk more quietly than men?

Do you think women are quieter than men?

etc. etc.

It seems clear that this limited domain nevertheless supports a large
amount of linguistic and communicative complexity.

SECOND STAGE OF PERCEPTIONS

A second stage of perceptual sophistication, beyond the basic
perceptions, would be to have recognition of a closed class of
objects, events and properties, e.g.:

Objects:
— Feet, hands, hair, arms, legs (we should be able to get a lot of
this from the skeleton tracker)
— Beard
— Glasses
— Head
— Bottle (e.g. water bottle), cup (e.g. coffee cup)
— Phone
— Tablet

Properties:
— Colors: a list of color values can be recognized, I guess
— Tall, short, fat, thin, bald — for people
— Big, small — for person
— Big, small — for bottle or phone or tablet

Events:
— Handshake (between people)
— Kick (person A kicks person B)
— Punch
— Pat on the head
— Jump up and down
— Fall down
— Get up
— Drop (object)
— Pick up (object)
— Give (A gives object X to B)
— Put down (object) on table or floor

CORPUS PREPARATION

While the crux of the proposed project is learning via real-time
interaction between the robot and humans, in the early stages it will
also be useful to experiment with “batch learning” from recorded
videos of human interactions, video-d from the robot’s point of view.

As one part of supporting this effort, I’d suggest that we

1) create a corpus of videos of 1-5 people interacting in front of the
robot, from the robot’s cameras

2) create a corpus of sentences describing the people, objects and
events in the videos, associating each sentence with a particular
time-interval in one of the videos

3) translate the sentences to Lojban and add them to our parallel
Lojban corpus, so we can be sure we have good logical mappings of all
the sentences in the corpus

Obviously, including the Stage Two perceptions along with the Basic
Perceptions, allows a much wider range of descriptions, e.g. …

A tall man with a hat is next to a short woman with long brown hair

The tall man is holding a briefcase in his left hand

The girl who just walked in in a midget with only one leg

Fred is bald

Vytas fell down, then Ruiting picked him up

Jim is pointing at her hat.

Jim pointing at her hat and smiling made her blush.

However, for initial work, I would say it’s best if at least 50% of
the descriptive sentences involve only Basic Perceptions … so we can
get language learning experimentation rolling right away, without
waiting for extended perception…

LANGUAGE LEARNING

What I then suggest is that we

1) Use the ideas from Linas & Ben’s “unsupervised language learning”
paper to learn a small “link grammar dictionary” from the corpus
mentioned above. Critically, the features associated with each word
should include features from non-linguistic PERCEPTION, not just
features from language. (The algorithms in the paper support this,
even though non-linguistic features are only very briefly mentioned in
the paper.) …. There are various ways to use PLN inference chaining
and Shujing’s information-theoretic Pattern Miner (both within
OpenCog) in the implementation of these ideas…

2) Once (1) is done, we then have a parallel corpus of quintuples of the form

[audiovisual scene, English sentence, parse of sentence via link
grammar with learned dictionary, Lojban sentence, PLN-Atomese
interpretation of Lojban sentence]

We can take the pairs

[parse of sentence via link grammar with learned dictionary,
PLN-Atomese interpretation of Lojban sentence]

from this corpus and use them as the input to a pattern mining process
(maybe a suitably restricted version of the OpenCog Pattern Miner,
maybe a specialized implementation), which will mine ImplicationLinks
serving the function of current RelEx2Logic rules.

The above can be done for sentences about Basic Perceptions only, and
also for sentences about Second Stage Perceptions.

NEXT STEPS FOR LANGUAGE LEARNING

The link grammar dictionary learned as described above will have
limited scope. However, it can potentially be used as the SEED for a
larger link grammar dictionary to be learned from unsupervised
analysis of a larger text corpus, for which nonlinguistic correlates
of the linguistic constructs are not available. This will be a next
step of experimentation.

NEXT STEPS FOR INTEGRATION

Obviously, what can be done with simple perceptions can be done with
more complex perceptions as well … the assumption of simple
perceptions is because that’s what we have working or almost-working
right now… but Hanson Robotics will put significant effort into making
better visual perception for their robots, and as this becomes a
reality we will be able to use it within the above process..

--
Ben Goertzel, PhD
http://goertzel.org

Super-benevolent super-intelligence is the thought the Global Brain is
currently struggling to form...

Andi

unread,

Jul 30, 2016, 6:08:57 AM7/30/16

to opencog

Ben, my congratulations for reaching this point!

I am watching the AI scene since more than 40 years - as far as I can see, the opencog stystem is the richest and by far most probable to build an AGI.

Why Google does not chain you and your team to a desk at their headquarters?????????????
They need to spend about 500.000.000 USD on new projects every week. Why they do not put one week on you and your team????
AGI is essential for them. Opecog is the only complete approach to reach it. Is there nobody at Google who watch closely opencog -and- understand what it is doing?????

Respect!
Andi

Noah Bliss

unread,

Jul 30, 2016, 9:19:58 AM7/30/16

to opencog

I genuinely believe that perhaps Opencog's obscurity is for the best. We need time to see how it will behave, if dangerous issues arise, we can patch and modify. If a ton of different companies forked and developed a dangerous AGI, that wouldn't be very good. Right now there are only a handful of Opencog devs/users and almost all if not all regularly pull and recompile.

Noah Bliss

unread,

Jul 30, 2016, 9:28:46 AM7/30/16

to opencog

Sorry for 2 emails. Congrats on making it this far. I do share the sentiment that the link grammar should be allowed to grow on its own. Getting a better visual processing system would probably prove among the most useful at the moment as our eyes provide highly critical context for much of our (human) development. Also being able to process "region updates" and isolating objects from an image without using prestaged trigger blobs would be far more efficient that just "diff" on each visual frame. This wouldn't really be cheating as the rods and cones in our eyes stop sending signals if they continuously receive the same data. (Try staring at a newspaper for 5 minutes then looking at a blank wall). Perhaps OpenCV could be modified for this purpose?

Ben Goertzel

unread,

Jul 30, 2016, 11:09:04 AM7/30/16

to opencog

From what I can tell, Google's tech leaders are smart, inventive and
good-hearted people who are not, however, deep thinkers about AGI ...
they are too busy for that...

Demis and Shane are brilliant, inventive, deep-thinking people who are
however apparently convinced that loose brain emulation (Demis) or
some mix of loose brain emulation and algorithmic information based
approaches (Shane) are the best approach to AGI ... so they are simply
not that intellectually interested in approaches like OpenCog, even
though they're aware of it...

I would like to have more funds so we could hire more senior
developers and proceed faster. However, I would not like this *as
badly* as I prefer the project to remain free and open source, as I
feel FOSS AGI will be the best course for the good of the humanity and
will increase the odds of a positive Singularity. So being fully
sucked into a big company or gov't agency or typical VC-funded AI
startup situation is not so compelling to me at this point... indeed
opportunities for this have been presented ...

The odds seem reasonable that with the current favorable climate
toward AGI, OpenCog will be able to secure greater funding during the
next year, so it can grow faster in various directions without
"selling out" ....

Once we get the project past a certain critical threshold in terms of
funky downloadable demos AND clear documentation and simpler usabilty
by developers, then I think the thing can really take off quickly, and
become as I've said "the Linux of AGI" (just for a start). Getting
to this threshold is proving slow and laborious given the complexity
of the design. But from the inside, the progress is clear. A
moderate-sized burst of funding would get us there, but we can also
very likely get there without that, just not quite as fast...

-- Ben

> --
> You received this message because you are subscribed to the Google Groups
> "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to opencog+u...@googlegroups.com.
> To post to this group, send email to ope...@googlegroups.com.
> Visit this group at https://groups.google.com/group/opencog.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/d614d10d-f629-427f-97d0-ceeeedc00180%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Andi

unread,

Jul 30, 2016, 2:15:09 PM7/30/16

to opencog

Thank you very much for this elaborated answer, Ben. It helps me to understand the situation better.
Now I am digging into the code to understand how things are done. This will take some time.
But than I hope to be able to contibute something usefull.

--Andi

Daniel Gross

unread,

Jul 30, 2016, 7:56:23 PM7/30/16

to opencog

Hi Ben,

Thank you for this update.

What i am a bit unclear about is, what part of language learning is explicit pre-defined knowledge and what is learned without explicit instruction.

Stated differently, if explicitly defined knowledge presents a scope, does implicit language learning extend this scope -- or is it about mapping the implicit to the explicit.

thank you,

Daniel

Gaurav Gautam

unread,

Jul 31, 2016, 12:35:30 AM7/31/16

to opencog

I did not know Google was that rich!

Andi

unread,

Aug 1, 2016, 4:06:34 AM8/1/16

to opencog

Gaurav,
speaking about Google I mean Alphabet inc. which former was Google inc.
https://en.wikipedia.org/wiki/Alphabet_Inc.
Revenue 2016 will be about 80.000.000.000 USD, earnings about 20.000.000.000 USD, liquid money about 75.000.000.000.

In an TV-interview the head of google X-labs in Switzerland said google has to spend 1,5 billion USD every week. So it was a loose estimation that they could spend 500.000.000 USD every week on new projects.

Looking more closely to this I found out that it is usual for companies to spend a big part of the earnings on growth and new projects for not having to pay to much tax. The tech people would like to spend every thing on new projects but the share holders want to have some profits.

The losses produced by the x-labs, where all the new things are done, are about 4 billion every year and share holders start complaining. So it is more realistic to say that they have some ten millions to spend on new projects every week not 500.000.000 as I said.

But AGI is essential for Google! More essential than just to spend money that can be wasted to save tax.
And it is not an easy thing!
For intelligent people it is more easy than for others - they need just to watch how things are done in their own mind... and put it to software. But if you look around you will find that not so many are able to do this. But Ben obviously can. I see no one who has the same overview and is working so hard and so long on this topic like he. Or see what linas is doing with link grammar. It's overwelming.

So if Google want to advance quick they should fund several teams in parallel. And OpenGog would be worth some million in all cases.

--Andi

Mark Nuzz

unread,

Aug 1, 2016, 5:44:20 PM8/1/16

to ope...@googlegroups.com

One theory is that Google may not be aware that OpenCog has a linking
exception in its AGPL license, whereas the Wikipedia page lists the
project as AGPL without mentioning the linking exception. This is a
huge, huge detail that would potentially affect interest by a party
such as Google.

> --
> You received this message because you are subscribed to the Google Groups
> "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to opencog+u...@googlegroups.com.
> To post to this group, send email to ope...@googlegroups.com.
> Visit this group at https://groups.google.com/group/opencog.
> To view this discussion on the web visit

> https://groups.google.com/d/msgid/opencog/b818481a-6be7-46c5-aa0b-2a3b6e2a71c9%40googlegroups.com.