Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

What is Brainchild?

4 views

Skip to first unread message

Joe Devin

unread,

Sep 13, 2009, 5:33:59 PM9/13/09

Dear Friends,

Brainchild is a project begun in the 1990s shortly after my first release
of SEMLEX, upon which it is based. This piece of software is now in its
5th incarnation, hence the name, Brainchild 5.

These software packages (SEMLEX and Brainchild) are available in several
languages, but for the purposes of this message I will confine myself to
English.

SEMLEX is an experimental lexicon/ontology containing over 19,000 English
words, their potential parts of speech, their meanings, and the most
important relationships between their meanings (synonymy, antonymy,
hypernymy, holonymy and the negatives of hypernymy and holonymy). The
collection of meanings and the relationships between meanings is the
ontology, and the collection of English words is the lexicon. But the
architecture of the system is such that everything is part of the ontology,
because the meanings in the ontology link to word-sense definitions and
English words in the same basic fashion that they link to each other. In
other words, a meaning #1259 may link to an English word-sense definition
#2058 in the definitions file, and it may link to English word #1072 in the
English word list as a noun and to English word #3925 in the English word
list as a verb, and to meaning #1326 as a hyponym, etc. SEMLEX comes
packaged with the wav files for over 71,000 clearly pronounced English
words, and sells for $100.

Brainchild cannot operate without SEMLEX, and sells for $200 including
SEMLEX. It is an English computer interface that can parse plain English
sentences and be made to perform any kind of action based on its results.
Because the parsing is exhaustive, it can differentiate between sentence
types and readily recognize whether a sentence is a question, command,
declaration, or exclamatory sentence, so it knows whether you are asking it
a question, telling it to do something, or just giving it a piece of
information. It can accept new words and add them to the ontology during
parsing or interaction. It recognizes unknown capitalized words or groups
of words as names and assigns them temporary part-of-speech values as
required. It recognizes English possessive names and nouns and gives them
special treatment. You can assign event codes to generic sentences of
particular types so that Brainchild will recognize such user input as being
of a particular type and select a specific event handler (a program
supplied by the user) for special processing, as in order to build an
expert system. It can be made to play audio, show photographs and diagrams
and videos, or do pretty much anything else the user may require in
response to natural-language commands.

And besides these things, both SEMLEX and Brainchild can do many other
things to numerous to mention here. Both SEMLEX and Brainchild employ
speech output, and Brainchild can read your favorite texts in a clear
American English voice.

These systems are based upon the revolutionary new linguistic theory
described in http://panlingua.net, which postulates a subsurface language
underlying all known languages and shows how easy it is to "see" and
understand this subsurface language and implement it on automated systems
(computers).

I am looking for your collaboration. The prices I am asking for SEMLEX and
Brainchild are really very modest when considered in terms of the time,
energy, and brainpower invested in their development. What I am REALLY
after is your collaboration in order to perfect these systems for all known
human languages, because my goal is (insofar as it is humanly possible) to
make the totality of human knowledge instantly available to anyone in any
language in any place at any time, and this includes blind people (I have
already made excellent speech synthesizers and text editors available free
to the blind in Indonesian and Malay and Vietnamese, and intend to do the
same for more and more languages as we bring them on line).

The next big goal will be machine translation. My test results in MT have
been very positive, and I expect to establish the interlingua method as the
standard means for all machine translation in the future. The market for
translation is quite large, and growing. People speaking all the languages
of the world need materials written in English, and if everyone in China
were to pay a mere 1 cent per year for translations from English, the total
take would be $10 million.

The world of search engines is another area in which the ability to ask for
things in plain English and other languages will become ever more critical.
As more and more information becomes available on the web, people are not
going to have time to sit around pointing and clicking their mouses all day
long for information. They are going to need direct answers to direct
questions, and this is what I have begun to experiment with on
http://witchit.com. This job is too big for one man, but I believe the
foundation has been laid correctly, and if I can get a response anything
like that gotten by Wikipedia, we will change the world.

And at last I need your collaboration in order to develop true AI. I am
afraid that many people currently working on AI don't really have much of a
clue how to continue. I have attempted to show again and again that
intelligence is language-based, and that it is their particular kind of
language use that makes humans really human. By building interactive
systems in all the various languages of the world, and by doing it in the
right way, with a real understanding of language and intelligence, we will
be laying the foundations of artificial intelligence for all the various
personality types most typically found for all the languages of the world.
Personality is a function of language with variations within each
language. Every language of the world is exceedingly precious, and with
its written and oral literature, turn of phrase, etc., constitutes a world
unto itself. Each time the world loses another language, humanity has lost
another of its most precious treasures. By creating interactive systems
for every language in the world, we will be documenting these languages for
posterity in a way not possible before.

And finally I want to collaborate with people studying the innermost
processes happening within the human mind in order to accurately model them
upon computers. If intelligence is linguistic, and if we truly learn to
understand the inner workings of language, then we ought to be able to
model any kind of intelligence upon computers.

Sincerely,
Joe Devin.

Mok-Kong Shen

unread,

Sep 14, 2009, 3:51:15 PM9/14/09

Joe Devin wrote:

> SEMLEX is an experimental lexicon/ontology containing over 19,000 English
> words, their potential parts of speech, their meanings, and the most
> important relationships between their meanings (synonymy, antonymy,
> hypernymy, holonymy and the negatives of hypernymy and holonymy).

A look at a synonym dictionary shows that a word can have many
synonyms, which could by themselves have quite different meanings
depending on context. Isn't it quite hard to practically build a
network connecting meanings and words even for the size of vocabulary
you mentioned?

Thanks,

M. K. Shen

Ian Parker

unread,

Sep 15, 2009, 6:29:25 AM9/15/09

Absolutely right. A complete synonym will give identical rows/columns
in the co-occurrence matrix leading to a null eigenvalue in LSA. I
STILL can see no real alternative to the traditional rubric in
disambiguation.

Now a synonym which is not a complete synonym will, for the same
reason, show up the DIFFERENT meanings in LSA.

- Ian Parker

Mok-Kong Shen

unread,

Sep 15, 2009, 4:02:11 PM9/15/09

Ian Parker wrote:

> Absolutely right. A complete synonym will give identical rows/columns
> in the co-occurrence matrix leading to a null eigenvalue in LSA. I
> STILL can see no real alternative to the traditional rubric in
> disambiguation.
>
> Now a synonym which is not a complete synonym will, for the same
> reason, show up the DIFFERENT meanings in LSA.

I am interested to obtain a rather comprehensive list of
complete synonyms. Is there any such that is easily available?

Thanks,

M. K. Shen

Joe Devin

unread,

Sep 15, 2009, 1:10:02 PM9/15/09

Mok-Kong Shen wrote:

No. Different meanings are simply assigned different identifiers (ordinal
numbers), so that one (English) word can be linked to by a number of
semantic nodes ("semnods," or meanings). In this way the distinction
between meanings can be refined just as far as one wishes (usually as far
as required for accurate parsing).

Synonymy and hypernymy both mean "what something is," and in some cases can
be handled identically. In other words, hypernymy is transitive. For
example, "Is a crow a bird?" Yes, because a crow is a corvid and a corvid
is a bird. But, "Is a shack a structure?" if a shack is the same thing as
a hut (these two are taken to be synonyms), and a hut is a structure? Yes,
because shack is a synonym of hut and a hut is a structure. So in cases
involving transitivity in hypernymy, being a synonym of something works
just the same as being a hyponym (the opposite of a hypernym) of something.

Or, to put it more rigorously, hypernymy is unidirectional. For example, a
Chevy is an automobile, but an automobile is not (or not JUST) a Chevy.
But a Chevrolet is a Chevy and a Chevy is a Chevrolet, so Chevy and
Chevrolet are synonyms, and neither one is the hypernym of the other. So
the hypernym relation is unidirectional, whereas the synonymy relation is
bidirectional.

This can make for clutter if synonym links must be created in both
directions for every pair of synonyms in the ontology, but the pay-off may
make it worth it if greater speed is desired, else whenever the poor
computer needs to determine synonymy, it may have to look through an
average of half the nodes of the ontology in order to determine whether
something is a synonym of something else instead of just looking at the
links emanating from a single semnod.

In all of this, it pays to be aware of the total architecture of the
linguistic system. What we usually call a "word" is actually a pattern of
characters or an ideogram in some human-readable linear text, or else it
may be some grouping of sounds in a stream of sounds. Very respectable
authors are guilty of writing that "humans are different because they think
symbolically--in terms of symbols." This idea, wherever it came from, is
fundamentally flawed. The symbols, be they written or spoken, exist only
in the external world, where they appear as scribblings or sounds, and may
exist momentarily in some kind of temporary buffer inside our heads. But
then, after examination and processing, the linear stream of symbols is
converted into nothing but links and nodes. The external symbols fall
quickly away, and we are left with the thought itself in pure form (a
Panlingua representation), which exists only as a structure of links and
nodes inside our heads. This is why it is so easy to remember what someone
has said and so hard to remember their exact words after the passage of
time.

Now this structure of links and nodes (which can be so easily modeled on a
computer) becomes part of the "corpus" containing all the sentences that
the human machine has ever parsed during its lifetime, or at least those
that have not fallen off the edge of the world over time. Just think for
five seconds, and you will probably be able to remember something your
mother told you when you were five. You may not remember her exact words,
but you will remember the meaning.

And if you have memorized my theorem, or at least remember its meaning, it
states that every word in the corpus is nothing but two links (a syntactic
link and a semantic link) emanating from the same node. Okay, so we can
account for the syntactic links easily because these just link to other
words within the same sentence, which has become part of the corpus. But
what about the semantic links? These have to go somewhere, but they don't
go to other words, so where do they go? They go to the semnods of the
ontology, that box of meanings in which each meaning is a node (semnod), or
linking point (a node is just a place where two or more links are
connected, and nothing less or more).

So now let us say that the human linguistic apparatus has the following
components, which you might like to diagram as little boxes:

1. The phonology box, or lexicon, the nodes (connecting points) of which
get activated in response to the recognition of a spoken or written word.
This is the box that has the logic required to recognize any spoken or
written word that it knows and map it to a single connecting point, or
node. Letus call these nodes "lexnods."

2. The ontology, whose nodes (semnods) are just meanings equivelant to the
word-sense definitions you might find in a dictionary. The nodes inside
this box are connected to each other by links (relations) of type hypernym,
holonym, synonym, etc. And there are links exiting this box like wires
leading to the nodes of the lexicon, each node in the ontology usually
linking to one or more nodes in the lexicon. For example, one node in the
ontology might link to the two nodes in the lexicon, one for "car," and the
other for "automobile.

3. The corpus, the nodes of which are words in their purest form, otherwise
called Panlingua atoms. The nodes of this box (the corpus) link to other
identical nodes within the same box by the links we call syntactic links,
and into the ontology (like a great bundle of wires) to various semnods.It
should be noted that in every case without exception, the links within and
between all of these structures are binary and directional in character,
having an origin, a type, and a destination. In no case do any of these
links ever have more than one origin, more than one type, or more than one
destination. Furthermore, in none of these linguistic components, or
"black boxes" do we ever find any kind of data other than just links and
nodes. Thus we have determined that all of the internal workings of
language can be modeled in terms of this one kind of link, which we will
call the "linguistic link," the nodes being only connecting points or
junctions and nothing more.

So are non-human animals linguistic as well? Apparently so, because they
are intelligent and it can be shown that intelligence is linguistic. So
they must have some kind of corpus and ontology, but no very good
phonological system that can reduce an external pattern to a single node or
vice versa. Some dogs can recognize a few hundred words, but they are
unable to reproduce them. Parrots recognize words and reproduce them, but
often fail to develop good connections from the semnods of the ontology to
the lexnods of the lexicon. Nevertheless it would seem far too expensive
in evolutionary terms to develop the ability to recognise and generate
words as certain birds do unless these capabilities were used for more
advanced communications at one time. I would therefore conjecture that
some predecessor of modern birds must have used speech communications at
one time. If birds are descended from dinosaurs as we are told, then
perhaps the true ancestors of birds belonged to the most intelligent of
dinosaur species, and this is why they have survived. Science has been
slow to recognize these indicators, but we will probably be forced to it in
due time.

--Chaumont (Joe) Devin.

Mok-Kong Shen

unread,

Sep 16, 2009, 4:53:59 PM9/16/09

Joe Devin wrote:
> Mok-Kong Shen wrote:

>> A look at a synonym dictionary shows that a word can have many
>> synonyms, which could by themselves have quite different meanings
>> depending on context. Isn't it quite hard to practically build a
>> network connecting meanings and words even for the size of vocabulary
>> you mentioned?

> No. Different meanings are simply assigned different identifiers (ordinal

> numbers), so that one (English) word can be linked to by a number of
> semantic nodes ("semnods," or meanings). In this way the distinction
> between meanings can be refined just as far as one wishes (usually as far
> as required for accurate parsing).

[snip]

In your post "A few hundred English synonyms" you listed e.g. 'nobody'
to be synonymous to 'human being'. I suppose this is not in the
sense of synonyms being treated e.g. in Merriam-Webster's Collegiate
Thesaurus. I like to pose another question: How do you specify
(describe) a semnod, using one of the words in the set of relevant
synonyms or employing soemthing different to capture the meaning
involved?

Thanks,

M. K. Shen

Joe Devin

unread,

Sep 16, 2009, 10:01:18 PM9/16/09

Mok-Kong Shen wrote:

> In your post "A few hundred English synonyms" you listed e.g. 'nobody'
> to be synonymous to 'human being'. I suppose this is not in the
> sense of synonyms being treated e.g. in Merriam-Webster's Collegiate
> Thesaurus.

No, these synonyms would tend to be just originals shoved into the ontology
to make Brainchild work. As I have already pointed out, this ontology is
experimental. Maybe nobody is not a synonym of person but a hyponym.
Everything is in the process of changing, and it will all get sorted out
in the end.

> I like to pose another question: How do you specify
> (describe) a semnod, using one of the words in the set of relevant
> synonyms or employing soemthing different to capture the meaning
> involved?
>
> Thanks,
>
> M. K. Shen

An excellent question, and one that needs answers from different
perspectives. From a human and biological point of view, language provides
a window into the mind, but not very FAR into the mind. Thus it is
comparatively easy to use language to trace human linguistic intelligence
down to the semnod (ontology) level, but not easy to go beyond. We know
that biological organisms develop their own software, whereas computer
science remains locked out of that capability. So has the biological
organism developed something like a C function for each semnod--one that
gets activated every time one activates that semnod? Quite probably, and
this may be why the human brain is large compared to that of other mammals.
In other words, not only do we need the space to store at least one neuron
for each semnod (or "concept"), but also the space in which to develop
complex computer-like functions related to that semnod. So, as an example,
when you read the word, "LION," you may immediately "see" a lion in your
mind, smell the zooish lion smell, and even hear and see it roar. What is
generating this "video," unless it is some kind of computer-like function
dedicated to "lion?"

But when we are talking about our as-yet primitive computer systems, then
the semnod is about as far as we can get because we still know nothing
further. So we can kludge (as I have), and simply write down or imagine a
word-sense definition for each semnod. Now as it turns out, there are many
semnods that may not need such word-sense definitions because their
meanings can be inferred simply by an examination of the ontology. And yet
this all depends on just what kind of resolution we are looking for in our
system. When I am working with a foreign language, I am careful to provide
English word-sense definitions for almost every semnod in a
word-sense-definitions file. Each semnod will then have a link emanating
from it whose type is "def," and whose destination is a line number in the
word-sense-definitions file.

But if, for example, we just want some kind of rough-and-ready English
system that doesn't care much about details, we won't need separate
word-sense definitions for semnods linked to words like magpie, raven,
parrot, duck, chicken, sparrow, etc., because all of these are simply
birds. And we may not even need a word-sense definition for the semnod
linked to "bird," because the hypernymy and holonymy of "bird" will already
have described everything we need to know about birds, for example, that
birds have eyes, birds have legs, birds have feet, etc. So in a
rough-and-ready sort of way, it may never be necessary to further define
"bird" at all. And a lot of city folk probably do not know or care to know
any more about the subject of birds than that.

So then the question arises, in the biological organism is it really
necessary to have a special "bird" function? Maybe so, and maybe no. The
truth is that we simply don't know, and mountains of further research is
needed. However in the end it may turn out that if an ontology is
correctly constructed (that is, constructed in some way that we do not yet
consciously understand), all meaning can be inferred by just examining
links, and from these linkages generic algorithms may be able to create the
image of a lion, its smell, and even its roar. At this point we simply do
not know.

But what is exciting follows from this assumption: Unlike other phenomena,
human language and intelligence simply cannot be modeled in any way other
than the way it really is with only minor modifications if it is to really
run on automated systems. Thus it is by hacking and chipping away at
examples of real human language and intelligence that we are slowly but
surely learning how to build linguistic systems and AI.

Now some of us have government grants but not much imagination or
knowledge, and others of us have a lot of knowledge but find slammed doors
to academic recognition and hence to good government funding.. There are
many reasons for this, most of them arising from human selfishness and
greed, which are always flapping in the face of REAL science. No matter
how right your knowledge may be, there will always be hundreds or thousands
of people who want to suppress it if it means the smallest thing to
them--either real or imagined. I have encountered a steady stream of this
at every step in my research. People get really upset and jealous if ever
you tell them anything obvious that they did not happen to see, and when
this happens, they will try to come down on you like a ton of bricks if
they can.

But my theory of language and intelligence could be guiding people in very
important ways now. For example:

1. If biologists are able to connect monkey brains to computers and have
them operating robot arms, then why doesn't some bright scientist (and I am
saying this facetiously because it should be an obvious "no brainer")
connect a dog brain up to 16 sensors and give the animal a vocabulary of
65535 English words, depending on the patterns its brain makes on the 16
connections?

2. If indeed it is true that single neurons are activated by the identities
of individual human beings, then my theory predicts that the particular
neuron getting activated has to be in the lexicon or ontology. More
experimentation might find which and where it is, and start the process of
mapping out how the three black-box components my theory has described map
to the human brain--if they do map to the human brain. Why work in the
dark when we have theory to go on and to prove or disprove in order to make
progress?

3. If neurons have directional axons, and we know that all of the inner
components of language can be modeled in terms of nothing but directional
links and nodes, then might not the entire nervous system be modeled in
precisely the same way? Wouldn't it be better to explore neurobiology with
these facts in mind in order to prove or disprove this hypothesis instead
of just groping in the dark?

4. If language is really a thing of such great complexity, and yet parrots
are found to have the ability to reduce spoken words to the activation of
single neurons or small clusters of neurons, then doesn't this indicate
that ancestors of birds must have once used language for communication?

Etc., etc.

So as you can see, I have many reservations about "science" not all even
beginning to be good science. Drastic changes are in order. We need to
wake up and get with it and figure out what is REALLY happening in the real
world even if it turns out to conflict with our religious or Darwinian
ideals.

--Chaumont (Joe) Devin.

0 new messages