OK, let me try to clarify more thoroughly...
Firstly, I take as the premise of my discussion here that we are
building an AGI system which has explicit, abstract logical inference
as a significant component (e.g. PLN). If you want to argue for an
AGI path that is purely subsymbolic, then I'm not going to dispute the
viability of such a path, but I don't think it's optimal and anyway my
suggestion of Lojban for OpenCog is premised on the fact that PLN is a
big component of OpenCog...
The question then is how to map natural-language relationships into
logic relationships.... Four approaches are obvious given current
technologies:
1) Hand-code mapping rules in some form
2) Learn mapping rules via supervised learning, from a training corpus
3) Learn mapping rules via unsupervised learning, from e.g. a big
corpus of texts or speech
4) Learn mapping rules via an embodied system's experience, i.e. via
reinforcement and imitation learning combined with unsupervised
learning
...
(4) is obviously appealing to me. For (4) to work one probably needs
to hand-code mappings from nonlinguistic perception (e.g. vision,
audition) into logical representation, but this is perhaps less
problematic than hand-coding mappings from language into logic,
because vision and audition have simpler structures in a way
Without hand-coded mappings from nonlinguistic perception into logic,
it's hard to see how (4) would work *unless* one was also willing to
have the logic itself emerge via reinforcement/ imitation /
unsupervised learning. That is, unless one was willing to give up
starting from a fixed logic like PLN and let the logic be learned....
I think this is possible but IMO it gets into "evolution of a brain
architecture" territory rather than "learning within a brain
architecture" territory...
What I am hoping to do is seed (4) with a combination of (1) and (3)
Specifically, regarding (3), Linas and I already wrote a paper
pointing in the direction of what we want to do....
https://arxiv.org/abs/1401.3372
However, at the moment I don't personally see how that approach is
going to let us learn something analogous to the RelEx2Logic rules. I
think it can let us learn something analogous to the link parser
grammar plus the RelEx rules. But I don't see how the unsupervised
learning paradigm we describe there is going to learn rules that
connect to PLN logic specifically.... I can sorta imagine how this
might happen, but it seems really hard...
So then we could use our unsupervised learning method for (3) and then
do (4) just for learning R2L rules. That might be viable....
However, Lojban seems to me like it could yield a robust way of doing
(1), which could potentially accelerate the overall process of making
an AGI that really understands language...
Our current R2L rule-base is kind of a mess and is also very
incomplete. So if we're going to do practical NLP dialogue
applications with OpenCog in the near future we need to either
extend/improve R2L or replace it. Taking approach (4) or "(4) on top
of (3)" is too researchy and difficult to be relevant to near-term
application development, though it's an important research direction..
The value of Lojban for an R2L-type layer is based on the facts that
A) Lojban directly maps into predicate logic, so into PLN-friendly Atomese
B) Lojban expresses everything that natural language expresses, in
ways that are reasonably elegant and already worked-out by other
people, and honed by decades of practice
On the other hand, the current system of R2L outputs is kind of
unsystematic and messy... and turning it into something elegant and
coherent would be a lot of work...
C) via generating Relex2Lojban or LinkGrammar2Lojban rules from a
parallel English/Lojban corpus, one avoids hand-coding any rules...
instead one can use this sample corpus to generate a R2L-like layer
for any syntax parser, including one learned via (3) or (3)+(4)... or
Google's newly released parser... or whatever...
D) unlike hand-coding R2L rules, the approach is more
language-independent (one only needs to create a parallel corpus in
Lojban and the new language, to extend the approach to a new language)
...
Regarding B, please do not minimize this point. FrameNet doesn't do
this, Cyc-L doesn't do this, SUMO doesn't do this.. the system of R2L
outputs doesn't currently do this ... Lojban does this...
I hope this long email at least conveys my line of thinking a bit better...
...
The point is not relabeling ConceptNodes with Lojban word-names
instead of English word-names. The point is that Lojban contains
B1) a more complete and commonsensical list of argument-structures for
verbs than Framenet
B2) systematic, commonsensical ways of dealing with everyday uses of
time, space, conjunction, possession, comparisons, etc. etc. in formal
logic
It's not the Lojban word-names that matter, it's the precisely-stated
logical relationships between the Lojban words...
-- Ben