unified distributive inference engine

39 views
Skip to first unread message

YKY (Yan King Yin, 甄景贤)

unread,
Jul 6, 2011, 9:01:54 AM7/6/11
to ope...@googlegroups.com
Hi Joel and others,

As I've mentioned before, I have designed a distributive architecture that is equally applicable to the inference algorithms in PLN, NARS, and Genifer.  I and some partners in my group would like to start implementing it, but I'm wondering if this would duplicate the Joel and others' efforts since I've heard Joel has been planning about a distributive version of PLN too.

I have re-factored the inference algorithm which allowed me to see that it would fit NARS, PLN, and Genifer, and even binary-logic engines like Cyc and Ayane.

So I hope we can develop the distributive version together...

-- 
KY
Notice:  50% of all my income derived from AGI will be instantaneously donated to charity.

"The ultimate goal of mathematics is to eliminate any need for intelligent thought" -- Alfred North Whitehead

Joel Pitt

unread,
Jul 6, 2011, 9:37:18 AM7/6/11
to ope...@googlegroups.com
Hi YKY,

My current efforts are related to embodied intelligence on a smaller
scale (inside virtual learning environments), but I certainly have a
lot of ideas about distributive architecture. For something like a
distributed PLN, matching the architecture you previously shared with
me, I would suggest using Erlang for it's in-built pattern matching
and messaging passing behaviour.

Unfortunately, my current work demands mean I can't contribute
directly to such an effort, but I'm happy to advise on distributed db
topics such as CAP, ACID, the utility problem and other aspects of
such a system.

Cheers,

Joel Pitt, PhD | http://ferrouswheel.me | +852 60744202
M-Lab AI Project and OpenCog Developer | http://opencog.org


2011/7/6 YKY (Yan King Yin, 甄景贤) <generic.in...@gmail.com>:

> --
> You received this message because you are subscribed to the Google Groups
> "opencog" group.
> To post to this group, send email to ope...@googlegroups.com.
> To unsubscribe from this group, send email to
> opencog+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/opencog?hl=en.
>

YKY (Yan King Yin, 甄景贤)

unread,
Jul 6, 2011, 11:20:51 AM7/6/11
to ope...@googlegroups.com
On Wed, Jul 6, 2011 at 9:37 PM, Joel Pitt <jo...@opencog.org> wrote:
 
My current efforts are related to embodied intelligence on a smaller
scale (inside virtual learning environments), but I certainly have a
lot of ideas about distributive architecture. For something like a
distributed PLN, matching the architecture you previously shared with
me, I would suggest using Erlang for it's in-built pattern matching
and messaging passing behaviour.

Yeah, I agree, Erlang is great for this type of apps.

But I have a partner, Sandeep, who prefers Haskell.  And another new partner William is a Haskell guru but he seems competent in many other languages including Erlang.

Unfortunately, my current work demands mean I can't contribute
directly to such an effort, but I'm happy to advise on distributed db
topics such as CAP, ACID, the utility problem and other aspects of
such a system.

Thanks, I've looked up CAP and ACID.  What do you mean by "utility"?

Another option is to build the inference engine on top of an existing distributed database app with MapReduce.  That may be more "agile" and easier.  I've just briefly looked at MapReduce and it seems possible to fit my distributive algorithm in it;  though I'll have to verify more carefully...

KY

Joel Pitt

unread,
Jul 7, 2011, 4:26:51 AM7/7/11
to ope...@googlegroups.com
2011/7/6 YKY (Yan King Yin, 甄景贤) <generic.in...@gmail.com>:
>> My current efforts are related to embodied intelligence on a smaller
>> scale (inside virtual learning environments), but I certainly have a
>> lot of ideas about distributive architecture. For something like a
>> distributed PLN, matching the architecture you previously shared with
>> me, I would suggest using Erlang for it's in-built pattern matching
>> and messaging passing behaviour.
>
> Yeah, I agree, Erlang is great for this type of apps.
> But I have a partner, Sandeep, who prefers Haskell.  And another new partner
> William is a Haskell guru but he seems competent in many other languages
> including Erlang.

Although I'm less familiar with Haskell, it's also a good choice. The
reason I like Erlang however is because the message passing happens
transparently between network nodes. I.e. the entire language is based
on message passing and it lets you (for the most part) forget whether
you are sending to a local process or to a process across the network.

>> Unfortunately, my current work demands mean I can't contribute
>> directly to such an effort, but I'm happy to advise on distributed db
>> topics such as CAP, ACID, the utility problem and other aspects of
>> such a system.
>
> Thanks, I've looked up CAP and ACID.  What do you mean by "utility"?

By utility, I mean the "utility problem" of machine learning where the
system can slow down or get worse with more knowledge available:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.958

> Another option is to build the inference engine on top of an existing
> distributed database app with MapReduce.  That may be more "agile" and
> easier.  I've just briefly looked at MapReduce and it seems possible to fit
> my distributive algorithm in it;  though I'll have to verify more
> carefully...

Yes, map reduce would be one possibility, but is generic and would
essentially be a way to query a distributed knowledge base for
existing facts.

It would certainly be worth thinking about the design of MapReduce as
inspiration though, with the Map part being inference and the reduce
part aggregating the solutions based on different inference paths.

J

Abram Demski

unread,
Jul 18, 2011, 1:52:54 AM7/18/11
to ope...@googlegroups.com
Joel,

I was curious-- how would distributive code written in languages like Erlang interface with existing opencog stuff? I was under the impression that making opencog more distributed would mostly be a matter of extending the C++ code.

--Abram

Ben Goertzel

unread,
Jul 18, 2011, 11:30:29 AM7/18/11
to ope...@googlegroups.com
One idea that's been bandied about is to replace the current C++
Atomspace and Scheduler and some related code with Erlang, probably
living behind the current C++ interfaces ...

We have no resources to dedicate to this at the moment, so this is not
a near-term plan, just an idea we think would be worth exploring...

ben g

--
Ben Goertzel, PhD
CEO, Novamente LLC and Biomind LLC
CTO, Genescient Corp
Vice Chairman, Humanity+
Adjunct Professor of Cognitive Science, Xiamen University, China
Advisor, Singularity University and Singularity Institute
b...@goertzel.org

"My humanity is a constant self-overcoming" -- Friedrich Nietzsche

YKY (Yan King Yin, 甄景贤)

unread,
Jul 18, 2011, 11:10:08 PM7/18/11
to ope...@googlegroups.com
On Mon, Jul 18, 2011 at 11:30 PM, Ben Goertzel <b...@goertzel.org> wrote:

One idea that's been bandied about is to replace the current C++
Atomspace and Scheduler and some related code with Erlang, probably
living behind the current C++ interfaces ...

I guess Haskell can interface with C/C++ very easily (haven't tried it though).  HaskellCloud is a library that provides Erlang-like message-passing functions.

I'm currently exploring that possibility...

Also, in my distributive architecture, the BIT-tree in PLN backward chaining can be realized implicitly via message-passing, ie, there is not need to represent the proof tree explicitly as a data structure.  That can be a great simplification IMO...

KY

Joel Pitt

unread,
Jul 20, 2011, 9:32:03 AM7/20/11
to ope...@googlegroups.com
As Ben mentioned, the idea was to place Erlang inside the core of
OpenCog and have C++ interfaces to allow existing code to continue
working with minimal changes.

However, I'm currently not sure how much benefit we'll get just from
doing that. I currently think that architecturally changing the code
so that it works well distributed won't easy. We really need to define
the right programming paradigms for a distributed context and port or
rewrite the existing code to fit them. Given the size of the codebase,
this could be a mammoth task and I'd be more inclined to start from a
clean slate and gradually port or integrate modules from legacy
OpenCog (part of this is also because it seems like there are large
parts of it that are not useful anymore). I would probably make a case
for rewriting many of them in Python/Cython.

Just making a distributed AtomSpace is a simpler task, but I've become
convinced there is less utility to doing that when we can easily get
servers with 12Gb+ of RAM.

As Ben said, this isn't a near-term plan at all, just me hypothesising
after getting somewhat frustrated with dealing with C++ code which
isn't very elegantly designed[1]


[1] Which I think is partly due to it being in C++... which makes the
overhead of creating new classes/functions etc enough to lead to
bloated methods/classes. Especially if a team is on a deadline.

Joel Pitt, PhD | http://ferrouswheel.me | +852 60744202
M-Lab AI Project and OpenCog Developer | http://opencog.org

Ben Goertzel

unread,
Jul 20, 2011, 9:47:21 AM7/20/11
to ope...@googlegroups.com
I would add that Joel and I have significantly different opinions on
some aspects of this topic, at present. I personally doubt it will be
necessary (or near-necessary) to "start from a clean slate" to make
OpenCog work well on a massively distributed scale. It was designed
with extensibility to distributed processing in mind, from the start.

However, I don't think it would be wise for Joel or me or others to
divert a lot of attention to discussing the details of this issue at
the present time -- in the absence of resources (in the form of $$ or
volunteered programmer time) to carry out a massive rewrite...

I am not opposed to rewriting OpenCog from scratch eventually -- the
key point is the ideas, algorithms and structures, not the code.

However, I believe there is a lot to be learned from working with and
extending the current codebase on single powerful machines, or small
networks of distributed machines. I think we can potentially get all
the key cognitive algorithms of OpenCog working together in this way,
which is really the main problem in the way of achieving AGI (IMO)....
Once this is done, we may or may not wish to re-code various portions
(or the whole thing), for use on distributed or parallel processors or
for other reasons based on what we've learned by that point... But
for now, I really feel the current codebase is quite appropriate for
getting to the point of demonstrated human-like cognition in various
simple situations, via cognitive synergy between learning algorithms
associated with different memory types...

-- Ben

--

Joel Pitt

unread,
Jul 20, 2011, 10:38:55 AM7/20/11
to ope...@googlegroups.com
On 20 July 2011 21:32, Joel Pitt <joel...@gmail.com> wrote:
> As Ben said, this isn't a near-term plan at all, just me hypothesising
> after getting somewhat frustrated with dealing with C++ code which
> isn't very elegantly designed[1]

I want add that this is only in certain parts of the codebase, and the
Hong Kong team is rapidly improving these! Most parts of the C++ code
*are* well designed but for optimal utilisation of resources in a
distributed architecture, they would of course have to be slightly
reworked.

We also won't really know how exactly a distributed version should be
designed until an integrated single instance is capable of at least
quasi-intelligent behaviour. Until then, a distributed version is
OpenCog is mostly conjecture.

On the road to that point however, experiments like YKY's distributed
reasoning engine, along with existing distributed graph algorithms and
other horizontally scalable data processing engines (like Disco + Pig)
will be worth watching.

Matt Mahoney

unread,
Jul 22, 2011, 11:33:29 AM7/22/11
to ope...@googlegroups.com, general-intelligence
I don't think that distributing OpenCog would require a substantial redesign. You might be familiar with my distributed AGI design which has been posted for awhile in http://mattmahoney.net/agi2.html

Briefly, the design consists of lots of specialists and a distributed index for routing natural language messages to the right experts. The design provides economic incentives for independent owners of knowledge bases and computing resources to establish a reputation of providing intelligent, reliable, and useful services in a hostile environment. The protocol supports this by providing a secure means of identifying message authorship and by the absence of a global mechanism for editing or deleting data once it has been posted to the global pool.

I haven't worked on the OpenCog code, but it seems to me that its goals are already in line with what is required of network peers. To distribute OpenCog, you would make multiple copies and have each peer specialize in a different domain of expertise. Each peer would be responsible for understanding incoming messages related to its specialty, ignoring everything else, and responding intelligently either by answering the question or forwarding it to peers that it knows are experts on related topics. By specializing, this would reduce the computational load so that each expert could run on a single computer.

I am cross posting to the GI list because I know some people there have expressed interest in my design. To get started, I suggested an application called Mailpool which would work like an email client, but without a "to" box. You would post a message about anything to nobody in particular, and it would be routed to anyone who cares. At first, it would predict your preferences by keyword matching to messages you have posted previously. There are smarter ways to do this, obviously, which is what makes the problem interesting, and something that OpenCog could be tasked to solve.

 
-- Matt Mahoney, matma...@yahoo.com

Matt Mahoney

unread,
Jul 22, 2011, 5:31:25 PM7/22/11
to general-in...@googlegroups.com, OpenCog
swkane <diss...@gmail.com> wrote:
>I'm curious. You have supposedly been able to make '20x what you need to live on' but you haven't been able to get the resources to start building Mailpool? Due to the fact that you have brought up Mailpool and have linked to your proposal for it many times, I'm assuming it is important to you. If you don't have time yourself to work on it, perhaps all that extra money you have could be paid to a contractor to build it?

Good point. First, if I was Bill Gates, I would not have enough money to build AGI. I think it will take decades of global effort to make machines smart enough to do what we now pay people worldwide US$60 trillion per year to do. Sorry if I don't think there is a quick solution to the problem. You have seen my cost estimate in my proposal.

But there are still things we can do. Nobody had the resources to build the web, but it only took one person 6 weeks to write the Mosaic browser and Apache server, which were the first implementations of HTTP and HTML. Which brings me to my second point. Any good programmer could probably implement the Mailpool protocol. But peers also have to be intelligent. That's a much harder problem. I don't think there are a lot of people on this planet who could do it well. If there are, they are probably here. But anyone who is smart enough to work on it (and see the benefits of doing so) is also going to see problems with my proposal and want to have some input on the design. That is why I am discussing it here.

-- Matt Mahoney, matma...@yahoo.com

swkane

unread,
Jul 22, 2011, 1:36:29 PM7/22/11
to general-in...@googlegroups.com, ope...@googlegroups.com
Matt,

I'm curious. You have supposedly been able to make '20x what you need to live on' but you haven't been able to get the resources to start building Mailpool? Due to the fact that you have brought up Mailpool and have linked to your proposal for it many times, I'm assuming it is important to you. If you don't have time yourself to work on it, perhaps all that extra money you have could be paid to a contractor to build it?

Steven
Reply all
Reply to author
Forward
0 new messages