My current efforts are related to embodied intelligence on a smaller
scale (inside virtual learning environments), but I certainly have a
lot of ideas about distributive architecture. For something like a
distributed PLN, matching the architecture you previously shared with
me, I would suggest using Erlang for it's in-built pattern matching
and messaging passing behaviour.
Unfortunately, my current work demands mean I can't contribute
directly to such an effort, but I'm happy to advise on distributed db
topics such as CAP, ACID, the utility problem and other aspects of
such a system.
Cheers,
Joel Pitt, PhD | http://ferrouswheel.me | +852 60744202
M-Lab AI Project and OpenCog Developer | http://opencog.org
2011/7/6 YKY (Yan King Yin, 甄景贤) <generic.in...@gmail.com>:
> --
> You received this message because you are subscribed to the Google Groups
> "opencog" group.
> To post to this group, send email to ope...@googlegroups.com.
> To unsubscribe from this group, send email to
> opencog+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/opencog?hl=en.
>
My current efforts are related to embodied intelligence on a smaller
scale (inside virtual learning environments), but I certainly have a
lot of ideas about distributive architecture. For something like a
distributed PLN, matching the architecture you previously shared with
me, I would suggest using Erlang for it's in-built pattern matching
and messaging passing behaviour.
Unfortunately, my current work demands mean I can't contribute
directly to such an effort, but I'm happy to advise on distributed db
topics such as CAP, ACID, the utility problem and other aspects of
such a system.
Although I'm less familiar with Haskell, it's also a good choice. The
reason I like Erlang however is because the message passing happens
transparently between network nodes. I.e. the entire language is based
on message passing and it lets you (for the most part) forget whether
you are sending to a local process or to a process across the network.
>> Unfortunately, my current work demands mean I can't contribute
>> directly to such an effort, but I'm happy to advise on distributed db
>> topics such as CAP, ACID, the utility problem and other aspects of
>> such a system.
>
> Thanks, I've looked up CAP and ACID. What do you mean by "utility"?
By utility, I mean the "utility problem" of machine learning where the
system can slow down or get worse with more knowledge available:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.958
> Another option is to build the inference engine on top of an existing
> distributed database app with MapReduce. That may be more "agile" and
> easier. I've just briefly looked at MapReduce and it seems possible to fit
> my distributive algorithm in it; though I'll have to verify more
> carefully...
Yes, map reduce would be one possibility, but is generic and would
essentially be a way to query a distributed knowledge base for
existing facts.
It would certainly be worth thinking about the design of MapReduce as
inspiration though, with the Map part being inference and the reduce
part aggregating the solutions based on different inference paths.
J
We have no resources to dedicate to this at the moment, so this is not
a near-term plan, just an idea we think would be worth exploring...
ben g
--
Ben Goertzel, PhD
CEO, Novamente LLC and Biomind LLC
CTO, Genescient Corp
Vice Chairman, Humanity+
Adjunct Professor of Cognitive Science, Xiamen University, China
Advisor, Singularity University and Singularity Institute
b...@goertzel.org
"My humanity is a constant self-overcoming" -- Friedrich Nietzsche
One idea that's been bandied about is to replace the current C++
Atomspace and Scheduler and some related code with Erlang, probably
living behind the current C++ interfaces ...
However, I'm currently not sure how much benefit we'll get just from
doing that. I currently think that architecturally changing the code
so that it works well distributed won't easy. We really need to define
the right programming paradigms for a distributed context and port or
rewrite the existing code to fit them. Given the size of the codebase,
this could be a mammoth task and I'd be more inclined to start from a
clean slate and gradually port or integrate modules from legacy
OpenCog (part of this is also because it seems like there are large
parts of it that are not useful anymore). I would probably make a case
for rewriting many of them in Python/Cython.
Just making a distributed AtomSpace is a simpler task, but I've become
convinced there is less utility to doing that when we can easily get
servers with 12Gb+ of RAM.
As Ben said, this isn't a near-term plan at all, just me hypothesising
after getting somewhat frustrated with dealing with C++ code which
isn't very elegantly designed[1]
[1] Which I think is partly due to it being in C++... which makes the
overhead of creating new classes/functions etc enough to lead to
bloated methods/classes. Especially if a team is on a deadline.
Joel Pitt, PhD | http://ferrouswheel.me | +852 60744202
M-Lab AI Project and OpenCog Developer | http://opencog.org
However, I don't think it would be wise for Joel or me or others to
divert a lot of attention to discussing the details of this issue at
the present time -- in the absence of resources (in the form of $$ or
volunteered programmer time) to carry out a massive rewrite...
I am not opposed to rewriting OpenCog from scratch eventually -- the
key point is the ideas, algorithms and structures, not the code.
However, I believe there is a lot to be learned from working with and
extending the current codebase on single powerful machines, or small
networks of distributed machines. I think we can potentially get all
the key cognitive algorithms of OpenCog working together in this way,
which is really the main problem in the way of achieving AGI (IMO)....
Once this is done, we may or may not wish to re-code various portions
(or the whole thing), for use on distributed or parallel processors or
for other reasons based on what we've learned by that point... But
for now, I really feel the current codebase is quite appropriate for
getting to the point of demonstrated human-like cognition in various
simple situations, via cognitive synergy between learning algorithms
associated with different memory types...
-- Ben
--
I want add that this is only in certain parts of the codebase, and the
Hong Kong team is rapidly improving these! Most parts of the C++ code
*are* well designed but for optimal utilisation of resources in a
distributed architecture, they would of course have to be slightly
reworked.
We also won't really know how exactly a distributed version should be
designed until an integrated single instance is capable of at least
quasi-intelligent behaviour. Until then, a distributed version is
OpenCog is mostly conjecture.
On the road to that point however, experiments like YKY's distributed
reasoning engine, along with existing distributed graph algorithms and
other horizontally scalable data processing engines (like Disco + Pig)
will be worth watching.
Briefly, the design consists of lots of specialists and a distributed index for routing natural language messages to the right experts. The design provides economic incentives for independent owners of knowledge bases and computing resources to establish a reputation of providing intelligent, reliable, and useful services in a hostile environment. The protocol supports this by providing a secure means of identifying message authorship and by the absence of a global mechanism for editing or deleting data once it has been posted to the global pool.
I haven't worked on the OpenCog code, but it seems to me that its goals are already in line with what is required of network peers. To distribute OpenCog, you would make multiple copies and have each peer specialize in a different domain of expertise. Each peer would be responsible for understanding incoming messages related to its specialty, ignoring everything else, and responding intelligently either by answering the question or forwarding it to peers that it knows are experts on related topics. By specializing, this would reduce the computational load so that each expert could run on a single computer.
I am cross posting to the GI list because I know some people there have expressed interest in my design. To get started, I suggested an application called Mailpool which would work like an email client, but without a "to" box. You would post a message about anything to nobody in particular, and it would be routed to anyone who cares. At first, it would predict your preferences by keyword matching to messages you have posted previously. There are smarter ways to do this, obviously, which is what makes the problem interesting, and something that OpenCog could be tasked to solve.
-- Matt Mahoney, matma...@yahoo.com
Good point. First, if I was Bill Gates, I would not have enough money to build AGI. I think it will take decades of global effort to make machines smart enough to do what we now pay people worldwide US$60 trillion per year to do. Sorry if I don't think there is a quick solution to the problem. You have seen my cost estimate in my proposal.
But there are still things we can do. Nobody had the resources to build the web, but it only took one person 6 weeks to write the Mosaic browser and Apache server, which were the first implementations of HTTP and HTML. Which brings me to my second point. Any good programmer could probably implement the Mailpool protocol. But peers also have to be intelligent. That's a much harder problem. I don't think there are a lot of people on this planet who could do it well. If there are, they are probably here. But anyone who is smart enough to work on it (and see the benefits of doing so) is also going to see problems with my proposal and want to have some input on the design. That is why I am discussing it here.
-- Matt Mahoney, matma...@yahoo.com