Re-imagining the AtomSpace

Amirouche Boubekki

unread,

Aug 17, 2021, 4:22:11 AM8/17/21

to opencog

I am very eager to read more about the new AtomSpace design architecture.

I went through the wiki page at : https://wiki.opencog.org/w/AtomSpace

There is still a few things I do not understand.

I will be very glad to be part of this endeavor wholly and fully.

Only the best,

Amirouche ~ https://hyper.dev

Amirouche Boubekki

unread,

Aug 17, 2021, 4:38:41 AM8/17/21

to opencog

Is there a page that gathers all the publications regarding the new design ?

Ben Goertzel

unread,

Aug 17, 2021, 4:24:47 PM8/17/21

to opencog

https://wiki.opencog.org/w/Hyperon

On Tue, Aug 17, 2021 at 1:39 AM Amirouche Boubekki
<amirouche...@gmail.com> wrote:
>
> Is there a page that gathers all the publications regarding the new design ?
>

> --
> You received this message because you are subscribed to the Google Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAL7_Mo_7ihMrRCZY%2BCLwAZ5KcHGqbRqwQgpOb8pLz1qY9Rj%2BSw%40mail.gmail.com.

--
Ben Goertzel, PhD
http://goertzel.org

“He not busy being born is busy dying" -- Bob Dylan

Linas Vepstas

unread,

Aug 17, 2021, 6:40:33 PM8/17/21

to opencog

I spent the last week trying to convince a group of scheme enthusiasts to build a "database for s-expressions", which, after all is said and done, is all that the Atomspace is. This idea went over like a lead balloon.

The primary stumbling block seems to be conceptual. People can visualize a database of rows and columns -- basically SQL -- very conventional. They can visualize a database of key-value pairs -- basically, noSQL. Also very popular. The idea of a JSON database is now common enough. JSON is, after all, a nested, hierarchical key-value store, having the form {name1:value1, name2:value2, ...} -- you can think of name1, name2, .. as being like column labels, and (value1, value2, ...) as being rows. The biggest difference between tables and JSON is that tables are fixed-width, with fixed column labels, while JSON is free-form: every JSON expression carries it's own labels. And, since it's hierarchical, each value can be another JSON expression, nesting arbitrarily deep. It's a labelled tree.

When I suggested that one can store just plain s-expressions -- i.e. just (value1, value2, ...) without the labels ... an unlabeled tree ... this seemed to make people's heads explode. So my efforts, it seems, were for naught. Perhaps I planted a seed, though.

The goal of a generic, agnostic database of s-expressions is to overcome the marketing problem the AtomSpace has. If some other organization could explain to the world what that is, and provide agnostic, generic API's, I think that would be a good thing. However, based on the cold reception I got, I'm thinking it may take another 10 years before the idea catches on.

(The reception I got was "why don't you use JSON?" so I had to explain the problem with the labels. Once that was clear, the next suggestion was "why don't you use Prolog/Datalog?" I tried to explain how prolog is limited to crisp true/false values, and how prolog does not allow trees - it's not hierarchical the way s-expressions and JSON are - but this argument did not seem to gain traction. Somehow, just saying "it's a database of s-expressions" is not enough to convey the idea. People stumble on this. And yet, that's all that it is...)

I'm saying this out loud, right here, right now, because if you are reading this, and you are thinking to yourself "I never quite understood what the atomspace is" -- well, it's that. It's a database of s-expressions. It's difficult to take the next step, until this first basic idea becomes clear. I want this first, basic idea to become clear to everyone.

As to Hyperon -- Ben, I skimmed through everything written on Hyperon, and it seems (to me) like it could be "easily" implemented within the existing AtomSpace framework. I think this would be the right direction to move in, but I don't think that is possible until there is some sort of shared understanding about how things work, about how things could work, about what needs to be done. Reaching that shared understanding may require real work and hard thinking -- there's no magic wand of sudden enlightenment -- but its doable. And it can be done with, ahhh talking and email. I very strongly encourage discussion. Let the sun shine in.

-- Linas

To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBf8jVx8pxaiwvmvkf0ACJgqGQeF6X8s3E%3D%3DdXhr2sPCvA%40mail.gmail.com.

--

Patrick: Are they laughing at us?

Sponge Bob: No, Patrick, they are laughing next to us.

Ben Goertzel

unread,

Aug 17, 2021, 7:24:36 PM8/17/21

to opencog

> As to Hyperon -- Ben, I skimmed through everything written on Hyperon, and it seems (to me) like it could be "easily" implemented within the existing AtomSpace framework. I think this would be the right direction to move in, but I don't think that is possible until there is some sort of shared understanding about how things work, about how things could work, about what needs to be done. Reaching that shared understanding may require real work and hard thinking -- there's no magic wand of sudden enlightenment -- but its doable. And it can be done with, ahhh talking and email. I very strongly encourage discussion. Let the sun shine in.

OK thanks for skimming the materials over...

Alexey is soon (by end of August roughly, is the plan) going to
produce a "first full draft" document summarizing what he thinks we
need from "Atomese 2" language (which will probably get a wizzier and
more appropriate name soon). Assuming I agree w/ what he suggests
that will then be the time to have an in-depth discussion w you about
how it might be implemented w/in the current Atomspace framework

On roughly the same timescale, Senna will likely have a document
summarizing his current thoughts on distribution/persistence for
Hyperon, on which it also will be excellent to get your feedback ....

-- Ben

Anatoly Belikov

unread,

Aug 18, 2021, 6:32:30 AM8/18/21

to ope...@googlegroups.com

What do you mean by Prolog not allowing trees?

ср, 18 авг. 2021 г. в 02:40, Linas Vepstas <linasv...@gmail.com>:

To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA34_UMZf-3t0giA7TZVQmXs%3DO2-aHB3sn5FC0o-6YBdzCg%40mail.gmail.com.

Linas Vepstas

unread,

Aug 18, 2021, 12:36:49 PM8/18/21

to opencog

Hi Ben,

On Tue, Aug 17, 2021 at 6:24 PM Ben Goertzel <b...@goertzel.org> wrote:

need from "Atomese 2" language (which will probably get a wizzier and
more appropriate name soon).

You're good at naming things. Naming things is hard.

On roughly the same timescale, Senna will likely have a document
summarizing his current thoughts on distribution/persistence for
Hyperon, on which it also will be excellent to get your feedback ....

I can make some pre-emptive remarks now, based on what I saw in earlier drafts.

(1) UUID's are a really bad idea. The problem is that issuing a UUID requires a central authority that will guarantee that the association between the UUID and the Atom is really unique, is the right one. It becomes hard to communicate with the central authority; it's a bottleneck.

Solution: Just use the Atom name. For example `(Evaluation (Predicate "foo") (List (Concept "bar")))` This is globally unique- its the same on Jupiter as anywhere else. It's easy to compute - there's nothing to do.

If you need to send this to a peer node, just send it as a string. If you want, then compress that string with some compression algorithm. Or invent a binary format. (I don't care) There is a speed-cpu-time tradeoff. If you really want -- use an ID -- but that ID is for that one communications link only -- its not "universally unique". The problem with ID's is that you need to keep a lookup table, of ID vs. the actual Atom -- and this lookup table chews up RAM. Its a significant amount of RAM. And doing the lookup is slow -- it can be 30% of performance. That's because Atoms are so tiny, so fast, that doing almost anything at all will just make everything slower. So even very simple ideas, very simple concepts, make things slower. Performance is a straight-jacket, there's just not a lot of wiggle room of what you can do without making things slower, fatter.

(2) DIstributed, federated, decentralized .... Decouple the policy of what atoms are sent where, from the mechanism of how they are sent. If the TV on some Atom is being updated 1000 times a second, it makes no sense to try to broadcast it to the world at the same rate.

(3) Allow multiple policies to be used. Allow multiple channels to be used.

And finally:

(4) Senna and you may not like it, but points 1-3 are more-or-less done, already. The demos work. Run the demos. I mean -- they already do everything that was written down in the 2017 design documents. So, in a sense, there's "nothing more to do" to get a decentralized AtomSpace.

The next steps are these:

(a) If you don't like the communications channels that are there, you can create new ones. I don't care. I think that what's there is "good enough for most purposes", but if you want to tune for something super-fancy .. sure. Do that. It can fit just fine in the existing API. I really doubt you'd need to change the API to get what you might want.

(b) Create some policies. A policy is just some Agent that passes some Atoms along from here to there. In what I do, I have not needed anything except for the most trivial policies, but certainly I can imagine getting fancy. This really would be new, cutting-edge work. But its entirely detached from the question of "how do I send an Atom from A to B" (that's the commo layer) and "how do I keep an atom in RAM" (that's the core AtomSpace)

Working on (b) is really what decentralized atomspaces are really about. That's where all the action is. That's the unexplored, unknown, interesting bit.

You can invent new commo layers - its real easy; everything you need is there.

At this time, it would be hard to invent "a new way to store an Atom in RAM" ... and have it be API compatible. This can be done, I suppose, but it would be a challenge. I'm up for it, but it's hard to see that this is needed or useful. I mean - I'm open to anything, but its real easy to make a system that is slower and bulkier, and hard to make one that is faster and slimmer.

I say this out loud because I suspect Senna (and you?) might be thinking of "new ways to store Atoms in RAM", and that's fine, but it is really not at all about distributed Atomspaces. I'm also concerned that there is a tangle between sending Atoms from A to B (the communications channel) and deciding which atoms to send (the policy). These two need to be kept separate. Tangling them together would be a fatal error.

-- Ben

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBdT_9t4AbUOG0_WvMM%2Bmo8moxFCFzcyFUxwAGfmm_t%2BbQ%40mail.gmail.com.

Linas Vepstas

unread,

Aug 18, 2021, 1:02:31 PM8/18/21

to opencog

On Wed, Aug 18, 2021 at 5:32 AM Anatoly Belikov <awbe...@gmail.com> wrote:

What do you mean by Prolog not allowing trees?

You can write

:- likes (Bob, baseball);

but you cannot write

:- likes (Bob, :- exploration (space, :- or(rockets, solarsails)))

The second example is a 3-level deep binary tree ... but is not valid prolog. Of course, you can convert it into valid prolog, but then it is no longer a single, deep tree, it would have to be three shallow trees.

The game being played here is "how do you represent knowledge?" and there's a whole rainbow of choices: trees and graphs and directed graphs or undirected graphs or hypergraphs, .. or RDF or "semantic triples" or datalog or json or tables, or whatever. And any one of these systems is really enough for "anything" - you can represent knowledge with any of these systems.

The real questions become: How easy is it to use? For example, you can represent "the green ball is under the couch" with "semantic triples" but it becomes hard and verbose. Another example: you can represent a hypergraph with just ordinary graphs, but how much extra RAM and CPU does that need? If CPU and RAM were free, if it weren't for these kinds of concerns, we could just layer datalog on top of Apache tinkerpop and use graphQL and declare victory.

--linas

To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAFj%2Bw-sueAkJQaU1C0Ef4t60FrtFG3A%3DWQk-z%2BjSM56PQzE%3Dvw%40mail.gmail.com.

Amirouche Boubekki

unread,

Aug 19, 2021, 2:16:36 AM8/19/21

to opencog

Le mar. 17 août 2021 à 22:24, Ben Goertzel <b...@goertzel.org> a écrit :
>
> https://wiki.opencog.org/w/Hyperon
>

Thanks, I will look into that. I am eager to read the upcoming publications.

Amirouche Boubekki

unread,

Aug 19, 2021, 2:20:28 AM8/19/21

to opencog

Hello Linas :-)

Le mer. 18 août 2021 à 00:40, Linas Vepstas <linasv...@gmail.com> a écrit :
>
> I spent the last week trying to convince a group of scheme enthusiasts to build a "database for s-expressions", which, after all is said and done, is all that the Atomspace is. This idea went over like a lead balloon.
>

Not really like a lead balloon, since I am back :-)

If you deliver a set of json or sexp files that is relevant to
opencog, I think about one terabyte or something like that, I can
demonstrate a JSON / s-exp database.

Anatoly Belikov

unread,

Aug 24, 2021, 3:53:28 AM8/24/21

to ope...@googlegroups.com

you can write something like that:

?- assert(likes('Bob', exploration('space', ('rockets'; 'solarsails')))).
true.

It almost works, but I can't make prolog to return all the solutions:

?- likes('Bob', exploration('space', X)).
X = (rockets;solarsails).

ср, 18 авг. 2021 г. в 21:02, Linas Vepstas <linasv...@gmail.com>:

To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA37UtA8YZ4qrvFSYdkBSPEVZeUZP5wbyyqS8oA-3UZ4VJw%40mail.gmail.com.

Linas Vepstas

unread,

Aug 29, 2021, 5:51:19 PM8/29/21

to opencog

Hi Anatoly,

I think it would be interesting (useful?) to write dynamic, run-time shims between the AtomSpace and prolog (or other systems). The user would write a small shim or conversion template, for example, something like

:- x(y,z); <<==>> (Evaluation (Predicate X) (List (Concept Y)(Concept Z)))

or whatever mapping the user wants, and then every time prolog needs this info, it could fish it out of the atomspace, and vice-versa: whenever prolog generates output, it would automatically be written into the atomspace.

I haven't really thought about this for prolog, but I did think about this for external data stores. Currently, the only way to get data into the atomspace is a giant batch import, and this can take an hour or two to run (e.g. the agi-bio dataset). It would be nicer to do this "on demand", "as needed". Output is likewise: instead of one big dump, just update the remote dataset bit by bit.

I thought about this a little bit, and I think it's doable and not that hard. I don't have any incentive to work on this, just right now.

To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAFj%2Bw-uc3Mnt1zFDwosk11AoM%2B%2BjkJsp_U-dtcEdHaxVR%3DtgUw%40mail.gmail.com.

Linas Vepstas

unread,

Aug 29, 2021, 5:59:26 PM8/29/21

to opencog

Hi Amirouche,

On Thu, Aug 19, 2021 at 1:20 AM Amirouche Boubekki <amirouche...@gmail.com> wrote:

If you deliver a set of json or sexp files that is relevant to
opencog, I think about one terabyte or something like that, I can
demonstrate a JSON / s-exp database.

I've been out of town. I can send you two. One will be a dump of (a portion of) the agi-bio dataset. That dataset is itself just an import into the atomspace of assorted external gene and protein databases. It's just "pure" s-expressions, no truth values or counts on them. It's not a terabyte, its probably much smaller than a gigabyte (I'll find out shortly)

The other will be a natural language dataset. Here, each s-exp will have a numerical count on it. It's the counts that matter. I have small, medium, large versions of this. I'll send the small one, no point in struggling with something huge.

The format will be "Atomese": Atoms in s-expressions are globally unique and immutable and indexed (thus, searchable). Values in s-expressions are fleeting, ephemeral, subject to change, and not indexed (thus, not searchable)

--linas

Amirouche Boubekki

unread,

Aug 30, 2021, 6:48:48 AM8/30/21

to opencog

Le dim. 29 août 2021 à 23:59, Linas Vepstas <linasv...@gmail.com> a écrit :
>
> Hi Amirouche,
>
> On Thu, Aug 19, 2021 at 1:20 AM Amirouche Boubekki <amirouche...@gmail.com> wrote:
>>
>>
>> If you deliver a set of json or sexp files that is relevant to
>> opencog, I think about one terabyte or something like that, I can
>> demonstrate a JSON / s-exp database.
>
>
> I've been out of town. I can send you two. One will be a dump of (a portion of) the agi-bio dataset. That dataset is itself just an import into the atomspace of assorted external gene and protein databases. It's just "pure" s-expressions, no truth values or counts on them. It's not a terabyte, its probably much smaller than a gigabyte (I'll find out shortly)
>
> The other will be a natural language dataset. Here, each s-exp will have a numerical count on it. It's the counts that matter. I have small, medium, large versions of this. I'll send the small one, no point in struggling with something huge.

That is wiser. Let me know where I can fetch the data, and whether the
server must be behind a login and password. My server is located in
Helsinki in Finland, and it is not encrypted so better keep secrets
away from it. I think it will be easier for me to make sense of the
natural language data, but anything sexp should do.

>
> The format will be "Atomese": Atoms in s-expressions are globally unique and immutable and indexed (thus, searchable). Values in s-expressions are fleeting, ephemeral, subject to change, and not indexed (thus, not searchable)
>
> --linas

--
Amirouche ~ https://hyper.dev

Linas Vepstas

unread,

Aug 31, 2021, 5:37:25 PM8/31/21

to opencog

Hi Amirouche,

Here: https://linas.org/datasets/

It has the bio dataset, I'll add the language dataset shortly. The README explains:

* `mozi-data.scm.bz2` -- uncompressed, its 371485784 bytes.
Contains the December 2019 version of the small public version of
the MOZI dataset. This is just genetic and proteomic data from
popular public datasets, converted to Atomese s-expressions.
Stats are:
`((ConceptNode . 454779) (PredicateNode . 12) (ListLink . 1925554) (MemberLink . 1850528) (AndLink . 98788) (EvaluationLink . 1887530) (InheritanceLink . 122184) (GeneNode . 49050) (MoleculeNode . 368909))`
so that's about 7 million atoms (It's 6757335 to be precise); loading
it into the AtomSpace results in an RSS of about 4.3 GBytes RAM.
That's about 632 bytes/atom when in RAM. It's just pure Atoms, no
Values. Compare to 55 Bytes/atom when stored as uncompressed s-exprs,
or about 4 Bytes/atom when bzip2-compressed. Clearly, indexes are
expensive!

--linas

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAL7_Mo9B-ZQMgCbyTAcQL23PwX50w-qwYqSekdRdaHP0ryGchQ%40mail.gmail.com.

Amirouche Boubekki

unread,

Sep 1, 2021, 1:54:42 PM9/1/21

to opencog

Thanks for the quick reply. I spent 3 weeks coding in the wrong
direction (!), I need to rest a bit. I will let you know when I have
something usable.

Linas Vepstas

unread,

Sep 1, 2021, 5:33:32 PM9/1/21

to opencog

On Wed, Sep 1, 2021 at 12:54 PM Amirouche Boubekki <amirouche...@gmail.com> wrote:

Thanks for the quick reply. I spent 3 weeks coding in the wrong
direction (!), I need to rest a bit. I will let you know when I have
something usable.

Gahh! I can't imagine. Some quick notes to avoid future wrong directions:

If the dataset is

(Foo (Bar "stuff"))

(Baz (Bar "stuff"))

then the (Bar "stuff") is the "same Atom" in both expressions. This is because one common query is "find everything that contains (Bar "stuff")"

You are welcome to reinvent the query language. But, as a hint, a few practical examples:

(Query (Foo (Variable ?x)))

will find (Foo (Bar "stuff")) and attach ?x to (Bar "stuff") ... I'll leave you to figure out what to do if there are multiple matches. It becomes "pattern matching" when there are multiple variables, and multiple clauses that share common variables. For example

(Query (And (Foo (Variable ?x)) (Yowza (Variable ?x))))

fails because there is no (Yowza anything) in the dataset, much less a (Yowza (Bar "stuff")) which is what the ?x binding requires.

To speed up matches, it's useful to support typed variables. For example

(TypedVariable (Variable ?x) (Type 'Bar))

which prevents matches unless ?x is (Bar ... something...) Once you get into types, then the whole type-theoretical universe opens up. Complex, compound types, intersections, unions, dependent types, etc. The interpretation here is that the first word after the open-paren is a "raw type".

Unrelated remark: the language dataset has an "alist" in it. It is an association list associated with an Atom. It is NOT searchable or globally unique. Thus,

(Bar "stuff" (alist (cons (Key "a") (Value 1 2 3))))

says that (Bar "stuff") has an association list hanging off of it. Thus, "alist" and "cons" are very special reserved keywords in this system.

I used capitalized words in this example, but they have no special meaning, they could be lower-case, as long as "alist" and "cons" are taken as reserved words.

The use of quotation marks is an idiosyncracy. One must be careful not to overuse single-quote, as those go into the scheme symbol table, which will probably blow up if you put ten million symbols in there. Other than that, the whole idea is a kind of a glorified, searchable symbol table.

It's a bit of a fun-house of mirrors: all the usual ideas show up, but oddly distorted: interned vs. uninterned symbols, hygenic vs unhygenic matches, the need for reserved keywords. The need for quote, unquote and quasiquote. The need for lambda and define. My current research is how to eliminate any need for lambda, by replacing it with a more general concept of a connector-set. Thus, lambda becomes a special case.

--linas

Douglas Miles

unread,

Nov 8, 2021, 1:14:45 AM11/8/21

to opencog

guile-log would be an ideal system as it supports overloaded unification (overloaded from scheme)

It would be nice to allow Prolog programs to be ran and maintained as Atomeese and vice versa

- Douglas

Linas Vepstas

unread,

Nov 8, 2021, 1:02:41 PM11/8/21

to opencog

Hi Douglas,

Interesting email...

On Mon, Nov 8, 2021 at 12:14 AM Douglas Miles <logi...@gmail.com> wrote:
>
>
> guile-log would be an ideal system as it supports overloaded unification (overloaded from scheme)
>
> It would be nice to allow Prolog programs to be ran and maintained as Atomeese and vice versa

I can suggest several solutions, and can even offer to write some of the code.

As of very recently, you can now store arbitrary s-expressions in the
atomspace. These are stored in such a way that they are searchable. So
you can write "(junk stunk)" and "(stuff stunk)" and then ask "find
all s-expressions with 'stunk' at the end of them". (or any other
query).

So, if you can convert prolog to s-expressions, they can be stored.

Another possibility is that I can store prolog "natively". Or any
language, for that matter: python, perl, java, javascript ... just
provide me with a parser (written in C/C++) that will convert that
language into an abstract syntax tree, and I can store that tree. The
query engine can then query such trees.

I was going to do one for JSON but got immediately bored. Storing it
in the atomspace is easy, but so what? The only "interesting" part
would be to write a query language, but that exists already:
"GraphQL", and so I could port graphQL to sit on top of the pattern
engine, but so what? Who cares?
No one would use such a thing. It seemed pointless.

So again, provide me with trees, and I can store/search trees. It's
not hard. The "hard part" is inventing a query language that users
would want to use. The raw, low-level pattern matcher language is
powerful but verbose and intimidating to new users. Some
domain-specific language would be more socially appealing.

Hmm. But that is not what you wrote. You wrote

> It would be nice to allow Prolog programs to be ran and maintained as Atomeese and vice versa

and that is kind-of harder. One possibility is that prolog programs
could be converted into URE rules, but the URE was designed for
probabilistic inference, and so would run slowly for crisp-logic
prolog. I have a partial fix for that: I think I can make a "simple"
fast forward-chainer on the pattern engine; this would fit prolog much
better.

This demo: https://github.com/opencog/atomspace/blob/master/examples/pattern-matcher/recursive.scm
shows how to do recursive queries, and so prolog would be a
fleshed-out version of that demo.

To convert (a subset) of atomese into prolog is "easy": just write
some code that takes atomese trees, and prints them as prolog. Of
course, you can't convert everything; prolog isn't powerful enough.

-- Linas

> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/df24d2f4-ba28-44e9-975f-5ca09edc44a8n%40googlegroups.com.

Douglas Miles

unread,

Nov 8, 2021, 9:15:36 PM11/8/21

to ope...@googlegroups.com

On Mon, Nov 8, 2021 at 10:02 AM Linas Vepstas <linasv...@gmail.com> wrote:

Hi Douglas,

Interesting email...

On Mon, Nov 8, 2021 at 12:14 AM Douglas Miles <logi...@gmail.com> wrote:
>
>
> guile-log would be an ideal system as it supports overloaded unification (overloaded from scheme)
>
> It would be nice to allow Prolog programs to be ran and maintained as Atomeese and vice versa

I can suggest several solutions, and can even offer to write some of the code.

As of very recently, you can now store arbitrary s-expressions in the
atomspace. These are stored in such a way that they are searchable. So
you can write "(junk stunk)" and "(stuff stunk)" and then ask "find
all s-expressions with 'stunk' at the end of them". (or any other
query).

So, if you can convert prolog to s-expressions, they can be stored.

Another possibility is that I can store prolog "natively". Or any
language, for that matter: python, perl, java, javascript ... just
provide me with a parser (written in C/C++) that will convert that
language into an abstract syntax tree, and I can store that tree. The
query engine can then query such trees.

I was going to do one for JSON but got immediately bored. Storing it
in the atomspace is easy, but so what? The only "interesting" part
would be to write a query language, but that exists already:
"GraphQL", and so I could port graphQL to sit on top of the pattern
engine, but so what? Who cares?
No one would use such a thing. It seemed pointless.

I often realize too late the amount of utility such things provides on paper sounded good.. but as soon as I have it I ask "now what?" ( Just because people can query my system in their format of choice does not mean that they will want to )

So again, provide me with trees, and I can store/search trees. It's
not hard.

In prolog, when i want to be searching trees I organize my trees into graph nodes (a predicate per arc) and search that way. That tree structure usually stored in a prolog hashmap or other structure.. but I have do do a small conversion to predicate arcs first to make search efficient.

The "hard part" is inventing a query language that users
would want to use. The raw, low-level pattern matcher language is
powerful but verbose and intimidating to new users. Some
domain-specific language would be more socially appealing.

In my case the reason I produce the "efficient [at least to prolog] structure" is for my program to be able to search and use the tree. It would be hard to make it socially appealing as you say something else is going to be better.

Hmm. But that is not what you wrote. You wrote

> It would be nice to allow Prolog programs to be ran and maintained as Atomeese and vice versa

and that is kind-of harder. One possibility is that prolog programs
could be converted into URE rules, but the URE was designed for
probabilistic inference, and so would run slowly for crisp-logic
prolog.

I'll try to restate the problem I think you are pointing out: Normally lightning fast prolog programs are fast because they are leveraging crisp-logic but will run slower because they would now have to compute probability at each step?

Now separately my usecase:

I have 3 prolog programs that I was considering trying to convert to URE/Atomeese:

One is the ALEPH-like (an inductive reasoner) that computes/guesses new rules that recreates the data observed (high-level sensory data or whatever) called LPS-ALEPH

And the other is one that creates an imaginary playable world .. (to get a understanding, here are some rules it uses

https://github.com/logicmoo/prologmud/blob/master/prolog/prologmud/vworld/world_2d.pfc.pl#L210-L241 ) PrologMUD

With the combination of the two programs I am working on a system that allows the ALEPH-like program to observe the Prolog Virtual World (Hand coded in rules) and have it create a 3rd program in Prolog/Atomeese Virtual World called NomicMU. So LPS-ALEPH's goal is to create an inductive copy PrologMUD program into NomicMU and continue to make additions and edits.

What I am imagining is making my system more socially acceptable to OpenCogers by having these (three) programs exist and run as Atomeese instead of prolog.

You received this message because you are subscribed to a topic in the Google Groups "opencog" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/opencog/xQbA8_6Wg5w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to opencog+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA36k_9-PeZOvbwJ2Dtx5%2BE4M7pMsg-QF%3DA_BkCJvDCtzGw%40mail.gmail.com.

Ben Goertzel

unread,

Nov 8, 2021, 9:29:12 PM11/8/21

to opencog

Doug, can you point me to documentation on the ALEPH-like inductive
reasoner you mention?

thx
ben

> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAER3M5mT0a6ni8eLPT9StSETOaB%2BpOgPDAx9jTgt6ELPHgFkJw%40mail.gmail.com.

--
Ben Goertzel, PhD
b...@goertzel.org

"My humanity is a constant self-overcoming" -- Friedrich Nietzsche

Douglas Miles

unread,

Nov 8, 2021, 10:03:16 PM11/8/21

to ope...@googlegroups.com

Ben,

On Mon, Nov 8, 2021 at 6:29 PM Ben Goertzel <bengo...@gmail.com> wrote:

Doug, can you point me to documentation on the ALEPH-like inductive
reasoner you mention?

Oops! I haven't written documentation. (That is one of my worse traits)

But can at least point to my code:

https://gitlab.logicmoo.org/gitlab/logicmoo/logicmoo_workspace/-/tree/master/packs_sys/logicmoo_ec/prolog/ec_planner

I used the word "inductive" in my email because the process of program creation is normally part of the culture of "Inductive Logic Programming (ILP)" but the actually method I am using is closer to "Abductive Event Calculus" (written about by Erik Mueller/Murry Shanahan) How this works is it takes a "starved list of events" (SLoE) that happen in the world and a "library of primitive [program] scripts" (LPS) that can make stuff turn out a certain way. It then takes the SLoE and fills into the middle of it the LPS like: SLoE-1 --> LPS-a --> LPS-b --> SLoE-2 Then the system memorizes this as a part of the "induced" program.

I should actually document how it works.. even for my own sanity

Here is a paper on ALEPH (in case anyone isn't familiar) https://www.cs.ox.ac.uk/activities/programinduction/Aleph/aleph.html

To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBcj_bW_t80KCC88iBGu%3DFPUo10Eg%2B%2BAKs%2BO-f7L8W3TBQ%40mail.gmail.com.

Linas Vepstas

unread,

Nov 8, 2021, 11:04:41 PM11/8/21

to opencog

On Mon, Nov 8, 2021 at 8:15 PM Douglas Miles <logi...@gmail.com> wrote:

>> > It would be nice to allow Prolog programs to be ran and maintained as Atomeese and vice versa
>>
>> and that is kind-of harder. One possibility is that prolog programs
>> could be converted into URE rules, but the URE was designed for
>> probabilistic inference, and so would run slowly for crisp-logic
>> prolog.
>
> I'll try to restate the problem I think you are pointing out: Normally lightning fast prolog programs are fast because they are leveraging crisp-logic but will run slower because they would now have to compute probability at each step?

Uh, it's more complicated. If you want to compute with probabilities,
then, at each step, you have to explore both possibilities: the "true"
branch with probability p and the "false" step with probability 1-p.
After N steps, you have to have explored 2^N cases. The combinatorics
kills you. (The "real-life" formulas are not just p and 1-p, but
something more complicated. Running those formulas each step adds to
the system complexity and run-time.)

The URE was designed to handle the combinatorics. It's not optimized
for crisp logic. The pattern engine, however, does do a crisp-logic
walk.It was not originally designed to be recursive, but it does seem
capable of that. No one has really explored that. I have "more demos"
on my todo list. It's a graph walker, not a SAT solver, so I'm not
sure how to compare performance.

> I have 3 prolog programs that I was considering trying to convert to URE/Atomeese:
>
> One is the ALEPH-like (an inductive reasoner) that computes/guesses new rules that recreates the data observed (high-level sensory data or whatever) called LPS-ALEPH

Doing this well is, of course, the holy grail of AI if not quite AGI.

> What I am imagining is making my system more socially acceptable to OpenCogers by having these (three) programs exist and run as Atomeese instead of prolog.

Heh. Well it's not exactly like we've got a deep bench of users here.

One way to attract users is to use your system to solve some
"important problem". There are other ways, too (like having good
documentation and easy-to-use API's. And then there's the marketing
angle. But I digress....)

I could do some spare-time prolog-like hacking, just to see how that
might go like. I would need regular encouragement, though. And I don't
have much spare time. Actually approximately zero, but whatever.

--linas

Reply all

Reply to author

Forward