Understanding AtomSpace through Java

Jack Park

unread,

May 30, 2019, 12:41:34 PM5/30/19

to opencog

Since I have limited experience with c/c++, but some ability to transliterate other programs to Java, I decided to build a Java version of AtomSpace solely to better understand how AtomSpace is supposed to work. The project is far from functional but already is helping me understand this platform. The code is at

https://github.com/KnowledgeGarden/krr-explorer

My ambition might be similar to that of the OpenCog community, but I'll explain a few bits.

The project is now named OpenSherlock (started life as SolrSherlock but migrated away from Solr in other directions). Fundamentally, the object which is AtomSpace in OpenSherlock is a topic map; working on a PhD project, I asked the question; could a topic map learn to and by reading? After defending the thesis proposal, I crafted a topic map simulator which, in fact, performed far better than expected, so that grew up to be OpenSherlock.

An emerging explanation of that project is taking shape in this early draft manuscript:

https://docs.google.com/document/d/1kj9fe96srHhA5GOscYglR5XDLy8jEIHdWCZVNGMSXeI/edit?usp=sharing

I chose to explore AtomSpace because OpenSherlock was first inspired by LinkGrammar; it is called anticipatory for a reason.

AtomSpace, for me, is like Disneyland; so much to explore, so little time.

Cheers,

-Jack

Linas Vepstas

unread,

May 30, 2019, 4:01:56 PM5/30/19

to opencog

Hi Jack,

Wow! I welcome the effort, its nice to have projects influence one-another.

I would like to point out something: to **understand** the atomspace, reading the source code is probably the hardest, most confusing, and misleading way of doing it (as in there lurks bugs and occasional bad design).

To understand the atomspace, its best to either run through the examples, and/or read through the wiki. That would be a much easier path.

Here's a thumbnail sketch. So -- the atomspace is first a graph database (and so, to recreate it in java, it might be easiest to start with some existing graph database)

Next, its a bunch of predefined types. Some of these are relations: for example, InheritanceLink is the classic "is-a" relation -- x is-a y. For the so-called "semantic triples", we use EvaluationLink -- x R y for R some arbitrary named relation. We call R a PredicateNode, so for example, for "Jack owns a computer", x=Jack, R=owns y=computer so (Evaluation (Predicate "owns) (List (Concept "Jack") (Concept "computer"))) In first-order logic, one writes P(x,y) instead of x R y, whence the name predicate.

Next, a conceptual leap: one is not limited to just P(x,y), but one can have arbitrary numbers of arguments. These arguments can be other atoms, which is what makes it a "graph database". And finally, there is no force-fit schema, which is why it's not SQL. (so, for example, "triple stores" have a force-fit schema: everything must be a triple, of the form x R y. In other words, a table with 3 columns. For the atomspace, "anything goes". Otherwise, it would be just SQL: since SQL is-a kind-of graph database, it just forces you to pre-declare your schema, i.e. to use tables.)

Next, each predicate (more generally, each atom) has a truth-value. Classically, this is true/false (e.g. "it is true that Jack owns a computer"). The next conceptual leap is this: crisp true/false -> probability -> probability+confidence -> list-of-floats -> arbitrary json struct -> arbitrary key-value-db ->arbitrary key-value-db with time-dependent values.

So, in this example, "Jack owns a computer" has an associated key-value DB on it. One of the keys might hold the truthiness of this statement. Another key might hold its probability. Another key might hold the time-varying value of the physical distance between Jack and the computer, or maybe the pixel-values on the screen at this instant in time. These are called "Values"

So, you can imagine the AtomSpace as holding graphs -- those graphs re like pipes, plumbing. The Values are the water that flows through the pipes. Performance-wise, its fairly hard/slow to change the graphs, but the values can change constantly. The pipes have a query language. The values do not (because key-value databases don't have a query language, by definition.)

Finally, there are three or four more magic ingredients:

-- Some atoms are executable. For example, PlusLink knows how to actually add numbers together. PlusLink is backed by a C++ class that performs addition.

-- Queries from the query language are graphs themselves. So queries can be stored in the Atomsapce (this is very unlike SQL, where you cannot store a query in the database itself. I think this is also very unlike any typical graph DB.)

-- A relation P(x,y), together with it's truth-value, can be thought of as a matrix. So there is an API to access P(x,y) as if it was an actual matrix, doing typical matrix-math stuff to it.

Well, there are a few more tricks up it's sleeve, but this email is too long already.

I'm hoping that this gives you a flavor for what to shoot for. Grokking this email is surely easier than reading the source code :-)

-- Linas

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
To post to this group, send email to ope...@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/cfd6efd1-ee13-42e3-ab31-e718abddeb79%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

cassette tapes - analog TV - film cameras - you

Linas Vepstas

unread,

May 30, 2019, 5:19:48 PM5/30/19

to opencog

I converted this email to a brand-new intro for the wiki page

https://wiki.opencog.org/w/AtomSpace#Overview

-- linas

Linas Vepstas

unread,

May 31, 2019, 12:23:54 AM5/31/19

to opencog

Hi Jack,

I looked over the google-docs document on anticipatory machine reading. You might get new insight from reading http://www.coli.uni-saarland.de/courses/syntactic-theory-09/literature/MTT-Handbook2003.pdf Sylvain KAHANE 's review of Meaning-Text theory. I think you will enjoy the way meaning gets represented there.

Regarding atomspace in java -- it probably would be more productive to just write java bindings for the atomspace. That way, you can just use it -- it works its, functioning and ready to go. There's 71KLOC of code there, and it took years to develop it and get it fully debugged. There's another 49KLOC of unit tests.

-- Linas

On Thu, May 30, 2019 at 11:41 AM Jack Park <jack...@topicquests.org> wrote:

--

You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
To post to this group, send email to ope...@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/cfd6efd1-ee13-42e3-ab31-e718abddeb79%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jack Park

unread,

May 31, 2019, 11:07:34 AM5/31/19

to opencog

Hi Linas,

(and apologies for misspelling your name before)

This thread, for me, is richly rewarding. Thank you very much for engaging.

In truth, if all I wanted to do is to work with AtomSpace as AtomSpace, I would, first, refresh my c/c++ skills, and also build a JNI interface to AtomSpace. In fact that may well be in the cards. But, I'd like to justify my transliteration game as one of a deeper mechanism, for me, to become more literate in complex software architectures.

In the past, when Forth was my game, I built a "discovery system" as a rational reconstruction of Lenat's Eurisko married to Forbus's QP Theory; that program was used to defend a PhD thesis in process control by another person. I transliterated Lisp programs of all kinds to Forth; later, and much more recently, I transliterated two structured conversation platforms from Open University, Cohere, and Evidence Hub, from PhP to Java. I'll never be a PhP programmer, but I did learn some interesting ideas which I folded into a set of utilities which I use in Java. And, today, I dusted off what little I recall about c/c++ and started learning about homoiconicity, hashing, and more. I'm a slow learner; hacking is my best tool.

AtomSpace is an awesome project and quest; I am happy to be able to surf the wake it is tossing.

Cheers,

-Jack

On Thursday, May 30, 2019 at 9:23:54 PM UTC-7, linas wrote:

Hi Jack,

I looked over the google-docs document on anticipatory machine reading. You might get new insight from reading http://www.coli.uni-saarland.de/courses/syntactic-theory-09/literature/MTT-Handbook2003.pdf Sylvain KAHANE 's review of Meaning-Text theory. I think you will enjoy the way meaning gets represented there.

Regarding atomspace in java -- it probably would be more productive to just write java bindings for the atomspace. That way, you can just use it -- it works its, functioning and ready to go. There's 71KLOC of code there, and it took years to develop it and get it fully debugged. There's another 49KLOC of unit tests.

-- Linas

On Thu, May 30, 2019 at 11:41 AM Jack Park <jack...@topicquests.org> wrote:

Since I have limited experience with c/c++, but some ability to transliterate other programs to Java, I decided to build a Java version of AtomSpace solely to better understand how AtomSpace is supposed to work. The project is far from functional but already is helping me understand this platform. The code is at
https://github.com/KnowledgeGarden/krr-explorer

My ambition might be similar to that of the OpenCog community, but I'll explain a few bits.
The project is now named OpenSherlock (started life as SolrSherlock but migrated away from Solr in other directions). Fundamentally, the object which is AtomSpace in OpenSherlock is a topic map; working on a PhD project, I asked the question; could a topic map learn to and by reading? After defending the thesis proposal, I crafted a topic map simulator which, in fact, performed far better than expected, so that grew up to be OpenSherlock.

An emerging explanation of that project is taking shape in this early draft manuscript:
https://docs.google.com/document/d/1kj9fe96srHhA5GOscYglR5XDLy8jEIHdWCZVNGMSXeI/edit?usp=sharing

I chose to explore AtomSpace because OpenSherlock was first inspired by LinkGrammar; it is called anticipatory for a reason.
AtomSpace, for me, is like Disneyland; so much to explore, so little time.

Cheers,
-Jack

--
You received this message because you are subscribed to the Google Groups "opencog" group.

To unsubscribe from this group and stop receiving emails from it, send an email to ope...@googlegroups.com.

To post to this group, send email to ope...@googlegroups.com.
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/cfd6efd1-ee13-42e3-ab31-e718abddeb79%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward