ANNOUNCE: AtomSpace Frames

52 views

Skip to first unread message

Linas Vepstas

unread,

May 9, 2022, 6:32:45 PM5/9/22

to opencog

I'd like to announce completion of work on a new AtomSpace feature, called "Frames". The name is meant to recall the concept of a "stackframe" or a "Kripke frame": each frame is a changeset, a delta, of all changes to the AtomSpace, sitting atop of a stack (or DAG) of AtomSpace changesets underneath. (So, much like git changesets, each layered on the last, and git branches, offering different merged histories of changes, with the ability to explore each changeset, individually.).

Why is this useful? Let me provide two answers: some hand-wavey general theoretical examples, and then the actual practical problem that was faced, and needed a practical solution.

In logic, in theorem-proving, in logical inference and reasoning, one has a set of steps one goes through, to reach a particular conclusion. One starts with some initial set of assumptions (stored in the base AtomSpace, for example) and then applies a set of inference rules, one after the other, to reach a conclusion. At each step, one can apply different inferences, leading to different conclusions: there is a natural branching structure. Some branches converge, others do not. AtomSpace frames allow you to take snapshots, and store them as you move along, and then revisit earlier snapshots, as needed. Different branches can be compared.

Another example is context-based knowledge. Consider the graph of all knowledge in some situation: say, the set of things that are true, when one is indoors. How does this change when one is in a forest? Things that are true in one situation can become false in another; things that seem probable and certain in one case may not be in another. Each AtomSpace Frame provides a place to record that context-specific knowledge: a different set of Atoms, and different Values attached to each Atom.

What does this mean in practice? Well, the AtomSpace has long supported the idea of stacks of AtomSpaces, one on top another. These are used very heavily in the graph query engine, to hold temporary results and temporary inferences. These temporary spaces are also used in the URE/PLN and in the pattern miner. They worked fine, but were perhaps less than complete and industrial-strength. No one created stacks or DAGs that were thousands of spaces deep, with Atoms being added and removed at each stage, while still having an easy-to-traverse graph at the end of it all. Well, now there is. Most importantly of all, the RocksDB storage backend full supports the save/restore of these DAGs, with a very simple, easy-to-use API, that behaves exactly how you'd expect it to, as you move from space to space. A half-dozen new unit tests to make sure it all works, bug-free.

I've used the terms "DAG" and "branch" in this email. A DAG is a "directed acyclic graph" (if unclear, look it up) Basically, an AtomSpace sitting on top of two others exposes the contents of these two as a set-union. The user of the top-most space sees all of the elements of the contributing members as a single, unified store. Thus, branches can be created, and they can also be merged together.

Why now, and not before? Well, until now, everyone has been happy working in just one big AtomSpace, and letting it accumulate changes as time went on. Maybe two: a second space for temporary storage. However, in the course of the (natural-language) learning project, it became clear that it would be useful to create a record, a chain of records, of the syntactic and semantic structure that has been learned so far. Several reasons for this. Different parameters to the algo cause different things to be learned: how do these differences change, diverge, converge over time? As the algo marches on, the quality of it's learning often gets worse over time, going from eye-poppingly great in early stages, to off-the-rails delusional at later stages. Not only is it nice to have a rewind function, to go back to the last known-good stage, but also to be able to have snapshots, to study how things go bad. To compare different parameters in different branches. Having chains of snapshots and multiple branches provides robustness for "lifetime learning". In "lifetime learning", an algo accumulates knowledge over time. To be robust, it must cross-check it against other branches, against earlier knowledge. Having explicit inference histories allows for explicit control over inference directions. We've never had this before. (Well, apologies to Nil, the PLN reasoner does have BIT "Backwards Inference Trees", but these were never generic, never exposed to the general public.)

FYI, the implementation wasn't easy. This really is a significant new feature, and not some trite software glitter. I thought it would take a week; it took a month. Almost all of the complexity is in the RocksDB storage backend. Representing frames in RAM is not that hard, not really. Getting save and restore to disk working correctly was a long slog. Especially if one is trying to read news about the Ukraine War at the same time :-) Slava Ukraina!

Documentation and examples: There is an old, very simple (trivial) multi-atomspace demo that precedes this work; it is unchanged: see https://github.com/opencog/atomspace/blob/master/examples/atomspace/multi-space.scm

There aren't any demos (yet) of how to save and restore to disk; I'll try to get to that "any day now". In the meanwhile, the most basic unit test reveals all: https://github.com/opencog/atomspace-rocks/blob/master/tests/persist/rocks/space-frame-test.scm

These two are written in Scheme, but everything should work equally well in Python.

-- Linas

Patrick: Are they laughing at us?

Sponge Bob: No, Patrick, they are laughing next to us.

Reply all

Reply to author

Forward

0 new messages