Distributed Atomspace

346 views
Skip to first unread message

Linas Vepstas

unread,
Jul 16, 2020, 2:53:42 PM7/16/20
to Cassio Pennachin, opencog, Andre Luiz de Senna
Hi Cassio,

Just got done listening to your CogCon presentation ... so .. chunks!  I think you hit the nail on the head with the concept of "chunks". It is maybe the #1 most important (hardest) part of the design.  Let me explain why...

I (recently) took two extremely-naive attempts at implementing distributed atomspace -- atomspace-ipfs and atomspace-dht (see the github repos) and since I was naive, these efforts... well, they work, but they don't scale.

The core problem that wrecks both of those was the problem of "chunks" -- of knowing when two atoms are "near" each other.  Of knowing when a glob of atoms should travel together, should be lumped together.  Without knowing what belongs to a glob, a chunk, it was hard/impossible to have a good, scalable backend.

... and if you do know how to identify the boundaries of a glob, then almost all of the other aspects of building a distributed atomspace become "easy".

Clarifying this makes all the other problems go away (or turn into problems that any programmer can implement, without major stumbling blocks) .. so any thoughts, work to clarify the idea of "chunks" would be ... well, for me, it would be extremely important.

-- linas

--
Verbogeny is one of the pleasurettes of a creatific thinkerizer.
        --Peter da Silva

Matt Chapman

unread,
Jul 16, 2020, 4:51:59 PM7/16/20
to opencog
All general intelligences we know of today are time-bound. Human brains can't solve problems that take more than ~100 years of processing, or more than ~100 hours of conscious attention. We could also use time bounds to limit chunks.

If I understand correctly, the "chunk" is the set of atoms in some subgraphs. So then the question is, how much of our locally cached atomspace do we publish to the distributed atomspace, given that newly created atoms locally may have low probability of surviving attention allocation in the distributed atomspace. 

If you are the mind agent deciding this question, then the answer is, you push as many as you can in a fixed time, ordering the set of outgoing links from each success atom pushed by the local attention value in a breadth-first search. Each "message" in the (immutable) "distributed update event stream" should have a header stating whether it is the beginning of a new chunk, or the continuation of a chunk. Then each consumer of that update stream can likewise use a time limit to determine how many of the atoms in a chunk it can consume. If a given local pattern matching process fails to find a complete pattern match, but finds a partial match in atoms updated by one or more incompletely received chunks, then it can allocate additional time to retrieve the rest of the atoms in those chunks and continue it's search.

I think this even generalizes to Execution Nodes -- if an executed atomese program fails to return a result of the expected form, and the program contains atoms from incomplete chunks, then retrieve the rest of those chunks. Alternatively, a specialized mind agent that is expected to handle executions might implement a rule to always alocat as much time as necessary to retrieve all execution nodes from all published chunks.

Or I might be totally missing the point, but these are my thoughts after today's presentations. I've definitely made some assumptions about the architecture of distributed atomspaces (i.e., the use of immutable event streams) that may not be consistent with better established ideas, since I've not been paying much attention to OpenCog development lately.

If you've read this far, I'll throw out 2 more brief thoughts:
1. Rust is awesome.
2. Semantic version is essential. Hyperon can be released as 1.0.0 and Tacyon can be 2.0.0 if it does not maintain backward compatibility.

I can try to defend these claims when I have more time later, if they are controversial.

Be well,

Matt

--
Please interpret brevity as me valuing your time, and not as any negative intention.


--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA37bz5CN%2BvDGXsy%3DFCcNjp_TY4ivNuDWN51LgdPCdawnSg%40mail.gmail.com.

Linas Vepstas

unread,
Jul 23, 2020, 11:09:07 AM7/23/20
to opencog
OK,

I was hoping to have a technical conversation about chunks; my fault because perhaps I was not clear about that.

When one attempts to store data in a computer, it is usually best if related things are nearby. Literally -- nearby on disk, RAM, cache. When one attempts to store data in a distributed database, it is best to if related things are on the same network node, and travel together in the same packets.  That way, you can issue fewer requests, initiate fewer network connections and make fewer queries.

The actual technical problem that we actually currently have right now as a real problem (and not a hypothetical pie-in-the-sky sci-fi problem, but as a true real right-now-it-is-affecting-the-code problem, like its-a-bug-in-the-current-codebase problem) is that we do not have any effective strategy for identifying chunks in the atomspace.

We do have hashes on atoms -- these are currently 64-bit hashes, so not cryptographic, but "good enough", but these hashes destroy locality.  Two closely related atoms will have completely different hashes.  So hashes cannot be used to determine locality.

What can? I don't know. That's why I'm asking.

For this to work, the chunking has to be immutable. The goal is that user A on the other side of the planet can agree with user B on the name of the chunk, without having to talk between each other first. (because forcing users to talk is not scalable -- talk scales like N^2 but we want storage to scale like log N or better.)

What I am fishing for, is either some example pseudocode, or the name of some algorithm or some wikipedia article that describes that algorithm, which can compute chunks.  Ideally, an algo that can run in fewer than a few thousand CPU cycles, because, after that, performance becomes a problem.  If none of this, then some brainstorming for how to find a reasonable algorithm.

-- Linas 


Ben Goertzel

unread,
Jul 23, 2020, 12:34:42 PM7/23/20
to opencog
> What I am fishing for, is either some example pseudocode, or the name of some algorithm or some wikipedia article that describes that algorithm, which can compute chunks. Ideally, an algo that can run in fewer than a few thousand CPU cycles, because, after that, performance becomes a problem. If none of this, then some brainstorming for how to find a reasonable algorithm.
>

Linas, just to be sure we're in sync -- how large of a chunk are you
thinking this algorithm would typically find?

Linas Vepstas

unread,
Jul 23, 2020, 6:53:59 PM7/23/20
to opencog
Arbitrary. If you look at what happened with opencog-ipfs or opencog-dht, there are several key operations. One is, of course, "who's got this atom?" but that's easy: each atom has a 64-bit hash (or 80-bit on opendht by default, but that's settable). Next, "what's the incoming set of this atom?" Whoops, can't compute the hash of that, because we don't know what it is. So you can ask, and get back a list of N other atoms (or hashes) that are in the incoming set. Where are they? Well, each different atom gets a totally different hash, so they spread all over the planet (because that's how Kademlia works), when in fact, what we really wanted to say was "gee golly, the incoming set of an atom is 'close to' the atom itself, get me the ball of close-by stuffs".  But I can't figure out how to "say that". 

Anyway, that is what I am trying to define as a chunk: an atom and everything "nearby", with a variable conception of "nearby".

atomspace-ipfs had multiple major stumbling blocks. One is that the IPFS documents are immutable, so for each new atomspace, you have to publish a brand new document -- which has a completely different hash, so whoops, how do findout out the hash of that?. Well, IPFS has a DNS-like naming system, but it was horridly slow, totally unusuable (multi-secnod lookups with 60-second timeouts). The second problem is that its "centralized" -- you have to jam the *entire* atomspace into the document. So its klunky. Won't scale for large atomspaces. Some notion of chunks alleviates that. But maybe something less klunky than IPFS would be better.

So that suggests a lower-level building block - e.g. opendht. and that is how atomspace-dht was born. But that now seems to be maybe "too low". It suffered from the chunking problem.

Here's one, somewhere in the middle: "earthstar" -- https://github.com/cinnamon-bun/earthstar is a decentralized document store. Cross out "document" and write "atomspace" instead. Or rather cross out "atomspace" and write "chunk" instead. Or something like that. Quite unclear.

The reason atomspace-cog got created is it seems best to have "seeders", same idea as in bittorent, so at least one server that is the source of truth for a given atomspace, even if all the other servers are down/offline.  The current ipfs and dht backends do not use seeders, but I've got extremely vague plans to change that.

--linas


Matt Chapman

unread,
Jul 23, 2020, 11:06:36 PM7/23/20
to opencog
OK, so I think I now understand that a Chunk is defined as, minimally, an atomX and all the atoms in it's incoming set. In that case, then the "name" of the chunk" may as well be the hash of the central atomX. If a Chunk2 is defined as "an atom, its incoming set atoms, and their incoming set atoms," then the name of that chunk can be a hash of the central atoms hash concatenated to the hash of all the 1st generation atoms, and so on, turtles & hashes all the way down  to ChunkN. 

Except we have the problem that some process may add a new atom to,e.g., the 1st generation, so under the above strategy, we have a new hash of hashes for the chunks centered on atomX. This could be a feature, not a bug. It depends on what you want. If you want a name that dynamically refers to an atom and all it's incoming children to the Nth generation, then I guess that name is "atomX-N" no?

But if that process that is adding atoms exists on some other node in our datomspace cluster, how can we know about that new atom? I'll flail stupidly toward some hopefully relevant ideas in the rest of this message.


> The goal is that user A on the other side of the planet can agree with user B on the name of the chunk, without having to talk between each other first. 

What you really want is, of course, impossible in a distributed scenario, but we can get a close approximation. If it's not obvious why I think it's impossible, I can explain in a follow-up.

In most commercial distributed data storage systems I know of (several), you have both a Partition Strategy (how do I decide which cluster node owns this record/atom) and a Replication Strategy (where do I put copies of this record so that I can recover it if the owner disappears, or so I can accept writes from multiple nodes?). Some systems also have a Consistency strategy (how many replicas do I have to look at in order to be sure of the present state of the record?) In this nomenclature, I would suggest that Chunks are not a Partitioning problem, but rather a Replication & Consistency problem. 

I suggest this because I believe some things that may not be true, so let me write out my assumptions so that you can ignore the rest of this message as soon as you hit an assumptions that doesn't hold:

1. Some cluster node will "own" each atom by assignment via some simple division of the hash address space. 
2. Each cluster node will also contain replicas of many other atoms, not only for disaster recovery purposes, but also because mind agents on that node will need in local memory many atoms "owned" by other nodes. Once we've obtained them from their owners, we might as well keep them around until we need to recover memory space for other "borrowed" atoms more urgently needed.
3. A mind agent on a given node wants to be able to update atom properties (truth value, etc) locally, without having to talk to the "owner" node directly.
4. Perfect consistency of atom state between different nodes is not a strict requirement, but it is desirable for a node to be able to identify the 'authoritative' source for a given atom, and that source should reflect a reasonably recent state of the atom as updated by any replica node.
5. Relatively poor storage efficiency is acceptable. I.e., a single node may only be able to dedicate a relatively small portion of its memory to storing the atoms it owns; a majority of its space may go to replicated atoms. Nodes are cheap; we'll just buy more. :-)

Given those design goals, I think we're looking at a publish-subscribe model for replicating updates to atoms. So, the owner for a given atom would also subscribe to updates for all atoms in the chunk (i.e., all atoms in the owned atom's incoming set) thus committing on a best-effort basis to maintain a reasonably up-to-date subgraph of all the nodes in the chunk, so that when some node in the cluster requests the chunk (by reference to the central atom's hash) it can also get reasonably recent copies of all the connected atoms.

If a particular mind agent is very sensitive to consistency, it can, of course, take the time to request the authoritative state of each atom in the chunk from its owner, but (I assume) in most cases this won't be necessary. The mind agent can also choose to subscribe directly to the update stream from the authoritative node, if it desires to apply updates caused by other mind agents, or it can periodically request the chunk again from the central node owner, if it prefers to trade consistency/latency for bandwidth efficiency.

See also:


All the Best,


Matt

--
Please interpret brevity as me valuing your time, and not as any negative intention.
--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

Ben Goertzel

unread,
Jul 23, 2020, 11:11:28 PM7/23/20
to opencog
Linas,

So it doesn't directly address the issue, but this paper looks vaguely relevant

https://www.sandia.gov/~srajama/publications/twolevel_ipdps18.pdf

They are looking at graph chunking for a different purpose (graph
algorithms on high bandwidth memory machines) and they experiment with
vertex chunking, edge chunking, random chunking and partitioning based
chunking...

An important point they note is that for real-world graphs, vertex and
edge chunking (which are very simple) work about as well as
partitioning based chunking, whereas this is not so true for random
graphs. The point being that in real-world graphs, much-interlinked
nodes tend to be highly related in terms of usage ...

In the Gemini distributed graph processing system,

https://www.usenix.org/sites/default/files/conference/protected-files/osdi16_slides_zhu.pdf

https://github.com/thu-pacman/GeminiGraph

they divide graphs into chunks based on "# vertices + # edges" , so a
mix of vertex chunking and edge chunking...

They report this works quite well emphasize that this is because of
the nice locality properties (nearby nodes tend to be used together)
of the graphs they're dealing with

-- Ben

On Thu, Jul 23, 2020 at 3:54 PM Linas Vepstas <linasv...@gmail.com> wrote:
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA34VPBdCw-hoXH-Y8RWa-eQRAjvUGQ%2BbRNSdU8wu95dbgA%40mail.gmail.com.



--
Ben Goertzel, PhD
http://goertzel.org

“The only people for me are the mad ones, the ones who are mad to
live, mad to talk, mad to be saved, desirous of everything at the same
time, the ones who never yawn or say a commonplace thing, but burn,
burn, burn like fabulous yellow roman candles exploding like spiders
across the stars.” -- Jack Kerouac

Ben Goertzel

unread,
Jul 23, 2020, 11:16:16 PM7/23/20
to opencog
Matt,

So regarding these requirements,

> 1. Some cluster node will "own" each atom by assignment via some simple division of the hash address space.
> 2. Each cluster node will also contain replicas of many other atoms, not only for disaster recovery purposes, but also because mind agents on that node will need in local memory many atoms "owned" by other nodes. Once we've obtained them from their owners, we might as well keep them around until we need to recover memory space for other "borrowed" atoms more urgently needed.
> 3. A mind agent on a given node wants to be able to update atom properties (truth value, etc) locally, without having to talk to the "owner" node directly.
> 4. Perfect consistency of atom state between different nodes is not a strict requirement, but it is desirable for a node to be able to identify the 'authoritative' source for a given atom, and that source should reflect a reasonably recent state of the atom as updated by any replica node.
> 5. Relatively poor storage efficiency is acceptable. I.e., a single node may only be able to dedicate a relatively small portion of its memory to storing the atoms it owns; a majority of its space may go to replicated atoms. Nodes are cheap; we'll just buy more. :-)
>
> Given those design goals, I think we're looking at a publish-subscribe model for replicating updates to atoms.


-- what Linas and Cassio and Senna have all posited, is that it may be
more sensible to replace "Atom" with "Chunk" (i.e. sub-metagraph) in
the above requirements..

What the references I sent in my just-prior email suggest is that, for
the sorts of graphs that tend to be created in real life, defining
Chunks in a fairly simple heuristic way (i.e. each chunk is just a
bunch of tightly-ish connected nodes and links) rather than via
running an expensive partitioning algorithm will generally be
adequate.

The requirements you state are in my view correct as regards Atoms.
However, the perspective being put forth is that handling these
requirements explicitly on the level of Atoms rather than Chunks will
become computationally intractable given the number of Atoms involved
and the dynamic nature of the Atomspace.

-- Ben

Ben Goertzel

unread,
Jul 23, 2020, 11:20:32 PM7/23/20
to opencog
Differently but indirectly relatedly, this caching system for graph
queries looks interesting,

https://openproceedings.org/2017/conf/edbt/paper-119.pdf

Linas Vepstas

unread,
Jul 24, 2020, 12:26:20 AM7/24/20
to opencog, Xabush Semrie
Took the dog for a walk, which helps w/ thinking. So .. a very short reply (as short as I can make it) and will ponder the entire email chain tomorrow morning.

The meta-question I pondered during the walk was "what can be prototyped in a few days/weeks?" and the answer becomes simpler (because the choices are fewer). Two parts.

Part one: create a custom atom (UpLink (Atom X) (Number N)) and it will return the incoming set up to N steps upward from X. This atom has a unique hash, and so can always be found.

What should a remote server do, when asked for this? Well, it could just do the look-up, then and there, and return the results.  Alternately, the remote node can do the lookup, attach the results, together with a timestamp, as a Value on that UpLink, and return that. That way, you know how old/stale your results are.

Part two: Oh wait, we already have UpLink. It's called JoinLink.

Part three: Oh wait, we could do this with an arbitrary BindLink/GetLink. We could take the most recent search results, attach a timestamp to the results and attach that as a Value on some key.  That way, you can ask the network for that Bind/Get, and if it comes back with a new-enough timestamp, you can be happy and just use the results, and not re-perform the search. If you're not happy with the results, you can re-run the pattern match, and publish your latest/greatest results to the world. (Or rather, you attach the results to the Bind/Get, and announce "I too have a copy of this atom")

There are still a few holes in what I describe above, but maybe they're not serious. Not sure. My sense is that some variant of this can be prototyped in not much time at all. It might even be usable for the genomics work, where the data is almost totally static, where many searches tend to be built on sub-searches which can be cached, and do not have to be recomputed each time.  Currently, we cache in a scheme wrapper, but caching search results as a Value on the Get/BindLink itself makes more sense.  (FWIW this kind of caching was briefly done for openpsi, many years ago, but fell into disuse. Amen might remember details, I don't. I just remember that caching made sense, at the time.) Habash was thinking about a server for the genomics data, but I think he was going in a different direction. But maybe this works for him?

--linas


--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

Ben Goertzel

unread,
Jul 24, 2020, 1:11:30 PM7/24/20
to opencog, Xabush Semrie
This all sounds useful but doesn't address your question on chunking, right?

Musing a bit, it seems like ultimately what you're going to need is a
ChunkLink, n-ary links for fairly large n ... where a ChunkLink is
basically the same as a SetLink but with the added semantics that
"this set should be considered a chunk from the standpoint of
distributed processing or memory caching"

ChunkLInks could be formed via an heuristic algorithm that greedily
partitions Atomspace into chunks of a desired size (measured in terms
of both nodes and links) ... Potentially if the result-sets of prior
Pattern Matcher queries are saved these could also be used to
heuristically guide the formation of chunks. (But note that according
to my current understanding, chunks should be disjoint whereas the
results of PM queries need not be...)

In a distributed system each chunk (represented by a ChunkLink) has a
certain home local-Atomspace, and one then has some variant of the
publish-subscribe dynamics Matt Chapman suggested but on the chunk
rather than Atom level

In a random Atomspace, this could be horribly inefficient because the
heuristic for ChunkLink formation would badly reflect Atom usage
patterns. However, if I look at the BioAtomspace -- the biggest
useful Atomspace we have around right now -- it seems like the
heuristic would work OK regarding both imported and inferred links...

An added heuristic would have to be used to determine which machine a
given chunk lives on of course, But there are lots of cool
algorithms for this in the CS literature, e.g. Ja-Be-Ja which is
wholly distributed/decentralized/localized in operation...

https://www.researchgate.net/publication/279230270_A_Distributed_Algorithm_for_Large-Scale_Graph_Partitioning

But in what I'm suggesting something like Ja-Be-Ja would be carried
out on chunks not Atoms for figuring out which. machines in a
distributed-Atomspace network to put a given ChunkLink and its targets
on

This direction does not contradict what you just suggested with
timestamps and Bind/Get links etc., I'm just trying to more directly
address your earlier question on chunking...

Of course the dilemma re chunking is that to chunk a big knowledge
graph really effectively will require very expensive operations and we
are talking here about something that -- in the context of a dynamic
knowledge graph -- needs to be done quite rapidly and light-weight-ly
as it needs to be redone over and over and updated as the graph
changes.... So we are going to do chunking heuristically and
badly-ish, and rely on the rough overall semantic locality of the
graph to make it not-too-bad... there is nothing else to do is there?
But I guess you already know this Linas but are somehow holding out
for a miracle genius insight to bypass this cruel reality? Not sure
it exists in this swath of the multiverse...

ben
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA34AfV5UbAJA2M0VzRe-iJtkg-Ska0xq3zWY-iEYW6babA%40mail.gmail.com.

Linas Vepstas

unread,
Jul 28, 2020, 7:17:52 PM7/28/20
to opencog, Xabush Semrie
So

> a miracle genius insight to bypass this cruel reality?

So .. hypergraphs really are different than graphs, in some important ways. The graph-algorithm people struggle to find "chunky" pieces, because they are limited to just ... vertexes and edges. We've got something ... better. Hypergraphs have "natural chunks". Given an atom X, the corresponding natural chunk is the entire, recursive incoming set of X.

It's worth dwelling on this point for a moment. For an ordinary graph, given a vertex V, it is tempting to define a "natural chunk" as V plus all the edges E attached to it. But then one has the overwhelming urge to make it just a tiny bit larger ... by also adding those edges that would form complete triangles. If one succumbs to this urge, then, heck, lets include those edges that make squares, or maybe instead, if an edge is attached to a dense hub, then include that hub and spokes as well... all of a sudden, it snowballs out of control, and the simple idea of a "natural chunk" in a graph is not natural at all, it's just some snowball of things that were deemed "well-connected-enough".

Somehow, hypergraphs chunks don't succumb to this temptation. The incoming set is very natural, and I can't think of any particular way that I would want to make it "larger".  I can't think of any heuristic, rule-of-thumb, common-sense idea to make it larger. So it really feels natural, it's resistant to snowballing, it's endowed with a naturalness property that ordinary graphs don't have.  ... and that's a good thing.  We can come back to this, but it seems reasonable to set aside the chunking problem for now.

OK ... so a full-decentralized peer-to-peer atomspace network. I think this might be easier than you imagine, and that we are not very far from having one.  So ... first, a peer-to-peer network needs to have a way for two peers to chat with each-other. I think that part is "more-or-less" done, with the https://github.com/opencog/atomspace-cog/ work. Basically, each peer runs a cogserver, and it can download or upload those parts of an atomspace that it wants with the other connected peers. We should talk more about "what parts each peer wants to up/download and share" (it would be better if you tell me what you imagine these might be. Right now, I'm thinking mostly "results of BindLinks or JoinLinks")

Next, a peer-to-peer network needs to have a peer-discovery system. I think this might be relatively straightforward using either opendht or ipfs. Several modes that can apply.

* Authorities: a peer could publish an "authoritative record" meaning basically it has the complete entire dataset, and is willing to share it. e.g. "Genomic Data version 2.0 (Complete)" and the URL it can be found at. e.g "cog://example.com:27001/". We might want to stick a time-stamp in there, maybe some other meta-data. Both IPFS and DHT provide natural ways to ask "who's got 'Genomic Data version 2.0 (Complete)' and get back a reply with the URL.

* Peer networks. The idea here is that (if) no one has the full dataset, but everyone has some part of it. Suppose you want some portion of it. Suppose that the portion you want can be described by a BindLink. You then look (in dht/ipfs) to see which peers in the network have recent results for that bindlink (i.e. they ran it recently, they've got hot, recent results cached). You then contact those peers (at cog://example.com/ using the existing cogserver) to get those results. It is up to you to merge them all together (deduplicate). And then you would announce "hey I've recently run this bindlink, I've got cached results, in case anyone cares" (this is all automated under the covers).

It's possible that none of the peers on the network have recent results for the bindlink. In which case, you have to ask them "oh pretty please can you run this query for me?" -- you ask because some might refuse to, because they may be overloaded. e.g. they may be willing to serve up incoming-sets, but not bindlinks. You can then punt, or you can ask for the incoming sets if you think you don't have everything you need.

I'm glossing over details, but I think all this is quite doable, and just not that hard.

The meta-issue here really is: "who needs this, and by when?" I know that Habush wants to have a genomic data server, but he has not expressed how much data he wants it to store, how many peers he wants to hang off of it, or whether he wants to have multiple redundant servers, or what. He was inventing his own server infrastructure to do this ... which is not a bad idea, but I think a network of cogservers could also do the trick.

Wait, let me rephrase that last sentence: I don't have any clear idea of what is wanted for some genomic-data-service, so I can't directly claim that the existing network of cogservers "does the trick". Maybe it does? If not, why not? What desirable function is missing?

See where I'm going with this? I'm basically saying "let's prototype some of these basics, and use them in an actual project, and see how it goes".
--linas

Ben Goertzel

unread,
Jul 29, 2020, 12:41:29 AM7/29/20
to opencog, Xabush Semrie
Linas,

> So .. hypergraphs really are different than graphs, in some important ways. The graph-algorithm people struggle to find "chunky" pieces, because they are limited to just ... vertexes and edges. We've got something ... better. Hypergraphs have "natural chunks". Given an atom X, the corresponding natural chunk is the entire, recursive incoming set of X.
>
> It's worth dwelling on this point for a moment. For an ordinary graph, given a vertex V, it is tempting to define a "natural chunk" as V plus all the edges E attached to it. But then one has the overwhelming urge to make it just a tiny bit larger ... by also adding those edges that would form complete triangles. If one succumbs to this urge, then, heck, lets include those edges that make squares, or maybe instead, if an edge is attached to a dense hub, then include that hub and spokes as well... all of a sudden, it snowballs out of control, and the simple idea of a "natural chunk" in a graph is not natural at all, it's just some snowball of things that were deemed "well-connected-enough".
>
> Somehow, hypergraphs chunks don't succumb to this temptation. The incoming set is very natural, and I can't think of any particular way that I would want to make it "larger". I can't think of any heuristic, rule-of-thumb, common-sense idea to make it larger. So it really feels natural, it's resistant to snowballing, it's endowed with a naturalness property that ordinary graphs don't have. ... and that's a good thing. We can come back to this, but it seems reasonable to set aside the chunking problem for now.


Hmm... you are right that OpenCog hypergraphs have natural chunks
defined by recursive incoming sets. However, I think these chunks
are going to be too small, in most real-life Atomspaces, to serve the
purpose of chunking for a distributed Atomspace

I.e. it is true that in most cases the recursive incoming set of an
Atom should all be in the same chunk. But I think we will probably
need to deal with chunks that are larger than the recursive incoming
set of a single Atom, in very many cases.


> * Peer networks. The idea here is that (if) no one has the full dataset, but everyone has some part of it. Suppose you want some portion of it. Suppose that the portion you want can be described by a BindLink. You then look (in dht/ipfs) to see which peers in the network have recent results for that bindlink (i.e. they ran it recently, they've got hot, recent results cached). You then contact those peers (at cog://example.com/ using the existing cogserver) to get those results. It is up to you to merge them all together (deduplicate). And then you would announce "hey I've recently run this bindlink, I've got cached results, in case anyone cares" (this is all automated under the covers).

This relates to the idea that results from a BindLink query should be
a chunk, right?

> It's possible that none of the peers on the network have recent results for the bindlink. In which case, you have to ask them "oh pretty please can you run this query for me?" -- you ask because some might refuse to, because they may be overloaded. e.g. they may be willing to serve up incoming-sets, but not bindlinks. You can then punt, or you can ask for the incoming sets if you think you don't have everything you need.
>
> I'm glossing over details, but I think all this is quite doable, and just not that hard.

What happens when the results for that (new) BindLink query are spread
among multiple peers on the network in some complex way?

>
> See where I'm going with this? I'm basically saying "let's prototype some of these basics, and use them in an actual project, and see how it goes".

The genomics use-case is a good one for this, and I think we could
come up with some further details for you on what we want from a
distributed Atomspace for a genomic data service.

-- Ben

Linas Vepstas

unread,
Jul 29, 2020, 12:47:07 AM7/29/20
to opencog, Xabush Semrie
Part the second.

This email only deals with the kinds of communications that two peers may want to have between each other, in a distributed AtomSpace.  It's a cut-n-paste of a proposed API (in C++; it is not hard to imagine the python/scheme equivalents).  I think it is general enough to do everything that Habush wants for the genomic data servers.

Since it's just an API for now, perhaps it's time to ponder an RFI process -- Request for Implementation.  It's a cut and paste of some code I plan to merge "real soon now", mostly because it's almost dirt-simple to implement.  The API is just ... one!!  new function:

      /**
       * Run the `query` on the remote server, and place the query
       * results onto the `key`, both locally, and remotely.
       * The `query` must be either a JoinLink, MeetLink or QueryLink.
       *
       * Because MeetLinks and QueryLinks can be cpu-intensive, not
       * all backends will honor this request. (JoinLinks will be
       * honored, in general; they can be thought of as a generalized
       * incoming set, and are much faster to process.) Backends are
       * free to return previously-cached results for the search,
       * rather than running a fresh search. If the flag `fresh` is
       * set to `true`, then the server may interpret this as a
       * request to perform a fresh search.  It is not required to
       * honor this request.

       *
       * If the `metadata_key` is provided, then metadata about the
       * search is returned. This may include a time-stamp indicating
       * when the search was last performed. If the search was refused,
       * a value indicating that will be returned.  The metadata is
       * intended to allow the receiver (i.e the user of this local
       * AtomSpace) what to do next.
       *
       * Note that the remote server may periodically purge search
       * results to save on storage usage. This is why the search
       * results are returned and placed in the local space.
       *
       * Only the Atoms that were the result of the search are returned.
       * Any Values hanging off those Atoms are not transfered from the
       * remote server to the local AtomSpace.
       *
       * FYI Design Note: in principle, I suppose that we could have
       * this method run any atom that has an `execute()` method on
       * it. At this time, this is not allowed, for somewhat vague
       * and arbitrary reasons: (1) we do not want to DDOS the remote
       * server with heavy CPU processing demands (you can use the
       * CogServer directly, if you want to do that). We also want to
       * limit the amount of complexity that the remote server implementation
       * must provide. For example, there's a slim chance that traditional
       * SQL ang GraphQL server might be able to support some of the
       * simpler queries.  If you want full-function hypergraph query,
       * just use the CogServer directly.
       */
      virtual void runQuery(const Handle& query, const Handle& key,
                            Handle metadata_key = Handle::UNDEFINED,
                            bool fresh=false);

That's it. This should take less than a day to implement (famous last words).

--linas

Linas Vepstas

unread,
Jul 29, 2020, 1:36:37 AM7/29/20
to opencog, Xabush Semrie
On Tue, Jul 28, 2020 at 11:41 PM Ben Goertzel <b...@goertzel.org> wrote:


Hmm... you are right that OpenCog hypergraphs have natural chunks
defined by recursive incoming sets.   However, I think these chunks
are going to be too small, in most real-life Atomspaces, to serve the
purpose of chunking for a distributed Atomspace

I.e. it is true that in most cases the recursive incoming set of an
Atom should all be in the same chunk.  But I think we will probably
need to deal with chunks that are larger than the recursive incoming
set of a single Atom, in very many cases.

I like the abstract to the Ja-be-ja paper, will read and ponder. It sounds exciting.

But ... the properties of a chunk depends on what you want to do with it.

For example: if some peer wants to declare a list of everything it holds, then clearly, creating a list of all of its atoms is self-defeating. But if some user wants some specific chunk, well, how does the user ask for that? How does the user know what to ask for?   How does the user say "hey I want that chunk which has these contents"?  Should the user say "deliver to me all chunks that contain Atom X"? If the user says this, then how does the peer/server know if it has any checks with Atom X in it?  Does the peer/server keep a giant index of all atoms it has, and what chunks they are in? Is every peer/server obliged to waste some CPU cycles to figure out if it's holding Atom X?  This gets yucky, fast.

This is where QueryLinks are marvelous: the Query clearly states "this is what I want" and the query is just a single Atom, and it can be given an unambiguous, locally-computable (easily-computable; we already do this)  80-bit or a 128-bit (or bigger) hash and that hash can be blasted out to the network (I'm thinking Kademlia, again) in a compact way - its not a lot of bytes.  The request for the "query chunk" is completely unambiguous, and the user does not have to make any guesses whatsoever about what may be contained in that chunk.  Whatever is in there, is in there. This solves the naming problem above.


What happens when the results for that (new) BindLink query are spread
among multiple peers on the network in some complex way?

I'm going to avoid this question for now, because "it depends" and "not sure" and "I have some ideas".

My gut impulse is that the problem splits into two parts: first, find the peers that you want to work with, second, figure out how to work with those peers.

The first part needs to be fairly static, where a peer can advertise "hey this is the kind of data I hold, this is the kind of work I'm willing to perform." Once a group of peers is located, many of the scaling issues go away: groups of peers tend to be small.  If they are not, you organize them hierarchically, they way you might organize people, with specialists for certain tasks.

I think it's a mistake to try to think of a distributed atomspace as one super-giant, universe-filling uniform, undifferentiated blob of storage. I think we'll run into all sorts of conceptual difficulties and design problems if you try to do that. If nothing else, it starts smelling like quorum-sensing in bacteria. Which is not an efficient way to communicate. You don't want broadcast messages going out to the whole universe. Think instead of atomspaces connecting to one-another like dendrites and axons: a limited number, a small number of connections between atomspaces,  but point-to-point, sharing only the data that is relevant for that particular peer-group.

-- Linas

Matt Chapman

unread,
Jul 29, 2020, 2:09:08 AM7/29/20
to opencog
>I think it's a mistake to try to think of a distributed atomspace as one super-giant, universe-filling uniform, undifferentiated blob of storage. 

> You don't want broadcast messages going out to the whole universe.

Not sure if you intended to imply it, but the reality of the first statmentt need not require the 2nd statement. Hashes of atoms/chunks can be mapped via modulo onto hashes of peer IDs so that messages need only go to one or few peers.

Specialization has a cost, in that you need to maintain some central directory or gossip protocol so that peers can learn which other peers are specialized to which purpose.

An ideal general intelligence network may very well include both a large number of generalist, undifferentiated peers and clusters of highly interconnected specialized peers. If peers are neurons, I think this describes the human nervous system also, no?

To borrow terms from my previous messsge, generalist peers own many atoms, and replicate few, while specialist peers own few or none, but replicate many.

Matt



--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

Abdulrahman Semrie

unread,
Jul 29, 2020, 9:34:51 AM7/29/20
to opencog
 > I think it's a mistake to try to think of a distributed atomspace as one super-giant, universe-filling uniform, undifferentiated blob of storage.

It is not clear to me why this is a mistake. It obviously has its use cases.  When I think of a distributed atomspace, I think of multiple atomspaces that are partitioned & replicated over multiple nodes (the first case with mutiple atomspace support in #2138). To give an example, say we want to store human genomic variant data for both reference builds,  hg19 and hg38 (this link explains what the genome reference builds are). The variant data for each reference will go into two separate atomspaces. But since these atomspaces need to store 100GB+ data, they need to be distributed over multiple nodes, each node holding portions of each atomspace. A local client will then specify which atomspace it wants to query by id (something like "hg38") and send a query (You can see elements of this design in the atomspace-rpc code). From the client's perspective it is as if it is querying a single large atomspace,  but the query engine takes care of coming up with an execution plan (optimizing the query if it has to) and getting the results from the partitions and returning it to the client. Also when a user wants get variant data from the ghrc38 atomspace, only partitions of ghrc38 should be searched/queried.

> I think we'll run into all sorts of conceptual difficulties and design problems if you try to do that. If nothing else, it starts smelling like quorum-sensing in bacteria. Which is not an efficient way to communicate. You don't want broadcast messages going out to the whole universe.

I suggest you to look into the design docs of Nebula graph DB, which is a strongly typed distributed graph db. I believe they address the above issues you mentioned and  it is possible to implement something similar for the first version of the distributed Atomspace.   Here are the links


[Storage Design] - https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/2.storage-design/ - part of this currently implemented through the Postgres backend as demonstrated in this example

[Query Engine] - https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/3.query-engine/ - esp.  interesting how they implement access control through sessions, which partly relates to #1855

They implement sharding somewhat similar to what you described above using Edge-Cut - storing a destination vertex and all its incoming edges in the same partition, a source vertex and its outgoing edges in the same partition. They use Multi-raft groups (" Multi-Raft only means we manage multiple Raft consensus groups on one node") to achieve consistency across partitions for multiple databases. This is contrary to what you suggested in that each node doesn't broadcast its changes, only the elected leader will broadcast changes (i.e send log requests) and the rest of the nodes will update their partitions accordingly. Of course, a new leader can be elected if the current leader fails or it term ends. The above design also solves what you noted as the "unsolved part" in #2138

Anyways, I think implementing something similar to Nebula db as an initial version has immediate benefits for projects that use the atomspace to store and process out-of-RAM data such the genomic data.

Linas Vepstas

unread,
Jul 29, 2020, 12:37:49 PM7/29/20
to opencog
On Wed, Jul 29, 2020 at 1:09 AM Matt Chapman <ma...@chapmanmedia.com> wrote:
>I think it's a mistake to try to think of a distributed atomspace as one super-giant, universe-filling uniform, undifferentiated blob of storage. 

> You don't want broadcast messages going out to the whole universe.

Not sure if you intended to imply it, but the reality of the first statmentt need not require the 2nd statement. Hashes of atoms/chunks can be mapped via modulo onto hashes of peer IDs so that messages need only go to one or few peers.

Which peers?  How do you find them? You are thinking Kademlia (as do I, when I think of publishing) or OpenDHT or IPFS. Which is great, if all you're doing is publishing small amounts of static, infrequently-changing information.  Not so much, if interacting or blasting out millions of updates.  Neither system can handle that -- literally -- tried that, been there, done that. They are simply not designed for that.

Now, perhaps using only a hash-driven system, it is possible to overcome these issues. I do not know how to do this. Perhaps someone does -- perhaps there are even published papers ... I admit I did not do a careful literature search.

But, basically, before we are even out of the gate, we already have a snowball of problems with no obvious solution.  Haven't even written any code, and are beset by technical problems. That's not an auspicious beginning.

If you have something more specific, let me know. Right now, I simply don't know how to do this.

--linas

Ben Goertzel

unread,
Jul 29, 2020, 12:40:03 PM7/29/20
to opencog
On Wed, Jul 29, 2020 at 6:35 AM Abdulrahman Semrie <hsam...@gmail.com> wrote:
>
> > I think it's a mistake to try to think of a distributed atomspace as one super-giant, universe-filling uniform, undifferentiated blob of storage.
>
> It is not clear to me why this is a mistake.

It's a mistake because making a call from machine A to machine B is
just sooooooo much slower than making a call from machine A to machine
A ...

So if you try to ignore the underlying distributed nature of a
knowledge store, and treat it as if it was a single knowledge blob
living in one location, you will wind up making a system that is very,
very, very slow...

My Webmind colleagues and I were naive enough to try this in the late
1990s using Java 1.1 ;-)

One challenge though is: From a language and algorithm design
perspective, it is of course necessary to abstract away many of the
details of distributed infrastructure, while still respecting the
difference btw a localized and distributed knowledge store.

E.g. an AI algorithm may need to be aware that pieces of knowledge can
have three different statuses: Local, Remote (in RAM on some other
machine in Distributed Atomspace) or BackedUp (disk). So then when
it issues a query it may need specify whether its search for an answer
should be Local only, should include Remote machines, or should also
include BackedUp data... Because having an AI algorithm issue all
its queries across a distributed Atomspace + disk backup will just be
too slow. So in this case the existence of a distributed/persistent
infrastructure requires the AI algorithm to prioritize its queries w/
at least 3 levels of priority.

> I suggest you to look into the design docs of Nebula graph DB, which is a strongly typed distributed graph db. I believe they address the above issues you mentioned and it is possible to implement something similar for the first version of the distributed Atomspace. Here are the links
>
> [Overview] - https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/1.design-and-architecture/
>
> [Storage Design] - https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/2.storage-design/ - part of this currently implemented through the Postgres backend as demonstrated in this example
>
> [Query Engine] - https://docs.nebula-graph.io/manual-EN/1.overview/3.design-and-architecture/3.query-engine/ - esp. interesting how they implement access control through sessions, which partly relates to #1855
>
> They implement sharding somewhat similar to what you described above using Edge-Cut - storing a destination vertex and all its incoming edges in the same partition, a source vertex and its outgoing edges in the same partition. They use Multi-raft groups (" Multi-Raft only means we manage multiple Raft consensus groups on one node") to achieve consistency across partitions for multiple databases. This is contrary to what you suggested in that each node doesn't broadcast its changes, only the elected leader will broadcast changes (i.e send log requests) and the rest of the nodes will update their partitions accordingly. Of course, a new leader can be elected if the current leader fails or it term ends. The above design also solves what you noted as the "unsolved part" in #2138



There are some interesting things in Nebula and maybe some stuff for
us to learn there.

However, their assumption of complete consistency across the
distributed KB does not match our requirements for OpenCog. We need
complete consistency only regarding certain sorts of knowledge items
-- for other cases it's OK for us if different versions of an Atom in
different parts of a distributed system drift apart a little and are
then reconciled a little later.

The assumption of complete consistency is built into the RocksDB
infrastructure that they use, btw

ben

Matthew Ikle

unread,
Jul 29, 2020, 1:33:01 PM7/29/20
to 'Nil Geisweiller' via opencog

On Jul 29, 2020, at 10:39 AM, Ben Goertzel <b...@goertzel.org> wrote:

On Wed, Jul 29, 2020 at 6:35 AM Abdulrahman Semrie <hsam...@gmail.com> wrote:

I think it's a mistake to try to think of a distributed atomspace as one super-giant, universe-filling uniform, undifferentiated blob of storage.

It is not clear to me why this is a mistake.

It's a mistake because making a call from machine A to machine B is
just sooooooo much slower than making a call from machine A to machine
A ...

So if you try to ignore the underlying distributed nature of a
knowledge store, and treat it as if it was a single knowledge blob
living in one location, you will wind up making a system that is very,
very, very slow...

My Webmind colleagues and I were naive enough to try this in the late
1990s using Java 1.1   ;-)

Ah yes I recall those days and the (in)famous Java 1 with the original broken Java Memory Model, not fixed until 2004 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.17.7914&rep=rep1&type=pdf).
--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

Matt Chapman

unread,
Jul 29, 2020, 2:18:21 PM7/29/20
to opencog
> Which peers? 
As determined by a token ring:


I think you could almost replace "vnode" with "chunk" if you wanted to adopt the Cassandra architecture, although I wouldn't be surprised to see performance problems with a huge number of vnodes, so it might actually need to be a "chunk-hash modulo reasonable number of vnodes".

 > How do you find them?

By calculating the partition token via consistent hash, as Cassandra does with Murmur3. This tells you the authoritative source for the chunk you want. You might also have a local cache of other peers that have had replicas of that chunk, in case any of them are more responsive to you. Cassandra calls this process of finding potential replicas "Snitching".


 > You are thinking Kademlia (as do I, when I think of publishing) or OpenDHT or IPFS.

Nope. I've only played with IPFS a bit, but I don't expect it to be performance for the atomsoace use case. I'm only vaguely familiar with openDHT; it seems worth exploring, but I'm sure you understand it far better than I do. 

I'm not very familiar with p2p systems like kademlia, but I suspect that's optimized for consistency & availability over performance, so not the right choice for datomspace.

By this point, it should be clear that I look to Cassandra for how semi-conistent distributed data storage systems should be designed. (Fwiw, my inspiration for distributed messaging systems comes mostly from Apache Kafka.)


> Which is great, if all you're doing is publishing small amounts of static, infrequently-changing information.  Not so much, if interacting or blasting out millions of updates.  Neither system can handle that -- literally -- tried that, been there, done that. They are simply not designed for that.

Cassandra is.  To be fair, Cassandra is optimized for massive scale, with may involve some trade-offs that are not desirable for present-day atomspace use cases.

See also, ScyllaaDB for a C++ reimplementation of Cassandra.

> Now, perhaps using only a hash-driven system, it is possible to overcome these issues. I do not know how to do this. Perhaps someone does -- perhaps there are even published papers ... I admit I did not do a careful literature search.


Ben Goertzel

unread,
Jul 29, 2020, 2:59:20 PM7/29/20
to opencog
Matt,

I looked at Cassandra some time ago, haven't used it in practice though...

You are pointing it out here as a source of design ideas/inspirations,
but I'm also wondering: Do you think it would be a strong choice as an
ingredient in an OpenCog Hyperon (next-gen OpenCog) distributed
Atomspace? We have been looking at Apache Ignite which serves a
different purpose, and of course the two have been integrated
https://apacheignite-mix.readme.io/docs/ignite-with-apache-cassandra
as well...

It looks like graph databases aren't going to be apropos for the
persistent storage component in Hyperon, and key-value stores are
probably the right level to be looking at...

I haven't thought through how the various levels of non-ACID
consistency in Cassandra might help with distributed Atomspace,

https://blog.yugabyte.com/apache-cassandra-lightweight-transactions-secondary-indexes-tunable-consistency/

ben
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAPE4pjALKeWmpzxwoYR7gCmS5ZcDqrrKPaB0V-UZe814G6cwTA%40mail.gmail.com.

Linas Vepstas

unread,
Jul 29, 2020, 3:35:14 PM7/29/20
to opencog
On Wed, Jul 29, 2020 at 11:40 AM Ben Goertzel <b...@goertzel.org> wrote:

have three different statuses: Local, Remote (in RAM on some other
machine in Distributed Atomspace) or BackedUp (disk). 

The distinctions between these can become rather blurry.  The point of disk-backup is to not lose data, but if your data is in RAM on a remote server, that might be enough. High-end database machines have battery-backed disk drives, so that a powerloss to the building does not lose in-flight transactional data.

Compute centers often have waterfalls of 100-gigabit optical fibers splashing out the back of the CPU's, going to the far end of the room, and attaching to RAM and/or disks there -- this is not new; in 1995, the infiniband storage interconnect ran tcp/ip to disk-drives and allowed remote-DMA access to RAM.   The now-defunct blue-rivers had fiber-optics going straight into the CPU chip itself, because this used less power than toggling electrical bits on a metal pin, thus saving on power and cooling. But it also meant that the RAM could be as far away as your fiber-optic cable is long.  The commercial storage market has some really wild stuff out there that addresses common issues faced by the market.  The local/remote/backed-up distinction is a heritage of 1960's-1990's tech and consumer-grade hardware.  It's not a bad distinction, it's just .. blurry, slightly archaic, rapidly-changing and we don't want to bake it into any AtomSpace specifications.

Ben Goertzel

unread,
Jul 29, 2020, 3:40:34 PM7/29/20
to opencog
***
The local/remote/backed-up distinction is a heritage of 1960's-1990's
tech and consumer-grade hardware. It's not a bad distinction, it's
just .. blurry, slightly archaic, rapidly-changing and we don't want
to bake it into any AtomSpace specifications.
***

Yeah, fair enough, however it's still a better approximation than "one
big undifferentiated Atomspace blob" ;-)

Indeed we would like the core Atomese language to allow more nuanced
distinctions, while allowing local/remote/backed-up distinctions to be
easily made when useful...
> --
> You received this message because you are subscribed to the Google Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA36Vhn-zFqizu6srMJ_gVbQttqt8jG-%2BOr1emD1c70X_LQ%40mail.gmail.com.

Linas Vepstas

unread,
Jul 29, 2020, 3:45:36 PM7/29/20
to opencog
On Wed, Jul 29, 2020 at 8:34 AM Abdulrahman Semrie <hsam...@gmail.com> wrote:
But since these atomspaces need to store 100GB+ data, they need to be distributed over multiple nodes, each node holding portions of each atomspace.

 
Sometimes, it's easier to solve complicated problems by throwing money at them.

--linas

Ben Goertzel

unread,
Jul 29, 2020, 3:48:03 PM7/29/20
to opencog
He may mean a 100GB flat file bio dataset, which when blown up into
Atoms will occupy more like...
> --
> You received this message because you are subscribed to the Google Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA37BjU2%2BaPrjFJH6LJBogsiJRuj6es1KMRLvS2VHk5Gb7A%40mail.gmail.com.

Linas Vepstas

unread,
Jul 29, 2020, 4:21:49 PM7/29/20
to opencog
On Wed, Jul 29, 2020 at 8:34 AM Abdulrahman Semrie <hsam...@gmail.com> wrote:
 > I think it's a mistake to try to think of a distributed atomspace as one super-giant, universe-filling uniform, undifferentiated blob of storage.

It is not clear to me why this is a mistake.

The mention of "quorum-sensing in bacteria" is not entirely spurious. The way that quorum sensing works is that a bacterium emits small polypeptides, which other bacteria can sense (i.e. can "smell"). The local strength of the "smell" provides feedback to bacteria as to how many neighbors it has. This can be used to perform computing; for example, in slime-molds, it is used to solve the two-armed bandit problem (the exploit-vs-explore problem - google it, it's fun). Turns out that tree-roots, shrubs, etc. use this technique as well.

The problem with this technique is it's slow -- rate-limited by diffusion, and high-cross-talk -- everybody smells everything, messages interfere with one-another.  The jellyfish solves this problem by inventing the neuron. The neuron is a Star Trek teleporter, a Star Gate for small polypeptides. Except now we call them neuro-transmitters, not polypeptides. So, the polypeptide walks into a star-gate, and is instantly transported (about a millisecond) to a location a few inches, a few feet away -- this is 5-10 orders of magnitude faster than diffusion.  And there's no cross-talk -- the shielding around a neuron means that nothing leaks out of the axon -- the star-gates are located at the dendrites, only. Message transmission is clean, no interference. High-fidelity.

Neurons partition themselves off into local groupings. They only talk to peers. There is no every-to-every connection. Yes, sometimes neurons grow new connections, or drop old ones, but this is slow.  This partitioning has a huge data-processing advantage over the universe-filling undifferentiated blob of bacteria or slime-mold.

If you want to have 2-3 or 5 or 10 or 20 atomspaces talk to each-other about genomic data, that is fine. But don't ask those atomspaces to also store robot data, and language-processing data, and face-recognition data. Partition them off into a group of peers who are interested only in genomics.  They don't need to process those atoms that say that Sophia just moved her arm, or that Sophia heard a round of applause after her speech at some conference at the opposite end of the planet. There is absolutely no need for that.

What there is a need for is to find out who's doing what. So, if I have a new robot-dancing algorithm, I want to find those Sophias with the newer  shoulder-joint designs, and talk to those atomspaces. The who-has-got-what-where info can be published in a global index, a universally-shared lookup table. e.g. IPFS. So this is like bit-torrent -- whatever content you want, you look it up in the global DHT, but then, to actually download on bit-torrent, you only talk to the local peers who actually have that data, You don't talk to the entire universe. 

So here -- you look up to see who's got the genomic data you want, and you talk to just them.  And if, for example, you recently imported a newer cell-ontology, or a new copy of some other bio dataset, you publish that, like bit-torrent, so that other atomspaces can find it, and connect, and download.

-- Linas

Matt Chapman

unread,
Jul 29, 2020, 4:22:18 PM7/29/20
to opencog
I don't know your performance requirements, but I always thought one way to do a distributed atomspace would simply be to have a bunch of independent atomspaces that all share one distributed Cassandra database as the "disk" storage layer.

Note that I continue to reference the name Cassandra because it is better known, but if you were going to adopt a third-party datastore whole cloth, I do recommend the Scylla C++ implementation of Cassandra, having used it in production for realtime, ML systems at moderate scale (6+ nodes in my case, though it is documented to scale to hundreds, as I recall).




Linas Vepstas

unread,
Jul 29, 2020, 4:36:07 PM7/29/20
to opencog
On Wed, Jul 29, 2020 at 8:34 AM Abdulrahman Semrie <hsam...@gmail.com> wrote:
 
Anyways, I think implementing something similar to Nebula db as an initial version has immediate benefits for projects that use the atomspace to store and process out-of-RAM data such the genomic data.

Someone should do that. Oh wait! Someone already did! Like 10 years ago, we had the Atomspace hooked up to BigTable or HyperTable or whatever it was called, and what happened is that absolutely no one ever actually used it for anything, because no one was writing any kind of atomspace algorithms that actually needed this.

Pie-in-the-sky design is usually terrible.  If you have actual, specific technical problems, if you have trouble accomplishing some specific task, that is when you talk, think, invent, write new code.  Because talking/thinking/writing requires energy, time, money. Creating something no one wants is pointless, even if it sounds like a cool sci-fi book. (I mean, I'm all for creating useless art, but let's call it Art, then, and not pretend that it's as useful as a steam-shovel.)

(This was also why Corto failed -- the charts were awesome, pie-in-the-sky stuff - but they failed to solve any actual problem that anyone actually had. This is not uncommon.)

--linas

Matt Chapman

unread,
Jul 29, 2020, 5:16:12 PM7/29/20
to opencog
> If you want to have 2-3 or 5 or 10 or 20 atomspaces talk to each-other about genomic data, that is fine. But don't ask those atomspaces to also store robot data, and language-processing data, and face-recognition data. Partition them off into a group of peers who are interested only in genomics.  They don't need to process those atoms that say that Sophia just moved her arm, or that Sophia heard a round of applause after her speech at some conference at the opposite end of the planet. There is absolutely no need for that.
> What there is a need for is to find out who's doing what. So, if I have a new robot-dancing algorithm, I want to find those Sophias with the newer  shoulder-joint designs, and talk to those atomspaces. 

I can think of two ways to handle this using the Cassandra Architecture:

1) Implement a custom org.apache.cassandra.locator.AbstractEndpointSnitch class where you "misuse" the `rack` and `datacenter` designations to instead refer to your groups of peers. I.e., you might have a 'rack' called 'genomics' and then pers in that rack would use a query consistency of "ONE" to ensure that they retrieve data only from the 'genomics' group if it exists there.

2) You extend your partition key to include grouping data besides the chunk central atom handle hash. I.e., the partition key can be calculated using some explicit grouping label, like adding a "group: genomics" value onto Atoms, or a bitmap for whether or not the chunk has an outgoing link to any of the predefined set of "NetworkTopologyNodes," i.e., a special atom type for the purpose. 

Using a pub-sub messaging paradigm, you could have nodes that update chunks publish their update to both a "local" topic used only by other group members for lower latency updates, and also to a "global" used by the undifferentiated masses to replicate those updates for data resiliency and for use by other peer groups. I'm sure clever topic partitioning (in the Kafka sense) could also be used here for performance optimizations.

> The who-has-got-what-where info can be published in a global index, a universally-shared lookup table. e.g. IPFS. So this is like bit-torrent -- whatever content you want, you look it up in the global DHT, but then, to actually download on bit-torrent, you only talk to the local peers who actually have that data, 

Anything this is globally shared and frequently updated doesn't scale. That's why it's better to use hash-based mappings to vnodes with a location that only rarely changes, combined with a local cache of peers having replicas. If your DHT never contains more than a small multiple of the total number of peers, and is only updated when a new peer joins or leaves the cluster (which I expect to be extremely rare in current opencog use cases, i.e., only in case of hardware failure) then I don't think you'll run into performance problems with your DHT. 

> You don't talk to the entire universe. 

I don't think anyone is suggesting this is a good idea.

> Pie-in-the-sky design is usually terrible.  If you have actual, specific technical problems, if you have trouble accomplishing some specific task, that is when you talk, think, invent, write new code.

You are absolutely correct; but pie-in-the-sky design is way more fun than real work... :-)

Is there a public document somewhere describing actual, present use-cases for distributed atomspace? Ideally with some useful guesses at performance requirements, in terms of updates per second to be processed on a single node and across the cluster, and reasonably estimated hardware specs (num cores, ram, disk) per peer?


All the Best,

Matt

--
Please interpret brevity as me valuing your time, and not as any negative intention.

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

Linas Vepstas

unread,
Jul 29, 2020, 7:05:38 PM7/29/20
to opencog
On Wed, Jul 29, 2020 at 4:16 PM Matt Chapman <ma...@chapmanmedia.com> wrote:

Is there a public document somewhere describing actual, present use-cases for distributed atomspace? Ideally with some useful guesses at performance requirements, in terms of updates per second to be processed on a single node and across the cluster, and reasonably estimated hardware specs (num cores, ram, disk) per peer?

No, because more or less no one does this. As far as I know, there are two cases: some experiments by the Ethiopians working on MOZI, and my work.  Briefly, then ...

In my case, it was language-learning ... one process chews through large quantities of text, and counts, incrementing counts on atoms, for days/weeks. Things like power-outages, or just plain screwups can result in the loss of days/weeks of processing, so backing to disk is vital. Once you have a dataset, may want to run different experiments on it, so want to be able to make copies.

Of course, I was using postgres for this. The experience was underwhelming. It was fairly easy to go into a mode where postgres is using 80% of the CPU, the atomspace is using 20%. Worse, the speed bottleneck was limited by rotating disk drives, so upgrading to SSD disks gave a huge performance boost (I don't recall the number) Even then, I found that traffic in the SATA link between disk and motherboard was 50% with bursts of 100% of grand-total SATA bandwidth -- so apparently postgres can saturate that just fine.  This is why PCIe SSD is interesting - PCIe has much higher bandwidth than SATA. 

Then there's postgres tuning. There are many things you can tweak in postgres, and one of them is to turn off the fsync flag (or something like that, I forget now) so things get buffered instead of being written to disk right away. This gives a huge performance boost -  sounds great! ... until the first thunderstorm knocked out power. The resulting database was corrupted, even the postgres tools could not recover it. So -- several weeks of work down the tubes. Of course, there are 100 warnings on 100 blogs that say you should never do this unless your disk-drive has batteries built into them. Which is how I know that you can get disk drives with batteries in them :-)

Now, you could respond by saying "oh postgres is bad, blah blah blah, you should use database X, blah blah blah it's so much better" but I find those arguments utterly unconvincing. One reason is I don't think the postgres people are stupid. The other is that the atomspace has a very unusual data structure. So -- it's very often the case that "normal" databases store objects that are thousands of bytes long, or larger -- say, javascript snippets, or gifs, or blog posts,  user-comments, web-pages, product ratings. Even when the data is small -- the data structure is nearly trivial "joe blow gave product X a five-star rating" -  a triple (name, product, rating).  The atomspace is almost the anti-universe of this: the atoms are almost always small - a dozen bytes, but have very complex inter-relationships, of atoms connected to atoms in every-which way in giant tangles. (Imagine the word "the" and a link to every sentence that word appears in. Then imagine links between other words in those sentences ...  most words are 3-7 bytes long. The tangle of what words are connected to what others is almost hopeless. This is also true for the genomics data.  I recently counted the number of gene-pairs, gene-triangles and gene tetragons in Habush's genome dataset, the number of tetragons is huge... but I digress)  So when someone says "database XYZ will solve all of the atomspaces problems", I mostly don't believe it.

----
The MOZI experience...

MOZI had a very different experience (but they said very little about it, so I'm not sure).  From what I can tell: (1) they struggled to configure postgres (2) the resulting performance was poor, (3) They needed a read-only underlay, and read-write overlays. (4) they probably needed user logins. (5) and a data dashboard.

Point 3 is that you have a genomic dataset -- large atomspace -- and a dozen scientists plinking on it. So, you can either make a dozen copies -- or more -- each scientist might run several experiments a day - so 12 scientists x 2 copies/scientist/day x 200 days/year x 50GB per copy = a big number.  Another approach is to have a single read-only atomspace, with read-write overlays, so everyone shares the read-only base, and makes changes that are isolated from the other scientists. The atomspace can now do this. The downside is that if you lose power ... and you also need a big-RAM machine, and each scientist might want to consume a lot of CPU, and there's still some lock contention in the atomspace, so if everyone is trying to modify everything all at the same time, you end up spending a lot of time waiting on locks.  I've tried to performance-tune the living daylights out of the atomspace, but workload-specific issues always remain .. some workloads are very different than others.

A third approach is one that Habush is working on now: a central server for the read-only copy, and local users copying over those subsets that they need for their particular experiments.  Details are TBD, but we've got several deuling approaches: I keep blathering about cogserver and backends because it works, and is proven, he's trying a different set of architectures, which is a good thing, because it's unhealthy to get trapped in Linas' tiny thought-bubble. You want to go out and explore the world.

Point (4) user logins ... no clue.  There's nothing in the atomspace regarding data protection, data security, partitioned access, safety, verifiability, loss recovery, homomorphic computing :-)  The security model for the atomspace is out of the 1960's - if you have physical access to the atomspace, you can do anything at all. 

Point (5) data dashboard -- a nice GUI that lists all of the atomspaces you've got, which ones are live, which ones are archived, what's in them, how big are they? What jobs are running now? Are they almost done? Do I have enough RAM/CPU to launch one more? (never mind, do I have enough disk/SATA bandwidth?)  Imagine one of those youtube ads with smiling millennials pretending to run a business by swiping right on their cell phone.

Point (5) is a HUGE GIANT BIG DEAL. I've spent years  pumping data through the atomspace and it's like sitting in front of a laundry machine watching your clothes spin round. It's boring as heck, but if you don't watch, something goes wrong or you forget what you're doing or you can't find the results from last week because you forgot where you put them. This is a major issue. I slog through this with a 100% manual process because I'm a masochist, but it is not a tenable state of affairs. 

And more or less everyone is ignoring this, although MOZI does have assorted GUI's and dashboards for genomics data.

-- linas

Ben Goertzel

unread,
Jul 29, 2020, 7:13:55 PM7/29/20
to opencog
>> Is there a public document somewhere describing actual, present use-cases for distributed atomspace? Ideally with some useful guesses at performance requirements, in terms of updates per second to be processed on a single node and across the cluster, and reasonably estimated hardware specs (num cores, ram, disk) per peer?

A couple years ago we put together a document collecting together some
use-cases for Distributed Atomspace (not ones we are currently using
distributed Atomspace for, but stuff we're doing or have previously
done that could use Distributed Atomspace). I will dig up that
document and share it.

Ben Goertzel

unread,
Jul 29, 2020, 7:30:25 PM7/29/20
to opencog
Regarding the genomics use-cases, there are many but here are three
which are clear and interesting

1) Genome annotation

Here, we have a large bioAtomspace containing diverse information
about e.g. human genes and their connections with proteins, diseases,
etc.

We then have a list of genes that results from some data-analysis, and
we want to find all the info in the (distributed) Atomspace related to
these genes.

This is very similar to what the Gearman distributed processing setup
that Mandeep etc. made years back was supposed to do (and probably
still does if it hasn't bitrotted too badly).

2) DeepWalk embedding vector creation

Here, we have the same bioAtomspace, and for each GeneNode (or e.g.
each ConceptNode representing a GO category) we want to generate a
large number of biased-random walks through that node. These walks
are then processed by DeepWalk algorithm and used to generate
embedding vectors.

The walks of length say K=10 through a node are very likely to wander
across multiple machines, in a distributed-Atomspace setting.

3) PLN inference

We have an implication such as

"overexpression of gene G in person P" ImplicationLink "Person P will
live past 90"

and we want to estimate its truth value using backward chaining
inference. As the backward chainer does its thing, its
premise-selection process will need to use distributed knowledge.

For example, suppose the BC wants to evaluate the truth value of the
premise P5= "Gene G IntensionalInheritance concept C" ... where
concept C is perhaps some GO category that G does not belong to
extensionally.

The BC may determine that the data about this premise P5 lies most
centrally on lobe M12 in the distributed Atomspace, and then it may
ask lobe M12 to evaluate the truth value of P5 using PLN.

Then, in evaluating P5, M12 may wind up wanting truth value
evaluations of other premises than rely on data centered on other
lobes... etc.

...

These are all things we are doing right now on a localized Atomspace,
but that would obviously merit from a distributed-Atomspace
implementation.

-- Ben

Matt Chapman

unread,
Jul 29, 2020, 7:45:51 PM7/29/20
to opencog
> So when someone says "database XYZ will solve all of the atomspaces problems", I mostly don't believe it.


If you think this is what I'm saying by describing Cassandra's architectural choices, then you are missing the point.

It was not so when I first discovered OpenCog, but in the years since, I've gained many man-months of hands on experience with PostgresSQL, wen more than my experience with Cassandra, and Postgres is a fantastic piece of software, if what you need is an ACID-compliant relational database that can fit on a single machine. It's various forms of distributed operation, of which I have limited experience, seem to range from OK to terrible. And the OK ones (proxied partitioning) leave the hardest problems unsolved (how to partition intelligently) and create scaling bottlenecks (the proxy/coordinator).

Cassandra, and many other modern non-relational data stores, are designed from the ground up to operate in distributed clusters over unreliable networks, and are willing to flex on things like ACID in order to gain performance for specialized use cases that don't require strict ACID. They are rarely interchangeable with each other, and they are almost never interchangable with a fully-consistent ACID-compliant relational database, if that's what you really need.

None of the use cases so far described convince me that the atomspace needs a fully-consistent ACID-compliant relational database. So I'm not suggesting a different one; i'm suggesting not to use one.

I'm not saying, "use Cassandra instead of Postgres" (although I think that could have benefits for some workloads). 

I'm saying, "use a Data distribution architecture like Cassandra's, instead of Kademelia." (Although Kademelia may be a fine choice for the small part of that architecture that does require a global DHT).

Or, "Use a tiered  pub-sub messaging paradigm, with a distribution architecture like Kafka or Pulsar, instead of trying to maintain global state at all." (I'm less sure about this, but it makes more sense to me as the independence of peer-subgroups increases.)

Matt

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

Linas Vepstas

unread,
Jul 29, 2020, 8:50:07 PM7/29/20
to opencog
On Wed, Jul 29, 2020 at 6:45 PM Matt Chapman <ma...@chapmanmedia.com> wrote:

If you think this is what I'm saying by describing Cassandra's

Sorry, it was not meant to be a jab at you ... over the last decade, something like a dozen different databases have been proposed, each with different reasons for using them. As I recall -- "nosql databases" -- BASE not ACID -- so we tried memdb (couchdb(?) was recommended). The bitter  lesson was that it was optimized for 100MByte mp3's and 1MByte gifs and had a throughput of about 100 atoms/second. The memdb developers couldn't care less - "what kind of moron stores 12 bytes in a database?" was the general reaction.

Then there was the neo4j work. The lesson there was that 95% of CPU was spent converting atoms into ZeroMQ packets (using google protocol buffers, if I recall) and RESTful API's written in python using python decorators ... lord knows how much CPU in neo4j itself unpacking the packets.  Again, I think this was also about 100 Atoms/second ... This is when the idea of chunks and chunking started getting discussed, since obviously things could run faster if we could ship thousands of atoms over at a time. Or maybe if we could get neo4j to do the pattern matching, and ship back only the results. How do you send a pattern-matcher query to neo4j?

By comparison, the current ASCII-file-reader for reading Atoms in s-expression format does about 100K atoms/second  (that's on my machine ... I'm told that the latest Apple laptops are maybe 5x faster?...) I actually measured: about 45% of CPU time is spent doing string-compares and string-copying and find-first-character-in-string and 55% of the cpu time was in the atomspace, actually adding Atoms. Or maybe it was 55/45 the other way around. I forget.

I do have extensive notes on atomspace performance in https://github.com/opencog/benchmark/ - on my machine, raw atomspace is 700K nodes/sec and 200K links/sec so maybe a million/sec on something modern. Running at 100 atoms/sec through some RESTful/zero-mq/whatever interface is embarrassing.

I'm writing in this flippant style because I'm trying to make it fun to read my emails. There's a serious lesson here: converting things that are 12 bytes long into other things has just a huge overhead. I'm not sure how c++ std::string is implemented -- how many cpu cycles it takes to compare a byte, add one and go to the next byte ... but if you do anything much more complicated than that, you pay a performance penalty.  This is where the performance bar is set. It's hard to figure out how to jump over that bar. Or even get near it.

-- Linas

Matt Chapman

unread,
Jul 29, 2020, 9:25:35 PM7/29/20
to opencog
I didnt take it personally, no need to apologize. I enjoy the more relaxed style, and often try for the same. Anyway, you're the expert here, and I should be disregarded if I'm speaking nonsense, but I'll make one more point in an attempt to convince you that I'm not:

If the unit of distribution is a a chunk, i.e., an Atom and the Atoms that make up it's outgoing set, then your average storage size is  (12bytes * average num outgoing^depth) and that rapidly gets to the point where the serialization overhead becomes a small fraction of the whole chunk-record.

I remember when protobuf over ZeroMQ was the toy of the month, and now that I understand those technologies better, I'm sure protobuf is a terrible idea, but I don't necessarily see anything wrong with using ZeroMQ to pass around 12 byte messages or 12n^d byte messages. In fact, the ML system that I mentiond earlier, which used Scylla/Cassandra as it's distributed feature store used ZeroMQ for communication between it's dozens of workers.

I also know that the creator of ZeroMQ has moved on to a new messaging project intended to replace zmq, but I don't recall the name of the new one yet.

Anyway, I guess my point here is just to encourage you not to discard technologies like ZeroMQ or NoSQL databases too quickly, just because someone failed to created a successful implementation in the past. 

Anyway, despite my off hand response to Ben, I'm not convinced replacing the current Postgres Backend with an interface to Cassandra would solve all the problems. But I am convinced that a token-ring-like architecture with tuneable consistency and topology-aware peer discovery and bloom-filter caching is essential for the kind of distributed data sharing performance you want, assuming a data set larger than can possibly fit in memory on a  single machine. I doubt it can be done with a DHT alone.

Distributed data processing of any kind is hard to get right. There are reasons we get a new unicorn start-up in this space every 6 months, and why most of them are on life support about a year later. (Cloudera. I predict Databricks reign will end too. Confluence may be on the rise now, but we'll see)

Also, keep in mind that you'll never get anything like the performance of loading from disk once you're dealing with other machines over a network. You're trading that performance for the ability to work with data larger than you can fit on your own machine. So comparisons to loading from a static file are unfair, although at least were finally talking about concrete numbers that can set the goal to reach for.

Will anybody out there fund my Distributed Data start-up if I can build a PoC that loads an atomspace from Kafka as fast as you can load from your ASCII files? ;-)

Best,

Matt





--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

Linas Vepstas

unread,
Jul 29, 2020, 11:15:25 PM7/29/20
to opencog
Hi Matt,

zero-mq is nice, if you use it correctly. The meta message is that some subsystems in the atomspace have reached the point where one can count bytes and cycles, and it's not obvious how to shave anything more off. The other message is that it's real easy to pointlessly waste cycles converting atoms into packets of various kinds to ship them around. 

The biggest message (I'm starting to think) is that placing atoms into a database, any database at all is pointless and useless. You can't actually "do anything" with atoms that are in a database; they are useful only when you can crawl over them (i.e. with the pattern engine or with PLN or as-moses, or pattern miner, etc.).

And this is key: I can't dump atoms into cloudera (or databricks) and run data-analytics on them. That doesn't make sense. 

So the only usefulness left in having a database is treating it like an archive manager. so, e.g. dump a million atoms into a text file or a binary blob, stick a name, date and author on it, and archive it.  So I guess what I really want is an archive manager. Like the lowly file manager on your pee cee. Or your photo-album manager. Except I don't need any database at all to do that -- and I can just stick them on bit-torrent to make them "distributed" -- or IPFS, or dat:// if you want to be awesomely modern.

So it seems like what I need is some way of saying "find me the archive of genomic data that has covid-19 in it" and maybe "if it's too big find me a smaller chunk with only the covid and not the reactome data".   And slap a real slick GUI on this ... seriously, like some cell-phone app GUI.

There's still a hole in the above: I can hear someone saying "but what if I have a trillion atoms, and I need to search all of them, that would require a petabyte of RAM!" and .. well, I have some ideas for that, (and so do others on this mailing list) but none of those ideas require a database of any sort!  The database "gets in the way", as it were.  It provides very nearly zero added value (except in a few corner cases...) In particular, all existing data analytics systems seem to be 100% useless for doing anything with atoms... !?! 😮

It feels like .. all these years, we've been imagining the wrong solution...

We need a plain-old archive manager, for now, and a chunk manager once we figure out what chunks are...

And by archive manager I'm thinking, like -- click a button and that launches an atomspace, loads it with archive 42, and then starts pln or whatever crawling on it... so maybe like the MOZI GUI's which I have never used....

--linas


Ben Goertzel

unread,
Jul 30, 2020, 1:20:25 AM7/30/20
to opencog
Matt, others,

Here is a rough draft for a Distributed Atomspace requirements
document, along the lines of Cassio's presentation at OpenCogCon last
month

https://docs.google.com/document/d/1n0xM5d3C_Va4ti9A6sgqK_RV6zXi_xFqZ2ppQ5koqco/edit#

-- Ben

Vitaly Bogdanov

unread,
Jul 30, 2020, 4:44:53 AM7/30/20
to opencog
> Or maybe if we could get neo4j to do the pattern matching, and ship back only the results. How do you send a pattern-matcher query to neo4j?

Btw, there is an example of Pattern Matcher implementation built on top of Neo4j and other graph databases (JanusGraph, SQL, in-memory) which was developed by Alexander Sherbatiy: https://github.com/stellarspot/atomspace-storage

This implementation uses Neo4j as a local storage not remote one and it is integrated via Neo4j API, so it is not related to the protobuf/zmq discussion.
But it seems very relevant to the overall discussion. All storages are integrated via same API which is very similar to the opencog/atomspace BackingStore.

Best regards,
  Vitaly

Vitaly Bogdanov

unread,
Jul 30, 2020, 7:06:46 AM7/30/20
to opencog
> Worse, the speed bottleneck was limited by rotating disk drives, so upgrading to SSD disks gave a huge performance boost (I don't recall the number)

Again, I don't say that another DB will solve all of the problems, but in context of SSD, there is key-value storage optimized for SSDs:

Best regards,
  Vitaly

Ben Goertzel

unread,
Jul 30, 2020, 10:43:25 AM7/30/20
to opencog
Regarding Neo4j, as noted in Section 6.5 of
https://wiki.opencog.org/w/File:OpenCog_Hyperon.pdf , the sort of
pattern matching done efficiently via Cypher does not cover all the
critical aspects we need for Atomspace,

To quote a few points from there,

"
* Deduplication of subgraphs (which complicates solutions of some
tasks over ordinary graphs) is still a default choice for many
Atomspace tasks
* Representation and retrieval mechanisms should be biased towards
acyclic hy- pergraphs (and related more general sorts of metagraphs)
instead of ordinary graphs, in particular, grounding of variables to a
whole subgraphs (expressions) at once is to be supported
* Variables in KB should be supported in contrast to graph DBs where
these are not part of the infrastructures
* Pattern matching should ground variables in KB entries and queries
simultaneously
"

Realizing these limitations one sees that for OpenCog purposes one
likely better off w/ more basic key-value or column stores than with
graphDBs ...

ben

ben
> --
> You received this message because you are subscribed to the Google Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/586c1c6b-fcf8-4a0f-80b6-adce343d95acn%40googlegroups.com.

Linas Vepstas

unread,
Jul 30, 2020, 11:34:14 AM7/30/20
to opencog
On Thu, Jul 30, 2020 at 9:43 AM Ben Goertzel <b...@goertzel.org> wrote:

Realizing these limitations one sees that for OpenCog purposes one
likely better off w/ more basic key-value or column stores than with
graphDBs ...

While I was writing the last message to Matt, I realized that having any DB at all is very nearly pointless. That having a DB-backed storage is very nearly an anti-pattern. 

The only useful function that a DB seems to provide is to be able to say "get me this particular atom X from the disk".  But how often do you need that?  Far more typical  is that you want to load zillions of atoms, and do something with them.  If zillions of atoms are too big to fit in RAM, then we are back to the chunking problem that started this conversation, and the chunking problem has nothing to do with databases.

The only other advantage of databases is incremental backup -- for multi-day, multi-week-long calculations, you want to save partial results, one atom at a time.

How much RAM are you willing to sacrifice to the storage subsystem? Every byte of RAM used by the storage subsystem is one less byte available to the AtomSpace.

For this, it seems that -- indeed, having a simple but very fast key-value store might be best.

It took me a decade to arrive at this chain of thought. It seems confusingly obvious, too simple to be true. But there it is...

-- linas

Ben Goertzel

unread,
Jul 30, 2020, 12:20:57 PM7/30/20
to opencog
> While I was writing the last message to Matt, I realized that having any DB at all is very nearly pointless. That having a DB-backed storage is very nearly an anti-pattern.
>
> The only useful function that a DB seems to provide is to be able to say "get me this particular atom X from the disk". But how often do you need that? Far more typical is that you want to load zillions of atoms, and do something with them. If zillions of atoms are too big to fit in RAM, then we are back to the chunking problem that started this conversation, and the chunking problem has nothing to do with databases.
>
> The only other advantage of databases is incremental backup -- for multi-day, multi-week-long calculations, you want to save partial results, one atom at a time.
>

This is almost right Linas, but just a little too extreme...

Let's think e.g. about the genomics use-case.

Consider the following situation. An OpenCog system is thinking hard
about say 100 human genes at a time, building new links connecting
them to various concepts and predicates etc. Then it saves its
conclusions to a backing-store DB -- and moves on to the next batch of
human genes.

But while thinking about gene G, OpenCog may relate it to gene H, and
may then want to grab information from the backing-store about gene H
...

In this case the "chunk" of information that we want to grab from the
backing-store is "the sub-metagraph giving the most intensely relevant
information about gene H" ....

Note that the chunk related to gene H, desired on a certain occasion,
may overlap with the chunk related to gene H1 ... or the chunk related
to GO category GO7 ... desired on other occasions...

So I think it's a correct point that

-- the quantity of Atom-stuff to be sucked out of the BackingStore
into the Atomspace will almost always be a "chunk" of Atoms rather
than an individual Atom

However, I think these chunks are not always going to be extremely
huge (they could be 100s of Atoms sometimes, or 1000s sometimes, not
always hundreds of thousands or millions...)... and also the chunks
needed are going to overlap w/ each other in ways that can't be
foreseen in advance

Thus I believe that we need some fairly powerful static pattern
matching operating against the BackingStore, and that a primary
operation to focus on it:

-- send a Pattern Matcher query to BackingStore
-- sent the Atom-chunk resulting from the query to Atomspace

This is pretty clearly what is needed in the genomics use-case. But I
could come up with similar stories for other use-cases, e.g. if an
OpenCog-controlled robot meets a person "Piotr Walarz" for the first
time, it may wish to fish into the BackingStore to pull in a whole
bunch of nodes and links comprising previously ingested or inferred
knowledge about "Piotr Walarz" .... This will be a sizeable chunk
but maybe

-- If the AI's knowledge about "Piotr Walarz" comes from online
profiles etc., this could be a 100s to 1000s to 10000s of Atoms chunk
...

-- if the robot or other robots sharing the same KB has had a lot of
direct interaction with "Piotr Walarz", then it could be a much larger
chunk ... which may need to get fished into RAM only partially and in
multiple stages....


-- Ben


ben

Linas Vepstas

unread,
Jul 30, 2020, 1:40:13 PM7/30/20
to opencog
On Thu, Jul 30, 2020 at 11:20 AM Ben Goertzel <b...@goertzel.org> wrote:

In this case the "chunk" of information that we want to grab from the
backing-store is "the sub-metagraph giving the most intensely relevant
information about gene H" ....

Agreed with all you say, but let me flip this over on its head, and look at it from a comp-sci fundamentals point of view. There are two ways to "grab something". One way is to provide a pointer to it -- e.g. in C/C++, literally a pointer.  Some number, some unique ID, a name for the thing that you want, a handle.

The other way to grab something is to describe it: "I want all X's that have property Y". These are the only two options available to the system programmer.  Everything else has to be built out of these two operations.  And this is generically true for all programming languages, all operating systems ... I suppose maybe it is even some mathematical theorem, but I don't know it's name.

For the current C++ Atomspace, the first corresponds to having Handle to some Atom, and the second corresponds to running a query.  (Which might include looking at the attention-bank, or using the space/time-server, or other indexing subsystems that are currently not integrated into the AtomSpace). So, to get "the sub-metagraph giving the most intensely relevant information about gene H", you have to specify that in some declarative or algorithmic or recursive fashion that either results in a pointer, or a query to an existing subsystem.

Since we are talking about "persistance", I have a very easy answer: don't turn off the electricity! Ta dah! Persistent!  Doesn't fit in RAM? Buy more RAM! Oh, still have problems? Well, let me think about that. Flash-file storage works with 64K blocks that can be flashed in 5 usecs, which have a pending write buffer of 128 MBytes on the other side of a PCIe 1x link which can run at about 8 Gbps. That means that if I create index blocks that are 64K in size and only flash them when ... hmm. Interesting. But what if I... wait, hang on ... one could always ... let see, uhh .... let me get back to you on that.

These are your choices. Yes, in theory it is possible to create a query system that runs fast from rotating storage or SSD.  Storage vendors like Oracle have pumped many billions of dollars of R&D into exactly this -- for the last half-century.

The meta question facing the current Atomspace is: what can be done using limited manpower, limited time, limited money? And I'm saying: buy more RAM, and use a dirt-simple, super-tiny, ultra-fast "database" to access disk. I wrote "database" in quotes because it barely needs most of the bells-n-whistles that databases have.  It only needs some kind of fast disk-access algorithm, to figure out which block to fetch next (taking into account PCIe and FFS and other bandwidth limitations.)  Once you've gotten those blocks into RAM, just run the existing query engine to get the rest of the job done.

Now, if you have the time, the money, and the interest in low-level stuff like speed-of-PCIe, and command-queueing in SCSI protocols (which is what those SSD drives actually use) then, sure, do some research in how to rapidly perform pattern-matching queries subject to these hardware constraints. It's fun, people do this all the time.

In the world of open-source, you might be able to find some existing library - in C++, java, whatever, that already does a superb job of disk-access, and has at least some query support in it. And then you can assign 5 systems programmers to add more pattern-query type things into it ... how much time/money do you want to spend on this low-level stuff, vs how much on high-level stuff?

-- Linas

Linas Vepstas

unread,
Aug 4, 2020, 10:45:20 AM8/4/20
to opencog
On Thu, Jul 30, 2020 at 11:20 AM Ben Goertzel <b...@goertzel.org> wrote:

-- send a Pattern Matcher query to BackingStore
-- sent the Atom-chunk resulting from the query to Atomspace


So,

Someone needed to prove me wrong, and who better to do that but me. I took the weekend to implement a file-based backing store, using RocksDB (which itself is a variant on LevelDB).  It's here: https://github.com/opencog/atomspace-rocks

-- It works, all of the old persistent store unit tests pass (there are 8 of them)
-- its faster than the SQL by factors of 2x to 5x depending on dataset. With tuning, maybe one could do better. (I have no plans to tune, right now)

I'm certain I know of a simple/easy way to "send a Pattern Matcher query to BackingStore and send the Atom-chunk resulting from the query to Atomspace" and will implement this afternoon (famous last words...)  BTW, you can *already* do this with the cogserver-based network client (i.e. without sql, just the network only) here:  https://github.com/opencog/atomspace-cog/blob/master/examples/remote-query.scm

By combining these two backends, I think you can get file-backed storage that is also network-enabled.  Or rather, you have two key building blocks for exploring both distributed and also decentralized designs.

Some background info, from the README:

AtomSpace RocksDB Backend
=========================

Save and restore AtomSpace contents to a RocksDB database. The RocksDB
database is a single-user, local-host-only file-backed database. That
means that only one AtomSpace can connect to it at any given moment.

In ASCII-art:

```
 +-------------+
 |  AtomSpace  |
 |             |
 +---- API-----+
 |             |
 |   RocksDB   |
 |    files    |
 +-------------+

```
RocksDB (see https://rocksdb.org/) is an "embeddable persistent key-value
store for fast storage." The goal of layering the AtomSpace on top of it
is to provide fast persistent storage for the AtomSpace.  There are
several advantages to doing this:

* RocksDB is file-based, and so it is straight-forward to make backup
  copies of datasets, as well as to share these copies with others.
* RocksDB runs locally, and so the overhead of pushing bytes through
  the network is eliminated. The remaining inefficiencies/bottlenecks
  have to do with converting between the AtomSpace's natural in-RAM
  format, and the position-independent format that all databases need.
  (Here, we say "position-independent" in that the DB format does not
  contain any C/C++ pointers; all references are managed with local
  unique ID's.)
* RocksDB is a "real" database, and so enables the storage of datasets
  that might not otherwise fit into RAM. This back-end does not try
  to guess what your working set is; it is up to you to load, work with
  and save those Atoms that are important for you. The [examples](examples)
  demonstrate exactly how that can be done.

This backend, together with the CogServer-based
[network AtomSpace](https://github.com/opencog/atomspace-cog)
backend provides a building-block out of which more complex
distributed and/or decentralized AtomSpaces can be built.

Status
------
This is **Version 0.8.0**.  All unit tests pass. All known issues
have been fixed. This could effectively be version 1.0; waiting on
user feedback.

-- Linas

Ben Goertzel

unread,
Aug 4, 2020, 12:51:38 PM8/4/20
to opencog
Wow!

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

Predrag Radovic

unread,
Aug 4, 2020, 4:55:14 PM8/4/20
to ope...@googlegroups.com
Hello everyone,

I just recently started following the development in more detail so excuse me if some of what I write below might be somewhat off for the system like Opencog or if something is already being done along the same lines and I didn't have a chance to read it so far. David Hart had similar ideas in the past. Constraints might be different this time.

I like minimalistic approach of Linas. But I'm thinking what if we go one layer further down - not use key/value database library, not do serialization/deserialization for core system operation. I'll admit some additional speculative aspects are interleaved in this proposal.

We could use memory-mapped files on SSD. Lets say memory-mapped size is 8x amount of physical RAM of that compute instance (should be tuned, for example, node is 16 CPUs / 64 GB RAM / 512 GB SSD mmap). Memory-mapped region is always at the same virtual memory address. This will require some cross-instance testing to find which virtual address is the best choice on 64-bit system. We can also patch Linux if necessary. Number of CPUs in the cluster is proportional to RAM size. Not sure what is the best ratio, will need experimentation. *Let the Linux do all the caching/loading/saving data*. We focus on Opencog subsystem algorithm development with some careful selection of data structures to be safe from corruption as this memory-mapped region will be shared by multiple processes (in the spirit of existing blackboard architecture). Concurrency could be handled optimistically even if some process degrades some part of the state effect should not be catastrophic, it will be corrected in the next incremental stochastic state update iteration. I'm implying here that Hyperon will feature stochastic runtime which might not be the case.

Each subsystem is function F from [past] GenericAtomSpace -> [future] GenericAtomSpace. Internally there will be functions which are ~ inverse of F. Subsystem categories are mapped to GenericAtomSpace in their own way, in separate processes. Opencog will ocassinally do fsync if we want to make a filesystem snapshot or to periodically clone the whole system state (for example, every day) to external storage for backup. Opencog compute instances only run Opencog processes forming a cluster of exactly the same instance configurations. If Opencog has to use some software packages this should be executed on additional instances, orchestrated by Opencog, to maintain stability of the cluster. There should be cgroup resource limit for memory usage by Opencog processes set to slightly below RAM size. This prevents system maintenance processes from being unavailable like sshd in case something freaky happens with Opencog. But I really don't expect this to happen because our fixed memory usage.

Distribution is tricky and from discussion so far I understand there are issues with network speed and latency which define activated atom chunking/batching/merging strategy. Maybe the solution is to log each network transfer with useful metadata so that knowledge/procedures for efficient distribution could be machine learned/inferred by PLN & MOSES Hyperon analogues. What also simplifies distribution is the fact that in the cluster all memory-mapped regions are mapped at exactly the same virtual memory address. Therefore we could just copy memory from one instance to another without any serialization/deserialization overhead as activation workload migrates. This could also be valuable characteristic for efficient Infiniband transfers. Okay, there might be some merging operation after the transfer to support possible superposition semantics of the new Atomese. Collapsing to more persitant closer-to-root atoms as form of compression/abstraction/embedding/entanglement. If we find correct translation from Atomese to fixed-sized data structures (i.e. N-dimensional arrays and corresponding highly parallel simple operations) that might simplify memory resource management and runtime design further. Adding nodes to the cluster while keeping the same working set size increases system reasoning fidelity (processing more possible worlds).

And at last, if Linux can't manage atomspace caching efficiently then we should forgo SSD backing of working set and maybe add more RAM as Linas suggested and similary handle really huge diverse datastores which are also much bigger than memory-mapped region anyway, specialized data access strategies for these should be automatically inferred by Opencog. My hope is that Opencog system should be more than capable to read and understand API specifications and software manuals to do intelligent data integration procedures. If this is not true, we are doing something wrong.

Final note regarding GPUs, I think that newer NVIDIA CUDA GPUs support memory sharing with the host system so we could see if smaller parts of our memory-mapped region can be shared with the GPU in this way too. Smaller shared parts will go through expansion/decompression/instancing into GPU memory by GPU kernels (generated from inferred Atomese which match CUDA kernel type system etc). Bulk of GPU compute should use on-board RAM. Reverse process will then happen for collapsing operation mentioned above or similar reduction operation. Maybe the use of GPUs/TPUs is distraction right now as Opencog system should eventually learn how to use those types of specialized hardware accelerators efficiently on its own. I'm still trying to understand how much should be built-in and how much learned in the system like Opencog. Even if the system is learning these things it's always easier to do that with the help from good teachers from the community.

best regards,
Pedja

Linas Vepstas

unread,
Aug 4, 2020, 5:51:35 PM8/4/20
to opencog
Hi Pedja,

I'm going to reply in small bites, to keep messages short. (actually, I'm just waiting for compiles to finish...)



On Tue, Aug 4, 2020 at 3:55 PM Predrag Radovic <pe...@crackleware.nu> wrote:

We could use memory-mapped files on SSD.

Ohhh! I like that! This is actually a very interesting idea! And with the appropriate programmer-fu, this should not be hard to proof-of-concept, I think ... so I'm guessing, sometime before the AtomSpace starts up, replace the memory allocator by something that is allocating out of the mapped memory (I think I've seen libraries out there that simplify this).

To get scientific about it, you'd want to create a heat-map -- load up some large datasets, say, some of the genomics datasets, run one of their standard work-loads as a bench-mark, and then see which pages are hit the most often. I mean -- what is the actual working-set size of the genomics processing? No one knows -- we know that during graph traversal, memory is hit "randomly" .. but what is the distribution? It's surely not uniform. Maybe 90% of the work is done on 10% of the pages? (Maybe it's Zipfian?  I'd love to see those charts...)

A lot of this is all about "how can one get the most function for the least amount of programming effort?"  Dinking around with a memory-mapped atomspace is surely worth a few weeks investment in effort!

(Cause -- let's look at the alternative ...  the "classic" opencog design would be to write a "forgetting agent" that scans the atomspace and deletes unused Atoms (maybe after saving them to disk, first). It's not technically hard to write such an agent, but still, it's work. Oh, but for it to work, we have to attach either a time-stamp to each atom, or an "attention-value" (or other data of your choosing) -- this is easy -- it's exactly what the value subsystem is for, but each of these timestamps eats up ... more RAM...!  And then creating a heat-map is icky, because we'd have to increment some counter on each atom every time it was accessed... yuck. Indeed, the OS is much better at this kind of stuff.

Flip side ... if we let the OS do this work, then maybe there is just one Atom on any given 4K page (or 64K page, or 2M page) that is interesting, and everything else on there is a waste of space. The granular attention-values/timestamps mean that we can isolate this one "hot" Atom from all the cold ones.  But none of this is known .. we don't know the heat map.

On the third hand, there's probably a need for a forgetting agent for other reasons... so one has to be created anyway (well there already is one, but is quite unusable, that code should be discarded so that it stops wasting human attention-span.)

Rest of your email in other replies.
--linas

Linas Vepstas

unread,
Aug 4, 2020, 7:43:45 PM8/4/20
to opencog


On Tue, Aug 4, 2020 at 11:51 AM Ben Goertzel <b...@goertzel.org> wrote:
Wow!

You're welcome.  Querying from the database is now supported. The demo is in

At the moment it works, but I'm rethinking the API.  Do check it out.  Feedback, opinions, suggestions, etc. invited.

--linas

Linas Vepstas

unread,
Aug 4, 2020, 7:51:19 PM8/4/20
to opencog
Hi Predra,

On Tue, Aug 4, 2020 at 3:55 PM Predrag Radovic <pe...@crackleware.nu> wrote:

Each subsystem is function F from [past] GenericAtomSpace -> [future] GenericAtomSpace. Internally there will be functions which are ~ inverse of F. Subsystem categories are mapped to GenericAtomSpace in their own way, in separate processes.

I don't understand what you are trying to say, above.

periodically clone the whole system state (for example, every day) to external storage for backup.

Keep in mind that every now and then, the RAM layout of the AtomSpace changes.  This may be due to bug fixes, new features, or simply that a new compiler decided to pack things differently (e.g. c++ std library changes, e.g. for strings, vectors, mutexes, etc).

--linas

Ben Goertzel

unread,
Aug 5, 2020, 12:39:05 AM8/5/20
to opencog
I wonder how different would be the API for RocksDB vs., say,
Cassandra which Matt Chapman has recommended (which may have some
advantages in terms of allowing more configurable/flexible notions of
consistency?)
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA36cTFc-S7C%3D0SqQgfAGZ1bpVupihVfOs0g6hpD14UtSxw%40mail.gmail.com.

Linas Vepstas

unread,
Aug 5, 2020, 3:16:17 AM8/5/20
to opencog
LevelDB/RocksDB and Cassandra are apples and kumquats.

LevelDB/RocksDB are C++ libraries, single-user, non-networked, non-distributed, link directly into the app, store data directly in files, on the local system. They are "embedded databases". So conceptually, they are like the 50-year old unix dbm, except that they have 50 years of computer science behind them, such as bloom filters and log-structured merge trees and what-not (e.g. Rocks is explicitly optimized for SSD disks.).  LevelDB was created by google in 2011. Then facebook took levelDB in 2013 and forked it to create rocksdb, and added a bunch of stuff, made some parts run faster.

Just like dbm, people use leveldb/rocksdb to build *other* databases on top of it.  (that's the beauty of "embedded") For example, there's some version of MariaDB that uses RocksDB for the actual storage.

Cassandra is written in java, its a network database, so basically, its like postgres, except its not postgres, because it uses CQL instead of SQL so its not actually SQL compatible. Otherwise, it has exactly all of the exact same issues that any other networked client-server database has, including the need for an experienced DB Admin to set it up, run it, administer it. (This is an easily forgotten but important detail --vs rocksdb just ... writes to a file. No DBAdmin required.)

For the app developer (i.e. me) one must to write in a custom query language -- CQL, convert my data into CQL format, send that data via tcpip to the server, which  unpacks it, then runs it's interpreter to figure out what I said/wanted, unpacks my data packets, converts them into it's own internal format, (so, that's a second format conversion) and actually performs whatever operations I had specified.  This is conceptually identical to *any* client-server database.

For CQL I copy from wikipedia:
CREATE KEYSPACE MyKeySpace
  WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };

USE MyKeySpace;

CREATE COLUMNFAMILY MyColumns (id text, Last text, First text, PRIMARY KEY(id));

INSERT INTO MyColumns (id, Last, First) VALUES ('1', 'Doe', 'John');

SELECT * FROM MyColumns;
Looks identical to SQL except its not actually compatible. Yuck.  This offers exactly zero advantages of SQL that I can see; the fact that its key-value somewhere in there offers no perceivable advantage that I can make out.

I'll bet you a bottle of good wine or other neuroactive substance that the existing atomspace client-server infrastructure is faster than Cassandra. That is --  start a cogserver, as is, today, open the rocksdb backend under it (so everything going to the cogserver gets stored), and then let other atomspaces connect to the cogserver (using the existing client-server code) that you will have a distributed atomspace that runs faster than cassandra. 

OK, it doesn't have any of those other bells-n-whistles in cassandra, but no one really knows how to do anything useful with those other bells-n-whistles, other than to suggest that they might be somehow useful in some way, maybe, for something.  That surpasses my attention span.

--linas



 

Matt Chapman

unread,
Aug 6, 2020, 12:59:04 AM8/6/20
to opencog
> I'll bet you a bottle of good wine or other neuroactive substance that the existing atomspace client-server infrastructure is faster than Cassandra.

No, it won't be faster, but you'll never be able to store an atomspace bigger than what you can fit in memory on that single atomserver, and you'll never be able to perform more operations (on the canonical atomspace) in parallel than what that one atom server can support. Obviously distributed systems have a performance penalty. We don't build them because we need to go faster (at the level of a single process), we build them because we need to go bigger (in terms of storage space or parallel processes).

All the Best,

Matt

--
Please interpret brevity as me valuing your time, and not as any negative intention.

Linas Vepstas

unread,
Aug 6, 2020, 2:32:52 AM8/6/20
to opencog
On Wed, Aug 5, 2020 at 11:59 PM Matt Chapman <ma...@chapmanmedia.com> wrote:
> I'll bet you a bottle of good wine or other neuroactive substance that the existing atomspace client-server infrastructure is faster than Cassandra.

No, it won't be faster,

wrong.

but you'll never be able to store an atomspace bigger than what you can fit in memory

That is also wrong.

on that single atomserver, and you'll never be able to perform more operations (on the canonical atomspace) in parallel than what that one atom server can support.

That has been wrong for 12+ years.

Obviously distributed systems have a performance penalty.

We've had a distributed atomspace for 12+ years. 

We don't build them because we need to go faster (at the level of a single process), we build them because we need to go bigger (in terms of storage space or parallel processes).

Who is "we"?

We've had the ability to run distributed AtomSpaces that far exceed installed RAM, running on a cluster, for more than a decade.  People talk about this as if it doesn't exist or it doesn't work or there's something wrong with it, or they want something with more chocolate sprinkles on it. 

I'm annoyed.  Seriously, is no one actually paying attention to anything? WTF.

--linas

Matt Chapman

unread,
Aug 6, 2020, 12:30:55 PM8/6/20
to opencog
You misunderstood my first comment; I was agreeing with you that Cassandra-backed storage & distribution won't be faster than what I thought you were suggesting: a client-server model where one Rocks-backed atom server is used by many clients who retrieve, manipulate, and return atoms to the central server. 

Maybe you're suggesting something very different, or I'm just very confused, because I've been hearing people talk about the need for distributed atomspace on and off for 8+ years, and I've never seen an answer along the lines of "you can already have a cluster, here's the documentation on how to set it up." If that was the answer, and people rejected it because of the lack of disk-back persistence, the I'm agreeing with you that RocksDB may solve much of the problem. 

Maybe the unsolved part of the problem is consistency/consensus? I tend to agree with your sentiment that consistency is overrated for Atomspace use cases, often not needed or not desirable, but it seems like maybe Ben and others are seeking something like Tunable Consistency. Maybe this is the big chocolate sprinkle?

>Who is "we"?

Practitioners of the computing arts & sciences, in general.

> We've had the ability to run distributed AtomSpaces that far exceed installed RAM, running on a cluster, for more than a decade. 

Does it meet the 7 business requirements in Ben's document: https://docs.google.com/document/d/1n0xM5d3C_Va4ti9A6sgqK_RV6zXi_xFqZ2ppQ5koqco/edit ?

Points 2 & 3 are about performance improvements; Do you believe such improvements are impossible, or would require more effort than the likely benefits would justify?

Of the other 5, which already exist and which are what you call "chocolate sprinkles"?

All the Best,

Matt

--
Please interpret brevity as me valuing your time, and not as any negative intention.

Linas Vepstas

unread,
Aug 6, 2020, 2:18:10 PM8/6/20
to opencog
On Thu, Aug 6, 2020 at 11:30 AM Matt Chapman <ma...@chapmanmedia.com> wrote:

I've been hearing people talk about the need for distributed atomspace on and off for 8+ years,

Mee too. This was a head-scratcher, since we had a distributed atomspace. So I was never sure why they talked about it.
 
and I've never seen an answer along the lines of "you can already have a cluster, here's the documentation on how to set it up."


I changed the name of the tutorial 5 days ago, because we now have not one, not two, but four different distributed atomspace solutions (of which two don't scale well)

The instructions to set up each of the four are here:

The oldest one, which is SQL-based:

The newest one, which is cogserver-based, and my current favorite:

The IPFS one, which is the one I love to hate:

The DHT one, which I hope to revive maybe if we get a good chunking solution:
 

Does it meet the 7 business requirements in Ben's document: https://docs.google.com/document/d/1n0xM5d3C_Va4ti9A6sgqK_RV6zXi_xFqZ2ppQ5koqco/edit ?

I have no clue.  I've never seen this document before.   It's only the 41st document on this topic, and I'm suffering from reader-fatigue. Care to summarize what it says?

Performance: did anyone run any of the benchmarks on any of the distributed AtomSpaces that we currently have?  We *do* have benchmarks for them. They're in https://github.com/opencog/benchmark/ 

-- Linas

Predrag Radovic

unread,
Aug 7, 2020, 1:46:17 PM8/7/20
to ope...@googlegroups.com
Hi Linas,

On Aug 4, 2020, at 23:51, Linas Vepstas <linasv...@gmail.com> wrote:

We could use memory-mapped files on SSD.

Ohhh! I like that! This is actually a very interesting idea! And with the appropriate programmer-fu, this should not be hard to proof-of-concept, I think ... so I'm guessing, sometime before the AtomSpace starts up, replace the memory allocator by something that is allocating out of the mapped memory (I think I've seen libraries out there that simplify this).

I'm glad that you like the idea. I want to try to make the poc!

We may even use sparse memory-mapped files representing much much bigger virtual address space than physical RAM as well as SSD storage. All nodes have the same file as previously described. Memory management functionality will deallocate unused blocks to maintain the file sparse enough, keeping actual storage usage limited. I hope this could simplify handling of Atom's identity.

To get scientific about it, you'd want to create a heat-map -- load up some large datasets, say, some of the genomics datasets, run one of their standard work-loads as a bench-mark, and then see which pages are hit the most often. I mean -- what is the actual working-set size of the genomics processing? No one knows -- we know that during graph traversal, memory is hit "randomly" .. but what is the distribution? It's surely not uniform. Maybe 90% of the work is done on 10% of the pages? (Maybe it's Zipfian?  I'd love to see those charts...)


I would like to see the heat-map for realistic dataset too. That's a next step. 

Are you referring to this genomics dataset benchmark: https://github.com/opencog/benchmark/tree/master/query-loop or there is some bigger and better benchmark and dataset for this kind of experiments. What about https://github.com/opencog/agi-bio ?


best regards, Pedja

Linas Vepstas

unread,
Aug 7, 2020, 2:30:29 PM8/7/20
to opencog
On Fri, Aug 7, 2020 at 12:46 PM Predrag Radovic <pe...@crackleware.nu> wrote:
Hi Linas,

On Aug 4, 2020, at 23:51, Linas Vepstas <linasv...@gmail.com> wrote:

We could use memory-mapped files on SSD.

Ohhh! I like that! This is actually a very interesting idea! And with the appropriate programmer-fu, this should not be hard to proof-of-concept, I think ... so I'm guessing, sometime before the AtomSpace starts up, replace the memory allocator by something that is allocating out of the mapped memory (I think I've seen libraries out there that simplify this).

I'm glad that you like the idea. I want to try to make the poc!

We may even use sparse memory-mapped files representing much much bigger virtual address space than physical RAM as well as SSD storage. All nodes have the same file as previously described. Memory management functionality will deallocate unused blocks to maintain the file sparse enough, keeping actual storage usage limited. I hope this could simplify handling of Atom's identity.

Have you ever done anything like this before? Because some of what you wrote does not seem right; the kernel does page-faulting and page flush as needed. There's no "deallocation", there is only unmapping.  Mem usage may be fragmented, but it's never going to be sparse, unless the mem allocator is broken.


To get scientific about it, you'd want to create a heat-map -- load up some large datasets, say, some of the genomics datasets, run one of their standard work-loads as a bench-mark, and then see which pages are hit the most often. I mean -- what is the actual working-set size of the genomics processing? No one knows -- we know that during graph traversal, memory is hit "randomly" .. but what is the distribution? It's surely not uniform. Maybe 90% of the work is done on 10% of the pages? (Maybe it's Zipfian?  I'd love to see those charts...)


I would like to see the heat-map for realistic dataset too. That's a next step. 

Are you referring to this genomics dataset benchmark: https://github.com/opencog/benchmark/tree/master/query-loop or there is some bigger and better benchmark and dataset for this kind of experiments. What about https://github.com/opencog/agi-bio ?

The "query-loop" is a subset/sample from one of the agi-bio datasets. It's a good one to experiment with, since it will never change, so you can compare before-and-after results.  The agi-bio datasets change all the time, as they add and remove new features, new data sources, etc. They're bigger, but not stable.

--linas
 

Predrag Radovic

unread,
Aug 7, 2020, 2:34:18 PM8/7/20
to ope...@googlegroups.com

Hi Linas,

Thank you for your interest and I’m sorry for not presenting more complete descriptions. I’m trying to figure things out.

Besides speculation about Hyperon I want to support current development concretely by doing realistic large atomspace experiments with memory-mapped files and distribution, getting acquainted with Opencog AGI platform, capabilities, challenges, requirements along the way, experience which should be valuable in whatever direction Hyperon development goes on.

On Aug 5, 2020, at 01:51, Linas Vepstas <linasv...@gmail.com> wrote:

Each subsystem is function F from [past] GenericAtomSpace -> [future] GenericAtomSpace. Internally there will be functions which are ~ inverse of F. Subsystem categories are mapped to GenericAtomSpace in their own way, in separate processes.

I don't understand what you are trying to say, above.

GenericAtomSpace is renamed to UniversalAtomSpace (UAS).

Function Fstep: UAS -> UAS is in UAS and it’s total (in Idris REPL, :total Fstep prints Opencog.Fstep is Total). It maps past state of UAS to the future state of UAS. Core runtime engine is applying Fstep in the infinite loop. Since new Atomese and UAS are probabilistic linear dependently-typed, resource usage is fixed and terms/atoms represent superposition of possibilities. That’s why I assumed that core engine will benefit by having working set of fixed-size and fixed-size data structures for atomic operations. I know that atoms in AtomSpace right now are higher level knowledge representations. What if in our design we mirror abstraction hierarchy of the universe as we understand it. First there is something like atoms in the sense of quantum physics, that are forming molecules, more complex patterns where cogistric phenomena could take place, next there are alife structures manifesting Rosennean complexity, all the way up to cognitive level where system has sophisticated self-model(s), existing in perfect harmony or perfect disharmony?! We manually encode prior implementation of Fstep and maybe derive ~ inverse of Fstep in initial state of UAS. Existing algorithms and theory from current Opencog codebase like PLN, MOSES, URE…, Opencog Background Publications BgPub and parts of GTGI body of work are translated to new Atomese EDSL as prior knowledge to bootstrap the system using intelligent not-yet-AGI tools. Ultimate preferred mode of interaction with the system is at cognitive level with appropriate interfaces like screen, audio for voice, TCP, HTTP… for internet, web and let’s not forget published SignularityNET services with other cognitive agents elsewhere). I guess that development of narrow-AGI applications will happen on the sub-cognitive level. If we look at “High Level Mind Diagram” from EGI1 page 146, we will be able to excerpt control on this level, to intercept, inspect and modulate what’s going on between these components. We will inspect mechanistic low level structures of UAS only for the system diagnostics and optimization of distribution strategies but not for efficient communication with the cognitive system.

For rapid engineering of OCP Hyperon and for mutual understanding between OCP Hyperon and humans, all science knowledge, mathematics and especially, as Ben’s paper “Toward Formal Model of Cognitive Synergy” BenCS suggests, category theory (CT) should be encoded in the Fstep. I’m complete newbie in CT. In this regard BaezRS is pretty cool illumination of analogues from physics, topology, logic and computation demonstrating the unifying power of CT. Some projects already try to encode CT concepts, apart from many examples from the Haskell’s ecosystem, there are LeanCT and CoqCT. Robert Rossen’s (M,R)-systems are also described in CT terms RossMR

I was also looking at flat concatenative languages as low-level language (LLL) for the stochastic runtime. In that case new Atomese may be decompilation of LLL. I can see how pattern-mining may look like on LLL. It could be very simple built-in functionality to automatically refactor executed sub-sequences of LLL words. People say that concatenative languages are too low level for normal software development. They complain about having to do stack manipulation by hand. What if LLL permeates all layers of the cognitive stack including external use of natural language. In that case, some LLL words could be natural language words auto refactored/pattern-mined to satisfy limited bandwidth between intelligent organisms. If you wonder how non-deterministic concatenative language execution a la Prolog could look like, there is a post JoyND from the author of language Joy. For explanation of equivalence between linear language and stack machine (LLL) see “Linear Logic and Permutation Stacks” BakerLL. LLL words could act as operators. For people who code in Haskell “Compiling to categories” may be interesting ConalC2C. And also GHC is getting linear types LinHS. For relation between linear logic and linear algebra (~ GPU/TPU acceleration) see LinLA.



Getting back to more practical matters.

periodically clone the whole system state (for example, every day) to external storage for backup.

Keep in mind that every now and then, the RAM layout of the AtomSpace changes.  This may be due to bug fixes, new features, or simply that a new compiler decided to pack things differently (e.g. c++ std library changes, e.g. for strings, vectors, mutexes, etc).

Of course. If compiler changes data layout, export/import (backup/restore) mentioned in my previous email will have to be performed. I hope these compiler changes are not happening frequently.

Best regards,

Pedja


Linas Vepstas

unread,
Aug 7, 2020, 4:02:33 PM8/7/20
to opencog
On Fri, Aug 7, 2020 at 1:34 PM Predrag Radovic <pe...@crackleware.nu> wrote:



Keep in mind that every now and then, the RAM layout of the AtomSpace changes.  This may be due to bug fixes, new features, or simply that a new compiler decided to pack things differently (e.g. c++ std library changes, e.g. for strings, vectors, mutexes, etc).

Of course. If compiler changes data layout, export/import (backup/restore) mentioned in my previous email will have to be performed. I hope these compiler changes are not happening frequently.

A new C++ standard comes out every few years.  Both gcc and llvm issue new compilers every year or two, which different distros pick up at different times. This also holds for Apple. So there are 3-5 major distros to think about, if we include raspberry pi in the mix. Memory layout also depends on libc which changes maybe only every 2-4 years. The most recent change removes several dozen deprecated features/functions, including syscalls, and is moving pthread_create to the main library. So the linkage is changing. Also, linkage depends on kernel headers.  All syscalls go through kernel headers. It's relatively unlikely that kernel header changes will alter the atomspace mem layout, unless, of course, intel has another snooping bug which is fixed by some wacky change to the alignment of some struct somewhere.

Unlike ABI's for compiled binaries, there is no ABI for RAM layout. You must assume a different layout  any time you install, update, upgrade anything.   So the memory-mapped RAM thing is NOT suitable for long-term storage; it's good only on stable systems that are not being churned.

Oh, and then there's ASR -- address space randomization -- used to prevent viruses/worms from grabbing hold. I don't know how this might affect a file that has been closed and re-opened.

As I write this, it seems to me that the memory-mapped atomspace might just be a re-invention of swapping.  Perhaps you can get the same effect by telling the kernel to use a special swapfile for just this one process but it now seems to me that is effectively what you describe...

And there is a weird way to do that: set the core-file size to "unlimited", then force a core-dump, and then every time you restart, you are using the RAM layout in the core-dump...

Hmmm...

--linas

Predrag Radovic

unread,
Aug 7, 2020, 4:03:28 PM8/7/20
to ope...@googlegroups.com
On Aug 7, 2020, at 20:30, Linas Vepstas <linasv...@gmail.com> wrote:
We may even use sparse memory-mapped files representing much much bigger virtual address space than physical RAM as well as SSD storage. All nodes have the same file as previously described. Memory management functionality will deallocate unused blocks to maintain the file sparse enough, keeping actual storage usage limited. I hope this could simplify handling of Atom's identity.

Have you ever done anything like this before? Because some of what you wrote does not seem right; the kernel does page-faulting and page flush as needed. There's no "deallocation", there is only unmapping.  Mem usage may be fragmented, but it's never going to be sparse, unless the mem allocator is broken.

I did but not on this scale.

I should be more precise, by "Memory management functionality" I mean "Memory management library code running in Opencog process". By deallocation of unused blocks I'm referring to the hole punching. Here is the demo: https://gist.github.com/crackleware/e01519f4ec16fcba42f5c2bb6151185f  I'll make better demo with much bigger memory usage to trigger and test the paging, so we know our assumptions are correct.


Linas Vepstas

unread,
Aug 7, 2020, 4:45:20 PM8/7/20
to opencog
Sorry Matt, I was very snippy. Making things work takes a lot of time and effort. It's frustrating when that effort isn't recognized.

Ideas are cheap -- there's an unbounded supply of ideas, and one can pump them out rapidly -- new idea every 5 minutes. Converting an idea into reality takes 1000x or 10000x longer. This is a stereotype in software: the programmer who says "that's easy, I can do that in no time" and weeks later they're still working on it. But it's also true of reality in general: the ideas behind #BLM are not particularly sophisticated or complex, but it will take 100 million man-years to turn them into reality. (and that's a lower-bound)

--linas

On Thu, Aug 6, 2020 at 11:30 AM Matt Chapman <ma...@chapmanmedia.com> wrote:

Matt Chapman

unread,
Aug 10, 2020, 1:57:23 PM8/10/20
to opencog

>> Does it meet the 7 business requirements in Ben's document: https://docs.google.com/document/d/1n0xM5d3C_Va4ti9A6sgqK_RV6zXi_xFqZ2ppQ5koqco/edit ?

> I have no clue.  I've never seen this document before.   It's only the 41st document on this topic, and I'm suffering from reader-fatigue. Care to summarize what it says?

Provide effective management of AtomSpaces that are too big to fit in RAM of any one machine that is available.

Decrease the overall processing time required to carry out AI operations to reduce cost per AI operation.

Decrease memory footprint providing better overall throughput in comparing with current implementation to reduce cost per AI operation.

Provide ability to use AtomSpace in the manner of hierarchical cache structure. In other words, provide a way to look for a specific Atom locally before start searching it among other components and fetching remotely.

Provide ability to request for Atom(s) based on a given Atom's property.

Provide ability to request for subgraphs based on patterns.

To isolate application layer from source code modification keeping the AtomSpace API as-is or with minor changes.


The idea I'm suggesting, which I readily admit is worth 1/1000th of the effort required to implement it, is that a Cassandra-like architecture provides a very good solution for requirements number 4 & 5, and possibly a foundation for #6. It also provides #1 for some definition of "effective," arguably better than any centralized architecture, for some definition of "better." :-) It may very likely fail at #2 and #3 compared to current alternatives; we won't know until someone builds & benchmarks it, and that's 1000x more effort...

If you think that those requirements are already adequately served by existing solutions, then I will stop adding noise to the conversation. Otherwise, I'm happy to share more of my experiences if it might be helpful in formulating an approach.

[Aside Req. 1 here is in fundamental conflict with 2 & 3; usually "we" accept a local performance penalty in exchange for distributed & decentralized scalability. But Cassandra's Tunable Consistency model is the only way I know to expose this trade-off to the user on a per-query basis, which seems quite powerful to me, for the Atomspace use case. The value of Tunable Consistency (relative to its cost to implement) may be the thing I've failed to convince you of, in which case, I certainly trust your opinion more than mine.]

All the Best,

Matt

--
Please interpret brevity as me valuing your time, and not as any negative intention.

--
You received this message because you are subscribed to the Google Groups "opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.

Linas Vepstas

unread,
Aug 11, 2020, 12:35:02 AM8/11/20
to opencog
On Fri, Aug 7, 2020 at 3:03 PM Predrag Radovic <pe...@crackleware.nu> wrote:


 By deallocation of unused blocks I'm referring to the hole punching. Here is the demo: https://gist.github.com/crackleware/e01519f4ec16fcba42f5c2bb6151185f  I'll make better demo with much bigger memory usage to trigger and test the paging, so we know our assumptions are correct.

The next obvious test is to allocate 256MB chunks, fill them with a test pattern, and keep doing that till RAM runs out.  Will the oldest untouched blocks be automatically flushed to disk? Will the system start thrashing instead? Will there be some error because maybe mmap won't automatically shrink RAM usage?

You might want to do this in a container, so that you don't accidentally render your desktop inoperable :-) Or do the test on some old victim machine that you can trash and reboot.

Might want to run iostat and vmstat while you do this...

--linas

Linas Vepstas

unread,
Aug 11, 2020, 12:44:36 AM8/11/20
to opencog

Hit send too soon...

so, don't write 10TB, just write a file that is maybe 10x larger than total RAM  on the victim. write the test pattern, all the way to the end, then loop back to the beginning and read the test pattern to make sure it's not corrupted.  Then write a new pattern, .. repeat in loop until system settles down to some steady-state. It should settle down to whatever the max disk-write/read speed is so maybe 200MBytes/sec for older disks; maybe faster for newer. of course spinners behave differently than ssd ...

A variant of that would be to map the whole file, then use a random number to touch random blocks. Does it get slower than the sequential sweep? Or faster (should be faster, I think, because I guess 10% of the time, the random block should still be in RAM, ... 10% because if the file is 10x larger than ram, 90% of it will NOT be in RAM...)


--linas

Ben Goertzel

unread,
Aug 11, 2020, 1:41:40 AM8/11/20
to opencog
We want a large Atomspace, parts of which are in RAM on various
machines, parts of which are in persistent storage, and the ability to
run a variety of queries and processes across this whole Atomspace. I
posted something about the "distributed PLN inference" use-case on
this list not long ago.

The current "distributed Atomspace" functionality is cool but it
doesn't do this yet. It could be the foundation for a system doing
the above, but it might also hit some serious problems. Matt is
pointing out how Cassandra could potentially help work around some of
these problems with its adjustable levels of consistency.

Coordinating a network of distributed sub-Atomspace via a postgres or
RocksDB backing store in a hub-and-spokes architecture seems like it's
not going to do what we need ultimately...

The document Matt Chapman linked above in this thread is the result of
a lot of thought by a number of us, and I think explains the above
points much more thoroughly than I could do in this brief email (plus
a bunch of other points I didn't get to in this email)

ben
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAPE4pjAZMcDWXNCvxR-v_KCVJxyg-DVskwhN7kQMekQHjMKidw%40mail.gmail.com.

Ben Goertzel

unread,
Aug 11, 2020, 11:34:59 AM8/11/20
to opencog
Matt,

So regarding Cassandra, it's clear there are many cool things there...
From what I understand, the key differentiating functionality it seems
potentially able to offer would be: The ability to replicate atoms
locally accompanied by eventual consistency ...

As a first step, I wonder if it would make sense to try some simple
experiments w/ Cassandra to see if it really does this effectively for
an OpenCog context? If you or anyone else w/ Cassandra experience
has time to experiment w/ this, it might be quite interesting...

Is Cassandra's notion of eventual consistency significantly different
from that in Amazon's DynamoDB ?

It seems that in some cases in OpenCog we might want to let two
versions of an Atom drift even further/longer than is commonly allowed
to happen in most Dynamo-based systems... but this really comes down
to, how flexible is the eventual consistency management /
configuration in these things?

ben

On Wed, Jul 29, 2020 at 12:19 PM Matt Chapman <ma...@chapmanmedia.com> wrote:
>
> > Which peers?
> As determined by a token ring:
>
> https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/architecture/archDataDistributeDistribute.html
>
> I think you could almost replace "vnode" with "chunk" if you wanted to adopt the Cassandra architecture, although I wouldn't be surprised to see performance problems with a huge number of vnodes, so it might actually need to be a "chunk-hash modulo reasonable number of vnodes".
>
> > How do you find them?
>
> By calculating the partition token via consistent hash, as Cassandra does with Murmur3. This tells you the authoritative source for the chunk you want. You might also have a local cache of other peers that have had replicas of that chunk, in case any of them are more responsive to you. Cassandra calls this process of finding potential replicas "Snitching".
>
>
> > You are thinking Kademlia (as do I, when I think of publishing) or OpenDHT or IPFS.
>
> Nope. I've only played with IPFS a bit, but I don't expect it to be performance for the atomsoace use case. I'm only vaguely familiar with openDHT; it seems worth exploring, but I'm sure you understand it far better than I do.
>
> I'm not very familiar with p2p systems like kademlia, but I suspect that's optimized for consistency & availability over performance, so not the right choice for datomspace.
>
> By this point, it should be clear that I look to Cassandra for how semi-conistent distributed data storage systems should be designed. (Fwiw, my inspiration for distributed messaging systems comes mostly from Apache Kafka.)
>
>
> > Which is great, if all you're doing is publishing small amounts of static, infrequently-changing information. Not so much, if interacting or blasting out millions of updates. Neither system can handle that -- literally -- tried that, been there, done that. They are simply not designed for that.
>
> Cassandra is. To be fair, Cassandra is optimized for massive scale, with may involve some trade-offs that are not desirable for present-day atomspace use cases.
>
> See also, ScyllaaDB for a C++ reimplementation of Cassandra.
>
> > Now, perhaps using only a hash-driven system, it is possible to overcome these issues. I do not know how to do this. Perhaps someone does -- perhaps there are even published papers ... I admit I did not do a careful literature search.
>
> http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
> http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf
>
> Matt
>
>
>
> On Wed, Jul 29, 2020, 9:37 AM Linas Vepstas <linasv...@gmail.com> wrote:
>>
>>
>>
>> On Wed, Jul 29, 2020 at 1:09 AM Matt Chapman <ma...@chapmanmedia.com> wrote:
>>>
>>> >I think it's a mistake to try to think of a distributed atomspace as one super-giant, universe-filling uniform, undifferentiated blob of storage.
>>>
>>> > You don't want broadcast messages going out to the whole universe.
>>>
>>> Not sure if you intended to imply it, but the reality of the first statmentt need not require the 2nd statement. Hashes of atoms/chunks can be mapped via modulo onto hashes of peer IDs so that messages need only go to one or few peers.
>>
>>
>> Which peers? How do you find them? You are thinking Kademlia (as do I, when I think of publishing) or OpenDHT or IPFS. Which is great, if all you're doing is publishing small amounts of static, infrequently-changing information. Not so much, if interacting or blasting out millions of updates. Neither system can handle that -- literally -- tried that, been there, done that. They are simply not designed for that.
>>
>> Now, perhaps using only a hash-driven system, it is possible to overcome these issues. I do not know how to do this. Perhaps someone does -- perhaps there are even published papers ... I admit I did not do a careful literature search.
>>
>> But, basically, before we are even out of the gate, we already have a snowball of problems with no obvious solution. Haven't even written any code, and are beset by technical problems. That's not an auspicious beginning.
>>
>> If you have something more specific, let me know. Right now, I simply don't know how to do this.
>>
>> --linas
>>>
>>>
>>> Specialization has a cost, in that you need to maintain some central directory or gossip protocol so that peers can learn which other peers are specialized to which purpose.
>>>
>>> An ideal general intelligence network may very well include both a large number of generalist, undifferentiated peers and clusters of highly interconnected specialized peers. If peers are neurons, I think this describes the human nervous system also, no?
>>>
>>> To borrow terms from my previous messsge, generalist peers own many atoms, and replicate few, while specialist peers own few or none, but replicate many.
>>>
>>> Matt
>>>
>>>
>>>
>>> On Tue, Jul 28, 2020, 10:36 PM Linas Vepstas <linasv...@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On Tue, Jul 28, 2020 at 11:41 PM Ben Goertzel <b...@goertzel.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> Hmm... you are right that OpenCog hypergraphs have natural chunks
>>>>> defined by recursive incoming sets. However, I think these chunks
>>>>> are going to be too small, in most real-life Atomspaces, to serve the
>>>>> purpose of chunking for a distributed Atomspace
>>>>>
>>>>> I.e. it is true that in most cases the recursive incoming set of an
>>>>> Atom should all be in the same chunk. But I think we will probably
>>>>> need to deal with chunks that are larger than the recursive incoming
>>>>> set of a single Atom, in very many cases.
>>>>
>>>>
>>>> I like the abstract to the Ja-be-ja paper, will read and ponder. It sounds exciting.
>>>>
>>>> But ... the properties of a chunk depends on what you want to do with it.
>>>>
>>>> For example: if some peer wants to declare a list of everything it holds, then clearly, creating a list of all of its atoms is self-defeating. But if some user wants some specific chunk, well, how does the user ask for that? How does the user know what to ask for? How does the user say "hey I want that chunk which has these contents"? Should the user say "deliver to me all chunks that contain Atom X"? If the user says this, then how does the peer/server know if it has any checks with Atom X in it? Does the peer/server keep a giant index of all atoms it has, and what chunks they are in? Is every peer/server obliged to waste some CPU cycles to figure out if it's holding Atom X? This gets yucky, fast.
>>>>
>>>> This is where QueryLinks are marvelous: the Query clearly states "this is what I want" and the query is just a single Atom, and it can be given an unambiguous, locally-computable (easily-computable; we already do this) 80-bit or a 128-bit (or bigger) hash and that hash can be blasted out to the network (I'm thinking Kademlia, again) in a compact way - its not a lot of bytes. The request for the "query chunk" is completely unambiguous, and the user does not have to make any guesses whatsoever about what may be contained in that chunk. Whatever is in there, is in there. This solves the naming problem above.
>>>>
>>>>>
>>>>> What happens when the results for that (new) BindLink query are spread
>>>>> among multiple peers on the network in some complex way?
>>>>
>>>>
>>>> I'm going to avoid this question for now, because "it depends" and "not sure" and "I have some ideas".
>>>>
>>>> My gut impulse is that the problem splits into two parts: first, find the peers that you want to work with, second, figure out how to work with those peers.
>>>>
>>>> The first part needs to be fairly static, where a peer can advertise "hey this is the kind of data I hold, this is the kind of work I'm willing to perform." Once a group of peers is located, many of the scaling issues go away: groups of peers tend to be small. If they are not, you organize them hierarchically, they way you might organize people, with specialists for certain tasks.
>>>>
>>>> I think it's a mistake to try to think of a distributed atomspace as one super-giant, universe-filling uniform, undifferentiated blob of storage. I think we'll run into all sorts of conceptual difficulties and design problems if you try to do that. If nothing else, it starts smelling like quorum-sensing in bacteria. Which is not an efficient way to communicate. You don't want broadcast messages going out to the whole universe. Think instead of atomspaces connecting to one-another like dendrites and axons: a limited number, a small number of connections between atomspaces, but point-to-point, sharing only the data that is relevant for that particular peer-group.
>>>>
>>>> -- Linas
>>>>
>>>> --
>>>> Verbogeny is one of the pleasurettes of a creatific thinkerizer.
>>>> --Peter da Silva
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups "opencog" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
>>>> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA35zN4aaSrZ2Dpu4qLUL1bYfjAF_rGiS_xxg2-E-SBqY3Q%40mail.gmail.com.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "opencog" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAPE4pjCyzOcoRAOPj7aGsj_73dAUnWovbjeaM4qjeM43hzXA6A%40mail.gmail.com.
>>
>>
>>
>> --
>> Verbogeny is one of the pleasurettes of a creatific thinkerizer.
>> --Peter da Silva
>>
>> --
>> You received this message because you are subscribed to the Google Groups "opencog" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAHrUA36esvtcgGrZ%3D4rCVMDde74TYKF1%3DS-AwLG95UYrT5Mdrg%40mail.gmail.com.
>
> --
> You received this message because you are subscribed to the Google Groups "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to opencog+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CAPE4pjALKeWmpzxwoYR7gCmS5ZcDqrrKPaB0V-UZe814G6cwTA%40mail.gmail.com.

Linas Vepstas

unread,
Aug 11, 2020, 2:31:26 PM8/11/20
to opencog, Xabush Semrie
Hi Ben,

The reason I called these "building blocks" is because they make apparent the need for something you've been calling "agents", - a "remembering agent", a "forgetting agent", a "sharing agent" - I'm using these terms in a slightly different fashion than you do, so let me explain.

Let me start small. So, right now, you can take the cogserver, start it on a large-RAM machine, and have half-a-dozen other atomspaces connect to it. They can request Atoms, do some work, push those Atoms back to the cogserver. This gives you a distributed atomspace, as long as everything fits in the RAM of the cogserver, and you don't turn the power off.

The next obvious step is to enable storage-to-disk for the cogserver.  This is where the first design difficulties show up.  If a cogserver is running out of RAM, it should save some Atoms to disk, and clear them out of RAM. Which ones?  Answer: write a "remembering agent" that implements some policy for doing this.  There is no need to modify the cogserver itself, or any of the clients, to create this agent: so it's a nice, modular design. That's why I called the existing pieces "building blocks": with this third piece, this "remembering agent", one gets a truly functional small distributed system.

If things don't fit on one disk, or if there are hundreds of clients instead of dozens, then one needs a "sharing agent" to implement some policy for sharing portions of an atomspace across multiple cogservers. Again, this is an orthogonal block of code, and one can imagine having different kinds of agents for implementing different kinds of policies for doing this.

OK -- so that's the easy part -- the grand concept of modular design. Two modules are done: the module for "save to disk" (aka atomspace-rocks) and "communicate-over-network" (aka atomspace-cog) Also Xabush is working on a different kind of communicate-over-network module, and that's OK, too.

Let's now consider the simplest agent - the "remembering agent" that sometimes moves atoms from RAM to disk, and then frees up RAM.  How should that work? Well, it's surprisingly ... tricky.  One could stick a timestamp on each Atom (a timestamp Value) and store the oldest ones. But this eats urp RAM, to store the timestamp, and then eats more RAM to keep a sorted list of the oldest ones. If we don't keep a sorted list, then we have to search the atomspace for old ones, and that eats CPU.  Yuck and yuck. 

Every time a client asks for an Atom, we have to update the timestamp (like the access timestamp on a unix file.)  So unix files have three timestamps to aid in decision-making - created, modified and accessed. It works because most files are huge compared to the size of the timestamps. For us, the Atoms are the same size as the timestamps, so we have to be careful not to mandate that every atom must have some meta-data.

There's another problem. Suppose some client asks for the incoming set of some Atom. Well, is that in RAM already, or is it on disk, and needs to be fetched? The worst-case scenario is to assume it's not, and always re-fetch from disk. But this hurts performance. (what's the point of RAM, if we're always going to disk???) Can one be more clever? How?

There's a third problem: "vital" data vs "re-creatable" data. For example, the genomics dataset itself is "vital" in that if you erase anything, it's a permanent "dataloss".  As it is being used, the MOZI genomics code-base performs searches, and places the search results into the atomspace, as a cache, to avoid re-searching next time. These search results are "re-creatable".   Should re-creatable data be saved to disk? Sometimes? Always? Never? If one has a dozen Values attached to some Atom, how can you tell which of these Values are "vital", and which are "recreatable"?

So -- I've sketched three different problems that even the very simplest agent must solve to make even the simplest distributed system.   The obvious solution is not very good, the good solution is not obvious.

The ideas from ECAN and attention values can help some of these problems, but they're not enough in themselves.  More subtle approaches, that actually take into account RAM, CPU and network usage are needed.  The "good news" is that these are "classic OS problems" -- there is a very long history of these kinds of problems, as anyone who writes an operating system and/or a database and/or a cloud infrastructure has to solve exactly these kinds of problems. It a very active area of research - witness the clash between the "Microsoft Open Service Mesh" vs. the "Google Istio" service mesh: these are both "agent" systems that are solving exactly the kinds of problems I'm talking about, and they are light-years ahead of the AtomSpace in terms of sophistication, because there are more than billions of dollars at stake in getting these "agents" to work well.

Well, that's it. A few random end-notes:
 -- an unrelated problem is that the existing opencog "agents" infrastructure is horrible, and needs to be consigned to oblivion.
-- I'm planning on creating an "atomspace-agents" repo real soon now, as a research-area/dumping ground for the kind of agents I describe above.


-- Linas

Linas Vepstas

unread,
Aug 11, 2020, 6:04:20 PM8/11/20
to opencog
This appears to evade/avoid acknowledging issue #1, which is the (CPU) overhead of translating between multiple formats, the competition for RAM that those formats entail, and the need to ship the resulting bytes between API's, or, worse, over (network or local) sockets.

Sure, maybe cassandra has nice solutions for issues #2 #3 and #4, such as consistency, replication, etc. but until you address issue #1 frontally and completely, the remaining issues are utterly unimportant and even delusional.

--linas

Linas Vepstas

unread,
Aug 11, 2020, 8:04:13 PM8/11/20
to opencog
Sigh.

I dislike writing long emails because I fear no one reads them, or that they are viewed as overly aggressive and pugnacious. But until such time as we have mind-reading or neural laces, its .. email.

I want to talk about "service meshes". The problem with shopping for cassandra, or any of the other suggested databases, is that they are all "monolithic black boxes". You pick one, and you get what you get: whatever is provided, that's what it is. Sure, some configuration files somewhere allow you to tune this and that, but that's all.

The service mesh idea (and the npm/js idea before that) is to assemble your system out of small, self-contained pieces. Sure, the object-oriented folks have been talking about this for 3 or 4 decades, and it's cited as the raison-d'etre for things like C++. But C++ never lived up to this ideal.  There are no generic C++ frameworks. None. At All. (OK, so SGI had one or two in the early 1990's ...) Something is ... missing...  in C++.  Compare this to node.js and npm which are wildly successful over-achievers in this category.  People regularly build large applications by assembling a cacophony of tiny little javascript parts. Clearly, javascript has something that C++ does not.  Something that makes the OO dream achievable not just in theory, but regularly validated in practice.

Now, there are some down-sides to npm apps: they contain hundreds or thousands of parts, and not all of them are well-maintained, and many have published security vulnerabilities that remain unpatched. Worse, patching some of them require incompatible API changes that would break users. So it has its own prickly and thorny issues that are unique and different from those that other languages (python, scheme, c++) suffer from.

In the cloud world, there has long been, and continues to be a movement to meshes of containerized applications. Here, docker is the prototypical container -- lxc/lxd/lxe more generally.  Managing these containers requires kubernetes, and more: the "service meshes" (istio, microsoft open service mesh) provide a layer (a "control plane") that further manages deployments, error fallbacks, a/b testing, circuit-breakers, load-balancing, etc.   The mental model is that containerized apps are just like npm nodes, except they are million times bigger and beefier (literally) and they all have network interfaces instead of javascript methods/objects. And since they are so much bigger, they need more active management.

Now compare the service-mesh idea to the olde-fashioned ideas of "web shopping carts" or "content management systems" or "customer relationship management systems".  Those things were single, monolithic black boxes that you bought from a vendor (or installed via open-source) that automagically did everything for you, once you configured a few templates.   They worked great, as long as what you wanted was (a) a web shopping cart, and (b) was customizable via some template or config file. If not .. you were SOL.

These monolithic architectures were their downfall, were the driver to containers, kubernetes and service meshes. The founders of cloud startup XYZ can't spray-paint some config files onto a monolith and then raise $20M in venture funding.  But, give them a bunch of pieces-parts containers, that they can hook up in some new, novel and exciting way, plus a little secret sauce, and buzzword-bingo, a unicorn is born.

And this is why Cassandra makes me yawn with disinterest, if not a bit of hostility. It's a big monolithic block. Sure, I can take the AtomSpace, and plaster it onto Cassandra, like wrapping some wet paper around a rock. The ultimate shape is still that of the rock, no matter how brightly-colored or thoughtful that paper wrapped around it is.

So, I'm trying to grab hold of this idea of pieces-parts.  OpenCog needs pieces-parts that can be arranged and re-assembled into that mesh that provides the distributed-atomspace attributes and requirements du-jour.

Yes, of course, singularity.net is also pursuing a vision of pieces-parts that can be assembled. Which is why I am a bit dumb-founded that we are entertaining ideas like Cassandra -- it is the very antithesis of modular architecture. It's the opposite of a dapp -- It's a big giant lump, the one ring to rule them all. It's kind of exactly the poster-child for what not to do ...

For a distributed atomspace, what we really need to focus on is inter-operability, so that, like javascript (and unlike c++) it is easy to assemble modules out of other modules.  Like containers, there should be some fairly regularized API for communications (I nominate atomese-as-ascii-strings i.e. s-expressions and maybe plan-B atomese-as-json). With this under control, we can move on to creating unique, custom services aka agents aka dapps or whatever these other things might be.

Again, I nominate the building-blocks idea: I took the earlier email, and pasted it into the README, here: https://github.com/opencog/atomspace-agents

-- Linas
 

Linas Vepstas

unread,
Aug 11, 2020, 8:05:53 PM8/11/20
to opencog
Re-sending with different subject-line.  I was gonna change the subject-line and forgot.

Matt Chapman

unread,
Aug 13, 2020, 1:20:38 PM8/13/20
to opencog
I read it. I feel like you're talking about complete different things, and misrepresenting what Cassandra is. All those node.microservice frequently require a common data store (in additional to a local data store), and that store is more often than not, Dynamo, Firebase, or Cassandra.

But I feel like someone said, "Which kind of plane should we take to China?" and you answered, "Fishing is best done from a boat." True, and maybe we do want to do some fishing when we get to China, but that's not at all what I'm talking about here, and I make my living building the kinds of micro-service based applications that youre describing.

So, I conclude we're talking at levels that are too abstract to resolve our communication failures, and the only path forward is to build a PoC. It's unlikely I'll have the time given my current employment, but now I know it won't happen until I do.

If I can leave oy one impression: Tunable Consistency is valuable. Not Eventual Consistency. Not Cassandra or Scylla or Seastar, or Token Rings. An effective multi-mind-agent distributed atomspace probably requires Tunable Consistency, however it's implemented. Arguably, current DB backing offers a limited form of it, but more powerful forms exist.

Matt

Linas Vepstas

unread,
Aug 13, 2020, 6:46:25 PM8/13/20
to opencog
What can I say. Hard to say anything if there's a communications breakdown.

If you have something specific and technical to comment on, please do so. I can't deal with "fishing boats to China" as a technical critique. It's abundantly clear that Cassandra would be a total failure for this application, and I've now explained this in considerable detail in three different ways. If you think you can prove otherwise ... you are welcome to do so. Good luck with that.

-- linas



Predrag Radović

unread,
Aug 16, 2020, 6:24:15 PM8/16/20
to ope...@googlegroups.com
Hi Linas,

On Fri, Aug 07, 2020 at 01:30:14PM -0500, Linas Vepstas wrote:
> >
> > To get scientific about it, you'd want to create a heat-map -- load up
> > some large datasets, say, some of the genomics datasets, run one of their
> > standard work-loads as a bench-mark, and then see which pages are hit the
> > most often. I mean -- what is the actual working-set size of the genomics
> > processing? No one knows -- we know that during graph traversal, memory is
> > hit "randomly" .. but what is the distribution? It's surely not uniform.
> > Maybe 90% of the work is done on 10% of the pages? (Maybe it's Zipfian?
> > I'd love to see those charts...)
>
> The "query-loop" is a subset/sample from one of the agi-bio datasets. It's
> a good one to experiment with, since it will never change, so you can
> compare before-and-after results. The agi-bio datasets change all the
> time, as they add and remove new features, new data sources, etc. They're
> bigger, but not stable.

VM page heat-map for query-loop benchmark is here:

https://github.com/crackleware/opencog-experiments/tree/c0cc508dc5757635ce6c069b20f8ae13ccf8ef8a/mmapped-atomspace

Everything is getting dirty during loading. There is a "hot" subset of pages
being referenced during processing stage. Total size of referenced pages in
processing stage is around ~150MB of 1.6GB (total allocation). Heat-map is very
crude because it groups pages in linear order which is probably bad
grouping. I may experiment with page grouping to get more informative graphs
(could be useful chunking research).

I also did several experimental runs where I used swap-space on NFS and NBD
(network block device). 2 cores, 1GB RAM, 2GB swap. Performance was not very
good (~10%). CPU is too fast for this amount of memory. :-)

Intermittent peaks are probably garbage collections.

All in all, I expect much better performance with very concurrent workloads,
hundreds of threads. When a processing thread hits a page which is not yet in
physical RAM it blocks. Request for that page from storage is queued. Other
threads continue to work and after some time they will block too waiting for
some of their pages to load. Storage layer will collect multiple requests and
deliver data in batches, introducing latency. That's why when they benchmark
SSDs there are graphs for various queue depths. Deeper queue, better throughput.

Query-loop benchmark is single-threaded. I would like to run more concurrent
workload with bigger datasets. Any suggestions?


--pedja

Linas Vepstas

unread,
Aug 16, 2020, 9:43:48 PM8/16/20
to opencog, Xabush Semrie
CC'ing Xabush to answer the question at the bottom ..

On Sun, Aug 16, 2020 at 5:24 PM Predrag Radović <pe...@crackleware.nu> wrote:
Hi Linas,

On Fri, Aug 07, 2020 at 01:30:14PM -0500, Linas Vepstas wrote:
> >
> > To get scientific about it, you'd want to create a heat-map -- load up
> > some large datasets, say, some of the genomics datasets, run one of their
> > standard work-loads as a bench-mark, and then see which pages are hit the
> > most often. I mean -- what is the actual working-set size of the genomics
> > processing? No one knows -- we know that during graph traversal, memory is
> > hit "randomly" .. but what is the distribution? It's surely not uniform.
> > Maybe 90% of the work is done on 10% of the pages? (Maybe it's Zipfian?
> > I'd love to see those charts...)
>
> The "query-loop" is a subset/sample from one of the agi-bio datasets. It's
> a good one to experiment with, since it will never change, so you can
> compare before-and-after results.  The agi-bio datasets change all the
> time, as they add and remove new features, new data sources, etc. They're
> bigger, but not stable.

VM page heat-map for query-loop benchmark is here:

https://github.com/crackleware/opencog-experiments/tree/c0cc508dc5757635ce6c069b20f8ae13ccf8ef8a/mmapped-atomspace

Wow! I wasn't really expecting much to happen, so very definitely wow!


Everything is getting dirty during loading. There is a "hot" subset of pages
being referenced during processing stage. Total size of referenced pages in
processing stage is around ~150MB of 1.6GB (total allocation). Heat-map is very
crude because it groups pages in linear order which is probably bad
grouping. I may experiment with page grouping to get more informative graphs
(could be useful chunking research).

OK, some assorted random, disconnected remarks:

 * After initial data load is completed, run the benchmark for 10 seconds, sort the pages by hits, and then monitor to see how that changes over time...
 
* There's a way to monitor guile, while running.  In the main guile shell, say `(use-modules (opencog cogserver))` and `(start-cogserver)` which should print `Listening on port 17001
$1 = "Started CogServer"`   then, from somewhere else, you can `rlwrap telnet localhost 17001` and `scm` which prints `Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
opencog> scm
Entering scheme shell; use ^D or a single . on a line by itself to exit.`
and then any scheme is valid. An interesting one is (gc-stats) which prints info about guile's garbage collector. It burns through a huge amount of ram during data load (no surprise), but then settles down to a working-set size of 10MBytes (also no surprise) there is very little guile usage after the initial load.

* We have newer load procedures that don't use guile for loading, so the initial hugeness should come down.

* Using the non-guile load is surely a benefit, as it will mean that atomspace RAM and guile RAM allocations are far less likely to interleave and fragment one-another.  Less fragmentation means that the guile GC is less likely to invade every page just to scan a few hundreds bytes of guile-heap. (I'm not clear how it actually works, but I think the fragmentation is a valid concern.)

* I have no clue of the 150MBytes vs long-tail. It is possible that, during the file-load, all of the gene data ended up on a 150MB subset of RAM, and the protein and reactome data fills the rest. There are 30K genes and 150K proteins, so that is a 5x difference, but you are seeing a 10x difference between hot and luke-warm ... hmm.

* In principle, the long tail does not surprise me: the access patterns are "very random". So, first of all, the genes are likely to get splattered over most pages, (depending on how the files load) and the various links connecting genes together might get splattered onto even more pages.

The "triangle" benchmark is looking for three genes that interact pair-wise, thus forming a triangle. These have a very "fat tail": the distribution is square-root-Zipfian. That is, if you sort genes by the number of triangles they appear in, then rank them, then the distribution is 1/sqrt(rank) so much much fatter than the classic Zipf tail of 1/rank. I also looked at tetragons, and its fatter still. (I back-burnered that work, but have detailed graphs for this stuff at https://github.com/linas/biome-distribution/blob/master/paper/biome-distributions.pdf which I need to finish...) 

... all this has implications for which RAM pages get hit. ...

... I've never-ever thought about it before, but maybe there are some tricks where we could somehow force more locality during the data load, e.g. by having the Atomspace allocate out of a different pool, than say, where-ever other random allocations are being done.  Or some other clever locality stunts, like asking related atoms to be placed near each other, e.g. the way modern file systems allocate blocks... is there a file-system-like allocator for RAM? Where I can ask for RAM that is as "near as possible to this", and otherwise, far away, leaving gaps for growth (like what ext2 does as opposed to what DOS FAT did)"?

Well, the thing to do here is to stop using guile for file loading, and see if that fixes the long-tail ... that long tail might just be the guile GC touching every page, because the guile heap got fragmented everywhere ... Xabush will explain how to use the "fast file loader" on the new datasets.


I also did several experimental runs where I used swap-space on NFS and NBD
(network block device). 2 cores, 1GB RAM, 2GB swap. Performance was not very
good (~10%). CPU is too fast for this amount of memory. :-)

Intermittent peaks are probably garbage collections.

Yes. And not using guile to load data may avoid having GC touch all RAM.

All in all, I expect much better performance with very concurrent workloads,
hundreds of threads. When a processing thread hits a page which is not yet in
physical RAM it blocks. Request for that page from storage is queued. Other
threads continue to work and after some time they will block too waiting for
some of their pages to load.  Storage layer will collect multiple requests and
deliver data in batches, introducing latency. That's why when they benchmark
SSDs there are graphs for various queue depths. Deeper queue, better throughput.

OK, so these tests are "easily" parallelized, with appropriate definition of "easy". Each search is conducted on each gene separately, so these can be run in parallel. That's the good news. The bad news is that doing this with guile threads seems to fail; there is some kind of live-lock problem in guile that I have not been able to isolate. Pure guile multi-threads well, but not guile+atomspace ... it is 1.5x faster for two threads, and running three threads is like running one. Running four threads is slower than one. Yech.

The good news is that I believe that atomspace pure C++ code threads well. The other good news is that Atomese has several actual Atoms that run multiple threads -- one called ParallelLink, the other called JoinThreadLink. I've never-ever tried them with the pattern matcher before. I will try now ... anyway, I think it's possible to do pure-c++-threading without having to write any new C++ code... separate email when I have something to say...


Query-loop benchmark is single-threaded.  I would like to run more concurrent
workload with bigger datasets. Any suggestions?

Xabush, what are your biggest datasets? How do you load them?  How do you run them?

Meanwhile, I'll explore concurrency with the ParallelLink ... see if I can make that practical.

-- Linas

Linas Vepstas

unread,
Aug 16, 2020, 11:36:40 PM8/16/20
to opencog, Xabush Semrie
On Sun, Aug 16, 2020 at 8:43 PM Linas Vepstas <linasv...@gmail.com> wrote:
On Sun, Aug 16, 2020 at 5:24 PM Predrag Radović <pe...@crackleware.nu> wrote:
Hi Linas,

All in all, I expect much better performance with very concurrent workloads,
hundreds of threads. When a processing thread hits a page which is not yet in
physical RAM it blocks. Request for that page from storage is queued. Other
threads continue to work and after some time they will block too waiting for
some of their pages to load.  Storage layer will collect multiple requests and
deliver data in batches, introducing latency. That's why when they benchmark
SSDs there are graphs for various queue depths. Deeper queue, better throughput.

OK, so these tests are "easily" parallelized, with appropriate definition of "easy". Each search is conducted on each gene separately, so these can be run in parallel. That's the good news.
The other good news is that Atomese has several actual Atoms that run multiple threads -- one called ParallelLink, the other called JoinThreadLink. I've never-ever tried them with the pattern matcher before. I will try now ...

I tried. It works-ish. (actually, it works-ish great, with the appropriate definition of "great"; I'm thrilled.) The JoinThreadLink does parallelize, but it has a rather old API that predates modern ideas that makes it currently impractical. I need to fix it .. which should be "easy" cause no one currently uses it. But this will take at least a few days or longer, and I really should be doing other things... like plumbing in the upstairs bathroom...

I (really?) like this idea, since, done right, it might be the easiest way to parallelize queries, which has been a sticking point for a while.  We've got all the other parallel infrastructure in place, already, so now it just has to be wired up correctly.

-- Linas

Abdulrahman Semrie

unread,
Aug 17, 2020, 5:33:35 PM8/17/20
to opencog
Xabush, what are your biggest datasets? How do you load them?

Meanwhile, I'll explore concurrency with the ParallelLink ... see if I can make that practical.

I load the following datasets:

Loading all the datasets takes 84.8 seconds with 4.5 GB of RAM usage on my machine. When excluding the string datasets, it takes only 25.8 seconds to load with ~ 2GB of RAM usage.

I use the sexpr code to load them which is much faster than using guile’s primitive-load. 

How do you run them?

I didn't understand this question. 


Regards,

Abdulrahman Semrie

Reply all
Reply to author
Forward
0 new messages