ANNOUNCE: FileStorageNode

11 views

Skip to first unread message

Linas Vepstas

unread,

Sep 1, 2021, 12:02:44 PM9/1/21

to opencog

Yesterday, I put the finishing touches on the FileStorageNode. This uses the StorageNode API to read/write Atomese s-expressions to a flat file. It's fast, its compact. It's 10x faster than using plain scheme (guile) to dump Atoms: this is thanks to code originally written by Alexey Potapov and Anatoly Belikov -- I wrote a wrapper around it to use the StorageNode API.

Some stats: I tested two datasets: a MOZI biology dataset, and a natural language dataset, of 7 million and 20 million Atoms, respectively. When these are loaded into the AtomSpace (in RAM), they take up 632 and 775 bytes/Atom of RSS (operating system resident set size). This is very typical for Atoms in the AtomSpace. (I put these two datasets up at https://linas.org/datasets/ for Amirouche.)

Dumped to a file, this becomes 55 and 154 bytes/Atom, for plain, uncompressed Atomese s-expressions. When compressed with bzip2, it shrinks to 4 and 6 bytes/Atom! Tiny! Clearly, storing searchable indexes into the AtomSpace costs a huge amount of RAM. The actual data content in typical Atoms is .. tiny.

See https://wiki.opencog.org/w/FileStorageNode and the demo in https://github.com/opencog/atomspace/blob/master/examples/atomspace/persist-store.scm

-- Linas

Patrick: Are they laughing at us?

Sponge Bob: No, Patrick, they are laughing next to us.

Reply all

Reply to author

Forward

0 new messages