ANNOUNCE: FileStorageNode

11 views
Skip to first unread message

Linas Vepstas

unread,
Sep 1, 2021, 12:02:44 PM9/1/21
to opencog
Yesterday, I put the finishing touches on the FileStorageNode.  This uses the StorageNode API to read/write Atomese s-expressions to a flat file.  It's fast, its compact.  It's 10x faster than using plain scheme (guile) to dump Atoms: this is thanks to code originally written by Alexey Potapov and Anatoly Belikov -- I wrote a wrapper around it to use the StorageNode API.

Some stats: I tested two datasets: a MOZI biology dataset, and a natural language dataset, of 7 million and 20 million Atoms, respectively. When these are loaded into the AtomSpace (in RAM), they take up 632 and 775 bytes/Atom of RSS (operating system resident set size). This is very typical for Atoms in the AtomSpace. (I put these two datasets up at https://linas.org/datasets/ for Amirouche.)

Dumped to a file, this becomes 55 and 154 bytes/Atom, for plain, uncompressed Atomese s-expressions. When compressed with bzip2, it shrinks to 4 and 6 bytes/Atom!  Tiny!  Clearly, storing searchable indexes into the AtomSpace costs a huge amount of RAM.  The actual data content in typical Atoms is .. tiny.


-- Linas

--
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.
 

Reply all
Reply to author
Forward
0 new messages