Very large tree data structure?

14 views
Skip to first unread message

Emanuele Olivetti

unread,
Dec 30, 2018, 1:08:19 PM12/30/18
to pytables-users
Hi,

I'm very new to PyTables and wondering whether it can be the right tool for my problem. I'm building a very large binary tree - millions to billions of nodes - that is quickly filling up the RAM. Each node contains a tiny amount of information, just a few numbers and the references to children nodes. I'd need something like numpy.memmap() but for trees, not arrays, to offload RAM to disk and to handle the persistence. Is PyTables a good candidate for the job or shall I aim for something else?

Thanks for answers/suggestions,

Emanuele

Francesc Alted

unread,
Dec 31, 2018, 3:07:40 AM12/31/18
to Emanuele Olivetti, pytables-users
Hi Emanuele,

PyTables is specially meant to deal with tabular data and *also* providing indexing on columns so that you can do faster queries.  It is indeed possible to store a tree on top of a table, but you have to be responsible of simulating the links between the different branches and manage them by yourself.  In case you have a try, keep us informed; that would be an interesting application.

Best wishes,
Francesc

--
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-user...@googlegroups.com.
To post to this group, send an email to pytable...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Francesc Alted

Gonçalo Lopes

unread,
Dec 31, 2018, 6:09:36 AM12/31/18
to Francesc Alted, Emanuele Olivetti, pytables-users
Hi Emanuele,

This is just a thought, but depending on your application, it is sometimes very easy to represent a large binary tree implicitly as an array. A binary heap is an example of one such application: https://en.wikipedia.org/wiki/Binary_heap#Heap_implementation

In this case, you can just directly use an array directly to represent the tree without any use for pointers.

Best,
Goncalo


To post to this group, send email to pytable...@googlegroups.com.

Emanuele Olivetti

unread,
Dec 31, 2018, 9:06:35 AM12/31/18
to Gonçalo Lopes, Francesc Alted, pytables-users
Hi Gonçalo,

Thank you for pointing me to to the binary heap! I was thinking about using just an array and now I'll have a deeper look into that.

Best,

Emanuele

Reply all
Reply to author
Forward
0 new messages