EOSS-3 Ideas

2 views
Skip to first unread message

Josh Moore

unread,
Mar 3, 2020, 5:21:44 AM3/3/20
to pytabl...@googlegroups.com
All-

I'm just back from the EOSS-1 [0] meeting in Berkeley, in part wearing my zarr hat. numpy, matplotlib, pandas, and a few other NumFOCUS projects were represented as well [1].

EOSS round 2 has just closed [2], but EOSS-3 will be opening in June. Would anyone be interested in a PyTables proposal? I was fielding questions about PyTables at the meeting, and so my first perhaps poorly thought through idea would be to propose a zarr backend. You all likely have better ideas for the max. 250K USD though. The money would need to be spent within one year.

~Josh

Francesc Alted

unread,
Mar 3, 2020, 12:56:30 PM3/3/20
to pytabl...@googlegroups.com, Josh Moore, Alberto Sabater, Aleix Alcacer
Thanks Josh for the heads-up.

I suppose that you know that we already tried to put PyTables on top of h5py at least on a couple of occasions.  Here it is our last atempt (making use of a NumFOCUS small grant): https://github.com/PyTables/PyTables/pull/634 .

In my experience, that continues to imply a great deal of work, and I am afraid that using zarr as a backed would not improve things significantly.  In fact, since the beginning, we started trying to define an interface so that it would be relatively easy to plug another backend than h5py (e.g. zarr).  In all honesty, I don't think that it is worth the effort to continue this effort, as the functionality that provides PyTables beyond what it provides h5py (or zarr) is essentially the indexing for accelerating queries, and for users it should be easier to use a e.g. relational database for this.

Having said this, I'd say that perhaps it makes more sense to implement an existing *columnar store* like e.g. bcolz (https://bcolz.readthedocs.io) on top of things like zarr or Caterva (see our plans for it here: https://github.com/Blosc/caterva/blob/master/ROADMAP.md), or better yet, on top of an interface that would allow to plug different multidimensional storages (including h5py).  We will think seriously about submitting a proposal for EOSS-3.

Best,
Francesc

--
You received this message because you are subscribed to the Google Groups "pytables-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pytables-dev/CAPRhD2US1eppdYgQ6pk%3DGrGvYpZaWASE9FUD%3D6HQ8hFZs77MHg%40mail.gmail.com.


--
Francesc Alted
Reply all
Reply to author
Forward
0 new messages