Using different storage framework to store Pandas dataframes

59 views
Skip to first unread message

Shan Tulshi

unread,
Jun 6, 2016, 6:39:51 PM6/6/16
to PyData
Hi! 

I'm fairly new to pandas, but I'm trying to integrate a different storage system with pandas so dataframes aren't kept in local memory, but are stored in server and all further modifications are done serverside. Is there a storage interface I can modify to make this happen?

Thanks a ton,
Shan

Stephan Hoyer

unread,
Jun 7, 2016, 2:45:45 AM6/7/16
to pyd...@googlegroups.com
I'm afraid pandas does not have any such storage interface available at present -- data is always stored in NumPy arrays in memory.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

CM

unread,
Jun 7, 2016, 9:35:04 AM6/7/16
to PyData
while not exactly what you are asking for there are these projects which are related:

http://blaze.pydata.org/
Blaze is a Python library and interface to query data on different storage systems. Blaze works by translating a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze gives Python users a familiar interface to query data living in other data storage systems such as SQL databases, NoSQL data stores, Spark, Hive, Impala, and raw data files such as CSV, JSON, and HDF5.

Arctic is a high performance datastore for numeric data. It supports Pandas, numpy arrays and pickled objects out-of-the-box, with pluggable support for other data types and optional versioning.

Michael Hooreman

unread,
Jun 8, 2016, 3:57:10 AM6/8/16
to PyData
Hi,

I'm using picke to "store" pandas Data Frames. But you have to unpickle eveything, you cannot only extract based on filtering.

So, that's far from perfection...

Best regards.

Michael
Reply all
Reply to author
Forward
0 new messages