Andreas.
--
but one of the general advantages of HDF5 is that it's **files**
which you can exchange. It's not (easily) possible to exchange MySQL data.
--
--
--
--
I see that there is a bias towards using HDF5 in pandas. Can someone elaborate on the design decision to have better support for HDF5?What are the pros/cons of using HDF5 vs. SQL?
--
Having that said, I prefer to have a solution where I have a centralized database and multiple users fetch data from it. I think this is where HDF5 becomes a subpar solution. There is OpenDAP which aims to provide access to remote HDF5 sources, but the python library support seems lacking in this arena. pydap is the opendap library, and it used to have a pytables plugin which no longer works. Therefore I believe it would not be possible to use pandas with an opendap server at this point since pandas uses pytables.
Of course I might be wrong, since I have not tried this myself -- this is just an observation from what I have read so far. Anyone with experience on using pandas with an OpenDAP server please pitch in
On Wednesday, 28 November 2012 13:40:34 UTC-5, Adam wrote:I am very interested in building a database around results in my research. At first I had decided to use PySQL and attempt to store data as pickled dataframes, and then would build a small interface to allow users to read the pickles into memory upon selection. I wasn't so much aware of pandas builtin support for PyTables and HDF5, so now I'm leaning in that direction. Has anyone already build a database around pandas-based reasearch? If so, can you share your experience, especially any tips, pointers and gotchas that I should be aware of?
On Wed, Nov 28, 2012 at 1:30 PM, Goyo <goyo...@gmail.com> wrote:
El miércoles, 28 de noviembre de 2012 15:17:35 UTC+1, Andreas Hilboll escribió:but one of the general advantages of HDF5 is that it's **files**
which you can exchange. It's not (easily) possible to exchange MySQL data.There's SQLite for that.Goyo--
Jeff uses hdf5 a lot and has invested a lot of time making it work well. He can better evangelise :) see also http://stackoverflow.com/q/14262433/1240268
SQL has recently moved to sqlalchemy for greater language support, but this doesn't lend itself to speed (previously had used pure python IIRC so wasn't super fast either). IMO is few devs using this (on large enough datasets)... ?
Note that Continuum have a (non-free) solution: io-pro.
Has been mooted before using a faster/lower-level (C) API, but I don't think we found one/or someone to implement...
--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.