I like to stick with DataFrame whenever possible to keep things
simple, so you could also create separate columns X, Y, and T for
every datapoint. You can then use set_index(['X', Y', 'T']) or
set_index(['T', 'X', 'Y']) depending on whether you want to process
over timepoints or grid cells. I believe this will automatically sort
by the new indices too, to retain your meaningful data order.
To recover the 3d array representation, you would set the index in the
order you want, and then:
df.values.reshape((Ndim1, Ndim2, Ndim3)).
One caveat is that if your X, Y, and T values are floats, you should
be careful when you set them as indices. Certainly you can't count on
testing for equality with another floating point (eg: df[df.X == 0.1]
) due to rounding error. More importantly, I think that set_index
should work correctly (ie group .09 repeating with .1), but it's
something to check.
Chris
--
Graduate Student
Helen Wills Neuroscience Institute
University of California - Berkeley
I would suggest using the Panel object, which is really just a 3-D
labeled array that is capable of having heterogeneous "slices" along
the first axis. I would be happy to get feedback about it if you find
that it does not suit your needs.
thanks,
Wes
Thanks Chris and Wes for your suggestions! So I implemented my 3D data
structure using a Panel. The problem arise when I use `MultiIndex` for
the `items` axis of the Panel structure. Most of the (multi)indexing
capabilities shown in the documentation* don't work when using a Panel
structure.
* http://pandas.sourceforge.net/indexing.html
Example:
ind = pn.MultiIndex.from_tuples([('a', 1), ('a', 2), ('b', 1), ('b',
2)], names=['fist', 'second'])
wp = pn.Panel(np.random.random((4,5,5)), items=ind,
major_axis=np.arange(5), minor_axis=np.arange(5))
In [10]: wp['a']
...
KeyError: 'no item named a'
In [11]: wp.ix['a']
...
KeyError: 'no item named a'
Why the Panel structure doesn't follow the same behavior as DataFrame
and Series for MultiIndex? Or perhaps I'm missing something?
Thanks so much,
hi Fernando,
I believe you've misunderstood. Panel has been patched in git master
https://github.com/pydata/pandas/commit/f5e5b1427744724ab2e54faed2b4f973a22abf62
https://github.com/pydata/pandas/commit/764ce5e44f83ec2f9fa30895c6061b009e664429
so the example you gave works now (this will be released in pandas
0.7.2, upcoming):
In [2]: import pandas as pn
In [3]: paste
ind = pn.MultiIndex.from_tuples([('a', 1), ('a', 2), ('b', 1), ('b',
2)], names=['fist', 'second'])
wp = pn.Panel(np.random.random((4,5,5)), items=ind,
major_axis=np.arange(5), minor_axis=np.arange(5))
## -- End pasted text --
In [4]: wp['a']
Out[4]:
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 5 (major) x 5 (minor)
Items: 1 to 2
Major axis: 0 to 4
Minor axis: 0 to 4
However, generally the use of hierarchical indexing in Panel needs
more users and more bug reports-- parts of it work and parts of it do
not. There are too many granular tasks involved with this to be a
single issue which is why Adam closed the issue.
- Wes