Neo and large files

65 views
Skip to first unread message

Niccolò Bonacchi

unread,
Nov 19, 2015, 8:45:45 AM11/19/15
to neurale...@googlegroups.com
I have these files:
datafile001.nev  3.5Gb
datafile001.ns3  21.2Mb
datafile001.ns5  9.9Gb

ns3 is an analog syncpulse so 1 channel @ 2ks/s
ns5 is 32 channels @ 30ks/s (8 tetrodes)
The session is ~1h long 

When I read the files with neo it will load everything, it occupies a lot of RAM and its very slow > 30min where the files are read multiple times in full. As a comparison I can read the 9.9Gb file in Matlab using the openNSx.m func Blackrock provides in ~6minutes w/ an extra 4 to 5 for saving in HDF5. (Matlab's implementation of HSF5 is a different annoying topic so I won't get in to it here). I would prefer not to use Matlab as I suffer from proprietary file format phobia, especially when talking about data, simple and open wins all the time.
Right now I would like to use Pandas to store all my data as I have a bunch of different analysis already coded and waiting for a properly formatted input. I've been reading your paper and it has a bunch of interesting examples that makes me want to start using the neo object model once I wrap my head around the details of your API. The loading time however is a problem.

This is what I'm doing:
data = neo.io.BlackrockIO(test_data)

d = data.read_segment(n_start=None, n_stop=None, load_waveforms=False,

                                      nsx_num=None, lazy=False, cascade=True)


If I use nsx_num=5 it will load only the ns5 file (still takes much longer than Matlab) and occasionaly (depending on the file I use) I get this error:
Traceback (most recent call last):
  File "/media/nico/Dropbox/ego_allo/code/python/neurons/neoTest.py", line 165, in <module>
    nsx_num=5, lazy=False, cascade=True)
  File "/usr/local/lib/python2.7/dist-packages/neo/io/blackrockio.py", line 158, in read_segment
    self.read_nsx(filename_nsx, seg, lazy, cascade)
  File "/usr/local/lib/python2.7/dist-packages/neo/io/blackrockio.py", line 361, in read_nsx
    seg.rec_datetime = get_window_datetime(nsx_header['window_datetime'])
  File "/usr/local/lib/python2.7/dist-packages/neo/io/blackrockio.py", line 441, in get_window_datetime
    assert len(buf) == np.dtype(dt).itemsize, 'This buffer do not have the good size for window date'
AssertionError: This buffer do not have the good size for window date

if I play around with n_start and n_stop it ignores it and loads all the data anyway.


So now for my question:
What am I doing wrong?
Is there a way to load one channel at the time? and part of the raw signal from all or specific channels?

Thanks for reading until here,

--
N

Samuel Garcia

unread,
Nov 19, 2015, 9:32:20 AM11/19/15
to neurale...@googlegroups.com
Hi,
neo.io read every thing in memory, it is a major problem in neo.
We made this choice to simplify the io API at the beginning. For long files it is really annoying.
Nobody have take time to change the API because recoding all IO is an enormous job.
Andrew is now working on changing the neo.core.
This won't change the fact that every thing is read in memory in neo.io.
It is a work that still need to be done.

Concerning the BlackrockIO speed, I code this third version to simplify the code but never used it.
There are  places to improve performance, certainly.
I won't have time to do it, if you have knowledges in python profiling : feel free to modify what you want.
I don't any reasons why a pure python implementation should be slower that a pure matlab one.

Concerning the t_start/t_stop in read_segment() : it is fake. I am sorry.
It is a sad copy/paste from an old BlackrockIO but not implemented.
I should remove it.


In the future branch of neo: https://github.com/NeuralEnsemble/python-neo/pull/210
Andrew have also change the BlackrockIO. It will read all channel at once. Maybe this will improve the performances.
Try it and tell Andrew.


Samuel
--
You received this message because you are subscribed to the Google Groups "Neural Ensemble" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neuralensembl...@googlegroups.com.
To post to this group, send email to neurale...@googlegroups.com.
Visit this group at http://groups.google.com/group/neuralensemble.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages