We are two postdocs in the Axel Laboratory at Columbia University setting up a spike sorting pipeline using the Klusta* Suite. So far our pilot experiments have been mostly successful but we have encountered problems with SpikeDetekt/KlustaKwik when we have tried to scale up. First a bit about our specs:
- Blackrock NSP acquisition system
- Neuronexus 32-channel probe A1x32-Poly3-5mm-25s-177-A32 probe
(as these are pilot experiments, we are purchasing "B-stock" probes with one defective site per probe)
- Acquiring at 30,000 Hz
- 63-minute long datafile (~7GB) 16-bit signed int, 32 continuous channels (probe) + 1 continuous analog input
(before feeding the data to SpikeDetekt we strip the header and the 33rd channel, leaving only the raw continuous 32 probe channels)
- Win64 / Python 2.7.5 (WinPython distribution) / NumPy 1.7.1 / SciPy 0.12.0 / PyTables 3.1.0 / h5py 2.2.1 / Pandas 0.12.0 / Matplotlib 1.3.0 / PyOpenGL 3.0.2 / PyQt4
- SpikeDetekt downloaded from GitHub on February 26, 2014
- KlustaKwik downloaded that same day, last modified August 6, 2013
- KlustaViewa 0.1.0 downloaded April 18, 2014
For shorter files (e.g. ~2mn continuous data acquisition SpikeDetekt only generated warnings re. unalignable spikes as discussed in another thread. This was solved by setting the USE_WEIGHTED_MEAN_PEAK_SAMPLE parameter to False. On these smaller files KlustKwik did a reasonable job with the default parameters, although tended to undercluster (i.e. 2+ clearly differentiable units often showed up in the same cluster).
The issues started arising when we fed an actual experiment file (~60mn long, 7GB). SpikeDetekt runs normally, detecting 558488 spikes and generating a 304KB .FET file. However, when we run KlustaKwik from the new directory the application crashes after listing the parameters (last console output: "MaxCluster = 300"). KlustaKwik itself does not generate an error but a new GUI error window opens saying: "klustakwik.exe has stopped working A problem caused the program to stop working correctly. Windows will close the program and notify you if a solution is available." We checked that KlustaKwik still runs on smaller datasets, as well as a dataset that is a 2mn subset of this same dataset.
Is the issue that SpikeDetekt and/or KlustaKwik cannot handle files of the size we're giving it? (Should we chunk our raw file into more manageable sizes and if so how big should they be?) Any advice about how to get past this problem would be very much appreciated.
And an unrelated question: is it possible to make full use of our computer's processing power (2 quad-core Xeon 3.6 GHz)? At the moment KlustaKwik only employs 12-15% of our CPU. Is there any way to have it use all of it?
Many thanks,
Carl Schoonover and Andrew Fink
Axel Laboratory
Columbia University
Thank you very much for your post. We are entirely shameless--happy to go the "embarrassingly parallel" route. And we'd be very grateful for any advice concerning optimizing the -Penalty parameters for our data.
The .FET, .MASK, and .LOG files can be downloaded at this URL: http://we.tl/CnPuQbsH0e The .KLG file is a 0KB file--opening it in Notepad shows that it is empty of text (This is consistent with KlustaKwik crashing at the very beginning). I'll email it to you directly, since WeTransfer won't take it for some reason.
Many thanks,
Carl and Andrew