Continualimprovements in data collection and processing have had a huge impact on brain research, producing data sets that are often large and complicated. By emphasizing a few fundamental principles, and a handful of ubiquitous techniques, Analysis of Neural Data provides a unified treatment of analytical methods that have become essential for contemporary researchers. Throughout the book ideas are illustrated with more than 100 examples drawn from the literature, ranging from electrophysiology, to neuroimaging, to behavior. By demonstrating the commonality among various statistical approaches the authors provide the crucial tools for gaining knowledge from diverse types of data. Aimed at experimentalists with only high-school level mathematics, as well as computationally-oriented neuroscientists who have limited familiarity with statistics, Analysis of Neural Data serves as both a self-contained introduction and a reference work.
"This is an outstanding book, that fills a real need. Assuming no background in statistics, it covers the data analysis methods neuroscientists need to know, from standard material like hypothesis tests, to specialized methods that have recently found use in our field. It has the detail and insight needed for those developing their own statistical methods. And for the working neurobiologist it has plenty of practical tricks, tips, and examples, coming straight from the experts. This book is a must for anyone serious about quantitative analysis in neuroscience." (Kenneth D. Harris, Professor of Quantitative Neuroscience, University College London)
"Analysis of Neural Data is a thorough, authoritative textbook on the fastest growing statistical field. All relevant topics are covered in depth with examples from the literature and thoughtful comments. Particularly welcome is the discussion of multivariate statistics, time series and Bayesian methods, topics frequently encountered in neuroscience research but infrequently discussed in standard statistics textbooks. A highly readable, useful and commendable textbook!" (Apostolos P. Georgopoulos, Regents Professor of Neuroscience, University of Minnesota)
"Written by eminent statisticians, this book covers a range of topics from basic mathematics to state-of-the-art statistical analyses of neural data. Researchers conducting experiments will learn the principles of data analysis and will begin analyzing data using the methods provided. Theoreticians will be introduced to more than 100 intriguing experiments that will teach them to form persuasive interpretations. Analysis of Neural Data should become astandard reference for neuroscience research." (Shigeru Shinomoto, Department of Physics, Kyoto University)
Robert E. (Rob) Kass is Professor in the Department of Statistics, the Machine Learning Department, and the Center for the Neural Basis of Cognition at Carnegie Mellon University. Since 2001 his research has been devoted to statistical methods in neuroscience. Together with Emery Brown he has organized the highly successful series of international meetings, Statistical Analysis of Neural Data (SAND).
Uri T. Eden is Associate Professor in the Department of Mathematics and Statistics at Boston University. He received his Ph.D. in the Harvard/MIT Medical Engineering and Medical Physics program in the Health Sciences and Technology Department. His research focuses on developing mathematical and statistical methods to analyze neural spiking activity, using methods related to model identi cation, statistical inference, signal processing, and stochastic estimation and control.
Emery N. Brown is Edward Hood Taplin Professor of Medical Engineering, Professor of Computational Neuroscience, and Associate Director of the Institute of Medical Engineering and Science at MIT; he is also the Warren M. Zapol Professor of Anaesthesia at Harvard Medical School and Massachusetts General Hospital. He is both a statistician and an anesthesiologist. Since 1998 his research has focused on neural information processing, and his experimental work characterizes the way anesthetic drugs act in the brain to create the state of general anesthesia.
Copyright: 2019 Livezey et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Cortical surface electrical potentials (CSEPs) were recorded directly from the cortical surface with a high-density (4mm pitch), 256-channel ECoG array and a multi-channel amplifier optically connected to a digital signal processor (Tucker-Davis Technologies [TDT], Alachua, FL). The time series from each channel was visually and quantitatively inspected for artifacts or excessive noise (typically 60 Hz line noise). These channels were excluded from all subsequent analysis and the raw CSEP signal from the remaining channels were downsampled to 400 Hz in the frequency domain and then common-average referenced and used for spectro-temporal analysis. For each useable channel, the time-varying analytic amplitude was extracted from 40 frequency domain, bandpass filters (Gaussian filters, logarithmically increasing center frequencies and semi-logarithmically increasing band-widths, equivalent to a frequency domain Morlet wavelet). The amplitude for each filter band was z-scored to a baseline window defined as a period of time in which the subject was silent, the room was silent, and the subject was resting. Finally, the amplitudes were downsampled to 200 Hz.
For each of the bands defined as: theta [4-7 Hz], alpha [8-14 Hz], beta [15-29 Hz], gamma [30-59 Hz], and high gamma [70-150 Hz], individual bands from the 40 Gaussian bandpassed amplitudes were grouped and averaged according to their center frequencies. The lower frequency features are all highly oversampled at the Hγ rate of 200 Hz. To make comparisons across frequency bands more interpretable, control for potential overfitting from training on oversampled signals, and to reduce the computational complexity of training deep networks with concatenated input features, we downsampled each of the lower frequency bands in time so that the center frequency-to-sampling rate ratio was constant (ratio = 112.5/200) for each band. Given limited data, deep networks are tasked with deciding whether a change across input features is relevant or irrelevant for prediction. The lower frequency bands are highly oversampled at 200 Hz, however, the higher frequencies will not have exactly zero amplitude do to numerical noise even though these are irrelevant signals. Downsampling the bands to a fixed ratio makes comparing CV decoding accuracy across frequency bands more interpretable.
For the fully-connected deep networks used here, the CSEP features were rasterized into a large feature vector per-trial in a window around CV production. These feature vectors are the input into the first layer of the fully connected network. The feature dimensionality is the number of electrodes by 258 time points which corresponds to Subject 1: 22,188, Subject 2: 20,124, Subject 3: 21,414, and Subject 4: 25,542 features. The final layer non-linearity is chosen to be the softmax function:(3)where hi is the ith element of the hidden representation. This nonlinearity transforms a vector of real numbers into a vector which represents a one-draw multinomial distribution. It is the negative log-likelihood of this distribution over the training data which is minimized during training.
As baseline models, we used multinomial logistic regression. Logistic regression required no additional dimensionality reduction and had the highest classification accuracy compared to other linear classifiers, i.e. linear support vector machines and linear discriminant analysis on the Hγ features (10.4 6.7% and 16.0 10.0% respectively compared to 28.0 12.9% for logistic regression). Additionally, the conditional class distribution used in logistic regression (multinomial) is the same as the one used for deep networks, which facilitated comparison of confusions.
Deep networks have a number of hyperparameters that govern network architecture and optimization such as the number of layers, the layer nonlinearity, and the optimization parameters. The full list of hyperparameters and their ranges is listed in S1 Table.
For all results that are based on training networks, 400 models were trained with hyperparameters selected with random search [47]. For each set of hyperparameters, 10 copies of the network were trained on the respective 10 folds as described in Deep networks, for a total of 4000 networks per subject per task. For each task, optimal hyperparameters were selected by choosing the model with the best mean validation classification accuracy across 10 folds. Since our datasets were relatively small for training deep networks, we regularized the models in three ways: dropout, weight decay, and filter norm-clipping in all layers of the model. The dropout rate, activation-rescaling factor, max filter norm, and weight decay coefficient were all optimized hyperparameters. The optimal values for the hyperparameters were selected independently for each family of models in the paper, i.e. independently for each subject, model type (logistic or deep), input data type (frequency bands), and amount (data scaling experiment). The search space for hyperparameters was shared across all models, however, for the logistic regression models, the number of hidden layers was set to zero and no other hidden layer hyperparameters were used. The optimal hyperparameters for each model and links to trained model files and Docker images for running preprocessing and deep network training are available in S1 Appendix.
Each subject produced a subset of the 57 CV and the classification methods were trained to predict the subset. Each CV can also be classified as containing 1 of 19 consonants or 1 of 3 vowels. Similarly, a subset of the constants can be grouped into 1 of 3 vocal tract constriction location categories or 1 of 3 vocal tract constriction degree categories. The CV predictions were then tabulated within these restricted labelings in order to calculate the accuracy for consonant, vowel, constriction location, and constriction degree accuracies.
3a8082e126