Probabilistic modeling of neural data for analysis and synthesis of speech
Matthews, Brett Alexander
MetadataShow full item record
This research consists of probabilistic modeling of speech audio signals and deep-brain neurological signals in brain-computer interfaces. A significant portion of this research consists of a collaborative effort with Neural Signals Inc., Duluth, GA, and Boston University to develop an intracortical neural prosthetic system for speech restoration in a human subject living with Locked-In Syndrome, i.e., he is paralyzed and unable to speak. The work is carried out in three major phases. We first use kernel-based classifiers to detect evidence of articulation gestures and phonological attributes speech audio signals. We demonstrate that articulatory information can be used to decode speech content in speech audio signals. In the second phase of the research, we use neurological signals collected from a human subject with Locked-In Syndrome to predict intended speech content. The neural data were collected with a microwire electrode surgically implanted in speech motor cortex of the subject's brain, with the implant location chosen to capture extracellular electric potentials related to speech motor activity. The data include extracellular traces, and firing occurrence times for neural clusters in the vicinity of the electrode identified by an expert. We compute continuous firing rate estimates for the ensemble of neural clusters using several rate estimation methods and apply statistical classifiers to the rate estimates to predict intended speech content. We use Gaussian mixture models to classify short frames of data into 5 vowel classes and to discriminate intended speech activity in the data from non-speech. We then perform a series of data collection experiments with the subject designed to test explicitly for several speech articulation gestures, and decode the data offline. Finally, in the third phase of the research we develop an original probabilistic method for the task of spike-sorting in intracortical brain-computer interfaces, i.e., identifying and distinguishing action potential waveforms in extracellular traces. Our method uses both action potential waveforms and their occurrence times to cluster the data. We apply the method to semi-artificial data and partially labeled real data. We then classify neural spike waveforms, modeled with single multivariate Gaussians, using the method of minimum classification error for parameter estimation. Finally, we apply our joint waveforms and occurrence times spike-sorting method to neurological data in the context of a neural prosthesis for speech.