A framework for exploiting modulation spectral features in music data mining and other applications
Sephus, Nashlie H.
MetadataShow full item record
When a signal is decomposed into frequency bands, demodulated into modulator and carrier pairs, and portrayed in a carrier frequency-versus modulator frequency domain, significant information may be automatically observed about the signal. We refer to this domain as the modulation spectral domain. The modulation spectrum is referred to as a windowed Fourier transform across time that produces an acoustic frequency versus modulation frequency representation of a signal. Previously, frameworks incorporating the discrete short-time modulation transform (DSTMT) and modulation spectrum have been designed mostly for filtering of speech signals. This modulation spectral domain is rarely, if ever, discussed in typical signal processing courses today, and we believe its current associated tools and applications are somewhat limited. We seek to revisit this domain to uncover more intuition, develop new concepts to extend its capabilities, and increase its applications, especially in the area of music data mining. A recent interest has risen in using modulation spectral features, which are features in the modulation spectral domain, for music data mining. The field of music data mining, also known as music information retrieval (MIR), has been rapidly developing over the past decade or so. One reason for this development is the aim to develop frameworks leveraging the particular characteristics of music signals instead of simply copying methods previously applied to its speech-centered predecessors, such as speech recognition, speech synthesis, and speaker identification. This research seeks to broaden the perspective and use of an existing modulation filterbank framework by exploiting modulation features well suited for music signals. The objective of this thesis is to develop a framework for extracting modulation spectral features from music and other signals. The purpose of extracting features from these signals is to perform data mining tasks, such as unsupervised source identification, unsupervised source separation, and audio synthesis. More specifically, this research emphasizes the following: the usefulness of the DSTMT and the modulation spectrum for music data mining tasks; a new approach to unsupervised source identification using modulation spectral features; a new approach to unsupervised source separation; a newly introduced analysis of FM features in an AM-dominated modulation spectra; and other applications.