Acoustic Models for the Analysis and Synthesis of the Singing Voice
Lee, Matthew E.
MetadataShow full item record
Throughout our history, the singing voice has been a fundamental tool for musical expression. While analysis and digital synthesis techniques have been developed for normal speech, few models and techniques have been focused on the singing voice. The central theme of this research is the development of models aimed at the characterization and synthesis of the singing voice. First, a spectral model is presented in which asymmetric generalized Gaussian functions are used to represent the formant structure of a singing voice in a flexible manner. Efficient methods for searching the parameter space are investigated and challenges associated with smooth parameter trajectories are discussed. Next a model for glottal characterization is introduced by first presenting an analysis of the relationship between measurable spectral qualities of the glottal waveform and perceptually relevant time-domain parameters. A mathematical derivation of this relationship is presented and is extended as a method for parameter estimation. These concepts are then used to outline a procedure for modifying glottal textures and qualities in the frequency domain. By combining these models with the Analysis-by-Synthesis/Overlap-Add sinusoidal model, the spectral and glottal models are shown to be capable of characterizing the singing voice according to traits such as level of training and registration. An application is presented in which these parameterizations are used to implement a system for singing voice enhancement. Subjective listening tests were conducted in which listeners showed an overall preference for outputs produced by the proposed enhancement system over both unmodified voices and voices enhanced with competitive methods.