Bayesian adaptation and combination of deep models for automatic speech recognition
MetadataShow full item record
The objective of the proposed research is to deploy a Bayesian adaptation and combination framework for deep model based automatic speech recognition systems to combat the degradation of the recognition accuracy, which is typically observed under potential mismatched conditions between training and testing. This dissertation addresses the problem in three directions. The first direction is to perform Bayesian adaptation directly on the discriminative deep neural network models. Maximum a posteriori estimation and multi-task learning techniques are employed in the manner of regularization in the deep neural network updating formula. In the second direction, deep neural network is cast into a generative model to better leverage Bayesian techniques. Classic structured maximum a posteriori adaption is adopted by using bottleneck features derived from deep neural networks. In the third direction, a hierarchical Bayesian system combination technique is employed to further enhance the adaptation performance by leveraging the complementarity of the discriminative and generative adaptive models.