Gales, M.J.F., Ragni, A. orcid.org/0000-0003-0634-4456, Zhang, A. et al. (1 more author) (2012) Structured discriminative models for speech recognition. In: Symposium on Machine Learning in Speech and Language Processing (MLSLP). Symposium on Machine Learning in Speech and Language Processing (MLSLP), 14 Sep 2012, Portland, Oregon, USA. International Speech Communication Association
Abstract
Generative models, normally in the form of hidden Markov models, have been the dominant form of acoustic model for automatic speech recognition for more than two decades. In recent years there has been interest in applying structured discriminative models to this task. This talk discusses one particular form of discriminative model, log-linear models, and how they may be applied to continuous speech recognition tasks. Two important issues will be discussed in detail: the appropriate form of features for this model; and the training criterion to be used. Generative models are proposed to extract the features for the discriminative log-linear model. This combination of generative and discriminative models enables state-of-the-art adaptation and noise robustness approaches to be used to handle mismatches between the training and test conditions. An interesting aspect of these features is that the conditional independence assumptions of the underlying generative models are not necessarily reflected in the features that are derived from the models. Various forms of training criteria, including minimum Bayes' risk and large margin approaches, are discussed. The relationship between large-margin training of log-linear models and structured support vector machines is described. Results are presented on two noise-robustness tasks: AURORA-2 and AURORA-4.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2012 The Authors. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 11 Nov 2019 15:08 |
Last Modified: | 11 Nov 2019 15:08 |
Published Version: | https://www.isca-speech.org/archive/mlslp_2012/ml1... |
Status: | Published |
Publisher: | International Speech Communication Association |
Refereed: | Yes |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:152847 |