Structured discriminative models for speech recognition

Gales, M.J.F., Ragni, A. orcid.org/0000-0003-0634-4456, Zhang, A. et al. (1 more author) (2012) Structured discriminative models for speech recognition. In: Symposium on Machine Learning in Speech and Language Processing (MLSLP). Symposium on Machine Learning in Speech and Language Processing (MLSLP), 14 Sep 2012, Portland, Oregon, USA. International Speech Communication Association

Abstract

Generative models, normally in the form of hidden Markov models, have been the dominant form of acoustic model for automatic speech recognition for more than two decades. In recent years there has been interest in applying structured discriminative models to this task. This talk discusses one particular form of discriminative model, log-linear models, and how they may be applied to continuous speech recognition tasks. Two important issues will be discussed in detail: the appropriate form of features for this model; and the training criterion to be used. Generative models are proposed to extract the features for the discriminative log-linear model. This combination of generative and discriminative models enables state-of-the-art adaptation and noise robustness approaches to be used to handle mismatches between the training and test conditions. An interesting aspect of these features is that the conditional independence assumptions of the underlying generative models are not necessarily reflected in the features that are derived from the models. Various forms of training criteria, including minimum Bayes' risk and large margin approaches, are discussed. The relationship between large-margin training of log-linear models and structured support vector machines is described. Results are presented on two noise-robustness tasks: AURORA-2 and AURORA-4.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Gales, M.J.F. Ragni, A. https://orcid.org/0000-0003-0634-4456 Zhang, A. Dalen, R.C.V.
Copyright, Publisher and Additional Information:	© 2012 The Authors.
Dates:	Published: 14 September 2012
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	11 Nov 2019 15:08
Last Modified:	11 Nov 2019 15:08
Published Version:	https://www.isca-speech.org/archive/mlslp_2012/ml1...
Status:	Published
Publisher:	International Speech Communication Association
Refereed:	Yes
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:152847

CORE (COnnecting REpositories)

Structured discriminative models for speech recognition

Abstract

Metadata

Download

External copy

Export

Statistics