Exploiting future word contexts in neural network language models for speech recognition

Abstract

Language modeling is a crucial component in a wide range of applications including speech recognition. Language models (LMs) are usually constructed by splitting a sentence into words and computing the probability of a word based on its word history. This sentence probability calculation, making use of conditional probability distributions, assumes that there is little impact from approximations used in the LMs, including the word history representations and finite training data. This motivates examining models that make use of additional information from the sentence. In this paper, future word information, in addition to the history, is used to predict the probability of the current word. For recurrent neural network LMs (RNNLMs), this information can be encapsulated in a bi-directional model. However, if used directly, this form of model is computationally expensive when trained on large quantities of data, and can be problematic when used with word lattices. This paper proposes a novel neural network language model structure, the succeeding-word RNNLM, su-RNNLM, to address these issues. Instead of using a recurrent unit to capture the complete future word contexts, a feedforward unit is used to model a fixed finite number of succeeding words. This is more efficient in training than bi-directional models and can be applied to lattice rescoring. The generated lattices can be used for downstream applications, such as confusion network decoding and keyword search. Experimental results on speech recognition and keyword spotting tasks illustrate the empirical usefulness of future word information, and the flexibility of the proposed model to represent this information.

Metadata

Item Type:	Article
Authors/Creators:	Chen, X. Liu, X. Wang, Y. Ragni, A. Wong, J.H.M. Gales, M.J.F.
Copyright, Publisher and Additional Information:	© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Reproduced in accordance with the publisher's self-archiving policy.
Keywords:	Recurrent neural network; language model; succeeding words; speech recognition; keyword search
Dates:	Accepted: 27 May 2019 Published (online): 11 June 2019 Published: September 2019
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	05 Sep 2019 13:50
Last Modified:	11 Jun 2020 00:38
Status:	Published
Publisher:	Institute of Electrical and Electronics Engineers (IEEE)
Refereed:	Yes
Identification Number:	10.1109/taslp.2019.2922048
Related URLs:	Author
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:150520

CORE (COnnecting REpositories)

Exploiting future word contexts in neural network language models for speech recognition

Abstract

Metadata

Download

Accepted Version

Export

Statistics