Chen, X., Liu, X., Wang, Y. et al. (3 more authors) (2019) Exploiting future word contexts in neural network language models for speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27 (9). pp. 1444-1454. ISSN 2329-9290
Abstract
Language modeling is a crucial component in a wide range of applications including speech recognition. Language models (LMs) are usually constructed by splitting a sentence into words and computing the probability of a word based on its word history. This sentence probability calculation, making use of conditional probability distributions, assumes that there is little impact from approximations used in the LMs, including the word history representations and finite training data. This motivates examining models that make use of additional information from the sentence. In this paper, future word information, in addition to the history, is used to predict the probability of the current word. For recurrent neural network LMs (RNNLMs), this information can be encapsulated in a bi-directional model. However, if used directly, this form of model is computationally expensive when trained on large quantities of data, and can be problematic when used with word lattices. This paper proposes a novel neural network language model structure, the succeeding-word RNNLM, su-RNNLM, to address these issues. Instead of using a recurrent unit to capture the complete future word contexts, a feedforward unit is used to model a fixed finite number of succeeding words. This is more efficient in training than bi-directional models and can be applied to lattice rescoring. The generated lattices can be used for downstream applications, such as confusion network decoding and keyword search. Experimental results on speech recognition and keyword spotting tasks illustrate the empirical usefulness of future word information, and the flexibility of the proposed model to represent this information.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Reproduced in accordance with the publisher's self-archiving policy. |
Keywords: | Recurrent neural network; language model; succeeding words; speech recognition; keyword search |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 05 Sep 2019 13:50 |
Last Modified: | 11 Jun 2020 00:38 |
Status: | Published |
Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
Refereed: | Yes |
Identification Number: | 10.1109/taslp.2019.2922048 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:150520 |