Wang, H., Ragni, A. orcid.org/0000-0003-0634-4456, Gales, M.J.F. et al. (3 more authors) (2015) Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages. In: INTERSPEECH 2015 : 16th Annual Conference of the International Speech Communication Association. INTERSPEECH 2015 : 16th Annual Conference of the International Speech Communication Association, 06-10 Sep 2015, Dresden, Germany. International Speech Communication Association (ISCA) , pp. 3660-3664.
Abstract
Keyword spotting (KWS) for low-resource languages has drawn increasing attention in recent years. The state-of-the-art KWS systems are based on lattices or Confusion Networks (CN) generated by Automatic Speech Recognition (ASR) systems. It has been shown that considerable KWS gains can be obtained by combining the keyword detection results from different forms of ASR systems, e.g., Tandem and Hybrid systems. This paper investigates an alternative combination scheme for KWS using joint decoding. This scheme treats a Tandem system and a Hybrid system as two separate streams, and makes a linear combination of individual acoustic model log-likelihoods. Joint decoding is more efficient as it requires just a single pass of decoding and a single pass of keyword search. Experiments on six Babel OP2 development languages show that joint decoding is capable of providing consistent gains over each individual system. Moreover, it is possible to efficiently rescore the joint decoding lattices with Tandem or Hybrid acoustic models, and further KWS gains can be obtained by merging the detection posting lists from the joint decoding lattices and rescored lattices.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2015 International Speech Communication Association. Reproduced in accordance with the publisher's self-archiving policy. |
Keywords: | keyword spotting; joint decoding; deep neural network; Tandem; Hybrid |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 12 Nov 2019 15:18 |
Last Modified: | 12 Nov 2019 15:18 |
Published Version: | https://www.isca-speech.org/archive/interspeech_20... |
Status: | Published |
Publisher: | International Speech Communication Association (ISCA) |
Refereed: | Yes |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:152837 |