Doulaty, M., Saz, O., Ng, R.W.M. et al. (1 more author) (2015) Latent Dirichlet Allocation Based Organisation of Broadcast Media Archives for Deep Neural Network Adaptation. In: Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 13-17 Dec 2015, Scottsdale, AZ. IEEE , pp. 130-136. ISBN 978-1-4799-7291-3
Abstract
This paper presents a new method for the discovery of latent domains in diverse speech data, for the use of adaptation of Deep Neural Networks (DNNs) for Automatic Speech Recognition. Our work focuses on transcription of multi-genre broadcast media, which is often only categorised broadly in terms of high level genres such as sports, news, documentary, etc. However, in terms of acoustic modelling these categories are coarse. Instead, it is expected that a mixture of latent domains can better represent the complex and diverse behaviours within a TV show, and therefore lead to better and more robust performance. We propose a new method, whereby these latent domains are discovered with Latent Dirichlet Allocation, in an unsupervised manner. These are used to adapt DNNs using the Unique Binary Code (UBIC) representation for the LDA domains. Experiments conducted on a set of BBC TV broadcasts, with more than 2,000 shows for training and 47 shows for testing, show that the use of LDA-UBIC DNNs reduces the error up to 13% relative compared to the baseline hybrid DNN models.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2015 IEEE. This is an author produced version of a paper subsequently published in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). Uploaded in accordance with the publisher's self-archiving policy. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 16 Aug 2016 12:22 |
Last Modified: | 19 Dec 2022 13:34 |
Published Version: | http://dx.doi.org/10.1109/ASRU.2015.7404785 |
Status: | Published |
Publisher: | IEEE |
Refereed: | Yes |
Identification Number: | 10.1109/ASRU.2015.7404785 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:101808 |