Ragni, A. orcid.org/0000-0003-0634-4456 and Gales, M. (2018) Automatic speech recognition system development in the "wild". In: Interspeech 2018. Interspeech 2018, 02-06 Sep 2018, Hyderabad, India. International Speech Communication Association (ISCA) , pp. 2217-2221.
Abstract
The standard framework for developing an automatic speech recognition (ASR) system is to generate training and development data for building the system and evaluation data for the final performance analysis. All the data is assumed to come from the domain of interest. Though this framework is matched to some tasks, it is more challenging for systems that are required to operate over broad domains, or where the ability to collect the required data is limited. This paper discusses ASR work performed under the IARPA MATERIAL program, which is aimed at cross-language information retrieval and examines this challenging scenario. In terms of available data, only limited narrow-band conversational telephone speech data was provided. However, the system is required to operate over a range of domains, including broadcast data. As no data is available for the broadcast domain, this paper proposes an approach for system development based on scraping "related" data from the web and using ASR system confidence scores as the primary metric for developing the acoustic and language model components. As an initial evaluation of the approach, the Swahili development language is used, with the final system performance assessed on the IARPA MATERIAL Analysis Pack 1 data.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2018 ISCA. Reproduced in accordance with the publisher's self-archiving policy. |
Keywords: | cross-domain development; confidence; web data; speech recognition |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 21 Nov 2019 12:01 |
Last Modified: | 21 Nov 2019 12:01 |
Status: | Published |
Publisher: | International Speech Communication Association (ISCA) |
Refereed: | Yes |
Identification Number: | 10.21437/interspeech.2018-1085 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:152763 |