Olcoz, J., Saz Torralba, O. and Hain, T. (2016) Error correction in lightly supervised alignment of broadcast subtitles. In: Proceedings of Interspeech 2016. 17th Annual Conference of the International Speech Communication Association (Interspeech), 08-12 Sep 2016, San Francisco, CA. ISCA , pp. 2110-2114.
Abstract
This paper presents a range of error correction techniques aimed at improving the accuracy of a lightly supervised alignment task for broadcast subtitles. Lightly supervised approaches are frequently used in the multimedia domain, either for subtitling purposes or for providing a more reliable source for training speech–based systems. The proposed methods focus on directly correcting of the alignment output using different techniques to infer word insertions and words with inaccurate time boundaries. The features used by the classification models are the outputs from the alignment system, such as confidence measures, and word or segment duration. Experiments in this paper are based on broadcast material provided by the BBC to the Multi–Genre Broadcast (MGB) challenge participants. Results, show that the order alignment F–measure improves up to 2.6% absolute (15.8% relative) when combining insertion and word– boundary correction
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2016 ISCA |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 08 Jul 2016 09:37 |
Last Modified: | 19 Dec 2022 13:34 |
Published Version: | https://doi.org/10.21437/Interspeech.2016-56 |
Status: | Published |
Publisher: | ISCA |
Refereed: | Yes |
Identification Number: | 10.21437/Interspeech.2016-56 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:101810 |