Saz Torralba, O., Deena, S., Doulaty, M. et al. (6 more authors) (2018) Lightly supervised alignment of subtitles on multi-genre broadcasts. Multimedia Systems, 77 (23). pp. 30533-30550. ISSN 0942-4962
Abstract
This paper describes a system for performing alignment of subtitles to audio on multigenre broadcasts using a lightly supervised approach. Accurate alignment of subtitles plays a substantial role in the daily work of media companies and currently still requires large human effort. Here, a comprehensive approach to performing this task in an automated way using lightly supervised alignment is proposed. The paper explores the different alternatives to speech segmentation, lightly supervised speech recognition and alignment of text streams. The proposed system uses lightly supervised decoding to improve the alignment accuracy by performing language model adaptation using the target subtitles. The system thus built achieves the third best reported result in the alignment of broadcast subtitles in the Multi–Genre Broadcast (MGB) challenge, with an F1 score of 88.8%. This system is available for research and other non–commercial purposes through webASR, the University of Sheffield’s cloud–based speech technology web service. Taking as inputs an audio file and untimed subtitles, webASR can produce timed subtitles in multiple formats, including TTML, WebVTT and SRT.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © The Author(s) 2018. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. |
Keywords: | Multigenre broadcasts; Lightly supervised alignment; Language model adaptation; Subtitles |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 31 May 2018 12:17 |
Last Modified: | 15 Dec 2020 12:17 |
Status: | Published |
Publisher: | Springer Nature |
Refereed: | Yes |
Identification Number: | 10.1007/s11042-018-6050-1 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:131147 |