Huber, R., Baumgartner, H., Goetze, S. orcid.org/0000-0003-1044-7343 et al. (1 more author) (2020) ASR-based, single-ended modeling of listening effort - a tool for TV sound engineers. In: Proceedings of the FA2020 Conference. Forum Acusticum, 07-11 Dec 2020, Lyon, France. Forum Acusticum , pp. 2441-2445.
Abstract
This paper reviews our research approaches towards a listening effort model and its applications as a tool to automatically measure and display the perceived listening effort required to understand speech in a variety of different background sounds. It is single-ended, i.e. it does not require a clean speech reference, and is based on an automatic speech recognition (ASR) system. Speech distortions and interfering background sounds increase the uncertainty of the ASR system, which can be quantified and mapped to a perceptually interpretable scale using a psychoacoustic modeling approach. This performance measure correlates well with mean subjective listening effort ratings for a variety of distortions and acoustic backgrounds typical for TV broadcast material (r > 0.9). In principle, the tool is applicable to be integrated as a software plugin for digital audio workstations (DAWs) to support the work of sound engineers, or in other applications such as speech quality monitoring of communication channels or real-time control of signal-enhancement algorithms.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2020. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 18 May 2022 08:58 |
Last Modified: | 18 May 2022 08:58 |
Status: | Published |
Publisher: | Forum Acusticum |
Identification Number: | 10.48465/fa.2020.0317 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:186828 |