ASR-based, single-ended modeling of listening effort - a tool for TV sound engineers

Abstract

This paper reviews our research approaches towards a listening effort model and its applications as a tool to automatically measure and display the perceived listening effort required to understand speech in a variety of different background sounds. It is single-ended, i.e. it does not require a clean speech reference, and is based on an automatic speech recognition (ASR) system. Speech distortions and interfering background sounds increase the uncertainty of the ASR system, which can be quantified and mapped to a perceptually interpretable scale using a psychoacoustic modeling approach. This performance measure correlates well with mean subjective listening effort ratings for a variety of distortions and acoustic backgrounds typical for TV broadcast material (r > 0.9). In principle, the tool is applicable to be integrated as a software plugin for digital audio workstations (DAWs) to support the work of sound engineers, or in other applications such as speech quality monitoring of communication channels or real-time control of signal-enhancement algorithms.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Huber, R. Baumgartner, H. Goetze, S. https://orcid.org/0000-0003-1044-7343 Rennies-Hochmuth, J.
Copyright, Publisher and Additional Information:	© 2020.
Dates:	Published: 7 December 2020
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	18 May 2022 08:58
Last Modified:	18 May 2022 08:58
Status:	Published
Publisher:	Forum Acusticum
Identification Number:	10.48465/fa.2020.0317
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:186828

CORE (COnnecting REpositories)

ASR-based, single-ended modeling of listening effort - a tool for TV sound engineers

Abstract

Metadata

Download

External copy

Export

Statistics