This is the latest version of this eprint.
Sutherland, R., Close, G., Hain, T. orcid.org/0000-0003-0939-3464 et al. (2 more authors) (2024) Using speech foundational models in loss functions for hearing aid speech enhancement. In: Proceedings of 2024 32nd European Signal Processing Conference (EUSIPCO). 2024 32nd European Signal Processing Conference (EUSIPCO), 26-30 Aug 2024, Lyon, France. Institute of Electrical and Electronics Engineers (IEEE), pp. 421-425. ISBN: 9798-331519773. ISSN: 2219-5491. EISSN: 2076-1465.
Abstract
Machine learning techniques are an active area of research for speech enhancement for hearing aids, with one particular focus on improving the intelligibility of a noisy speech signal. Recent work has shown that feature encodings from self-supervised speech representation models can effectively capture speech intelligibility. In this work, it is shown that the distance between self-supervised speech representations of clean and noisy speech correlates more strongly with human intelligibility ratings than other signal-based metrics. Experiments show that training a speech enhancement model using this distance as part of a loss function improves the performance over using an SNR-based loss function, demonstrated by an increase in HASPI, STOI, PESQ and SI-SNR scores. This method takes inference of a high parameter count model only at training time, meaning the speech enhancement model can remain smaller, as is required for hearing aids.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2024 The Authors. Except as otherwise noted, this author-accepted version of a paper published in Proceedings of 2024 32nd European Signal Processing Conference (EUSIPCO) is made available via the University of Sheffield Research Publications and Copyright Policy under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ |
| Keywords: | self-supervised speech representations; speech enhancement; loss functions |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Date Deposited: | 06 Aug 2025 13:47 |
| Last Modified: | 17 Oct 2025 12:17 |
| Status: | Published |
| Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
| Refereed: | Yes |
| Identification Number: | 10.23919/eusipco63174.2024.10714933 |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:230087 |
Available Versions of this Item
-
Using Speech Foundational Models in Loss Functions for Hearing Aid Speech Enhancement. (deposited 17 Oct 2025 12:15)
- Using speech foundational models in loss functions for hearing aid speech enhancement. (deposited 06 Aug 2025 13:47) [Currently Displayed]

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)