
There is a more recent version of this eprint available. Click here to view it.
Sutherland, R., Close, G., Hain, T. orcid.org/0000-0003-0939-3464 et al. (2 more authors) (Submitted: 2024) Using Speech Foundational Models in Loss Functions for Hearing Aid Speech Enhancement. [Preprint - arXiv] (Submitted)
Abstract
Machine learning techniques are an active area of research for speech enhancement for hearing aids, with one particular focus on improving the intelligibility of a noisy speech signal. Recent work has shown that feature encodings from self-supervised speech representation models can effectively capture speech intelligibility. In this work, it is shown that the distance between self-supervised speech representations of clean and noisy speech correlates more strongly with human intelligibility ratings than other signal-based metrics. Experiments show that training a speech enhancement model using this distance as part of a loss function improves the performance over using an SNR-based loss function, demonstrated by an increase in HASPI, STOI, PESQ and SI-SNR scores. This method takes inference of a high parameter count model only at training time, meaning the speech enhancement model can remain smaller, as is required for hearing aids.
Metadata
| Item Type: | Preprint | 
|---|---|
| Authors/Creators: | 
 | 
| Copyright, Publisher and Additional Information: | © 2024 The Author(s). This preprint is made available under a Creative Commons Attribution 4.0 International License. (https://creativecommons.org/licenses/by/4.0/) | 
| Keywords: | Biomedical and Clinical Sciences; Allied Health and Rehabilitation Science; Clinical Sciences; Health Sciences; Information and Computing Sciences; 4602 Artificial Intelligence; 4603 Computer Vision and Multimedia Computation; Neurosciences; Prevention; Clinical Research; Rehabilitation; Bioengineering; Machine Learning and Artificial Intelligence; Assistive Technology; Ear | 
| Dates: | 
 | 
| Institution: | The University of Sheffield | 
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) | 
| Date Deposited: | 17 Oct 2025 12:15 | 
| Last Modified: | 17 Oct 2025 12:15 | 
| Status: | Submitted | 
| Identification Number: | 10.48550/arxiv.2407.13333 | 
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:233134 | 
Available Versions of this Item
- Using Speech Foundational Models in Loss Functions for Hearing Aid Speech Enhancement. (deposited 17 Oct 2025 12:15) [Currently Displayed]

 CORE (COnnecting REpositories)
 CORE (COnnecting REpositories) CORE (COnnecting REpositories)
 CORE (COnnecting REpositories)