This is the latest version of this eprint.
Cross, M. and Ragni, A. (2025) Flowing straighter with conditional flow matching for accurate speech enhancement. In: Proceedings of the 2nd ECAI Workshop on "Machine Learning Meets Differential Equations: From Theory to Applications". 2nd ECAI Workshop on "Machine Learning Meets Differential Equations: From Theory to Applications", 25-30 Oct 2025, Bologna, Italy. Proceedings of Machine Learning Research, pp. 121-132.
Abstract
Current flow-based generative speech enhancement methods learn curved probability paths which model a mapping between clean and noisy speech. Despite impressive performance, the implications of curved probability paths are unknown. Methods such as Schrödinger bridges focus on curved paths, where time-dependent gradients and variance do not promote straight paths. Findings in machine learning research suggest that straight paths, such as conditional flow matching, are easier to train and offer better generalisation. In this paper we quantify the effect of path straightness on speech enhancement quality. We report experiments with the Schrödinger bridge, where we show that certain configurations lead to straighter paths. Conversely, we propose independent conditional flow-matching for speech enhancement, which models straight paths between noisy and clean speech. We demonstrate empirically that a time-independent variance has a greater effect on sample quality than the gradient. Although conditional flow matching improves several speech quality metrics, it requires multiple inference steps. We rectify this with a one-step solution by inferring the trained flow-based model as if it was directly predictive. Our work suggests that straighter time-independent probability paths improve generative speech enhancement over curved time-dependent paths.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © The authors 2025. Except as otherwise noted, this paper published in Proceedings of Machine Learning Research is made available under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ |
| Keywords: | speech enhancement; conditional flow matching; neural ordinary differential equations |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Funding Information: | Funder Grant number UK RESEARCH AND INNOVATION / UKRI / RCUK UNSPECIFIED |
| Date Deposited: | 05 Jan 2026 13:28 |
| Last Modified: | 05 Jan 2026 13:30 |
| Published Version: | https://proceedings.mlr.press/v277/cross25a.html |
| Status: | Published |
| Publisher: | Proceedings of Machine Learning Research |
| Refereed: | Yes |
| Identification Number: | https://proceedings.mlr.press/v277/cross25a.html |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:236039 |
Available Versions of this Item
-
Flowing straighter with conditional flow matching for accurate speech enhancement. (deposited 30 Sep 2025 15:44)
- Flowing straighter with conditional flow matching for accurate speech enhancement. (deposited 05 Jan 2026 13:28) [Currently Displayed]
CORE (COnnecting REpositories)
CORE (COnnecting REpositories)