Sun, W. and Ragni, A. (2025) Score-based training for energy-based TTS models. In: Interspeech 2025. Interspeech 2025, 17-21 Aug 2025, Rotterdam, The Netherlands. ISCA, pp. 5528-5532. ISSN: 2308-457X. EISSN: 2958-1796.
Abstract
Noise contrastive estimation (NCE) is a popular method for training energy-based models (EBM) with intractable normalisation terms. The key idea of NCE is to learn by comparing unnormalised log-likelihoods of the reference and noisy samples, thus avoiding explicitly computing normalisation terms. However, NCE critically relies on the quality of noisy samples. Recently, sliced score matching (SSM) has been popularised by closely related diffusion models (DM). Unlike NCE, SSM learns a gradient of log-likelihood, or score, by learning distribution of its projections on randomly chosen directions. However, both NCE and SSM disregard the form of log-likelihood function, which is problematic given that EBMs and DMs make use of first-order optimisation during inference. This paper proposes a new criterion that learns scores more suitable for first-order schemes. Experiments contrasts these approaches for training EBMs.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2025 The Authors. Except as otherwise noted, this author-accepted version of a paper published in Interspeech 2025 is made available via the University of Sheffield Research Publications and Copyright Policy under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Date Deposited: | 09 Jan 2026 09:50 |
| Last Modified: | 09 Jan 2026 09:50 |
| Status: | Published online |
| Publisher: | ISCA |
| Refereed: | Yes |
| Identification Number: | 10.21437/interspeech.2025-1066 |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:236037 |
Download
Filename: 2505.13771v1 (1).pdf
Licence: CC-BY 4.0
CORE (COnnecting REpositories)
CORE (COnnecting REpositories)