Loweimi, E., Barker, J. orcid.org/0000-0002-1684-5660 and Hain, T. orcid.org/0000-0003-0939-3464 (2018) On the usefulness of the speech phase spectrum for pitch extraction. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Interspeech 2018, 02-06 Sep 2018, Hyderabad, India. ISCA , pp. 696-700.
Abstract
© 2018 International Speech Communication Association. All rights reserved. Most frequency domain techniques for pitch extraction such as cepstrum, harmonic product spectrum (HPS) and summation residual harmonics (SRH) operate on the magnitude spectrum and turn it into a function in which the fundamental frequency emerges as argmax. In this paper, we investigate the extension of these three techniques to the phase and group delay (GD) domains. Our extensions exploit the observation that the bin at which F(magnitude) becomes maximum, for some monotonically increasing function F, is equivalent to bin at which F(phase) has maximum negative slope and F(groupdelay) has the maximum value. To extract the pitch track from speech phase spectrum, these techniques were coupled with the source-filter model in the phase domain that we proposed in earlier publications and a novel voicing detection algorithm proposed here. The accuracy and robustness of the phase-based pitch extraction techniques are illustrated and compared with their magnitude-based counterparts using six pitch evaluation metrics. On average, it is observed that the phase spectrum can be successfully employed in pitch tracking with comparable accuracy and robustness to the speech magnitude spectrum.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2018 ISCA. Reproduced in accordance with the publisher's self-archiving policy. |
Keywords: | Pitch extraction; voicing detection; phase spectrum; group delay; source-filter separation |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 26 Nov 2018 11:39 |
Last Modified: | 28 Nov 2018 15:33 |
Published Version: | https://doi.org/10.21437/Interspeech.2018-1062 |
Status: | Published |
Publisher: | ISCA |
Refereed: | Yes |
Identification Number: | 10.21437/Interspeech.2018-1062 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:139148 |