Meghanani, A. and Hain, T. orcid.org/0000-0003-0939-3464 (2024) Deriving translational acoustic sub-word embeddings. In: 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Proceedings. 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 16-20 Dec 2023, Taipei, Taiwan. Institute of Electrical and Electronics Engineers (IEEE) ISBN 9798350306903
Abstract
There is a growing interest in understanding the representational geometry of acoustic word embeddings (AWEs), which are fixed-dimensional representations of spoken words. However, not much research has been conducted on acoustic sub-word embeddings (ASWEs), which can provide a better understanding of the AWE space. This work focuses on decomposing AWEs to obtain ASWEs while retaining the ability to reconstruct AWEs by translating ASWEs in the embedding space, under constrained settings. Initially, high-quality AWEs are obtained with an Average Precision (AP) score of 0.97 on the word discrimination task. Subsequently, ASWEs are derived through the decomposition of AWEs. Three adapted versions of the AP metric, utilized for evaluating the quality of the derived ASWEs and their translational properties, are proposed. The results demonstrate that the derived ASWEs exhibit high quality, and the reconstruction of AWEs from the ASWEs is achievable by translating them in the embedding space.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2024 The Author(s). Except as otherwise noted, this author-accepted version of a journal article published in 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Proceedings is made available via the University of Sheffield Research Publications and Copyright Policy under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ |
Keywords: | acoustic word embeddings; acoustic subword embeddings; translational; word discrimination task |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 20 Dec 2023 16:34 |
Last Modified: | 02 Feb 2024 16:34 |
Status: | Published |
Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
Refereed: | Yes |
Identification Number: | 10.1109/ASRU57964.2023.10389747 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:206866 |