Empirical evaluation of sequence-to-sequence models for word discovery in low-resource settings

Boito, M.Z., Villavicencio, A. orcid.org/0000-0002-3731-9168 and Besacier, L. (2019) Empirical evaluation of sequence-to-sequence models for word discovery in low-resource settings. In: Kubin, G. and Kačič, Z., (eds.) Interspeech 2019 - Proceedings of the Annual Conference of the International Speech Communication Association. Interspeech 2019, 15-19 Sep 2019, Graz, Austria. Interspeech Proceedings . International Speech Communication Association (ISCA) , pp. 2688-2692.

Abstract

Since Bahdanau et al. [1] first introduced attention for neural machine translation, most sequence-to-sequence models made use of attention mechanisms [2, 3, 4]. While they produce soft-alignment matrices that could be interpreted as alignment between target and source languages, we lack metrics to quantify their quality, being unclear which approach produces the best alignments. This paper presents an empirical evaluation of 3 of the main sequence-to-sequence models for word discovery from unsegmented phoneme sequences: CNN, RNN and Transformer-based. This task consists in aligning word sequences in a source language with phoneme sequences in a target language, inferring from it word segmentation on the target side [5]. Evaluating word segmentation quality can be seen as an extrinsic evaluation of the soft-alignment matrices produced during training. Our experiments in a low-resource scenario on Mboshi and English languages (both aligned to French) show that RNNs surprisingly outperform CNNs and Transformer for this task. Our results are confirmed by an intrinsic evaluation of alignment quality through the use Average Normalized Entropy (ANE). Lastly, we improve our best word discovery model by using an alignment entropy confidence measure that accumulates ANE over all the occurrences of a given alignment pair in the collection.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Boito, M.Z. Villavicencio, A. https://orcid.org/0000-0002-3731-9168 Besacier, L.
Editors:	Kubin, G. Kačič, Z.
Copyright, Publisher and Additional Information:	© 2019 International Speech Communication Association. Reproduced in accordance with the publisher's self-archiving policy.
Keywords:	sequence-to-sequence models; soft-alignment matrices; word discovery; low-resource languages; computational language documentation
Dates:	Published (online): 15 September 2019 Published: 15 September 2019
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	04 Feb 2020 14:50
Last Modified:	04 Feb 2020 14:50
Published Version:	https://www.isca-speech.org/archive/Interspeech_20...
Status:	Published
Publisher:	International Speech Communication Association (ISCA)
Series Name:	Interspeech Proceedings
Refereed:	Yes
Identification Number:	10.21437/Interspeech.2019-2029
Related URLs:	Publisher Conference
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:155716

CORE (COnnecting REpositories)

Empirical evaluation of sequence-to-sequence models for word discovery in low-resource settings

Abstract

Metadata

Download

Published Version

Export

Statistics