Investigating alignment interpretability for low-resource NMT

Abstract

The attention mechanism in Neural Machine Translation (NMT) models added flexibility to translation systems, and the possibility to visualize soft-alignments between source and target representations. While there is much debate about the relationship between attention and the yielded output for neural models (Jain and Wallace 2019; Serrano and Smith 2019; Wiegreffe and Pinter 2019; Vashishth et al. 2019), in this paper we propose a different assessment, investigating soft-alignment interpretability in low-resource scenarios. We experimented with different architectures (RNN (Bahdanau et al. 2015), 2D-CNN (Elbayad et al. 2018), and Transformer (Vaswani et al. 2017)), comparing them with regards to their ability to produce directly exploitable alignments. For evaluating exploitability, we replicated the Unsupervised Word Segmentation (UWS) task from Godard et al. (2018). There, source words are translated into unsegmented phone sequences. Posterior to training, the resulting soft-alignments are used for producing segmentation over the target side. Our results showed that a RNN-based NMT model produced the most exploitable alignments in this scenario. We then investigated methods for increasing its UWS scores by comparing the following methodologies: monolingual pre-training, input representation augmentation (hybrid model), and explicit word length optimization during training. We reached the best results by using the hybrid model, which uses an intermediate monolingual-rooted segmentation from a non-parametric Bayesian model (Goldwater 2007) to enrich the input representation before training.

Metadata

Item Type:	Article
Authors/Creators:	Boito, M.Z. Villavicencio, A. https://orcid.org/0000-0002-3731-9168 Besacier, L.
Copyright, Publisher and Additional Information:	© The Author(s), under exclusive licence to Springer Nature B.V. part of Springer Nature 2021. This is an author-produced version of a paper subsequently published in Machine Translation. Uploaded in accordance with the publisher's self-archiving policy.
Keywords:	low-resource languages; attention mechanism; sequence-to-sequence models; unsupervised word segmentation; computational language documentation; neural machine translation
Dates:	Accepted: 11 December 2020 Published (online): 6 February 2021 Published: 6 February 2021
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Funding Information:	Funder Grant number Engineering and Physical Science Research Council EP/T02450X/1
Depositing User:	Symplectic Sheffield
Date Deposited:	19 Oct 2020 10:29
Last Modified:	06 Feb 2022 01:38
Status:	Published
Publisher:	Springer
Refereed:	Yes
Identification Number:	10.1007/s10590-020-09254-w
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:166668

CORE (COnnecting REpositories)

Investigating alignment interpretability for low-resource NMT

Abstract

Metadata

Download

Accepted Version

Export

Statistics