Translation error detection as rationale extraction

Abstract

Recent Quality Estimation (QE) models based on multilingual pre-trained representations have achieved very competitive results when predicting the overall quality of translated sentences. Predicting translation errors, i.e. detecting specifically which words are incorrect, is a more challenging task, especially with limited amounts of training data. We hypothesize that, not unlike humans, successful QE models rely on translation errors to predict overall sentence quality. By exploring a set of feature attribution methods that assign relevance scores to the inputs to explain model predictions, we study the behaviour of state-of-the-art sentence-level QE models and show that explanations (i.e. rationales) extracted from these models can indeed be used to detect translation errors. We therefore (i) introduce a novel semi-supervised method for word-level QE and (ii) propose to use the QE task as a new benchmark for evaluating the plausibility of feature attribution, i.e. how interpretable model explanations are to humans.

Metadata

Item Type:	Article
Authors/Creators:	Fomicheva, M. Specia, L. Aletras, N. https://orcid.org/0000-0003-4285-1965
Copyright, Publisher and Additional Information:	© 2021 The Author(s). Preprint available under a Creative Commons Attribution Licence (http://creativecommons.org/licenses/by/4.0).
Keywords:	cs.CL; cs.CL
Dates:	Submitted: 27 August 2021
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Funding Information:	Funder Grant number EUROPEAN COMMISSION - HORIZON 2020 825303
Depositing User:	Symplectic Sheffield
Date Deposited:	14 Sep 2021 15:40
Last Modified:	25 Nov 2022 11:03
Published Version:	https://arxiv.org/abs/2108.12197
Status:	Submitted
Related URLs:	arXiv URL
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:178204

Download

Submitted Version

Filename: 2108.12197v1.pdf

Licence: CC-BY 4.0

CLICK TO DOWNLOAD

CORE (COnnecting REpositories)

Translation error detection as rationale extraction

Abstract

Metadata

Download

Submitted Version

Export

Statistics