Fomicheva, M., Sun, S., Fonseca, E.R. et al. (6 more authors) (Submitted: 2020) MLQE-PE : a multilingual quality estimation and post-editing dataset. arXiv. (Submitted)
Abstract
We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE). The dataset contains seven language pairs, with human labels for 9,000 translations per language pair in the following formats: sentence-level direct assessments and post-editing effort, and word-level good/bad labels. It also contains the post-edited sentences, as well as titles of the articles where the sentences were extracted from, and the neural MT models used to translate the text.
Metadata
Authors/Creators: |
|
||||
---|---|---|---|---|---|
Copyright, Publisher and Additional Information: | © 2020 The Author(s). For reuse permissions, please contact the Author(s). | ||||
Dates: |
|
||||
Institution: | The University of Sheffield | ||||
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) | ||||
Funding Information: |
|
||||
Depositing User: | Symplectic Sheffield | ||||
Date Deposited: | 02 Nov 2020 14:28 | ||||
Last Modified: | 02 Nov 2020 14:28 | ||||
Published Version: | https://arxiv.org/abs/2010.04480 | ||||
Status: | Submitted |