Fomicheva, M., Sun, S., Fonseca, E.R. et al. (6 more authors) (Submitted: 2020) MLQE-PE : a multilingual quality estimation and post-editing dataset. arXiv. (Submitted)
Abstract
We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE). The dataset contains seven language pairs, with human labels for 9,000 translations per language pair in the following formats: sentence-level direct assessments and post-editing effort, and word-level good/bad labels. It also contains the post-edited sentences, as well as titles of the articles where the sentences were extracted from, and the neural MT models used to translate the text.
Metadata
| Item Type: | Article |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2020 The Author(s). For reuse permissions, please contact the Author(s). |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Funding Information: | Funder Grant number European Commission - Horizon 2020 825303 |
| Depositing User: | Symplectic Sheffield |
| Date Deposited: | 02 Nov 2020 14:28 |
| Last Modified: | 02 Nov 2020 14:28 |
| Published Version: | https://arxiv.org/abs/2010.04480 |
| Status: | Submitted |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:167469 |

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)