Alva-Manchego, F., Scarton, C. orcid.org/0000-0002-0103-4072 and Specia, L.
(2020)
Data-driven sentence simplification: Survey and benchmark.
Computational Linguistics, 46 (1).
pp. 135-187.
ISSN 0891-2017
Abstract
Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read and understand. In order to do so, several rewriting transformations can be performed such as replacement, reordering, and splitting. Executing these transformations while keeping sentences grammatical, preserving their main idea, and generating simpler output, is a challenging and still far from solved problem. In this article, we survey research on SS, focusing on approaches that attempt to learn how to simplify using corpora of aligned original-simplified sentence pairs in English, which is the dominant paradigm nowadays. We also include a benchmark of different approaches on common datasets so as to compare them and highlight their strengths and limitations. We expect that this survey will serve as a starting point for researchers interested in the task and help spark new ideas for future developments.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2020 Association for Computational Linguistics Published under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license, which permits you to copy and redistribute in any medium or format, for non-commercial use only, provided that the original work is not remixed, transfromed, or built upon, and the appropriate credit to the original source is given. For a full description of the license, please visit https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 20 Mar 2020 13:34 |
Last Modified: | 06 Dec 2021 16:09 |
Status: | Published |
Publisher: | MIT Press |
Refereed: | Yes |
Identification Number: | 10.1162/coli_a_00370 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:158279 |