Khallaf, N., Eugeni, C. orcid.org/0000-0002-4465-8897 and Sharoff, S. orcid.org/0000-0002-4877-0210 (2025) Reading Between the Lines: A dataset and a study on why some texts are tougher than others. In: Zock, M., Inui, K. and Zheng, Y., (eds.) Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025). First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025), 20 Jan 2025, Abu Dhabi, UAE. . International Committee on Computational Linguistics, pp. 24-34.
Abstract
Our research aims at better understanding what makes a text difficult to read for specific audiences with intellectual disabilities, more specifically, people who have limitations in cognitive functioning, such as reading and understanding skills, an IQ below 70, and challenges in conceptual domains. We introduce a scheme for the annotation of difficulties which is based on empirical research in psychology as well as on research in translation studies. The paper describes the annotated dataset, primarily derived from the parallel texts (standard English and Easy to Read English translations) made available online. we fine-tuned four different pre-trained transformer models to perform the task of multiclass classification to predict the strategies required for simplification. We also investigate the possibility to interpret the decisions of this language model when it is aimed at predicting the difficulty of sentences.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Editors: |
|
| Copyright, Publisher and Additional Information: | This item is protected by copyright. This is an open access conference paper under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. |
| Dates: |
|
| Institution: | The University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Arts, Humanities and Cultures (Leeds) > School of Languages Cultures & Societies (Leeds) |
| Funding Information: | Funder Grant number EU - European Union 10103529 |
| Date Deposited: | 25 Mar 2026 11:12 |
| Last Modified: | 25 Mar 2026 11:12 |
| Published Version: | https://aclanthology.org/2025.wraicogs-1.3/ |
| Status: | Published |
| Publisher: | International Committee on Computational Linguistics |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:239124 |

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)