Alharbi, E. and Stevenson, M. orcid.org/0000-0002-9483-6006 (Accepted: 2026) Can NLP models detect when one publication outweighs twenty? Predicting systematic review conclusion changes. In: Proceedings of the 25th Workshop on Biomedical Language Processing. 25th Workshop on Biomedical Language Processing (BioNLP 2026) collocated with 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), 03-04 Jul 2026, San Diego, California, USA. . Association for Computational Linguistics (ACL). (In Press)
Abstract
Systematic reviews underpin evidence-based medicine but can outdate quickly when new evidence appears. We formulate a novel prediction task: given a review and new studies that have appeared since its publication, predict whether the review's conclusions will change. A dataset of 3,326 Cochrane review-update pairs is constructed and a range of approaches explored including feature-based baselines, zero- and few-shot LLMs, in addition to parameter-efficient fine-tuning. Fine-tuning Qwen2.5-14B achieves the highest AUC-ROC (70.4%).
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2026 The Association for Computational Linguistics. |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Date Deposited: | 20 May 2026 13:37 |
| Last Modified: | 20 May 2026 13:37 |
| Status: | In Press |
| Publisher: | Association for Computational Linguistics (ACL) |
| Refereed: | Yes |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:241278 |
Download
Filename: BioNLP2026.pdf

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)