A question-answering framework for automated abstract screening using large language models

Abstract

Objective

This paper aims to address the challenges in abstract screening within systematic reviews (SR) by leveraging the zero-shot capabilities of large language models (LLMs).

Methods

We employ LLM to prioritize candidate studies by aligning abstracts with the selection criteria outlined in an SR protocol. Abstract screening was transformed into a novel question-answering (QA) framework, treating each selection criterion as a question addressed by LLM. The framework involves breaking down the selection criteria into multiple questions, properly prompting LLM to answer each question, scoring and re-ranking each answer, and combining the responses to make nuanced inclusion or exclusion decisions.

Results and Discussion

Large-scale validation was performed on the benchmark of CLEF eHealth 2019 Task 2: Technology-Assisted Reviews in Empirical Medicine. Focusing on GPT-3.5 as a case study, the proposed QA framework consistently exhibited a clear advantage over traditional information retrieval approaches and bespoke BERT-family models that were fine-tuned for prioritizing candidate studies (ie, from the BERT to PubMedBERT) across 31 datasets of 4 categories of SRs, underscoring their high potential in facilitating abstract screening. The experiments also showcased the viability of using selection criteria as a query for reference prioritization. The experiments also showcased the viability of the framework using different LLMs.

Conclusion

Investigation justified the indispensable value of leveraging selection criteria to improve the performance of automated abstract screening. LLMs demonstrated proficiency in prioritizing candidate studies for abstract screening using the proposed QA framework. Significant performance improvements were obtained by re-ranking answers using the semantic alignment between abstracts and selection criteria. This further highlighted the pertinence of utilizing selection criteria to enhance abstract screening.

Metadata

Item Type:	Article
Authors/Creators:	Akinseloyin, O. Jiang, X. https://orcid.org/0000-0003-4255-5445 Palade, V. https://orcid.org/0000-0002-6768-8394
Copyright, Publisher and Additional Information:	© The Author(s) 2024. Published by Oxford University Press on behalf of the American Medical Informatics Association. This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
Keywords:	abstract screening; automated systematic review; large language model; question answering; zero-shot re-ranking; Natural Language Processing; Abstracting and Indexing; Systematic Reviews as Topic; Humans; Information Storage and Retrieval
Dates:	Accepted: 9 July 2024 Published (online): 23 July 2024 Published: September 2024
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	18 Oct 2024 11:38
Last Modified:	18 Oct 2024 11:38
Status:	Published
Publisher:	Oxford University Press (OUP)
Refereed:	Yes
Identification Number:	10.1093/jamia/ocae166
Related URLs:	PubMed URL Dataset Software or Code
Sustainable Development Goals:
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:218488