Aslam, A. orcid.org/0000-0002-2654-4255, Walker, L., Abaho, M. et al. (16 more authors) (2025) An automation framework for clinical codelist development validated with UK data from patients with multiple long-term conditions. BMC Medical Research Methodology, 25 (1). 138. ISSN 1471-2288
Abstract
Background Codelists play a crucial role in ensuring accurate and standardized communication within healthcare. However, preparation of high-quality codelists is a rigorous and time-consuming process. The literature focuses on transparency of clinical codelists and overlooks the utility of automation. Methods (Automated Framework Design and Use‑case: DynAIRx) Here we present a Codelist Generation Framework that can automate generation of codelists with minimal input from clinical experts. We demonstrate the process using a specifc project, DynAIRx, producing appropriate codelists and a framework allowing future projects to take advantage of automated codelist generation. Both the framework and codelist are publicly available. DynAIRx is an NIHR-funded project aiming to develop AIs to help optimise prescribing of medicines in patients with multiple long-term conditions. DynAIRx requires complex codelists to describe the trajectory of each patient, and the interaction between their conditions. We promptly generated ≈214 codelists for DynAIRx using the proposed framework and validated them with a panel of experts, signifcantly reducing the amount of time required by making effective use of automation. Results The framework reduced the clinician time required to validate codes, automatically shrunk codelists using trusted sources and added new codes for review against existing codelists. In the DynAIRx case study, a codelist of ≈ 14000 codes required only 7-9 hours of clinician’s time in the end (while existing methods takes months), and application of the automation framework reduced the workload by >80%. Conclusion This work examines current methodologies for codelist development and the challenges associated with ensuring transparency and reproducibility. A key beneft of this approach is its emphasis on automation and reliance on trusted sources, which signifcantly lowers the workload, minimizes human error, and saves substantial time, particularly the time needed from clinical experts.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
Keywords: | Codelist; Automation; Multiple long term conditions (MLTC); SNOMEDs; DynAIRx |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 27 May 2025 14:51 |
Last Modified: | 27 May 2025 14:51 |
Status: | Published |
Publisher: | Springer Science and Business Media LLC |
Refereed: | Yes |
Identification Number: | 10.1186/s12874-025-02541-1 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:227140 |