Jiang, X. orcid.org/0000-0003-4255-5445, Khan, K. orcid.org/0009-0008-0588-1974, Vasantha, S.T. orcid.org/0009-0001-1935-5552 et al. (1 more author) (2025) Evidence extraction for automated medical coding: preliminary evaluation. In: NLPIR '24: Proceedings of the 2024 8th International Conference on Natural Language Processing and Information Retrieval. NLPIR 2024: 2024 8th International Conference on Natural Language Processing and Information Retrieval, 13-15 Dec 2024, Okayama University, Japan. Association for Computing Machinery (ACM) , pp. 18-23. ISBN 9798400717383
Abstract
Coding clinical texts in standard language such as ICD is an important but tedious and error-prone process. Automated medical coding algorithms suffer problems due to the combined the challenge of handling the significant length of clinical text, the complexity of the huge code hierarchy and the lack of interpretability to ensure user trust. Large language models (LLM) have also been proven struggling with this task in recent studies. Recent efforts have been made to annotate an evidence-supported medical coding dataset. The current study makes the first empirical investigation into how well (small) fine-tuned pretrained language models (PLM) and LLMs could identify the sentences containing medical evidence supporting the assigned codes. Hierarchical sequential sentence classification and GPT-3.5 in the zero-shot setting were tested for evidence sentence extraction. Extra evaluation was performed to investigate how evidence extraction impacts clinical coding and what implications it has towards the future generation algorithms for automated medical coding.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2024 The Authors. Except as otherwise noted, this author-accepted version of a journal article published in NLPIR '24: Proceedings of the 2024 8th International Conference on Natural Language Processing and Information Retrieval is made available via the University of Sheffield Research Publications and Copyright Policy under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ |
Keywords: | Information and Computing Sciences; Language, Communication and Culture; Linguistics; Clinical Research |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 17 Jun 2025 13:40 |
Last Modified: | 18 Jun 2025 02:58 |
Status: | Published |
Publisher: | Association for Computing Machinery (ACM) |
Refereed: | Yes |
Identification Number: | 10.1145/3711542.3711580 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:227917 |
Download
Filename: Evidence_Extraction_for_Automated_Medical_Coding__Preliminary_Evaluation.pdf
Licence: CC-BY 4.0