Griffis, D., Shivade, C., Fosler-Lussier, E. et al. (1 more author) (2016) A quantitative and qualitative evaluation of sentence boundary detection for the clinical domain. In: Proceedings of the 2016 Summit on Translational Bioinformatics. 2016 Summit on Translational Bioinformatics, 21-24 Mar 2016, San Francisco, CA, United States. AMIA Joint Summits on Translational Science Proceedings, 2016 . American Medical Informatics Association , pp. 88-97.
Abstract
Sentence boundary detection (SBD) is a critical preprocessing task for many natural language processing (NLP) applications. However, there has been little work on evaluating how well existing methods for SBD perform in the clinical domain. We evaluate five popular off-the-shelf NLP toolkits on the task of SBD in various kinds of text using a diverse set of corpora, including the GENIA corpus of biomedical abstracts, a corpus of clinical notes used in the 2010 i2b2 shared task, and two general-domain corpora (the British National Corpus and Switchboard). We find that, with the exception of the cTAKES system, the toolkits we evaluate perform noticeably worse on clinical text than on general-domain text. We identify and discuss major classes of errors, and suggest directions for future work to improve SBD methods in the clinical domain. We also make the code used for SBD evaluation in this paper available for download at http://github.com/drgriffis/SBD-Evaluation.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2016 AMIA. |
Keywords: | Mental Health |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 17 Feb 2023 14:38 |
Last Modified: | 17 Feb 2023 14:50 |
Published Version: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC50017... |
Status: | Published |
Publisher: | American Medical Informatics Association |
Series Name: | AMIA Joint Summits on Translational Science Proceedings |
Refereed: | Yes |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:196482 |