Derczynski, L. and Bontcheva, K. (2015) Efficient named entity annotation through pre-empting. In: International Conference Recent Advances in Natural Language Processing, RANLP. Recent Advances in Natural Language Processing, 05-11 Sep 2015, Hissar, Bulgaria. Association for Computational Linguistics , pp. 123-130.
Abstract
Linguistic annotation is time-consuming and expensive. One common annotation task is to mark entities - such as names of people, places and organisations - in text. In a document, many segments of text often contain no entities at all. We show that these segments are worth skipping, and demonstrate a technique for reducing the amount of entity-less text examined by annotators, which we call "preempting". This technique is evaluated in a crowdsourcing scenario, where it provides downstream performance improvements for the same size corpus.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | Article licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License (https://creativecommons.org/licenses/by-nc-sa/3.0/). Permission is granted to make copies for the purposes of teaching and research. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 03 Feb 2016 16:10 |
Last Modified: | 19 Dec 2022 13:32 |
Published Version: | http://anthology.aclweb.org/R/R15/R15-1018.pdf |
Status: | Published |
Publisher: | Association for Computational Linguistics |
Refereed: | Yes |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:94051 |