Gao, J., Zhang, Z. and Gentile, A.L. The LODIE team (University of Sheffield) Participation at the TAC2015 Entity Discovery Task of the Cold Start KBP Track. In: Proceedings of the 2015 Text Analysis Conference. TAC Knowledge Base Population (KBP) 2015, 16-17 Nov 2015, Gaithersburg, Maryland USA. (Submitted)
Abstract
This paper describes the LODIE team (from the OAK lab of the University of Sheffield) participation at TAC-KBP 2015 for the Entity Discovery task in the Cold Start KBP track. We have taken a cross-document coreference resolution approach that starts with Named EntityRecognitiontolocateandclassifymentions of named entities, followed by a clustering procedure that groups mentions referring to the same entity. Our primary interest was studying different features and their effect on the clustering process, as well as scalablemethodstocopewithverylargedata. We experimented with several feature combinationsandconcludethatthebestresultsareobtained using features based on entity surface forms and distributed word embeddings. To cope with large scale data, the clustering process takes a two-step approach to break data to smaller batches. Our method on the 2015 evaluation dataset obtains a best CEAF mention F-measure of 63.21.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number INNOVATE UK (TSB) 101947 / 41205-293373 ENGINEERING AND PHYSICAL SCIENCE RESEARCH COUNCIL (EPSRC) EP/J019488/1 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 06 Apr 2016 09:46 |
Last Modified: | 19 Dec 2022 13:33 |
Status: | Submitted |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:96380 |