Zhang, Z., Petrak, J. orcid.org/0000-0001-8038-3096 and Maynard, D. orcid.org/0000-0002-1773-7020 (2018) Adapted TextRank for Term Extraction: a generic method of improving automatic term extraction algorithms. In: Procedia Computer Science. 14th International Conference on Semantic Systems, 10-13 Sep 2018, Vienna, Austria. Elsevier , pp. 102-108.
Abstract
Automatic Term Extraction is a fundamental Natural Language Processing task often used in many knowledge acquisition processes. It is a challenging NLP task due to its high domain dependence: no existing methods can consistently outperform others in all domains, and good ATE is very much an unsolved problem. We propose a generic method for improving the ranking of terms extracted by a potentially wide range of existing ATE methods. We re-design the well-known TextRank algorithm to work at corpus level, using easily obtainable domain resources in the form of seed words or phrases, to compute a score for a word from the target dataset. This is used to refine a candidate term’s score computed by an existing ATE method, potentially improving the ranking of real terms to be selected for tasks such as ontology engineering. Evaluation shows consistent improvement on 10 state of the art ATE methods by up to 25 percentage points in average precision measured at top-ranked K candidates.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2018 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the scientific committee of the SEMANTiCS 2018 – 14th International Conference on Semantic Systems. |
Keywords: | Automatic term extraction; NLP; terminology; ontology engineering |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number EUROPEAN COMMISSION - HORIZON 2020 726992 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 13 Sep 2018 11:42 |
Last Modified: | 08 Mar 2019 11:40 |
Published Version: | https://doi.org/10.1016/j.procs.2018.09.010 |
Status: | Published |
Publisher: | Elsevier |
Refereed: | Yes |
Identification Number: | 10.1016/j.procs.2018.09.010 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:135565 |