Huang, Z., Rong, W., Zhang, X. et al. (3 more authors) (2023) Token relation aware Chinese named entity recognition. ACM Transactions on Asian and Low-Resource Language Information Processing, 22 (1). pp. 1-21. ISSN 2375-4699
Abstract
Due to the lack of natural delimiters, most Chinese Named Entity Recognition (NER) approaches are character-based and utilize an external lexicon to leverage the word-level information. Although they have achieved promising results, the latent words they introduced are still non-contextualized. In this paper, we investigate three relations, i.e, adjacent relation between characters, character co-occurrence relation between latent words, and dependency relation among tokens, to address this issue. Specifically, we first establish the local context for latent words and then propose a masked self-attention mechanism to incorporate such local contextual information. Besides, since introducing external knowledge such as lexicon and dependency relation inevitably brings in some noises, we propose a gated information controller to handle this problem. Extensive experimental results show that the proposed approach surpasses most similar methods on public datasets and demonstrates its promising potential.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2022 Association for Computing Machinery. |
Keywords: | Chinese NER; dependency relation; character co-occurrence; character adjacency; gated mechanism |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 23 Nov 2022 16:40 |
Last Modified: | 28 Jun 2024 15:43 |
Status: | Published |
Publisher: | Association for Computing Machinery (ACM) |
Refereed: | Yes |
Identification Number: | 10.1145/3531534 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:193587 |