Jiang, X. orcid.org/0000-0003-4255-5445 and Chen, J. (2023) Contextualised segment-wise citation function classification. Scientometrics, 128 (9). pp. 5117-5158. ISSN 0138-9130
Abstract
Much effort has been made in the past decades to citation function classification, but noteworthy issues exist. Annotation difficulty resulted in limited data size, especially for minority classes, and inadequate representativeness of the underlying scientific domains. Concerning algorithmic classification, state-of-the-art deep learning-based methods are flawed by generating a feature vector for the whole citation context (or sentence) and failing to exploit the full realm of citation modelling options. Responding to these issues, this paper studied contextualised citation function classification. Specifically, a large new citation context dataset was created by merging and re-annotating six datasets about computational linguistics. A variety of strong SciBERT-based citation function classification models were proposed, and new states of the art were achieved. Through deeper performance analysis, this study focused on answering several research questions about the effective ways of performing citation function classification. More specifically, the study justified the necessity of modelling in-text citations in context and confirmed the superiority of doing citation function classification at citation (segment) level. A particular emphasis was placed on in-depth per-class performance analysis to understand whether citation function classification is robust enough to suit various popular downstream applications and what further efforts are required to meet such analytic needs. Finally, a naïve ensemble classifier was proposed, which greatly improved citation function classification performance.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © Akadémiai Kiadó, Budapest, Hungary 2023. This is an author-produced version of a paper subsequently published in Scientometrics. Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | Citation context analysis; Citation function classification; Deep learning;; SciBERT; Ensemble |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 17 Jun 2025 07:51 |
Last Modified: | 17 Jun 2025 07:53 |
Status: | Published |
Publisher: | Springer Science and Business Media LLC |
Refereed: | Yes |
Identification Number: | 10.1007/s11192-023-04778-3 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:227924 |