Cheng, M., Zhang, L. orcid.org/0000-0003-0526-9677, Sun, B. et al. (1 more author) (2026) How do large language models and reference analysis differ in tracing disciplinary contributions to interdisciplinary fields? Scientometrics. ISSN: 0138-9130
Abstract
Interdisciplinary research is a key driver of scientific innovation, yet how knowledge from different disciplines integrates to address research problems remains understudied. Recent advances in large language models (LLMs) provide new tools for analyzing complex texts and extracting domain knowledge. Using bioinformatics as a case study, this paper explores the potential of LLM-based title/abstract analysis to identify contributing disciplines and their functional roles within interdisciplinary research from titles and abstracts, and compares the results with those derived from the topics of references and enriched cited references (ECR) data. The findings suggest that the performance of LLMs in extracting the contributing disciplines and their roles within interdisciplinary research can be enhanced through stepwise prompt optimization. Compared to the ECR-enhanced reference analysis, the LLM-based method focuses more directly on disciplines closely related to the core content of the study and suggests disciplinary contributions that the ECR-enhanced reference analysis may overlook. In the co-occurrence networks of disciplines, the ECR-enhanced reference analysis reports broader and more diverse integration pathways, while the LLM-based network exhibits a more centralized structure, highlighting the strong connections between core fields such as Biochemistry, Genetics & Molecular Biology and Computer Science. Additionally, the LLM-based method suggests more fine-grained contributions of different disciplines across key components of the research process, while the ECR-enhanced reference analysis tends to capture a broader range of disciplinary contexts involved in different sections of a study. Overall, the findings demonstrate the complementary strengths of LLMs and reference analysis in understanding the process of interdisciplinary knowledge integration. Future work could combine these two approaches to develop a more comprehensive methodological framework.
Metadata
| Item Type: | Article |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2026 The Authors. Except as otherwise noted, this author-accepted version of a journal article published in Scientometrics is made available via the University of Sheffield Research Publications and Copyright Policy under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ |
| Keywords: | Interdisciplinary knowledge integration; Large language models; Reference analysis; Disciplinary contribution |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > School of Information, Journalism and Communication |
| Date Deposited: | 25 Feb 2026 10:39 |
| Last Modified: | 25 Feb 2026 10:39 |
| Status: | Published online |
| Publisher: | Springer Science and Business Media LLC |
| Refereed: | Yes |
| Identification Number: | 10.1007/s11192-026-05566-5 |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:238398 |

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)