Research evaluation with ChatGPT: is it age, country, length, or field biased?

Abstract

Some research now suggests that ChatGPT can estimate the quality of journal articles from their titles and abstracts. This has created the possibility to use ChatGPT quality scores, perhaps alongside citation-based formulae, to support peer review for research evaluation. Nevertheless, ChatGPT’s internal processes are effectively opaque, despite it writing a report to support its scores, and its biases are unknown. This article investigates whether publication date and field are biasing factors. Based on submitting a monodisciplinary journal-balanced set of 117,650 articles from 26 fields published in the years 2003, 2008, 2013, 2018 and 2023 to ChatGPT 4o-mini, the results show that average scores increased over time, and this was not due to author nationality or title and abstract length changes. The results also varied substantially between fields, and first author countries. In addition, articles with longer abstracts tended to receive higher scores, mostly due to such articles tending to be better (e.g., more likely to be in higher impact journals) but also partly due to ChatGPT analysing more text. For the most accurate research quality evaluation results from ChatGPT, it is important to normalise ChatGPT scores for field and year and check for anomalies caused by sets of articles with short abstracts.

Metadata

Item Type:	Article
Authors/Creators:	Thelwall, M. https://orcid.org/0000-0001-6065-205X Kurt, Z.
Copyright, Publisher and Additional Information:	© The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Keywords:	ChatGPT; research impact; publication date; research excellence framework
Dates:	Accepted: 22 July 2025 Published (online): 8 August 2025 Published: October 2025
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Funding Information:	Funder Grant number UK RESEARCH AND INNOVATION UKRI1079
Date Deposited:	31 Jul 2025 09:24
Last Modified:	26 Nov 2025 15:56
Status:	Published
Publisher:	Springer
Refereed:	Yes
Identification Number:	10.1007/s11192-025-05393-0
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:229531

Download

Published Version

Filename: s11192-025-05393-0.pdf

Licence: CC-BY 4.0

CLICK TO DOWNLOAD

CORE (COnnecting REpositories)

Research evaluation with ChatGPT: is it age, country, length, or field biased?

Abstract

Metadata

Download

Published Version

Export

Statistics