Thelwall, M. orcid.org/0000-0001-6065-205X (2025) Can smaller large language models evaluate research quality? Malaysian Journal of Library and Information Science, 30 (2). pp. 68-81. ISSN: 1394-6234
Abstract
Academic librarians often construct bibliometric indicators to support research evaluation. Traditionally, these have been citation-based, but AI alternatives have recently emerged. Although both Google Gemini (1.5 Flash) and ChatGPT (4o and 4o-mini) provide research quality evaluation scores that correlate positively with expert scores in nearly all fields, and more strongly than citations in most, it is not known whether this holds for smaller Large Language Models (LLMs). In response, this article assesses Google’s Gemma-3-27b-it, a downloadable LLM (60 GB). Results for 104,187 articles show that Gemma-3-27b-it scores correlate positively with an expert research quality score proxy for all 34 Units of Assessment (broad fields) from the UK Research Excellence Framework 2021. The Gemma-3-27b-it correlations have 83.8% of the strength of ChatGPT 4o and 94.7% of the strength of ChatGPT 4o-mini correlations. Unlike the two larger LLMs, the Gemma-3-27b-it correlations do not increase substantially when scores are averaged across five repetitions, its scores tend to be lower, and its reports are relatively uniform in style. Overall, the results show that research quality score estimation can be conducted by offline LLMs, so this capability is not an emergent property of only the largest LLMs. Moreover, score improvement through repetition is not a universal feature of LLMs. In conclusion, although the largest LLMs still have the highest research evaluation score estimation capability, smaller ones can also be used for this task, which can be helpful for cost saving or when secure offline processing is required.
Metadata
| Item Type: | Article |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © Malaysian Journal of Library & Information Science 2025. |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) ?? Sheffield.IJC ?? The University of Sheffield > Faculty of Social Sciences (Sheffield) > Department of Journalism Studies (Sheffield) |
| Funding Information: | Funder Grant number UK RESEARCH AND INNOVATION UKRI1079 |
| Date Deposited: | 20 Nov 2025 15:31 |
| Last Modified: | 20 Nov 2025 15:44 |
| Published Version: | https://mjlis.um.edu.my/index.php/MJLIS/article/vi... |
| Status: | Published |
| Publisher: | Masters of Library of Inf. Sci. Program Uni. of Malaya |
| Refereed: | Yes |
| Identification Number: | 10.22452/mjlis.vol30no2.4 |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:234732 |
Download
Filename: ChatGPT_how_varied2a_r1_preprint.pdf

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)