Paul, S., Majumdar, S. orcid.org/0000-0003-3935-4087, Bandyopadhyay, A. et al. (5 more authors) (2024) Efficiency of Large Language Models to scale up Ground Truth: Overview of the IRSE Track at Forum for Information Retrieval 2023. In: The 15th Annual Meeting of the Forum for Information Retrieval Evaluation, 15-18 Dec 2023, Panjim, India.
Abstract
The Software Engineering Information Retrieval (IRSE) track aims to devise solutions for the automated evaluation of code comments within a machine learning framework, with labels generated by both humans and large language models. Within this track, there is a binary classification task: discerning comments as either useful or not useful. The dataset includes 9,048 pairs of code comments and surrounding code snippets drawn from open-source C-based projects on GitHub and an additional dataset generated by teams employing large language models. In total, 17 teams representing various universities and software companies have contributed 56 experiments. These experiments were assessed through quantitative metrics, primarily the F1-Score, and qualitative evaluations based on the features developed, the supervised learning models employed, and their respective hyperparameters. It is worth noting that labels generated by large language models introduce bias into the prediction model but lead to less over-fitted results.
Metadata
| Item Type: | Conference or Workshop Item |
|---|---|
| Authors/Creators: |
|
| Dates: |
|
| Institution: | The University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
| Date Deposited: | 06 Feb 2026 11:36 |
| Last Modified: | 06 Feb 2026 11:39 |
| Published Version: | https://dl.acm.org/doi/10.1145/3632754.3633480 |
| Status: | Published |
| Publisher: | Association for Computing Machinery (ACM) |
| Identification Number: | 10.1145/3632754.3633480 |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:237532 |

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)