Wang, H., Jacob, D., Kelly, D. et al. (3 more authors) (2025) SecureMind: A Framework for Benchmarking Large Language Models in Memory Bug Detection and Repair. In: ISMM '25: Proceedings of the 2025 ACM SIGPLAN International Symposium on Memory Management. 2025 ACM SIGPLAN International Symposium on Memory Management (ISMM 2025), 17 Jun 2025, Seoul, South Korea. . Association for Computer Machinery, pp. 27-40. ISBN: 979-8-4007-1610-2/25/06.
Abstract
Large language models (LLMs) hold great promise for automating software vulnerability detection and repair, but ensuring their correctness remains a challenge. While recent work has developed benchmarks for evaluating LLMs in bug detection and repair, existing studies rely on hand-crafted datasets that quickly become outdated. Moreover, systematic evaluation of advanced reasoning-based LLMs using chain-of-thought prompting for software security is lacking. We introduce SecureMind, an open-source framework for evaluating LLMs in vulnerability detection and repair, focusing on memory-related vulnerabilities. SecureMind provides a user-friendly Python interface for defining test plans, which automates data retrieval, preparation, and benchmarking across a wide range of metrics. Using SecureMind, we assess 10 representative LLMs, including 7 state-of-the-art reasoning models, on 16K test samples spanning 8 Common Weakness Enumeration (CWE) types related to memory safety violations. Our findings highlight the strengths and limitations of current LLMs in handling memory-related vulnerabilities.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2025 Copyright held by the owner/author(s). This is an open access conference paper under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. |
| Keywords: | Software bug detection, Bug repair, Large language models |
| Dates: |
|
| Institution: | The University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
| Funding Information: | Funder Grant number EPSRC (Engineering and Physical Sciences Research Council) EP/X018202/1 EPSRC (Engineering and Physical Sciences Research Council) EP/X037304/1 |
| Date Deposited: | 16 May 2025 12:50 |
| Last Modified: | 12 Aug 2025 10:25 |
| Status: | Published |
| Publisher: | Association for Computer Machinery |
| Identification Number: | 10.1145/3735950.3735954 |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:226674 |
Download
Filename: 3735950.3735954.pdf
Licence: CC-BY 4.0

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)