Wang, H., Jacob, D., Kelly, D. et al. (3 more authors) (2025) SecureMind: A Framework for Benchmarking Large Language Models in Memory Bug Detection and Repair. In: ISMM '25: Proceedings of the 2025 ACM SIGPLAN International Symposium on Memory Management. 2025 ACM SIGPLAN International Symposium on Memory Management (ISMM 2025), 17 Jun 2025, Seoul, South Korea. Association for Computer Machinery , pp. 27-40. ISBN: 979-8-4007-1610-2/25/06
Abstract
Large language models (LLMs) hold great promise for automating software vulnerability detection and repair, but ensuring their correctness remains a challenge. While recent work has developed benchmarks for evaluating LLMs in bug detection and repair, existing studies rely on hand-crafted datasets that quickly become outdated. Moreover, systematic evaluation of advanced reasoning-based LLMs using chain-of-thought prompting for software security is lacking. We introduce SecureMind, an open-source framework for evaluating LLMs in vulnerability detection and repair, focusing on memory-related vulnerabilities. SecureMind provides a user-friendly Python interface for defining test plans, which automates data retrieval, preparation, and benchmarking across a wide range of metrics. Using SecureMind, we assess 10 representative LLMs, including 7 state-of-the-art reasoning models, on 16K test samples spanning 8 Common Weakness Enumeration (CWE) types related to memory safety violations. Our findings highlight the strengths and limitations of current LLMs in handling memory-related vulnerabilities.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2025 Copyright held by the owner/author(s). This is an open access conference paper under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited. |
Keywords: | Software bug detection, Bug repair, Large language models |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Funding Information: | Funder Grant number EPSRC (Engineering and Physical Sciences Research Council) EP/X018202/1 EPSRC (Engineering and Physical Sciences Research Council) EP/X037304/1 |
Depositing User: | Symplectic Publications |
Date Deposited: | 16 May 2025 12:50 |
Last Modified: | 12 Aug 2025 10:25 |
Status: | Published |
Publisher: | Association for Computer Machinery |
Identification Number: | 10.1145/3735950.3735954 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:226674 |