Wang, H., Tang, Z., Tan, S.H. et al. (5 more authors) (2024) Combining Structured Static Code Information and Dynamic Symbolic Traces for Software Vulnerability Prediction. In: Proceedings of the 46th International Conference on Software Engineering. 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE), 14-20 Apr 2024, Lisbon, Portugal. ACM ISBN 979-8-4007-0217-4
Abstract
Deep learning (DL) has emerged as a viable means for identifying software bugs and vulnerabilities. The success of DL relies on having a suitable representation of the problem domain. However, existing DL-based solutions for learning program representations have limitations - they either cannot capture the deep, precise program semantics or suffer from poor scalability. We present Concoction, the first DL system to learn program presentations by combining static source code information and dynamic program execution traces. Concoction employs unsupervised active learning techniques to determine a subset of important paths to collect dynamic symbolic execution traces. By implementing a focused symbolic execution solution, Concoction brings the benefits of static and dynamic code features while reducing the expensive symbolic execution overhead. We integrate Concoction with fuzzing techniques to detect function-level code vulnerabilities in C programs from 20 open-source projects. In 200 hours of automated concurrent test runs, Concoction has successfully uncovered vulnerabilities in all tested projects, identifying 54 unique vulnerabilities and yielding 37 new, unique CVE IDs. Concoction also significantly outperforms 16 prior methods by providing higher accuracy and lower false positive rates.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2024 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution International 4.0 License. |
Keywords: | Software vulnerability detection, Deep learning, Symbolic execution |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Funding Information: | Funder Grant number EPSRC (Engineering and Physical Sciences Research Council) EP/X018202/1 |
Depositing User: | Symplectic Publications |
Date Deposited: | 22 Jan 2024 14:49 |
Last Modified: | 16 May 2024 12:40 |
Published Version: | https://dl.acm.org/doi/abs/10.1145/3597503.3639212 |
Status: | Published |
Publisher: | ACM |
Identification Number: | 10.1145/3597503.3639212 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:208077 |
Download
Filename: Combining Structured Static Code Information and Dynamic.pdf
Licence: CC-BY 4.0