Zhao, Y, Tang, Z, Ye, G et al. (4 more authors) (2020) Semantics-aware obfuscation scheme prediction for binary. Computers & Security, 99. 102072. p. 102072. ISSN 0167-4048
Abstract
By restoring the program into an easier understandable form, deobfuscation is an important technique for detecting and analyzing malicious software. To enable deobfuscation, one must know if the target program is obfuscated and what types of obfuscation schemes may be used. However, obtaining such information is challenging without having access to the original program source code.
This paper presents a new way to estimate the obfuscation scheme of a compiled binary. It achieves this by using semantic information of the disassembled binary to predict if the program has been obfuscated and if so, what type of obfuscation scheme may be used. At the core of our approach is a set of deep neural networks that can effectively characterize and leverage the contextual information available in the assembly code. Our models are first trained offline, and the learned models can then be applied to new previously unseen obfuscated binaries. We evaluate our approach by applying it to a large dataset of over 277,000 obfuscated samples with different individual obfuscation schemes and their combinations. Experimental results show that our approach is highly effective in identifying the obfuscation scheme, with a prediction accuracy of at least 83% (up to 98%).
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2020 Elsevier Ltd. All rights reserved. This is an author produced version of an article published in Computers & Security. Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | Deobfuscation; Reverse engineering; Deep neural networks; Disassembled binary analysis; Semantic expression |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 16 Oct 2020 10:42 |
Last Modified: | 03 Oct 2021 00:38 |
Status: | Published |
Publisher: | Elsevier |
Identification Number: | 10.1016/j.cose.2020.102072 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:166686 |