Ye, G, Tang, Z, Wang, H et al. (4 more authors) (2020) Deep Program Structure Modeling Through Multi-Relational Graph-based Learning. In: PACT '20: Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques. PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 03-07 Oct 2020, Virtual Event, USA. Association for Computing Machinery , pp. 111-123. ISBN 978-1-4503-8075-1
Abstract
Deep learning is emerging as a promising technique for building predictive models to support code-related tasks like performance optimization and code vulnerability detection. One of the critical aspects of building a successful predictive model is having the right representation to characterize the model input for the given task. Existing approaches in the area typically treat the program structure as a sequential sequence but fail to capitalize on the rich semantics of data and control flow information, for which graphs are a proven representation structure.
We present POEM, a novel framework that automatically learns useful code representations from graph-based program structures. At the core of POEM is a graph neural network (GNN) that is specially designed for capturing the syntax and semantic information from the program abstract syntax tree and the control and data flow graph. As a departure from existing GNN-based code modeling techniques, our network simultaneously learns over multiple relations of a program graph. This capability enables the learning framework to distinguish and reason about the diverse code relationships, be it a data or a control flow or any other relationships that may be important for the downstream processing task.
We apply POEM to four representative tasks that require a strong ability to reason about the program structure: heterogeneous device mapping, parallel thread coarsening, loop vectorization and code vulnerability detection. We evaluate POEM on programs written in OpenCL, C, Java and Swift, and compare it against nine learning-based methods. Experimental results show that POEM consistently outperforms all competing methods across evaluation settings.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2020 Association for Computing Machinery. This is an author produced version of a paper published in PACT '20: Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques. Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | Program Modeling, Code Optimization, Machine Learning |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Funding Information: | Funder Grant number Royal Society IEC\NSFC\191465 |
Depositing User: | Symplectic Publications |
Date Deposited: | 20 Aug 2020 11:44 |
Last Modified: | 26 Oct 2020 16:04 |
Status: | Published |
Publisher: | Association for Computing Machinery |
Identification Number: | 10.1145/3410463.3414670 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:164551 |