Wang, Z orcid.org/0000-0001-6157-0662, Grewe, D and O’boyle, MFP (2015) Automatic and Portable Mapping of Data Parallel Programs to OpenCL for GPU-Based Heterogeneous Systems. ACM Transactions on Architecture and Code Optimization, 11 (4). 42. ISSN 1544-3566
Abstract
General-purpose GPU-based systems are highly attractive, as they give potentially massive performance at little cost. Realizing such potential is challenging due to the complexity of programming. This article presents a compiler-based approach to automatically generate optimized OpenCL code from data parallel OpenMP programs for GPUs. A key feature of our scheme is that it leverages existing transformations, especially data transformations, to improve performance on GPU architectures and uses automatic machine learning to build a predictive model to determine if it is worthwhile running the OpenCL code on the GPU or OpenMP code on the multicore host. We applied our approach to the entire NAS parallel benchmark suite and evaluated it on distinct GPU-based systems. We achieved average (up to) speedups of 4.51× and 4.20× (143× and 67×) on Core i7/NVIDIA GeForce GTX580 and Core i7/AMD Radeon 7970 platforms, respectively, over a sequential baseline. Our approach achieves, on average, greater than 10× speedups over two state-of-the-art automatic GPU code generators.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2014 ACM. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Architecture and Code Optimization, Vol. 11, No. 4, Article 42, https://doi.org/10.1145/2677036. Uploaded in accordance with the publisher's self-archiving policy. |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 09 Oct 2019 15:40 |
Last Modified: | 09 Oct 2019 15:40 |
Status: | Published |
Publisher: | Association for Computing Machinery |
Identification Number: | 10.1145/2677036 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:151217 |