Saxena, G, Jimack, PK and Walkley, MA orcid.org/0000-0003-2541-4173 (2018) A quasi‐cache‐aware model for optimal domain partitioning in parallel geometric multigrid. Concurrency and Computation: Practice and Experience, 30 (9). e4328. ISSN 1532-0626
Abstract
Stencil computations form the heart of numerical simulations to solve Partial Differential Equations using Finite Difference, Finite Element, and Finite Volume methods. Geometric Multigrid is an optimal O(N), hierarchical tool employing stencil computations in its chief constituents, namely, smoothing, restriction, and interpolation. When Multigrid is parallelized over distributed‐shared memory architectures, traditionally, the domain partitioning creates cubic partitions of the mesh to minimize overall communication. Thus, the orthodox approach considers only load‐balancing and communication minimization for completely determining the domain partitioning. In this article, we show that these two factors are not sufficient to obtain optimal partitions for Parallel Geometric Multigrid. To this effect, we develop and validate a high level analytical model to show that “close to 2‐D” partitions for Geometric Multigrid can give higher performance than the partitions returned by the MPI_Dims_create() function which minimizes the communication volume by default. We quantify sub‐domain level cache‐misses in Parallel Geometric Multigrid and obtain families of optimal domain partitions. We conclude that the sub‐domain level cache‐misses for the application‐specific stencil computational kernel and communicated planes should be taken into account in addition to communication minimization/load‐balance to obtain optimal partitions for Parallel Geometric Multigrid.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2017 John Wiley & Sons, Ltd. This is the peer reviewed version of the following article: Saxena G, Jimack PK, Walkley MA. A quasi‐cache‐aware model for optimal domain partitioning in parallel geometric multigrid. Concurrency Computat Pract Exper. 2018;30:e4328. https://doi.org/10.1002/cpe.4328 , which has been published in final form at https://doi.org/10.1002/cpe.4328 . This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving. Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | cache misses; domain partitioning; geometric multigrid; quasi‐cache‐aware; stencil; topology |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 21 Aug 2017 10:09 |
Last Modified: | 09 Oct 2018 00:38 |
Status: | Published |
Publisher: | Wiley |
Identification Number: | 10.1002/cpe.4328 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:120330 |