The University of Southampton
University of Southampton Institutional Repository

Large-scale performance of a DSL-based multi-block structured-mesh application for Direct Numerical Simulation

Large-scale performance of a DSL-based multi-block structured-mesh application for Direct Numerical Simulation
Large-scale performance of a DSL-based multi-block structured-mesh application for Direct Numerical Simulation
SBLI (Shock-wave/Boundary-layer Interaction) is a large-scale Computational Fluid Dynamics (CFD) application, developed over 20 years at the University of Southampton and extensively used within the UK Turbulence Consortium. It is capable of performing Direct Numerical Simulations (DNS) or Large Eddy Simulation (LES) of shock-wave/boundary-layer interaction problems over highly detailed multi-block structured mesh geometries. SBLI presents major challenges in data organization and movement that need to be overcome for continued high performance on emerging massively parallel hardware platforms. In this paper we present research in achieving this goal through the OPS embedded domain-specific language. OPS targets the domain of multi-block structured mesh applications. It provides an API embedded in C/C++ and Fortran and makes use of automatic code generation and compilation to produce executables capable of running on a range of parallel hardware systems. The core functionality of SBLI is captured using a new framework called OpenSBLI which enables a developer to declare the partial differential equations using Einstein notation and then automatically carryout discretization and generation of OPS (C/C++) API code. OPS is then used to automatically generate a wide range of parallel implementations. Using this multi-layered abstractions approach we demonstrate how new opportunities for further optimizations can be gained, such as fine-tuning the computation intensity and reducing data movement and apply them automatically. Performance results demonstrate there is no performance loss due to the high-level development strategy with OPS and OpenSBLI, with performance matching or exceeding the hand-tuned original code on all CPU nodes tested. The data movement optimizations provide over 3 speedups on CPU nodes, while GPUs provide 5 speedups over the best performing CPU node. The OPS generated parallel code also demonstrates excellent scalability on nearly 100K cores on a Cray XC30 (ARCHER at EPCC) and on over 4K GPUs on a CrayXK7 (Titan at ORNL).
0743-7315
130-146
Mudalige, G.R.
842e79dd-2699-4250-be57-771a538707ef
Reguly, I.Z.
1c95ab4b-782d-44af-a92f-baf32ddeb916
Jammy, S.P.
5267fe44-6c22-473c-b9f0-8e1df884fada
Jacobs, C.T.
6404603a-3c2e-42d9-a0ca-01e4b76f6a58
Giles, M.B.
029bcad3-71bb-479c-a470-a86b3c4fe3e5
Sandham, N.D.
0024d8cd-c788-4811-a470-57934fbdcf97
Mudalige, G.R.
842e79dd-2699-4250-be57-771a538707ef
Reguly, I.Z.
1c95ab4b-782d-44af-a92f-baf32ddeb916
Jammy, S.P.
5267fe44-6c22-473c-b9f0-8e1df884fada
Jacobs, C.T.
6404603a-3c2e-42d9-a0ca-01e4b76f6a58
Giles, M.B.
029bcad3-71bb-479c-a470-a86b3c4fe3e5
Sandham, N.D.
0024d8cd-c788-4811-a470-57934fbdcf97

Mudalige, G.R., Reguly, I.Z., Jammy, S.P., Jacobs, C.T., Giles, M.B. and Sandham, N.D. (2019) Large-scale performance of a DSL-based multi-block structured-mesh application for Direct Numerical Simulation. Journal of Parallel and Distributed Computing, 131, 130-146. (doi:10.1016/j.jpdc.2019.04.019).

Record type: Article

Abstract

SBLI (Shock-wave/Boundary-layer Interaction) is a large-scale Computational Fluid Dynamics (CFD) application, developed over 20 years at the University of Southampton and extensively used within the UK Turbulence Consortium. It is capable of performing Direct Numerical Simulations (DNS) or Large Eddy Simulation (LES) of shock-wave/boundary-layer interaction problems over highly detailed multi-block structured mesh geometries. SBLI presents major challenges in data organization and movement that need to be overcome for continued high performance on emerging massively parallel hardware platforms. In this paper we present research in achieving this goal through the OPS embedded domain-specific language. OPS targets the domain of multi-block structured mesh applications. It provides an API embedded in C/C++ and Fortran and makes use of automatic code generation and compilation to produce executables capable of running on a range of parallel hardware systems. The core functionality of SBLI is captured using a new framework called OpenSBLI which enables a developer to declare the partial differential equations using Einstein notation and then automatically carryout discretization and generation of OPS (C/C++) API code. OPS is then used to automatically generate a wide range of parallel implementations. Using this multi-layered abstractions approach we demonstrate how new opportunities for further optimizations can be gained, such as fine-tuning the computation intensity and reducing data movement and apply them automatically. Performance results demonstrate there is no performance loss due to the high-level development strategy with OPS and OpenSBLI, with performance matching or exceeding the hand-tuned original code on all CPU nodes tested. The data movement optimizations provide over 3 speedups on CPU nodes, while GPUs provide 5 speedups over the best performing CPU node. The OPS generated parallel code also demonstrates excellent scalability on nearly 100K cores on a Cray XC30 (ARCHER at EPCC) and on over 4K GPUs on a CrayXK7 (Titan at ORNL).

Text
Large-scale Performance of a DSL-based MudaligeJPDS2019 - Accepted Manuscript
Restricted to Repository staff only until 6 May 2020.
Request a copy

More information

Accepted/In Press date: 21 April 2019
e-pub ahead of print date: 6 May 2019
Published date: September 2019

Identifiers

Local EPrints ID: 431598
URI: https://eprints.soton.ac.uk/id/eprint/431598
ISSN: 0743-7315
PURE UUID: f3594194-372c-4ce3-9171-3ddc51a867af
ORCID for S.P. Jammy: ORCID iD orcid.org/0000-0002-8099-8573
ORCID for N.D. Sandham: ORCID iD orcid.org/0000-0002-5107-0944

Catalogue record

Date deposited: 10 Jun 2019 16:30
Last modified: 03 Sep 2019 00:38

Export record

Altmetrics

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of https://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×