The University of Southampton
University of Southampton Institutional Repository

Porting ONETEP to graphical processing unit-based coprocessors. 1. FFT box operations

Porting ONETEP to graphical processing unit-based coprocessors. 1. FFT box operations
Porting ONETEP to graphical processing unit-based coprocessors. 1. FFT box operations
We present the first graphical processing unit (GPU) coprocessor-enabled version of the Order-N Electronic Total Energy Package (ONETEP) code for linear-scaling first principles quantum mechanical calculations on materials. This work focuses on porting to the GPU the parts of the code that involve atom-localized fast Fourier transform (FFT) operations. These are among the most computationally intensive parts of the code and are used in core algorithms such as the calculation of the charge density, the local potential integrals, the kinetic energy integrals, and the nonorthogonal generalized Wannier function gradient. We have found that direct porting of the isolated FFT operations did not provide any benefit. Instead, it was necessary to tailor the port to each of the aforementioned algorithms to optimize data transfer to and from the GPU. A detailed discussion of the methods used and tests of the resulting performance are presented, which show that individual steps in the relevant algorithms are accelerated by a significant amount. However, the transfer of data between the GPU and host machine is a significant bottleneck in the reported version of the code. In addition, an initial investigation into a dynamic precision scheme for the ONETEP energy calculation has been performed to take advantage of the enhanced single precision capabilities of GPUs. The methods used here result in no disruption to the existing code base. Furthermore, as the developments reported here concern the core algorithms, they will benefit the full range of ONETEP functionality. Our use of a directive-based programming model ensures portability to other forms of coprocessors and will allow this work to form the basis of future developments to the code designed to support emerging high-performance computing platforms.
1096-987X
2446-2459
Wilkinson, Karl
8e2e967a-138c-4833-8526-908a1db8abee
Skylaris, Chris-Kriton
8f593d13-3ace-4558-ba08-04e48211af61
Wilkinson, Karl
8e2e967a-138c-4833-8526-908a1db8abee
Skylaris, Chris-Kriton
8f593d13-3ace-4558-ba08-04e48211af61

Wilkinson, Karl and Skylaris, Chris-Kriton (2013) Porting ONETEP to graphical processing unit-based coprocessors. 1. FFT box operations. Journal of Computational Chemistry, 34 (28), 2446-2459. (doi:10.1002/jcc.23410).

Record type: Article

Abstract

We present the first graphical processing unit (GPU) coprocessor-enabled version of the Order-N Electronic Total Energy Package (ONETEP) code for linear-scaling first principles quantum mechanical calculations on materials. This work focuses on porting to the GPU the parts of the code that involve atom-localized fast Fourier transform (FFT) operations. These are among the most computationally intensive parts of the code and are used in core algorithms such as the calculation of the charge density, the local potential integrals, the kinetic energy integrals, and the nonorthogonal generalized Wannier function gradient. We have found that direct porting of the isolated FFT operations did not provide any benefit. Instead, it was necessary to tailor the port to each of the aforementioned algorithms to optimize data transfer to and from the GPU. A detailed discussion of the methods used and tests of the resulting performance are presented, which show that individual steps in the relevant algorithms are accelerated by a significant amount. However, the transfer of data between the GPU and host machine is a significant bottleneck in the reported version of the code. In addition, an initial investigation into a dynamic precision scheme for the ONETEP energy calculation has been performed to take advantage of the enhanced single precision capabilities of GPUs. The methods used here result in no disruption to the existing code base. Furthermore, as the developments reported here concern the core algorithms, they will benefit the full range of ONETEP functionality. Our use of a directive-based programming model ensures portability to other forms of coprocessors and will allow this work to form the basis of future developments to the code designed to support emerging high-performance computing platforms.

This record has no associated files available for download.

More information

Published date: 30 October 2013
Organisations: Chemistry, Faculty of Natural and Environmental Sciences, Computational Systems Chemistry

Identifiers

Local EPrints ID: 365351
URI: http://eprints.soton.ac.uk/id/eprint/365351
ISSN: 1096-987X
PURE UUID: 9dacff02-623d-4710-a09c-df8378c080d4
ORCID for Chris-Kriton Skylaris: ORCID iD orcid.org/0000-0003-0258-3433

Catalogue record

Date deposited: 03 Jun 2014 10:15
Last modified: 15 Mar 2024 03:26

Export record

Altmetrics

Contributors

Author: Karl Wilkinson

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×