CFDComm: an optimized library for scalable point-to-point communication for general CFD applications
CFDComm: an optimized library for scalable point-to-point communication for general CFD applications
Domain decomposition is the most widely used technique to achieve parallelism in CFD applications. For complicated geometries usually graph partitioning programs are used to decompose the domain into smaller computational blocks such that the computation load is balanced and communication cost is minimized. In this paper an algorithm is provided and tested which avoids deadlocks in complicated communications patterns inherited from the graph decomposition process. The basic algorithm is implemented using FORTRAN 95 and MPI and then several optimization techniques are used to increase the scalability of the library which include addition of topologies, overlap of communication and computation to mask the message passing latency and non-blocking communication. The library is tested for up to 512 cores on the Iridis-3 cluster which incorporates 1008 compute nodes each composed of 2, 2.4 GHz 6-core Westmere processors. IO and inter-node communication is via a fast Infiniband network which is composed of groups of 32 nodes connected by DDR links to a 48 port QDR leaf-switch. The leaf switches then have 4 trunked QDR connections to 4 QDR 48-port core switches.
computational fluid dynamics, domain decomposition, mpi, parallel performance, point-to-point communication
978-1-4673-2164-8
1001-1006
Haeri, Sina
feee833a-70f4-4457-af71-7c696286074c
Shrimpton, John S.
ddb51b00-ef21-42aa-bc74-20108e08a4b6
18 October 2012
Haeri, Sina
feee833a-70f4-4457-af71-7c696286074c
Shrimpton, John S.
ddb51b00-ef21-42aa-bc74-20108e08a4b6
Haeri, Sina and Shrimpton, John S.
(2012)
CFDComm: an optimized library for scalable point-to-point communication for general CFD applications.
In 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS).
IEEE.
.
(doi:10.1109/HPCC.2012.146).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Domain decomposition is the most widely used technique to achieve parallelism in CFD applications. For complicated geometries usually graph partitioning programs are used to decompose the domain into smaller computational blocks such that the computation load is balanced and communication cost is minimized. In this paper an algorithm is provided and tested which avoids deadlocks in complicated communications patterns inherited from the graph decomposition process. The basic algorithm is implemented using FORTRAN 95 and MPI and then several optimization techniques are used to increase the scalability of the library which include addition of topologies, overlap of communication and computation to mask the message passing latency and non-blocking communication. The library is tested for up to 512 cores on the Iridis-3 cluster which incorporates 1008 compute nodes each composed of 2, 2.4 GHz 6-core Westmere processors. IO and inter-node communication is via a fast Infiniband network which is composed of groups of 32 nodes connected by DDR links to a 48 port QDR leaf-switch. The leaf switches then have 4 trunked QDR connections to 4 QDR 48-port core switches.
This record has no associated files available for download.
More information
Published date: 18 October 2012
Venue - Dates:
2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), Liverpool, United Kingdom, 2012-07-25 - 2012-07-27
Keywords:
computational fluid dynamics, domain decomposition, mpi, parallel performance, point-to-point communication
Organisations:
Aeronautics, Astronautics & Comp. Eng
Identifiers
Local EPrints ID: 344617
URI: http://eprints.soton.ac.uk/id/eprint/344617
ISBN: 978-1-4673-2164-8
PURE UUID: e4312613-8168-4899-a155-3520414e1cf4
Catalogue record
Date deposited: 31 Oct 2012 11:00
Last modified: 14 Mar 2024 12:15
Export record
Altmetrics
Contributors
Author:
Sina Haeri
Author:
John S. Shrimpton
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics