GPU libraries speed performance analysis for RCWA simulation matrix operations
GPU libraries speed performance analysis for RCWA simulation matrix operations
Rigorous Coupled Wave Analysis (RCWA) method is highly efficient for the simulation of diffraction efficiency and field distribution patterns in periodic structures and textured optoelectronic devices. GPU has been increasingly used in complex scientific problems such as climate simulation and the latest Covid-19 spread model. In this paper, we break down the RCWA simulation problem to key computational steps (eigensystem solution, matrix inversion/multiplication) and investigate speed performance provided by optimized linear algebra GPU libraries in comparison to multithreaded Intel MKL CPU library running on IRIDIS 5 supercomputer (1 NVIDIA v100 GPU and 40 Intel Xeon Gold 6138 cores CPU). Our work shows that GPU outperforms CPU significantly for all required steps. Eigensystem solution becomes 60% faster, Matrix inversion improves with size achieving 8x faster for large matrixes. Most significantly, matrix multiplication becomes 40x faster for small and 5x faster for large matrix sizes.
Xu, Jingxiao
6a01b40f-4a5c-4908-a2b9-61433a03757e
Charlton, Martin D.B.
fcf86ab0-8f34-411a-b576-4f684e51e274
10 March 2023
Xu, Jingxiao
6a01b40f-4a5c-4908-a2b9-61433a03757e
Charlton, Martin D.B.
fcf86ab0-8f34-411a-b576-4f684e51e274
Xu, Jingxiao and Charlton, Martin D.B.
(2023)
GPU libraries speed performance analysis for RCWA simulation matrix operations.
Witzigmann, Bernd, Osiński, Marek and Arakawa, Yasuhiko
(eds.)
In Physics and Simulation of Optoelectronic Devices XXXI.
vol. 12415,
SPIE.
8 pp
.
(doi:10.1117/12.2650112).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Rigorous Coupled Wave Analysis (RCWA) method is highly efficient for the simulation of diffraction efficiency and field distribution patterns in periodic structures and textured optoelectronic devices. GPU has been increasingly used in complex scientific problems such as climate simulation and the latest Covid-19 spread model. In this paper, we break down the RCWA simulation problem to key computational steps (eigensystem solution, matrix inversion/multiplication) and investigate speed performance provided by optimized linear algebra GPU libraries in comparison to multithreaded Intel MKL CPU library running on IRIDIS 5 supercomputer (1 NVIDIA v100 GPU and 40 Intel Xeon Gold 6138 cores CPU). Our work shows that GPU outperforms CPU significantly for all required steps. Eigensystem solution becomes 60% faster, Matrix inversion improves with size achieving 8x faster for large matrixes. Most significantly, matrix multiplication becomes 40x faster for small and 5x faster for large matrix sizes.
Text
124150O
- Version of Record
Text
12415-48poster
- Version of Record
More information
Published date: 10 March 2023
Venue - Dates:
Physics and Simulation of Optoelectronic Devices XXXI, , San Francisco, United States, 2023-01-28 - 2023-02-03
Identifiers
Local EPrints ID: 490034
URI: http://eprints.soton.ac.uk/id/eprint/490034
ISSN: 0277-786X
PURE UUID: 360a8822-23d2-4302-b001-f30857d840e1
Catalogue record
Date deposited: 14 May 2024 16:30
Last modified: 13 Jun 2024 01:57
Export record
Altmetrics
Contributors
Author:
Jingxiao Xu
Author:
Martin D.B. Charlton
Editor:
Bernd Witzigmann
Editor:
Marek Osiński
Editor:
Yasuhiko Arakawa
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics