Implementation of a fully-parallel turbo decoder on a general-purpose graphics processing unit
Implementation of a fully-parallel turbo decoder on a general-purpose graphics processing unit
Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing throughputs in order for the turbo code to support real-time communications. In stateof- the-art turbo code implementations, the processing throughput is typically limited by the data dependencies that occur within the forward and backward recursions of the Log-BCJR algorithm, which is employed during turbo decoding. In contrast to the highly-serial Log-BCJR turbo decoder, we have recently proposed a novel Fully Parallel Turbo Decoder (FPTD) algorithm, which can eliminate the data dependencies and perform fully parallel processing. In this paper, we propose an optimized FPTD algorithm, which reformulates the operation of the FPTD algorithm so that the upper and lower decoders have identical operation, in order to support Single Instruction Multiple Data (SIMD) operation. This allows us to develop a novel General Purpose Graphics Processing Unit (GPGPU) implementation of the FPTD, which has application in Software-Defined Radios (SDRs) and virtualized Cloud- Radio Access Networks (C-RANs). As a benefit of its higher degree of parallelism, we show that our FPTD improves the higher processing throughput of the Log-BCJR turbo decoder by between 2.3 and 9.2 times, when employing a high-specification GPGPU. However, this is achieved at the cost of a moderate increase of the overall complexity by between 1.7 and 3.3 times.
could radio access network, fully-parallel turbo decoder, parallel processing, GPGPU computing, software defined radio
5624-5639
Li, An
099fae06-fd69-4cab-933c-43a9b94ce1f1
Maunder, Robert G.
76099323-7d58-4732-a98f-22a662ccba6c
Al-Hashimi, Bashir
0b29c671-a6d2-459c-af68-c4614dce3b5d
Hanzo, Lajos
66e7266f-3066-4fc0-8391-e000acce71a1
29 June 2016
Li, An
099fae06-fd69-4cab-933c-43a9b94ce1f1
Maunder, Robert G.
76099323-7d58-4732-a98f-22a662ccba6c
Al-Hashimi, Bashir
0b29c671-a6d2-459c-af68-c4614dce3b5d
Hanzo, Lajos
66e7266f-3066-4fc0-8391-e000acce71a1
Li, An, Maunder, Robert G., Al-Hashimi, Bashir and Hanzo, Lajos
(2016)
Implementation of a fully-parallel turbo decoder on a general-purpose graphics processing unit.
IEEE Access, 4, .
(doi:10.1109/ACCESS.2016.2586309).
Abstract
Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing throughputs in order for the turbo code to support real-time communications. In stateof- the-art turbo code implementations, the processing throughput is typically limited by the data dependencies that occur within the forward and backward recursions of the Log-BCJR algorithm, which is employed during turbo decoding. In contrast to the highly-serial Log-BCJR turbo decoder, we have recently proposed a novel Fully Parallel Turbo Decoder (FPTD) algorithm, which can eliminate the data dependencies and perform fully parallel processing. In this paper, we propose an optimized FPTD algorithm, which reformulates the operation of the FPTD algorithm so that the upper and lower decoders have identical operation, in order to support Single Instruction Multiple Data (SIMD) operation. This allows us to develop a novel General Purpose Graphics Processing Unit (GPGPU) implementation of the FPTD, which has application in Software-Defined Radios (SDRs) and virtualized Cloud- Radio Access Networks (C-RANs). As a benefit of its higher degree of parallelism, we show that our FPTD improves the higher processing throughput of the Log-BCJR turbo decoder by between 2.3 and 9.2 times, when employing a high-specification GPGPU. However, this is achieved at the cost of a moderate increase of the overall complexity by between 1.7 and 3.3 times.
Text
FPTD_GPU_IEEEaccess.pdf
- Accepted Manuscript
Text
07501831.pdf
- Accepted Manuscript
More information
Accepted/In Press date: 28 June 2016
Published date: 29 June 2016
Keywords:
could radio access network, fully-parallel turbo decoder, parallel processing, GPGPU computing, software defined radio
Organisations:
Southampton Wireless Group
Identifiers
Local EPrints ID: 397525
URI: http://eprints.soton.ac.uk/id/eprint/397525
PURE UUID: 69af5b24-579d-4f5d-8c25-23eaf8f1cbbc
Catalogue record
Date deposited: 29 Jun 2016 13:48
Last modified: 18 Mar 2024 03:09
Export record
Altmetrics
Contributors
Author:
An Li
Author:
Robert G. Maunder
Author:
Bashir Al-Hashimi
Author:
Lajos Hanzo
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics