The University of Southampton
University of Southampton Institutional Repository

Channel coding algorithms for Ultra-Reliable Low Latency Communication

Channel coding algorithms for Ultra-Reliable Low Latency Communication
Channel coding algorithms for Ultra-Reliable Low Latency Communication
The Ultra-Reliable Low Latency Communication (URLLC) concept has been conceived for the emerging Fifth Generation (5G) systems, targeting a round-trip end-to-end latency of less than 1 ms in conjunction with ultra-high reliability. Therefore, this thesis proposes several novel channel coding schemes in order to meet the latency requirements of the URLLC mobile communication standard.
First, an Arbitrarily Parallel Turbo Decoder (APTD) is proposed to support an arbitrarily high degree of parallel processing, facilitating significantly higher processing throughputs and substantially lower processing latencies than the State-of-the-art (SOTA) Long Term Evolution (LTE) turbo decoder. As in conventional turbo decoding algorithms, the proposed APTD decomposes each block of information bits into a sequence of windows, where the bits within different windows are processed simultaneously using forward and backward recursions in a serial manner. However, in contrast to conventional turbo decoding algorithms, the APTD does not require the different windows to be composed of an identical number of bits. This allows the use of an arbitrary number of windows and hence an arbitrary degree of parallelism, when decoding information bits of an arbitrary block length. Furthermore, conventional turbo decoding algorithms alternate between simultaneously processing the windows in the upper decoder and those in the lower decoder. By contrast, the APTD processes the odd-indexed windows in the upper decoder at the same time as the even-indexed windows in the lower decoder and alternates between this and the reversed arrangement, hence further improving the decoding throughput and latency. Furthermore, the APTD achieves a reduced hardware resource requirement by calculating the extrinsic information based only on the outputs of the forward recursions, rather than being based on both the forward and backward recursions of conventional turbo decoding algorithms. We demonstrate that the proposed APTD achieves superior latency, throughput and computational efficiency compared to the SOTA LTE turbo decoder at all block lengths, but particularly at the short block lengths that are typically used in URLLC approaches.

For example, at a block length of N = 504 bits, the proposed APTD achieves an Block Error Rate (BLER) of 10􀀀5 at the same Eb=N0 as I = 8 iterations of a conventional turbo decoder, but with a computational efficiency that is 6 times higher than that of the SOTA turbo decoder, while achieving a latency and throughput that are 0:7 and 1:4 times those of the SOTA decoder, respectively.
Additionally, the URLLC service requires an order of magnitude improvements in all layers of the wireless communication stack. This is a particular challenge for the physical layer, where typically a processing time of the order of microseconds is required for the computationally intensive demodulation and error correction processing, among other operations. Conventionally, the reception of signals, the demodulation processing and the error correction processing are performed consecutively at the receiver. However, this approach is associated with processing times on the order of hundreds of microseconds, preventing URLLC. Therefore, this paper proposes a novel processing architecture, which is capable of performing reception, Orthogonal Frequency-Division Multiplexing (OFDM) demodulation and turbo decoding concurrently, rather than consecutively, hence significantly reducing the processing time. In order to achieve concurrent operation, the OFDM demodulation is performed using a novel cumulative Fast Fourier Transform (FFT), which produces successively more reliable estimates of all transmitted symbols in each successive clock cycle. At the same time, a Fully-Parallel Turbo Decoder (FPTD) is used to recover successively more reliable estimates of all bits in each successive clock cycle.
Then, a detailed tutorial on the Cyclic Redundancy Check (CRC)-aided Logarithmic Successive Cancellation Stack (Log-SCS) algorithm conceived for polar codes is provided, followed by a pair of refinements for improving the error correction performance. We also apply these algorithms for the ultra-reliable decoding of polar codes, which has relevance for the control channels of the URLLC version of the 3rd Generation Partnership Project (3GPP) New Radio (NR). In contrast to the bit probabilities of all previous work on SCS polar decoding, the Log-SCS algorithm operates on the basis of Logarithmic-Likelihood Ratios (LLRs), which facilitates low-complexity fixed point implementation and reduced storage requirements. Furthermore, we extend the computation to consider frozen bits in stack decoding when determining the most likely sequence of information bits, which improves the error correction performance despite reducing the decoding complexity. During the exploitation of the CRC codes, for improving the error correction performance, we propose a novel technique which limits the number of CRC checks performed, in order to maintain a consistent error detection performance. Additionally, a pair of techniques for further improving the performance of the Log-SCS polar decoder are proposed and we demonstrate that the proposed S = 128 Improved Log-SCS decoder achieves a similar error correction capability as a Logarithmic Successive Cancellation List (Log-SCL) decoder having a list size of L = 128 across the full range of block lengths supported by the 3GPP NR Physical Uplink Control Channel (PUCCH). This is achieved without increasing its memory requirement, while dramatically reducing its complexity, which becomes up to seven times lower than that of a L = 8 Log-SCL decoder.

Following the Improved Log-SCS algorithm, a novel fast Log-SCS polar decoder is proposed, which employs several techniques that is previously considered by the fast SCL decoder. This Log-SCS polar decoder is capable of attaining a decoding latency that is lower than that of the SOTA fast SCL polar decoders without the loss of error correction performance. First, a 32-bit fixed point Log-SCS polar decoder is achieved in this paper, which is capable of maintaining the same BLER as that of the floating-point Log-SCS polar decoder, allowing the software implementation on x86 processors. In addition, the simplified path-metric computation of the rate-0, rate-1 and repetition subgraphs is applied in the proposed fast Log-SCS decoder which reduces the decoding complexity by 50% on average. In addition, the software implementation of the fast Log-SCS polar decoder is achieved on the x86 processors that support Single Instruction Multiple Data (SIMD) instructions with 512-bit Advanced Vector Extensions (AVX-512) for the first time, satisfying the low-latency requirements of Software-Defined Radio (SDR) systems. By implementing the 32-bit fast Log-SCS polar decoder into the x86 processors in conjunction with AVX-512 SIMD instructions, a maximum parallelization degree of 16 may be attained, and an 80% latency reduction may be achieved.
University of Southampton
Xiang, Luping
56d951c0-455e-4a67-b167-f6c8233343b1
Xiang, Luping
56d951c0-455e-4a67-b167-f6c8233343b1
Maunder, Robert
76099323-7d58-4732-a98f-22a662ccba6c

Xiang, Luping (2019) Channel coding algorithms for Ultra-Reliable Low Latency Communication. University of Southampton, Doctoral Thesis, 161pp.

Record type: Thesis (Doctoral)

Abstract

The Ultra-Reliable Low Latency Communication (URLLC) concept has been conceived for the emerging Fifth Generation (5G) systems, targeting a round-trip end-to-end latency of less than 1 ms in conjunction with ultra-high reliability. Therefore, this thesis proposes several novel channel coding schemes in order to meet the latency requirements of the URLLC mobile communication standard.
First, an Arbitrarily Parallel Turbo Decoder (APTD) is proposed to support an arbitrarily high degree of parallel processing, facilitating significantly higher processing throughputs and substantially lower processing latencies than the State-of-the-art (SOTA) Long Term Evolution (LTE) turbo decoder. As in conventional turbo decoding algorithms, the proposed APTD decomposes each block of information bits into a sequence of windows, where the bits within different windows are processed simultaneously using forward and backward recursions in a serial manner. However, in contrast to conventional turbo decoding algorithms, the APTD does not require the different windows to be composed of an identical number of bits. This allows the use of an arbitrary number of windows and hence an arbitrary degree of parallelism, when decoding information bits of an arbitrary block length. Furthermore, conventional turbo decoding algorithms alternate between simultaneously processing the windows in the upper decoder and those in the lower decoder. By contrast, the APTD processes the odd-indexed windows in the upper decoder at the same time as the even-indexed windows in the lower decoder and alternates between this and the reversed arrangement, hence further improving the decoding throughput and latency. Furthermore, the APTD achieves a reduced hardware resource requirement by calculating the extrinsic information based only on the outputs of the forward recursions, rather than being based on both the forward and backward recursions of conventional turbo decoding algorithms. We demonstrate that the proposed APTD achieves superior latency, throughput and computational efficiency compared to the SOTA LTE turbo decoder at all block lengths, but particularly at the short block lengths that are typically used in URLLC approaches.

For example, at a block length of N = 504 bits, the proposed APTD achieves an Block Error Rate (BLER) of 10􀀀5 at the same Eb=N0 as I = 8 iterations of a conventional turbo decoder, but with a computational efficiency that is 6 times higher than that of the SOTA turbo decoder, while achieving a latency and throughput that are 0:7 and 1:4 times those of the SOTA decoder, respectively.
Additionally, the URLLC service requires an order of magnitude improvements in all layers of the wireless communication stack. This is a particular challenge for the physical layer, where typically a processing time of the order of microseconds is required for the computationally intensive demodulation and error correction processing, among other operations. Conventionally, the reception of signals, the demodulation processing and the error correction processing are performed consecutively at the receiver. However, this approach is associated with processing times on the order of hundreds of microseconds, preventing URLLC. Therefore, this paper proposes a novel processing architecture, which is capable of performing reception, Orthogonal Frequency-Division Multiplexing (OFDM) demodulation and turbo decoding concurrently, rather than consecutively, hence significantly reducing the processing time. In order to achieve concurrent operation, the OFDM demodulation is performed using a novel cumulative Fast Fourier Transform (FFT), which produces successively more reliable estimates of all transmitted symbols in each successive clock cycle. At the same time, a Fully-Parallel Turbo Decoder (FPTD) is used to recover successively more reliable estimates of all bits in each successive clock cycle.
Then, a detailed tutorial on the Cyclic Redundancy Check (CRC)-aided Logarithmic Successive Cancellation Stack (Log-SCS) algorithm conceived for polar codes is provided, followed by a pair of refinements for improving the error correction performance. We also apply these algorithms for the ultra-reliable decoding of polar codes, which has relevance for the control channels of the URLLC version of the 3rd Generation Partnership Project (3GPP) New Radio (NR). In contrast to the bit probabilities of all previous work on SCS polar decoding, the Log-SCS algorithm operates on the basis of Logarithmic-Likelihood Ratios (LLRs), which facilitates low-complexity fixed point implementation and reduced storage requirements. Furthermore, we extend the computation to consider frozen bits in stack decoding when determining the most likely sequence of information bits, which improves the error correction performance despite reducing the decoding complexity. During the exploitation of the CRC codes, for improving the error correction performance, we propose a novel technique which limits the number of CRC checks performed, in order to maintain a consistent error detection performance. Additionally, a pair of techniques for further improving the performance of the Log-SCS polar decoder are proposed and we demonstrate that the proposed S = 128 Improved Log-SCS decoder achieves a similar error correction capability as a Logarithmic Successive Cancellation List (Log-SCL) decoder having a list size of L = 128 across the full range of block lengths supported by the 3GPP NR Physical Uplink Control Channel (PUCCH). This is achieved without increasing its memory requirement, while dramatically reducing its complexity, which becomes up to seven times lower than that of a L = 8 Log-SCL decoder.

Following the Improved Log-SCS algorithm, a novel fast Log-SCS polar decoder is proposed, which employs several techniques that is previously considered by the fast SCL decoder. This Log-SCS polar decoder is capable of attaining a decoding latency that is lower than that of the SOTA fast SCL polar decoders without the loss of error correction performance. First, a 32-bit fixed point Log-SCS polar decoder is achieved in this paper, which is capable of maintaining the same BLER as that of the floating-point Log-SCS polar decoder, allowing the software implementation on x86 processors. In addition, the simplified path-metric computation of the rate-0, rate-1 and repetition subgraphs is applied in the proposed fast Log-SCS decoder which reduces the decoding complexity by 50% on average. In addition, the software implementation of the fast Log-SCS polar decoder is achieved on the x86 processors that support Single Instruction Multiple Data (SIMD) instructions with 512-bit Advanced Vector Extensions (AVX-512) for the first time, satisfying the low-latency requirements of Software-Defined Radio (SDR) systems. By implementing the 32-bit fast Log-SCS polar decoder into the x86 processors in conjunction with AVX-512 SIMD instructions, a maximum parallelization degree of 16 may be attained, and an 80% latency reduction may be achieved.

Text
Channel coding algorithms for Ultra-Reliable Low Latency Communication - Version of Record
Available under License University of Southampton Thesis Licence.
Download (3MB)

More information

Published date: December 2019

Identifiers

Local EPrints ID: 438625
URI: http://eprints.soton.ac.uk/id/eprint/438625
PURE UUID: 098bd5bc-e7dd-4b52-99aa-959e56f0e49b
ORCID for Luping Xiang: ORCID iD orcid.org/0000-0003-1465-6708
ORCID for Robert Maunder: ORCID iD orcid.org/0000-0002-7944-2615

Catalogue record

Date deposited: 18 Mar 2020 17:41
Last modified: 17 Mar 2024 05:19

Export record

Contributors

Author: Luping Xiang ORCID iD
Thesis advisor: Robert Maunder ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×