Search | arXiv e-print repository

High-Throughput Flexible Belief Propagation List Decoder for Polar Codes

Authors: Yuqing Ren, Yifei Shen, Leyu Zhang, Andreas Toftegaard Kristensen, Alexios Balatsoukas-Stimming, Andreas Burg, Chuan Zhang

Abstract: Owing to its high parallelism, belief propagation (BP) decoding is highly amenable to high-throughput implementations and thus represents a promising solution for meeting the ultra-high peak data rate of future communication systems. However, for polar codes, the error-correcting performance of BP decoding is far inferior to that of the widely used CRC-aided successive cancellation list (SCL) deco… ▽ More Owing to its high parallelism, belief propagation (BP) decoding is highly amenable to high-throughput implementations and thus represents a promising solution for meeting the ultra-high peak data rate of future communication systems. However, for polar codes, the error-correcting performance of BP decoding is far inferior to that of the widely used CRC-aided successive cancellation list (SCL) decoding algorithm. To close the performance gap to SCL, BP list (BPL) decoding expands the exploration of candidate codewords through multiple permuted factor graphs (PFGs). From an implementation perspective, designing a unified and flexible hardware architecture for BPL decoding that supports various PFGs and code configurations presents a big challenge. In this paper, we propose the first hardware implementation of a BPL decoder for polar codes and overcome the implementation challenge by applying a hardware-friendly algorithm that generates flexible permutations on-the-fly. First, we derive the graph selection gain and provide a sequential generation (SG) algorithm to obtain a near-optimal PFG set. We further prove that any permutation can be decomposed into a combination of multiple fixed routings, and we design a low-complexity permutation network to satisfy the decoding schedule. Our BPL decoder not only has a low decoding latency by executing the decoding and permutation generation in parallel, but also supports an arbitrary list size without any area overhead. Experimental results show that, for length-1024 polar codes with a code rate of one-half, our BPL decoder with 32 PFGs has a similar error-correcting performance to SCL with a list size of 4 and achieves a throughput of 25.63 Gbps and an area efficiency of 29.46 Gbps/mm$^{2}$ at SNR=4.0dB, which is 1.82$\times$ and 4.33$\times$ faster than the state-of-the-art BP flip and SCL decoders,~respectively △ Less

Submitted 19 March, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

arXiv:2205.08857 [pdf, other]

doi 10.1109/TSP.2022.3216921

A Sequence Repetition Node-Based Successive Cancellation List Decoder for 5G Polar Codes: Algorithm and Implementation

Authors: Yuqing Ren, Andreas Toftegaard Kristensen, Yifei Shen, Alexios Balatsoukas-Stimming, Chuan Zhang, Andreas Burg

Abstract: Due to the low-latency and high-reliability requirements of 5G, low-complexity node-based successive cancellation list (SCL) decoding has received considerable attention for use in 5G communications systems. By identifying special constituent codes in the decoding tree and immediately decoding these, node-based SCL decoding provides a significant reduction in decoding latency compared to conventio… ▽ More Due to the low-latency and high-reliability requirements of 5G, low-complexity node-based successive cancellation list (SCL) decoding has received considerable attention for use in 5G communications systems. By identifying special constituent codes in the decoding tree and immediately decoding these, node-based SCL decoding provides a significant reduction in decoding latency compared to conventional SCL decoding. However, while there exists many types of nodes, the current node-based SCL decoders are limited by the lack of a more generalized node that can efficiently decode a larger number of different constituent codes to further reduce the decoding time. In this paper, we extend a recent generalized node, the sequence repetition (SR) node to SCL decoding and we describe the first implementation of an SR-List decoder. By merging certain SR-List decoding operations and applying various optimizations for 5G New Radio (NR) polar codes, our optimized SR-List decoding algorithm increases the throughput by almost ${2\times}$ compared to a similar state-of-the-art node-based SCL decoder. We also present our hardware implementation of the optimized SR-List decoding algorithm which supports all 5G NR polar codes. Synthesis results show that our SR-List decoder can achieve a $2.94 \, \mathrm{Gbps}$ throughput and $6.70\, \mathrm{Gbps} / \mathrm{mm}^2$ area efficiency for ${L=8}$. △ Less

Submitted 26 August, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

arXiv:2005.01016 [pdf, other]

Lupulus: A Flexible Hardware Accelerator for Neural Networks

Authors: Andreas Toftegaard Kristensen, Robert Giterman, Alexios Balatsoukas-Stimming, Andreas Burg

Abstract: Neural networks have become indispensable for a wide range of applications, but they suffer from high computational- and memory-requirements, requiring optimizations from the algorithmic description of the network to the hardware implementation. Moreover, the high rate of innovation in machine learning makes it important that hardware implementations provide a high level of programmability to supp… ▽ More Neural networks have become indispensable for a wide range of applications, but they suffer from high computational- and memory-requirements, requiring optimizations from the algorithmic description of the network to the hardware implementation. Moreover, the high rate of innovation in machine learning makes it important that hardware implementations provide a high level of programmability to support current and future requirements of neural networks. In this work, we present a flexible hardware accelerator for neural networks, called Lupulus, supporting various methods for scheduling and map** of operations onto the accelerator. Lupulus was implemented in a 28nm FD-SOI technology and demonstrates a peak performance of 380 GOPS/GHz with latencies of 21.4ms and 183.6ms for the convolutional layers of AlexNet and VGG-16, respectively. △ Less

Submitted 3 May, 2020; originally announced May 2020.

Comments: To be presented at the 2020 International Conference on Acoustics, Speech, and Signal Processing

arXiv:2001.09877 [pdf, ps, other]

Identification of Non-Linear RF Systems Using Backpropagation

Authors: Andreas Toftegaard Kristensen, Andreas Burg, Alexios Balatsoukas-Stimming

Abstract: In this work, we use deep unfolding to view cascaded non-linear RF systems as model-based neural networks. This view enables the direct use of a wide range of neural network tools and optimizers to efficiently identify such cascaded models. We demonstrate the effectiveness of this approach through the example of digital self-interference cancellation in full-duplex communications where an IQ imbal… ▽ More In this work, we use deep unfolding to view cascaded non-linear RF systems as model-based neural networks. This view enables the direct use of a wide range of neural network tools and optimizers to efficiently identify such cascaded models. We demonstrate the effectiveness of this approach through the example of digital self-interference cancellation in full-duplex communications where an IQ imbalance model and a non-linear PA model are cascaded in series. For a self-interference cancellation performance of approximately 44.5 dB, the number of model parameters can be reduced by 74% and the number of operations per sample can be reduced by 79% compared to an expanded linear-in-parameters polynomial model. △ Less

Submitted 31 May, 2020; v1 submitted 27 January, 2020; originally announced January 2020.

Comments: To be presented at the 2020 IEEE International Conference on Communications (Workshop on Full-Duplex Communications for Future Wireless Networks)

arXiv:2001.04543 [pdf, other]

Hardware Implementation of Neural Self-Interference Cancellation

Authors: Yann Kurzo, Andreas Toftegaard Kristensen, Andreas Burg, Alexios Balatsoukas-Stimming

Abstract: In-band full-duplex systems can transmit and receive information simultaneously on the same frequency band. However, due to the strong self-interference caused by the transmitter to its own receiver, the use of non-linear digital self-interference cancellation is essential. In this work, we describe a hardware architecture for a neural network-based non-linear self-interference (SI) canceller and… ▽ More In-band full-duplex systems can transmit and receive information simultaneously on the same frequency band. However, due to the strong self-interference caused by the transmitter to its own receiver, the use of non-linear digital self-interference cancellation is essential. In this work, we describe a hardware architecture for a neural network-based non-linear self-interference (SI) canceller and we compare it with our own hardware implementation of a conventional polynomial based SI canceller. In particular, we present implementation results for a shallow and a deep neural network SI canceller as well as for a polynomial SI canceller. Our results show that the deep neural network canceller achieves a hardware efficiency of up to $312.8$ Msamples/s/mm$^2$ and an energy efficiency of up to $0.9$ nJ/sample, which is $2.1\times$ and $2\times$ better than the polynomial SI canceller, respectively. These results show that NN-based methods applied to communications are not only useful from a performance perspective, but can also be a very effective means to reduce the implementation complexity. △ Less

Submitted 7 May, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

Comments: Accepted for publication in IEEE Journal on Emerging and Selected Topics in Circuits and Systems

Showing 1–5 of 5 results for author: Kristensen, A T