Performance-Aware Predictive-Model-Based On-Chip Body-Bias Regulation Strategy for an ULP Multi-Core Cluster in 28nm UTBB FD-SOI
Authors:
Alfio Di Mauro,
Davide Rossi,
Antonio Pullini,
Philippe Flatresse,
Luca Benini
Abstract:
The performance and reliability of Ultra-Low-Power (ULP) computing platforms are adversely affected by environmental temperature and process variations. Mitigating the effect of these phenomena becomes crucial when these devices operate near-threshold, due to the magnification of process variations and to the strong temperature inversion effect that affects advanced technology nodes in low-voltage…
▽ More
The performance and reliability of Ultra-Low-Power (ULP) computing platforms are adversely affected by environmental temperature and process variations. Mitigating the effect of these phenomena becomes crucial when these devices operate near-threshold, due to the magnification of process variations and to the strong temperature inversion effect that affects advanced technology nodes in low-voltage corners, which causes huge overhead due to margining for timing closure. Supporting an extended range of reverse and forward body-bias, UTBB FD-SOI technology provides a powerful knob to compensate for such variations. In this work we propose a methodology to maximize energy efficiency at run-time exploiting body biasing on a ULP platform operating near-threshold. The proposed method relies on on-line performance measurements by means of Process Monitoring Blocks (PMBs) coupled with an on-chip low-power body bias generator. We correlate the measurement performed by the PMBs to the maximum achievable frequency of the system, deriving a predictive model able to estimate it with an error of 9.7% at 0.7V. To minimize the effect of process variations we propose a calibration procedure that allows to use a PMB model affected by only the temperature-induced error, which reduces the frequency estimation error by 2.4x (from 9.7% to 4%). We finally propose a controller architecture relying on the derived models to automatically regulate at run-time the body bias voltage. We demonstrate that adjusting the body bias voltage against environmental temperature variations leads up to 2X reduction in the leakage power and a 15% improvement on the global energy consumption when the system operates at 0.7V and 170MHz
△ Less
Submitted 27 July, 2020;
originally announced July 2020.
PolarBear: A 28-nm FD-SOI ASIC for Decoding of Polar Codes
Authors:
Pascal Giard,
Alexios Balatsoukas-Stimming,
Thomas Christoph Müller,
Andrea Bonetti,
Claude Thibeault,
Warren J. Gross,
Philippe Flatresse,
Andreas Burg
Abstract:
Polar codes are a recently proposed class of block codes that provably achieve the capacity of various communication channels. They received a lot of attention as they can do so with low-complexity encoding and decoding algorithms, and they have an explicit construction. Their recent inclusion in a 5G communication standard will only spur more research. However, only a couple of ASICs featuring de…
▽ More
Polar codes are a recently proposed class of block codes that provably achieve the capacity of various communication channels. They received a lot of attention as they can do so with low-complexity encoding and decoding algorithms, and they have an explicit construction. Their recent inclusion in a 5G communication standard will only spur more research. However, only a couple of ASICs featuring decoders for polar codes were fabricated, and none of them implements a list-based decoding algorithm. In this paper, we present ASIC measurement results for a fabricated 28 nm CMOS chip that implements two different decoders: the first decoder is tailored toward error-correction performance and flexibility. It supports any code rate as well as three different decoding algorithms: successive cancellation (SC), SC flip and SC list (SCL). The flexible decoder can also decode both non-systematic and systematic polar codes. The second decoder targets speed and energy efficiency. We present measurement results for the first silicon-proven SCL decoder, where its coded throughput is shown to be of 306.8 Mbps with a latency of 3.34 us and an energy per bit of 418.3 pJ/bit at a clock frequency of 721 MHz for a supply of 1.3 V. The energy per bit drops down to 178.1 pJ/bit with a more modest clock frequency of 308 MHz, lower throughput of 130.9 Mbps and a reduced supply voltage of 0.9 V. For the other two operating modes, the energy per bit is shown to be of approximately 95 pJ/bit. The less flexible high-throughput unrolled decoder can achieve a coded throughput of 9.2 Gbps and a latency of 628 ns for a measured energy per bit of 1.15 pJ/bit at 451 MHz.
△ Less
Submitted 1 September, 2017; v1 submitted 31 August, 2017;
originally announced August 2017.