Apparate: Evading Memory Hierarchy with GodSpeed Wireless-on-Chip
Authors:
Nitesh Narayana GS,
Abhijit Das
Abstract:
The rapid advancements in memory systems, CPU technology, and emerging technologies herald a transformative potential in computing, promising to revolutionize memory hierarchies. Innovations in DDR memory are delivering unprecedented bandwidth, while advancements in on-chip wireless technology are reducing size and increasing speed. The introduction of godspeed wireless transceivers on chip, along…
▽ More
The rapid advancements in memory systems, CPU technology, and emerging technologies herald a transformative potential in computing, promising to revolutionize memory hierarchies. Innovations in DDR memory are delivering unprecedented bandwidth, while advancements in on-chip wireless technology are reducing size and increasing speed. The introduction of godspeed wireless transceivers on chip, alongside near high-speed DRAM, is poised to directly facilitate memory requests. This integration suggests the potential for eliminating traditional memory hierarchies, offering a new paradigm in computing efficiency and speed. These developments indicate a near-future where computing systems are significantly more responsive and powerful, leveraging direct, high-speed memory access mechanisms.
△ Less
Submitted 23 April, 2024;
originally announced June 2024.
ReuseSense: With Great Reuse Comes Greater Efficiency; Effectively Employing Computation Reuse on General-Purpose CPUs
Authors:
Nitesh Narayana GS,
Marc Ordoñez,
Lokananda Hari,
Franyell Silfa,
Antonio González
Abstract:
Deep Neural Networks (DNNs) are the de facto algorithm for tackling cognitive tasks in real-world applications such as speech recognition and natural language processing. DNN inference comprises numerous dot product operations between inputs and weights that require numerous multiplications and memory accesses, which hinder their performance and energy consumption when evaluated in modern CPUs. In…
▽ More
Deep Neural Networks (DNNs) are the de facto algorithm for tackling cognitive tasks in real-world applications such as speech recognition and natural language processing. DNN inference comprises numerous dot product operations between inputs and weights that require numerous multiplications and memory accesses, which hinder their performance and energy consumption when evaluated in modern CPUs. In this work, we leverage the high degree of similarity between consecutive inputs in different DNN layers to improve the performance and energy efficiency of DNN inference on CPUs. To this end, we propose ReuseSense, a new hardware scheme that includes ReuseSensor, an engine to efficiently generate the compute and load instructions needed to evaluate a DNN layer accordingly when sensing similar inputs. By intelligently reusing previously computed product values, ReuseSense allows bypassing computations when encountering input values identical to previous ones. Additionally, it efficiently avoids redundant loads by skip** weight loads associated with the bypassed dot product computations.
Our experiments show that ReuseSense achieves an 8x speedup in performance and a 74% reduction in total energy consumption across several DNNs on average over the baseline.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.