Skip to main content

Showing 1–35 of 35 results for author: Kvatinsky, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02353  [pdf, other

    eess.SP cs.AR eess.SY

    Roadmap to Neuromorphic Computing with Emerging Technologies

    Authors: Adnan Mehonic, Daniele Ielmini, Kaushik Roy, Onur Mutlu, Shahar Kvatinsky, Teresa Serrano-Gotarredona, Bernabe Linares-Barranco, Sabina Spiga, Sergey Savelev, Alexander G Balanov, Nitin Chawla, Giuseppe Desoli, Gerardo Malavena, Christian Monzio Compagnoni, Zhongrui Wang, J Joshua Yang, Ghazi Sarwat Syed, Abu Sebastian, Thomas Mikolajick, Beatriz Noheda, Stefan Slesazeck, Bernard Dieny, Tuo-Hung, Hou, Akhil Varri , et al. (28 additional authors not shown)

    Abstract: The roadmap is organized into several thematic sections, outlining current computing challenges, discussing the neuromorphic computing approach, analyzing mature and currently utilized technologies, providing an overview of emerging technologies, addressing material challenges, exploring novel computing concepts, and finally examining the maturity level of emerging technologies while determining t… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 90 pages, 22 figures, roadmap

  2. arXiv:2406.02197  [pdf

    eess.SY cs.NE

    A Pipelined Memristive Neural Network Analog-to-Digital Converter

    Authors: Loai Danial, Kanishka Sharma, Shahar Kvatinsky

    Abstract: With the advent of high-speed, high-precision, and low-power mixed-signal systems, there is an ever-growing demand for accurate, fast, and energy-efficient analog-to-digital (ADCs) and digital-to-analog converters (DACs). Unfortunately, with the downscaling of CMOS technology, modern ADCs trade off speed, power and accuracy. Recently, memristive neuromorphic architectures of four-bit ADC/DAC have… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  3. arXiv:2405.02169  [pdf

    cs.AR

    Transimpedance Amplifier with Automatic Gain Control Based on Memristors for Optical Signal Acquisition

    Authors: Sariel Hodisan, Shahar Kvatinsky

    Abstract: Transimpedance amplifiers (TIA) play a crucial role in various electronic systems, especially in optical signal acquisition. However, their performance is often hampered by saturation issues due to high input currents, leading to prolonged recovery times. This paper addresses this challenge by introducing a novel approach utilizing a memristive automatic gain control (AGC) to adjust the TIA's gain… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  4. arXiv:2310.16843  [pdf, other

    cs.AR cs.ET eess.SY

    Experimental Demonstration of Non-Stateful In-Memory Logic with 1T1R OxRAM Valence Change Mechanism Memristors

    Authors: Henriette Padberg, Amir Regev, Giuseppe Piccolboni, Alessandro Bricalli, Gabriel Molas, Jean Francois Nodin, Shahar Kvatinsky

    Abstract: Processing-in-memory (PIM) is attractive to overcome the limitations of modern computing systems. Numerous PIM systems exist, varying by the technologies and logic techniques used. Successful operation of specific logic functions is crucial for effective processing-in-memory. Memristive non-stateful logic techniques are compatible with CMOS logic and can be integrated into a 1T1R memory array, sim… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: 5 pages, 6 figures

    Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs (2023), pages 1-1

  5. TDPP: Two-Dimensional Permutation-Based Protection of Memristive Deep Neural Networks

    Authors: Minhui Zou, Zhenhua Zhu, Tzofnat Greenberg-Toledo, Orian Leitersdorf, Jiang Li, Junlong Zhou, Yu Wang, Nan Du, Shahar Kvatinsky

    Abstract: The execution of deep neural network (DNN) algorithms suffers from significant bottlenecks due to the separation of the processing and memory units in traditional computer systems. Emerging memristive computing systems introduce an in situ approach that overcomes this bottleneck. The non-volatility of memristive devices, however, may expose the DNN weights stored in memristive crossbars to potenti… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: 14 pages, 11 figures

  6. arXiv:2308.14007  [pdf, other

    cs.AR

    CUDA-PIM: End-to-End Integration of Digital Processing-in-Memory from High-Level C++ to Microarchitectural Design

    Authors: Orian Leitersdorf, Ronny Ronen, Shahar Kvatinsky

    Abstract: Digital processing-in-memory (PIM) architectures mitigate the memory wall problem by facilitating parallel bitwise operations directly within memory. Recent works have demonstrated their algorithmic potential for accelerating data-intensive applications; however, there remains a significant gap in the programming model and microarchitectural design. This is further exacerbated by the emerging mode… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

  7. An Asynchronous and Low-Power True Random Number Generator using STT-MTJ

    Authors: Ben Perach, Shahar Kvatinsky

    Abstract: The emerging Spin Transfer Torque Magnetic Tunnel Junction (STT-MTJ) technology exhibits interesting stochastic behavior combined with small area and low operation energy. It is, therefore, a promising technology for security applications, specifically the generation of random numbers. In this paper, STT-MTJ is used to construct an asynchronous true random number generator (TRNG) with low power an… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

    Comments: Published in: IEEE Transactions on Very Large Scale Integration (VLSI) Systems ( Volume: 27, Issue: 11, November 2019)

  8. arXiv:2307.00658  [pdf, other

    cs.DB

    Accelerating Relational Database Analytical Processing with Bulk-Bitwise Processing-in-Memory

    Authors: Ben Perach, Ronny Ronen, Shahar Kvatinsky

    Abstract: Online Analytical Processing (OLAP) for relational databases is a business decision support application. The application receives queries about the business database, usually requesting to summarize many database records, and produces few results. Existing OLAP requires transferring a large amount of data between the memory and the CPU, having a few operations per datum, and producing a small outp… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: Presented at the 21st IEEE Interregional NEWCAS conference in Edinburgh, Scotland, on June 2023

  9. arXiv:2305.04122  [pdf, other

    cs.AR

    ConvPIM: Evaluating Digital Processing-in-Memory through Convolutional Neural Network Acceleration

    Authors: Orian Leitersdorf, Ronny Ronen, Shahar Kvatinsky

    Abstract: Processing-in-memory (PIM) architectures are emerging to reduce data movement in data-intensive applications. These architectures seek to exploit the same physical devices for both information storage and logic, thereby dwarfing the required data transfer and utilizing the full internal memory bandwidth. Whereas analog PIM utilizes the inherent connectivity of crossbar arrays for approximate matri… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

  10. FourierPIM: High-Throughput In-Memory Fast Fourier Transform and Polynomial Multiplication

    Authors: Orian Leitersdorf, Yahav Boneh, Gonen Gazit, Ronny Ronen, Shahar Kvatinsky

    Abstract: The Discrete Fourier Transform (DFT) is essential for various applications ranging from signal processing to convolution and polynomial multiplication. The groundbreaking Fast Fourier Transform (FFT) algorithm reduces DFT time complexity from the naive O(n^2) to O(n log n), and recent works have sought further acceleration through parallel architectures such as GPUs. Unfortunately, accelerators su… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Journal ref: Memories - Materials, Devices, Circuits and Systems, Volume 4, 2023

  11. arXiv:2302.08284  [pdf, other

    cs.LG eess.SY

    ClaPIM: Scalable Sequence CLAssification using Processing-In-Memory

    Authors: Marcel Khalifa, Barak Hoffer, Orian Leitersdorf, Robert Hanhan, Ben Perach, Leonid Yavits, Shahar Kvatinsky

    Abstract: DNA sequence classification is a fundamental task in computational biology with vast implications for applications such as disease prevention and drug design. Therefore, fast high-quality sequence classifiers are significantly important. This paper introduces ClaPIM, a scalable DNA sequence classification architecture based on the emerging concept of hybrid in-crossbar and near-crossbar memristive… ▽ More

    Submitted 5 November, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

  12. Enabling Relational Database Analytical Processing in Bulk-Bitwise Processing-In-Memory

    Authors: Ben Perach, Ronny Ronen, Shahar Kvatinsky

    Abstract: Bulk-bitwise processing-in-memory (PIM), an emerging computational paradigm utilizing memory arrays as computational units, has been shown to benefit database applications. This paper demonstrates how GROUP-BY and JOIN, database operations not supported by previous works, can be performed efficiently in bulk-bitwise PIM for relational database analytical processing. We extend the gem5 simulator an… ▽ More

    Submitted 2 November, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: Included in conference proceedings of the 2023 IEEE 36th International System-on-Chip Conference (SOCC)

  13. Stateful Logic using Phase Change Memory

    Authors: Barak Hoffer, Nicolás Wainstein, Christopher M. Neumann, Eric Pop, Eilam Yalon, Shahar Kvatinsky

    Abstract: Stateful logic is a digital processing-in-memory technique that could address von Neumann memory bottleneck challenges while maintaining backward compatibility with standard von Neumann architectures. In stateful logic, memory cells are used to perform the logic operations without reading or moving any data outside the memory array. Stateful logic has been previously demonstrated using several res… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

    Journal ref: IEEE Journal on Exploratory Solid-State Computational Devices and Circuits (Volume: 8, Issue: 2, December 2022)

  14. arXiv:2212.09347  [pdf, other

    cs.CR cs.ET

    Review of security techniques for memristor computing systems

    Authors: Minhui Zou, Nan Du, Shahar Kvatinsky

    Abstract: Neural network (NN) algorithms have become the dominant tool in visual object recognition, natural language processing, and robotics. To enhance the computational efficiency of these algorithms, in comparison to the traditional von Neuman computing architectures, researchers have been focusing on memristor computing systems. A major drawback when using memristor computing systems today is that, in… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 15 pages, 5 figures

    Journal ref: Front. Electron. Mater, 19 December 2022, Sec. Semiconducting Materials and Devices Sec. Semiconducting Materials and Devices

  15. arXiv:2211.07542  [pdf, other

    cs.AR

    On Consistency for Bulk-Bitwise Processing-in-Memory

    Authors: Ben Perach, Ronny Ronnen, Shahar Kvatinsky

    Abstract: Processing-in-memory (PIM) architectures allow software to explicitly initiate computation in the memory. This effectively makes PIM operations a new class of memory operations, alongside standard memory operations (e.g., load, store). For software correctness, it is crucial to have ordering rules for a PIM operation with other PIM operations and other memory operations, i.e., a consistency model… ▽ More

    Submitted 7 December, 2022; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: To be published in the 29th IEEE International Symposium on High-Performance Computer Architecture (HPCA-29)

  16. arXiv:2208.14472  [pdf, other

    cs.ET

    abstractPIM: A Technology Backward-Compatible Compilation Flow for Processing-In-Memory

    Authors: Adi Eliahu, Rotem Ben-Hur, Ronny Ronen, Shahar Kvatinsky

    Abstract: The von Neumann architecture, in which the memory and the computation units are separated, demands massive data traffic between the memory and the CPU. To reduce data movement, new technologies and computer architectures have been explored. The use of memristors, which are devices with both memory and computation capabilities, has been considered for different processing-in-memory (PIM) solutions,… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

    Comments: Link to the book chapter in Springer: https://link.springer.com/chapter/10.1007/978-3-030-81641-4_16

  17. arXiv:2208.00741  [pdf, ps, other

    cs.ET

    Performing Stateful Logic Using Spin-Orbit Torque (SOT) MRAM

    Authors: Barak Hoffer, Shahar Kvatinsky

    Abstract: Stateful logic is a promising processing-in-memory (PIM) paradigm to perform logic operations using emerging nonvolatile memory cells. While most stateful logic circuits to date focused on technologies such as resistive RAM, we propose two approaches to designing stateful logic using spin orbit torque (SOT) MRAM. The first approach utilizes the separation of read and write paths in SOT devices to… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

    Comments: Published in 2022 22nd IEEE International Conference on Nanotechnology (NANO)

  18. arXiv:2206.15165  [pdf, other

    cs.AR

    MatPIM: Accelerating Matrix Operations with Memristive Stateful Logic

    Authors: Orian Leitersdorf, Ronny Ronen, Shahar Kvatinsky

    Abstract: The emerging memristive Memory Processing Unit (mMPU) overcomes the memory wall through memristive devices that unite storage and logic for real processing-in-memory (PIM) systems. At the core of the mMPU is stateful logic, which is accelerated with memristive partitions to enable logic with massive inherent parallelism within crossbar arrays. This paper vastly accelerates the fundamental operatio… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

  19. Enhancing Security of Memristor Computing System Through Secure Weight Map**

    Authors: Minhui Zou, Junlong Zhou, Xiaotong Cui, Wei Wang, Shahar Kvatinsky

    Abstract: Emerging memristor computing systems have demonstrated great promise in improving the energy efficiency of neural network (NN) algorithms. The NN weights stored in memristor crossbars, however, may face potential theft attacks due to the nonvolatility of the memristor devices. In this paper, we propose to protect the NN weights by map** selected columns of them in the form of 1's complements and… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: 6 pages, 4 figures, accepted by IEEE ISVLSI 2022

  20. arXiv:2206.04218  [pdf, other

    cs.AR

    AritPIM: High-Throughput In-Memory Arithmetic

    Authors: Orian Leitersdorf, Dean Leitersdorf, Jonathan Gal, Mor Dahan, Ronny Ronen, Shahar Kvatinsky

    Abstract: Digital processing-in-memory (PIM) architectures are rapidly emerging to overcome the memory-wall bottleneck by integrating logic within memory elements. Such architectures provide vast computational power within the memory itself in the form of parallel bitwise logic operations. We develop novel algorithmic techniques for PIM that, combined with new perspectives on computer arithmetic, extend thi… ▽ More

    Submitted 15 April, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Accepted to IEEE Transactions on Emerging Topics in Computing (TETC)

  21. arXiv:2206.04200  [pdf, other

    cs.AR

    PartitionPIM: Practical Memristive Partitions for Fast Processing-in-Memory

    Authors: Orian Leitersdorf, Ronny Ronen, Shahar Kvatinsky

    Abstract: Digital memristive processing-in-memory overcomes the memory wall through a fundamental storage device capable of stateful logic within crossbar arrays. Dynamically dividing the crossbar arrays by adding memristive partitions further increases parallelism, thereby overcoming an inherent trade-off in memristive processing-in-memory. The algorithmic topology of partitions is highly unique, and was r… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

  22. Making Real Memristive Processing-in-Memory Faster and Reliable

    Authors: Shahar Kvatinsky

    Abstract: Memristive technologies are attractive candidates to replace conventional memory technologies, and can also be used to perform logic and arithmetic operations using a technique called 'stateful logic.' Combining data storage and computation in the memory array enables a novel non-von Neumann architecture, where both the operations are performed within a memristive Memory Processing Unit (mMPU). Th… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

  23. arXiv:2205.13559  [pdf, other

    cs.AR cs.CR

    HashPIM: High-Throughput SHA-3 via Memristive Digital Processing-in-Memory

    Authors: Batel Oved, Orian Leitersdorf, Ronny Ronen, Shahar Kvatinsky

    Abstract: Recent research has sought to accelerate cryptographic hash functions as they are at the core of modern cryptography. Traditional designs, however, suffer from the von Neumann bottleneck that originates from the separation of processing and memory units. An emerging solution to overcome this bottleneck is processing-in-memory (PIM): performing logic within the same devices responsible for memory t… ▽ More

    Submitted 1 June, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted to International Conference on Modern Circuits and Systems Technologies (MOCAST) 2022

  24. Understanding Bulk-Bitwise Processing In-Memory Through Database Analytics

    Authors: Ben Perach, Ronny Ronen, Benny Kimelfeld, Shahar Kvatinsky

    Abstract: Bulk-bitwise processing-in-memory (PIM), where large bitwise operations are performed in parallel by the memory array itself, is an emerging form of computation with the potential to mitigate the memory wall problem. This paper examines the capabilities of bulk-bitwise PIM by constructing PIMDB, a fully-digital system based on memristive stateful logic, utilizing and focusing on in-memory bulk-bit… ▽ More

    Submitted 26 September, 2023; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: published in IEEE Transactions on Emerging Topics in Computing

  25. arXiv:2203.09046  [pdf

    physics.app-ph cond-mat.dis-nn cond-mat.mtrl-sci cs.ET

    A memristive deep belief neural network based on silicon synapses

    Authors: Wei Wang, Loai Danial, Yang Li, Eric Herbelin, Evgeny Pikhay, Yakov Roizin, Barak Hoffer, Zhongrui Wang, Shahar Kvatinsky

    Abstract: Memristor-based neuromorphic computing could overcome the limitations of traditional von Neumann computing architectures -- in which data are shuffled between separate memory and processing units -- and improve the performance of deep neural networks. However, this will require accurate synaptic-like device performance, and memristors typically suffer from poor yield and a limited number of reliab… ▽ More

    Submitted 20 July, 2023; v1 submitted 16 March, 2022; originally announced March 2022.

    Journal ref: Nature Electronics, 5, 870 (2022)

  26. arXiv:2202.10228  [pdf, other

    cs.ET physics.app-ph

    Physical based compact model of Y-Flash memristor for neuromorphic computation

    Authors: Wei Wang, Loai Danial, Eric Herbelin, Barak Hoffer, Batel Oved, Tzofnat Greenberg-Toledo, Evgeny Pikhay, Yakov Roizin, Shahar Kvatinsky

    Abstract: Y-Flash memristors utilize the mature technology of single polysilicon floating gate non-volatile memories (NVM). It can be operated in a two-terminal configuration similar to the other emerging memristive devices, i.e., resistive random-access memory (RRAM), phase-change memory (PCM), etc. Fabricated in production complementary metal-oxide-semiconductor (CMOS) technology, Y-Flash memristors allow… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

  27. arXiv:2109.09687  [pdf, other

    cs.AR

    Making Memristive Processing-in-Memory Reliable

    Authors: Orian Leitersdorf, Ronny Ronen, Shahar Kvatinsky

    Abstract: Processing-in-memory (PIM) solutions vastly accelerate systems by reducing data transfer between computation and memory. Memristors possess a unique property that enables storage and logic within the same device, which is exploited in the memristive Memory Processing Unit (mMPU). The mMPU expands fundamental stateful logic techniques, such as IMPLY, MAGIC and FELIX, to high-throughput parallel log… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

    Comments: Accepted to 28th International Conference on Electronics Circuits and Systems (ICECS) 2021

  28. arXiv:2108.13378  [pdf, other

    cs.AR

    MultPIM: Fast Stateful Multiplication for Processing-in-Memory

    Authors: Orian Leitersdorf, Ronny Ronen, Shahar Kvatinsky

    Abstract: Processing-in-memory (PIM) seeks to eliminate computation/memory data transfer using devices that support both storage and logic. Stateful logic techniques such as IMPLY, MAGIC and FELIX can perform logic gates within memristive crossbar arrays with massive parallelism. Multiplication via stateful logic is an active field of research due to the wide implications. Recently, RIME has become the stat… ▽ More

    Submitted 20 September, 2021; v1 submitted 30 August, 2021; originally announced August 2021.

    Comments: Accepted to IEEE Transactions On Circuits And Systems-II (TCAS-II)

  29. arXiv:2107.10308  [pdf, other

    cs.AR

    The Bitlet Model: A Parameterized Analytical Model to Compare PIM and CPU Systems

    Authors: Ronny Ronen, Adi Eliahu, Orian Leitersdorf, Natan Peled, Kunal Korgaonkar, Anupam Chattopadhyay, Ben Perach, Shahar Kvatinsky

    Abstract: Nowadays, data-intensive applications are gaining popularity and, together with this trend, processing-in-memory (PIM)-based systems are being given more attention and have become more relevant. This paper describes an analytical modeling tool called Bitlet that can be used, in a parameterized fashion, to estimate the performance and the power/energy of a PIM-based system and thereby assess the af… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.

    Comments: Accepted to ACM JETC

  30. arXiv:2105.04212  [pdf, other

    cs.AR

    Efficient Error-Correcting-Code Mechanism for High-Throughput Memristive Processing-in-Memory

    Authors: Orian Leitersdorf, Ben Perach, Ronny Ronen, Shahar Kvatinsky

    Abstract: Inefficient data transfer between computation and memory inspired emerging processing-in-memory (PIM) technologies. Many PIM solutions enable storage and processing using memristors in a crossbar-array structure, with techniques such as memristor-aided logic (MAGIC) used for computation. This approach provides highly-paralleled logic computation with minimal data movement. However, memristors are… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: Accepted to 58th Design Automation Conference (DAC) 2021

  31. arXiv:2009.00881  [pdf, other

    cs.AR cs.ET

    CONTRA: Area-Constrained Technology Map** Framework For Memristive Memory Processing Unit

    Authors: Debjyoti Bhattacharjee, Anupam Chattopadhyay, Srijit Dutta, Ronny Ronen, Shahar Kvatinsky

    Abstract: Data-intensive applications are poised to benefit directly from processing-in-memory platforms, such as memristive Memory Processing Units, which allow leveraging data locality and performing stateful logic operations. Develo** design automation flows for such platforms is a challenging and highly relevant research problem. In this work, we investigate the problem of minimizing delay under arbit… ▽ More

    Submitted 2 September, 2020; originally announced September 2020.

    Comments: 9 pages

  32. arXiv:1912.12636  [pdf, other

    cs.ET cs.AR cs.LG cs.NE

    Training of Quantized Deep Neural Networks using a Magnetic Tunnel Junction-Based Synapse

    Authors: Tzofnat Greenberg Toledo, Ben Perach, Itay Hubara, Daniel Soudry, Shahar Kvatinsky

    Abstract: Quantized neural networks (QNNs) are being actively researched as a solution for the computational complexity and memory intensity of deep neural networks. This has sparked efforts to develop algorithms that support both inference and training with quantized weight and activation values, without sacrificing accuracy. A recent example is the GXNOR framework for stochastic training of ternary (TNN)… ▽ More

    Submitted 29 May, 2022; v1 submitted 29 December, 2019; originally announced December 2019.

    Comments: Published in Semiconductor Science and Technology, Vol 36

    Journal ref: Semicond. Sci. Technol. 36 114003 (2021)

  33. arXiv:1910.10234  [pdf, other

    cs.AR cs.PF

    The Bitlet Model: Defining a Litmus Test for the Bitwise Processing-in-Memory Paradigm

    Authors: Kunal Korgaonkar, Ronny Ronen, Anupam Chattopadhyay, Shahar Kvatinsky

    Abstract: This paper describes an analytical modeling tool called Bitlet that can be used, in a parameterized fashion, to understand the affinity of workloads to processing-in-memory (PIM) as opposed to traditional computing. The tool uncovers interesting trade-offs between operation complexity (cycles required to perform an operation through PIM) and other key parameters, such as system memory bandwidth, d… ▽ More

    Submitted 22 October, 2019; originally announced October 2019.

  34. arXiv:1606.04209  [pdf, other

    cs.DC cs.NE

    A Systematic Approach to Blocking Convolutional Neural Networks

    Authors: Xuan Yang, **g Pu, Blaine Burton Rister, Nikhil Bhagdikar, Stephen Richardson, Shahar Kvatinsky, Jonathan Ragan-Kelley, Ardavan Pedram, Mark Horowitz

    Abstract: Convolutional Neural Networks (CNNs) are the state of the art solution for many computer vision problems, and many researchers have explored optimized implementations. Most implementations heuristically block the computation to deal with the large data sizes and high data reuse of CNNs. This paper explores how to block CNN computations for memory locality by creating an analytical model for CNN-li… ▽ More

    Submitted 14 June, 2016; originally announced June 2016.

  35. Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era

    Authors: Ardavan Pedram, Stephen Richardson, Sameh Galal, Shahar Kvatinsky, Mark A. Horowitz

    Abstract: The key challenge to improving performance in the age of Dark Silicon is how to leverage transistors when they cannot all be used at the same time. In modern SOCs, these transistors are often used to create specialized accelerators which improve energy efficiency for some applications by 10-1000X. While this might seem like the magic bullet we need, for most CPU applications more energy is dissipa… ▽ More

    Submitted 26 April, 2016; v1 submitted 12 February, 2016; originally announced February 2016.

    Comments: 8 pages, To appear in IEEE Design and Test Journal

    Journal ref: IEEE Design & Test ( Volume: 34, Issue: 2, April 2017 )