Skip to main content

Showing 1–7 of 7 results for author: Bruschi, N

.
  1. arXiv:2308.00154  [pdf, other

    cs.AR

    PATRONoC: Parallel AXI Transport Reducing Overhead for Networks-on-Chip targeting Multi-Accelerator DNN Platforms at the Edge

    Authors: Vikram Jain, Matheus Cavalcante, Nazareno Bruschi, Michael Rogenmoser, Thomas Benz, Andreas Kurth, Davide Rossi, Luca Benini, Marian Verhelst

    Abstract: Emerging deep neural network (DNN) applications require high-performance multi-core hardware acceleration with large data bursts. Classical network-on-chips (NoCs) use serial packet-based protocols suffering from significant protocol translation overheads towards the endpoints. This paper proposes PATRONoC, an open-source fully AXI-compliant NoC fabric to better address the specific needs of multi… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: Accepted and presented at 60th DAC

  2. arXiv:2307.01056  [pdf, other

    cs.AR

    A 3 TOPS/W RISC-V Parallel Cluster for Inference of Fine-Grain Mixed-Precision Quantized Neural Networks

    Authors: Alessandro Nadalini, Georg Rutishauser, Alessio Burrello, Nazareno Bruschi, Angelo Garofalo, Luca Benini, Francesco Conti, Davide Rossi

    Abstract: The emerging trend of deploying complex algorithms, such as Deep Neural Networks (DNNs), increasingly poses strict memory and energy efficiency requirements on Internet-of-Things (IoT) end-nodes. Mixed-precision quantization has been proposed as a technique to minimize a DNN's memory footprint and maximize its execution efficiency, with negligible end-to-end precision degradation. In this work, we… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  3. arXiv:2211.12877  [pdf, other

    cs.DC

    End-to-End DNN Inference on a Massively Parallel Analog In Memory Computing Architecture

    Authors: Nazareno Bruschi, Giuseppe Tagliavini, Angelo Garofalo, Francesco Conti, Irem Boybat, Luca Benini, Davide Rossi

    Abstract: The demand for computation resources and energy efficiency of Convolutional Neural Networks (CNN) applications requires a new paradigm to overcome the "Memory Wall". Analog In-Memory Computing (AIMC) is a promising paradigm since it performs matrix-vector multiplications, the critical kernel of many ML applications, in-place in the analog domain within memory arrays structured as crossbars of memo… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

  4. arXiv:2206.04796  [pdf, other

    cs.AR eess.SY

    Scale up your In-Memory Accelerator: Leveraging Wireless-on-Chip Communication for AIMC-based CNN Inference

    Authors: Nazareno Bruschi, Giuseppe Tagliavini, Francesco Conti, Sergi Abadal, Alberto Cabellos-Aparicio, Eduard Alarcón, Geethan Karunaratne, Irem Boybat, Luca Benini, Davide Rossi

    Abstract: Analog In-Memory Computing (AIMC) is emerging as a disruptive paradigm for heterogeneous computing, potentially delivering orders of magnitude better peak performance and efficiency over traditional digital signal processing architectures on Matrix-Vector multiplication. However, to sustain this throughput in real-world applications, AIMC tiles must be supplied with data at very high bandwidth and… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

  5. GVSoC: A Highly Configurable, Fast and Accurate Full-Platform Simulator for RISC-V based IoT Processors

    Authors: Nazareno Bruschi, Germain Haugou, Giuseppe Tagliavini, Francesco Conti, Luca Benini, Davide Rossi

    Abstract: The last few years have seen the emergence of IoT processors: ultra-low power systems-on-chips (SoCs) combining lightweight and flexible micro-controller units (MCUs), often based on open-ISA RISC-V cores, with application-specific accelerators to maximize performance and energy efficiency. Overall, this heterogeneity level requires complex hardware and a full-fledged software stack to orchestrate… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

  6. arXiv:2008.07127  [pdf, other

    cs.DC cs.AR cs.NE

    DORY: Automatic End-to-End Deployment of Real-World DNNs on Low-Cost IoT MCUs

    Authors: Alessio Burrello, Angelo Garofalo, Nazareno Bruschi, Giuseppe Tagliavini, Davide Rossi, Francesco Conti

    Abstract: The deployment of Deep Neural Networks (DNNs) on end-nodes at the extreme edge of the Internet-of-Things is a critical enabler to support pervasive Deep Learning-enhanced applications. Low-Cost MCU-based end-nodes have limited on-chip memory and often replace caches with scratchpads, to reduce area overheads and increase energy efficiency -- requiring explicit DMA-based memory transfers between di… ▽ More

    Submitted 19 March, 2021; v1 submitted 17 August, 2020; originally announced August 2020.

    Comments: 14 pages, 12 figures, 4 tables, 2 listings. Accepted for publication in IEEE Transactions on Computers (https://ieeexplore.ieee.org/document/9381618)

  7. Enabling Mixed-Precision Quantized Neural Networks in Extreme-Edge Devices

    Authors: Nazareno Bruschi, Angelo Garofalo, Francesco Conti, Giuseppe Tagliavini, Davide Rossi

    Abstract: The deployment of Quantized Neural Networks (QNN) on advanced microcontrollers requires optimized software to exploit digital signal processing (DSP) extensions of modern instruction set architectures (ISA). As such, recent research proposed optimized libraries for QNNs (from 8-bit to 2-bit) such as CMSIS-NN and PULP-NN. This work presents an extension to the PULP-NN library targeting the accelera… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: 4 pages, 6 figures, published in 17th ACM International Conference on Computing Frontiers (CF '20), May 11--13, 2020, Catania, Italy