Skip to main content

Showing 1–14 of 14 results for author: Di Mauro, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.14750  [pdf, other

    cs.AR

    Siracusa: A 16 nm Heterogenous RISC-V SoC for Extended Reality with At-MRAM Neural Engine

    Authors: Arpan Suravi Prasad, Moritz Scherer, Francesco Conti, Davide Rossi, Alfio Di Mauro, Manuel Eggimann, Jorge Tómas Gómez, Ziyun Li, Syed Shakib Sarwar, Zhao Wang, Barbara De Salvo, Luca Benini

    Abstract: Extended reality (XR) applications are Machine Learning (ML)-intensive, featuring deep neural networks (DNNs) with millions of weights, tightly latency-bound (10-20 ms end-to-end), and power-constrained (low tens of mW average power). While ML performance and efficiency can be achieved by introducing neural engines within low-power systems-on-chip (SoCs), system-level power for nontrivial DNNs dep… ▽ More

    Submitted 14 April, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: Final accepted manuscript pre-print submitted to the IEEE Journal of Solid-State Circuits

  2. arXiv:2305.18371  [pdf, other

    cs.CV cs.AI cs.AR eess.SY

    ColibriUAV: An Ultra-Fast, Energy-Efficient Neuromorphic Edge Processing UAV-Platform with Event-Based and Frame-Based Cameras

    Authors: Sizhen Bian, Lukas Schulthess, Georg Rutishauser, Alfio Di Mauro, Luca Benini, Michele Magno

    Abstract: The interest in dynamic vision sensor (DVS)-powered unmanned aerial vehicles (UAV) is raising, especially due to the microsecond-level reaction time of the bio-inspired event sensor, which increases robustness and reduces latency of the perception tasks compared to a RGB camera. This work presents ColibriUAV, a UAV platform with both frame-based and event-based cameras interfaces for efficient per… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

  3. Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30%-Boost Adaptive Body Biasing

    Authors: Francesco Conti, Gianna Paulin, Angelo Garofalo, Davide Rossi, Alfio Di Mauro, Georg Rutishauser, Gianmarco Ottavi, Manuel Eggimann, Hayate Okuhara, Luca Benini

    Abstract: Emerging Artificial Intelligence-enabled Internet-of-Things (AI-IoT) System-on-a-Chip (SoC) for augmented reality, personalized healthcare, and nano-robotics need to run many diverse tasks within a power envelope of a few tens of mW over a wide range of operating conditions: compute-intensive but strongly quantized Deep Neural Network (DNN) inference, as well as signal processing and control requi… ▽ More

    Submitted 28 November, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: Post-print accepted by IEEE Journal of Solid-State Circuits. Fixed metadata (was missing one co-author), added DOI of IEEE JSSC

  4. arXiv:2302.07957  [pdf, other

    cs.AR

    ColibriES: A Milliwatts RISC-V Based Embedded System Leveraging Neuromorphic and Neural Networks Hardware Accelerators for Low-Latency Closed-loop Control Applications

    Authors: Georg Rutishauser, Robin Hunziker, Alfio Di Mauro, Sizhen Bian, Luca Benini, Michele Magno

    Abstract: End-to-end event-based computation has the potential to push the envelope in latency and energy efficiency for edge AI applications. Unfortunately, event-based sensors (e.g., DVS cameras) and neuromorphic spike-based processors (e.g., Loihi) have been designed in a decoupled fashion, thereby missing major streamlining opportunities. This paper presents ColibriES, the first-ever neuromorphic hardwa… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  5. arXiv:2212.00688  [pdf, other

    cs.AR

    TCN-CUTIE: A 1036 TOp/s/W, 2.72 uJ/Inference, 12.2 mW All-Digital Ternary Accelerator in 22 nm FDX Technology

    Authors: Moritz Scherer, Alfio Di Mauro, Tim Fischer, Georg Rutishauser, Luca Benini

    Abstract: Tiny Machine Learning (TinyML) applications impose uJ/Inference constraints, with a maximum power consumption of tens of mW. It is extremely challenging to meet these requirements at a reasonable accuracy level. This work addresses the challenge with a flexible, fully digital Ternary Neural Network (TNN) accelerator in a RISC-V-based System-on-Chip (SoC). Besides supporting Ternary Convolutional N… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted at IEEE MICRO Journal

  6. An Energy-Efficient Spiking Neural Network for Finger Velocity Decoding for Implantable Brain-Machine Interface

    Authors: Jiawei Liao, Lars Widmer, Xiaying Wang, Alfio Di Mauro, Samuel R. Nason-Tomaszewski, Cynthia A. Chestek, Luca Benini, Taekwang Jang

    Abstract: Brain-machine interfaces (BMIs) are promising for motor rehabilitation and mobility augmentation. High-accuracy and low-power algorithms are required to achieve implantable BMI systems. In this paper, we propose a novel spiking neural network (SNN) decoder for implantable BMI regression tasks. The SNN is trained with enhanced spatio-temporal backpropagation to fully leverage its ability in handlin… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Journal ref: 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2022, pp. 134-137

  7. arXiv:2209.01065  [pdf, other

    cs.AR eess.SP

    Kraken: A Direct Event/Frame-Based Multi-sensor Fusion SoC for Ultra-Efficient Visual Processing in Nano-UAVs

    Authors: Alfio Di Mauro, Moritz Scherer, Davide Rossi, Luca Benini

    Abstract: Small-size unmanned aerial vehicles (UAV) have the potential to dramatically increase safety and reduce cost in applications like critical infrastructure maintenance and post-disaster search and rescue. Many scenarios require UAVs to shrink toward nano and pico-size form factors. The key open challenge to achieve true autonomy on Nano-UAVs is to run complex visual tasks like object detection, trac… ▽ More

    Submitted 18 August, 2022; originally announced September 2022.

  8. arXiv:2204.10687  [pdf, other

    cs.AR

    SNE: an Energy-Proportional Digital Accelerator for Sparse Event-Based Convolutions

    Authors: Alfio Di Mauro, Arpan Suravi Prasad, Zhikai Huang, Matteo Spallanzani, Francesco Conti, Luca Benini

    Abstract: Event-based sensors are drawing increasing attention due to their high temporal resolution, low power consumption, and low bandwidth. To efficiently extract semantically meaningful information from sparse data streams produced by such sensors, we present a 4.5TOP/s/W digital accelerator capable of performing 4-bits-quantized event-based convolutional neural networks (eCNN). Compared to standard co… ▽ More

    Submitted 29 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: Accepted at DATE22

  9. Dustin: A 16-Cores Parallel Ultra-Low-Power Cluster with 2b-to-32b Fully Flexible Bit-Precision and Vector Lockstep Execution Mode

    Authors: Gianmarco Ottavi, Angelo Garofalo, Giuseppe Tagliavini, Francesco Conti, Alfio Di Mauro, Luca Benini, Davide Rossi

    Abstract: Computationally intensive algorithms such as Deep Neural Networks (DNNs) are becoming killer applications for edge devices. Porting heavily data-parallel algorithms on resource-constrained and battery-powered devices poses several challenges related to memory footprint, computational throughput, and energy efficiency. Low-bitwidth and mixed-precision arithmetic have been proven to be valid strateg… ▽ More

    Submitted 16 March, 2023; v1 submitted 21 January, 2022; originally announced January 2022.

    Comments: 13 pages, 17 figures, 2 tables, Journal

  10. Vega: A 10-Core SoC for IoT End-Nodes with DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode

    Authors: Davide Rossi, Francesco Conti, Manuel Eggimann, Alfio Di Mauro, Giuseppe Tagliavini, Stefan Mach, Marco Guermandi, Antonio Pullini, Igor Loi, Jie Chen, Eric Flamand, Luca Benini

    Abstract: The Internet-of-Things requires end-nodes with ultra-low-power always-on capability for a long battery lifetime, as well as high performance, energy efficiency, and extreme flexibility to deal with complex and fast-evolving near-sensor analytics algorithms (NSAAs). We present Vega, an IoT end-node SoC capable of scaling from a 1.7 $\mathrmμ$W fully retentive cognitive sleep mode up to 32.2 GOPS (@… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: 13 pages, 11 figures, 8 tables, journal paper

  11. arXiv:2010.04566  [pdf, other

    cs.AR

    An Energy-Efficient Low-Voltage Swing Transceiver for mW-Range IoT End-Nodes

    Authors: Hayate Okuhara, Ahmed Elnaqib, Davide Rossi, Alfio Di Mauro, Philipp Mayer, Pierpaolo Palestri, Luca Benini

    Abstract: As the Internet-of-Things (IoT) applications become more and more pervasive, IoT end nodes are requiring more and more computational power within a few mW of power envelope, coupled with high-speed and energy-efficient inter-chip communication to deal with the growing input/output and memory bandwidth for emerging near-sensor analytics applications. While traditional interfaces such as SPI cannot… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: ISCAS2020

  12. Performance-Aware Predictive-Model-Based On-Chip Body-Bias Regulation Strategy for an ULP Multi-Core Cluster in 28nm UTBB FD-SOI

    Authors: Alfio Di Mauro, Davide Rossi, Antonio Pullini, Philippe Flatresse, Luca Benini

    Abstract: The performance and reliability of Ultra-Low-Power (ULP) computing platforms are adversely affected by environmental temperature and process variations. Mitigating the effect of these phenomena becomes crucial when these devices operate near-threshold, due to the magnification of process variations and to the strong temperature inversion effect that affects advanced technology nodes in low-voltage… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Journal ref: Integration, Volume 72, 2020, Pages 194-207

  13. arXiv:2007.08952  [pdf, other

    cs.AR cs.LG eess.SP

    Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node

    Authors: Alfio Di Mauro, Francesco Conti, Pasquale Davide Schiavone, Davide Rossi, Luca Benini

    Abstract: Binary Neural Networks (BNNs) have been shown to be robust to random bit-level noise, making aggressive voltage scaling attractive as a power-saving technique for both logic and SRAMs. In this work, we introduce the first fully programmable IoT end-node system-on-chip (SoC) capable of executing software-defined, hardware-accelerated BNNs at ultra-low voltage. Our SoC exploits a hybrid memory schem… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

    Comments: Submitted to ISICAS2020 journal special issue

  14. arXiv:2006.14256  [pdf, other

    cs.AR

    Arnold: an eFPGA-Augmented RISC-V SoC for Flexible and Low-Power IoT End-Nodes

    Authors: Pasquale Davide Schiavone, Davide Rossi, Alfio Di Mauro, Frank Gurkaynak, Timothy Saxe, Mao Wang, Ket Chong Yap, Luca Benini

    Abstract: A wide range of Internet of Things (IoT) applications require powerful, energy-efficient and flexible end-nodes to acquire data from multiple sources, process and distill the sensed data through near-sensor data analytics algorithms, and transmit it wirelessly. This work presents Arnold: a 0.5 V to 0.8 V, 46.83 uW/MHz, 600 MOPS fully programmable RISC-V Microcontroller unit (MCU) fabricated in 22… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.