Search | arXiv e-print repository

doi 10.1109/JSEN.2024.3406948

Low Latency Visual Inertial Odometry with On-Sensor Accelerated Optical Flow for Resource-Constrained UAVs

Authors: Jonas Kühne, Michele Magno, Luca Benini

Abstract: Visual Inertial Odometry (VIO) is the task of estimating the movement trajectory of an agent from an onboard camera stream fused with additional Inertial Measurement Unit (IMU) measurements. A crucial subtask within VIO is the tracking of features, which can be achieved through Optical Flow (OF). As the calculation of OF is a resource-demanding task in terms of computational load and memory footpr… ▽ More Visual Inertial Odometry (VIO) is the task of estimating the movement trajectory of an agent from an onboard camera stream fused with additional Inertial Measurement Unit (IMU) measurements. A crucial subtask within VIO is the tracking of features, which can be achieved through Optical Flow (OF). As the calculation of OF is a resource-demanding task in terms of computational load and memory footprint, which needs to be executed at low latency, especially in robotic applications, OF estimation is today performed on powerful CPUs or GPUs. This restricts its use in a broad spectrum of applications where the deployment of such powerful, power-hungry processors is unfeasible due to constraints related to cost, size, and power consumption. On-sensor hardware acceleration is a promising approach to enable low latency VIO even on resource-constrained devices such as nano drones. This paper assesses the speed-up in a VIO sensor system exploiting a compact OF sensor consisting of a global shutter camera and an Application Specific Integrated Circuit (ASIC). By replacing the feature tracking logic of the VINS-Mono pipeline with data from this OF camera, we demonstrate a 49.4% reduction in latency and a 53.7% reduction of compute load of the VIO pipeline over the original VINS-Mono implementation, allowing VINS-Mono operation up to 50 FPS instead of 20 FPS on the quad-core ARM Cortex-A72 processor of a Raspberry Pi Compute Module 4. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: This article has been accepted for publication in the IEEE Sensors Journal (JSEN)

arXiv:2406.06404 [pdf, other]

doi 10.1109/IWASI58316.2023.10164426

A LoRa-based Energy-efficient Sensing System for Urban Data Collection

Authors: Lukas Schulthess, Tiago Salzmann, Christian Vogt, Michele Magno

Abstract: Nowadays, cities provide much more than shop** opportunities or working spaces. Individual locations such as parks and squares are used as meeting points and local recreation areas by many people. To ensure that they remain attractive in the future, the design of such squares must be regularly adapted to the needs of the public. These utilization trends can be derived using public data collectio… ▽ More Nowadays, cities provide much more than shop** opportunities or working spaces. Individual locations such as parks and squares are used as meeting points and local recreation areas by many people. To ensure that they remain attractive in the future, the design of such squares must be regularly adapted to the needs of the public. These utilization trends can be derived using public data collection. The more diverse and rich the data sets are, the easier it is to optimize public space design through data analysis. Traditional data collection methods such as questionnaires, observations, or videos are either labor intensive or cannot guarantee to preserve the individual's privacy. This work presents a privacy-preserving, low-power, and low-cost smart sensing system that is capable of anonymously collecting data about public space utilization by analyzing the occupancy distribution of public seating. To support future urban planning the sensor nodes are capable of monitoring environmental noise, chair utilization, and their position, temperature, and humidity and provide them over a city-wide Long Range Wide Area Network (LoRaWAN). The final sensing system's robust operation is proven in a trial run at two public squares in a city with 16 sensor nodes over a duration of two months. By consuming 33.65 mWh per day with all subsystems enabled, including sitting detection based on a continuous acceleration measurement operating on a robust and simple threshold algorithm, the custom-designed sensor node achieves continuous monitoring during the 2-month trial run. The evaluation of the experimental results clearly shows how the two locations are used, which confirms the practicability of the proposed solution. All data collected during the field trial is publicly available as open data. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.06245 [pdf, other]

A Lora-Based and Maintenance-Free Cattle Monitoring System for Alpine Pastures and Remote Locations

Authors: Lukas Schulthess, Fabrice Longchamp, Christian Vogt, Michele Magno

Abstract: The advent of the Internet of Things (IoT) is boosting the proliferation of sensors and smart devices in industry and daily life. Continuous monitoring IoT systems are also finding applications in agriculture, particularly in the realm of smart farming. The adoption of wearable sensors to record the activity of livestock has garnered increasing interest. Such a device enables farmers to locate, mo… ▽ More The advent of the Internet of Things (IoT) is boosting the proliferation of sensors and smart devices in industry and daily life. Continuous monitoring IoT systems are also finding applications in agriculture, particularly in the realm of smart farming. The adoption of wearable sensors to record the activity of livestock has garnered increasing interest. Such a device enables farmers to locate, monitor, and constantly assess the health status of their cattle more efficiently and effectively, even in challenging terrain and remote locations. This work presents a maintenance-free and robust smart sensing system that is capable of tracking cattle in remote locations and collecting activity parameters, such as the individual's grazing- and resting time. To support the paradigm of smart farming, the cattle tracker is capable of monitoring the cow's activity by analyzing data from an accelerometer, magnetometer, temperature sensor, and Global Navigation Satellite System (GNSS) module, providing them over Long Range Wide Area Network (LoRaWAN) to a backend server. By consuming 511.9 J per day with all subsystems enabled and a data transmission every 15 minutes, the custom-designed sensor node achieves a battery lifetime of 4 months. When exploiting the integrated solar energy harvesting subsystem, this can be even increased by 40% to up to 6 months. The final sensing system's robust operation is proven in a trial run with two cows on a pasture for over three days. Evaluations of the experimental results clearly show behavior patterns, which confirms the practicability of the proposed solution. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2405.18000 [pdf, other]

A Passive and Asynchronous Wake-up Receiver for Acoustic Underwater Communication

Authors: Lukas Schulthess, Philipp Mayer, Luca Benini, Michele Magno

Abstract: Establishing reliable data exchange in an underwater domain using energy and power-efficient communication methods is crucial and challenging. Radio frequencies are absorbed by the salty and mineral-rich water and optical signals are obstructed and scattered after short distances. In contrast, acoustic communication benefits from low absorption and enables communication over long distances. Underw… ▽ More Establishing reliable data exchange in an underwater domain using energy and power-efficient communication methods is crucial and challenging. Radio frequencies are absorbed by the salty and mineral-rich water and optical signals are obstructed and scattered after short distances. In contrast, acoustic communication benefits from low absorption and enables communication over long distances. Underwater communication must match low power and energy requirements as underwater sensor systems must have a long battery lifetime and need to work reliably due to their deployment and maintenance cost. For long-term deployments, the sensors' overall power consumption is determined by the power consumption during idle state. It can be reduced by integrating asynchronous always-on wake-up circuits with nano-watt power consumption. However, this approach does reduce but not eliminate idle power consumption, leaving a margin for improvement. This paper presents a passive and asynchronous wake-up receiver for acoustic underwater communication enabling zero-power always-on listening. Zero-power listening is achieved by combining energy and information transmission using a low-power wake-up receiver that extracts energy out of the acoustic signal and eliminates radio frontend idle consumption. In-field evaluations demonstrate that the wake-up circuit requires only 63 uW to detect and compare an 8-bit UUID at a data rate of 200 bps up to a distance of 5 m and that the needed energy can directly be extracted from the acoustic signal. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2403.16696 [pdf, other]

BatDeck: Advancing Nano-drone Navigation with Low-power Ultrasound-based Obstacle Avoidance

Authors: Hanna Müller, Victor Kartsch, Michele Magno, Luca Benini

Abstract: Nano-drones, distinguished by their agility, minimal weight, and cost-effectiveness, are particularly well-suited for exploration in confined, cluttered and narrow spaces. Recognizing transparent, highly reflective or absorbing materials, such as glass and metallic surfaces is challenging, as classical sensors, such as cameras or laser rangers, often do not detect them. Inspired by bats, which can… ▽ More Nano-drones, distinguished by their agility, minimal weight, and cost-effectiveness, are particularly well-suited for exploration in confined, cluttered and narrow spaces. Recognizing transparent, highly reflective or absorbing materials, such as glass and metallic surfaces is challenging, as classical sensors, such as cameras or laser rangers, often do not detect them. Inspired by bats, which can fly at high speeds in complete darkness with the help of ultrasound, this paper introduces \textit{BatDeck}, a pioneering sensor-deck employing a lightweight and low-power ultrasonic sensor for nano-drone autonomous navigation. This paper first provides insights about sensor characteristics, highlighting the influence of motor noise on the ultrasound readings, then it introduces the results of extensive experimental tests for obstacle avoidance (OA) in a diverse environment. Results show that \textit{BatDeck} allows exploration for a flight time of 8 minutes while covering 136m on average before crash in a challenging environment with transparent and reflective obstacles, proving the effectiveness of ultrasonic sensors for OA on nano-drones. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2403.11875 [pdf, other]

Towards Real-Time Fast Unmanned Aerial Vehicle Detection Using Dynamic Vision Sensors

Authors: Jakub Mandula, Jonas Kühne, Luca Pascarella, Michele Magno

Abstract: Unmanned Aerial Vehicles (UAVs) are gaining popularity in civil and military applications. However, uncontrolled access to restricted areas threatens privacy and security. Thus, prevention and detection of UAVs are pivotal to guarantee confidentiality and safety. Although active scanning, mainly based on radars, is one of the most accurate technologies, it can be expensive and less versatile than… ▽ More Unmanned Aerial Vehicles (UAVs) are gaining popularity in civil and military applications. However, uncontrolled access to restricted areas threatens privacy and security. Thus, prevention and detection of UAVs are pivotal to guarantee confidentiality and safety. Although active scanning, mainly based on radars, is one of the most accurate technologies, it can be expensive and less versatile than passive inspections, e.g., object recognition. Dynamic vision sensors (DVS) are bio-inspired event-based vision models that leverage timestamped pixel-level brightness changes in fast-moving scenes that adapt well to low-latency object detection. This paper presents F-UAV-D (Fast Unmanned Aerial Vehicle Detector), an embedded system that enables fast-moving drone detection. In particular, we propose a setup to exploit DVS as an alternative to RGB cameras in a real-time and low-power configuration. Our approach leverages the high-dynamic range (HDR) and background suppression of DVS and, when trained with various fast-moving drones, outperforms RGB input in suboptimal ambient conditions such as low illumination and fast-moving scenes. Our results show that F-UAV-D can (i) detect drones by using less than <15 W on average and (ii) perform real-time inference (i.e., <50 ms) by leveraging the CPU and GPU nodes of our edge computer. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: Accepted at 2024 IEEE International Instrumentation and Measurement Technology Conference (I2MTC)

arXiv:2403.11784 [pdf, other]

ForzaETH Race Stack -- Scaled Autonomous Head-to-Head Racing on Fully Commercial off-the-Shelf Hardware

Authors: Nicolas Baumann, Edoardo Ghignone, Jonas Kühne, Niklas Bastuck, Jonathan Becker, Nadine Imholz, Tobias Kränzlin, Tian Yi Lim, Michael Lötscher, Luca Schwarzenbach, Luca Tognoni, Christian Vogt, Andrea Carron, Michele Magno

Abstract: Autonomous racing in robotics combines high-speed dynamics with the necessity for reliability and real-time decision-making. While such racing pushes software and hardware to their limits, many existing full-system solutions necessitate complex, custom hardware and software, and usually focus on Time-Trials rather than full unrestricted Head-to-Head racing, due to financial and safety constraints.… ▽ More Autonomous racing in robotics combines high-speed dynamics with the necessity for reliability and real-time decision-making. While such racing pushes software and hardware to their limits, many existing full-system solutions necessitate complex, custom hardware and software, and usually focus on Time-Trials rather than full unrestricted Head-to-Head racing, due to financial and safety constraints. This limits their reproducibility, making advancements and replication feasible mostly for well-resourced laboratories with comprehensive expertise in mechanical, electrical, and robotics fields. Researchers interested in the autonomy domain but with only partial experience in one of these fields, need to spend significant time with familiarization and integration. The ForzaETH Race Stack addresses this gap by providing an autonomous racing software platform designed for F1TENTH, a 1:10 scaled Head-to-Head autonomous racing competition, which simplifies replication by using commercial off-the-shelf hardware. This approach enhances the competitive aspect of autonomous racing and provides an accessible platform for research and development in the field. The ForzaETH Race Stack is designed with modularity and operational ease of use in mind, allowing customization and adaptability to various environmental conditions, such as track friction and layout. Capable of handling both Time-Trials and Head-to-Head racing, the stack has demonstrated its effectiveness, robustness, and adaptability in the field by winning the official F1TENTH international competition multiple times. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.07915 [pdf, other]

CycloWatt: An Affordable, TinyML-enhanced IoT Device Revolutionizing Cycling Power Metrics

Authors: Victor Luder, Sizhen Bian, Michele Magno

Abstract: Cycling power measurement is an indispensable metric with profound implications for cyclists' performance and fitness levels. It empowers riders with real-time feedback, supports precise training regimen planning, mitigates injury risks, and enhances muscular development. Despite these advantages, the widespread adoption of cycling power meters has been hampered by their prohibitive cost and deplo… ▽ More Cycling power measurement is an indispensable metric with profound implications for cyclists' performance and fitness levels. It empowers riders with real-time feedback, supports precise training regimen planning, mitigates injury risks, and enhances muscular development. Despite these advantages, the widespread adoption of cycling power meters has been hampered by their prohibitive cost and deployment complexity. This paper pioneers a groundbreaking approach to power measurement in cycling, prioritizing affordability and user-friendliness. To achieve this goal, we introduce a cutting-edge Internet of Things (IoT) device that seamlessly integrates force signals with inertial sensor data while leveraging the power of edge machine learning techniques. In-field experimental evaluations demonstrate that our prototype can estimate power with remarkable accuracy, boasting a Mean Absolute Error (MAE) of only 12.29 Watts (4.1\%). Notably, our design emphasizes energy efficiency, operating in a low-power mode that consumes a mere 50 milliwatts and offers an exceptional battery life of up to 25.8 hours in always-on active mode. With an ultra-low latency of 4.33 milliseconds for data processing and inference, our system ensures real-time power estimation during cycling activities. Incorporating IoT concepts and devices, this paper marks a significant milestone in develo** cost-effective and accurate cycling power meters. △ Less

Submitted 27 February, 2024; originally announced March 2024.

arXiv:2401.07658 [pdf, other]

Robustness Evaluation of Localization Techniques for Autonomous Racing

Authors: Tian Yi Lim, Edoardo Ghignone, Nicolas Baumann, Michele Magno

Abstract: This work introduces SynPF, an MCL-based algorithm tailored for high-speed racing environments. Benchmarked against Cartographer, a state-of-the-art pose-graph SLAM algorithm, SynPF leverages synergies from previous particle-filtering methods and synthesizes them for the high-performance racing domain. Our extensive in-field evaluations reveal that while Cartographer excels under nominal condition… ▽ More This work introduces SynPF, an MCL-based algorithm tailored for high-speed racing environments. Benchmarked against Cartographer, a state-of-the-art pose-graph SLAM algorithm, SynPF leverages synergies from previous particle-filtering methods and synthesizes them for the high-performance racing domain. Our extensive in-field evaluations reveal that while Cartographer excels under nominal conditions, it struggles when subjected to wheel-slip, a common phenomenon in a racing scenario due to varying grip levels and aggressive driving behaviour. Conversely, SynPF demonstrates robustness in these challenging conditions and a low-latency computation time of 1.25 ms on on-board computers without a GPU. Using the F1TENTH platform, a 1:10 scaled autonomous racing vehicle, this work not only highlights the vulnerabilities of existing algorithms in high-speed scenarios, tested up until 7.6 m/s, but also emphasizes the potential of SynPF as a viable alternative, especially in deteriorating odometry conditions. △ Less

Submitted 26 March, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

Comments: Accepted at the Design, Automation and Test in Europe Conference 2024 as an extended abstract

arXiv:2401.06000 [pdf, other]

Body-Area Capacitive or Electric Field Sensing for Human Activity Recognition and Human-Computer Interaction: A Comprehensive Survey

Authors: Sizhen Bian, Mengxi Liu, Bo Zhou, Paul Lukowicz, Michele Magno

Abstract: Due to the fact that roughly sixty percent of the human body is essentially composed of water, the human body is inherently a conductive object, being able to, firstly, form an inherent electric field from the body to the surroundings and secondly, deform the distribution of an existing electric field near the body. Body-area capacitive sensing, also called body-area electric field sensing, is bec… ▽ More Due to the fact that roughly sixty percent of the human body is essentially composed of water, the human body is inherently a conductive object, being able to, firstly, form an inherent electric field from the body to the surroundings and secondly, deform the distribution of an existing electric field near the body. Body-area capacitive sensing, also called body-area electric field sensing, is becoming a promising alternative for wearable devices to accomplish certain tasks in human activity recognition and human-computer interaction. Over the last decade, researchers have explored plentiful novel sensing systems backed by the body-area electric field. On the other hand, despite the pervasive exploration of the body-area electric field, a comprehensive survey does not exist for an enlightening guideline. Moreover, the various hardware implementations, applied algorithms, and targeted applications result in a challenging task to achieve a systematic overview of the subject. This paper aims to fill in the gap by comprehensively summarizing the existing works on body-area capacitive sensing so that researchers can have a better view of the current exploration status. To this end, we first sorted the explorations into three domains according to the involved body forms: body-part electric field, whole-body electric field, and body-to-body electric field, and enumerated the state-of-art works in the domains with a detailed survey of the backed sensing tricks and targeted applications. We then summarized the three types of sensing frontends in circuit design, which is the most critical part in body-area capacitive sensing, and analyzed the data processing pipeline categorized into three kinds of approaches. Finally, we described the challenges and outlooks of body-area electric sensing. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2312.13672 [pdf, other]

doi 10.1109/TIM.2023.3282289

Angle of Arrival and Centimeter Distance Estimation on a Smart UWB Sensor Node

Authors: Tobias Margiani, Silvano Cortesi, Milena Keller, Christian Vogt, Tommaso Polonelli, Michele Magno

Abstract: Accurate and low-power indoor localization is becoming more and more of a necessity to empower novel consumer and industrial applications. In this field, the most promising technology is based on UWB modulation; however, current UWB positioning systems do not reach centimeter accuracy in general deployments due to multipath and nonisotropic antennas, still necessitating several fixed anchors to es… ▽ More Accurate and low-power indoor localization is becoming more and more of a necessity to empower novel consumer and industrial applications. In this field, the most promising technology is based on UWB modulation; however, current UWB positioning systems do not reach centimeter accuracy in general deployments due to multipath and nonisotropic antennas, still necessitating several fixed anchors to estimate an object's position in space. This article presents an in-depth study and assessment of angle of arrival (AoA) UWB measurements using a compact, low-power solution integrating a novel commercial module with phase difference of arrival (PDoA) estimation as integrated feature. Results demonstrate the possibility of reaching centimeter distance precision and ang 2.4 average angular accuracy in many operative conditions, e.g., in a ang 90 range around the center. Moreover, integrating the channel impulse response, the phase differential of arrival, and the point-to-point distance, an error correction model is discussed to compensate for reflections, multipaths, and front-back ambiguity. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: This article has been accepted for publication in the IEEE Transactions on Instrumentation and Measurement journal. DOI: https://doi.org/10.1109/TIM.2023.3282289

arXiv:2312.09854 [pdf]

Q-Segment: Segmenting Images In-Sensor for Vessel-Based Medical Diagnosis

Authors: Pietro Bonazzi, Yawei Li, Sizhen Bian, Michele Magno

Abstract: This paper addresses the growing interest in deploying deep learning models directly in-sensor. We present "Q-Segment", a quantized real-time segmentation algorithm, and conduct a comprehensive evaluation on a low-power edge vision platform with an in-sensors processor, the Sony IMX500. One of the main goals of the model is to achieve end-to-end image segmentation for vessel-based medical diagnosi… ▽ More This paper addresses the growing interest in deploying deep learning models directly in-sensor. We present "Q-Segment", a quantized real-time segmentation algorithm, and conduct a comprehensive evaluation on a low-power edge vision platform with an in-sensors processor, the Sony IMX500. One of the main goals of the model is to achieve end-to-end image segmentation for vessel-based medical diagnosis. Deployed on the IMX500 platform, Q-Segment achieves ultra-low inference time in-sensor only 0.23 ms and power consumption of only 72mW. We compare the proposed network with state-of-the-art models, both float and quantized, demonstrating that the proposed solution outperforms existing networks on various platforms in computing efficiency, e.g., by a factor of 75x compared to ERFNet. The network employs an encoder-decoder structure with skip connections, and results in a binary accuracy of 97.25% and an Area Under the Receiver Operating Characteristic Curve (AUC) of 96.97% on the CHASE dataset. We also present a comparison of the IMX500 processing core with the Sony Spresense, a low-power multi-core ARM Cortex-M microcontroller, and a single-core ARM Cortex-M4 showing that it can achieve in-sensor processing with end-to-end low latency (17 ms) and power concumption (254mW). This research contributes valuable insights into edge-based image segmentation, laying the foundation for efficient algorithms tailored to low-power environments. △ Less

Submitted 4 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

arXiv:2311.14523 [pdf, other]

doi 10.1109/SENSORS56945.2023.10325219

Evaluation of a Non-Coherent Ultra-Wideband Transceiver for Micropower Sensor Nodes

Authors: Jonah Imfeld, Silvano Cortesi, Philipp Mayer, Michele Magno

Abstract: Spatial and contextual awareness has the potential to revolutionize sensor nodes, enabling spatially augmented data collection and location-based services. With its high bandwidth, superior energy efficiency, and precise time-of-flight measurements, ultra-wideband (UWB) technology emerges as an ideal solution for such devices. This paper presents an evaluation and comparison of a non-coherent UW… ▽ More Spatial and contextual awareness has the potential to revolutionize sensor nodes, enabling spatially augmented data collection and location-based services. With its high bandwidth, superior energy efficiency, and precise time-of-flight measurements, ultra-wideband (UWB) technology emerges as an ideal solution for such devices. This paper presents an evaluation and comparison of a non-coherent UWB transceiver within the context of highly energy-constrained wireless sensing nodes and pervasive Internet of Things (IoT) devices. Experimental results highlight the unique properties of UWB transceivers, showcasing efficient data transfer ranging from 2 kbit/s to 7.2 Mbit/s while reaching an energy consumption of 0.29 nJ/bit and 1.39 nJ/bit for transmitting and receiving, respectively. Notably, a ranging accuracy of up to +/-25 cm can be achieved. Moreover, the peak power consumption of the UWB transceiver is with 6.7 mW in TX and 23 mW in RX significantly lower than that of other commercial UWB transceivers. △ Less

Submitted 21 December, 2023; v1 submitted 24 November, 2023; originally announced November 2023.

Comments: This article has been accepted for publication in the Proceedings of the 2023 IEEE SENSORS conference. DOI: https://doi.org/10.1109/SENSORS56945.2023.10325219

arXiv:2311.01881 [pdf, other]

Quantitative Evaluation of a Multi-Modal Camera Setup for Fusing Event Data with RGB Images

Authors: Julian Moosmann, Jakub Mandula, Philipp Mayer, Luca Benini, Michele Magno

Abstract: Event-based cameras, also called silicon retinas, potentially revolutionize computer vision by detecting and reporting significant changes in intensity asynchronous events, offering extended dynamic range, low latency, and low power consumption, enabling a wide range of applications from autonomous driving to longtime surveillance. As an emerging technology, there is a notable scarcity of publicly… ▽ More Event-based cameras, also called silicon retinas, potentially revolutionize computer vision by detecting and reporting significant changes in intensity asynchronous events, offering extended dynamic range, low latency, and low power consumption, enabling a wide range of applications from autonomous driving to longtime surveillance. As an emerging technology, there is a notable scarcity of publicly available datasets for event-based systems that also feature frame-based cameras, in order to exploit the benefits of both technologies. This work quantitatively evaluates a multi-modal camera setup for fusing high-resolution DVS data with RGB image data by static camera alignment. The proposed setup, which is intended for semi-automatic DVS data labeling, combines two recently released Prophesee EVK4 DVS cameras and one global shutter XIMEA MQ022CG-CM RGB camera. After alignment, state-of-the-art object detection or segmentation networks label the image data by map** boundary boxes or labeled pixels directly to the aligned events. To facilitate this process, various time-based synchronization methods for DVS data are analyzed, and calibration accuracy, camera alignment, and lens impact are evaluated. Experimental results demonstrate the benefits of the proposed system: the best synchronization method yields an image calibration error of less than 0.90px and a pixel cross-correlation deviation of1.6px, while a lens with 8mm focal length enables detection of objects with size 30cm at a distance of 350m against homogeneous background. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2310.20331 [pdf, other]

doi 10.1145/3628353.3628545

Energy-Aware Adaptive Sampling for Self-Sustainability in Resource-Constrained IoT Devices

Authors: Marco Giordano, Silvano Cortesi, Prodromos-Vasileios Mekikis, Michele Crabolu, Giovanni Bellusci, Michele Magno

Abstract: In the ever-growing Internet of Things (IoT) landscape, smart power management algorithms combined with energy harvesting solutions are crucial to obtain self-sustainability. This paper presents an energy-aware adaptive sampling rate algorithm designed for embedded deployment in resource-constrained, battery-powered IoT devices. The algorithm, based on a finite state machine (FSM) and inspired by… ▽ More In the ever-growing Internet of Things (IoT) landscape, smart power management algorithms combined with energy harvesting solutions are crucial to obtain self-sustainability. This paper presents an energy-aware adaptive sampling rate algorithm designed for embedded deployment in resource-constrained, battery-powered IoT devices. The algorithm, based on a finite state machine (FSM) and inspired by Transmission Control Protocol (TCP) Reno's additive increase and multiplicative decrease, maximizes sensor sampling rates, ensuring power self-sustainability without risking battery depletion. Moreover, we characterized our solar cell with data acquired over 48 days and used the model created to obtain energy data from an open-source world-wide dataset. To validate our approach, we introduce the EcoTrack device, a versatile device with global navigation satellite system (GNSS) capabilities and Long-Term Evolution Machine Type Communication (LTE-M) connectivity, supporting MQTT protocol for cloud data relay. This multi-purpose device can be used, for instance, as a health and safety wearable, remote hazard monitoring system, or as a global asset tracker. The results, validated on data from three different European cities, show that the proposed algorithm enables self-sustainability while maximizing sampled locations per day. In experiments conducted with a 3000 mAh battery capacity, the algorithm consistently maintained a minimum of 24 localizations per day and achieved peaks of up to 3000. △ Less

Submitted 21 December, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

Comments: This article has been accepted for publication in the Proceedings of the 11th International Workshop on Energy Harvesting and Energy-Neutral Sensing Systems (ENSSys '23). DOI: https://doi.org/10.1145/3628353.3628545

arXiv:2310.14758 [pdf, other]

Optimizing IoT-Based Asset and Utilization Tracking: Efficient Activity Classification with MiniRocket on Resource-Constrained Devices

Authors: Marco Giordano, Silvano Cortesi, Michele Crabolu, Lavinia Pedrollo, Giovanni Bellusci, Tommaso Bendinelli, Engin Türetken, Andrea Dunbar, Michele Magno

Abstract: This paper introduces an effective solution for retrofitting construction power tools with low-power IoT to enable accurate activity classification. We address the challenge of distinguishing between when a power tool is being moved and when it is actually being used. To achieve classification accuracy and power consumption preservation a newly released algorithm called MiniRocket was employed. Kn… ▽ More This paper introduces an effective solution for retrofitting construction power tools with low-power IoT to enable accurate activity classification. We address the challenge of distinguishing between when a power tool is being moved and when it is actually being used. To achieve classification accuracy and power consumption preservation a newly released algorithm called MiniRocket was employed. Known for its accuracy, scalability, and fast training for time-series classification, in this paper, it is proposed as a TinyML algorithm for inference on resource-constrained IoT devices. The paper demonstrates the portability and performance of MiniRocket on a resource-constrained, ultra-low power sensor node for floating-point and fixed-point arithmetic, matching up to 1% of the floating-point accuracy. The hyperparameters of the algorithm have been optimized for the task at hand to find a Pareto point that balances memory usage, accuracy and energy consumption. For the classification problem, we rely on an accelerometer as the sole sensor source, and BLE for data transmission. Extensive real-world construction data, using 16 different power tools, were collected, labeled, and used to validate the algorithm's performance directly embedded in the IoT device. Experimental results demonstrate that the proposed solution achieves an accuracy of 96.9% in distinguishing between real usage status and other motion statuses while consuming only 7kB of flash and 3kB of RAM. The final application exhibits an average current consumption of less than 15μW for the whole system, resulting in battery life performance ranging from 3 to 9 years depending on the battery capacity (250-500mAh) and the number of power tool usage hours (100-1500h). △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2310.14704 [pdf, other]

doi 10.1109/WiMob52687.2021.9606272

Design and Implementation of an RSSI-Based Bluetooth Low Energy Indoor Localization System

Authors: Silvano Cortesi, Marc Dreher, Michele Magno

Abstract: Indoor Positioning System (IPS) is a crucial technology that enables medical staff and hospital managements to accurately locate and track persons or assets inside the medical buildings. Among other technologies, Bluetooth Low Energy (BLE) can be exploited for achieving an energy-efficient and low-cost solution. This work presents the design and implementation of an received signal strength indica… ▽ More Indoor Positioning System (IPS) is a crucial technology that enables medical staff and hospital managements to accurately locate and track persons or assets inside the medical buildings. Among other technologies, Bluetooth Low Energy (BLE) can be exploited for achieving an energy-efficient and low-cost solution. This work presents the design and implementation of an received signal strength indicator (RSSI)-based indoor localization system. The paper shows the implementation of a low complex weighted k-Nearest Neighbors algorithm that processes raw RSSI data from connection-less iBeacon's. The designed hardware and firmware are implemented around the low-power and low-cost nRF52832 from Nordic Semiconductor. Experimental evaluation with the real-time data processing has been evaluated and presented in a 7.2 m by 7.2 m room with furniture and 5 beacon nodes. The experimental results show an average error of only 0.72 m in realistic conditions. Finally, the overall power consumption of the fixed beacon with a periodic advertisement of 100 ms is only 50 uA at 3 V, which leads to a long-lasting solution of over one year with a 500 mAh coin battery. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: This article has been accepted for publication in the proceedings of the 2021 IEEE International Conference on Wireless and Mobile Computing, Networking And Communications (WiMob). DOI: 10.1109/WiMob52687.2021.9606272

arXiv:2310.14681 [pdf, other]

doi 10.1109/WiMob58348.2023.10187799

Latency and Power Consumption in 2.4 GHz IoT Wireless Mesh Nodes: An Experimental Evaluation of Bluetooth Mesh and Wirepas Mesh

Authors: Silvano Cortesi, Christian Vogt, Elio Reinschmidt, Michele Magno

Abstract: The rapid growth of the Internet of Things paradigm is pushing the need to connect billions of batteryoperated devices to the internet and among them. To address this need, the introduction of energy-efficient wireless mesh networks based on Bluetooth provides an effective solution. This paper proposes a testbed setup to accurately evaluate and compare the standard Bluetooth Mesh 5.0 and the emerg… ▽ More The rapid growth of the Internet of Things paradigm is pushing the need to connect billions of batteryoperated devices to the internet and among them. To address this need, the introduction of energy-efficient wireless mesh networks based on Bluetooth provides an effective solution. This paper proposes a testbed setup to accurately evaluate and compare the standard Bluetooth Mesh 5.0 and the emerging energy-efficient Wirepas protocol that promises better performance. The paper presents the evaluation in terms of power consumption, energy efficiency, and transmission latency which are the most crucial features, in a controlled and reproducible test setup consisting of 10 nodes. Experimental results demonstrated that Wirepas has a median latency of 2.83 ms in Low-Latency mode respectively around 2 s in the Low-Energy mode. The corresponding power consumption is 6.2 mA in Low-Latency mode and 38.9 uA in Low-Energy mode. For Bluetooth Mesh the median latency is 4.54 ms with a power consumption of 6.2 mA at 3.3 V. Based on this comparison, conclusions about the advantages and disadvantages of both technologies can be drawn. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: This article has been accepted for publication in the proceedings of the 2023 IEEE International Conference on Wireless and Mobile Computing, Networking And Communications (WiMob). DOI: 10.1109/WiMob58348.2023.10187799

arXiv:2309.14569 [pdf, other]

Towards a Novel Ultrasound System Based on Low-Frequency Feature Extraction From a Fully-Printed Flexible Transducer

Authors: Marco Giordano, Kirill Keller, Francesco Greco, Luca Benini, Michele Magno, Christoph Leitner

Abstract: Ultrasound is a key technology in healthcare, and it is being explored for non-invasive, wearable, continuous monitoring of vital signs. However, its widespread adoption in this scenario is still hindered by the size, complexity, and power consumption of current devices. Moreover, such an application demands adaptability to human anatomy, which is hard to achieve with current transducer technology… ▽ More Ultrasound is a key technology in healthcare, and it is being explored for non-invasive, wearable, continuous monitoring of vital signs. However, its widespread adoption in this scenario is still hindered by the size, complexity, and power consumption of current devices. Moreover, such an application demands adaptability to human anatomy, which is hard to achieve with current transducer technology. This paper presents a novel ultrasound system prototype based on a fully printed, lead-free, and flexible polymer ultrasound transducer, whose bending radius promises good adaptability to the human anatomy. Our application scenario focuses on continuous blood flow monitoring. We implemented a hardware envelope filter to efficiently transpose high-frequency ultrasound signals to a lower-frequency spectrum. This reduces computational and power demands with little to no degradation in the task proposed for this work. We validated our method on a setup that mimics human blood flow by using a flow phantom and a peristaltic pump simulating 3 different heartbeat rhythms: 60, 90 and 120 beats per minute. Our ultrasound setup reconstructs peristaltic pump frequencies with errors of less than 0.05 Hz (3 bpm) from the set pump frequency, both for the raw echo and the enveloped echo. The analog pre-processing showed a promising reduction of signal bandwidth of more than 6x: pulse-echo signals of transducers excited at 12.5 MHz were reduced to about 2 MHz. Thus, allowing consumer MCUs to acquire and elaborate signals within mW-power range in an inexpensive fashion. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 5 pages, 2 tables, 3 figures, Accepted at IEEE BioCAS 2023

arXiv:2309.14455 [pdf, other]

Skilog: A Smart Sensor System for Performance Analysis and Biofeedback in Ski Jum**

Authors: Lukas Schulthess, Thorir Mar Ingolfsson, Marc Nölke, Michele Magno, Luca Benini, Christoph Leitner

Abstract: In ski jum**, low repetition rates of jumps limit the effectiveness of training. Thus, increasing learning rate within every single jump is key to success. A critical element of athlete training is motor learning, which has been shown to be accelerated by feedback methods. In particular, a fine-grained control of the center of gravity in the in-run is essential. This is because the actual takeof… ▽ More In ski jum**, low repetition rates of jumps limit the effectiveness of training. Thus, increasing learning rate within every single jump is key to success. A critical element of athlete training is motor learning, which has been shown to be accelerated by feedback methods. In particular, a fine-grained control of the center of gravity in the in-run is essential. This is because the actual takeoff occurs within a blink of an eye ($\sim$300ms), thus any unbalanced body posture during the in-run will affect flight. This paper presents a smart, compact, and energy-efficient wireless sensor system for real-time performance analysis and biofeedback during ski jum**. The system operates by gauging foot pressures at three distinct points on the insoles of the ski boot at 100Hz. Foot pressure data can either be directly sent to coaches to improve their feedback, or fed into a ML model to give athletes instantaneous in-action feedback using a vibration motor in the ski boot. In the biofeedback scenario, foot pressures act as input variables for an optimized XGBoost model. We achieve a high predictive accuracy of 92.7% for center of mass predictions (dorsal shift, neutral stand, ventral shift). Subsequently, we parallelized and fine-tuned our XGBoost model for a RISC-V based low power parallel processor (GAP9), based on the PULP architecture. We demonstrate real-time detection and feedback (0.0109ms/inference) using our on-chip deployment. The proposed smart system is unobtrusive with a slim form factor (13mm baseboard, 3.2mm antenna) and a lightweight build (26g). Power consumption analysis reveals that the system's energy-efficient design enables sustained operation over multiple days (up to 300 hours) without requiring recharge. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 5 pages, 2 tables, 4 figure, Accepted at IEEE BioCAS 2023

arXiv:2309.08317 [pdf, other]

doi 10.1109/MeMeA57477.2023.10171940

Investigation of mmWave Radar Technology For Non-contact Vital Sign Monitoring

Authors: Steven Marty, Federico Pantanella, Andrea Ronco, Kanika Dheman, Michele Magno

Abstract: Non-contact vital sign monitoring has many advantages over conventional methods in being comfortable, unobtrusive and without any risk of spreading infection. The use of millimeter-wave (mmWave) radars is one of the most promising approaches that enable contact-less monitoring of vital signs. Novel low-power implementations of this technology promise to enable vital sign sensing in embedded, batte… ▽ More Non-contact vital sign monitoring has many advantages over conventional methods in being comfortable, unobtrusive and without any risk of spreading infection. The use of millimeter-wave (mmWave) radars is one of the most promising approaches that enable contact-less monitoring of vital signs. Novel low-power implementations of this technology promise to enable vital sign sensing in embedded, battery-operated devices. The nature of these new low-power sensors exacerbates the challenges of accurate and robust vital sign monitoring and especially the problem of heart-rate tracking. This work focuses on the investigation and characterization of three Frequency Modulated Continuous Wave (FMCW) low-power radars with different carrier frequencies of 24 GHz, 60 GHz and 120 GHz. The evaluation platforms were first tested on phantom models that emulated human bodies to accurately evaluate the baseline noise, error in range estimation, and error in displacement estimation. Additionally, the systems were also used to collect data from three human subjects to gauge the feasibility of identifying heartbeat peaks and breathing peaks with simple and lightweight algorithms that could potentially run in low-power embedded processors. The investigation revealed that the 24 GHz radar has the highest baseline noise level, 0.04mm at 0° angle of incidence, and an error in range estimation of 3.45 +- 1.88 cm at a distance of 60 cm. At the same distance, the 60 GHz and the 120 GHz radar system shows the least noise level, 0.0lmm at 0° angle of incidence, and error in range estimation 0.64 +- 0.01 cm and 0.04 +- 0.0 cm respectively. Additionally, tests on humans showed that all three radar systems were able to identify heart and breathing activity but the 120 GHz radar system outperformed the other two. △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2309.02393 [pdf, other]

doi 10.1145/3576842.3582365

In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms

Authors: Philipp Schilk, Niccolò Polvani, Andrea Ronco, Milos Cernak, Michele Magno

Abstract: The recent ubiquitous adoption of remote conferencing has been accompanied by omnipresent frustration with distorted or otherwise unclear voice communication. Audio enhancement can compensate for low-quality input signals from, for example, small true wireless earbuds, by applying noise suppression techniques. Such processing relies on voice activity detection (VAD) with low latency and the added… ▽ More The recent ubiquitous adoption of remote conferencing has been accompanied by omnipresent frustration with distorted or otherwise unclear voice communication. Audio enhancement can compensate for low-quality input signals from, for example, small true wireless earbuds, by applying noise suppression techniques. Such processing relies on voice activity detection (VAD) with low latency and the added capability of discriminating the wearer's voice from others - a task of significant computational complexity. The tight energy budget of devices as small as modern earphones, however, requires any system attempting to tackle this problem to do so with minimal power and processing overhead, while not relying on speaker-specific voice samples and training due to usability concerns. This paper presents the design and implementation of a custom research platform for low-power wireless earbuds based on novel, commercial, MEMS bone-conduction microphones. Such microphones can record the wearer's speech with much greater isolation, enabling personalized voice activity detection and further audio enhancement applications. Furthermore, the paper accurately evaluates a proposed low-power personalized speech detection algorithm based on bone conduction data and a recurrent neural network running on the implemented research platform. This algorithm is compared to an approach based on traditional microphone input. The performance of the bone conduction system, achieving detection of speech within 12.8ms at an accuracy of 95\% is evaluated. Different SoC choices are contrasted, with the final implementation based on the cutting-edge Ambiq Apollo 4 Blue SoC achieving 2.64mW average power consumption at 14uJ per inference, reaching 43h of battery life on a miniature 32mAh li-ion cell and without duty cycling. △ Less

Submitted 5 September, 2023; originally announced September 2023.

arXiv:2309.01647 [pdf, other]

doi 0.1109/IWASI58316.2023.10164312

Towards Robust Velocity and Position Estimation of Opponents for Autonomous Racing Using Low-Power Radar

Authors: Andrea Ronco, Nicolas Baumann, Marco Giordano, Michele Magno

Abstract: This paper presents the design and development of an intelligent subsystem that includes a novel low-power radar sensor integrated into an autonomous racing perception pipeline to robustly estimate the position and velocity of dynamic obstacles. The proposed system, based on the Infineon BGT60TR13D radar, is evaluated in a real-world scenario with scaled race cars. The paper explores the benefits… ▽ More This paper presents the design and development of an intelligent subsystem that includes a novel low-power radar sensor integrated into an autonomous racing perception pipeline to robustly estimate the position and velocity of dynamic obstacles. The proposed system, based on the Infineon BGT60TR13D radar, is evaluated in a real-world scenario with scaled race cars. The paper explores the benefits and limitations of using such a sensor subsystem and draws conclusions based on field-collected data. The results demonstrate a tracking error up to 0.21 +- 0.29 m in distance estimation and 0.39 +- 0.19 m/s in velocity estimation, despite the power consumption in the range of 10s of milliwatts. The presented system provides complementary information to other sensors such as LiDAR and camera, and can be used in a wide range of applications beyond autonomous racing. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2307.05999 [pdf, other]

doi 10.1109/ACCESS.2024.3404878

Flexible and Fully Quantized Ultra-Lightweight TinyissimoYOLO for Ultra-Low-Power Edge Systems

Authors: Julian Moosmann, Hanna Mueller, Nicky Zimmerman, Georg Rutishauser, Luca Benini, Michele Magno

Abstract: This paper deploys and explores variants of TinyissimoYOLO, a highly flexible and fully quantized ultra-lightweight object detection network designed for edge systems with a power envelope of a few milliwatts. With experimental measurements, we present a comprehensive characterization of the network's detection performance, exploring the impact of various parameters, including input resolution, nu… ▽ More This paper deploys and explores variants of TinyissimoYOLO, a highly flexible and fully quantized ultra-lightweight object detection network designed for edge systems with a power envelope of a few milliwatts. With experimental measurements, we present a comprehensive characterization of the network's detection performance, exploring the impact of various parameters, including input resolution, number of object classes, and hidden layer adjustments. We deploy variants of TinyissimoYOLO on state-of-the-art ultra-low-power extreme edge platforms, presenting an in-depth a comparison on latency, energy efficiency, and their ability to efficiently parallelize the workload. In particular, the paper presents a comparison between a novel parallel RISC-V processor (GAP9 from Greenwaves) with and without use of its on-chip hardware accelerator, an ARM Cortex-M7 core (STM32H7 from ST Microelectronics), two ARM Cortex-M4 cores (STM32L4 from STM and Apollo4b from Ambiq), and a multi-core platform with a CNN hardware accelerator (Analog Devices MAX78000). Experimental results show that the GAP9's hardware accelerator achieves the lowest inference latency and energy at 2.12ms and 150uJ respectively, which is around 2x faster and 20% more efficient than the next best platform, the MAX78000. The hardware accelerator of GAP9 can even run an increased resolution version of TinyissimoYOLO with 112x112 pixels and 10 detection classes within 3.2ms, consuming 245uJ. To showcase the competitiveness of a versatile general-purpose system we also deployed and profiled a multi-core implementation on GAP9 at different operating points, achieving 11.3ms with the lowest-latency and 490uJ with the most energy-efficient configuration. With this paper, we demonstrate the suitability and flexibility of TinyissimoYOLO on state-of-the-art detection datasets for real-time ultra-low-power edge inference. △ Less

Submitted 14 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

Comments: * The first three authors contributed equally to this research

arXiv:2306.00001 [pdf, other]

doi 10.1109/AICAS57966.2023.10168657

TinyissimoYOLO: A Quantized, Low-Memory Footprint, TinyML Object Detection Network for Low Power Microcontrollers

Authors: Julian Moosmann, Marco Giordano, Christian Vogt, Michele Magno

Abstract: This paper introduces a highly flexible, quantized, memory-efficient, and ultra-lightweight object detection network, called TinyissimoYOLO. It aims to enable object detection on microcontrollers in the power domain of milliwatts, with less than 0.5MB memory available for storing convolutional neural network (CNN) weights. The proposed quantized network architecture with 422k parameters, enables r… ▽ More This paper introduces a highly flexible, quantized, memory-efficient, and ultra-lightweight object detection network, called TinyissimoYOLO. It aims to enable object detection on microcontrollers in the power domain of milliwatts, with less than 0.5MB memory available for storing convolutional neural network (CNN) weights. The proposed quantized network architecture with 422k parameters, enables real-time object detection on embedded microcontrollers, and it has been evaluated to exploit CNN accelerators. In particular, the proposed network has been deployed on the MAX78000 microcontroller achieving high frame-rate of up to 180fps and an ultra-low energy consumption of only 196μJ per inference with an inference efficiency of more than 106 MAC/Cycle. TinyissimoYOLO can be trained for any multi-object detection. However, considering the small network size, adding object detection classes will increase the size and memory consumption of the network, thus object detection with up to 3 classes is demonstrated. Furthermore, the network is trained using quantization-aware training and deployed with 8-bit quantization on different microcontrollers, such as STM32H7A3, STM32L4R9, Apollo4b and on the MAX78000's CNN accelerator. Performance evaluations are presented in this paper. △ Less

Submitted 12 July, 2023; v1 submitted 22 May, 2023; originally announced June 2023.

Comments: Published In: 2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)

arXiv:2305.18371 [pdf, other]

ColibriUAV: An Ultra-Fast, Energy-Efficient Neuromorphic Edge Processing UAV-Platform with Event-Based and Frame-Based Cameras

Authors: Sizhen Bian, Lukas Schulthess, Georg Rutishauser, Alfio Di Mauro, Luca Benini, Michele Magno

Abstract: The interest in dynamic vision sensor (DVS)-powered unmanned aerial vehicles (UAV) is raising, especially due to the microsecond-level reaction time of the bio-inspired event sensor, which increases robustness and reduces latency of the perception tasks compared to a RGB camera. This work presents ColibriUAV, a UAV platform with both frame-based and event-based cameras interfaces for efficient per… ▽ More The interest in dynamic vision sensor (DVS)-powered unmanned aerial vehicles (UAV) is raising, especially due to the microsecond-level reaction time of the bio-inspired event sensor, which increases robustness and reduces latency of the perception tasks compared to a RGB camera. This work presents ColibriUAV, a UAV platform with both frame-based and event-based cameras interfaces for efficient perception and near-sensor processing. The proposed platform is designed around Kraken, a novel low-power RISC-V System on Chip with two hardware accelerators targeting spiking neural networks and deep ternary neural networks.Kraken is capable of efficiently processing both event data from a DVS camera and frame data from an RGB camera. A key feature of Kraken is its integrated, dedicated interface with a DVS camera. This paper benchmarks the end-to-end latency and power efficiency of the neuromorphic and event-based UAV subsystem, demonstrating state-of-the-art event data with a throughput of 7200 frames of events per second and a power consumption of 10.7 \si{\milli\watt}, which is over 6.6 times faster and a hundred times less power-consuming than the widely-used data reading approach through the USB interface. The overall sensing and processing power consumption is below 50 mW, achieving latency in the milliseconds range, making the platform suitable for low-latency autonomous nano-drones as well. △ Less

Submitted 27 May, 2023; originally announced May 2023.

arXiv:2305.17594 [pdf, other]

doi 10.1109/MetroInd4.0IoT57462.2023.10180177

Fully Automatic Gym Exercises Recording: An IoT Solution

Authors: Sizhen Bian, Alexander Rupp, Michele Magno

Abstract: In recent years, working out in the gym has gotten increasingly more data-focused and many gym enthusiasts are recording their exercises to have a better overview of their historical gym activities and to make a better exercise plan for the future. As a side effect, this recording process has led to a lot of time spent painstakingly operating these apps by plugging in used types of equipment and r… ▽ More In recent years, working out in the gym has gotten increasingly more data-focused and many gym enthusiasts are recording their exercises to have a better overview of their historical gym activities and to make a better exercise plan for the future. As a side effect, this recording process has led to a lot of time spent painstakingly operating these apps by plugging in used types of equipment and repetitions. This project aims to automate this process using an Internet of Things (IoT) approach. Specifically, beacons with embedded ultra-low-power inertial measurement units (IMUs) are attached to the types of equipment to recognize the usage and transmit the information to gym-goers and managers. We have created a small ecosystem composed of beacons, a gateway, smartwatches, android/iPhone applications, a firebase cloud server, and a dashboard, all communicating over a mixture of Bluetooth and Wifi to distribute collected data from machines to users and gym managers in a compact and meaningful way. The system we have implemented is a working prototype of a bigger end goal and is supposed to initialize progress toward a smarter, more efficient, and still privacy-respect gym environment in the future. A small-scale real-life test shows 94.6\% accuracy in user gym session recording, which can reach up to 100\% easily with a more suitable assembling of the beacons. This promising result shows the potential of a fully automatic exercise recording system, which enables comprehensive monitoring and analysis of the exercise sessions and frees the user from manual recording. The estimated battery life of the beacon is 400 days with a 210 mAh coin battery. We also discussed the shortcoming of the current demonstration system and the future work for a reliable and ready-to-deploy automatic gym workout recording system. △ Less

Submitted 27 May, 2023; originally announced May 2023.

arXiv:2305.13087 [pdf, other]

doi 10.1109/IWASI58316.2023.10164626

A Fast and Accurate Optical Flow Camera for Resource-Constrained Edge Applications

Authors: Jonas Kühne, Michele Magno, Luca Benini

Abstract: Optical Flow (OF) is the movement pattern of pixels or edges that is caused in a visual scene by the relative motion between an agent and a scene. OF is used in a wide range of computer vision algorithms and robotics applications. While the calculation of OF is a resource-demanding task in terms of computational load and memory footprint, it needs to be executed at low latency, especially in robot… ▽ More Optical Flow (OF) is the movement pattern of pixels or edges that is caused in a visual scene by the relative motion between an agent and a scene. OF is used in a wide range of computer vision algorithms and robotics applications. While the calculation of OF is a resource-demanding task in terms of computational load and memory footprint, it needs to be executed at low latency, especially in robotics applications. Therefore, OF estimation is today performed on powerful CPUs or GPUs to satisfy the stringent requirements in terms of execution speed for control and actuation. On-sensor hardware acceleration is a promising approach to enable low latency OF calculations and fast execution even on resource-constrained devices such as nano drones and AR/VR glasses and headsets. This paper analyzes the achievable accuracy, frame rate, and power consumption when using a novel optical flow sensor consisting of a global shutter camera with an Application Specific Integrated Circuit (ASIC) for optical flow computation. The paper characterizes the optical flow sensor in high frame-rate, low-latency settings, with a frame rate of up to 88 fps at the full resolution of 1124 by 1364 pixels and up to 240 fps at a reduced camera resolution of 280 by 336, for both classical camera images and optical flow data. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: Accepted by IWASI 2023

arXiv:2305.13055 [pdf, other]

doi 10.1109/ISCAS48785.2022.9937215

Parallelizing Optical Flow Estimation on an Ultra-Low Power RISC-V Cluster for Nano-UAV Navigation

Authors: Jonas Kühne, Michele Magno, Luca Benini

Abstract: Optical flow estimation is crucial for autonomous navigation and localization of unmanned aerial vehicles (UAV). On micro and nano UAVs, real-time calculation of the optical flow is run on low power and resource-constrained microcontroller units (MCUs). Thus, lightweight algorithms for optical flow have been proposed targeting real-time execution on traditional single-core MCUs. This paper introdu… ▽ More Optical flow estimation is crucial for autonomous navigation and localization of unmanned aerial vehicles (UAV). On micro and nano UAVs, real-time calculation of the optical flow is run on low power and resource-constrained microcontroller units (MCUs). Thus, lightweight algorithms for optical flow have been proposed targeting real-time execution on traditional single-core MCUs. This paper introduces an efficient parallelization strategy for optical flow computation targeting new-generation multicore low power RISC-V based microcontroller units. Our approach enables higher frame rates at lower clock speeds. It has been implemented and evaluated on the eight-core cluster of a commercial octa-core MCU (GAP8) reaching a parallelization speedup factor of 7.21 allowing for a frame rate of 500 frames per second when running on a 50 MHz clock frequency. The proposed parallel algorithm significantly boosts the camera frame rate on micro unmanned aerial vehicles, which enables higher flight speeds: the maximum flight speed can be doubled, while using less than a third of the clock frequency of previous single-core implementations. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: Accepted by ISCAS 2022

arXiv:2303.14028 [pdf]

Non-invasive urinary bladder volume estimation with artefact-suppressed bio-impedance measurements

Authors: Kanika Dheman, Stefan Walser, Philipp Mayer, Manuel Eggimann, Marko Kozomara, Denise Franke, Thomas Hermanns, Hugo Sax, Simone Schürle, Michele Magno

Abstract: Urine output is a vital parameter to gauge kidney health. Current monitoring methods include manually written records, invasive urinary catheterization or ultrasound measurements performed by highly skilled personnel. Catheterization bears high risks of infection while intermittent ultrasound measures and manual recording are time consuming and might miss early signs of kidney malfunction. Bioimpe… ▽ More Urine output is a vital parameter to gauge kidney health. Current monitoring methods include manually written records, invasive urinary catheterization or ultrasound measurements performed by highly skilled personnel. Catheterization bears high risks of infection while intermittent ultrasound measures and manual recording are time consuming and might miss early signs of kidney malfunction. Bioimpedance (BI) measurements may serve as a non-invasive alternative for measuring urine volume in vivo. However, limited robustness have prevented its clinical translation. Here, a deep learning-based algorithm is presented that processes the local BI of the lower abdomen and suppresses artefacts to measure the bladder volume quantitatively, non-invasively and without the continuous need for additional personnel. A tetrapolar BI wearable system called ANUVIS was used to collect continuous bladder volume data from three healthy subjects to demonstrate feasibility of operation, while clinical gold standards of urodynamic (n=6) and uroflowmetry tests (n=8) provided the ground truth. Optimized location for electrode placement and a model for the change in BI with changing bladder volume is deduced. The average error for full bladder volume estimation and for residual volume estimation was -29 +/-87.6 ml, thus, comparable to commercial portable ultrasound devices (Bland Altman analysis showed a bias of -5.2 ml with LoA between 119.7 ml to -130.1 ml), while providing the additional benefit of hands-free, non-invasive, and continuous bladder volume estimation. The combination of the wearable BI sensor node and the presented algorithm provides an attractive alternative to current standard of care with potential benefits in providing insights into kidney function. △ Less

Submitted 24 March, 2023; originally announced March 2023.

arXiv:2301.05748 [pdf, other]

doi 10.1109/IGSC55832.2022.9969370

Exploring Automatic Gym Workouts Recognition Locally On Wearable Resource-Constrained Devices

Authors: Sizhen Bian, Xiaying Wang, Tommaso Polonelli, Michele Magno

Abstract: Automatic gym activity recognition on energy- and resource-constrained wearable devices removes the human-interaction requirement during intense gym sessions - like soft-touch tap** and swi**. This work presents a tiny and highly accurate residual convolutional neural network that runs in milliwatt microcontrollers for automatic workouts classification. We evaluated the inference performance o… ▽ More Automatic gym activity recognition on energy- and resource-constrained wearable devices removes the human-interaction requirement during intense gym sessions - like soft-touch tap** and swi**. This work presents a tiny and highly accurate residual convolutional neural network that runs in milliwatt microcontrollers for automatic workouts classification. We evaluated the inference performance of the deep model with quantization on three resource-constrained devices: two microcontrollers with ARM-Cortex M4 and M7 core from ST Microelectronics, and a GAP8 system on chip, which is an open-sourced, multi-core RISC-V computing platform from GreenWaves Technologies. Experimental results show an accuracy of up to 90.4% for eleven workouts recognition with full precision inference. The paper also presents the trade-off performance of the resource-constrained system. While kee** the recognition accuracy (88.1%) with minimal loss, each inference takes only 3.2 ms on GAP8, benefiting from the 8 RISC-V cluster cores. We measured that it features an execution time that is 18.9x and 6.5x faster than the Cortex-M4 and Cortex-M7 cores, showing the feasibility of real-time on-board workouts recognition based on the described data set with 20 Hz sampling rate. The energy consumed for each inference on GAP8 is 0.41 mJ compared to 5.17 mJ on Cortex-M4 and 8.07 mJ on Cortex-M7 with the maximum clock. It can lead to longer battery life when the system is battery-operated. We also introduced an open data set composed of fifty sessions of eleven gym workouts collected from ten subjects that is publicly available. △ Less

Submitted 13 January, 2023; originally announced January 2023.

arXiv:2212.04896 [pdf, other]

doi 10.1109/JIOT.2023.3289568

Self-sustaining Ultra-wideband Positioning System for Event-driven Indoor Localization

Authors: Philipp Mayer, Michele Magno, Luca Benini

Abstract: Smart and unobtrusive mobile sensor nodes that accurately track their own position have the potential to augment data collection with location-based functions. To attain this vision of unobtrusiveness, the sensor nodes must have a compact form factor and operate over long periods without battery recharging or replacement. This paper presents a self-sustaining and accurate ultra-wideband-based indo… ▽ More Smart and unobtrusive mobile sensor nodes that accurately track their own position have the potential to augment data collection with location-based functions. To attain this vision of unobtrusiveness, the sensor nodes must have a compact form factor and operate over long periods without battery recharging or replacement. This paper presents a self-sustaining and accurate ultra-wideband-based indoor location system with conservative infrastructure overhead. An event-driven sensing approach allows for balancing the limited energy harvested in indoor conditions with the power consumption of ultra-wideband transceivers. The presented tag-centralized concept, which combines heterogeneous system design with embedded processing, minimizes idle consumption without sacrificing functionality. Despite modest infrastructure requirements, high localization accuracy is achieved with error-correcting double-sided two-way ranging and embedded optimal multilateration. Experimental results demonstrate the benefits of the proposed system: the node achieves a quiescent current of $47~nA$ and operates at $1.2~μA$ while performing energy harvesting and motion detection. The energy consumption for position updates, with an accuracy of $40~cm$ (2D) in realistic non-line-of-sight conditions, is $10.84~mJ$. In an asset tracking case study within a $200~m^2$ multi-room office space, the achieved accuracy level allows for identifying 36 different desk and storage locations with an accuracy of over $95~{\%}$. The system`s long-time self-sustainability has been analyzed over $700~days$ in multiple indoor lighting situations. △ Less

Submitted 3 July, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

Journal ref: IEEE Internet of Things Journal 2023

arXiv:2212.00712 [pdf, other]

Nonlinear and Machine Learning Analyses on High-Density EEG data of Math Experts and Novices

Authors: Hanna Poikonen, Tomasz Zaluska, Xiaying Wang, Michele Magno, Manu Kapur

Abstract: Current trend in neurosciences is to use naturalistic stimuli, such as cinema, class-room biology or video gaming, aiming to understand the brain functions during ecologically valid conditions. Naturalistic stimuli recruit complex and overlap** cognitive, emotional and sensory brain processes. Brain oscillations form underlying mechanisms for such processes, and further, these processes can be m… ▽ More Current trend in neurosciences is to use naturalistic stimuli, such as cinema, class-room biology or video gaming, aiming to understand the brain functions during ecologically valid conditions. Naturalistic stimuli recruit complex and overlap** cognitive, emotional and sensory brain processes. Brain oscillations form underlying mechanisms for such processes, and further, these processes can be modified by expertise. Human cortical oscillations are often analyzed with linear methods despite brain as a biological system is highly nonlinear. This study applies a relatively robust nonlinear method, Higuchi fractal dimension (HFD), to classify cortical oscillations of math experts and novices when they solve long and complex math demonstrations in an EEG laboratory. Brain imaging data, which is collected over a long time span during naturalistic stimuli, enables the application of data-driven analyses. Therefore, we also explore the neural signature of math expertise with machine learning algorithms. There is a need for novel methodologies in analyzing naturalistic data because formulation of theories of the brain functions in the real world based on reductionist and simplified study designs is both challenging and questionable. Data-driven intelligent approaches may be helpful in develo** and testing new theories on complex brain functions. Our results clarify the different neural signature, analyzed by HFD, of math experts and novices during complex math and suggest machine learning as a promising data-driven approach to understand the brain processes in expertise and mathematical cognition. △ Less

Submitted 1 December, 2022; originally announced December 2022.

arXiv:2212.00710 [pdf, other]

Fully On-board Low-Power Localization with Multizone Time-of-Flight Sensors on Nano-UAVs

Authors: Hanna Müller, Nicky Zimmerman, Tommaso Polonelli, Michele Magno, Jens Behley, Cyrill Stachniss, Luca Benini

Abstract: Nano-size unmanned aerial vehicles (UAVs) hold enormous potential to perform autonomous operations in complex environments, such as inspection, monitoring or data collection. Moreover, their small size allows safe operation close to humans and agile flight. An important part of autonomous flight is localization, which is a computationally intensive task especially on a nano-UAV that usually has st… ▽ More Nano-size unmanned aerial vehicles (UAVs) hold enormous potential to perform autonomous operations in complex environments, such as inspection, monitoring or data collection. Moreover, their small size allows safe operation close to humans and agile flight. An important part of autonomous flight is localization, which is a computationally intensive task especially on a nano-UAV that usually has strong constraints in sensing, processing and memory. This work presents a real-time localization approach with low element-count multizone range sensors for resource-constrained nano-UAVs. The proposed approach is based on a novel miniature 64-zone time-of-flight sensor from ST Microelectronics and a RISC-V-based parallel ultra low-power processor, to enable accurate and low latency Monte Carlo Localization on-board. Experimental evaluation using a nano-UAV open platform demonstrated that the proposed solution is capable of localizing on a 31.2m$\boldsymbol{^2}$ map with 0.15m accuracy and an above 95% success rate. The achieved accuracy is sufficient for localization in common indoor environments. We analyze tradeoffs in using full and half-precision floating point numbers as well as a quantized map and evaluate the accuracy and memory footprint across the design space. Experimental evaluation shows that parallelizing the execution for 8 RISC-V cores brings a 7x speedup and allows us to execute the algorithm on-board in real-time with a latency of 0.2-30ms (depending on the number of particles), while only increasing the overall drone power consumption by 3-7%. Finally, we provide an open-source implementation of our approach. △ Less

Submitted 25 November, 2022; originally announced December 2022.

Comments: DATE 2023

arXiv:2209.04346 [pdf, other]

doi 10.1109/ICRA48891.2023.10161472

Model- and Acceleration-based Pursuit Controller for High-Performance Autonomous Racing

Authors: Jonathan Becker, Nadine Imholz, Luca Schwarzenbach, Edoardo Ghignone, Nicolas Baumann, Michele Magno

Abstract: Autonomous racing is a research field gaining large popularity, as it pushes autonomous driving algorithms to their limits and serves as a catalyst for general autonomous driving. For scaled autonomous racing platforms, the computational constraint and complexity often limit the use of Model Predictive Control (MPC). As a consequence, geometric controllers are the most frequently deployed controll… ▽ More Autonomous racing is a research field gaining large popularity, as it pushes autonomous driving algorithms to their limits and serves as a catalyst for general autonomous driving. For scaled autonomous racing platforms, the computational constraint and complexity often limit the use of Model Predictive Control (MPC). As a consequence, geometric controllers are the most frequently deployed controllers. They prove to be performant while yielding implementation and operational simplicity. Yet, they inherently lack the incorporation of model dynamics, thus limiting the race car to a velocity domain where tire slip can be neglected. This paper presents Model- and Acceleration-based Pursuit (MAP) a high-performance model-based trajectory tracking algorithm that preserves the simplicity of geometric approaches while leveraging tire dynamics. The proposed algorithm allows accurate tracking of a trajectory at unprecedented velocities compared to State-of-the-Art (SotA) geometric controllers. The MAP controller is experimentally validated and outperforms the reference geometric controller four-fold in terms of lateral tracking error, yielding a tracking error of 0.055m at tested speeds up to 11m/s. △ Less

Submitted 7 July, 2023; v1 submitted 9 September, 2022; originally announced September 2022.

Comments: 6 pages, 6 figures, 1 table

Journal ref: 2023 IEEE International Conference on Robotics and Automation (ICRA)

arXiv:2208.12624 [pdf, other]

Robust and Efficient Depth-based Obstacle Avoidance for Autonomous Miniaturized UAVs

Authors: Hanna Müller, Vlad Niculescu, Tommaso Polonelli, Michele Magno, Luca Benini

Abstract: Nano-size drones hold enormous potential to explore unknown and complex environments. Their small size makes them agile and safe for operation close to humans and allows them to navigate through narrow spaces. However, their tiny size and payload restrict the possibilities for on-board computation and sensing, making fully autonomous flight extremely challenging. The first step towards full autono… ▽ More Nano-size drones hold enormous potential to explore unknown and complex environments. Their small size makes them agile and safe for operation close to humans and allows them to navigate through narrow spaces. However, their tiny size and payload restrict the possibilities for on-board computation and sensing, making fully autonomous flight extremely challenging. The first step towards full autonomy is reliable obstacle avoidance, which has proven to be technically challenging by itself in a generic indoor environment. Current approaches utilize vision-based or 1-dimensional sensors to support nano-drone perception algorithms. This work presents a lightweight obstacle avoidance system based on a novel millimeter form factor 64 pixels multi-zone Time-of-Flight (ToF) sensor and a generalized model-free control policy. Reported in-field tests are based on the Crazyflie 2.1, extended by a custom multi-zone ToF deck, featuring a total flight mass of 35g. The algorithm only uses 0.3% of the on-board processing power (210uS execution time) with a frame rate of 15fps, providing an excellent foundation for many future applications. Less than 10% of the total drone power is needed to operate the proposed perception system, including both lifting and operating the sensor. The presented autonomous nano-size drone reaches 100% reliability at 0.5m/s in a generic and previously unexplored indoor environment. The proposed system is released open-source with an extensive dataset including ToF and gray-scale camera data, coupled with UAV position ground truth from motion capture. △ Less

Submitted 26 August, 2022; originally announced August 2022.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2205.11902 [pdf, other]

Aerosense: A Self-Sustainable And Long-Range Bluetooth Wireless Sensor Node for Aerodynamic and Aeroacoustic Monitoring on Wind Turbines

Authors: Tommaso Polonelli, Hanna Müller, Weikang Kong, Raphael Fischer, Luca Benini, Michele Magno

Abstract: This paper presents a low-power, self-sustainable, and modular wireless sensor node for aerodynamic and acoustic measurements on wind turbines and other industrial structures. It includes 40 high-accuracy barometers, 10 microphones, 5 differential pressure sensors, and implements a lossy and a lossless on-board data compression algorithm to decrease the transmission energy cost. The wireless trans… ▽ More This paper presents a low-power, self-sustainable, and modular wireless sensor node for aerodynamic and acoustic measurements on wind turbines and other industrial structures. It includes 40 high-accuracy barometers, 10 microphones, 5 differential pressure sensors, and implements a lossy and a lossless on-board data compression algorithm to decrease the transmission energy cost. The wireless transmitter is based on Bluetooth Low Energy 5.1 tuned for long-range and high throughput while maintaining adequate per-bit energy efficiency (80 nJ). Moreover, we field-assessed the node capability to collect precise and accurate aerodynamic data. Outdoor experimental tests revealed that the system can acquire and sustain a data rate of 850 kbps over 438 m. The power consumption while collecting and streaming all measured data is 120 mW, enabling self-sustainability and long-term in-situ monitoring with a 111 cm^2 photovoltaic panel. △ Less

Submitted 24 May, 2022; originally announced May 2022.

Comments: 9 pages, 4 figures, 3 tables, IEEE Journal

arXiv:2203.15069 [pdf, other]

Leveraging Tactile Sensors for Low Latency Embedded Smart Hands for Prosthetic and Robotic Applications

Authors: Xiaying Wang, Fabian Geiger, Vlad Niculescu, Michele Magno, Luca Benini

Abstract: Tactile sensing is a crucial perception mode for robots and human amputees in need of controlling a prosthetic device. Today robotic and prosthetic systems are still missing the important feature of accurate tactile sensing. This lack is mainly due to the fact that the existing tactile technologies have limited spatial and temporal resolution and are either expensive or not scalable. In this paper… ▽ More Tactile sensing is a crucial perception mode for robots and human amputees in need of controlling a prosthetic device. Today robotic and prosthetic systems are still missing the important feature of accurate tactile sensing. This lack is mainly due to the fact that the existing tactile technologies have limited spatial and temporal resolution and are either expensive or not scalable. In this paper, we present the design and the implementation of a hardware-software embedded system called SmartHand. It is specifically designed to enable the acquisition and the real-time processing of high-resolution tactile information from a hand-shaped multi-sensor array for prosthetic and robotic applications. During data collection, our system can deliver a high throughput of 100 frames per second, which is 13.7x higher than previous related work. We collected a new tactile dataset while interacting with daily-life objects during five different sessions. We propose a compact yet accurate convolutional neural network that requires one order of magnitude less memory and 15.6x fewer computations compared to related work without degrading classification accuracy. The top-1 and top-3 cross-validation accuracies are respectively 98.86% and 99.83%. We further analyze the inter-session variability and obtain the best top-3 leave-one-out-validation accuracy of 77.84%. We deploy the trained model on a high-performance ARM Cortex-M7 microcontroller achieving an inference time of only 100 ms minimizing the response latency. The overall measured power consumption is 505 mW. Finally, we fabricate a new control sensor and perform additional experiments to provide analyses on sensor degradation and slip detection. This work is a step forward in giving robotic and prosthetic devices a sense of touch and demonstrates the practicality of a smart embedded system empowered by tiny machine learning. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2107.14598

arXiv:2203.14592 [pdf, other]

MI-BMInet: An Efficient Convolutional Neural Network for Motor Imagery Brain--Machine Interfaces with EEG Channel Selection

Authors: Xiaying Wang, Michael Hersche, Michele Magno, Luca Benini

Abstract: A brain--machine interface (BMI) based on motor imagery (MI) enables the control of devices using brain signals while the subject imagines performing a movement. It plays a vital role in prosthesis control and motor rehabilitation. To improve user comfort, preserve data privacy, and reduce the system's latency, a new trend in wearable BMIs is to execute algorithms on low-power microcontroller unit… ▽ More A brain--machine interface (BMI) based on motor imagery (MI) enables the control of devices using brain signals while the subject imagines performing a movement. It plays a vital role in prosthesis control and motor rehabilitation. To improve user comfort, preserve data privacy, and reduce the system's latency, a new trend in wearable BMIs is to execute algorithms on low-power microcontroller units (MCUs) embedded on edge devices to process the electroencephalographic (EEG) data in real-time close to the sensors. However, most of the classification models present in the literature are too resource-demanding, making them unfit for low-power MCUs. This paper proposes an efficient convolutional neural network (CNN) for EEG-based MI classification that achieves comparable accuracy while being orders of magnitude less resource-demanding and significantly more energy-efficient than state-of-the-art (SoA) models for a long-lifetime battery operation. To further reduce the model complexity, we propose an automatic channel selection method based on spatial filters and quantize both weights and activations to 8-bit precision with negligible accuracy loss. Finally, we implement and evaluate the proposed models on leading-edge parallel ultra-low-power (PULP) MCUs. The final 2-class solution consumes as little as 30 uJ/inference with a runtime of 2.95 ms/inference and an accuracy of 82.51% while using 6.4x fewer EEG channels, becoming the new SoA for embedded MI-BMI and defining a new Pareto frontier in the three-way trade-off among accuracy, resource cost, and power usage. △ Less

Submitted 13 January, 2023; v1 submitted 28 March, 2022; originally announced March 2022.

arXiv:2201.03386 [pdf, other]

Sub-mW Keyword Spotting on an MCU: Analog Binary Feature Extraction and Binary Neural Networks

Authors: Gianmarco Cerutti, Lukas Cavigelli, Renzo Andri, Michele Magno, Elisabetta Farella, Luca Benini

Abstract: Keyword spotting (KWS) is a crucial function enabling the interaction with the many ubiquitous smart devices in our surroundings, either activating them through wake-word or directly as a human-computer interface. For many applications, KWS is the entry point for our interactions with the device and, thus, an always-on workload. Many smart devices are mobile and their battery lifetime is heavily i… ▽ More Keyword spotting (KWS) is a crucial function enabling the interaction with the many ubiquitous smart devices in our surroundings, either activating them through wake-word or directly as a human-computer interface. For many applications, KWS is the entry point for our interactions with the device and, thus, an always-on workload. Many smart devices are mobile and their battery lifetime is heavily impacted by continuously running services. KWS and similar always-on services are thus the focus when optimizing the overall power consumption. This work addresses KWS energy-efficiency on low-cost microcontroller units (MCUs). We combine analog binary feature extraction with binary neural networks. By replacing the digital preprocessing with the proposed analog front-end, we show that the energy required for data acquisition and preprocessing can be reduced by 29x, cutting its share from a dominating 85% to a mere 16% of the overall energy consumption for our reference KWS application. Experimental evaluations on the Speech Commands Dataset show that the proposed system outperforms state-of-the-art accuracy and energy efficiency, respectively, by 1% and 4.3x on a 10-class dataset while providing a compelling accuracy-energy trade-off including a 2% accuracy drop for a 71x energy reduction. △ Less

Submitted 10 January, 2022; originally announced January 2022.

arXiv:2107.14598 [pdf, other]

SmartHand: Towards Embedded Smart Hands for Prosthetic and Robotic Applications

Authors: Xiaying Wang, Fabian Geiger, Vlad Niculescu, Michele Magno, Luca Benini

Abstract: The sophisticated sense of touch of the human hand significantly contributes to our ability to safely, efficiently, and dexterously manipulate arbitrary objects in our environment. Robotic and prosthetic devices lack refined, tactile feedback from their end-effectors, leading to counterintuitive and complex control strategies. To address this lack, tactile sensors have been designed and developed,… ▽ More The sophisticated sense of touch of the human hand significantly contributes to our ability to safely, efficiently, and dexterously manipulate arbitrary objects in our environment. Robotic and prosthetic devices lack refined, tactile feedback from their end-effectors, leading to counterintuitive and complex control strategies. To address this lack, tactile sensors have been designed and developed, but they often offer an insufficient spatial and temporal resolution. This paper focuses on overcoming these issues by designing a smart embedded system, called SmartHand, enabling the acquisition and real-time processing of high-resolution tactile information from a hand-shaped multi-sensor array for prosthetic and robotic applications. We acquire a new tactile dataset consisting of 340,000 frames while interacting with 16 everyday objects and the empty hand, i.e., a total of 17 classes. The design of the embedded system minimizes response latency in classification, by deploying a small yet accurate convolutional neural network on a high-performance ARM Cortex-M7 microcontroller. Compared to related work, our model requires one order of magnitude less memory and 15.6x fewer computations while achieving similar inter-session accuracy and up to 98.86% and 99.83% top-1 and top-3 cross-validation accuracy, respectively. Experimental results show a total power consumption of 505mW and a latency of only 100ms. △ Less

Submitted 23 July, 2021; originally announced July 2021.

arXiv:2104.11042 [pdf, other]

High-Accuracy Ranging and Localization with Ultra-Wideband Communications for Energy-Constrained Devices

Authors: L. Flueratoru, S. Wehrli, M. Magno, E. S. Lohan, D. Niculescu

Abstract: Ultra-wideband (UWB) communications have gained popularity in recent years for being able to provide distance measurements and localization with high accuracy, which can enhance the capabilities of devices in the Internet of Things (IoT). Since energy efficiency is of utmost concern in such applications, in this work we evaluate the power and energy consumption, distance measurements, and localiza… ▽ More Ultra-wideband (UWB) communications have gained popularity in recent years for being able to provide distance measurements and localization with high accuracy, which can enhance the capabilities of devices in the Internet of Things (IoT). Since energy efficiency is of utmost concern in such applications, in this work we evaluate the power and energy consumption, distance measurements, and localization performance of two types of UWB physical interfaces (PHYs), which use either a low- or high-rate pulse repetition (LRP and HRP, respectively). The evaluation is done through measurements acquired in identical conditions, which is crucial in order to have a fair comparison between the devices. The LRP devices that we tested have the same ranging and localization performance, but ten times (10x) lower power consumption, 6x lower energy consumption per distance measurement, and at least 8x higher coverage than the HRP devices. Therefore, UWB LRP devices can offer high-accuracy ranging and localization even to ultra-low-power devices in the IoT. We performed measurements in typical LOS and NLOS scenarios and propose theoretical models for the distance errors obtained in these situations. The models can be used to simulate realistic building deployments and we illustrate such an example. This paper, therefore, provides a comprehensive overview of the energy demands, ranging characteristics, and localization performance of state-of-the-art UWB devices. △ Less

Submitted 3 November, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

Comments: Accepted at the IEEE Internet of Things Journal (IoT-J)

arXiv:2101.04446 [pdf, other]

doi 10.1145/3370748.3406588

Sound Event Detection with Binary Neural Networks on Tightly Power-Constrained IoT Devices

Authors: Gianmarco Cerutti, Renzo Andri, Lukas Cavigelli, Michele Magno, Elisabetta Farella, Luca Benini

Abstract: Sound event detection (SED) is a hot topic in consumer and smart city applications. Existing approaches based on Deep Neural Networks are very effective, but highly demanding in terms of memory, power, and throughput when targeting ultra-low power always-on devices. Latency, availability, cost, and privacy requirements are pushing recent IoT systems to process the data on the node, close to the… ▽ More Sound event detection (SED) is a hot topic in consumer and smart city applications. Existing approaches based on Deep Neural Networks are very effective, but highly demanding in terms of memory, power, and throughput when targeting ultra-low power always-on devices. Latency, availability, cost, and privacy requirements are pushing recent IoT systems to process the data on the node, close to the sensor, with a very limited energy supply, and tight constraints on the memory size and processing capabilities precluding to run state-of-the-art DNNs. In this paper, we explore the combination of extreme quantization to a small-footprint binary neural network (BNN) with the highly energy-efficient, RISC-V-based (8+1)-core GAP8 microcontroller. Starting from an existing CNN for SED whose footprint (815 kB) exceeds the 512 kB of memory available on our platform, we retrain the network using binary filters and activations to match these memory constraints. (Fully) binary neural networks come with a natural drop in accuracy of 12-18% on the challenging ImageNet object recognition challenge compared to their equivalent full-precision baselines. This BNN reaches a 77.9% accuracy, just 7% lower than the full-precision version, with 58 kB (7.2 times less) for the weights and 262 kB (2.4 times less) memory in total. With our BNN implementation, we reach a peak throughput of 4.6 GMAC/s and 1.5 GMAC/s over the full network, including preprocessing with Mel bins, which corresponds to an efficiency of 67.1 GMAC/s/W and 31.3 GMAC/s/W, respectively. Compared to the performance of an ARM Cortex-M4 implementation, our system has a 10.3 times faster execution time and a 51.1 times higher energy-efficiency. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: 6 pages conference

arXiv:2006.16281 [pdf, other]

doi 10.1109/JIOT.2021.3067382}

TinyRadarNN: Combining Spatial and Temporal Convolutional Neural Networks for Embedded Gesture Recognition with Short Range Radars

Authors: Moritz Scherer, Michele Magno, Jonas Erb, Philipp Mayer, Manuel Eggimann, Luca Benini

Abstract: This work proposes a low-power high-accuracy embedded hand-gesture recognition algorithm targeting battery-operated wearable devices using low power short-range RADAR sensors. A 2D Convolutional Neural Network (CNN) using range frequency Doppler features is combined with a Temporal Convolutional Neural Network (TCN) for time sequence prediction. The final algorithm has a model size of only 46 thou… ▽ More This work proposes a low-power high-accuracy embedded hand-gesture recognition algorithm targeting battery-operated wearable devices using low power short-range RADAR sensors. A 2D Convolutional Neural Network (CNN) using range frequency Doppler features is combined with a Temporal Convolutional Neural Network (TCN) for time sequence prediction. The final algorithm has a model size of only 46 thousand parameters, yielding a memory footprint of only 92 KB. Two datasets containing 11 challenging hand gestures performed by 26 different people have been recorded containing a total of 20,210 gesture instances. On the 11 hand gesture dataset, accuracies of 86.6% (26 users) and 92.4% (single user) have been achieved, which are comparable to the state-of-the-art, which achieves 87% (10 users) and 94% (single user), while using a TCN-based network that is 7500x smaller than the state-of-the-art. Furthermore, the gesture recognition classifier has been implemented on a Parallel Ultra-Low Power Processor, demonstrating that real-time prediction is feasible with only 21 mW of power consumption for the full TCN sequence prediction network, while a system-level power consumption of less than 100 mW is achieved. We provide open-source access to all the code and data collected and used in this work on tinyradar.ethz.ch. △ Less

Submitted 16 March, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

arXiv:2004.00077 [pdf, other]

doi 10.1109/MeMeA49120.2020.9137134

An Accurate EEGNet-based Motor-Imagery Brain-Computer Interface for Low-Power Edge Computing

Authors: Xiaying Wang, Michael Hersche, Batuhan Tömekce, Burak Kaya, Michele Magno, Luca Benini

Abstract: This paper presents an accurate and robust embedded motor-imagery brain-computer interface (MI-BCI). The proposed novel model, based on EEGNet, matches the requirements of memory footprint and computational resources of low-power microcontroller units (MCUs), such as the ARM Cortex-M family. Furthermore, the paper presents a set of methods, including temporal downsampling, channel selection, and n… ▽ More This paper presents an accurate and robust embedded motor-imagery brain-computer interface (MI-BCI). The proposed novel model, based on EEGNet, matches the requirements of memory footprint and computational resources of low-power microcontroller units (MCUs), such as the ARM Cortex-M family. Furthermore, the paper presents a set of methods, including temporal downsampling, channel selection, and narrowing of the classification window, to further scale down the model to relax memory requirements with negligible accuracy degradation. Experimental results on the Physionet EEG Motor Movement/Imagery Dataset show that standard EEGNet achieves 82.43%, 75.07%, and 65.07% classification accuracy on 2-, 3-, and 4-class MI tasks in global validation, outperforming the state-of-the-art (SoA) convolutional neural network (CNN) by 2.05%, 5.25%, and 5.48%. Our novel method further scales down the standard EEGNet at a negligible accuracy loss of 0.31% with 7.6x memory footprint reduction and a small accuracy loss of 2.51% with 15x reduction. The scaled models are deployed on a commercial Cortex-M4F MCU taking 101ms and consuming 4.28mJ per inference for operating the smallest model, and on a Cortex-M7 with 44ms and 18.1mJ per inference for the medium-sized model, enabling a fully autonomous, wearable, and accurate low-power BCI. △ Less

Submitted 16 January, 2023; v1 submitted 31 March, 2020; originally announced April 2020.

arXiv:2003.00041 [pdf, other]

doi 10.23919/DATE48585.2020.9116218

InfiniWolf: Energy Efficient Smart Bracelet for Edge Computing with Dual Source Energy Harvesting

Authors: Michele Magno, Xiaying Wang, Manuel Eggimann, Lukas Cavigelli, Luca Benini

Abstract: This work presents InfiniWolf, a novel multi-sensor smartwatch that can achieve self-sustainability exploiting thermal and solar energy harvesting, performing computationally high demanding tasks. The smartwatch embeds both a System-on-Chip (SoC) with an ARM Cortex-M processor and Bluetooth Low Energy (BLE) and Mr. Wolf, an open-hardware RISC-V based parallel ultra-low-power processor that boosts… ▽ More This work presents InfiniWolf, a novel multi-sensor smartwatch that can achieve self-sustainability exploiting thermal and solar energy harvesting, performing computationally high demanding tasks. The smartwatch embeds both a System-on-Chip (SoC) with an ARM Cortex-M processor and Bluetooth Low Energy (BLE) and Mr. Wolf, an open-hardware RISC-V based parallel ultra-low-power processor that boosts the processing capabilities on board by more than one order of magnitude, while also increasing energy efficiency. We demonstrate its functionality based on a sample application scenario performing stress detection with multi-layer artificial neural networks on a wearable multi-sensor bracelet. Experimental results show the benefits in terms of energy efficiency and latency of Mr. Wolf over an ARM Cortex-M4F micro-controllers and the possibility, under specific assumptions, to be self-sustainable using thermal and solar energy harvesting while performing up to 24 stress classifications per minute in indoor conditions. △ Less

Submitted 16 January, 2023; v1 submitted 28 February, 2020; originally announced March 2020.

arXiv:1912.04441 [pdf, other]

doi 10.1109/SAS48726.2020.9220068

HR-SAR-Net: A Deep Neural Network for Urban Scene Segmentation from High-Resolution SAR Data

Authors: Xiaying Wang, Lukas Cavigelli, Manuel Eggimann, Michele Magno, Luca Benini

Abstract: Synthetic aperture radar (SAR) data is becoming increasingly available to a wide range of users through commercial service providers with resolutions reaching 0.5m/px. Segmenting SAR data still requires skilled personnel, limiting the potential for large-scale use. We show that it is possible to automatically and reliably perform urban scene segmentation from next-gen resolution SAR data (0.15m/px… ▽ More Synthetic aperture radar (SAR) data is becoming increasingly available to a wide range of users through commercial service providers with resolutions reaching 0.5m/px. Segmenting SAR data still requires skilled personnel, limiting the potential for large-scale use. We show that it is possible to automatically and reliably perform urban scene segmentation from next-gen resolution SAR data (0.15m/px) using deep neural networks (DNNs), achieving a pixel accuracy of 95.19% and a mean IoU of 74.67% with data collected over a region of merely 2.2km${}^2$. The presented DNN is not only effective, but is very small with only 63k parameters and computationally simple enough to achieve a throughput of around 500Mpx/s using a single GPU. We further identify that additional SAR receive antennas and data from multiple flights massively improve the segmentation accuracy. We describe a procedure for generating a high-quality segmentation ground truth from multiple inaccurate building and road annotations, which has been crucial to achieving these segmentation results. △ Less

Submitted 16 January, 2023; v1 submitted 9 December, 2019; originally announced December 2019.

arXiv:1911.03314 [pdf, other]

FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things

Authors: Xiaying Wang, Michele Magno, Lukas Cavigelli, Luca Benini

Abstract: The growing number of low-power smart devices in the Internet of Things is coupled with the concept of "Edge Computing", that is moving some of the intelligence, especially machine learning, towards the edge of the network. Enabling machine learning algorithms to run on resource-constrained hardware, typically on low-power smart devices, is challenging in terms of hardware (optimized and energy-ef… ▽ More The growing number of low-power smart devices in the Internet of Things is coupled with the concept of "Edge Computing", that is moving some of the intelligence, especially machine learning, towards the edge of the network. Enabling machine learning algorithms to run on resource-constrained hardware, typically on low-power smart devices, is challenging in terms of hardware (optimized and energy-efficient integrated circuits), algorithmic and firmware implementations. This paper presents FANN-on-MCU, an open-source toolkit built upon the Fast Artificial Neural Network (FANN) library to run lightweight and energy-efficient neural networks on microcontrollers based on both the ARM Cortex-M series and the novel RISC-V-based Parallel Ultra-Low-Power (PULP) platform. The toolkit takes multi-layer perceptrons trained with FANN and generates code targeted at execution on low-power microcontrollers either with a floating-point unit (i.e., ARM Cortex-M4F and M7F) or without (i.e., ARM Cortex M0-M3 or PULP-based processors). This paper also provides an architectural performance evaluation of neural networks on the most popular ARM Cortex-M family and the parallel RISC-V processor called Mr. Wolf. The evaluation includes experimental results for three different applications using a self-sustainable wearable multi-sensor bracelet. Experimental results show a measured latency in the order of only a few microseconds and a power consumption of few milliwatts while kee** the memory requirements below the limitations of the targeted microcontrollers. In particular, the parallel implementation on the octa-core RISC-V platform reaches a speedup of 22x and a 69% reduction in energy consumption with respect to a single-core implementation on Cortex-M4 for continuous real-time classification. △ Less

Submitted 17 February, 2022; v1 submitted 8 November, 2019; originally announced November 2019.

arXiv:1611.03130 [pdf]

doi 10.1117/12.2241383

Computationally Efficient Target Classification in Multispectral Image Data with Deep Neural Networks

Authors: Lukas Cavigelli, Dominic Bernath, Michele Magno, Luca Benini

Abstract: Detecting and classifying targets in video streams from surveillance cameras is a cumbersome, error-prone and expensive task. Often, the incurred costs are prohibitive for real-time monitoring. This leads to data being stored locally or transmitted to a central storage site for post-incident examination. The required communication links and archiving of the video data are still expensive and this… ▽ More Detecting and classifying targets in video streams from surveillance cameras is a cumbersome, error-prone and expensive task. Often, the incurred costs are prohibitive for real-time monitoring. This leads to data being stored locally or transmitted to a central storage site for post-incident examination. The required communication links and archiving of the video data are still expensive and this setup excludes preemptive actions to respond to imminent threats. An effective way to overcome these limitations is to build a smart camera that transmits alerts when relevant video sequences are detected. Deep neural networks (DNNs) have come to outperform humans in visual classifications tasks. The concept of DNNs and Convolutional Networks (ConvNets) can easily be extended to make use of higher-dimensional input data such as multispectral data. We explore this opportunity in terms of achievable accuracy and required computational effort. To analyze the precision of DNNs for scene labeling in an urban surveillance scenario we have created a dataset with 8 classes obtained in a field experiment. We combine an RGB camera with a 25-channel VIS-NIR snapshot sensor to assess the potential of multispectral image data for target classification. We evaluate several new DNNs, showing that the spectral information fused together with the RGB frames can be used to improve the accuracy of the system or to achieve similar accuracy with a 3x smaller computation effort. We achieve a very high per-pixel accuracy of 99.1%. Even for scarcely occurring, but particularly interesting classes, such as cars, 75% of the pixels are labeled correctly with errors occurring only around the border of the objects. This high accuracy was obtained with a training set of only 30 labeled images, paving the way for fast adaptation to various application scenarios. △ Less

Submitted 9 November, 2016; originally announced November 2016.

Comments: Presented at SPIE Security + Defence 2016 Proc. SPIE 9997, Target and Background Signatures II

Showing 1–49 of 49 results for author: Magno, M