-
Resource-Aware Collaborative Monte Carlo Localization with Distribution Compression
Authors:
Nicky Zimmerman,
Alessandro Giusti,
Jérôme Guzzi
Abstract:
Global localization is essential in enabling robot autonomy, and collaborative localization is key for multi-robot systems. In this paper, we address the task of collaborative global localization under computational and communication constraints. We propose a method which reduces the amount of information exchanged and the computational cost. We also analyze, implement and open-source seminal appr…
▽ More
Global localization is essential in enabling robot autonomy, and collaborative localization is key for multi-robot systems. In this paper, we address the task of collaborative global localization under computational and communication constraints. We propose a method which reduces the amount of information exchanged and the computational cost. We also analyze, implement and open-source seminal approaches, which we believe to be a valuable contribution to the community. We exploit techniques for distribution compression in near-linear time, with error guarantees. We evaluate our approach and the implemented baselines on multiple challenging scenarios, simulated and real-world. Our approach can run online on an onboard computer. We release an open-source C++/ROS2 implementation of our approach, as well as the baselines
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Predicting the Intention to Interact with a Service Robot:the Role of Gaze Cues
Authors:
Simone Arreghini,
Gabriele Abbate,
Alessandro Giusti,
Antonio Paolillo
Abstract:
For a service robot, it is crucial to perceive as early as possible that an approaching person intends to interact: in this case, it can proactively enact friendly behaviors that lead to an improved user experience. We solve this perception task with a sequence-to-sequence classifier of a potential user intention to interact, which can be trained in a self-supervised way. Our main contribution is…
▽ More
For a service robot, it is crucial to perceive as early as possible that an approaching person intends to interact: in this case, it can proactively enact friendly behaviors that lead to an improved user experience. We solve this perception task with a sequence-to-sequence classifier of a potential user intention to interact, which can be trained in a self-supervised way. Our main contribution is a study of the benefit of features representing the person's gaze in this context. Extensive experiments on a novel dataset show that the inclusion of gaze cues significantly improves the classifier performance (AUROC increases from 84.5% to 91.2%); the distance at which an accurate classification can be achieved improves from 2.4 m to 3.2 m. We also quantify the system's ability to adapt to new environments without external supervision. Qualitative experiments show practical applications with a waiter robot.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
On-device Self-supervised Learning of Visual Perception Tasks aboard Hardware-limited Nano-quadrotors
Authors:
Elia Cereda,
Manuele Rusci,
Alessandro Giusti,
Daniele Palossi
Abstract:
Sub-\SI{50}{\gram} nano-drones are gaining momentum in both academia and industry. Their most compelling applications rely on onboard deep learning models for perception despite severe hardware constraints (\ie sub-\SI{100}{\milli\watt} processor). When deployed in unknown environments not represented in the training data, these models often underperform due to domain shift. To cope with this fund…
▽ More
Sub-\SI{50}{\gram} nano-drones are gaining momentum in both academia and industry. Their most compelling applications rely on onboard deep learning models for perception despite severe hardware constraints (\ie sub-\SI{100}{\milli\watt} processor). When deployed in unknown environments not represented in the training data, these models often underperform due to domain shift. To cope with this fundamental problem, we propose, for the first time, on-device learning aboard nano-drones, where the first part of the in-field mission is dedicated to self-supervised fine-tuning of a pre-trained convolutional neural network (CNN). Leveraging a real-world vision-based regression task, we thoroughly explore performance-cost trade-offs of the fine-tuning phase along three axes: \textit{i}) dataset size (more data increases the regression performance but requires more memory and longer computation); \textit{ii}) methodologies (\eg fine-tuning all model parameters vs. only a subset); and \textit{iii}) self-supervision strategy. Our approach demonstrates an improvement in mean absolute error up to 30\% compared to the pre-trained baseline, requiring only \SI{22}{\second} fine-tuning on an ultra-low-power GWT GAP9 System-on-Chip. Addressing the domain shift problem via on-device learning aboard nano-drones not only marks a novel result for hardware-limited robots but lays the ground for more general advancements for the entire robotics community.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
High-throughput Visual Nano-drone to Nano-drone Relative Localization using Onboard Fully Convolutional Networks
Authors:
Luca Crupi,
Alessandro Giusti,
Daniele Palossi
Abstract:
Relative drone-to-drone localization is a fundamental building block for any swarm operations. We address this task in the context of miniaturized nano-drones, i.e., 10cm in diameter, which show an ever-growing interest due to novel use cases enabled by their reduced form factor. The price for their versatility comes with limited onboard resources, i.e., sensors, processing units, and memory, whic…
▽ More
Relative drone-to-drone localization is a fundamental building block for any swarm operations. We address this task in the context of miniaturized nano-drones, i.e., 10cm in diameter, which show an ever-growing interest due to novel use cases enabled by their reduced form factor. The price for their versatility comes with limited onboard resources, i.e., sensors, processing units, and memory, which limits the complexity of the onboard algorithms. A traditional solution to overcome these limitations is represented by lightweight deep learning models directly deployed aboard nano-drones. This work tackles the challenging relative pose estimation between nano-drones using only a gray-scale low-resolution camera and an ultra-low-power System-on-Chip (SoC) hosted onboard. We present a vertically integrated system based on a novel vision-based fully convolutional neural network (FCNN), which runs at 39Hz within 101mW onboard a Crazyflie nano-drone extended with the GWT GAP8 SoC. We compare our FCNN against three State-of-the-Art (SoA) systems. Considering the best-performing SoA approach, our model results in an R-squared improvement from 32 to 47% on the horizontal image coordinate and from 18 to 55% on the vertical image coordinate, on a real-world dataset of 30k images. Finally, our in-field tests show a reduction of the average tracking error of 37% compared to a previous SoA work and an endurance performance up to the entire battery lifetime of 4 minutes.
△ Less
Submitted 17 April, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Self-Supervised Learning of Visual Robot Localization Using LED State Prediction as a Pretext Task
Authors:
Mirko Nava,
Nicholas Carlotti,
Luca Crupi,
Daniele Palossi,
Alessandro Giusti
Abstract:
We propose a novel self-supervised approach for learning to visually localize robots equipped with controllable LEDs. We rely on a few training samples labeled with position ground truth and many training samples in which only the LED state is known, whose collection is cheap. We show that using LED state prediction as a pretext task significantly helps to learn the visual localization end task. T…
▽ More
We propose a novel self-supervised approach for learning to visually localize robots equipped with controllable LEDs. We rely on a few training samples labeled with position ground truth and many training samples in which only the LED state is known, whose collection is cheap. We show that using LED state prediction as a pretext task significantly helps to learn the visual localization end task. The resulting model does not require knowledge of LED states during inference. We instantiate the approach to visual relative localization of nano-quadrotors: experimental results show that using our pretext task significantly improves localization accuracy (from 68.3% to 76.2%) and outperforms alternative strategies, such as a supervised baseline, model pre-training, and an autoencoding pretext task. We deploy our model aboard a 27-g Crazyflie nano-drone, running at 21 fps, in a position-tracking task of a peer nano-drone. Our approach, relying on position labels for only 300 images, yields a mean tracking error of 4.2 cm versus 11.9 cm of a supervised baseline model trained without our pretext task. Videos and code of the proposed approach are available at https://github.com/idsia-robotics/leds-as-pretext
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
A Sim-to-Real Deep Learning-based Framework for Autonomous Nano-drone Racing
Authors:
Lorenzo Lamberti,
Elia Cereda,
Gabriele Abbate,
Lorenzo Bellone,
Victor Javier Kartsch Morinigo,
Michał Barcis,
Agata Barcis,
Alessandro Giusti,
Francesco Conti,
Daniele Palossi
Abstract:
Autonomous drone racing competitions are a proxy to improve unmanned aerial vehicles' perception, planning, and control skills. The recent emergence of autonomous nano-sized drone racing imposes new challenges, as their ~10cm form factor heavily restricts the resources available onboard, including memory, computation, and sensors. This paper describes the methodology and technical implementation o…
▽ More
Autonomous drone racing competitions are a proxy to improve unmanned aerial vehicles' perception, planning, and control skills. The recent emergence of autonomous nano-sized drone racing imposes new challenges, as their ~10cm form factor heavily restricts the resources available onboard, including memory, computation, and sensors. This paper describes the methodology and technical implementation of the system winning the first autonomous nano-drone racing international competition: the IMAV 2022 Nanocopter AI Challenge. We developed a fully onboard deep learning approach for visual navigation trained only on simulation images to achieve this goal. Our approach includes a convolutional neural network for obstacle avoidance, a sim-to-real dataset collection procedure, and a navigation policy that we selected, characterized, and adapted through simulation and actual in-field experiments. Our system ranked 1st among seven competing teams at the competition. In our best attempt, we scored 115m of traveled distance in the allotted 5-minute flight, never crashing while dodging static and dynamic obstacles. Sharing our knowledge with the research community, we aim to provide a solid groundwork to foster future development in this field.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Self-Supervised Prediction of the Intention to Interact with a Service Robot
Authors:
Gabriele Abbate,
Alessandro Giusti,
Viktor Schmuck,
Oya Celiktutan,
Antonio Paolillo
Abstract:
A service robot can provide a smoother interaction experience if it has the ability to proactively detect whether a nearby user intends to interact, in order to adapt its behavior e.g. by explicitly showing that it is available to provide a service. In this work, we propose a learning-based approach to predict the probability that a human user will interact with a robot before the interaction actu…
▽ More
A service robot can provide a smoother interaction experience if it has the ability to proactively detect whether a nearby user intends to interact, in order to adapt its behavior e.g. by explicitly showing that it is available to provide a service. In this work, we propose a learning-based approach to predict the probability that a human user will interact with a robot before the interaction actually begins; the approach is self-supervised because after each encounter with a human, the robot can automatically label it depending on whether it resulted in an interaction or not. We explore different classification approaches, using different sets of features considering the pose and the motion of the user. We validate and deploy the approach in three scenarios. The first collects $3442$ natural sequences (both interacting and non-interacting) representing employees in an office break area: a real-world, challenging setting, where we consider a coffee machine in place of a service robot. The other two scenarios represent researchers interacting with service robots ($200$ and $72$ sequences, respectively). Results show that, even in challenging real-world settings, our approach can learn without external supervision, and can achieve accurate classification (i.e. AUROC greater than $0.9$) of the user's intention to interact with an advance of more than $3$s before the interaction actually occurs.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Sim-to-Real Vision-depth Fusion CNNs for Robust Pose Estimation Aboard Autonomous Nano-quadcopter
Authors:
Luca Crupi,
Elia Cereda,
Alessandro Giusti,
Daniele Palossi
Abstract:
Nano-quadcopters are versatile platforms attracting the interest of both academia and industry. Their tiny form factor, i.e., $\,$10 cm diameter, makes them particularly useful in narrow scenarios and harmless in human proximity. However, these advantages come at the price of ultra-constrained onboard computational and sensorial resources for autonomous operations. This work addresses the task of…
▽ More
Nano-quadcopters are versatile platforms attracting the interest of both academia and industry. Their tiny form factor, i.e., $\,$10 cm diameter, makes them particularly useful in narrow scenarios and harmless in human proximity. However, these advantages come at the price of ultra-constrained onboard computational and sensorial resources for autonomous operations. This work addresses the task of estimating human pose aboard nano-drones by fusing depth and images in a novel CNN exclusively trained in simulation yet capable of robust predictions in the real world. We extend a commercial off-the-shelf (COTS) Crazyflie nano-drone -- equipped with a 320$\times$240 px camera and an ultra-low-power System-on-Chip -- with a novel multi-zone (8$\times$8) depth sensor. We design and compare different deep-learning models that fuse depth and image inputs. Our models are trained exclusively on simulated data for both inputs, and transfer well to the real world: field testing shows an improvement of 58% and 51% of our depth+camera system w.r.t. a camera-only State-of-the-Art baseline on the horizontal and angular mean pose errors, respectively. Our prototype is based on COTS components, which facilitates reproducibility and adoption of this novel class of systems.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
Secure Deep Learning-based Distributed Intelligence on Pocket-sized Drones
Authors:
Elia Cereda,
Alessandro Giusti,
Daniele Palossi
Abstract:
Palm-sized nano-drones are an appealing class of edge nodes, but their limited computational resources prevent running large deep-learning models onboard. Adopting an edge-fog computational paradigm, we can offload part of the computation to the fog; however, this poses security concerns if the fog node, or the communication link, can not be trusted. To tackle this concern, we propose a novel dist…
▽ More
Palm-sized nano-drones are an appealing class of edge nodes, but their limited computational resources prevent running large deep-learning models onboard. Adopting an edge-fog computational paradigm, we can offload part of the computation to the fog; however, this poses security concerns if the fog node, or the communication link, can not be trusted. To tackle this concern, we propose a novel distributed edge-fog execution scheme that validates fog computation by redundantly executing a random subnetwork aboard our nano-drone. Compared to a State-of-the-Art visual pose estimation network that entirely runs onboard, a larger network executed in a distributed way improves the $R^2$ score by +0.19; in case of attack, our approach detects it within 2s with 95% probability.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Cyber Security aboard Micro Aerial Vehicles: An OpenTitan-based Visual Communication Use Case
Authors:
Maicol Ciani,
Stefano Bonato,
Rafail Psiakis,
Angelo Garofalo,
Luca Valente,
Suresh Sugumar,
Alessandro Giusti,
Davide Rossi,
Daniele Palossi
Abstract:
Autonomous Micro Aerial Vehicles (MAVs), with a form factor of 10cm in diameter, are an emerging technology thanks to the broad applicability enabled by their onboard intelligence. However, these platforms are strongly limited in the onboard power envelope for processing, i.e., less than a few hundred mW, which confines the onboard processors to the class of simple microcontroller units (MCUs). Th…
▽ More
Autonomous Micro Aerial Vehicles (MAVs), with a form factor of 10cm in diameter, are an emerging technology thanks to the broad applicability enabled by their onboard intelligence. However, these platforms are strongly limited in the onboard power envelope for processing, i.e., less than a few hundred mW, which confines the onboard processors to the class of simple microcontroller units (MCUs). These MCUs lack advanced security features opening the way to a wide range of cyber security vulnerabilities, from the communication between agents of the same fleet to the onboard execution of malicious code. This work presents an open source System on Chip (SoC) design that integrates a 64 bit Linux capable host processor accelerated by an 8 core 32 bit parallel programmable accelerator. The heterogeneous system architecture is coupled with a security enclave based on an open source OpenTitan root of trust. To demonstrate our design, we propose a use case where OpenTitan detects a security breach on the SoC aboard the MAV and drives its exclusive GPIOs to start a LED blinking routine. This procedure embodies an unconventional visual communication between two palm sized MAVs: the receiver MAV classifies the LED state of the sender (on or off) with an onboard convolutional neural network running on the parallel accelerator. Then, it reconstructs a high-level message in 1.3s, 2.3 times faster than current commercial solutions.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
Ultra-low Power Deep Learning-based Monocular Relative Localization Onboard Nano-quadrotors
Authors:
Stefano Bonato,
Stefano Carlo Lambertenghi,
Elia Cereda,
Alessandro Giusti,
Daniele Palossi
Abstract:
Precise relative localization is a crucial functional block for swarm robotics. This work presents a novel autonomous end-to-end system that addresses the monocular relative localization, through deep neural networks (DNNs), of two peer nano-drones, i.e., sub-40g of weight and sub-100mW processing power. To cope with the ultra-constrained nano-drone platform, we propose a vertically-integrated fra…
▽ More
Precise relative localization is a crucial functional block for swarm robotics. This work presents a novel autonomous end-to-end system that addresses the monocular relative localization, through deep neural networks (DNNs), of two peer nano-drones, i.e., sub-40g of weight and sub-100mW processing power. To cope with the ultra-constrained nano-drone platform, we propose a vertically-integrated framework, from the dataset collection to the final in-field deployment, including dataset augmentation, quantization, and system optimizations. Experimental results show that our DNN can precisely localize a 10cm-size target nano-drone by employing only low-resolution monochrome images, up to ~2m distance. On a disjoint testing dataset our model yields a mean R2 score of 0.42 and a root mean square error of 18cm, which results in a mean in-field prediction error of 15cm and in a closed-loop control error of 17cm, over a ~60s-flight test. Ultimately, the proposed system improves the State-of-the-Art by showing long-endurance tracking performance (up to 2min continuous tracking), generalization capabilities being deployed in a never-seen-before environment, and requiring a minimal power consumption of 95mW for an onboard real-time inference-rate of 48Hz.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Deep Neural Network Architecture Search for Accurate Visual Pose Estimation aboard Nano-UAVs
Authors:
Elia Cereda,
Luca Crupi,
Matteo Risso,
Alessio Burrello,
Luca Benini,
Alessandro Giusti,
Daniele Jahier Pagliari,
Daniele Palossi
Abstract:
Miniaturized autonomous unmanned aerial vehicles (UAVs) are an emerging and trending topic. With their form factor as big as the palm of one hand, they can reach spots otherwise inaccessible to bigger robots and safely operate in human surroundings. The simple electronics aboard such robots (sub-100mW) make them particularly cheap and attractive but pose significant challenges in enabling onboard…
▽ More
Miniaturized autonomous unmanned aerial vehicles (UAVs) are an emerging and trending topic. With their form factor as big as the palm of one hand, they can reach spots otherwise inaccessible to bigger robots and safely operate in human surroundings. The simple electronics aboard such robots (sub-100mW) make them particularly cheap and attractive but pose significant challenges in enabling onboard sophisticated intelligence. In this work, we leverage a novel neural architecture search (NAS) technique to automatically identify several Pareto-optimal convolutional neural networks (CNNs) for a visual pose estimation task. Our work demonstrates how real-life and field-tested robotics applications can concretely leverage NAS technologies to automatically and efficiently optimize CNNs for the specific hardware constraints of small UAVs. We deploy several NAS-optimized CNNs and run them in closed-loop aboard a 27-g Crazyflie nano-UAV equipped with a parallel ultra-low power System-on-Chip. Our results improve the State-of-the-Art by reducing the in-field control error of 32% while achieving a real-time onboard inference-rate of ~10Hz@10mW and ~50Hz@90mW.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Visual Servoing with Geometrically Interpretable Neural Perception
Authors:
Antonio Paolillo,
Mirko Nava,
Dario Piga,
Alessandro Giusti
Abstract:
An increasing number of nonspecialist robotic users demand easy-to-use machines. In the context of visual servoing, the removal of explicit image processing is becoming a trend, allowing an easy application of this technique. This work presents a deep learning approach for solving the perception problem within the visual servoing scheme. An artificial neural network is trained using the supervisio…
▽ More
An increasing number of nonspecialist robotic users demand easy-to-use machines. In the context of visual servoing, the removal of explicit image processing is becoming a trend, allowing an easy application of this technique. This work presents a deep learning approach for solving the perception problem within the visual servoing scheme. An artificial neural network is trained using the supervision coming from the knowledge of the controller and the visual features motion model. In this way, it is possible to give a geometrical interpretation to the estimated visual features, which can be used in the analytical law of the visual servoing. The approach keeps perception and control decoupled, conferring flexibility and interpretability on the whole framework. Simulated and real experiments with a robotic manipulator validate our approach.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
Challenges in Visual Anomaly Detection for Mobile Robots
Authors:
Dario Mantegazza,
Alessandro Giusti,
Luca M. Gambardella,
Andrea Rizzoli,
Jérôme Guzzi
Abstract:
We consider the task of detecting anomalies for autonomous mobile robots based on vision. We categorize relevant types of visual anomalies and discuss how they can be detected by unsupervised deep learning methods. We propose a novel dataset built specifically for this task, on which we test a state-of-the-art approach; we finally discuss deployment in a real scenario.
We consider the task of detecting anomalies for autonomous mobile robots based on vision. We categorize relevant types of visual anomalies and discuss how they can be detected by unsupervised deep learning methods. We propose a novel dataset built specifically for this task, on which we test a state-of-the-art approach; we finally discuss deployment in a real scenario.
△ Less
Submitted 22 September, 2022;
originally announced September 2022.
-
An Outlier Exposure Approach to Improve Visual Anomaly Detection Performance for Mobile Robots
Authors:
Dario Mantegazza,
Alessandro Giusti,
Luca Maria Gambardella,
Jérôme Guzzi
Abstract:
We consider the problem of building visual anomaly detection systems for mobile robots. Standard anomaly detection models are trained using large datasets composed only of non-anomalous data. However, in robotics applications, it is often the case that (potentially very few) examples of anomalies are available. We tackle the problem of exploiting these data to improve the performance of a Real-NVP…
▽ More
We consider the problem of building visual anomaly detection systems for mobile robots. Standard anomaly detection models are trained using large datasets composed only of non-anomalous data. However, in robotics applications, it is often the case that (potentially very few) examples of anomalies are available. We tackle the problem of exploiting these data to improve the performance of a Real-NVP anomaly detection model, by minimizing, jointly with the Real-NVP loss, an auxiliary outlier exposure margin loss. We perform quantitative experiments on a novel dataset (which we publish as supplementary material) designed for anomaly detection in an indoor patrolling scenario. On a disjoint test set, our approach outperforms alternatives and shows that exposing even a small number of anomalous frames yields significant performance improvements.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
Distributed control for geometric pattern formation of large-scale multirobot systems
Authors:
Andrea Giusti,
Gian Carlo Maffettone,
Davide Fiore,
Marco Coraggio,
Mario di Bernardo
Abstract:
Geometric pattern formation is crucial in many tasks involving large-scale multi-agent systems. Examples include mobile agents performing surveillance, swarm of drones or robots, or smart transportation systems. Currently, most control strategies proposed to achieve pattern formation in network systems either show good performance but require expensive sensors and communication devices, or have le…
▽ More
Geometric pattern formation is crucial in many tasks involving large-scale multi-agent systems. Examples include mobile agents performing surveillance, swarm of drones or robots, or smart transportation systems. Currently, most control strategies proposed to achieve pattern formation in network systems either show good performance but require expensive sensors and communication devices, or have lesser sensor requirements but behave more poorly. Also, they often require certain prescribed structural interconnections between the agents (e.g., regular lattices, all-to-all networks etc). In this paper, we provide a distributed displacement-based control law that allows large group of agents to achieve triangular and square lattices, with low sensor requirements and without needing communication between the agents. Also, a simple, yet powerful, adaptation law is proposed to automatically tune the control gains in order to reduce the design effort, while improving robustness and flexibility. We show the validity and robustness of our approach via numerical simulations and experiments, comparing it with other approaches from the existing literature.
△ Less
Submitted 29 July, 2022;
originally announced July 2022.
-
Vision-State Fusion: Improving Deep Neural Networks for Autonomous Robotics
Authors:
Elia Cereda,
Stefano Bonato,
Mirko Nava,
Alessandro Giusti,
Daniele Palossi
Abstract:
Vision-based deep learning perception fulfills a paramount role in robotics, facilitating solutions to many challenging scenarios, such as acrobatic maneuvers of autonomous unmanned aerial vehicles (UAVs) and robot-assisted high-precision surgery. Control-oriented end-to-end perception approaches, which directly output control variables for the robot, commonly take advantage of the robot's state e…
▽ More
Vision-based deep learning perception fulfills a paramount role in robotics, facilitating solutions to many challenging scenarios, such as acrobatic maneuvers of autonomous unmanned aerial vehicles (UAVs) and robot-assisted high-precision surgery. Control-oriented end-to-end perception approaches, which directly output control variables for the robot, commonly take advantage of the robot's state estimation as an auxiliary input. When intermediate outputs are estimated and fed to a lower-level controller, i.e. mediated approaches, the robot's state is commonly used as an input only for egocentric tasks, which estimate physical properties of the robot itself. In this work, we propose to apply a similar approach for the first time -- to the best of our knowledge -- to non-egocentric mediated tasks, where the estimated outputs refer to an external subject. We prove how our general methodology improves the regression performance of deep convolutional neural networks (CNNs) on a broad class of non-egocentric 3D pose estimation problems, with minimal computational cost. By analyzing three highly-different use cases, spanning from gras** with a robotic arm to following a human subject with a pocket-sized UAV, our results consistently improve the R\textsuperscript{2} regression metric, up to +0.51, compared to their stateless baselines. Finally, we validate the in-field performance of a closed-loop autonomous cm-scale UAV on the human pose estimation task. Our results show a significant reduction, i.e., 24\% on average, on the mean absolute error of our stateful CNN, compared to a State-of-the-Art stateless counterpart.
△ Less
Submitted 20 March, 2024; v1 submitted 13 June, 2022;
originally announced June 2022.
-
Collaborative Artificial Intelligence Needs Stronger Assurances Driven by Risks
Authors:
Jubril Gbolahan Adigun,
Matteo Camilli,
Michael Felderer,
Andrea Giusti,
Dominik T Matt,
Anna Perini,
Barbara Russo,
Angelo Susi
Abstract:
Collaborative AI systems (CAISs) aim at working together with humans in a shared space to achieve a common goal. This critical setting yields hazardous circumstances that could harm human beings. Thus, building such systems with strong assurances of compliance with requirements, domain-specific standards and regulations is of greatest importance. Only few scale impact has been reported so far for…
▽ More
Collaborative AI systems (CAISs) aim at working together with humans in a shared space to achieve a common goal. This critical setting yields hazardous circumstances that could harm human beings. Thus, building such systems with strong assurances of compliance with requirements, domain-specific standards and regulations is of greatest importance. Only few scale impact has been reported so far for such systems since much work remains to manage possible risks. We identify emerging problems in this context and then we report our vision, as well as the progress of our multidisciplinary research team composed of software/systems, and mechatronics engineers to develop a risk-driven assurance process for CAISs.
△ Less
Submitted 22 September, 2022; v1 submitted 1 December, 2021;
originally announced December 2021.
-
Sensing Anomalies as Potential Hazards: Datasets and Benchmarks
Authors:
Dario Mantegazza,
Carlos Redondo,
Fran Espada,
Luca M. Gambardella,
Alessandro Giusti,
Jérôme Guzzi
Abstract:
We consider the problem of detecting, in the visual sensing data stream of an autonomous mobile robot, semantic patterns that are unusual (i.e., anomalous) with respect to the robot's previous experience in similar environments. These anomalies might indicate unforeseen hazards and, in scenarios where failure is costly, can be used to trigger an avoidance behavior. We contribute three novel image-…
▽ More
We consider the problem of detecting, in the visual sensing data stream of an autonomous mobile robot, semantic patterns that are unusual (i.e., anomalous) with respect to the robot's previous experience in similar environments. These anomalies might indicate unforeseen hazards and, in scenarios where failure is costly, can be used to trigger an avoidance behavior. We contribute three novel image-based datasets acquired in robot exploration scenarios, comprising a total of more than 200k labeled frames, spanning various types of anomalies. On these datasets, we study the performance of an anomaly detection approach based on autoencoders operating at different scales.
△ Less
Submitted 20 September, 2022; v1 submitted 27 October, 2021;
originally announced October 2021.
-
Training Lightweight CNNs for Human-Nanodrone Proximity Interaction from Small Datasets using Background Randomization
Authors:
Marco Ferri,
Dario Mantegazza,
Elia Cereda,
Nicky Zimmerman,
Luca M. Gambardella,
Daniele Palossi,
Jérôme Guzzi,
Alessandro Giusti
Abstract:
We consider the task of visually estimating the pose of a human from images acquired by a nearby nano-drone; in this context, we propose a data augmentation approach based on synthetic background substitution to learn a lightweight CNN model from a small real-world training set. Experimental results on data from two different labs proves that the approach improves generalization to unseen environm…
▽ More
We consider the task of visually estimating the pose of a human from images acquired by a nearby nano-drone; in this context, we propose a data augmentation approach based on synthetic background substitution to learn a lightweight CNN model from a small real-world training set. Experimental results on data from two different labs proves that the approach improves generalization to unseen environments.
△ Less
Submitted 27 October, 2021;
originally announced October 2021.
-
Performance vs Programming Effort between Rust and C on Multicore Architectures: Case Study in N-Body
Authors:
Manuel Costanzo,
Enzo Rucci,
Marcelo Naiouf,
Armando De Giusti
Abstract:
Historically, Fortran and C have been the default programming languages in High-Performance Computing (HPC). In both, programmers have primitives and functions available that allow manipulating system memory and interacting directly with the underlying hardware, resulting in efficient code in both response times and resource use. On the other hand, it is a real challenge to generate code that is m…
▽ More
Historically, Fortran and C have been the default programming languages in High-Performance Computing (HPC). In both, programmers have primitives and functions available that allow manipulating system memory and interacting directly with the underlying hardware, resulting in efficient code in both response times and resource use. On the other hand, it is a real challenge to generate code that is maintainable and scalable over time in these types of languages. In 2010, Rust emerged as a new programming language designed for concurrent and secure applications, which adopts features of procedural, object-oriented and functional languages. Among its design principles, Rust is aimed at matching C in terms of efficiency, but with increased code security and productivity. This paper presents a comparative study between C and Rust in terms of performance and programming effort, selecting as a case study the simulation of N computational bodies (N-Body), a popular problem in the HPC community. Based on the experimental work, it was possible to establish that Rust is a language that reduces programming effort while maintaining acceptable performance levels, meaning that it is a possible alternative to C for HPC.
△ Less
Submitted 19 October, 2021; v1 submitted 25 July, 2021;
originally announced July 2021.
-
Uncertainty-Aware Self-Supervised Learning of Spatial Perception Tasks
Authors:
Mirko Nava,
Antonio Paolillo,
Jérôme Guzzi,
Luca Maria Gambardella,
Alessandro Giusti
Abstract:
We propose a general self-supervised learning approach for spatial perception tasks, such as estimating the pose of an object relative to the robot, from onboard sensor readings. The model is learned from training episodes, by relying on: a continuous state estimate, possibly inaccurate and affected by odometry drift; and a detector, that sporadically provides supervision about the target pose. We…
▽ More
We propose a general self-supervised learning approach for spatial perception tasks, such as estimating the pose of an object relative to the robot, from onboard sensor readings. The model is learned from training episodes, by relying on: a continuous state estimate, possibly inaccurate and affected by odometry drift; and a detector, that sporadically provides supervision about the target pose. We demonstrate the general approach in three different concrete scenarios: a simulated robot arm that visually estimates the pose of an object of interest; a small differential drive robot using 7 infrared sensors to localize a nearby wall; an omnidirectional mobile robot that localizes itself in an environment from camera images. Quantitative results show that the approach works well in all three scenarios, and that explicitly accounting for uncertainty yields statistically significant performance improvements.
△ Less
Submitted 18 July, 2021; v1 submitted 22 March, 2021;
originally announced March 2021.
-
Fully Onboard AI-powered Human-Drone Pose Estimation on Ultra-low Power Autonomous Flying Nano-UAVs
Authors:
Daniele Palossi,
Nicky Zimmerman,
Alessio Burrello,
Francesco Conti,
Hanna Müller,
Luca Maria Gambardella,
Luca Benini,
Alessandro Giusti,
Jérôme Guzzi
Abstract:
Artificial intelligence-powered pocket-sized air robots have the potential to revolutionize the Internet-of-Things ecosystem, acting as autonomous, unobtrusive, and ubiquitous smart sensors. With a few cm$^{2}$ form-factor, nano-sized unmanned aerial vehicles (UAVs) are the natural befit for indoor human-drone interaction missions, as the pose estimation task we address in this work. However, this…
▽ More
Artificial intelligence-powered pocket-sized air robots have the potential to revolutionize the Internet-of-Things ecosystem, acting as autonomous, unobtrusive, and ubiquitous smart sensors. With a few cm$^{2}$ form-factor, nano-sized unmanned aerial vehicles (UAVs) are the natural befit for indoor human-drone interaction missions, as the pose estimation task we address in this work. However, this scenario is challenged by the nano-UAVs' limited payload and computational power that severely relegates the onboard brain to the sub-100 mW microcontroller unit-class. Our work stands at the intersection of the novel parallel ultra-low-power (PULP) architectural paradigm and our general development methodology for deep neural network (DNN) visual pipelines, i.e., covering from perception to control. Addressing the DNN model design, from training and dataset augmentation to 8-bit quantization and deployment, we demonstrate how a PULP-based processor, aboard a nano-UAV, is sufficient for the real-time execution (up to 135 frame/s) of our novel DNN, called PULP-Frontnet. We showcase how, scaling our model's memory and computational requirement, we can significantly improve the onboard inference (top energy efficiency of 0.43 mJ/frame) with no compromise in the quality-of-result vs. a resource-unconstrained baseline (i.e., full-precision DNN). Field experiments demonstrate a closed-loop top-notch autonomous navigation capability, with a heavily resource-constrained 27-gram Crazyflie 2.1 nano-quadrotor. Compared against the control performance achieved using an ideal sensing setup, onboard relative pose inference yields excellent drone behavior in terms of median absolute errors, such as positional (onboard: 41 cm, ideal: 26 cm) and angular (onboard: 3.7$^{\circ}$, ideal: 4.1$^{\circ}$).
△ Less
Submitted 19 March, 2021;
originally announced March 2021.
-
Towards Risk Modeling for Collaborative AI
Authors:
Matteo Camilli,
Michael Felderer,
Andrea Giusti,
Dominik T. Matt,
Anna Perini,
Barbara Russo,
Angelo Susi
Abstract:
Collaborative AI systems aim at working together with humans in a shared space to achieve a common goal. This setting imposes potentially hazardous circumstances due to contacts that could harm human beings. Thus, building such systems with strong assurances of compliance with requirements domain specific standards and regulations is of greatest importance. Challenges associated with the achieveme…
▽ More
Collaborative AI systems aim at working together with humans in a shared space to achieve a common goal. This setting imposes potentially hazardous circumstances due to contacts that could harm human beings. Thus, building such systems with strong assurances of compliance with requirements domain specific standards and regulations is of greatest importance. Challenges associated with the achievement of this goal become even more severe when such systems rely on machine learning components rather than such as top-down rule-based AI. In this paper, we introduce a risk modeling approach tailored to Collaborative AI systems. The risk model includes goals, risk events and domain specific indicators that potentially expose humans to hazards. The risk model is then leveraged to drive assurance methods that feed in turn the risk model through insights extracted from run-time evidence. Our envisioned approach is described by means of a running example in the domain of Industry 4.0, where a robotic arm endowed with a visual perception component, implemented with machine learning, collaborates with a human operator for a production-relevant task.
△ Less
Submitted 12 March, 2021;
originally announced March 2021.
-
Semantic Segmentation on Swiss3DCities: A Benchmark Study on Aerial Photogrammetric 3D Pointcloud Dataset
Authors:
Gülcan Can,
Dario Mantegazza,
Gabriele Abbate,
Sébastien Chappuis,
Alessandro Giusti
Abstract:
We introduce a new outdoor urban 3D pointcloud dataset, covering a total area of 2.7 $km^2$, sampled from three Swiss cities with different characteristics. The dataset is manually annotated for semantic segmentation with per-point labels, and is built using photogrammetry from images acquired by multirotors equipped with high-resolution cameras. In contrast to datasets acquired with ground LiDAR…
▽ More
We introduce a new outdoor urban 3D pointcloud dataset, covering a total area of 2.7 $km^2$, sampled from three Swiss cities with different characteristics. The dataset is manually annotated for semantic segmentation with per-point labels, and is built using photogrammetry from images acquired by multirotors equipped with high-resolution cameras. In contrast to datasets acquired with ground LiDAR sensors, the resulting point clouds are uniformly dense and complete, and are useful to disparate applications, including autonomous driving, gaming and smart city planning. As a benchmark, we report quantitative results of PointNet++, an established point-based deep 3D semantic segmentation model; on this model, we additionally study the impact of using different cities for model generalization.
△ Less
Submitted 23 December, 2020;
originally announced December 2020.
-
Diabetes Link: Platform for Self-Control and Monitoring People with Diabetes
Authors:
Enzo Rucci,
Lisandro Delia,
Joaquín Pujol,
Paula Erbino,
Armando De Giusti,
Juan José Gagliardino
Abstract:
Diabetes Mellitus (DM) is a chronic disease characterized by an increase in blood glucose (sugar) above normal levels and it appears when human body is not able to produce enough insulin to cover the peripheral tissue demand. Nowadays, DM affects the 8.5% of the world's population and, even though no cure for it has been found, an adequate monitoring and treatment allow patients to have an almost…
▽ More
Diabetes Mellitus (DM) is a chronic disease characterized by an increase in blood glucose (sugar) above normal levels and it appears when human body is not able to produce enough insulin to cover the peripheral tissue demand. Nowadays, DM affects the 8.5% of the world's population and, even though no cure for it has been found, an adequate monitoring and treatment allow patients to have an almost normal life. This paper introduces Diabetes Link, a comprehensive platform for control and monitoring people with DM. Diabetes Link allows recording various parameters relevant for the treatment and calculating different statistical charts using them. In addition, it allows connecting with other users (supervisors) so they can monitor the controls. Even more, the extensive comparative study carried out reflects that Diabetes Link presents distinctive and superior features against other proposals. We conclude that Diabetes Link represents a broad and accessible tool that can help make day-to-day control easier and optimize the efficacy in DM control and treatment.
△ Less
Submitted 29 October, 2020;
originally announced November 2020.
-
Learning to predict metal deformations in hot-rolling processes
Authors:
R. Omar Chavez-Garcia,
Emian Furger,
Samuele Kronauer,
Christian Brianza,
Marco Scarfò,
Luca Diviani,
Alessandro Giusti
Abstract:
Hot-rolling is a metal forming process that produces a workpiece with a desired target cross-section from an input workpiece through a sequence of plastic deformations; each deformation is generated by a stand composed of opposing rolls with a specific geometry. In current practice, the rolling sequence (i.e., the sequence of stands and the geometry of their rolls) needed to achieve a given final…
▽ More
Hot-rolling is a metal forming process that produces a workpiece with a desired target cross-section from an input workpiece through a sequence of plastic deformations; each deformation is generated by a stand composed of opposing rolls with a specific geometry. In current practice, the rolling sequence (i.e., the sequence of stands and the geometry of their rolls) needed to achieve a given final cross-section is designed by experts based on previous experience, and iteratively refined in a costly trial-and-error process. Finite Element Method simulations are increasingly adopted to make this process more efficient and to test potential rolling sequences, achieving good accuracy at the cost of long simulation times, limiting the practical use of the approach. We propose a supervised learning approach to predict the deformation of a given workpiece by a set of rolls with a given geometry; the model is trained on a large dataset of procedurally-generated FEM simulations, which we publish as supplementary material. The resulting predictor is four orders of magnitude faster than simulations, and yields an average Jaccard Similarity Index of 0.972 (against ground truth from simulations) and 0.925 (against real-world measured deformations); we additionally report preliminary results on using the predictor for automatic planning of rolling sequences.
△ Less
Submitted 22 July, 2020;
originally announced July 2020.
-
Soft Errors Detection and Automatic Recovery based on Replication combined with different Levels of Checkpointing
Authors:
Diego Montezanti,
Enzo Rucci,
Armando De Giusti,
Marcelo Naiouf,
Dolores Rexachs,
Emilio Luque
Abstract:
Handling faults is a growing concern in HPC. In future exascale systems, it is projected that silent undetected errors will occur several times a day, increasing the occurrence of corrupted results. In this article, we propose SEDAR, which is a methodology that improves system reliability against transient faults when running parallel message-passing applications. Our approach, based on process re…
▽ More
Handling faults is a growing concern in HPC. In future exascale systems, it is projected that silent undetected errors will occur several times a day, increasing the occurrence of corrupted results. In this article, we propose SEDAR, which is a methodology that improves system reliability against transient faults when running parallel message-passing applications. Our approach, based on process replication for detection, combined with different levels of checkpointing for automatic recovery, has the goal of hel** users of scientific applications to obtain executions with correct results. SEDAR is structured in three levels: (1) only detection and safe-stop with notification; (2) recovery based on multiple system-level checkpoints; and (3) recovery based on a single valid user-level checkpoint. As each of these variants supplies a particular coverage but involves limitations and implementation costs, SEDAR can be adapted to the needs of the system. In this work, a description of the methodology is presented and the temporal behavior of employing each SEDAR strategy is mathematically described, both in the absence and presence of faults. A model that considers all the fault scenarios on a test application is introduced to show the validity of the detection and recovery mechanisms. An overhead evaluation of each variant is performed with applications involving different communication patterns; this is also used to extract guidelines about when it is beneficial to employ each SEDAR protection level. As a result, we show its efficacy and viability to tolerate transient faults in target HPC environments.
△ Less
Submitted 27 July, 2020; v1 submitted 16 July, 2020;
originally announced July 2020.
-
Blocked All-Pairs Shortest Paths Algorithm on Intel Xeon Phi KNL Processor: A Case Study
Authors:
Enzo Rucci,
Armando De Giusti,
Marcelo Naiouf
Abstract:
Manycores are consolidating in HPC community as a way of improving performance while kee** power efficiency. Knights Landing is the recently released second generation of Intel Xeon Phi architecture. While optimizing applications on CPUs, GPUs and first Xeon Phi's has been largely studied in the last years, the new features in Knights Landing processors require the revision of programming and op…
▽ More
Manycores are consolidating in HPC community as a way of improving performance while kee** power efficiency. Knights Landing is the recently released second generation of Intel Xeon Phi architecture. While optimizing applications on CPUs, GPUs and first Xeon Phi's has been largely studied in the last years, the new features in Knights Landing processors require the revision of programming and optimization techniques for these devices. In this work, we selected the Floyd-Warshall algorithm as a representative case study of graph and memory-bound applications. Starting from the default serial version, we show how data, thread and compiler level optimizations help the parallel implementation to reach 338 GFLOPS.
△ Less
Submitted 3 November, 2018;
originally announced November 2018.
-
Vision-based Control of a Quadrotor in User Proximity: Mediated vs End-to-End Learning Approaches
Authors:
Dario Mantegazza,
Jérôme Guzzi,
Luca M. Gambardella,
Alessandro Giusti
Abstract:
We consider the task of controlling a quadrotor to hover in front of a freely moving user, using input data from an onboard camera. On this specific task we compare two widespread learning paradigms: a mediated approach, which learns an high-level state from the input and then uses it for deriving control signals; and an end-to-end approach, which skips high-level state estimation altogether. We s…
▽ More
We consider the task of controlling a quadrotor to hover in front of a freely moving user, using input data from an onboard camera. On this specific task we compare two widespread learning paradigms: a mediated approach, which learns an high-level state from the input and then uses it for deriving control signals; and an end-to-end approach, which skips high-level state estimation altogether. We show that despite their fundamental difference, both approaches yield equivalent performance on this task. We finally qualitatively analyze the behavior of a quadrotor implementing such approaches.
△ Less
Submitted 25 February, 2019; v1 submitted 24 September, 2018;
originally announced September 2018.
-
Learning Long-Range Perception Using Self-Supervision from Short-Range Sensors and Odometry
Authors:
Mirko Nava,
Jerome Guzzi,
R. Omar Chavez-Garcia,
Luca M. Gambardella,
Alessandro Giusti
Abstract:
We introduce a general self-supervised approach to predict the future outputs of a short-range sensor (such as a proximity sensor) given the current outputs of a long-range sensor (such as a camera); we assume that the former is directly related to some piece of information to be perceived (such as the presence of an obstacle in a given position), whereas the latter is information-rich but hard to…
▽ More
We introduce a general self-supervised approach to predict the future outputs of a short-range sensor (such as a proximity sensor) given the current outputs of a long-range sensor (such as a camera); we assume that the former is directly related to some piece of information to be perceived (such as the presence of an obstacle in a given position), whereas the latter is information-rich but hard to interpret directly. We instantiate and implement the approach on a small mobile robot to detect obstacles at various distances using the video stream of the robot's forward-pointing camera, by training a convolutional neural network on automatically-acquired datasets. We quantitatively evaluate the quality of the predictions on unseen scenarios, qualitatively evaluate robustness to different operating conditions, and demonstrate usage as the sole input of an obstacle-avoidance controller. We additionally instantiate the approach on a different simulated scenario with complementary characteristics, to exemplify the generality of our contribution.
△ Less
Submitted 17 January, 2019; v1 submitted 19 September, 2018;
originally announced September 2018.
-
Learning Ground Traversability from Simulations
Authors:
R. Omar Chavez-Garcia,
Jerome Guzzi,
Luca M. Gambardella,
Alessandro Giusti
Abstract:
Mobile ground robots operating on unstructured terrain must predict which areas of the environment they are able to pass in order to plan feasible paths. We address traversability estimation as a heightmap classification problem: we build a convolutional neural network that, given an image representing the heightmap of a terrain patch, predicts whether the robot will be able to traverse such patch…
▽ More
Mobile ground robots operating on unstructured terrain must predict which areas of the environment they are able to pass in order to plan feasible paths. We address traversability estimation as a heightmap classification problem: we build a convolutional neural network that, given an image representing the heightmap of a terrain patch, predicts whether the robot will be able to traverse such patch from left to right. The classifier is trained for a specific robot model (wheeled, tracked, legged, snake-like) using simulation data on procedurally generated training terrains; the trained classifier can be applied to unseen large heightmaps to yield oriented traversability maps, and then plan traversable paths. We extensively evaluate the approach in simulation on six real-world elevation datasets, and run a real-robot validation in one indoor and one outdoor environment.
△ Less
Submitted 18 February, 2019; v1 submitted 15 September, 2017;
originally announced September 2017.
-
First Experiences Optimizing Smith-Waterman on Intel's Knights Landing Processor
Authors:
Enzo Rucci,
Carlos Garcia,
Guillermo Botella,
Armando De Giusti,
Marcelo Naiouf,
Manuel Prieto-Matias
Abstract:
The well-known Smith-Waterman (SW) algorithm is the most commonly used method for local sequence alignments. However, SW is very computationally demanding for large protein databases. There exist several implementations that take advantage of computing parallelization on many-cores, FPGAs or GPUs, in order to increase the alignment throughtput. In this paper, we have explored SW acceleration on In…
▽ More
The well-known Smith-Waterman (SW) algorithm is the most commonly used method for local sequence alignments. However, SW is very computationally demanding for large protein databases. There exist several implementations that take advantage of computing parallelization on many-cores, FPGAs or GPUs, in order to increase the alignment throughtput. In this paper, we have explored SW acceleration on Intel KNL processor. The novelty of this architecture requires the revision of previous programming and optimization techniques on many-core architectures. To the best of authors knowledge, this is the first KNL architecture assessment for SW algorithm. Our evaluation, using the renowned Environmental NR database as benchmark, has shown that multi-threading and SIMD exploitation reports competitive performance (351 GCUPS) in comparison with other implementations.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Assessment of algorithms for mitosis detection in breast cancer histopathology images
Authors:
Mitko Veta,
Paul J. van Diest,
Stefan M. Willems,
Haibo Wang,
Anant Madabhushi,
Angel Cruz-Roa,
Fabio Gonzalez,
Anders B. L. Larsen,
Jacob S. Vestergaard,
Anders B. Dahl,
Dan C. Cireşan,
Jürgen Schmidhuber,
Alessandro Giusti,
Luca M. Gambardella,
F. Boray Tek,
Thomas Walter,
Ching-Wei Wang,
Satoshi Kondo,
Bogdan J. Matuszewski,
Frederic Precioso,
Violet Snell,
Josef Kittler,
Teofilo E. de Campos,
Adnan M. Khan,
Nasir M. Rajpoot
, et al. (4 additional authors not shown)
Abstract:
The proliferative activity of breast tumors, which is routinely estimated by counting of mitotic figures in hematoxylin and eosin stained histology sections, is considered to be one of the most important prognostic markers. However, mitosis counting is laborious, subjective and may suffer from low inter-observer agreement. With the wider acceptance of whole slide images in pathology labs, automati…
▽ More
The proliferative activity of breast tumors, which is routinely estimated by counting of mitotic figures in hematoxylin and eosin stained histology sections, is considered to be one of the most important prognostic markers. However, mitosis counting is laborious, subjective and may suffer from low inter-observer agreement. With the wider acceptance of whole slide images in pathology labs, automatic image analysis has been proposed as a potential solution for these issues. In this paper, the results from the Assessment of Mitosis Detection Algorithms 2013 (AMIDA13) challenge are described. The challenge was based on a data set consisting of 12 training and 11 testing subjects, with more than one thousand annotated mitotic figures by multiple observers. Short descriptions and results from the evaluation of eleven methods are presented. The top performing method has an error rate that is comparable to the inter-observer agreement among pathologists.
△ Less
Submitted 21 November, 2014;
originally announced November 2014.
-
Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks
Authors:
Alessandro Giusti,
Dan C. Cireşan,
Jonathan Masci,
Luca M. Gambardella,
Jürgen Schmidhuber
Abstract:
Deep Neural Networks now excel at image classification, detection and segmentation. When used to scan images by means of a sliding window, however, their high computational complexity can bring even the most powerful hardware to its knees. We show how dynamic programming can speedup the process by orders of magnitude, even when max-pooling layers are present.
Deep Neural Networks now excel at image classification, detection and segmentation. When used to scan images by means of a sliding window, however, their high computational complexity can bring even the most powerful hardware to its knees. We show how dynamic programming can speedup the process by orders of magnitude, even when max-pooling layers are present.
△ Less
Submitted 7 February, 2013;
originally announced February 2013.
-
A Fast Learning Algorithm for Image Segmentation with Max-Pooling Convolutional Networks
Authors:
Jonathan Masci,
Alessandro Giusti,
Dan Cireşan,
Gabriel Fricout,
Jürgen Schmidhuber
Abstract:
We present a fast algorithm for training MaxPooling Convolutional Networks to segment images. This type of network yields record-breaking performance in a variety of tasks, but is normally trained on a computationally expensive patch-by-patch basis. Our new method processes each training image in a single pass, which is vastly more efficient.
We validate the approach in different scenarios and r…
▽ More
We present a fast algorithm for training MaxPooling Convolutional Networks to segment images. This type of network yields record-breaking performance in a variety of tasks, but is normally trained on a computationally expensive patch-by-patch basis. Our new method processes each training image in a single pass, which is vastly more efficient.
We validate the approach in different scenarios and report a 1500-fold speed-up. In an application to automated steel defect detection and segmentation, we obtain excellent performance with short training times.
△ Less
Submitted 7 February, 2013;
originally announced February 2013.
-
Automatic Map** Tasks to Cores - Evaluating AMTHA Algorithm in Multicore Architectures
Authors:
Laura De Giusti,
Franco Chichizola,
Marcelo Naiouf,
Armando De Giusti,
Emilio Luque
Abstract:
The AMTHA (Automatic Map** Task on Heterogeneous Architectures) algorithm for task-to-processors assignment and the MPAHA (Model of Parallel Algorithms on Heterogeneous Architectures) model are presented. The use of AMTHA is analyzed for multicore processor-based architectures, considering the communication model among processes in use. The results obtained in the tests carried out are presented…
▽ More
The AMTHA (Automatic Map** Task on Heterogeneous Architectures) algorithm for task-to-processors assignment and the MPAHA (Model of Parallel Algorithms on Heterogeneous Architectures) model are presented. The use of AMTHA is analyzed for multicore processor-based architectures, considering the communication model among processes in use. The results obtained in the tests carried out are presented, comparing the real execution times on multicores of a set of synthetic applications with the predictions obtained with AMTHA. Finally current lines of research are presented, focusing on clusters of multicores and hybrid programming paradigms.
△ Less
Submitted 19 April, 2010;
originally announced April 2010.
-
A Unified Perspective on Parity- and Syndrome-Based Binary Data Compression Using Off-the-Shelf Turbo Codecs
Authors:
Lorenzo Cappellari,
Andrea De Giusti
Abstract:
We consider the problem of compressing memoryless binary data with or without side information at the decoder. We review the parity- and the syndrome-based approaches and discuss their theoretical limits, assuming that there exists a virtual binary symmetric channel between the source and the side information, and that the source is not necessarily uniformly distributed. We take a factor-graph-bas…
▽ More
We consider the problem of compressing memoryless binary data with or without side information at the decoder. We review the parity- and the syndrome-based approaches and discuss their theoretical limits, assuming that there exists a virtual binary symmetric channel between the source and the side information, and that the source is not necessarily uniformly distributed. We take a factor-graph-based approach in order to devise how to take full advantage of the ready-available iterative decoding procedures when turbo codes are employed, in both a parity- or a syndrome-based fashion. We end up obtaining a unified decoder formulation that holds both for error-free and for error-prone encoder-to-decoder transmission over generic channels. To support the theoretical results, the different compression systems analyzed in the paper are also experimentally tested. They are compared against several different approaches proposed in literature and shown to be competitive in a variety of cases.
△ Less
Submitted 2 August, 2010; v1 submitted 3 February, 2009;
originally announced February 2009.