-
Guarantees on Robot System Performance Using Stochastic Simulation Rollouts
Authors:
Joseph A. Vincent,
Aaron O. Feldman,
Mac Schwager
Abstract:
We provide finite-sample performance guarantees for control policies executed on stochastic robotic systems. Given an open- or closed-loop policy and a finite set of trajectory rollouts under the policy, we bound the expected value, value-at-risk, and conditional-value-at-risk of the trajectory cost, and the probability of failure in a sparse cost setting. The bounds hold, with user-specified prob…
▽ More
We provide finite-sample performance guarantees for control policies executed on stochastic robotic systems. Given an open- or closed-loop policy and a finite set of trajectory rollouts under the policy, we bound the expected value, value-at-risk, and conditional-value-at-risk of the trajectory cost, and the probability of failure in a sparse cost setting. The bounds hold, with user-specified probability, for any policy synthesis technique and can be seen as a post-design safety certification. Generating the bounds only requires sampling simulation rollouts, without assumptions on the distribution or complexity of the underlying stochastic system. We adapt these bounds to also give a constraint satisfaction test to verify safety of the robot system. We provide a thorough analysis of the bound sensitivity to sim-to-real distribution shifts and provide results for constructing robust bounds that can tolerate some specified amount of distribution shift. Furthermore, we extend our method to apply when selecting the best policy from a set of candidates, requiring a multi-hypothesis correction. We show the statistical validity of our bounds in the Ant, Half-cheetah, and Swimmer MuJoCo environments and demonstrate our constraint satisfaction test with the Ant. Finally, using the 20 degree-of-freedom MuJoCo Shadow Hand, we show the necessity of the multi-hypothesis correction.
△ Less
Submitted 13 June, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Reachable Polyhedral Marching (RPM): An Exact Analysis Tool for Deep-Learned Control Systems
Authors:
Joseph A. Vincent,
Mac Schwager
Abstract:
We present a tool for computing exact forward and backward reachable sets of deep neural networks with rectified linear unit (ReLU) activation. We then develop algorithms using this tool to compute invariant sets and regions of attraction (ROAs) for control systems with neural networks in the feedback loop. Our algorithm is unique in that it builds the reachable sets by incrementally enumerating p…
▽ More
We present a tool for computing exact forward and backward reachable sets of deep neural networks with rectified linear unit (ReLU) activation. We then develop algorithms using this tool to compute invariant sets and regions of attraction (ROAs) for control systems with neural networks in the feedback loop. Our algorithm is unique in that it builds the reachable sets by incrementally enumerating polyhedral regions in the input space, rather than iterating layer-by-layer through the network as in other methods. When performing safety verification, if an unsafe region is found, our algorithm can return this result without completing the full reachability computation, thus giving an anytime property that accelerates safety verification. Furthermore, we introduce a method to accelerate the computation of ROAs in the case that deep learned components are homeomorphisms, which we find is surprisingly common in practice. We demonstrate our tool in several test cases. We compute a ROA for a learned van der Pol oscillator model. We find a control invariant set for a learned torque-controlled pendulum model. We also verify specific safety properties for multiple deep networks related to the ACAS Xu aircraft collision advisory system. Finally, we apply our algorithm to find ROAs for an image-based aircraft runway taxi problem. Algorithm source code: https://github.com/StanfordMSL/Neural-Network-Reach .
△ Less
Submitted 25 October, 2022; v1 submitted 15 October, 2022;
originally announced October 2022.
-
Fast Cross-Correlation for TDoA Estimation on Small Aperture Microphone Arrays
Authors:
François Grondin,
Marc-Antoine Maheux,
Jean-Samuel Lauzon,
Jonathan Vincent,
François Michaud
Abstract:
This paper introduces the Fast Cross-Correlation (FCC) method for Time Difference of Arrival (TDoA) Estimation for pairs of microphones on a small aperture microphone array. FCC relies on low-rank decomposition and exploits symmetry in even and odd bases to speed up computation while preserving TDoA accuracy. FCC reduces the number of flops by a factor of 4.5 and the execution speed by factors bet…
▽ More
This paper introduces the Fast Cross-Correlation (FCC) method for Time Difference of Arrival (TDoA) Estimation for pairs of microphones on a small aperture microphone array. FCC relies on low-rank decomposition and exploits symmetry in even and odd bases to speed up computation while preserving TDoA accuracy. FCC reduces the number of flops by a factor of 4.5 and the execution speed by factors between 3.5 and 8.3 on embedded hardware, compared to the state-of-the-art Generalized Cross-Correlation (GCC) method that relies on the Fast Fourier Transform (FFT). This improvement can provide portable microphone arrays with extended battery life and allow real-time processing on low-cost hardware.
△ Less
Submitted 10 March, 2023; v1 submitted 28 April, 2022;
originally announced April 2022.
-
Federated Learning Enables Big Data for Rare Cancer Boundary Detection
Authors:
Sarthak Pati,
Ujjwal Baid,
Brandon Edwards,
Micah Sheller,
Shih-Han Wang,
G Anthony Reina,
Patrick Foley,
Alexey Gruzdev,
Deepthi Karkada,
Christos Davatzikos,
Chiharu Sako,
Satyam Ghodasara,
Michel Bilello,
Suyash Mohan,
Philipp Vollmuth,
Gianluca Brugnara,
Chandrakanth J Preetha,
Felix Sahm,
Klaus Maier-Hein,
Maximilian Zenk,
Martin Bendszus,
Wolfgang Wick,
Evan Calabrese,
Jeffrey Rudie,
Javier Villanueva-Meyer
, et al. (254 additional authors not shown)
Abstract:
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc…
▽ More
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.
△ Less
Submitted 25 April, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
SMP-PHAT: Lightweight DoA Estimation by Merging Microphone Pairs
Authors:
François Grondin,
Marc-Antoine Maheux,
Jean-Samuel Lauzon,
Jonathan Vincent,
François Michaud
Abstract:
This paper introduces SMP-PHAT, which performs direction of arrival (DoA) of sound estimation with a microphone array by merging pairs of microphones that are parallel in space. This approach reduces the number of pairwise cross-correlation computations, and brings down the number of flops and memory lookups when searching for DoA. Experiments on low-cost hardware with commonly used microphone arr…
▽ More
This paper introduces SMP-PHAT, which performs direction of arrival (DoA) of sound estimation with a microphone array by merging pairs of microphones that are parallel in space. This approach reduces the number of pairwise cross-correlation computations, and brings down the number of flops and memory lookups when searching for DoA. Experiments on low-cost hardware with commonly used microphone arrays show that the proposed method provides the same accuracy as the former SRP-PHAT approach, while reducing the computational load by 39% in some cases.
△ Less
Submitted 27 March, 2022;
originally announced March 2022.
-
Heuristics for Customer-focused Ride-pooling Assignment
Authors:
Alexander Sundt,
Qi Luo,
John Vincent,
Mehrdad Shahabi,
Yafeng Yin
Abstract:
Ride-pooling has become an important service option offered by ride-hailing platforms as it serves multiple trip requests in a single ride. By leveraging customer data, connected vehicles, and efficient assignment algorithms, ride-pooling can be a critical instrument to address driver shortages and mitigate the negative externalities of ride-hailing operations. Recent literature has focused on com…
▽ More
Ride-pooling has become an important service option offered by ride-hailing platforms as it serves multiple trip requests in a single ride. By leveraging customer data, connected vehicles, and efficient assignment algorithms, ride-pooling can be a critical instrument to address driver shortages and mitigate the negative externalities of ride-hailing operations. Recent literature has focused on computationally intensive optimization-based methods that maximize system throughput or minimize vehicle miles. However, individual customers may experience substantial service quality degradation due to the consequent waiting and detour time. In contrast, this paper examines heuristic methods for real-time ride-pooling assignments that are highly scalable and easily computable. We propose a restricted subgraph method and compare it with other existing heuristic and optimization-based matching algorithms using a variety of metrics. By fusing multiple sources of trip and network data in New York City, we develop a flexible, agent-based simulation platform to test these strategies on different demand levels and examine how they affect both the customer experience and the ride-hailing platform. Our results find a trade-off among heuristics between throughput and customer matching time. We show that our proposed ride-pooling strategy maintains system performance while limiting trip delays and improving customer experience. This work provides insight for policymakers and ride-hailing operators about the performance of simpler heuristics and raises concerns about prioritizing only specific platform metrics without considering service quality.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
ODAS: Open embeddeD Audition System
Authors:
François Grondin,
Dominic Létourneau,
Cédric Godin,
Jean-Samuel Lauzon,
Jonathan Vincent,
Simon Michaud,
Samuel Faucher,
François Michaud
Abstract:
Artificial audition aims at providing hearing capabilities to machines, computers and robots. Existing frameworks in robot audition offer interesting sound source localization, tracking and separation performance, although involve a significant amount of computations that limit their use on robots with embedded computing capabilities. This paper presents ODAS, the Open embeddeD Audition System fra…
▽ More
Artificial audition aims at providing hearing capabilities to machines, computers and robots. Existing frameworks in robot audition offer interesting sound source localization, tracking and separation performance, although involve a significant amount of computations that limit their use on robots with embedded computing capabilities. This paper presents ODAS, the Open embeddeD Audition System framework, which includes strategies to reduce the computational load and perform robot audition tasks on low-cost embedded computing systems. It presents key features of ODAS, along with cases illustrating its uses in different robots and artificial audition applications.
△ Less
Submitted 11 May, 2022; v1 submitted 5 March, 2021;
originally announced March 2021.
-
Develo** and Evaluating Deep Neural Network-based Denoising for Nanoparticle TEM Images with Ultra-low Signal-to-Noise
Authors:
Joshua L. Vincent,
Ramon Manzorro,
Sreyas Mohan,
Binh Tang,
Dev Y. Sheth,
Eero P. Simoncelli,
David S. Matteson,
Carlos Fernandez-Granda,
Peter A. Crozier
Abstract:
A deep convolutional neural network has been developed to denoise atomic-resolution TEM image datasets of nanoparticles acquired using direct electron counting detectors, for applications where the image signal is severely limited by shot noise. The network was applied to a model system of CeO2-supported Pt nanoparticles. We leverage multislice image simulations to generate a large and flexible da…
▽ More
A deep convolutional neural network has been developed to denoise atomic-resolution TEM image datasets of nanoparticles acquired using direct electron counting detectors, for applications where the image signal is severely limited by shot noise. The network was applied to a model system of CeO2-supported Pt nanoparticles. We leverage multislice image simulations to generate a large and flexible dataset for training and testing the network. The proposed network outperforms state-of-the-art denoising methods by a significant margin both on simulated and experimental test data. Factors contributing to the performance are identified, including most importantly (a) the geometry of the images used during training and (b) the size of the network's receptive field. Through a gradient-based analysis, we investigate the mechanisms learned by the network to denoise experimental images. This shows that the network exploits global and local information in the noisy measurements, for example, by adapting its filtering approach when it encounters atomic-level defects at the nanoparticle surface. Extensive analysis has been done to characterize the network's ability to correctly predict the exact atomic structure at the nanoparticle surface. Finally, we develop an approach based on the log-likelihood ratio test that provides a quantitative measure of the agreement between the noisy observation and the atomic-level structure in the network-denoised image.
△ Less
Submitted 17 March, 2021; v1 submitted 19 January, 2021;
originally announced January 2021.
-
Unsupervised Deep Video Denoising
Authors:
Dev Yashpal Sheth,
Sreyas Mohan,
Joshua L. Vincent,
Ramon Manzorro,
Peter A. Crozier,
Mitesh M. Khapra,
Eero P. Simoncelli,
Carlos Fernandez-Granda
Abstract:
Deep convolutional neural networks (CNNs) for video denoising are typically trained with supervision, assuming the availability of clean videos. However, in many applications, such as microscopy, noiseless videos are not available. To address this, we propose an Unsupervised Deep Video Denoiser (UDVD), a CNN architecture designed to be trained exclusively with noisy data. The performance of UDVD i…
▽ More
Deep convolutional neural networks (CNNs) for video denoising are typically trained with supervision, assuming the availability of clean videos. However, in many applications, such as microscopy, noiseless videos are not available. To address this, we propose an Unsupervised Deep Video Denoiser (UDVD), a CNN architecture designed to be trained exclusively with noisy data. The performance of UDVD is comparable to the supervised state-of-the-art, even when trained only on a single short noisy video. We demonstrate the promise of our approach in real-world imaging applications by denoising raw video, fluorescence-microscopy and electron-microscopy data. In contrast to many current approaches to video denoising, UDVD does not require explicit motion compensation. This is advantageous because motion compensation is computationally expensive, and can be unreliable when the input data are noisy. A gradient-based analysis reveals that UDVD automatically adapts to local motion in the input noisy videos. Thus, the network learns to perform implicit motion compensation, even though it is only trained for denoising.
△ Less
Submitted 19 August, 2021; v1 submitted 30 November, 2020;
originally announced November 2020.
-
Deep Denoising For Scientific Discovery: A Case Study In Electron Microscopy
Authors:
Sreyas Mohan,
Ramon Manzorro,
Joshua L. Vincent,
Binh Tang,
Dev Yashpal Sheth,
Eero P. Simoncelli,
David S. Matteson,
Peter A. Crozier,
Carlos Fernandez-Granda
Abstract:
Denoising is a fundamental challenge in scientific imaging. Deep convolutional neural networks (CNNs) provide the current state of the art in denoising natural images, where they produce impressive results. However, their potential has barely been explored in the context of scientific imaging. Denoising CNNs are typically trained on real natural images artificially corrupted with simulated noise.…
▽ More
Denoising is a fundamental challenge in scientific imaging. Deep convolutional neural networks (CNNs) provide the current state of the art in denoising natural images, where they produce impressive results. However, their potential has barely been explored in the context of scientific imaging. Denoising CNNs are typically trained on real natural images artificially corrupted with simulated noise. In contrast, in scientific applications, noiseless ground-truth images are usually not available. To address this issue, we propose a simulation-based denoising (SBD) framework, in which CNNs are trained on simulated images. We test the framework on data obtained from transmission electron microscopy (TEM), an imaging technique with widespread applications in material science, biology, and medicine. SBD outperforms existing techniques by a wide margin on a simulated benchmark dataset, as well as on real data. Apart from the denoised images, SBD generates likelihood maps to visualize the agreement between the structure of the denoised image and the observed data. Our results reveal shortcomings of state-of-the-art denoising architectures, such as their small field-of-view: substantially increasing the field-of-view of the CNNs allows them to exploit non-local periodic patterns in the data, which is crucial at high noise levels. In addition, we analyze the generalization capability of SBD, demonstrating that the trained networks are robust to variations of imaging parameters and of the underlying signal structure. Finally, we release the first publicly available benchmark dataset of TEM images, containing 18,000 examples.
△ Less
Submitted 13 July, 2021; v1 submitted 24 October, 2020;
originally announced October 2020.
-
WHO 2016 subty** and automated segmentation of glioma using multi-task deep learning
Authors:
Sebastian R. van der Voort,
Fatih Incekara,
Maarten M. J. Wijnenga,
Georgios Kapsas,
Renske Gahrmann,
Joost W. Schouten,
Rishi Nandoe Tewarie,
Geert J. Lycklama,
Philip C. De Witt Hamer,
Roelant S. Eijgelaar,
Pim J. French,
Hendrikus J. Dubbink,
Arnaud J. P. E. Vincent,
Wiro J. Niessen,
Martin J. van den Bent,
Marion Smits,
Stefan Klein
Abstract:
Accurate characterization of glioma is crucial for clinical decision making. A delineation of the tumor is also desirable in the initial decision stages but is a time-consuming task. Leveraging the latest GPU capabilities, we developed a single multi-task convolutional neural network that uses the full 3D, structural, pre-operative MRI scans to can predict the IDH mutation status, the 1p/19q co-de…
▽ More
Accurate characterization of glioma is crucial for clinical decision making. A delineation of the tumor is also desirable in the initial decision stages but is a time-consuming task. Leveraging the latest GPU capabilities, we developed a single multi-task convolutional neural network that uses the full 3D, structural, pre-operative MRI scans to can predict the IDH mutation status, the 1p/19q co-deletion status, and the grade of a tumor, while simultaneously segmenting the tumor. We trained our method using the largest, most diverse patient cohort to date containing 1508 glioma patients from 16 institutes. We tested our method on an independent dataset of 240 patients from 13 different institutes, and achieved an IDH-AUC of 0.90, 1p/19q-AUC of 0.85, grade-AUC of 0.81, and a mean whole tumor DICE score of 0.84. Thus, our method non-invasively predicts multiple, clinically relevant parameters and generalizes well to the broader clinical population.
△ Less
Submitted 9 October, 2020;
originally announced October 2020.
-
Respiratory Sound Classification Using Long-Short Term Memory
Authors:
Chelsea Villanueva,
Joshua Vincent,
Alexander Slowinski,
Mohammad-Parsa Hosseini
Abstract:
Develo** a reliable sound detection and recognition system offers many benefits and has many useful applications in different industries. This paper examines the difficulties that exist when attempting to perform sound classification as it relates to respiratory disease classification. Some methods which have been employed such as independent component analysis and blind source separation are ex…
▽ More
Develo** a reliable sound detection and recognition system offers many benefits and has many useful applications in different industries. This paper examines the difficulties that exist when attempting to perform sound classification as it relates to respiratory disease classification. Some methods which have been employed such as independent component analysis and blind source separation are examined. Finally, an examination on the use of deep learning and long short-term memory networks is performed in order to identify how such a task can be implemented.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Dynamic Object Tracking and Masking for Visual SLAM
Authors:
Jonathan Vincent,
Mathieu Labbé,
Jean-Samuel Lauzon,
François Grondin,
Pier-Marc Comtois-Rivet,
François Michaud
Abstract:
In dynamic environments, performance of visual SLAM techniques can be impaired by visual features taken from moving objects. One solution is to identify those objects so that their visual features can be removed for localization and map**. This paper presents a simple and fast pipeline that uses deep neural networks, extended Kalman filters and visual SLAM to improve both localization and mappin…
▽ More
In dynamic environments, performance of visual SLAM techniques can be impaired by visual features taken from moving objects. One solution is to identify those objects so that their visual features can be removed for localization and map**. This paper presents a simple and fast pipeline that uses deep neural networks, extended Kalman filters and visual SLAM to improve both localization and map** in dynamic environments (around 14 fps on a GTX 1080). Results on the dynamic sequences from the TUM dataset using RTAB-Map as visual SLAM suggest that the approach achieves similar localization performance compared to other state-of-the-art methods, while also providing the position of the tracked dynamic objects, a 3D map free of those dynamic objects, better loop closure detection with the whole pipeline able to run on a robot moving at moderate speed.
△ Less
Submitted 31 July, 2020;
originally announced August 2020.
-
GEV Beamforming Supported by DOA-based Masks Generated on Pairs of Microphones
Authors:
Francois Grondin,
Jean-Samuel Lauzon,
Jonathan Vincent,
Francois Michaud
Abstract:
Distant speech processing is a challenging task, especially when dealing with the cocktail party effect. Sound source separation is thus often required as a preprocessing step prior to speech recognition to improve the signal to distortion ratio (SDR). Recently, a combination of beamforming and speech separation networks have been proposed to improve the target source quality in the direction of a…
▽ More
Distant speech processing is a challenging task, especially when dealing with the cocktail party effect. Sound source separation is thus often required as a preprocessing step prior to speech recognition to improve the signal to distortion ratio (SDR). Recently, a combination of beamforming and speech separation networks have been proposed to improve the target source quality in the direction of arrival of interest. However, with this type of approach, the neural network needs to be trained in advance for a specific microphone array geometry, which limits versatility when adding/removing microphones, or changing the shape of the array. The solution presented in this paper is to train a neural network on pairs of microphones with different spacing and acoustic environmental conditions, and then use this network to estimate a time-frequency mask from all the pairs of microphones forming the array with an arbitrary shape. Using this mask, the target and noise covariance matrices can be estimated, and then used to perform generalized eigenvalue (GEV) beamforming. Results show that the proposed approach improves the SDR from 4.78 dB to 7.69 dB on average, for various microphone array geometries that correspond to commercially available hardware.
△ Less
Submitted 5 August, 2020; v1 submitted 19 May, 2020;
originally announced May 2020.