-
Tensor Power Flow Formulations for Multidimensional Analyses in Distribution Systems
Authors:
Edgar Mauricio Salazar Duque,
Juan S. Giraldo,
Pedro P. Vergara,
Phuong H. Nguyen,
Han,
Slootweg
Abstract:
In this paper, we present two multidimensional power flow formulations based on a fixed-point iteration (FPI) algorithm to efficiently solve hundreds of thousands of power flows in distribution systems. The presented algorithms are the base for a new TensorPowerFlow (TPF) tool and shine for their simplicity, benefiting from multicore \gls{cpu} and \gls{gpu} parallelization. We also focus on the ma…
▽ More
In this paper, we present two multidimensional power flow formulations based on a fixed-point iteration (FPI) algorithm to efficiently solve hundreds of thousands of power flows in distribution systems. The presented algorithms are the base for a new TensorPowerFlow (TPF) tool and shine for their simplicity, benefiting from multicore \gls{cpu} and \gls{gpu} parallelization. We also focus on the mathematical convergence properties of the algorithm, showing that its unique solution is at the practical operational point, which is the solution of high-voltage and low-current. The proof is validated using numerical simulations showing the robustness of the FPI algorithm compared to the classical \gls{nr} approach. In the case study, a benchmark with different PF solution methods is performed, showing that for applications requiring a yearly simulation at 1-minute resolution the computation time is decreased by a factor of 164, compared to the NR in its sparse formulation.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
An Unobtrusive and Lightweight Ear-worn System for Continuous Epileptic Seizure Detection
Authors:
Abdul Aziz,
Nhat Pham,
Neel Vora,
Cody Reynolds,
Jaime Lehnen,
Pooja Venkatesh,
Zhuoran Yao,
Jay Harvey,
Tam Vu,
Kan Ding,
Phuc Nguyen
Abstract:
Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scal…
▽ More
Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scalp-based EEG test, despite being the gold standard for diagnosing epilepsy, is costly, necessitates hospitalization, demands skilled professionals for operation, and is discomforting for users. In this paper, we propose EarSD, a novel lightweight, unobtrusive, and socially acceptable ear-worn system to detect epileptic seizure onsets by measuring the physiological signals from behind the user's ears. EarSD includes an integrated custom-built sensing, computing, and communication PCB to collect and amplify the signals of interest, remove the noises caused by motion artifacts and environmental impacts, and stream the data wirelessly to the computer or mobile phone nearby, where data are uploaded to the host computer for further processing. We conducted both in-lab and in-hospital experiments with epileptic seizure patients who were hospitalized for seizure studies. The preliminary results confirm that EarSD can detect seizures with up to 95.3 percent accuracy by just using classical machine learning algorithms.
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
Real-Time Diagnostic Integrity Meets Efficiency: A Novel Platform-Agnostic Architecture for Physiological Signal Compression
Authors:
Neel R Vora,
Amir Hajighasemi,
Cody T. Reynolds,
Amirmohammad Radmehr,
Mohamed Mohamed,
Jillur Rahman Saurav,
Abdul Aziz,
Jai Prakash Veerla,
Mohammad S Nasr,
Hayden Lotspeich,
Partha Sai Guttikonda,
Thuong Pham,
Aarti Darji,
Parisa Boodaghi Malidarreh,
Helen H Shang,
Jay Harvey,
Kan Ding,
Phuc Nguyen,
Jacob M Luber
Abstract:
Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases.
However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monit…
▽ More
Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases.
However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monitoring wearables.
This paper presents a novel deep-learning framework employing a variational autoencoder (VAE) for physiological signal compression to reduce wearables' computational complexity and energy consumption.
Our approach achieves an impressive compression ratio of 1:293 specifically for spectrogram data, surpassing state-of-the-art compression techniques such as JPEG2000, H.264, Direct Cosine Transform (DCT), and Huffman Encoding, which do not excel in handling physiological signals.
We validate the efficacy of the compressed algorithms using collected physiological signals from real patients in the Hospital and deploy the solution on commonly used embedded AI chips (i.e., ARM Cortex V8 and Jetson Nano). The proposed framework achieves a 91% seizure detection accuracy using XGBoost, confirming the approach's reliability, practicality, and scalability.
△ Less
Submitted 4 January, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
IncepSE: Leveraging InceptionTime's performance with Squeeze and Excitation mechanism in ECG analysis
Authors:
Tue Minh Cao,
Nhat Hong Tran,
Le Phi Nguyen,
Hieu Huy Pham,
Hung Thanh Nguyen
Abstract:
Our study focuses on the potential for modifications of Inception-like architecture within the electrocardiogram (ECG) domain. To this end, we introduce IncepSE, a novel network characterized by strategic architectural incorporation that leverages the strengths of both InceptionTime and channel attention mechanisms. Furthermore, we propose a training setup that employs stabilization techniques tha…
▽ More
Our study focuses on the potential for modifications of Inception-like architecture within the electrocardiogram (ECG) domain. To this end, we introduce IncepSE, a novel network characterized by strategic architectural incorporation that leverages the strengths of both InceptionTime and channel attention mechanisms. Furthermore, we propose a training setup that employs stabilization techniques that are aimed at tackling the formidable challenges of severe imbalance dataset PTB-XL and gradient corruption. By this means, we manage to set a new height for deep learning model in a supervised learning manner across the majority of tasks. Our model consistently surpasses InceptionTime by substantial margins compared to other state-of-the-arts in this domain, noticeably 0.013 AUROC score improvement in the "all" task, while also mitigating the inherent dataset fluctuations during training.
△ Less
Submitted 16 November, 2023;
originally announced December 2023.
-
Acousto-optic reconstruction of exterior sound field based on concentric circle sampling with circular harmonic expansion
Authors:
Phuc Duc Nguyen,
Kenji Ishikawa,
Noboru Harada,
Takehiro Moriya
Abstract:
Acousto-optic sensing provides an alternative approach to traditional microphone arrays by shedding light on the interaction of light with an acoustic field. Sound field reconstruction is a fascinating and advanced technique used in acousto-optics sensing. Current challenges in sound-field reconstruction methods pertain to scenarios in which the sound source is located within the reconstruction ar…
▽ More
Acousto-optic sensing provides an alternative approach to traditional microphone arrays by shedding light on the interaction of light with an acoustic field. Sound field reconstruction is a fascinating and advanced technique used in acousto-optics sensing. Current challenges in sound-field reconstruction methods pertain to scenarios in which the sound source is located within the reconstruction area, known as the exterior problem. Existing reconstruction algorithms, primarily designed for interior scenarios, often exhibit suboptimal performance when applied to exterior cases. This paper introduces a novel technique for exterior sound-field reconstruction. The proposed method leverages concentric circle sampling and a two-dimensional exterior sound-field reconstruction approach based on circular harmonic extensions. To evaluate the efficacy of this approach, both numerical simulations and practical experiments are conducted. The results highlight the superior accuracy of the proposed method when compared to conventional reconstruction methods, all while utilizing a minimal amount of measured projection data.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Bearing-Based Network Localization Under Randomized Gossip Protocol
Authors:
Nhat-Minh Le-Phan,
Minh Hoang Trinh,
Phuoc Doan Nguyen
Abstract:
In this paper, we consider a randomized gossip algorithm for the bearing-based network localization problem. Let each sensor node be able to obtain the bearing vectors and communicate its position estimates with several neighboring agents. Each update involves two agents, and the update sequence follows a stochastic process. Under the assumption that the network is infinitesimally bearing rigid an…
▽ More
In this paper, we consider a randomized gossip algorithm for the bearing-based network localization problem. Let each sensor node be able to obtain the bearing vectors and communicate its position estimates with several neighboring agents. Each update involves two agents, and the update sequence follows a stochastic process. Under the assumption that the network is infinitesimally bearing rigid and contains at least two beacon nodes, we show that when the updating step-size is properly selected, the proposed algorithm can successfully estimate the actual sensor nodes' positions with probability one. The randomized update provides a simple, distributed, and cost-effective method for localizing the network. The theoretical result is supported with a simulation of a 1089-node sensor network.
△ Less
Submitted 17 January, 2024; v1 submitted 27 April, 2023;
originally announced April 2023.
-
Multimodal contrastive learning for diagnosing cardiovascular diseases from electrocardiography (ECG) signals and patient metadata
Authors:
Tue M. Cao,
Nhat H. Tran,
Phi Le Nguyen,
Hieu Pham
Abstract:
This work discusses the use of contrastive learning and deep learning for diagnosing cardiovascular diseases from electrocardiography (ECG) signals. While the ECG signals usually contain 12 leads (channels), many healthcare facilities and devices lack access to all these 12 leads. This raises the problem of how to use only fewer ECG leads to produce meaningful diagnoses with high performance. We i…
▽ More
This work discusses the use of contrastive learning and deep learning for diagnosing cardiovascular diseases from electrocardiography (ECG) signals. While the ECG signals usually contain 12 leads (channels), many healthcare facilities and devices lack access to all these 12 leads. This raises the problem of how to use only fewer ECG leads to produce meaningful diagnoses with high performance. We introduce a simple experiment to test whether contrastive learning can be applied to this task. More specifically, we added the similarity between the embedding vectors when the 12 leads signal and the fewer leads ECG signal to the loss function to bring these representations closer together. Despite its simplicity, this has been shown to have improved the performance of diagnosing with all lead combinations, proving the potential of contrastive learning on this task.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
Randomized Matrix Weighted Consensus
Authors:
Nhat-Minh Le-Phan,
Minh Hoang Trinh,
Phuoc Doan Nguyen
Abstract:
In this paper, randomized gossip-type matrix-weighted consensus algorithms are proposed for both leaderless and leader-follower topologies. First, we introduce the notion of expected matrix-weighted network, which captures the multi-dimensional interactions between any two agents in a probabilistic sense. Under some mild assumptions on the distribution of the expected matrix weights and the upper…
▽ More
In this paper, randomized gossip-type matrix-weighted consensus algorithms are proposed for both leaderless and leader-follower topologies. First, we introduce the notion of expected matrix-weighted network, which captures the multi-dimensional interactions between any two agents in a probabilistic sense. Under some mild assumptions on the distribution of the expected matrix weights and the upper bound of the updating step size, the proposed asynchronous pairwise update algorithms drive the network to achieve a consensus in expectation. An upper bound of the $ε$-convergence time of the algorithm is then derived. Furthermore, the proposed algorithms are applied to the bearing-based network localization and formation control problems. The theoretical results are supported by several numerical examples.
△ Less
Submitted 6 February, 2024; v1 submitted 26 March, 2023;
originally announced March 2023.
-
ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
Authors:
Kai-Chieh Hsu,
Duy Phuong Nguyen,
Jaime Fernández Fisac
Abstract:
The deployment of robots in uncontrolled environments requires them to operate robustly under previously unseen scenarios, like irregular terrain and wind conditions. Unfortunately, while rigorous safety frameworks from robust optimal control theory scale poorly to high-dimensional nonlinear dynamics, control policies computed by more tractable "deep" methods lack guarantees and tend to exhibit li…
▽ More
The deployment of robots in uncontrolled environments requires them to operate robustly under previously unseen scenarios, like irregular terrain and wind conditions. Unfortunately, while rigorous safety frameworks from robust optimal control theory scale poorly to high-dimensional nonlinear dynamics, control policies computed by more tractable "deep" methods lack guarantees and tend to exhibit little robustness to uncertain operating conditions. This work introduces a novel approach enabling scalable synthesis of robust safety-preserving controllers for robotic systems with general nonlinear dynamics subject to bounded modeling error by combining game-theoretic safety analysis with adversarial reinforcement learning in simulation. Following a soft actor-critic scheme, a safety-seeking fallback policy is co-trained with an adversarial "disturbance" agent that aims to invoke the worst-case realization of model error and training-to-deployment discrepancy allowed by the designer's uncertainty. While the learned control policy does not intrinsically guarantee safety, it is used to construct a real-time safety filter (or shield) with robust safety guarantees based on forward reachability rollouts. This shield can be used in conjunction with a safety-agnostic control policy, precluding any task-driven actions that could result in loss of safety. We evaluate our learning-based safety approach in a 5D race car simulator, compare the learned safety policy to the numerically obtained optimal solution, and empirically validate the robust safety guarantee of our proposed safety shield against worst-case model discrepancy.
△ Less
Submitted 7 June, 2024; v1 submitted 6 December, 2022;
originally announced December 2022.
-
Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context
Authors:
Lam Pham,
Dusan Salovic,
Anahid Jalali,
Alexander Schindler,
Khoa Tran,
Canh Vu,
Phu X. Nguyen
Abstract:
In this paper, we present a comprehensive analysis of Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. In particular, we firstly propose an inception-based and low footprint ASC model, referred to as the ASC baseline. The proposed ASC baseline is then compared with benchmark and high-complexity network architectures of Mobile…
▽ More
In this paper, we present a comprehensive analysis of Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. In particular, we firstly propose an inception-based and low footprint ASC model, referred to as the ASC baseline. The proposed ASC baseline is then compared with benchmark and high-complexity network architectures of MobileNetV1, MobileNetV2, VGG16, VGG19, ResNet50V2, ResNet152V2, DenseNet121, DenseNet201, and Xception. Next, we improve the ASC baseline by proposing a novel deep neural network architecture which leverages residual-inception architectures and multiple kernels. Given the novel residual-inception (NRI) model, we further evaluate the trade off between the model complexity and the model accuracy performance. Finally, we evaluate whether sound events occurring in a sound scene recording can help to improve ASC accuracy, then indicate how a sound scene context is well presented by combining both sound scene and sound event information. We conduct extensive experiments on various ASC datasets, including Crowded Scenes, IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2018 Task 1A and 1B, 2019 Task 1A and 1B, 2020 Task 1A, 2021 Task 1A, 2022 Task 1. The experimental results on several different ASC challenges highlight two main achievements; the first is to propose robust, general, and low complexity ASC systems which are suitable for real-life applications on a wide range of edge devices and mobiles; the second is to propose an effective visualization method for comprehensively presenting a sound scene context.
△ Less
Submitted 16 October, 2022;
originally announced October 2022.
-
A Deep Reinforcement Learning-based Adaptive Charging Policy for WRSNs
Authors:
Ngoc Bui,
Phi Le Nguyen,
Viet Anh Nguyen,
Phan Thuan Do
Abstract:
Wireless sensor networks consist of randomly distributed sensor nodes for monitoring targets or areas of interest. Maintaining the network for continuous surveillance is a challenge due to the limited battery capacity in each sensor. Wireless power transfer technology is emerging as a reliable solution for energizing the sensors by deploying a mobile charger (MC) to recharge the sensor. However, d…
▽ More
Wireless sensor networks consist of randomly distributed sensor nodes for monitoring targets or areas of interest. Maintaining the network for continuous surveillance is a challenge due to the limited battery capacity in each sensor. Wireless power transfer technology is emerging as a reliable solution for energizing the sensors by deploying a mobile charger (MC) to recharge the sensor. However, designing an optimal charging path for the MC is challenging because of uncertainties arising in the networks. The energy consumption rate of the sensors may fluctuate significantly due to unpredictable changes in the network topology, such as node failures. These changes also lead to shifts in the importance of each sensor, which are often assumed to be the same in existing works. We address these challenges in this paper by proposing a novel adaptive charging scheme using a deep reinforcement learning (DRL) approach. Specifically, we endow the MC with a charging policy that determines the next sensor to charge conditioning on the current state of the network. We then use a deep neural network to parametrize this charging policy, which will be trained by reinforcement learning techniques. Our model can adapt to spontaneous changes in the network topology. The empirical results show that the proposed algorithm outperforms the existing on-demand algorithms by a significant margin.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
A Soft-Bodied Aerial Robot for Collision Resilience and Contact-Reactive Perching
Authors:
Pham H. Nguyen,
Karishma Patnaik,
Shatadal Mishra,
Panagiotis Polygerinos,
Wenlong Zhang
Abstract:
Current aerial robots demonstrate limited interaction capabilities in unstructured environments when compared with their biological counterparts. Some examples include their inability to tolerate collisions and to successfully land or perch on objects of unknown shapes, sizes, and texture. Efforts to include compliance have introduced designs that incorporate external mechanical impact protection…
▽ More
Current aerial robots demonstrate limited interaction capabilities in unstructured environments when compared with their biological counterparts. Some examples include their inability to tolerate collisions and to successfully land or perch on objects of unknown shapes, sizes, and texture. Efforts to include compliance have introduced designs that incorporate external mechanical impact protection at the cost of reduced agility and flight time due to the added weight. In this work, we propose and develop a light-weight, inflatable, soft-bodied aerial robot (SoBAR) that can pneumatically vary its body stiffness to achieve intrinsic collision resilience. Unlike the conventional rigid aerial robots, SoBAR successfully demonstrates its ability to repeatedly endure and recover from collisions in various directions, not only limited to in-plane ones. Furthermore, we exploit its capabilities to demonstrate perching where the 3D collision resilience helps in improving the perching success rates. We also augment SoBAR with a novel hybrid fabric-based, bistable (HFB) grasper that can utilize impact energies to perform contact-reactive gras** through rapid shape conforming abilities. We exhaustively study and offer insights into the collision resilience, impact absorption, and manipulation capabilities of SoBAR with the HFB grasper. Finally, we compare the performance of conventional aerial robots with the SoBAR through collision characterizations, gras** identifications, and experimental validations of collision resilience and perching in various scenarios and on differently shaped objects.
△ Less
Submitted 4 January, 2023; v1 submitted 27 April, 2022;
originally announced April 2022.
-
SHREC 2021: Classification in cryo-electron tomograms
Authors:
Ilja Gubins,
Marten L. Chaillet,
Gijs van der Schot,
M. Cristina Trueba,
Remco C. Veltkamp,
Friedrich Förster,
Xiao Wang,
Daisuke Kihara,
Emmanuel Moebel,
Nguyen P. Nguyen,
Tommi White,
Filiz Bunyak,
Giorgos Papoulias,
Stavros Gerolymatos,
Evangelia I. Zacharaki,
Konstantinos Moustakas,
Xiangrui Zeng,
Sinuo Liu,
Min Xu,
Yaoyu Wang,
Cheng Chen,
Xuefeng Cui,
Fa Zhang
Abstract:
Cryo-electron tomography (cryo-ET) is an imaging technique that allows three-dimensional visualization of macro-molecular assemblies under near-native conditions. Cryo-ET comes with a number of challenges, mainly low signal-to-noise and inability to obtain images from all angles. Computational methods are key to analyze cryo-electron tomograms.
To promote innovation in computational methods, we…
▽ More
Cryo-electron tomography (cryo-ET) is an imaging technique that allows three-dimensional visualization of macro-molecular assemblies under near-native conditions. Cryo-ET comes with a number of challenges, mainly low signal-to-noise and inability to obtain images from all angles. Computational methods are key to analyze cryo-electron tomograms.
To promote innovation in computational methods, we generate a novel simulated dataset to benchmark different methods of localization and classification of biological macromolecules in tomograms. Our publicly available dataset contains ten tomographic reconstructions of simulated cell-like volumes. Each volume contains twelve different types of complexes, varying in size, function and structure.
In this paper, we have evaluated seven different methods of finding and classifying proteins. Seven research groups present results obtained with learning-based methods and trained on the simulated dataset, as well as a baseline template matching (TM), a traditional method widely used in cryo-ET research. We show that learning-based approaches can achieve notably better localization and classification performance than TM. We also experimentally confirm that there is a negative relationship between particle size and performance for all methods.
△ Less
Submitted 18 March, 2022;
originally announced March 2022.
-
Sound-Dr: Reliable Sound Dataset and Baseline Artificial Intelligence System for Respiratory Illnesses
Authors:
Truong V. Hoang,
Quang H. Nguyen,
Cuong Q. Nguyen,
Phong X. Nguyen,
Hoang D. Nguyen
Abstract:
As the burden of respiratory diseases continues to fall on society worldwide, this paper proposes a high-quality and reliable dataset of human sounds for studying respiratory illnesses, including pneumonia and COVID-19. It consists of coughing, mouth breathing, and nose breathing sounds together with metadata on related clinical characteristics. We also develop a proof-of-concept system for establ…
▽ More
As the burden of respiratory diseases continues to fall on society worldwide, this paper proposes a high-quality and reliable dataset of human sounds for studying respiratory illnesses, including pneumonia and COVID-19. It consists of coughing, mouth breathing, and nose breathing sounds together with metadata on related clinical characteristics. We also develop a proof-of-concept system for establishing baselines and benchmarking against multiple datasets, such as Coswara and COUGHVID. Our comprehensive experiments show that the Sound-Dr dataset has richer features, better performance, and is more robust to dataset shifts in various machine learning tasks. It is promising for a wide range of real-time applications on mobile devices. The proposed dataset and system will serve as practical tools to support healthcare professionals in diagnosing respiratory disorders. The dataset and code are publicly available here: https://github.com/ReML-AI/Sound-Dr/.
△ Less
Submitted 4 August, 2023; v1 submitted 12 January, 2022;
originally announced January 2022.
-
An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification
Authors:
Lam Pham,
Dat Ngo,
Phu X. Nguyen,
Truong Hoang,
Alexander Schindler
Abstract:
This paper presents a task of audio-visual scene classification (SC) where input videos are classified into one of five real-life crowded scenes: 'Riot', 'Noise-Street', 'Firework-Event', 'Music-Event', and 'Sport-Atmosphere'. To this end, we firstly collect an audio-visual dataset (videos) of these five crowded contexts from Youtube (in-the-wild scenes). Then, a wide range of deep learning framew…
▽ More
This paper presents a task of audio-visual scene classification (SC) where input videos are classified into one of five real-life crowded scenes: 'Riot', 'Noise-Street', 'Firework-Event', 'Music-Event', and 'Sport-Atmosphere'. To this end, we firstly collect an audio-visual dataset (videos) of these five crowded contexts from Youtube (in-the-wild scenes). Then, a wide range of deep learning frameworks are proposed to deploy either audio or visual input data independently. Finally, results obtained from high-performed deep learning frameworks are fused to achieve the best accuracy score. Our experimental results indicate that audio and visual input factors independently contribute to the SC task's performance. Significantly, an ensemble of deep learning frameworks exploring either audio or visual input data can achieve the best accuracy of 95.7%.
△ Less
Submitted 16 December, 2021;
originally announced December 2021.
-
CCS-GAN: COVID-19 CT-scan classification with very few positive training images
Authors:
Sumeet Menon,
Jayalakshmi Mangalagiri,
Josh Galita,
Michael Morris,
Babak Saboury,
Yaacov Yesha,
Yelena Yesha,
Phuong Nguyen,
Aryya Gangopadhyay,
David Chapman
Abstract:
We present a novel algorithm that is able to classify COVID-19 pneumonia from CT Scan slices using a very small sample of training images exhibiting COVID-19 pneumonia in tandem with a larger number of normal images. This algorithm is able to achieve high classification accuracy using as few as 10 positive training slices (from 10 positive cases), which to the best of our knowledge is one order of…
▽ More
We present a novel algorithm that is able to classify COVID-19 pneumonia from CT Scan slices using a very small sample of training images exhibiting COVID-19 pneumonia in tandem with a larger number of normal images. This algorithm is able to achieve high classification accuracy using as few as 10 positive training slices (from 10 positive cases), which to the best of our knowledge is one order of magnitude fewer than the next closest published work at the time of writing. Deep learning with extremely small positive training volumes is a very difficult problem and has been an important topic during the COVID-19 pandemic, because for quite some time it was difficult to obtain large volumes of COVID-19 positive images for training. Algorithms that can learn to screen for diseases using few examples are an important area of research. We present the Cycle Consistent Segmentation Generative Adversarial Network (CCS-GAN). CCS-GAN combines style transfer with pulmonary segmentation and relevant transfer learning from negative images in order to create a larger volume of synthetic positive images for the purposes of improving diagnostic classification performance. The performance of a VGG-19 classifier plus CCS-GAN was trained using a small sample of positive image slices ranging from at most 50 down to as few as 10 COVID-19 positive CT-scan images. CCS-GAN achieves high accuracy with few positive images and thereby greatly reduces the barrier of acquiring large training volumes in order to train a diagnostic classifier for COVID-19.
△ Less
Submitted 1 October, 2021;
originally announced October 2021.
-
Automated Workers Ergonomic Risk Assessment in Manual Material Handling using sEMG Wearable Sensors and Machine Learning
Authors:
Srimantha E. Mudiyanselage,
Phuong H. D. Nguyen,
Mohammad Sadra Rajabi,
Reza Akhavian
Abstract:
Manual material handling tasks have the potential to be highly unsafe from an ergonomic viewpoint. Safety inspections to monitor body postures can help mitigate ergonomic risks of material handling. However, the real effect of awkward muscle movements, strains, and excessive forces that may result in an injury may not be identified by external cues. This paper evaluates the ability of surface elec…
▽ More
Manual material handling tasks have the potential to be highly unsafe from an ergonomic viewpoint. Safety inspections to monitor body postures can help mitigate ergonomic risks of material handling. However, the real effect of awkward muscle movements, strains, and excessive forces that may result in an injury may not be identified by external cues. This paper evaluates the ability of surface electromyogram (EMG)-based systems together with machine learning algorithms to automatically detect body movements that may harm muscles in material handling. The analysis utilized a lifting equation developed by the U.S. National Institute for Occupational Safety and Health (NIOSH). This equation determines a Recommended Weight Limit, which suggests the maximum acceptable weight that a healthy worker can lift and carry as well as a Lifting Index value to assess the risk extent. Four different machine learning models, namely Decision Tree, Support Vector Machine, K-Nearest Neighbor, and Random Forest are developed to classify the risk assessments calculated based on the NIOSH lifting equation. The sensitivity of the models to various parameters is also evaluated to find the best performance using each algorithm. Results indicate that Decision Tree models have the potential to predict the risk level with close to 99.35% accuracy.
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Back to the Future: Efficient, Time-Consistent Solutions in Reach-Avoid Games
Authors:
Dennis R. Anthony,
Duy P. Nguyen,
David Fridovich-Keil,
Jaime F. Fisac
Abstract:
We study the class of reach-avoid dynamic games in which multiple agents interact noncooperatively, and each wishes to satisfy a distinct target criterion while avoiding a failure criterion. Reach-avoid games are commonly used to express safety-critical optimal control problems found in mobile robot motion planning. Here, we focus on finding time-consistent solutions, in which future motion plans…
▽ More
We study the class of reach-avoid dynamic games in which multiple agents interact noncooperatively, and each wishes to satisfy a distinct target criterion while avoiding a failure criterion. Reach-avoid games are commonly used to express safety-critical optimal control problems found in mobile robot motion planning. Here, we focus on finding time-consistent solutions, in which future motion plans remain optimal even when a robot diverges from the plan early on due to, e.g., intrinsic dynamic uncertainty or extrinsic environment disturbances. Our main contribution is a computationally-efficient algorithm for multi-agent reach-avoid games which renders time-consistent solutions for all players. We demonstrate our approach in two- and three-player simulated driving scenarios, in which our method provides safe control strategies for all agents.
△ Less
Submitted 2 March, 2022; v1 submitted 15 September, 2021;
originally announced September 2021.
-
Behavior Analysis and Design of Concrete-Filled Steel Circular-Tube Short Columns Subjected to Axial Compression
Authors:
Duc-Duy Pham,
Phu-Cuong Nguyen
Abstract:
In this paper, a new finite element (FE) model using ABAQUS software was developed to investigate the compressive behavior of Concrete-Filled Steel Circular-Tube (CFSCT) columns. Experimental studies indicated that the confinement offered by the circular steel tube in a CFSCT column increased both the strength and ductility of the filled concrete. Base on the database of 663 test results CFSCT col…
▽ More
In this paper, a new finite element (FE) model using ABAQUS software was developed to investigate the compressive behavior of Concrete-Filled Steel Circular-Tube (CFSCT) columns. Experimental studies indicated that the confinement offered by the circular steel tube in a CFSCT column increased both the strength and ductility of the filled concrete. Base on the database of 663 test results CFSCT columns under axial compression are collected from the available literature, a formula to determine the lateral confining pressures on concrete. Concrete-Damaged Plasticity Model (CDPM) and parameters are available in ABAQUS are used in the analysis. From results analysis, a proposed formula for predicting ultimate load by determining intensification and diminution for concrete and steel. The proposed formula is then compared with the FE model, the previous study, and the design code current in strength prediction of CFSCT columns under compression. The comparative result shows that the FE model, the proposed formula is more stable and accurate than the previous study and current standards when using material normal or high strength.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
Toward Generating Synthetic CT Volumes using a 3D-Conditional Generative Adversarial Network
Authors:
Jayalakshmi Mangalagiri,
David Chapman,
Aryya Gangopadhyay,
Yaacov Yesha,
Joshua Galita,
Sumeet Menon,
Yelena Yesha,
Babak Saboury,
Michael Morris,
Phuong Nguyen
Abstract:
We present a novel conditional Generative Adversarial Network (cGAN) architecture that is capable of generating 3D Computed Tomography scans in voxels from noisy and/or pixelated approximations and with the potential to generate full synthetic 3D scan volumes. We believe conditional cGAN to be a tractable approach to generate 3D CT volumes, even though the problem of generating full resolution dee…
▽ More
We present a novel conditional Generative Adversarial Network (cGAN) architecture that is capable of generating 3D Computed Tomography scans in voxels from noisy and/or pixelated approximations and with the potential to generate full synthetic 3D scan volumes. We believe conditional cGAN to be a tractable approach to generate 3D CT volumes, even though the problem of generating full resolution deep fakes is presently impractical due to GPU memory limitations. We present results for autoencoder, denoising, and depixelating tasks which are trained and tested on two novel COVID19 CT datasets. Our evaluation metrics, Peak Signal to Noise ratio (PSNR) range from 12.53 - 46.46 dB, and the Structural Similarity index ( SSIM) range from 0.89 to 1.
△ Less
Submitted 2 April, 2021;
originally announced April 2021.
-
Enhancement of Distribution System State Estimation Using Pruned Physics-Aware Neural Networks
Authors:
Minh-Quan Tran,
Ahmed S. Zamzam,
Phuong H. Nguyen
Abstract:
Realizing complete observability in the three-phase distribution system remains a challenge that hinders the implementation of classic state estimation algorithms. In this paper, a new method, called the pruned physics-aware neural network (P2N2), is developed to improve the voltage estimation accuracy in the distribution system. The method relies on the physical grid topology, which is used to de…
▽ More
Realizing complete observability in the three-phase distribution system remains a challenge that hinders the implementation of classic state estimation algorithms. In this paper, a new method, called the pruned physics-aware neural network (P2N2), is developed to improve the voltage estimation accuracy in the distribution system. The method relies on the physical grid topology, which is used to design the connections between different hidden layers of a neural network model. To verify the proposed method, a numerical simulation based on one-year smart meter data of load consumptions for three-phase power flow is developed to generate the measurement and voltage state data. The IEEE 123-node system is selected as the test network to benchmark the proposed algorithm against the classic weighted least squares (WLS). Numerical results show that P2N2 outperforms WLS in terms of data redundancy and estimation accuracy.
△ Less
Submitted 15 October, 2021; v1 submitted 7 February, 2021;
originally announced February 2021.
-
Lung Nodule Classification Using Biomarkers, Volumetric Radiomics and 3D CNNs
Authors:
Kushal Mehta,
Arshita Jain,
Jayalakshmi Mangalagiri,
Sumeet Menon,
Phuong Nguyen,
David R. Chapman
Abstract:
We present a hybrid algorithm to estimate lung nodule malignancy that combines imaging biomarkers from Radiologist's annotation with image classification of CT scans. Our algorithm employs a 3D Convolutional Neural Network (CNN) as well as a Random Forest in order to combine CT imagery with biomarker annotation and volumetric radiomic features. We analyze and compare the performance of the algorit…
▽ More
We present a hybrid algorithm to estimate lung nodule malignancy that combines imaging biomarkers from Radiologist's annotation with image classification of CT scans. Our algorithm employs a 3D Convolutional Neural Network (CNN) as well as a Random Forest in order to combine CT imagery with biomarker annotation and volumetric radiomic features. We analyze and compare the performance of the algorithm using only imagery, only biomarkers, combined imagery + biomarkers, combined imagery + volumetric radiomic features and finally the combination of imagery + biomarkers + volumetric features in order to classify the suspicion level of nodule malignancy. The National Cancer Institute (NCI) Lung Image Database Consortium (LIDC) IDRI dataset is used to train and evaluate the classification task. We show that the incorporation of semi-supervised learning by means of K-Nearest-Neighbors (KNN) can increase the available training sample size of the LIDC-IDRI thereby further improving the accuracy of malignancy estimation of most of the models tested although there is no significant improvement with the use of KNN semi-supervised learning if image classification with CNNs and volumetric features are combined with descriptive biomarkers. Unexpectedly, we also show that a model using image biomarkers alone is more accurate than one that combines biomarkers with volumetric radiomics, 3D CNNs, and semi-supervised learning. We discuss the possibility that this result may be influenced by cognitive bias in LIDC-IDRI because malignancy estimates were recorded by the same radiologist panel as biomarkers, as well as future work to incorporate pathology information over a subset of study participants.
△ Less
Submitted 19 October, 2020;
originally announced October 2020.
-
Generating Realistic COVID19 X-rays with a Mean Teacher + Transfer Learning GAN
Authors:
Sumeet Menon,
Joshua Galita,
David Chapman,
Aryya Gangopadhyay,
Jayalakshmi Mangalagiri,
Phuong Nguyen,
Yaacov Yesha,
Yelena Yesha,
Babak Saboury,
Michael Morris
Abstract:
COVID-19 is a novel infectious disease responsible for over 800K deaths worldwide as of August 2020. The need for rapid testing is a high priority and alternative testing strategies including X-ray image classification are a promising area of research. However, at present, public datasets for COVID19 x-ray images have low data volumes, making it challenging to develop accurate image classifiers. S…
▽ More
COVID-19 is a novel infectious disease responsible for over 800K deaths worldwide as of August 2020. The need for rapid testing is a high priority and alternative testing strategies including X-ray image classification are a promising area of research. However, at present, public datasets for COVID19 x-ray images have low data volumes, making it challenging to develop accurate image classifiers. Several recent papers have made use of Generative Adversarial Networks (GANs) in order to increase the training data volumes. But realistic synthetic COVID19 X-rays remain challenging to generate. We present a novel Mean Teacher + Transfer GAN (MTT-GAN) that generates COVID19 chest X-ray images of high quality. In order to create a more accurate GAN, we employ transfer learning from the Kaggle Pneumonia X-Ray dataset, a highly relevant data source orders of magnitude larger than public COVID19 datasets. Furthermore, we employ the Mean Teacher algorithm as a constraint to improve stability of training. Our qualitative analysis shows that the MTT-GAN generates X-ray images that are greatly superior to a baseline GAN and visually comparable to real X-rays. Although board-certified radiologists can distinguish MTT-GAN fakes from real COVID19 X-rays. Quantitative analysis shows that MTT-GAN greatly improves the accuracy of both a binary COVID19 classifier as well as a multi-class Pneumonia classifier as compared to a baseline GAN. Our classification accuracy is favourable as compared to recently reported results in the literature for similar binary and multi-class COVID19 screening tasks.
△ Less
Submitted 25 September, 2020;
originally announced September 2020.
-
PCA Reduced Gaussian Mixture Models with Applications in Superresolution
Authors:
Johannes Hertrich,
Dang Phoung Lan Nguyen,
Jean-Fancois Aujol,
Dominique Bernard,
Yannick Berthoumieu,
Abdellatif Saadaldin,
Gabriele Steidl
Abstract:
Despite the rapid development of computational hardware, the treatment of large and high dimensional data sets is still a challenging problem. This paper provides a twofold contribution to the topic. First, we propose a Gaussian Mixture Model in conjunction with a reduction of the dimensionality of the data in each component of the model by principal component analysis, called PCA-GMM. To learn th…
▽ More
Despite the rapid development of computational hardware, the treatment of large and high dimensional data sets is still a challenging problem. This paper provides a twofold contribution to the topic. First, we propose a Gaussian Mixture Model in conjunction with a reduction of the dimensionality of the data in each component of the model by principal component analysis, called PCA-GMM. To learn the (low dimensional) parameters of the mixture model we propose an EM algorithm whose M-step requires the solution of constrained optimization problems. Fortunately, these constrained problems do not depend on the usually large number of samples and can be solved efficiently by an (inertial) proximal alternating linearized minimization algorithm. Second, we apply our PCA-GMM for the superresolution of 2D and 3D material images based on the approach of Sandeep and Jacob. Numerical results confirm the moderate influence of the dimensionality reduction on the overall superresolution result.
△ Less
Submitted 6 May, 2021; v1 submitted 16 September, 2020;
originally announced September 2020.
-
A novel approach to remove foreign objects from chest X-ray images
Authors:
Hieu X. Le,
Phuong D. Nguyen,
Thang H. Nguyen,
Khanh N. Q. Le,
Thanh T. Nguyen
Abstract:
We initially proposed a deep learning approach for foreign objects inpainting in smartphone-camera captured chest radiographs utilizing the cheXphoto dataset. Foreign objects which can significantly affect the quality of a computer-aided diagnostic prediction are captured under various settings. In this paper, we used multi-method to tackle both removal and inpainting chest radiographs. Firstly, a…
▽ More
We initially proposed a deep learning approach for foreign objects inpainting in smartphone-camera captured chest radiographs utilizing the cheXphoto dataset. Foreign objects which can significantly affect the quality of a computer-aided diagnostic prediction are captured under various settings. In this paper, we used multi-method to tackle both removal and inpainting chest radiographs. Firstly, an object detection model is trained to separate the foreign objects from the given image. Subsequently, the binary mask of each object is extracted utilizing a segmentation model. Each pair of the binary mask and the extracted object are then used for inpainting purposes. Finally, the in-painted regions are now merged back to the original image, resulting in a clean and non-foreign-object-existing output. To conclude, we achieved state-of-the-art accuracy. The experimental results showed a new approach to the possible applications of this method for chest X-ray images detection.
△ Less
Submitted 15 August, 2020;
originally announced August 2020.
-
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Authors:
Chung-Cheng Chiu,
Arun Narayanan,
Wei Han,
Rohit Prabhavalkar,
Yu Zhang,
Navdeep Jaitly,
Ruoming Pang,
Tara N. Sainath,
Patrick Nguyen,
Liangliang Cao,
Yonghui Wu
Abstract:
In recent years, all-neural end-to-end approaches have obtained state-of-the-art results on several challenging automatic speech recognition (ASR) tasks. However, most existing works focus on building ASR models where train and test data are drawn from the same domain. This results in poor generalization characteristics on mismatched-domains: e.g., end-to-end models trained on short segments perfo…
▽ More
In recent years, all-neural end-to-end approaches have obtained state-of-the-art results on several challenging automatic speech recognition (ASR) tasks. However, most existing works focus on building ASR models where train and test data are drawn from the same domain. This results in poor generalization characteristics on mismatched-domains: e.g., end-to-end models trained on short segments perform poorly when evaluated on longer utterances. In this work, we analyze the generalization properties of streaming and non-streaming recurrent neural network transducer (RNN-T) based end-to-end models in order to identify model components that negatively affect generalization performance. We propose two solutions: combining multiple regularization techniques during training, and using dynamic overlap** inference. On a long-form YouTube test set, when the nonstreaming RNN-T model is trained with shorter segments of data, the proposed combination improves word error rate (WER) from 22.3% to 14.8%; when the streaming RNN-T model trained on short Search queries, the proposed techniques improve WER on the YouTube set from 67.0% to 25.3%. Finally, when trained on Librispeech, we find that dynamic overlap** inference improves WER on YouTube from 99.8% to 33.0%.
△ Less
Submitted 23 December, 2020; v1 submitted 7 May, 2020;
originally announced May 2020.
-
Pre-processing Image using Brightening, CLAHE and RETINEX
Authors:
Thi Phuoc Hanh Nguyen,
Zinan Cai,
Khanh Nguyen,
Sokuntheariddh Keth,
Ningyuan Shen,
Mira Park
Abstract:
This paper focuses on finding the most optimal pre-processing methods considering three common algorithms for image enhancement: Brightening, CLAHE and Retinex. For the purpose of image training in general, these methods will be combined to find out the most optimal method for image enhancement. We have carried out the research on the different permutation of three methods: Brightening, CLAHE and…
▽ More
This paper focuses on finding the most optimal pre-processing methods considering three common algorithms for image enhancement: Brightening, CLAHE and Retinex. For the purpose of image training in general, these methods will be combined to find out the most optimal method for image enhancement. We have carried out the research on the different permutation of three methods: Brightening, CLAHE and Retinex. The evaluation is based on Canny Edge detection applied to all processed images. Then the sharpness of objects will be justified by true positive pixels number in comparison between images. After using different number combinations pre-processing functions on images, CLAHE proves to be the most effective in edges improvement, Brightening does not show much effect on the edges enhancement, and the Retinex even reduces the sharpness of images and shows little contribution on images enhancement.
△ Less
Submitted 22 March, 2020;
originally announced March 2020.
-
UAV-Assisted Secure Communications in Terrestrial Cognitive Radio Networks: Joint Power Control and 3D Trajectory Optimization
Authors:
Phu X. Nguyen,
Van-Dinh Nguyen,
Hieu V. Nguyen,
Oh-Soon Shin
Abstract:
This paper considers secure communications for an underlay cognitive radio network (CRN) in the presence of an external eavesdropper (Eve). The secrecy performance of CRNs is usually limited by the primary receiver's interference power constraint. To overcome this issue, we propose to use an unmanned aerial vehicle (UAV) as a friendly jammer to interfere with Eve in decoding the confidential messa…
▽ More
This paper considers secure communications for an underlay cognitive radio network (CRN) in the presence of an external eavesdropper (Eve). The secrecy performance of CRNs is usually limited by the primary receiver's interference power constraint. To overcome this issue, we propose to use an unmanned aerial vehicle (UAV) as a friendly jammer to interfere with Eve in decoding the confidential message from the secondary transmitter (ST). Our goal is to jointly optimize the transmit power and UAV's trajectory in the three-dimensional (3D) space to maximize the average achievable secrecy rate of the secondary system. The formulated optimization problem is nonconvex due to the nonconvexity of the objective and nonconvexity of constraints, which is very challenging to solve. To obtain a suboptimal but efficient solution to the problem, we first transform the original problem into a more tractable form and develop an iterative algorithm for its solution by leveraging the inner approximation framework. We further extend the proposed algorithm to the case of imperfect location information of Eve, where the average worst-case secrecy rate is considered as the objective function. Extensive numerical results are provided to demonstrate the merits of the proposed algorithms over existing approaches.
△ Less
Submitted 25 March, 2020; v1 submitted 21 March, 2020;
originally announced March 2020.
-
Improved motion correction for functional MRI using an omnibus regression model
Authors:
Vyom Raval,
Kevin P. Nguyen,
Albert Montillo
Abstract:
Head motion during functional Magnetic Resonance Imaging acquisition can significantly contaminate the neural signal and introduce spurious, distance-dependent changes in signal correlations. This can heavily confound studies of development, aging, and disease. Previous approaches to suppress head motion artifacts have involved sequential regression of nuisance covariates, but this has been shown…
▽ More
Head motion during functional Magnetic Resonance Imaging acquisition can significantly contaminate the neural signal and introduce spurious, distance-dependent changes in signal correlations. This can heavily confound studies of development, aging, and disease. Previous approaches to suppress head motion artifacts have involved sequential regression of nuisance covariates, but this has been shown to reintroduce artifacts. We propose a new motion correction pipeline using an omnibus regression model that avoids this problem by simultaneously regressing out multiple artifacts using the best performing algorithms to estimate each artifact. We quantitatively evaluate its motion artifact suppression performance against sequential regression pipelines using a large heterogeneous dataset (n=151) which includes high-motion subjects and multiple disease phenotypes. The proposed concatenated regression pipeline significantly reduces the association between head motion and functional connectivity while significantly outperforming the traditional sequential regression pipelines in eliminating distance-dependent head motion artifacts.
△ Less
Submitted 21 January, 2020; v1 submitted 22 November, 2019;
originally announced November 2019.
-
Prediction of individual progression rate in Parkinson's disease using clinical measures and biomechanical measures of gait and postural stability
Authors:
Vyom Raval,
Kevin P. Nguyen,
Ashley Gerald,
Richard B. Dewey Jr.,
Albert Montillo
Abstract:
Parkinson's disease (PD) is a common neurological disorder characterized by gait impairment. PD has no cure, and an impediment to develo** a treatment is the lack of any accepted method to predict disease progression rate. The primary aim of this study was to develop a model using clinical measures and biomechanical measures of gait and postural stability to predict an individual's PD progressio…
▽ More
Parkinson's disease (PD) is a common neurological disorder characterized by gait impairment. PD has no cure, and an impediment to develo** a treatment is the lack of any accepted method to predict disease progression rate. The primary aim of this study was to develop a model using clinical measures and biomechanical measures of gait and postural stability to predict an individual's PD progression over two years. Data from 160 PD subjects were utilized. Machine learning models, including XGBoost and Feed Forward Neural Networks, were developed using extensive model optimization and cross-validation. The highest performing model was a neural network that used a group of clinical measures, achieved a PPV of 71% in identifying fast progressors, and explained a large portion (37%) of the variance in an individual's progression rate on held-out test data. This demonstrates the potential to predict individual PD progression rate and enrich trials by analyzing clinical and biomechanical measures with machine learning.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.
-
A comparison of end-to-end models for long-form speech recognition
Authors:
Chung-Cheng Chiu,
Wei Han,
Yu Zhang,
Ruoming Pang,
Sergey Kishchenko,
Patrick Nguyen,
Arun Narayanan,
Hank Liao,
Shuyuan Zhang,
Anjuli Kannan,
Rohit Prabhavalkar,
Zhifeng Chen,
Tara Sainath,
Yonghui Wu
Abstract:
End-to-end automatic speech recognition (ASR) models, including both attention-based models and the recurrent neural network transducer (RNN-T), have shown superior performance compared to conventional systems. However, previous studies have focused primarily on short utterances that typically last for just a few seconds or, at most, a few tens of seconds. Whether such architectures are practical…
▽ More
End-to-end automatic speech recognition (ASR) models, including both attention-based models and the recurrent neural network transducer (RNN-T), have shown superior performance compared to conventional systems. However, previous studies have focused primarily on short utterances that typically last for just a few seconds or, at most, a few tens of seconds. Whether such architectures are practical on long utterances that last from minutes to hours remains an open question. In this paper, we both investigate and improve the performance of end-to-end models on long-form transcription. We first present an empirical comparison of different end-to-end models on a real world long-form task and demonstrate that the RNN-T model is much more robust than attention-based systems in this regime. We next explore two improvements to attention-based systems that significantly improve its performance: restricting the attention to be monotonic, and applying a novel decoding algorithm that breaks long utterances into shorter overlap** segments. Combining these two improvements, we show that attention-based end-to-end models can be very competitive to RNN-T on long-form speech recognition.
△ Less
Submitted 6 November, 2019;
originally announced November 2019.
-
Anatomically-Informed Data Augmentation for functional MRI with Applications to Deep Learning
Authors:
Kevin P. Nguyen,
Cherise Chin Fatt,
Alex Treacher,
Cooper Mellema,
Madhukar H. Trivedi,
Albert Montillo
Abstract:
The application of deep learning to build accurate predictive models from functional neuroimaging data is often hindered by limited dataset sizes. Though data augmentation can help mitigate such training obstacles, most data augmentation methods have been developed for natural images as in computer vision tasks such as CIFAR, not for medical images. This work helps to fills in this gap by proposin…
▽ More
The application of deep learning to build accurate predictive models from functional neuroimaging data is often hindered by limited dataset sizes. Though data augmentation can help mitigate such training obstacles, most data augmentation methods have been developed for natural images as in computer vision tasks such as CIFAR, not for medical images. This work helps to fills in this gap by proposing a method for generating new functional Magnetic Resonance Images (fMRI) with realistic brain morphology. This method is tested on a challenging task of predicting antidepressant treatment response from pre-treatment task-based fMRI and demonstrates a 26% improvement in performance in predicting response using augmented images. This improvement compares favorably to state-of-the-art augmentation methods for natural images. Through an ablative test, augmentation is also shown to substantively improve performance when applied before hyperparameter optimization. These results suggest the optimal order of operations and support the role of data augmentation method for improving predictive performance in tasks using fMRI.
△ Less
Submitted 17 October, 2019;
originally announced October 2019.
-
BUZz: BUffer Zones for defending adversarial examples in image classification
Authors:
Kaleel Mahmood,
Phuong Ha Nguyen,
Lam M. Nguyen,
Thanh Nguyen,
Marten van Dijk
Abstract:
We propose a novel defense against all existing gradient based adversarial attacks on deep neural networks for image classification problems. Our defense is based on a combination of deep neural networks and simple image transformations. While straightforward in implementation, this defense yields a unique security property which we term buffer zones. We argue that our defense based on buffer zone…
▽ More
We propose a novel defense against all existing gradient based adversarial attacks on deep neural networks for image classification problems. Our defense is based on a combination of deep neural networks and simple image transformations. While straightforward in implementation, this defense yields a unique security property which we term buffer zones. We argue that our defense based on buffer zones offers significant improvements over state-of-the-art defenses. We are able to achieve this improvement even when the adversary has access to the {\em entire} original training data set and unlimited query access to the defense. We verify our claim through experimentation using Fashion-MNIST and CIFAR-10: We demonstrate $<11\%$ attack success rate -- significantly lower than what other well-known state-of-the-art defenses offer -- at only a price of a $11-18\%$ drop in clean accuracy. By using a new intuitive metric, we explain why this trade-off offers a significant improvement over prior work.
△ Less
Submitted 16 June, 2020; v1 submitted 3 October, 2019;
originally announced October 2019.
-
DeepUSPS: Deep Robust Unsupervised Saliency Prediction With Self-Supervision
Authors:
Duc Tam Nguyen,
Maximilian Dax,
Chaithanya Kumar Mummadi,
Thi Phuong Nhung Ngo,
Thi Hoai Phuong Nguyen,
Zhongyu Lou,
Thomas Brox
Abstract:
Deep neural network (DNN) based salient object detection in images based on high-quality labels is expensive. Alternative unsupervised approaches rely on careful selection of multiple handcrafted saliency methods to generate noisy pseudo-ground-truth labels. In this work, we propose a two-stage mechanism for robust unsupervised object saliency prediction, where the first stage involves refinement…
▽ More
Deep neural network (DNN) based salient object detection in images based on high-quality labels is expensive. Alternative unsupervised approaches rely on careful selection of multiple handcrafted saliency methods to generate noisy pseudo-ground-truth labels. In this work, we propose a two-stage mechanism for robust unsupervised object saliency prediction, where the first stage involves refinement of the noisy pseudo labels generated from different handcrafted methods. Each handcrafted method is substituted by a deep network that learns to generate the pseudo labels. These labels are refined incrementally in multiple iterations via our proposed self-supervision technique. In the second stage, the refined labels produced from multiple networks representing multiple saliency methods are used to train the actual saliency detection network. We show that this self-learning procedure outperforms all the existing unsupervised methods over different datasets. Results are even comparable to those of fully-supervised state-of-the-art approaches. The code is available at https://tinyurl.com/wtlhgo3 .
△ Less
Submitted 15 March, 2021; v1 submitted 28 September, 2019;
originally announced September 2019.
-
Computation Offloading and Resource Allocation for Backhaul Limited Cooperative MEC Systems
Authors:
Phuong-Duy Nguyen,
Vu Nguyen Ha,
Long Bao Le
Abstract:
In this paper, we jointly optimize computation offloading and resource allocation to minimize the weighted sum of energy consumption of all mobile users in a backhaul limited cooperative MEC system with multiple fog servers. Considering the partial offloading strategy and TDMA transmission at each base station, the underlying optimization problem with constraints on maximum task latency and limite…
▽ More
In this paper, we jointly optimize computation offloading and resource allocation to minimize the weighted sum of energy consumption of all mobile users in a backhaul limited cooperative MEC system with multiple fog servers. Considering the partial offloading strategy and TDMA transmission at each base station, the underlying optimization problem with constraints on maximum task latency and limited computation resource at mobile users and fog servers is non-convex. We propose to convexify the problem exploiting the relationship among some optimization variables from which an optimal algorithm is proposed to solve the resulting problem. We then present numerical results to demonstrate the significant gains of our proposed design compared to conventional designs without exploiting cooperation among fog servers and a greedy algorithm.
△ Less
Submitted 22 June, 2019;
originally announced June 2019.
-
Hierarchical Generative Modeling for Controllable Speech Synthesis
Authors:
Wei-Ning Hsu,
Yu Zhang,
Ron J. Weiss,
Heiga Zen,
Yonghui Wu,
Yuxuan Wang,
Yuan Cao,
Ye Jia,
Zhifeng Chen,
Jonathan Shen,
Patrick Nguyen,
Ruoming Pang
Abstract:
This paper proposes a neural sequence-to-sequence text-to-speech (TTS) model which can control latent attributes in the generated speech that are rarely annotated in the training data, such as speaking style, accent, background noise, and recording conditions. The model is formulated as a conditional generative model based on the variational autoencoder (VAE) framework, with two levels of hierarch…
▽ More
This paper proposes a neural sequence-to-sequence text-to-speech (TTS) model which can control latent attributes in the generated speech that are rarely annotated in the training data, such as speaking style, accent, background noise, and recording conditions. The model is formulated as a conditional generative model based on the variational autoencoder (VAE) framework, with two levels of hierarchical latent variables. The first level is a categorical variable, which represents attribute groups (e.g. clean/noisy) and provides interpretability. The second level, conditioned on the first, is a multivariate Gaussian variable, which characterizes specific attribute configurations (e.g. noise level, speaking rate) and enables disentangled fine-grained control over these attributes. This amounts to using a Gaussian mixture model (GMM) for the latent distribution. Extensive evaluation demonstrates its ability to control the aforementioned attributes. In particular, we train a high-quality controllable TTS model on real found data, which is capable of inferring speaker and style attributes from a noisy utterance and use it to synthesize clean speech with controllable speaking style.
△ Less
Submitted 27 December, 2018; v1 submitted 16 October, 2018;
originally announced October 2018.
-
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Authors:
Ye Jia,
Yu Zhang,
Ron J. Weiss,
Quan Wang,
Jonathan Shen,
Fei Ren,
Zhifeng Chen,
Patrick Nguyen,
Ruoming Pang,
Ignacio Lopez Moreno,
Yonghui Wu
Abstract:
We describe a neural network-based system for text-to-speech (TTS) synthesis that is able to generate speech audio in the voice of many different speakers, including those unseen during training. Our system consists of three independently trained components: (1) a speaker encoder network, trained on a speaker verification task using an independent dataset of noisy speech from thousands of speakers…
▽ More
We describe a neural network-based system for text-to-speech (TTS) synthesis that is able to generate speech audio in the voice of many different speakers, including those unseen during training. Our system consists of three independently trained components: (1) a speaker encoder network, trained on a speaker verification task using an independent dataset of noisy speech from thousands of speakers without transcripts, to generate a fixed-dimensional embedding vector from seconds of reference speech from a target speaker; (2) a sequence-to-sequence synthesis network based on Tacotron 2, which generates a mel spectrogram from text, conditioned on the speaker embedding; (3) an auto-regressive WaveNet-based vocoder that converts the mel spectrogram into a sequence of time domain waveform samples. We demonstrate that the proposed model is able to transfer the knowledge of speaker variability learned by the discriminatively-trained speaker encoder to the new task, and is able to synthesize natural speech from speakers that were not seen during training. We quantify the importance of training the speaker encoder on a large and diverse speaker set in order to obtain the best generalization performance. Finally, we show that randomly sampled speaker embeddings can be used to synthesize speech in the voice of novel speakers dissimilar from those used in training, indicating that the model has learned a high quality speaker representation.
△ Less
Submitted 2 January, 2019; v1 submitted 12 June, 2018;
originally announced June 2018.
-
An Efficient Spectral Leakage Filtering for IEEE 802.11af in TV White Space
Authors:
Phu Xuan Nguyen,
Thinh Hung Pham,
Trang Hoang,
Oh-Soon Shin
Abstract:
Orthogonal frequency division multiplexing (OFDM) has been widely adopted for modern wireless standards and become a key enabling technology for cognitive radios. However, one of its main drawbacks is significant spectral leakage due to the accumulation of multiple sinc-shaped subcarriers. In this paper, we present a novel pulse sha** scheme for efficient spectral leakage suppression in OFDM bas…
▽ More
Orthogonal frequency division multiplexing (OFDM) has been widely adopted for modern wireless standards and become a key enabling technology for cognitive radios. However, one of its main drawbacks is significant spectral leakage due to the accumulation of multiple sinc-shaped subcarriers. In this paper, we present a novel pulse sha** scheme for efficient spectral leakage suppression in OFDM based physical layer of IEEE 802.11af standard. With conventional pulse sha** filters such as a raised-cosine filter, vestigial symmetry can be used to reduce spectral leakage very effectively. However, these pulse sha** filters require long guard interval, i.e., cyclic prefix in an OFDM system, to avoid inter-symbol interference (ISI), resulting in a loss of spectral efficiency. The proposed pulse sha** method based on asymmetric pulse sha** achieves better spectral leakage suppression and decreases ISI caused by filtering as compared to conventional pulse sha** filters.
△ Less
Submitted 22 December, 2017;
originally announced December 2017.
-
An analysis of incorporating an external language model into a sequence-to-sequence model
Authors:
Anjuli Kannan,
Yonghui Wu,
Patrick Nguyen,
Tara N. Sainath,
Zhifeng Chen,
Rohit Prabhavalkar
Abstract:
Attention-based sequence-to-sequence models for automatic speech recognition jointly train an acoustic model, language model, and alignment mechanism. Thus, the language model component is only trained on transcribed audio-text pairs. This leads to the use of shallow fusion with an external language model at inference time. Shallow fusion refers to log-linear interpolation with a separately traine…
▽ More
Attention-based sequence-to-sequence models for automatic speech recognition jointly train an acoustic model, language model, and alignment mechanism. Thus, the language model component is only trained on transcribed audio-text pairs. This leads to the use of shallow fusion with an external language model at inference time. Shallow fusion refers to log-linear interpolation with a separately trained language model at each step of the beam search. In this work, we investigate the behavior of shallow fusion across a range of conditions: different types of language models, different decoding units, and different tasks. On Google Voice Search, we demonstrate that the use of shallow fusion with a neural LM with wordpieces yields a 9.1% relative word error rate reduction (WERR) over our competitive attention-based sequence-to-sequence model, obviating the need for second-pass rescoring.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models
Authors:
Tara N. Sainath,
Rohit Prabhavalkar,
Shankar Kumar,
Seungji Lee,
Anjuli Kannan,
David Rybach,
Vlad Schogol,
Patrick Nguyen,
Bo Li,
Yonghui Wu,
Zhifeng Chen,
Chung-Cheng Chiu
Abstract:
For decades, context-dependent phonemes have been the dominant sub-word unit for conventional acoustic modeling systems. This status quo has begun to be challenged recently by end-to-end models which seek to combine acoustic, pronunciation, and language model components into a single neural network. Such systems, which typically predict graphemes or words, simplify the recognition process since th…
▽ More
For decades, context-dependent phonemes have been the dominant sub-word unit for conventional acoustic modeling systems. This status quo has begun to be challenged recently by end-to-end models which seek to combine acoustic, pronunciation, and language model components into a single neural network. Such systems, which typically predict graphemes or words, simplify the recognition process since they remove the need for a separate expert-curated pronunciation lexicon to map from phoneme-based units to words. However, there has been little previous work comparing phoneme-based versus grapheme-based sub-word units in the end-to-end modeling framework, to determine whether the gains from such approaches are primarily due to the new probabilistic model, or from the joint learning of the various components with grapheme-based units.
In this work, we conduct detailed experiments which are aimed at quantifying the value of phoneme-based pronunciation lexica in the context of end-to-end models. We examine phoneme-based end-to-end models, which are contrasted against grapheme-based ones on a large vocabulary English Voice-search task, where we find that graphemes do indeed outperform phonemes. We also compare grapheme and phoneme-based approaches on a multi-dialect English task, which once again confirm the superiority of graphemes, greatly simplifying the system for recognizing multiple dialects.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models
Authors:
Rohit Prabhavalkar,
Tara N. Sainath,
Yonghui Wu,
Patrick Nguyen,
Zhifeng Chen,
Chung-Cheng Chiu,
Anjuli Kannan
Abstract:
Sequence-to-sequence models, such as attention-based models in automatic speech recognition (ASR), are typically trained to optimize the cross-entropy criterion which corresponds to improving the log-likelihood of the data. However, system performance is usually measured in terms of word error rate (WER), not log-likelihood. Traditional ASR systems benefit from discriminative sequence training whi…
▽ More
Sequence-to-sequence models, such as attention-based models in automatic speech recognition (ASR), are typically trained to optimize the cross-entropy criterion which corresponds to improving the log-likelihood of the data. However, system performance is usually measured in terms of word error rate (WER), not log-likelihood. Traditional ASR systems benefit from discriminative sequence training which optimizes criteria such as the state-level minimum Bayes risk (sMBR) which are more closely related to WER. In the present work, we explore techniques to train attention-based models to directly minimize expected word error rate. We consider two loss functions which approximate the expected number of word errors: either by sampling from the model, or by using N-best lists of decoded hypotheses, which we find to be more effective than the sampling-based method. In experimental evaluations, we find that the proposed training procedure improves performance by up to 8.2% relative to the baseline system. This allows us to train grapheme-based, uni-directional attention-based models which match the performance of a traditional, state-of-the-art, discriminative sequence-trained system on a mobile voice-search task.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
Improving the Performance of Online Neural Transducer Models
Authors:
Tara N. Sainath,
Chung-Cheng Chiu,
Rohit Prabhavalkar,
Anjuli Kannan,
Yonghui Wu,
Patrick Nguyen,
Zhifeng Chen
Abstract:
Having a sequence-to-sequence model which can operate in an online fashion is important for streaming applications such as Voice Search. Neural transducer is a streaming sequence-to-sequence model, but has shown a significant degradation in performance compared to non-streaming models such as Listen, Attend and Spell (LAS). In this paper, we present various improvements to NT. Specifically, we loo…
▽ More
Having a sequence-to-sequence model which can operate in an online fashion is important for streaming applications such as Voice Search. Neural transducer is a streaming sequence-to-sequence model, but has shown a significant degradation in performance compared to non-streaming models such as Listen, Attend and Spell (LAS). In this paper, we present various improvements to NT. Specifically, we look at increasing the window over which NT computes attention, mainly by looking backwards in time so the model still remains online. In addition, we explore initializing a NT model from a LAS-trained model so that it is guided with a better alignment. Finally, we explore including stronger language models such as using wordpiece models, and applying an external LM during the beam search. On a Voice Search task, we find with these improvements we can get NT to match the performance of LAS.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Authors:
Chung-Cheng Chiu,
Tara N. Sainath,
Yonghui Wu,
Rohit Prabhavalkar,
Patrick Nguyen,
Zhifeng Chen,
Anjuli Kannan,
Ron J. Weiss,
Kanishka Rao,
Ekaterina Gonina,
Navdeep Jaitly,
Bo Li,
Jan Chorowski,
Michiel Bacchiani
Abstract:
Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network. In previous work, we have shown that such architectures are comparable to state-of-theart ASR systems on dictation tasks, but it was not clear if such archite…
▽ More
Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network. In previous work, we have shown that such architectures are comparable to state-of-theart ASR systems on dictation tasks, but it was not clear if such architectures would be practical for more challenging tasks such as voice search. In this work, we explore a variety of structural and optimization improvements to our LAS model which significantly improve performance. On the structural side, we show that word piece models can be used instead of graphemes. We also introduce a multi-head attention architecture, which offers improvements over the commonly-used single-head attention. On the optimization side, we explore synchronous training, scheduled sampling, label smoothing, and minimum word error rate optimization, which are all shown to improve accuracy. We present results with a unidirectional LSTM encoder for streaming recognition. On a 12, 500 hour voice search task, we find that the proposed changes improve the WER from 9.2% to 5.6%, while the best conventional system achieves 6.7%; on a dictation task our model achieves a WER of 4.1% compared to 5% for the conventional system.
△ Less
Submitted 23 February, 2018; v1 submitted 5 December, 2017;
originally announced December 2017.
-
Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model
Authors:
Bo Li,
Tara N. Sainath,
Khe Chai Sim,
Michiel Bacchiani,
Eugene Weinstein,
Patrick Nguyen,
Zhifeng Chen,
Yonghui Wu,
Kanishka Rao
Abstract:
Sequence-to-sequence models provide a simple and elegant solution for building speech recognition systems by folding separate components of a typical system, namely acoustic (AM), pronunciation (PM) and language (LM) models into a single neural network. In this work, we look at one such sequence-to-sequence model, namely listen, attend and spell (LAS), and explore the possibility of training a sin…
▽ More
Sequence-to-sequence models provide a simple and elegant solution for building speech recognition systems by folding separate components of a typical system, namely acoustic (AM), pronunciation (PM) and language (LM) models into a single neural network. In this work, we look at one such sequence-to-sequence model, namely listen, attend and spell (LAS), and explore the possibility of training a single model to serve different English dialects, which simplifies the process of training multi-dialect systems without the need for separate AM, PM and LMs for each dialect. We show that simply pooling the data from all dialects into one LAS model falls behind the performance of a model fine-tuned on each dialect. We then look at incorporating dialect-specific information into the model, both by modifying the training targets by inserting the dialect symbol at the end of the original grapheme sequence and also feeding a 1-hot representation of the dialect information into all layers of the model. Experimental results on seven English dialects show that our proposed system is effective in modeling dialect variations within a single LAS model, outperforming a LAS model trained individually on each of the seven dialects by 3.1 ~ 16.5% relative.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
Speech recognition for medical conversations
Authors:
Chung-Cheng Chiu,
Anshuman Tripathi,
Katherine Chou,
Chris Co,
Navdeep Jaitly,
Diana Jaunzeikare,
Anjuli Kannan,
Patrick Nguyen,
Hasim Sak,
Ananth Sankar,
Justin Tansuwan,
Nathan Wan,
Yonghui Wu,
Xuedong Zhang
Abstract:
In this work we explored building automatic speech recognition models for transcribing doctor patient conversation. We collected a large scale dataset of clinical conversations ($14,000$ hr), designed the task to represent the real word scenario, and explored several alignment approaches to iteratively improve data quality. We explored both CTC and LAS systems for building speech recognition model…
▽ More
In this work we explored building automatic speech recognition models for transcribing doctor patient conversation. We collected a large scale dataset of clinical conversations ($14,000$ hr), designed the task to represent the real word scenario, and explored several alignment approaches to iteratively improve data quality. We explored both CTC and LAS systems for building speech recognition models. The LAS was more resilient to noisy data and CTC required more data clean up. A detailed analysis is provided for understanding the performance for clinical tasks. Our analysis showed the speech recognition models performed well on important medical utterances, while errors occurred in causal conversations. Overall we believe the resulting models can provide reasonable quality in practice.
△ Less
Submitted 20 June, 2018; v1 submitted 20 November, 2017;
originally announced November 2017.
-
Range-Spread Targets Detection in Unknown Doppler Shift via Semi-Definite Programming
Authors:
Mai. P. T. Nguyen,
I. Song,
S. Lee,
S. Yoon
Abstract:
Based on the technique of generalized likelihood ratio test, we address detection schemes for Doppler-shifted range-spread targets in Gaussian noise. First, a detection scheme is derived by solving the maximization associated with the estimation of unknown Doppler frequency with semi-definite programming. To lower the computational complexity of the detector, we then consider a simplification of t…
▽ More
Based on the technique of generalized likelihood ratio test, we address detection schemes for Doppler-shifted range-spread targets in Gaussian noise. First, a detection scheme is derived by solving the maximization associated with the estimation of unknown Doppler frequency with semi-definite programming. To lower the computational complexity of the detector, we then consider a simplification of the detector by adopting maximization over a relaxed space. Both of the proposed detectors are shown to have constant false alarm rate via numerical or theoretical analysis. The detection performance of the proposed detector based on the semi-definite programming is shown to be almost the same as that of the conventional detector designed for known Doppler frequency.
△ Less
Submitted 8 October, 2017;
originally announced October 2017.
-
Robust Radar Detection of a Mismatched Steering Vector Embedded in Compound Gaussian Clutter
Authors:
Mai P. T. Nguyen,
I. Song
Abstract:
The problem of radar detection in compound Gaussian clutter when a radar signature is not completely known has not been considered yet and is addressed in this paper. We proposed a robust technique to detect, based on the generalized likelihood ratio test, a point-like target embedded in compound Gaussian clutter. Employing an array of antennas, we assume that the actual steering vector departs fr…
▽ More
The problem of radar detection in compound Gaussian clutter when a radar signature is not completely known has not been considered yet and is addressed in this paper. We proposed a robust technique to detect, based on the generalized likelihood ratio test, a point-like target embedded in compound Gaussian clutter. Employing an array of antennas, we assume that the actual steering vector departs from the nominal one, but lies in a known interval. The detection is then secured by employing a semi-definite programming. It is confirmed via simulation that the proposed detector experiences a negligible detection loss compared to an adaptive normalized matched filter in a perfectly matched case, but outperforms in cases of mismatched signal. Remarkably, the proposed detector possesses constant false alarm rate with respect to the clutter covariance matrix.
△ Less
Submitted 7 October, 2017;
originally announced October 2017.
-
A Motion Planning Strategy for the Active Vision-Based Map** of Ground-Level Structures
Authors:
Manikandasriram Srinivasan Ramanagopal,
André Phu-Van Nguyen,
Jerome Le Ny
Abstract:
This paper presents a strategy to guide a mobile ground robot equipped with a camera or depth sensor, in order to autonomously map the visible part of a bounded three-dimensional structure. We describe motion planning algorithms that determine appropriate successive viewpoints and attempt to fill holes automatically in a point cloud produced by the sensing and perception layer. The emphasis is on…
▽ More
This paper presents a strategy to guide a mobile ground robot equipped with a camera or depth sensor, in order to autonomously map the visible part of a bounded three-dimensional structure. We describe motion planning algorithms that determine appropriate successive viewpoints and attempt to fill holes automatically in a point cloud produced by the sensing and perception layer. The emphasis is on accurately reconstructing a 3D model of a structure of moderate size rather than map** large open environments, with applications for example in architecture, construction and inspection. The proposed algorithms do not require any initialization in the form of a mesh model or a bounding box, and the paths generated are well adapted to situations where the vision sensor is used simultaneously for map** and for localizing the robot, in the absence of additional absolute positioning system. We analyze the coverage properties of our policy, and compare its performance to the classic frontier based exploration algorithm. We illustrate its efficacy for different structure sizes, levels of localization accuracy and range of the depth sensor, and validate our design on a real-world experiment.
△ Less
Submitted 10 November, 2017; v1 submitted 22 February, 2016;
originally announced February 2016.