-
Attention-Guided Erasing: A Novel Augmentation Method for Enhancing Downstream Breast Density Classification
Authors:
Adarsh Bhandary Panambur,
Hui Yu,
Sheethal Bhat,
Prathmesh Madhu,
Siming Bayer,
Andreas Maier
Abstract:
The assessment of breast density is crucial in the context of breast cancer screening, especially in populations with a higher percentage of dense breast tissues. This study introduces a novel data augmentation technique termed Attention-Guided Erasing (AGE), devised to enhance the downstream classification of four distinct breast density categories in mammography following the BI-RADS recommendat…
▽ More
The assessment of breast density is crucial in the context of breast cancer screening, especially in populations with a higher percentage of dense breast tissues. This study introduces a novel data augmentation technique termed Attention-Guided Erasing (AGE), devised to enhance the downstream classification of four distinct breast density categories in mammography following the BI-RADS recommendation in the Vietnamese cohort. The proposed method integrates supplementary information during transfer learning, utilizing visual attention maps derived from a vision transformer backbone trained using the self-supervised DINO method. These maps are utilized to erase background regions in the mammogram images, unveiling only the potential areas of dense breast tissues to the network. Through the incorporation of AGE during transfer learning with varying random probabilities, we consistently surpass classification performance compared to scenarios without AGE and the traditional random erasing transformation. We validate our methodology using the publicly available VinDr-Mammo dataset. Specifically, we attain a mean F1-score of 0.5910, outperforming values of 0.5594 and 0.5691 corresponding to scenarios without AGE and with random erasing (RE), respectively. This superiority is further substantiated by t-tests, revealing a p-value of p<0.0001, underscoring the statistical significance of our approach.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
Adaptive Sampling of Algal Blooms Using Autonomous Underwater Vehicle and Satellite Imagery: Experimental Validation in the Baltic Sea
Authors:
Joana Fonseca,
Sriharsha Bhat,
Matthew Lock,
Ivan Stenius,
Karl H. Johansson
Abstract:
This paper investigates using satellite data to improve adaptive sampling missions, particularly for front tracking scenarios such as with algal blooms. Our proposed solution to find and track algal bloom fronts uses an Autonomous Underwater Vehicle (AUV) equipped with a sensor that measures the concentration of chlorophyll a and satellite data. The proposed method learns the kernel parameters for…
▽ More
This paper investigates using satellite data to improve adaptive sampling missions, particularly for front tracking scenarios such as with algal blooms. Our proposed solution to find and track algal bloom fronts uses an Autonomous Underwater Vehicle (AUV) equipped with a sensor that measures the concentration of chlorophyll a and satellite data. The proposed method learns the kernel parameters for a Gaussian process model using satellite images of chlorophyll a from the previous days. Then, using the data collected by the AUV, it models chlorophyll a concentration online. We take the gradient of this model to obtain the direction of the algal bloom front and feed it to our control algorithm. The performance of this method is evaluated through realistic simulations for an algal bloom front in the Baltic sea, using the models of the AUV and the chlorophyll a sensor. We compare the performance of different estimation methods, from GP to curve interpolation using least squares. Sensitivity analysis is performed to evaluate the impact of sensor noise on the methods performance. We implement our method on an AUV and run experiments in the Stockholm archipelago in the summer of 2022.
△ Less
Submitted 1 May, 2023;
originally announced May 2023.
-
On the Bit Error Performance of OTFS Modulation using Discrete Zak Transform
Authors:
Vineetha Yogesh,
Vighnesh S Bhat,
Sandesh Rao Mattu,
A. Chockalingam
Abstract:
In orthogonal time frequency space (OTFS) modulation, Zak transform approach is a natural approach for converting information symbols multiplexed in the DD domain directly to time domain for transmission, and vice versa at the receiver. Past research on OTFS has primarily considered a two-step approach where DD domain symbols are first converted to time-frequency domain which are then converted to…
▽ More
In orthogonal time frequency space (OTFS) modulation, Zak transform approach is a natural approach for converting information symbols multiplexed in the DD domain directly to time domain for transmission, and vice versa at the receiver. Past research on OTFS has primarily considered a two-step approach where DD domain symbols are first converted to time-frequency domain which are then converted to time domain for transmission, and vice versa at the receiver. The Zak transform approach can offer performance and complexity benefits compared to the two-step approach. This paper presents an early investigation on the bit error performance of OTFS realized using discrete Zak transform (DZT). We develop a compact DD domain input-output relation for DZT-OTFS using matrix decomposition that is valid for both integer and fractional delay-Dopplers. We analyze the bit error performance of DZT-OTFS using pairwise error probability analysis and simulations. Simulation results show that 1) both DZT-OTFS and two-step OTFS perform better than OFDM, and 2) DZT-OTFS achieves better performance compared to two-step OTFS over a wide range of Doppler spreads.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
Towards Composable Distributions of Latent Space Augmentations
Authors:
Omead Pooladzandi,
Jeffrey Jiang,
Sunay Bhat,
Gregory Pottie
Abstract:
We propose a composable framework for latent space image augmentation that allows for easy combination of multiple augmentations. Image augmentation has been shown to be an effective technique for improving the performance of a wide variety of image classification and generation tasks. Our framework is based on the Variational Autoencoder architecture and uses a novel approach for augmentation via…
▽ More
We propose a composable framework for latent space image augmentation that allows for easy combination of multiple augmentations. Image augmentation has been shown to be an effective technique for improving the performance of a wide variety of image classification and generation tasks. Our framework is based on the Variational Autoencoder architecture and uses a novel approach for augmentation via linear transformation within the latent space itself. We explore losses and augmentation latent geometry to enforce the transformations to be composable and involuntary, thus allowing the transformations to be readily combined or inverted. Finally, we show these properties are better performing with certain pairs of augmentations, but we can transfer the latent space to other sets of augmentations to modify performance, effectively constraining the VAE's bottleneck to preserve the variance of specific augmentations and features of the image which we care about. We demonstrate the effectiveness of our approach with initial results on the MNIST dataset against both a standard VAE and a Conditional VAE. This latent augmentation method allows for much greater control and geometric interpretability of the latent space, making it a valuable tool for researchers and practitioners in the field.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Safe Networked Robotics with Probabilistic Verification
Authors:
Sai Shankar Narasimhan,
Sharachchandra Bhat,
Sandeep P. Chinchali
Abstract:
Autonomous robots must utilize rich sensory data to make safe control decisions. To process this data, compute-constrained robots often require assistance from remote computation, or the cloud, that runs compute-intensive deep neural network perception or control models. However, this assistance comes at the cost of a time delay due to network latency, resulting in past observations being used in…
▽ More
Autonomous robots must utilize rich sensory data to make safe control decisions. To process this data, compute-constrained robots often require assistance from remote computation, or the cloud, that runs compute-intensive deep neural network perception or control models. However, this assistance comes at the cost of a time delay due to network latency, resulting in past observations being used in the cloud to compute the control commands for the present robot state. Such communication delays could potentially lead to the violation of essential safety properties, such as collision avoidance. This paper develops methods to ensure the safety of robots operated over communication networks with stochastic latency. To do so, we use tools from formal verification to construct a shield, i.e., a run-time monitor, that provides a list of safe actions for any delayed sensory observation, given the expected and maximum network latency. Our shield is minimally intrusive and enables networked robots to satisfy key safety constraints, expressed as temporal logic specifications, with desired probability. We demonstrate our approach on a real F1/10th autonomous vehicle that navigates in indoor environments and transmits rich LiDAR sensory data over congested WiFi links.
△ Less
Submitted 12 July, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
Neural Continuous-Time Markov Models
Authors:
Majerle Reeves,
Harish S. Bhat
Abstract:
Continuous-time Markov chains are used to model stochastic systems where transitions can occur at irregular times, e.g., birth-death processes, chemical reaction networks, population dynamics, and gene regulatory networks. We develop a method to learn a continuous-time Markov chain's transition rate functions from fully observed time series. In contrast with existing methods, our method allows for…
▽ More
Continuous-time Markov chains are used to model stochastic systems where transitions can occur at irregular times, e.g., birth-death processes, chemical reaction networks, population dynamics, and gene regulatory networks. We develop a method to learn a continuous-time Markov chain's transition rate functions from fully observed time series. In contrast with existing methods, our method allows for transition rates to depend nonlinearly on both state variables and external covariates. The Gillespie algorithm is used to generate trajectories of stochastic systems where propensity functions (reaction rates) are known. Our method can be viewed as the inverse: given trajectories of a stochastic reaction network, we generate estimates of the propensity functions. While previous methods used linear or log-linear methods to link transition rates to covariates, we use neural networks, increasing the capacity and potential accuracy of learned models. In the chemical context, this enables the method to learn propensity functions from non-mass-action kinetics. We test our method with synthetic data generated from a variety of systems with known transition rates. We show that our method learns these transition rates with considerably more accuracy than log-linear methods, in terms of mean absolute error between ground truth and predicted transition rates. We also demonstrate an application of our methods to open-loop control of a continuous-time Markov chain.
△ Less
Submitted 10 December, 2022;
originally announced December 2022.
-
Input-Output Relation and Performance of RIS-Aided OTFS with Fractional Delay-Doppler
Authors:
Vighnesh S Bhat,
Gandhodi Harshavardhan,
A. Chockalingam
Abstract:
Reconfigurable intelligent surfaces (RIS) and orthogonal time-frequency space (OTFS) modulation have gained attention in recent wireless research. RIS technology aids communication by reflecting the incident electromagnetic waves towards the receiver, and OTFS modulation is effective in high-Doppler channels. This paper presents an early investigation of RIS-aided OTFS in high-Doppler channels. We…
▽ More
Reconfigurable intelligent surfaces (RIS) and orthogonal time-frequency space (OTFS) modulation have gained attention in recent wireless research. RIS technology aids communication by reflecting the incident electromagnetic waves towards the receiver, and OTFS modulation is effective in high-Doppler channels. This paper presents an early investigation of RIS-aided OTFS in high-Doppler channels. We derive the end-to-end delay-Doppler (DD) domain input-output relation of a RIS-aided OTFS system, considering rectangular pulses and fractional delay-Doppler values. We also consider a Zak receiver for RIS-aided OTFS that converts the received time-domain signal to DD domain in one step using Zak transform, and derive its end-to-end input-output relation. Our simulation results show that $i)$ RIS-aided OTFS performs better than OTFS without RIS, $ii)$ Zak receiver performs better than a two-step receiver, and $iii)$ RIS-aided OTFS achieves superior performance compared to RIS-aided OFDM.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics
Authors:
Ananya Muguli,
Lancelot Pinto,
Nirmala R.,
Neeraj Sharma,
Prashant Krishnan,
Prasanta Kumar Ghosh,
Rohit Kumar,
Shrirama Bhat,
Srikanth Raj Chetupalli,
Sriram Ganapathy,
Shreyas Ramoji,
Viral Nanda
Abstract:
The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning. This challenge is an open call for researchers to analyze a dataset of sound recordings collected from COVID-19 infected and non-COVID-19 individuals for a two-class classification. These…
▽ More
The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning. This challenge is an open call for researchers to analyze a dataset of sound recordings collected from COVID-19 infected and non-COVID-19 individuals for a two-class classification. These recordings were collected via crowdsourcing from multiple countries, through a website application. The challenge features two tracks, one focusing on cough sounds, and the other on using a collection of breath, sustained vowel phonation, and number counting speech recordings. In this paper, we introduce the challenge and provide a detailed description of the task, and present a baseline system for the task.
△ Less
Submitted 17 June, 2021; v1 submitted 16 March, 2021;
originally announced March 2021.
-
Performance Analysis of OTFS Modulation with Receive Antenna Selection
Authors:
Vighnesh S Bhat,
G. D. Surabhi,
A. Chockalingam
Abstract:
In this paper, we analyze the performance of orthogonal time frequency space (OTFS) modulation with antenna selection at the receiver, where $n_s$ out of $n_r$ receive antennas with maximum channel Frobenius norms in the delay-Doppler (DD) domain are selected. Single-input multiple-output OTFS (SIMO-OTFS), multiple-input multiple-output OTFS (MIMO-OTFS), and space-time coded OTFS (STC-OTFS) system…
▽ More
In this paper, we analyze the performance of orthogonal time frequency space (OTFS) modulation with antenna selection at the receiver, where $n_s$ out of $n_r$ receive antennas with maximum channel Frobenius norms in the delay-Doppler (DD) domain are selected. Single-input multiple-output OTFS (SIMO-OTFS), multiple-input multiple-output OTFS (MIMO-OTFS), and space-time coded OTFS (STC-OTFS) systems with receive antenna selection (RAS) are considered. We consider these systems without and with phase rotation. Our diversity analysis results show that, with no phase rotation, SIMO-OTFS and MIMO-OTFS systems with RAS are rank deficient, and therefore they do not extract the full receive diversity as well as the diversity present in the DD domain. Also, Alamouti coded STC-OTFS system with RAS and no phase rotation extracts the full transmit diversity, but it fails to extract the DD diversity. On the other hand, SIMO-OTFS and STC-OTFS systems with RAS become full-ranked when phase rotation is used, because of which they extract the full spatial as well as the DD diversity present in the system. Also, when phase rotation is used, MIMO-OTFS systems with RAS extract the full DD diversity, but they do not extract the full receive diversity because of rank deficiency. Simulation results are shown to validate the analytically predicted diversity performance.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Brain Tumor Segmentation and Survival Prediction using Automatic Hard mining in 3D CNN Architecture
Authors:
Vikas Kumar Anand,
Sanjeev Grampurohit,
Pranav Aurangabadkar,
Avinash Kori,
Mahendra Khened,
Raghavendra S Bhat,
Ganapathy Krishnamurthi
Abstract:
We utilize 3-D fully convolutional neural networks (CNN) to segment gliomas and its constituents from multimodal Magnetic Resonance Images (MRI). The architecture uses dense connectivity patterns to reduce the number of weights and residual connections and is initialized with weights obtained from training this model with BraTS 2018 dataset. Hard mining is done during training to train for the dif…
▽ More
We utilize 3-D fully convolutional neural networks (CNN) to segment gliomas and its constituents from multimodal Magnetic Resonance Images (MRI). The architecture uses dense connectivity patterns to reduce the number of weights and residual connections and is initialized with weights obtained from training this model with BraTS 2018 dataset. Hard mining is done during training to train for the difficult cases of segmentation tasks by increasing the dice similarity coefficient (DSC) threshold to choose the hard cases as epoch increases. On the BraTS2020 validation data (n = 125), this architecture achieved a tumor core, whole tumor, and active tumor dice of 0.744, 0.876, 0.714,respectively. On the test dataset, we get an increment in DSC of tumor core and active tumor by approximately 7%. In terms of DSC, our network performances on the BraTS 2020 test data are 0.775, 0.815, and 0.85 for enhancing tumor, tumor core, and whole tumor, respectively. Overall survival of a subject is determined using conventional machine learning from rediomics features obtained using a generated segmentation mask. Our approach has achieved 0.448 and 0.452 as the accuracy on the validation and test dataset.
△ Less
Submitted 5 January, 2021;
originally announced January 2021.
-
Estimating Vector Fields from Noisy Time Series
Authors:
Harish S. Bhat,
Majerle Reeves,
Ramin Raziperchikolaei
Abstract:
While there has been a surge of recent interest in learning differential equation models from time series, methods in this area typically cannot cope with highly noisy data. We break this problem into two parts: (i) approximating the unknown vector field (or right-hand side) of the differential equation, and (ii) dealing with noise. To deal with (i), we describe a neural network architecture consi…
▽ More
While there has been a surge of recent interest in learning differential equation models from time series, methods in this area typically cannot cope with highly noisy data. We break this problem into two parts: (i) approximating the unknown vector field (or right-hand side) of the differential equation, and (ii) dealing with noise. To deal with (i), we describe a neural network architecture consisting of tensor products of one-dimensional neural shape functions. For (ii), we propose an alternating minimization scheme that switches between vector field training and filtering steps, together with multiple trajectories of training data. We find that the neural shape function architecture retains the approximation properties of dense neural networks, enables effective computation of vector field error, and allows for graphical interpretability, all for data/systems in any finite dimension $d$. We also study the combination of either our neural shape function method or existing differential equation learning methods with alternating minimization and multiple trajectories. We find that retrofitting any learning method in this way boosts the method's robustness to noise. While in their raw form the methods struggle with 1% Gaussian noise, after retrofitting, they learn accurate vector fields from data with 10% Gaussian noise.
△ Less
Submitted 6 December, 2020;
originally announced December 2020.
-
Improvement of plant performance using Closed loop Reference Model Simple Adaptive Control for Micro Air Vehicle
Authors:
Shuvrangshu Jana,
M. Seetharama Bhat
Abstract:
In this paper, we present a novel idea to improve the transient performance of the existing Simple Adaptive Control architecture, without requiring high adaptation gains. Improvement in performance is achieved by incorporating the closed loop reference model based on the output feedback to the Simple Adaptive Control architecture. In this proposed scheme, the reference model dynamics is driven by…
▽ More
In this paper, we present a novel idea to improve the transient performance of the existing Simple Adaptive Control architecture, without requiring high adaptation gains. Improvement in performance is achieved by incorporating the closed loop reference model based on the output feedback to the Simple Adaptive Control architecture. In this proposed scheme, the reference model dynamics is driven by the desired command as well as the error signal between the plant output and the reference model output. It is shown that the modified control architecture improves the system performance without any additional control efforts, which is then validated through simulations of the lateral model dynamics of Micro Air Vehicle.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Noise dependent Super Gaussian-Coherence based dual microphone Speech Enhancement for hearing aid application using smartphone
Authors:
Nikhil Shankar,
Gautam S Bhat,
Chandan K A Reddy,
Issa Panahi
Abstract:
In this paper, the coherence between speech and noise signals is used to obtain a Speech Enhancement (SE) gain function, in combination with a Super Gaussian Joint Maximum a Posteriori (SGJMAP) single microphone SE gain function. The proposed SE method can be implemented on a smartphone that works as an assistive device to hearing aids. Although coherence SE gain function suppresses the background…
▽ More
In this paper, the coherence between speech and noise signals is used to obtain a Speech Enhancement (SE) gain function, in combination with a Super Gaussian Joint Maximum a Posteriori (SGJMAP) single microphone SE gain function. The proposed SE method can be implemented on a smartphone that works as an assistive device to hearing aids. Although coherence SE gain function suppresses the background noise well, it distorts the speech. In contrary, SE using SGJMAP improves speech quality with additional musical noise, which we contain by using a post filter. The weighted union of these two gain functions strikes a balance between noise suppression and speech distortion. A 'weighting' parameter is introduced in the derived gain function to allow the smartphone user to control the weighting factor based on different background noise and their comfort level of hearing. Objective and subjective measures of the proposed method show effective improvement in comparison to standard techniques considered in this paper for several noisy conditions at signal to noise ratio levels of -5 dB, 0 dB and 5 dB.
△ Less
Submitted 26 January, 2020;
originally announced January 2020.
-
Integrated guidance and control framework for the waypoint navigation of a miniature aircraft with highly coupled longitudinal and lateral dynamics
Authors:
K Harikumar,
**raj V. Pushpangathan,
Sidhant Dhall,
M. Seetharama Bhat
Abstract:
A solution to the waypoint navigation problem for fixed wing micro air vehicles (MAV) is addressed in this paper, in the framework of integrated guidance and control (IGC). IGC yields a single step solution to the waypoint navigation problem, unlike conventional multiple loop design. The pure proportional navigation (PPN) guidance law is integrated with the MAV dynamics. A multivariable static out…
▽ More
A solution to the waypoint navigation problem for fixed wing micro air vehicles (MAV) is addressed in this paper, in the framework of integrated guidance and control (IGC). IGC yields a single step solution to the waypoint navigation problem, unlike conventional multiple loop design. The pure proportional navigation (PPN) guidance law is integrated with the MAV dynamics. A multivariable static output feedback (SOF) controller is designed for the linear state space model formulated in the IGC framework. The waypoint navigation algorithm handles the minimum turn radius constraint of the MAV. The algorithm also evaluates the feasibility of reaching a waypoint. Extensive non-linear simulations are performed on high fidelity 150 mm wingspan MAV model to demonstrate the potential advantages of the proposed waypoint navigation algorithm.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Semi-Bagging Based Deep Neural Architecture to Extract Text from High Entropy Images
Authors:
Pranay Dugar,
Anirban Chatterjee,
Rajesh Shreedhar Bhat,
Saswata Sahoo
Abstract:
Extracting texts of various size and shape from images containing multiple objects is an important problem in many contexts, especially, in connection to e-commerce, augmented reality assistance system in natural scene, etc. The existing works (based on only CNN) often perform sub-optimally when the image contains regions of high entropy having multiple objects. This paper presents an end-to-end t…
▽ More
Extracting texts of various size and shape from images containing multiple objects is an important problem in many contexts, especially, in connection to e-commerce, augmented reality assistance system in natural scene, etc. The existing works (based on only CNN) often perform sub-optimally when the image contains regions of high entropy having multiple objects. This paper presents an end-to-end text detection strategy combining a segmentation algorithm and an ensemble of multiple text detectors of different types to detect text in every individual image segments independently. The proposed strategy involves a super-pixel based image segmenter which splits an image into multiple regions. A convolutional deep neural architecture is developed which works on each of the segments and detects texts of multiple shapes, sizes, and structures. It outperforms the competing methods in terms of coverage in detecting texts in images especially the ones where the text of various types and sizes are compacted in a small region along with various other objects. Furthermore, the proposed text detection method along with a text recognizer outperforms the existing state-of-the-art approaches in extracting text from high entropy images. We validate the results on a dataset consisting of product images on an e-commerce website.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.