-
Distributed Stochastic Optimization of a Neural Representation Network for Time-Space Tomography Reconstruction
Authors:
K. Aditya Mohan,
Massimiliano Ferrucci,
Chuck Divin,
Garrett A. Stevenson,
Hyo** Kim
Abstract:
4D time-space reconstruction of dynamic events or deforming objects using X-ray computed tomography (CT) is an extremely ill-posed inverse problem. Existing approaches assume that the object remains static for the duration of several tens or hundreds of X-ray projection measurement images (reconstruction of consecutive limited-angle CT scans). However, this is an unrealistic assumption for many in…
▽ More
4D time-space reconstruction of dynamic events or deforming objects using X-ray computed tomography (CT) is an extremely ill-posed inverse problem. Existing approaches assume that the object remains static for the duration of several tens or hundreds of X-ray projection measurement images (reconstruction of consecutive limited-angle CT scans). However, this is an unrealistic assumption for many in-situ experiments that causes spurious artifacts and inaccurate morphological reconstructions of the object. To solve this problem, we propose to perform a 4D time-space reconstruction using a distributed implicit neural representation (DINR) network that is trained using a novel distributed stochastic training algorithm. Our DINR network learns to reconstruct the object at its output by iterative optimization of its network parameters such that the measured projection images best match the output of the CT forward measurement model. We use a continuous time and space forward measurement model that is a function of the DINR outputs at a sparsely sampled set of continuous valued object coordinates. Unlike existing state-of-the-art neural representation architectures that forward and back propagate through dense voxel grids that sample the object's entire time-space coordinates, we only propagate through the DINR at a small subset of object coordinates in each iteration resulting in an order-of-magnitude reduction in memory and compute for training. DINR leverages distributed computation across several compute nodes and GPUs to produce high-fidelity 4D time-space reconstructions even for extremely large CT data sizes. We use both simulated parallel-beam and experimental cone-beam X-ray CT datasets to demonstrate the superior performance of our approach.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Deciphering Heartbeat Signatures: A Vision Transformer Approach to Explainable Atrial Fibrillation Detection from ECG Signals
Authors:
Aruna Mohan,
Danne Elbers,
Or Zilbershot,
Fatemeh Afghah,
David Vorchheimer
Abstract:
Remote patient monitoring based on wearable single-lead electrocardiogram (ECG) devices has significant potential for enabling the early detection of heart disease, especially in combination with artificial intelligence (AI) approaches for automated heart disease detection. There have been prior studies applying AI approaches based on deep learning for heart disease detection. However, these model…
▽ More
Remote patient monitoring based on wearable single-lead electrocardiogram (ECG) devices has significant potential for enabling the early detection of heart disease, especially in combination with artificial intelligence (AI) approaches for automated heart disease detection. There have been prior studies applying AI approaches based on deep learning for heart disease detection. However, these models are yet to be widely accepted as a reliable aid for clinical diagnostics, in part due to the current black-box perception surrounding many AI algorithms. In particular, there is a need to identify the key features of the ECG signal that contribute toward making an accurate diagnosis, thereby enhancing the interpretability of the model. In the present study, we develop a vision transformer approach to identify atrial fibrillation based on single-lead ECG data. A residual network (ResNet) approach is also developed for comparison with the vision transformer approach. These models are applied to the Chapman-Shaoxing dataset to classify atrial fibrillation, as well as another common arrhythmia, sinus bradycardia, and normal sinus rhythm heartbeats. The models enable the identification of the key regions of the heartbeat that determine the resulting classification, and highlight the importance of P-waves and T-waves, as well as heartbeat duration and signal amplitude, in distinguishing normal sinus rhythm from atrial fibrillation and sinus bradycardia.
△ Less
Submitted 28 April, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Generating Images of the M87* Black Hole Using GANs
Authors:
Arya Mohan,
Pavlos Protopapas,
Keerthi Kunnumkai,
Cecilia Garraffo,
Lindy Blackburn,
Koushik Chatterjee,
Sheperd S. Doeleman,
Razieh Emami,
Christian M. Fromm,
Yosuke Mizuno,
Angelo Ricarte
Abstract:
In this paper, we introduce a novel data augmentation methodology based on Conditional Progressive Generative Adversarial Networks (CPGAN) to generate diverse black hole (BH) images, accounting for variations in spin and electron temperature prescriptions. These generated images are valuable resources for training deep learning algorithms to accurately estimate black hole parameters from observati…
▽ More
In this paper, we introduce a novel data augmentation methodology based on Conditional Progressive Generative Adversarial Networks (CPGAN) to generate diverse black hole (BH) images, accounting for variations in spin and electron temperature prescriptions. These generated images are valuable resources for training deep learning algorithms to accurately estimate black hole parameters from observational data. Our model can generate BH images for any spin value within the range of [-1, 1], given an electron temperature distribution. To validate the effectiveness of our approach, we employ a convolutional neural network to predict the BH spin using both the GRMHD images and the images generated by our proposed model. Our results demonstrate a significant performance improvement when training is conducted with the augmented dataset while testing is performed using GRMHD simulated data, as indicated by the high R2 score. Consequently, we propose that GANs can be employed as cost effective models for black hole image generation and reliably augment training datasets for other parameterization algorithms.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Physics-based basis functions for low-dimensional representation of the refractive index in the high energy limit
Authors:
Saransh Singh,
K. Aditya Mohan
Abstract:
The relationship between the refractive index decrement, $δ$, and the real part of the atomic form factor, $f^\prime$, is used to derive a simple polynomial functional form for $δ(E)$ far from the K-edge of the element. The functional form, motivated by the underlying physics, follows an infinite power sum, with most of the energy dependence captured by a single term, $1/E^2$. The derived function…
▽ More
The relationship between the refractive index decrement, $δ$, and the real part of the atomic form factor, $f^\prime$, is used to derive a simple polynomial functional form for $δ(E)$ far from the K-edge of the element. The functional form, motivated by the underlying physics, follows an infinite power sum, with most of the energy dependence captured by a single term, $1/E^2$. The derived functional form shows excellent agreement with theoretical and experimentally recorded values. This work helps reduce the dimensionality of the refractive index across the energy range of x-ray radiation for efficient forward modeling and formulation of a well-posed inverse problem in propagation-based polychromatic phase-contrast computed tomography.
△ Less
Submitted 25 April, 2023;
originally announced June 2023.
-
Maximum Likelihood based Phase-Retrieval using Fresnel Propagation Forward Models with Optional Constraints
Authors:
K. Aditya Mohan,
Jean-Baptiste Forien,
Venkatesh Sridhar,
Jefferson A. Cuadra,
Dilworth Parkinson
Abstract:
X-ray phase-contrast tomography (XPCT) is widely used for high contrast 3D imaging using either synchrotron or laboratory microfocus X-ray sources. XPCT enables an order of magnitude improvement in image contrast of the reconstructed material interfaces with low X-ray absorption contrast. The dominant approaches to 3D reconstruction using XPCT relies on the use of phase-retrieval algorithms that m…
▽ More
X-ray phase-contrast tomography (XPCT) is widely used for high contrast 3D imaging using either synchrotron or laboratory microfocus X-ray sources. XPCT enables an order of magnitude improvement in image contrast of the reconstructed material interfaces with low X-ray absorption contrast. The dominant approaches to 3D reconstruction using XPCT relies on the use of phase-retrieval algorithms that make one or more limiting approximations for the experimental configuration and material properties. Since many experimental scenarios violate such approximations, the resulting reconstructions contain blur, artifacts, or other quantitative inaccuracies. Our solution to this problem is to formulate new iterative non-linear phase-retrieval (NLPR) algorithms that avoid such limiting approximations. Compared to the widely used state-of-the-art approaches, we show that our proposed algorithms result in sharp and quantitatively accurate reconstruction with reduced artifacts. Unlike existing NLPR algorithms, our approaches avoid the laborious manual tuning of regularization hyper-parameters while still achieving the stated goals. As an alternative to regularization, we propose explicit constraints on the material properties to constrain the solution space and solve the phase-retrieval problem. These constraints are easily user-configurable since they follow directly from the imaged object's dimensions and material properties.
△ Less
Submitted 2 October, 2023; v1 submitted 29 April, 2023;
originally announced May 2023.
-
AutoRL Hyperparameter Landscapes
Authors:
Aditya Mohan,
Carolin Benjamins,
Konrad Wienecke,
Alexander Dockhorn,
Marius Lindauer
Abstract:
Although Reinforcement Learning (RL) has shown to be capable of producing impressive results, its use is limited by the impact of its hyperparameters on performance. This often makes it difficult to achieve good results in practice. Automated RL (AutoRL) addresses this difficulty, yet little is known about the dynamics of the hyperparameter landscapes that hyperparameter optimization (HPO) methods…
▽ More
Although Reinforcement Learning (RL) has shown to be capable of producing impressive results, its use is limited by the impact of its hyperparameters on performance. This often makes it difficult to achieve good results in practice. Automated RL (AutoRL) addresses this difficulty, yet little is known about the dynamics of the hyperparameter landscapes that hyperparameter optimization (HPO) methods traverse in search of optimal configurations. In view of existing AutoRL approaches dynamically adjusting hyperparameter configurations, we propose an approach to build and analyze these hyperparameter landscapes not just for one point in time but at multiple points in time throughout training. Addressing an important open question on the legitimacy of such dynamic AutoRL approaches, we provide thorough empirical evidence that the hyperparameter landscapes strongly vary over time across representative algorithms from RL literature (DQN, PPO, and SAC) in different kinds of environments (Cartpole, Bipedal Walker, and Hopper) This supports the theory that hyperparameters should be dynamically adjusted during training and shows the potential for more insights on AutoRL problems that can be gained through landscape analyses. Our code can be found at https://github.com/automl/AutoRL-Landscape
△ Less
Submitted 5 June, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
X-ray Spectral Estimation using Dictionary Learning
Authors:
Wenrui Li,
Venkatesh Sridhar,
K. Aditya Mohan,
Saransh Singh,
Jean-Baptiste Forien,
Xin Liu,
Gregery T. Buzzard,
Charles A. Bouman
Abstract:
As computational tools for X-ray computed tomography (CT) become more quantitatively accurate, knowledge of the source-detector spectral response is critical for quantitative system-independent reconstruction and material characterization capabilities. Directly measuring the spectral response of a CT system is hard, which motivates spectral estimation using transmission data obtained from a collec…
▽ More
As computational tools for X-ray computed tomography (CT) become more quantitatively accurate, knowledge of the source-detector spectral response is critical for quantitative system-independent reconstruction and material characterization capabilities. Directly measuring the spectral response of a CT system is hard, which motivates spectral estimation using transmission data obtained from a collection of known homogeneous objects. However, the associated inverse problem is ill-conditioned, making accurate estimation of the spectrum challenging, particularly in the absence of a close initial guess. In this paper, we describe a dictionary-based spectral estimation method that yields accurate results without the need for any initial estimate of the spectral response. Our method utilizes a MAP estimation framework that combines a physics-based forward model along with an $L_0$ sparsity constraint and a simplex constraint on the dictionary coefficients. Our method uses a greedy support selection method and a new pair-wise iterated coordinate descent method to compute the above estimate. We demonstrate that our dictionary-based method outperforms a state-of-the-art method as shown in a cross-validation experiment on four real datasets collected at beamline 8.3.2 of the Advanced Light Source (ALS).
△ Less
Submitted 26 February, 2023;
originally announced February 2023.
-
DOLCE: A Model-Based Probabilistic Diffusion Framework for Limited-Angle CT Reconstruction
Authors:
Jiaming Liu,
Rushil Anirudh,
Jayaraman J. Thiagarajan,
Stewart He,
K. Aditya Mohan,
Ulugbek S. Kamilov,
Hyo** Kim
Abstract:
Limited-Angle Computed Tomography (LACT) is a non-destructive evaluation technique used in a variety of applications ranging from security to medicine. The limited angle coverage in LACT is often a dominant source of severe artifacts in the reconstructed images, making it a challenging inverse problem. We present DOLCE, a new deep model-based framework for LACT that uses a conditional diffusion mo…
▽ More
Limited-Angle Computed Tomography (LACT) is a non-destructive evaluation technique used in a variety of applications ranging from security to medicine. The limited angle coverage in LACT is often a dominant source of severe artifacts in the reconstructed images, making it a challenging inverse problem. We present DOLCE, a new deep model-based framework for LACT that uses a conditional diffusion model as an image prior. Diffusion models are a recent class of deep generative models that are relatively easy to train due to their implementation as image denoisers. DOLCE can form high-quality images from severely under-sampled data by integrating data-consistency updates with the sampling updates of a diffusion model, which is conditioned on the transformed limited-angle data. We show through extensive experimentation on several challenging real LACT datasets that, the same pre-trained DOLCE model achieves the SOTA performance on drastically different types of images. Additionally, we show that, unlike standard LACT reconstruction methods, DOLCE naturally enables the quantification of the reconstruction uncertainty by generating multiple samples consistent with the measured data.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Actor-Critic based Improper Reinforcement Learning
Authors:
Mohammadi Zaki,
Avinash Mohan,
Aditya Gopalan,
Shie Mannor
Abstract:
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones. This can be useful in tuning across controllers, learnt possibly in mismatched or simulated environments, to obtain a good controller for a…
▽ More
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones. This can be useful in tuning across controllers, learnt possibly in mismatched or simulated environments, to obtain a good controller for a given target environment with relatively few trials.
Towards this, we propose two algorithms: (1) a Policy Gradient-based approach; and (2) an algorithm that can switch between a simple Actor-Critic (AC) based scheme and a Natural Actor-Critic (NAC) scheme depending on the available information. Both algorithms operate over a class of improper mixtures of the given controllers. For the first case, we derive convergence rate guarantees assuming access to a gradient oracle. For the AC-based approach we provide convergence rate guarantees to a stationary point in the basic AC case and to a global optimum in the NAC case. Numerical results on (i) the standard control theoretic benchmark of stabilizing an cartpole; and (ii) a constrained queueing task show that our improper policy optimization algorithm can stabilize the system even when the base policies at its disposal are unstable.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Music Generation using Three-layered LSTM
Authors:
Vaishali Ingale,
Anush Mohan,
Divit Adlakha,
Krishan Kumar,
Mohit Gupta
Abstract:
This paper explores the idea of utilising Long Short-Term Memory neural networks (LSTMNN) for the generation of musical sequences in ABC notation. The proposed approach takes ABC notations from the Nottingham dataset and encodes it to be fed as input for the neural networks. The primary objective is to input the neural networks with an arbitrary note, let the network process and augment a sequence…
▽ More
This paper explores the idea of utilising Long Short-Term Memory neural networks (LSTMNN) for the generation of musical sequences in ABC notation. The proposed approach takes ABC notations from the Nottingham dataset and encodes it to be fed as input for the neural networks. The primary objective is to input the neural networks with an arbitrary note, let the network process and augment a sequence based on the note until a good piece of music is produced. Multiple calibrations have been done to amend the parameters of the network for optimal generation. The output is assessed on the basis of rhythm, harmony, and grammar accuracy.
△ Less
Submitted 9 June, 2021; v1 submitted 19 May, 2021;
originally announced May 2021.
-
Dynamic CT Reconstruction from Limited Views with Implicit Neural Representations and Parametric Motion Fields
Authors:
Albert W. Reed,
Hyo** Kim,
Rushil Anirudh,
K. Aditya Mohan,
Kyle Champley,
**gu Kang,
Suren Jayasuriya
Abstract:
Reconstructing dynamic, time-varying scenes with computed tomography (4D-CT) is a challenging and ill-posed problem common to industrial and medical settings. Existing 4D-CT reconstructions are designed for sparse sampling schemes that require fast CT scanners to capture multiple, rapid revolutions around the scene in order to generate high quality results. However, if the scene is moving too fast…
▽ More
Reconstructing dynamic, time-varying scenes with computed tomography (4D-CT) is a challenging and ill-posed problem common to industrial and medical settings. Existing 4D-CT reconstructions are designed for sparse sampling schemes that require fast CT scanners to capture multiple, rapid revolutions around the scene in order to generate high quality results. However, if the scene is moving too fast, then the sampling occurs along a limited view and is difficult to reconstruct due to spatiotemporal ambiguities. In this work, we design a reconstruction pipeline using implicit neural representations coupled with a novel parametric motion field war** to perform limited view 4D-CT reconstruction of rapidly deforming scenes. Importantly, we utilize a differentiable analysis-by-synthesis approach to compare with captured x-ray sinogram data in a self-supervised fashion. Thus, our resulting optimization method requires no training data to reconstruct the scene. We demonstrate that our proposed system robustly reconstructs scenes containing deformable and periodic motion and validate against state-of-the-art baselines. Further, we demonstrate an ability to reconstruct continuous spatiotemporal representations of our scenes and upsample them to arbitrary volumes and frame rates post-optimization. This research opens a new avenue for implicit neural representations in computed tomography reconstruction in general.
△ Less
Submitted 23 April, 2021;
originally announced April 2021.
-
Algorithm-driven Advances for Scientific CT Instruments: From Model-based to Deep Learning-based Approaches
Authors:
S. V. Venkatakrishnan,
K. Aditya Mohan,
Amir Koushyar Ziabari,
Charles A. Bouman
Abstract:
Multi-scale 3D characterization is widely used by materials scientists to further their understanding of the relationships between microscopic structure and macroscopic function. Scientific computed tomography (CT) instruments are one of the most popular choices for 3D non-destructive characterization of materials at length scales ranging from the angstrom-scale to the micron-scale. These instrume…
▽ More
Multi-scale 3D characterization is widely used by materials scientists to further their understanding of the relationships between microscopic structure and macroscopic function. Scientific computed tomography (CT) instruments are one of the most popular choices for 3D non-destructive characterization of materials at length scales ranging from the angstrom-scale to the micron-scale. These instruments typically have a source of radiation that interacts with the sample to be studied and a detector assembly to capture the result of this interaction. A collection of such high-resolution measurements are made by re-orienting the sample which is mounted on a specially designed stage/holder after which reconstruction algorithms are used to produce the final 3D volume of interest. The end goal of scientific CT scans include determining the morphology,chemical composition or dynamic behavior of materials when subjected to external stimuli. In this article, we will present an overview of recent advances in reconstruction algorithms that have enabled significant improvements in the performance of scientific CT instruments - enabling faster, more accurate and novel imaging capabilities. In the first part, we will focus on model-based image reconstruction algorithms that formulate the inversion as solving a high-dimensional optimization problem involving a data-fidelity term and a regularization term. In the last part of the article, we will present an overview of recent approaches using deep-learning based algorithms for improving scientific CT instruments.
△ Less
Submitted 15 September, 2021; v1 submitted 16 April, 2021;
originally announced April 2021.
-
Improper Reinforcement Learning with Gradient-based Policy Optimization
Authors:
Mohammadi Zaki,
Avinash Mohan,
Aditya Gopalan,
Shie Mannor
Abstract:
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones. This can be useful in tuning across controllers, learnt possibly in mismatched or simulated environments, to obtain a good controller for a…
▽ More
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones. This can be useful in tuning across controllers, learnt possibly in mismatched or simulated environments, to obtain a good controller for a given target environment with relatively few trials.
\par We propose a gradient-based approach that operates over a class of improper mixtures of the controllers. We derive convergence rate guarantees for the approach assuming access to a gradient oracle. The value function of the mixture and its gradient may not be available in closed-form; however, we show that we can employ rollouts and simultaneous perturbation stochastic approximation (SPSA) for explicit gradient descent optimization. Numerical results on (i) the standard control theoretic benchmark of stabilizing an inverted pendulum and (ii) a constrained queueing task show that our improper policy optimization algorithm can stabilize the system even when the base policies at its disposal are unstable\footnote{Under review. Please do not distribute.}.
△ Less
Submitted 3 July, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
AutoAtlas: Neural Network for 3D Unsupervised Partitioning and Representation Learning
Authors:
K. Aditya Mohan,
Alan D. Kaplan
Abstract:
We present a novel neural network architecture called AutoAtlas for fully unsupervised partitioning and representation learning of 3D brain Magnetic Resonance Imaging (MRI) volumes. AutoAtlas consists of two neural network components: one neural network to perform multi-label partitioning based on local texture in the volume, and a second neural network to compress the information contained within…
▽ More
We present a novel neural network architecture called AutoAtlas for fully unsupervised partitioning and representation learning of 3D brain Magnetic Resonance Imaging (MRI) volumes. AutoAtlas consists of two neural network components: one neural network to perform multi-label partitioning based on local texture in the volume, and a second neural network to compress the information contained within each partition. We train both of these components simultaneously by optimizing a loss function that is designed to promote accurate reconstruction of each partition, while encouraging spatially smooth and contiguous partitioning, and discouraging relatively small partitions. We show that the partitions adapt to the subject specific structural variations of brain tissue while consistently appearing at similar spatial locations across subjects. AutoAtlas also produces very low dimensional features that represent local texture of each partition. We demonstrate prediction of metadata associated with each subject using the derived feature representations and compare the results to prediction using features derived from FreeSurfer anatomical parcellation. Since our features are intrinsically linked to distinct partitions, we can then map values of interest, such as partition-specific feature importance scores onto the brain for visualization.
△ Less
Submitted 11 November, 2021; v1 submitted 29 October, 2020;
originally announced October 2020.
-
University Operations During a Pandemic: A Flexible Decision Analysis Toolkit
Authors:
Himanshu Kharkwal,
Dakota Olson,
Jiali Huang,
Abhiraj Mohan,
Ankur Mani,
Jaideep Srivastava
Abstract:
Modeling infection spread during pandemics is not new, with models using past data to tune simulation parameters for predictions. These help understand the healthcare burden posed by a pandemic and respond accordingly. However, the problem of how college/university campuses should function during a pandemic is new for the following reasons:(i) social contact in colleges are structured and can be e…
▽ More
Modeling infection spread during pandemics is not new, with models using past data to tune simulation parameters for predictions. These help understand the healthcare burden posed by a pandemic and respond accordingly. However, the problem of how college/university campuses should function during a pandemic is new for the following reasons:(i) social contact in colleges are structured and can be engineered for chosen objectives, (ii) the last pandemic to cause such societal disruption was over 100 years ago, when higher education was not a critical part of society, (ii) not much was known about causes of pandemics, and hence effective ways of safe operations were not known, and (iii) today with distance learning, remote operation of an academic institution is possible. Our approach is unique in presenting a flexible simulation system, containing a suite of model libraries, one for each major component. The system integrates agent based modeling (ABM) and stochastic network approach, and models the interactions among individual entities, e.g., students, instructors, classrooms, residences, etc. in great detail. For each decision to be made, the system can be used to predict the impact of various choices, and thus enable the administrator to make informed decisions. While current approaches are good for infection modeling, they lack accuracy in social contact modeling. Our ABM approach, combined with ideas from Network Science, presents a novel approach to contact modeling. A detailed case study of the University of Minnesota's Sunrise Plan is presented. For each decisions made, its impact was assessed, and results used to get a measure of confidence. We believe this flexible tool can be a valuable asset for various kinds of organizations to assess their infection risks in pandemic-time operations, including middle and high schools, factories, warehouses, and small/medium sized businesses.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Constrained Non-Linear Phase Retrieval for Single Distance X-ray Phase Contrast Tomography
Authors:
K. Aditya Mohan,
Dilworth Y. Parkinson,
Jefferson A. Cuadra
Abstract:
X-ray phase contrast tomography (XPCT) is widely used for 3D imaging of objects with weak contrast in X-ray absorption index but strong contrast in refractive index decrement. To reconstruct an object imaged using XPCT, phase retrieval algorithms are first used to estimate the X-ray phase projections, which is the 2D projection of the refractive index decrement, at each view. Phase retrieval is fo…
▽ More
X-ray phase contrast tomography (XPCT) is widely used for 3D imaging of objects with weak contrast in X-ray absorption index but strong contrast in refractive index decrement. To reconstruct an object imaged using XPCT, phase retrieval algorithms are first used to estimate the X-ray phase projections, which is the 2D projection of the refractive index decrement, at each view. Phase retrieval is followed by refractive index decrement reconstruction from the phase projections using an algorithm such as filtered back projection (FBP). In practice, phase retrieval is most commonly solved by approximating it as a linear inverse problem. However, this linear approximation often results in artifacts and blurring when the conditions for the approximation are violated. In this paper, we formulate phase retrieval as a non-linear inverse problem, where we solve for the transmission function, which is the negative exponential of the projections, from XPCT measurements. We use a constraint to enforce proportionality between phase and absorption projections. We do not use constraints such as large Fresnel number, slowly varying phase, or Born/Rytov approximations. Our approach also does not require any regularization parameter tuning since there is no explicit sparsity enforcing regularization function. We validate the performance of our non-linear phase retrieval (NLPR) method using both simulated and real synchrotron datasets. We compare NLPR with a popular linear phase retrieval (LPR) approach and show that NLPR achieves sharper reconstructions with higher quantitative accuracy.
△ Less
Submitted 22 September, 2020;
originally announced September 2020.
-
Towards Relevance and Sequence Modeling in Language Recognition
Authors:
Bharat Padi,
Anand Mohan,
Sriram Ganapathy
Abstract:
The task of automatic language identification (LID) involving multiple dialects of the same language family in the presence of noise is a challenging problem. In these scenarios, the identity of the language/dialect may be reliably present only in parts of the temporal sequence of the speech signal. The conventional approaches to LID (and for speaker recognition) ignore the sequence information by…
▽ More
The task of automatic language identification (LID) involving multiple dialects of the same language family in the presence of noise is a challenging problem. In these scenarios, the identity of the language/dialect may be reliably present only in parts of the temporal sequence of the speech signal. The conventional approaches to LID (and for speaker recognition) ignore the sequence information by extracting long-term statistical summary of the recording assuming an independence of the feature frames. In this paper, we propose a neural network framework utilizing short-sequence information in language recognition. In particular, a new model is proposed for incorporating relevance in language recognition, where parts of speech data are weighted more based on their relevance for the language recognition task. This relevance weighting is achieved using the bidirectional long short-term memory (BLSTM) network with attention modeling. We explore two approaches, the first approach uses segment level i-vector/x-vector representations that are aggregated in the neural model and the second approach where the acoustic features are directly modeled in an end-to-end neural model. Experiments are performed using the language recognition task in NIST LRE 2017 Challenge using clean, noisy and multi-speaker speech data as well as in the RATS language recognition corpus. In these experiments on noisy LRE tasks as well as the RATS dataset, the proposed approach yields significant improvements over the conventional i-vector/x-vector based language recognition approaches as well as with other previous models incorporating sequence information.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
On the Volatility of Optimal Control Policies and the Capacity of a Class of Linear Quadratic Regulators
Authors:
Avinash Mohan,
Shie Mannor,
Arman Kizilkale
Abstract:
It is well known that highly volatile control laws, while theoretically optimal for certain systems, are undesirable from an engineering perspective, being generally deleterious to the controlled system. In this article we are concerned with the temporal volatility of the control process of the regulator in discrete time Linear Quadratic Regulators (LQRs). Our investigation in this paper unearths…
▽ More
It is well known that highly volatile control laws, while theoretically optimal for certain systems, are undesirable from an engineering perspective, being generally deleterious to the controlled system. In this article we are concerned with the temporal volatility of the control process of the regulator in discrete time Linear Quadratic Regulators (LQRs). Our investigation in this paper unearths a surprising connection between the cost functional which an LQR is tasked with minimizing and the temporal variations of its control laws.
We first show that optimally controlling the system always implies high levels of control volatility, i.e., it is impossible to reduce volatility in the optimal control process without sacrificing cost. We also show that, akin to communication systems, every LQR has a $Capacity~Region$ associated with it, that dictates and quantifies how much cost is achievable at a given level of control volatility. This additionally establishes the fact that no admissible control policy can simultaneously achieve low volatility and low cost. We then employ this analysis to explain the phenomenon of temporal price volatility frequently observed in deregulated electricity markets.
△ Less
Submitted 21 September, 2020; v1 submitted 17 February, 2020;
originally announced February 2020.
-
Extreme Few-view CT Reconstruction using Deep Inference
Authors:
Hyo** Kim,
Rushil Anirudh,
K. Aditya Mohan,
Kyle Champley
Abstract:
Reconstruction of few-view x-ray Computed Tomography (CT) data is a highly ill-posed problem. It is often used in applications that require low radiation dose in clinical CT, rapid industrial scanning, or fixed-gantry CT. Existing analytic or iterative algorithms generally produce poorly reconstructed images, severely deteriorated by artifacts and noise, especially when the number of x-ray project…
▽ More
Reconstruction of few-view x-ray Computed Tomography (CT) data is a highly ill-posed problem. It is often used in applications that require low radiation dose in clinical CT, rapid industrial scanning, or fixed-gantry CT. Existing analytic or iterative algorithms generally produce poorly reconstructed images, severely deteriorated by artifacts and noise, especially when the number of x-ray projections is considerably low. This paper presents a deep network-driven approach to address extreme few-view CT by incorporating convolutional neural network-based inference into state-of-the-art iterative reconstruction. The proposed method interprets few-view sinogram data using attention-based deep networks to infer the reconstructed image. The predicted image is then used as prior knowledge in the iterative algorithm for final reconstruction. We demonstrate effectiveness of the proposed approach by performing reconstruction experiments on a chest CT dataset.
△ Less
Submitted 11 October, 2019;
originally announced October 2019.
-
Improving Limited Angle CT Reconstruction with a Robust GAN Prior
Authors:
Rushil Anirudh,
Hyo** Kim,
Jayaraman J. Thiagarajan,
K. Aditya Mohan,
Kyle M. Champley
Abstract:
Limited angle CT reconstruction is an under-determined linear inverse problem that requires appropriate regularization techniques to be solved. In this work we study how pre-trained generative adversarial networks (GANs) can be used to clean noisy, highly artifact laden reconstructions from conventional techniques, by effectively projecting onto the inferred image manifold. In particular, we use a…
▽ More
Limited angle CT reconstruction is an under-determined linear inverse problem that requires appropriate regularization techniques to be solved. In this work we study how pre-trained generative adversarial networks (GANs) can be used to clean noisy, highly artifact laden reconstructions from conventional techniques, by effectively projecting onto the inferred image manifold. In particular, we use a robust version of the popularly used GAN prior for inverse problems, based on a recent technique called corruption mimicking, that significantly improves the reconstruction quality. The proposed approach operates in the image space directly, as a result of which it does not need to be trained or require access to the measurement model, is scanner agnostic, and can work over a wide range of sensing scenarios.
△ Less
Submitted 29 January, 2020; v1 submitted 3 October, 2019;
originally announced October 2019.
-
SABER: A Systems Approach to Blur Estimation and Reduction in X-ray Imaging
Authors:
K. Aditya Mohan,
Robert M. Panas,
Jefferson A. Cuadra
Abstract:
Blur in X-ray radiographs not only reduces the sharpness of image edges but also reduces the overall contrast. The effective blur in a radiograph is the combined effect of blur from multiple sources such as the detector panel, X-ray source spot, and system motion. In this paper, we use a systems approach to model the point spread function (PSF) of the effective radiographic blur as the convolution…
▽ More
Blur in X-ray radiographs not only reduces the sharpness of image edges but also reduces the overall contrast. The effective blur in a radiograph is the combined effect of blur from multiple sources such as the detector panel, X-ray source spot, and system motion. In this paper, we use a systems approach to model the point spread function (PSF) of the effective radiographic blur as the convolution of multiple PSFs, where each PSF models one of the various sources of blur. In particular, we model the combined contribution of X-ray source and detector blurs while assuming negligible contribution from other forms of blur. Then, we present a numerical optimization algorithm for estimating the source and detector PSFs from multiple radiographs acquired at different X-ray source to object (SOD) and object to detector distances (ODD). Finally, we computationally reduce blur in radiographs using deblurring algorithms that use the estimated PSFs from the previous step. Our approach to estimate and reduce blur is called SABER, which is an acronym for systems approach to blur estimation and reduction.
△ Less
Submitted 28 June, 2020; v1 submitted 10 May, 2019;
originally announced May 2019.
-
Adversarial Learning of Raw Speech Features for Domain Invariant Speech Recognition
Authors:
Aditay Tripathi,
Aanchan Mohan,
Saket Anand,
Maneesh Singh
Abstract:
Recent advances in neural network based acoustic modelling have shown significant improvements in automatic speech recognition (ASR) performance. In order for acoustic models to be able to handle large acoustic variability, large amounts of labeled data is necessary, which are often expensive to obtain. This paper explores the application of adversarial training to learn features from raw speech t…
▽ More
Recent advances in neural network based acoustic modelling have shown significant improvements in automatic speech recognition (ASR) performance. In order for acoustic models to be able to handle large acoustic variability, large amounts of labeled data is necessary, which are often expensive to obtain. This paper explores the application of adversarial training to learn features from raw speech that are invariant to acoustic variability. This acoustic variability is referred to as a domain shift in this paper. The experimental study presented in this paper leverages the architecture of Domain Adversarial Neural Networks (DANNs) [1] which uses data from two different domains. The DANN is a Y-shaped network that consists of a multi-layer CNN feature extractor module that is common to a label (senone) classifier and a so-called domain classifier. The utility of DANNs is evaluated on multiple datasets with domain shifts caused due to differences in gender and speaker accents. Promising empirical results indicate the strength of adversarial training for unsupervised domain adaptation in ASR, thereby emphasizing the ability of DANNs to learn domain invariant features from raw speech.
△ Less
Submitted 21 May, 2018;
originally announced May 2018.