Search | arXiv e-print repository

Self-Supervised MRI Reconstruction with Unrolled Diffusion Models

Authors: Yilmaz Korkmaz, Tolga Cukur, Vishal M. Patel

Abstract: Magnetic Resonance Imaging (MRI) produces excellent soft tissue contrast, albeit it is an inherently slow imaging modality. Promising deep learning methods have recently been proposed to reconstruct accelerated MRI scans. However, existing methods still suffer from various limitations regarding image fidelity, contextual sensitivity, and reliance on fully-sampled acquisitions for model training. T… ▽ More Magnetic Resonance Imaging (MRI) produces excellent soft tissue contrast, albeit it is an inherently slow imaging modality. Promising deep learning methods have recently been proposed to reconstruct accelerated MRI scans. However, existing methods still suffer from various limitations regarding image fidelity, contextual sensitivity, and reliance on fully-sampled acquisitions for model training. To comprehensively address these limitations, we propose a novel self-supervised deep reconstruction model, named Self-Supervised Diffusion Reconstruction (SSDiffRecon). SSDiffRecon expresses a conditional diffusion process as an unrolled architecture that interleaves cross-attention transformers for reverse diffusion steps with data-consistency blocks for physics-driven processing. Unlike recent diffusion methods for MRI reconstruction, a self-supervision strategy is adopted to train SSDiffRecon using only undersampled k-space data. Comprehensive experiments on public brain MR datasets demonstrates the superiority of SSDiffRecon against state-of-the-art supervised, and self-supervised baselines in terms of reconstruction speed and quality. Implementation will be available at https://github.com/yilmazkorkmaz1/SSDiffRecon. △ Less

Submitted 15 April, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

arXiv:2303.09536 [pdf, other]

Deep Metric Learning for Unsupervised Remote Sensing Change Detection

Authors: Wele Gedara Chaminda Bandara, Vishal M. Patel

Abstract: Remote Sensing Change Detection (RS-CD) aims to detect relevant changes from Multi-Temporal Remote Sensing Images (MT-RSIs), which aids in various RS applications such as land cover, land use, human development analysis, and disaster response. The performance of existing RS-CD methods is attributed to training on large annotated datasets. Furthermore, most of these models are less transferable in… ▽ More Remote Sensing Change Detection (RS-CD) aims to detect relevant changes from Multi-Temporal Remote Sensing Images (MT-RSIs), which aids in various RS applications such as land cover, land use, human development analysis, and disaster response. The performance of existing RS-CD methods is attributed to training on large annotated datasets. Furthermore, most of these models are less transferable in the sense that the trained model often performs very poorly when there is a domain gap between training and test datasets. This paper proposes an unsupervised CD method based on deep metric learning that can deal with both of these issues. Given an MT-RSI, the proposed method generates corresponding change probability map by iteratively optimizing an unsupervised CD loss without training it on a large dataset. Our unsupervised CD method consists of two interconnected deep networks, namely Deep-Change Probability Generator (D-CPG) and Deep-Feature Extractor (D-FE). The D-CPG is designed to predict change and no change probability maps for a given MT-RSI, while D-FE is used to extract deep features of MT-RSI that will be further used in the proposed unsupervised CD loss. We use transfer learning capability to initialize the parameters of D-FE. We iteratively optimize the parameters of D-CPG and D-FE for a given MT-RSI by minimizing the proposed unsupervised ``similarity-dissimilarity loss''. This loss is motivated by the principle of metric learning where we simultaneously maximize the distance between change pair-wise pixels while minimizing the distance between no-change pair-wise pixels in bi-temporal image domain and their deep feature domain. The experiments conducted on three CD datasets show that our unsupervised CD method achieves significant improvements over the state-of-the-art supervised and unsupervised CD methods. Code available at https://github.com/wgcban/Metric-CD △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: Code available at https://github.com/wgcban/Metric-CD

arXiv:2210.08664 [pdf, other]

Design and Modeling of a Smart Torque-Adjustable Rotary Electroadhesive Clutch for Application in Human-Robot Interaction

Authors: Navid Feizi, S. Farokh Atashzar, Mehrdad R. Kermani, Rajni V. Patel

Abstract: The increasing need for sharing workspace and interactive physical tasks between robots and humans has raised concerns regarding safety of such operations. In this regard, controllable clutches have shown great potential for addressing important safety concerns at the hardware level by separating the high-impedance actuator from the end effector by providing the power transfer from electromagnetic… ▽ More The increasing need for sharing workspace and interactive physical tasks between robots and humans has raised concerns regarding safety of such operations. In this regard, controllable clutches have shown great potential for addressing important safety concerns at the hardware level by separating the high-impedance actuator from the end effector by providing the power transfer from electromagnetic source to the human. However, the existing clutches suffer from high power consumption and large-weight, which make them undesirable from the design point of view. In this paper, for the first time, the design and development of a novel, lightweight, and low-power torque-adjustable rotary clutch using electroadhesive materials is presented. The performance of three different pairs of clutch plates is investigated in the context of the smoothness and quality of output torque. The performance degradation issue due to the polarization of the insulator is addressed through the utilization of an alternating current waveform activation signal. Moreover, the effect of the activation frequency on the output torque and power consumption of the clutch is investigated. Finally, a time-dependent model for the output torque of the clutch is presented, and the performance of the clutch was evaluated through experiments, including physical human-robot interaction. The proposed clutch offers a torque to power consumption ratio that is six times better than commercial magnetic particle clutches. The proposed clutch presents great potential for develo** safe, lightweight, and low-power physical human-robot interaction systems, such as exoskeletons and robotic walkers. △ Less

Submitted 16 October, 2022; originally announced October 2022.

Comments: submitted to IEEE T-MECH, 11 pages, 14 figures,

arXiv:2209.09498 [pdf, other]

NBD-GAP: Non-Blind Image Deblurring Without Clean Target Images

Authors: Nithin Gopalakrishnan Nair, Rajeev Yasarla, Vishal M. Patel

Abstract: In recent years, deep neural network-based restoration methods have achieved state-of-the-art results in various image deblurring tasks. However, one major drawback of deep learning-based deblurring networks is that large amounts of blurry-clean image pairs are required for training to achieve good performance. Moreover, deep networks often fail to perform well when the blurry images and the blur… ▽ More In recent years, deep neural network-based restoration methods have achieved state-of-the-art results in various image deblurring tasks. However, one major drawback of deep learning-based deblurring networks is that large amounts of blurry-clean image pairs are required for training to achieve good performance. Moreover, deep networks often fail to perform well when the blurry images and the blur kernels during testing are very different from the ones used during training. This happens mainly because of the overfitting of the network parameters on the training data. In this work, we present a method that addresses these issues. We view the non-blind image deblurring problem as a denoising problem. To do so, we perform Wiener filtering on a pair of blurry images with the corresponding blur kernels. This results in a pair of images with colored noise. Hence, the deblurring problem is translated into a denoising problem. We then solve the denoising problem without using explicit clean target images. Extensive experiments are conducted to show that our method achieves results that are on par to the state-of-the-art non-blind deblurring works. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Comments: Accepted at ICIP 2022

arXiv:2207.03447 [pdf, other]

Learning to restore images degraded by atmospheric turbulence using uncertainty

Authors: Rajeev Yasarla, Vishal M. Patel

Abstract: Atmospheric turbulence can significantly degrade the quality of images acquired by long-range imaging systems by causing spatially and temporally random fluctuations in the index of refraction of the atmosphere. Variations in the refractive index causes the captured images to be geometrically distorted and blurry. Hence, it is important to compensate for the visual degradation in images caused by… ▽ More Atmospheric turbulence can significantly degrade the quality of images acquired by long-range imaging systems by causing spatially and temporally random fluctuations in the index of refraction of the atmosphere. Variations in the refractive index causes the captured images to be geometrically distorted and blurry. Hence, it is important to compensate for the visual degradation in images caused by atmospheric turbulence. In this paper, we propose a deep learning-based approach for restring a single image degraded by atmospheric turbulence. We make use of the epistemic uncertainty based on Monte Carlo dropouts to capture regions in the image where the network is having hard time restoring. The estimated uncertainty maps are then used to guide the network to obtain the restored image. Extensive experiments are conducted on synthetic and real images to show the significance of the proposed work. Code is available at : https://github.com/rajeevyasarla/AT-Net △ Less

Submitted 7 July, 2022; originally announced July 2022.

Comments: Recognized as Best Paper at IEEE International Conference on Image Processing, 2021. arXiv admin note: substantial text overlap with arXiv:2007.08404

arXiv:2206.08936 [pdf, other]

Simultaneous Bone and Shadow Segmentation Network using Task Correspondence Consistency

Authors: Aimon Rahman, Jeya Maria Jose Valanarasu, Ilker Hacihaliloglu, Vishal M Patel

Abstract: Segmenting both bone surface and the corresponding acoustic shadow are fundamental tasks in ultrasound (US) guided orthopedic procedures. However, these tasks are challenging due to minimal and blurred bone surface response in US images, cross-machine discrepancy, imaging artifacts, and low signal-to-noise ratio. Notably, bone shadows are caused by a significant acoustic impedance mismatch between… ▽ More Segmenting both bone surface and the corresponding acoustic shadow are fundamental tasks in ultrasound (US) guided orthopedic procedures. However, these tasks are challenging due to minimal and blurred bone surface response in US images, cross-machine discrepancy, imaging artifacts, and low signal-to-noise ratio. Notably, bone shadows are caused by a significant acoustic impedance mismatch between the soft tissue and bone surfaces. To leverage this mutual information between these highly related tasks, we propose a single end-to-end network with a shared transformer-based encoder and task independent decoders for simultaneous bone and shadow segmentation. To share complementary features, we propose a cross task feature transfer block which learns to transfer meaningful features from decoder of shadow segmentation to that of bone segmentation and vice-versa. We also introduce a correspondence consistency loss which makes sure that network utilizes the inter-dependency between the bone surface and its corresponding shadow to refine the segmentation. Validation against expert annotations shows that the method outperforms the previous state-of-the-art for both bone surface and shadow segmentation. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: Accepted at MICCAI 2022

arXiv:2206.08481 [pdf, other]

Orientation-guided Graph Convolutional Network for Bone Surface Segmentation

Authors: Aimon Rahman, Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Ilker Hacihaliloglu, Vishal M Patel

Abstract: Due to imaging artifacts and low signal-to-noise ratio in ultrasound images, automatic bone surface segmentation networks often produce fragmented predictions that can hinder the success of ultrasound-guided computer-assisted surgical procedures. Existing pixel-wise predictions often fail to capture the accurate topology of bone tissues due to a lack of supervision to enforce connectivity. In this… ▽ More Due to imaging artifacts and low signal-to-noise ratio in ultrasound images, automatic bone surface segmentation networks often produce fragmented predictions that can hinder the success of ultrasound-guided computer-assisted surgical procedures. Existing pixel-wise predictions often fail to capture the accurate topology of bone tissues due to a lack of supervision to enforce connectivity. In this work, we propose an orientation-guided graph convolutional network to improve connectivity while segmenting the bone surface. We also propose an additional supervision on the orientation of the bone surface to further impose connectivity. We validated our approach on 1042 vivo US scans of femur, knee, spine, and distal radius. Our approach improves over the state-of-the-art methods by 5.01% in connectivity metric. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: Accepted at MICCAI 2022

arXiv:2206.04514 [pdf, ps, other]

doi 10.1109/LGRS.2023.3270799

SAR Despeckling using a Denoising Diffusion Probabilistic Model

Authors: Malsha V. Perera, Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, Vishal M. Patel

Abstract: Speckle is a multiplicative noise which affects all coherent imaging modalities including Synthetic Aperture Radar (SAR) images. The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications such as automatic target recognition and change detection. Thus, SAR despeckling is an important problem in remote sensing. In this paper, we… ▽ More Speckle is a multiplicative noise which affects all coherent imaging modalities including Synthetic Aperture Radar (SAR) images. The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications such as automatic target recognition and change detection. Thus, SAR despeckling is an important problem in remote sensing. In this paper, we introduce SAR-DDPM, a denoising diffusion probabilistic model for SAR despeckling. The proposed method comprises of a Markov chain that transforms clean images to white Gaussian noise by repeatedly adding random noise. The despeckled image is recovered by a reverse process which iteratively predicts the added noise using a noise predictor which is conditioned on the speckled image. In addition, we propose a new inference strategy based on cycle spinning to improve the despeckling performance. Our experiments on both synthetic and real SAR images demonstrate that the proposed method achieves significant improvements in both quantitative and qualitative results over the state-of-the-art despeckling methods. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: Our code is available at https://github.com/malshaV/SAR_DDPM

arXiv:2205.15906 [pdf, ps, other]

SAR Despeckling Using Overcomplete Convolutional Networks

Authors: Malsha V. Perera, Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Synthetic Aperture Radar (SAR) despeckling is an important problem in remote sensing as speckle degrades SAR images, affecting downstream tasks like detection and segmentation. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. Traditional CNNs try to increase the receptive field size as the network goes deeper, thus extracting global features. H… ▽ More Synthetic Aperture Radar (SAR) despeckling is an important problem in remote sensing as speckle degrades SAR images, affecting downstream tasks like detection and segmentation. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. Traditional CNNs try to increase the receptive field size as the network goes deeper, thus extracting global features. However,speckle is relatively small, and increasing receptive field does not help in extracting speckle features. This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field. The proposed network consists of an overcomplete branch to focus on the local structures and an undercomplete branch that focuses on the global structures. We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Comments: Accepted to International Geoscience and Remote Sensing Symposium (IGARSS), 2022. Our code is available at https://github.com/malshaV/sar_overcomplete

arXiv:2205.11740 [pdf]

doi 10.1002/aisy.202100099

Trends in Workplace Wearable Technologies and Connected-Worker Solutions for Next-Generation Occupational Safety, Health, and Productivity

Authors: Vishal Patel, Austin Chesmore, Christopher M. Legner, Santosh Pandey

Abstract: The workplace influences the safety, health, and productivity of workers at multiple levels. To protect and promote total worker health, smart hardware, and software tools have emerged for the identification, elimination, substitution, and control of occupational hazards. Wearable devices enable constant monitoring of individual workers and the environment, whereas connected worker solutions provi… ▽ More The workplace influences the safety, health, and productivity of workers at multiple levels. To protect and promote total worker health, smart hardware, and software tools have emerged for the identification, elimination, substitution, and control of occupational hazards. Wearable devices enable constant monitoring of individual workers and the environment, whereas connected worker solutions provide contextual information and decision support. Here, the recent trends in commercial workplace technologies to monitor and manage occupational risks, injuries, accidents, and diseases are reviewed. Workplace safety wearables for safe lifting, ergonomics, hazard identification, sleep monitoring, fatigue management, and heat and cold stress are discussed. Examples of workplace productivity wearables for asset tracking, augmented reality, gesture and motion control, brain wave sensing, and work stress management are given. Workplace health wearables designed for work-related musculoskeletal disorders, functional movement disorders, respiratory hazards, cardiovascular health, outdoor sun exposure, and continuous glucose monitoring are shown. Connected worker platforms are discussed with information about the architecture, system modules, intelligent operations, and industry applications. Predictive analytics provide contextual information about occupational safety risks, resource allocation, equipment failure, and predictive maintenance. Altogether, these examples highlight the ground-level benefits of real-time visibility about frontline workers, work environment, distributed assets, workforce efficiency, and safety compliance △ Less

Submitted 23 May, 2022; originally announced May 2022.

Journal ref: Advanced Intelligent Systems 2021

arXiv:2204.11669 [pdf]

doi 10.1038/s41746-023-00859-y

Deep-learning-enabled Brain Hemodynamic Map** Using Resting-state fMRI

Authors: Xirui Hou, Pengfei Guo, Puyang Wang, Peiying Liu, Doris D. M. Lin, Hongli Fan, Yang Li, Zhiliang Wei, Zixuan Lin, Dengrong Jiang, ** **, Catherine Kelly, Jay J. Pillai, Judy Huang, Marco C. Pinho, Binu P. Thomas, Babu G. Welch, Denise C. Park, Vishal M. Patel, Argye E. Hillis, Hanzhang Lu

Abstract: Cerebrovascular disease is a leading cause of death globally. Prevention and early intervention are known to be the most effective forms of its management. Non-invasive imaging methods hold great promises for early stratification, but at present lack the sensitivity for personalized prognosis. Resting-state functional magnetic resonance imaging (rs-fMRI), a powerful tool previously used for mappin… ▽ More Cerebrovascular disease is a leading cause of death globally. Prevention and early intervention are known to be the most effective forms of its management. Non-invasive imaging methods hold great promises for early stratification, but at present lack the sensitivity for personalized prognosis. Resting-state functional magnetic resonance imaging (rs-fMRI), a powerful tool previously used for map** neural activity, is available in most hospitals. Here we show that rs-fMRI can be used to map cerebral hemodynamic function and delineate impairment. By exploiting time variations in breathing pattern during rs-fMRI, deep learning enables reproducible map** of cerebrovascular reactivity (CVR) and bolus arrive time (BAT) of the human brain using resting-state CO2 fluctuations as a natural 'contrast media'. The deep-learning network was trained with CVR and BAT maps obtained with a reference method of CO2-inhalation MRI, which included data from young and older healthy subjects and patients with Moyamoya disease and brain tumors. We demonstrate the performance of deep-learning cerebrovascular map** in the detection of vascular abnormalities, evaluation of revascularization effects, and vascular alterations in normal aging. In addition, cerebrovascular maps obtained with the proposed method exhibited excellent reproducibility in both healthy volunteers and stroke patients. Deep-learning resting-state vascular imaging has the potential to become a useful tool in clinical cerebrovascular imaging. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Journal ref: npj Digital Medicine (2023) 116

arXiv:2204.08974 [pdf, other]

A comparison of different atmospheric turbulence simulation methods for image restoration

Authors: Nithin Gopalakrishnan Nair, Kangfu Mei, Vishal M. Patel

Abstract: Atmospheric turbulence deteriorates the quality of images captured by long-range imaging systems by introducing blur and geometric distortions to the captured scene. This leads to a drastic drop in performance when computer vision algorithms like object/face recognition and detection are performed on these images. In recent years, various deep learning-based atmospheric turbulence mitigation metho… ▽ More Atmospheric turbulence deteriorates the quality of images captured by long-range imaging systems by introducing blur and geometric distortions to the captured scene. This leads to a drastic drop in performance when computer vision algorithms like object/face recognition and detection are performed on these images. In recent years, various deep learning-based atmospheric turbulence mitigation methods have been proposed in the literature. These methods are often trained using synthetically generated images and tested on real-world images. Hence, the performance of these restoration methods depends on the type of simulation used for training the network. In this paper, we systematically evaluate the effectiveness of various turbulence simulation methods on image restoration. In particular, we evaluate the performance of two state-or-the-art restoration networks using six simulations method on a real-world LRFID dataset consisting of face images degraded by turbulence. This paper will provide guidance to the researchers and practitioners working in this field to choose the suitable data generation models for training deep models for turbulence mitigation. The implementation codes for the simulation methods, source codes for the networks, and the pre-trained models will be publicly made available. △ Less

Submitted 19 April, 2022; originally announced April 2022.

arXiv:2204.08454 [pdf, other]

Revisiting Consistency Regularization for Semi-supervised Change Detection in Remote Sensing Images

Authors: Wele Gedara Chaminda Bandara, Vishal M. Patel

Abstract: Remote-sensing (RS) Change Detection (CD) aims to detect "changes of interest" from co-registered bi-temporal images. The performance of existing deep supervised CD methods is attributed to the large amounts of annotated data used to train the networks. However, annotating large amounts of remote sensing images is labor-intensive and expensive, particularly with bi-temporal images, as it requires… ▽ More Remote-sensing (RS) Change Detection (CD) aims to detect "changes of interest" from co-registered bi-temporal images. The performance of existing deep supervised CD methods is attributed to the large amounts of annotated data used to train the networks. However, annotating large amounts of remote sensing images is labor-intensive and expensive, particularly with bi-temporal images, as it requires pixel-wise comparisons by a human expert. On the other hand, we often have access to unlimited unlabeled multi-temporal RS imagery thanks to ever-increasing earth observation programs. In this paper, we propose a simple yet effective way to leverage the information from unlabeled bi-temporal images to improve the performance of CD approaches. More specifically, we propose a semi-supervised CD model in which we formulate an unsupervised CD loss in addition to the supervised Cross-Entropy (CE) loss by constraining the output change probability map of a given unlabeled bi-temporal image pair to be consistent under the small random perturbations applied on the deep feature difference map that is obtained by subtracting their latent feature representations. Experiments conducted on two publicly available CD datasets show that the proposed semi-supervised CD method can reach closer to the performance of supervised CD even with access to as little as 10% of the annotated training data. Code available at https://github.com/wgcban/SemiCD △ Less

Submitted 21 April, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

Comments: Code available at https://github.com/wgcban/SemiCD 36 pages

arXiv:2203.06338 [pdf, other]

Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation

Authors: Pengfei Guo, Dong Yang, Ali Hatamizadeh, An Xu, Ziyue Xu, Wenqi Li, Can Zhao, Daguang Xu, Stephanie Harmon, Evrim Turkbey, Baris Turkbey, Bradford Wood, Francesca Patella, Elvira Stellato, Gianpaolo Carrafiello, Vishal M. Patel, Holger R. Roth

Abstract: Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning t… ▽ More Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning to achieve optimal performance. Conventional hyperparameter optimization algorithms are impractical in real-world FL applications as they involve numerous training trials, which are often not affordable with limited compute budgets. In this work, we propose an efficient reinforcement learning (RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL, in which an online RL agent can dynamically adjust hyperparameters of each client based on the current training progress. Extensive experiments are conducted to investigate different search strategies and RL agents. The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset as well as two real-world medical image segmentation datasets for COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT. △ Less

Submitted 31 August, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

arXiv:2203.05574 [pdf, other]

On-the-Fly Test-time Adaptation for Medical Image Segmentation

Authors: Jeya Maria Jose Valanarasu, Pengfei Guo, Vibashan VS, Vishal M. Patel

Abstract: One major problem in deep learning-based solutions for medical imaging is the drop in performance when a model is tested on a data distribution different from the one that it is trained on. Adapting the source model to target data distribution at test-time is an efficient solution for the data-shift problem. Previous methods solve this by adapting the model to target distribution by using techniqu… ▽ More One major problem in deep learning-based solutions for medical imaging is the drop in performance when a model is tested on a data distribution different from the one that it is trained on. Adapting the source model to target data distribution at test-time is an efficient solution for the data-shift problem. Previous methods solve this by adapting the model to target distribution by using techniques like entropy minimization or regularization. In these methods, the models are still updated by back-propagation using an unsupervised loss on complete test data distribution. In real-world clinical settings, it makes more sense to adapt a model to a new test image on-the-fly and avoid model update during inference due to privacy concerns and lack of computing resource at deployment. To this end, we propose a new setting - On-the-Fly Adaptation which is zero-shot and episodic (i.e., the model is adapted to a single image at a time and also does not perform any back-propagation during test-time). To achieve this, we propose a new framework called Adaptive UNet where each convolutional block is equipped with an adaptive batch normalization layer to adapt the features with respect to a domain code. The domain code is generated using a pre-trained encoder trained on a large corpus of medical images. During test-time, the model takes in just the new test image and generates a domain code to adapt the features of source model according to the test data. We validate the performance on both 2D and 3D data distribution shifts where we get a better performance compared to previous test-time adaptation methods. Code is available at https://github.com/jeya-maria-jose/On-The-Fly-Adaptation △ Less

Submitted 10 March, 2022; originally announced March 2022.

Comments: Tech Report

arXiv:2203.04967 [pdf, other]

UNeXt: MLP-based Rapid Medical Image Segmentation Network

Authors: Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. However, these networks cannot be effectively adopted for rapid image segmentation in point-of-care applications as they are parameter-heavy, computationally complex and slow to use. To this end, we propose UNeXt which is a Convolutional multilayer perceptron (MLP) based network… ▽ More UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. However, these networks cannot be effectively adopted for rapid image segmentation in point-of-care applications as they are parameter-heavy, computationally complex and slow to use. To this end, we propose UNeXt which is a Convolutional multilayer perceptron (MLP) based network for image segmentation. We design UNeXt in an effective way with an early convolutional stage and a MLP stage in the latent stage. We propose a tokenized MLP block where we efficiently tokenize and project the convolutional features and use MLPs to model the representation. To further boost the performance, we propose shifting the channels of the inputs while feeding in to MLPs so as to focus on learning local dependencies. Using tokenized MLPs in latent space reduces the number of parameters and computational complexity while being able to result in a better representation to help segmentation. The network also consists of skip connections between various levels of encoder and decoder. We test UNeXt on multiple medical image segmentation datasets and show that we reduce the number of parameters by 72x, decrease the computational complexity by 68x, and improve the inference speed by 10x while also obtaining better segmentation performance over the state-of-the-art medical image segmentation architectures. Code is available at https://github.com/jeya-maria-jose/UNeXt-pytorch △ Less

Submitted 9 March, 2022; originally announced March 2022.

Comments: Tech Report

arXiv:2203.04292 [pdf, other]

Towards performant and reliable undersampled MR reconstruction via diffusion model sampling

Authors: Cheng Peng, Pengfei Guo, S. Kevin Zhou, Vishal Patel, Rama Chellappa

Abstract: Magnetic Resonance (MR) image reconstruction from under-sampled acquisition promises faster scanning time. To this end, current State-of-The-Art (SoTA) approaches leverage deep neural networks and supervised training to learn a recovery model. While these approaches achieve impressive performances, the learned model can be fragile on unseen degradation, e.g. when given a different acceleration fac… ▽ More Magnetic Resonance (MR) image reconstruction from under-sampled acquisition promises faster scanning time. To this end, current State-of-The-Art (SoTA) approaches leverage deep neural networks and supervised training to learn a recovery model. While these approaches achieve impressive performances, the learned model can be fragile on unseen degradation, e.g. when given a different acceleration factor. These methods are also generally deterministic and provide a single solution to an ill-posed problem; as such, it can be difficult for practitioners to understand the reliability of the reconstruction. We introduce DiffuseRecon, a novel diffusion model-based MR reconstruction method. DiffuseRecon guides the generation process based on the observed signals and a pre-trained diffusion model, and does not require additional training on specific acceleration factors. DiffuseRecon is stochastic in nature and generates results from a distribution of fully-sampled MR images; as such, it allows us to explicitly visualize different potential reconstruction solutions. Lastly, DiffuseRecon proposes an accelerated, coarse-to-fine Monte-Carlo sampling scheme to approximate the most likely reconstruction candidate. The proposed DiffuseRecon achieves SoTA performances reconstructing from raw acquisition signals in fastMRI and SKM-TEA. Code will be open-sourced at www.github.com/cpeng93/DiffuseRecon. △ Less

Submitted 10 March, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

arXiv:2203.02503 [pdf, other]

HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening

Authors: Wele Gedara Chaminda Bandara, Vishal M. Patel

Abstract: Pansharpening aims to fuse a registered high-resolution panchromatic image (PAN) with a low-resolution hyperspectral image (LR-HSI) to generate an enhanced HSI with high spectral and spatial resolution. Existing pansharpening approaches neglect using an attention mechanism to transfer HR texture features from PAN to LR-HSI features, resulting in spatial and spectral distortions. In this paper, we… ▽ More Pansharpening aims to fuse a registered high-resolution panchromatic image (PAN) with a low-resolution hyperspectral image (LR-HSI) to generate an enhanced HSI with high spectral and spatial resolution. Existing pansharpening approaches neglect using an attention mechanism to transfer HR texture features from PAN to LR-HSI features, resulting in spatial and spectral distortions. In this paper, we present a novel attention mechanism for pansharpening called HyperTransformer, in which features of LR-HSI and PAN are formulated as queries and keys in a transformer, respectively. HyperTransformer consists of three main modules, namely two separate feature extractors for PAN and HSI, a multi-head feature soft attention module, and a spatial-spectral feature fusion module. Such a network improves both spatial and spectral quality measures of the pansharpened HSI by learning cross-feature space dependencies and long-range details of PAN and LR-HSI. Furthermore, HyperTransformer can be utilized across multiple spatial scales at the backbone for obtaining improved performance. Extensive experiments conducted on three widely used datasets demonstrate that HyperTransformer achieves significant improvement over the state-of-the-art methods on both spatial and spectral quality measures. Implementation code and pre-trained weights can be accessed at https://github.com/wgcban/HyperTransformer. △ Less

Submitted 28 March, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

Comments: Accepted at CVPR'22. Project page: https://www.wgcban.com/research#h.ar24vwqlm021 Code available at: https://github.com/wgcban/HyperTransformer

arXiv:2201.09376 [pdf, other]

ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer

Authors: Pengfei Guo, Yiqun Mei, **yuan Zhou, Shanshan Jiang, Vishal M. Patel

Abstract: Accelerating magnetic resonance image (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k-space. In this paper, we propose a recurrent transformer model, namely ReconFormer, for MRI reconstruction which can iteratively reconstruct high fertility magnetic resonance images from highly under-sampled k-space data. In particular, th… ▽ More Accelerating magnetic resonance image (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k-space. In this paper, we propose a recurrent transformer model, namely ReconFormer, for MRI reconstruction which can iteratively reconstruct high fertility magnetic resonance images from highly under-sampled k-space data. In particular, the proposed architecture is built upon Recurrent Pyramid Transformer Layers (RPTL), which jointly exploits intrinsic multi-scale information at every architecture unit as well as the dependencies of the deep feature correlation through recurrent states. Moreover, the proposed ReconFormer is lightweight since it employs the recurrent structure for its parameter efficiency. We validate the effectiveness of ReconFormer on multiple datasets with different magnetic resonance sequences and show that it achieves significant improvements over the state-of-the-art methods with better parameter efficiency. Implementation code will be available in https://github.com/guopengf/ReconFormer. △ Less

Submitted 27 January, 2022; v1 submitted 23 January, 2022; originally announced January 2022.

arXiv:2201.09355 [pdf, ps, other]

Transformer-based SAR Image Despeckling

Authors: Malsha V. Perera, Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Synthetic Aperture Radar (SAR) images are usually degraded by a multiplicative noise known as speckle which makes processing and interpretation of SAR images difficult. In this paper, we introduce a transformer-based network for SAR image despeckling. The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different im… ▽ More Synthetic Aperture Radar (SAR) images are usually degraded by a multiplicative noise known as speckle which makes processing and interpretation of SAR images difficult. In this paper, we introduce a transformer-based network for SAR image despeckling. The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different image regions - aiding in better despeckling. The network is trained end-to-end with synthetically generated speckled images using a composite loss function. Experiments show that the proposed method achieves significant improvements over traditional and convolutional neural network-based despeckling methods on both synthetic and real SAR images. △ Less

Submitted 23 January, 2022; originally announced January 2022.

Comments: Submitted to International Geoscience and Remote Sensing Symposium (IGARSS), 2022. Our code is available at https://github.com/malshaV/sar_transformer

arXiv:2107.12775 [pdf, other]

doi 10.1007/978-3-030-87583-1_18

Realistic Ultrasound Image Synthesis for Improved Classification of Liver Disease

Authors: Hui Che, Sumana Ramanathan, David Foran, John L Nosher, Vishal M Patel, Ilker Hacihaliloglu

Abstract: With the success of deep learning-based methods applied in medical image analysis, convolutional neural networks (CNNs) have been investigated for classifying liver disease from ultrasound (US) data. However, the scarcity of available large-scale labeled US data has hindered the success of CNNs for classifying liver disease from US data. In this work, we propose a novel generative adversarial netw… ▽ More With the success of deep learning-based methods applied in medical image analysis, convolutional neural networks (CNNs) have been investigated for classifying liver disease from ultrasound (US) data. However, the scarcity of available large-scale labeled US data has hindered the success of CNNs for classifying liver disease from US data. In this work, we propose a novel generative adversarial network (GAN) architecture for realistic diseased and healthy liver US image synthesis. We adopt the concept of stacking to synthesize realistic liver US data. Quantitative and qualitative evaluation is performed on 550 in-vivo B-mode liver US images collected from 55 subjects. We also show that the synthesized images, together with real in vivo data, can be used to significantly improve the performance of traditional CNN architectures for Nonalcoholic fatty liver disease (NAFLD) classification. △ Less

Submitted 27 July, 2021; originally announced July 2021.

Comments: Accepted for presentation at the 2021 MICCAI-International Workshop of Advances in Simplifying Medical UltraSound (ASMUS2021)

arXiv:2107.02630 [pdf, other]

Hyperspectral Pansharpening Based on Improved Deep Image Prior and Residual Reconstruction

Authors: Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Hyperspectral pansharpening aims to synthesize a low-resolution hyperspectral image (LR-HSI) with a registered panchromatic image (PAN) to generate an enhanced HSI with high spectral and spatial resolution. Recently proposed HS pansharpening methods have obtained remarkable results using deep convolutional networks (ConvNets), which typically consist of three steps: (1) up-sampling the LR-HSI, (2)… ▽ More Hyperspectral pansharpening aims to synthesize a low-resolution hyperspectral image (LR-HSI) with a registered panchromatic image (PAN) to generate an enhanced HSI with high spectral and spatial resolution. Recently proposed HS pansharpening methods have obtained remarkable results using deep convolutional networks (ConvNets), which typically consist of three steps: (1) up-sampling the LR-HSI, (2) predicting the residual image via a ConvNet, and (3) obtaining the final fused HSI by adding the outputs from first and second steps. Recent methods have leveraged Deep Image Prior (DIP) to up-sample the LR-HSI due to its excellent ability to preserve both spatial and spectral information, without learning from large data sets. However, we observed that the quality of up-sampled HSIs can be further improved by introducing an additional spatial-domain constraint to the conventional spectral-domain energy function. We define our spatial-domain constraint as the $L_1$ distance between the predicted PAN image and the actual PAN image. To estimate the PAN image of the up-sampled HSI, we also propose a learnable spectral response function (SRF). Moreover, we noticed that the residual image between the up-sampled HSI and the reference HSI mainly consists of edge information and very fine structures. In order to accurately estimate fine information, we propose a novel over-complete network, called HyperKite, which focuses on learning high-level features by constraining the receptive from increasing in the deep layers. We perform experiments on three HSI datasets to demonstrate the superiority of our DIP-HyperKite over the state-of-the-art pansharpening methods. The deployment codes, pre-trained models, and final fusion outputs of our DIP-HyperKite and the methods used for the comparisons will be publicly made available at https://github.com/wgcban/DIP-HyperKite.git. △ Less

Submitted 6 July, 2021; originally announced July 2021.

arXiv:2106.08886 [pdf, other]

Over-and-Under Complete Convolutional RNN for MRI Reconstruction

Authors: Pengfei Guo, Jeya Maria Jose Valanarasu, Puyang Wang, **yuan Zhou, Shanshan Jiang, Vishal M. Patel

Abstract: Reconstructing magnetic resonance (MR) images from undersampled data is a challenging problem due to various artifacts introduced by the under-sampling operation. Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture which captures low-level features at the initial layers and high-level features at the deeper layers. Such networks focus… ▽ More Reconstructing magnetic resonance (MR) images from undersampled data is a challenging problem due to various artifacts introduced by the under-sampling operation. Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture which captures low-level features at the initial layers and high-level features at the deeper layers. Such networks focus much on global features which may not be optimal to reconstruct the fully-sampled image. In this paper, we propose an Over-and-Under Complete Convolutional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network(CRNN). The overcomplete branch gives special attention in learning local structures by restraining the receptive field of the network. Combining it with the undercomplete branch leads to a network which focuses more on low-level features without losing out on the global structures. Extensive experiments on two datasets demonstrate that the proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters. △ Less

Submitted 24 June, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

Comments: Accepted to MICCAI 2021

arXiv:2103.02148 [pdf, other]

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

Authors: Pengfei Guo, Puyang Wang, **yuan Zhou, Shanshan Jiang, Vishal M. Patel

Abstract: Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. In recent years, deep learning-based methods have been shown to produce superior performance on MR image reconstruction. However, these methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical dat… ▽ More Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. In recent years, deep learning-based methods have been shown to produce superior performance on MR image reconstruction. However, these methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical data privacy regulations. In order to overcome this challenge, we propose a federated learning (FL) based solution in which we take advantage of the MR data available at different institutions while preserving patients' privacy. However, the generalizability of models trained with the FL setting can still be suboptimal due to domain shift, which results from the data collected at multiple institutions with different sensors, disease types, and acquisition protocols, etc. With the motivation of circumventing this challenge, we propose a cross-site modeling for MR image reconstruction in which the learned intermediate latent features among different source sites are aligned with the distribution of the latent features at the target site. Extensive experiments are conducted to provide various insights about FL for MR image reconstruction. Experimental results demonstrate that the proposed framework is a promising direction to utilize multi-institutional data without compromising patients' privacy for achieving improved MR image reconstruction. Our code will be available at https://github.com/guopengf/FLMRCM. △ Less

Submitted 10 March, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

Comments: Accepted at CVPR 2021

arXiv:2102.11942 [pdf, other]

doi 10.1109/EMBC46164.2021.9631069

Multi-Feature Multi-Scale CNN-Derived COVID-19 Classification from Lung Ultrasound Data

Authors: Hui Che, Jared Radbel, Jag Sunderram, John L. Nosher, Vishal M. Patel, Ilker Hacihaliloglu

Abstract: The global pandemic of the novel coronavirus disease 2019 (COVID-19) has put tremendous pressure on the medical system. Imaging plays a complementary role in the management of patients with COVID-19. Computed tomography (CT) and chest X-ray (CXR) are the two dominant screening tools. However, difficulty in eliminating the risk of disease transmission, radiation exposure and not being costeffective… ▽ More The global pandemic of the novel coronavirus disease 2019 (COVID-19) has put tremendous pressure on the medical system. Imaging plays a complementary role in the management of patients with COVID-19. Computed tomography (CT) and chest X-ray (CXR) are the two dominant screening tools. However, difficulty in eliminating the risk of disease transmission, radiation exposure and not being costeffective are some of the challenges for CT and CXR imaging. This fact induces the implementation of lung ultrasound (LUS) for evaluating COVID-19 due to its practical advantages of noninvasiveness, repeatability, and sensitive bedside property. In this paper, we utilize a deep learning model to perform the classification of COVID-19 from LUS data, which could produce objective diagnostic information for clinicians. Specifically, all LUS images are processed to obtain their corresponding local phase filtered images and radial symmetry transformed images before fed into the multi-scale residual convolutional neural network (CNN). Secondly, image combination as the input of the network is used to explore rich and reliable features. Feature fusion strategy at different levels is adopted to investigate the relationship between the depth of feature aggregation and the classification accuracy. Our proposed method is evaluated on the point-of-care US (POCUS) dataset together with the Italian COVID-19 Lung US database (ICLUS-DB) and shows promising performance for COVID-19 prediction. △ Less

Submitted 23 February, 2021; originally announced February 2021.

arXiv:2101.09451 [pdf, other]

Error Diffusion Halftoning Against Adversarial Examples

Authors: Shao-Yuan Lo, Vishal M. Patel

Abstract: Adversarial examples contain carefully crafted perturbations that can fool deep neural networks (DNNs) into making wrong predictions. Enhancing the adversarial robustness of DNNs has gained considerable interest in recent years. Although image transformation-based defenses were widely considered at an earlier time, most of them have been defeated by adaptive attacks. In this paper, we propose a ne… ▽ More Adversarial examples contain carefully crafted perturbations that can fool deep neural networks (DNNs) into making wrong predictions. Enhancing the adversarial robustness of DNNs has gained considerable interest in recent years. Although image transformation-based defenses were widely considered at an earlier time, most of them have been defeated by adaptive attacks. In this paper, we propose a new image transformation defense based on error diffusion halftoning, and combine it with adversarial training to defend against adversarial examples. Error diffusion halftoning projects an image into a 1-bit space and diffuses quantization error to neighboring pixels. This process can remove adversarial perturbations from a given image while maintaining acceptable image quality in the meantime in favor of recognition. Experimental results demonstrate that the proposed method is able to improve adversarial robustness even under advanced adaptive attacks, while most of the other image transformation-based defenses do not. We show that a proper image transformation can still be an effective defense approach. Code: https://github.com/shaoyuanlo/Halftoning-Defense △ Less

Submitted 24 July, 2021; v1 submitted 23 January, 2021; originally announced January 2021.

Comments: Accepted at IEEE International Conference on Image Processing (ICIP) 2021

arXiv:2012.04262 [pdf, other]

Overcomplete Representations Against Adversarial Videos

Authors: Shao-Yuan Lo, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Adversarial robustness of deep neural networks is an extensively studied problem in the literature and various methods have been proposed to defend against adversarial images. However, only a handful of defense methods have been developed for defending against attacked videos. In this paper, we propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OU… ▽ More Adversarial robustness of deep neural networks is an extensively studied problem in the literature and various methods have been proposed to defend against adversarial images. However, only a handful of defense methods have been developed for defending against attacked videos. In this paper, we propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OUDefend). Most restoration networks adopt an encoder-decoder architecture that first shrinks spatial dimension then expands it back. This approach learns undercomplete representations, which have large receptive fields to collect global information but overlooks local details. On the other hand, overcomplete representations have opposite properties. Hence, OUDefend is designed to balance local and global features by learning those two representations. We attach OUDefend to target video recognition models as a feature restoration block and train the entire network end-to-end. Experimental results show that the defenses focusing on images may be ineffective to videos, while OUDefend enhances robustness against different types of adversarial videos, ranging from additive attacks, multiplicative attacks to physically realizable attacks. Code: https://github.com/shaoyuanlo/OUDefend △ Less

Submitted 14 June, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

Comments: Accepted at IEEE International Conference on Image Processing (ICIP) 2021

arXiv:2010.10661 [pdf, other]

doi 10.1109/JSTSP.2020.3039393

Exploring Overcomplete Representations for Single Image Deraining using CNNs

Authors: Rajeev Yasarla, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Removal of rain streaks from a single image is an extremely challenging problem since the rainy images often contain rain streaks of different size, shape, direction and density. Most recent methods for deraining use a deep network following a generic "encoder-decoder" architecture which captures low-level features across the initial layers and high-level features in the deeper layers. For the tas… ▽ More Removal of rain streaks from a single image is an extremely challenging problem since the rainy images often contain rain streaks of different size, shape, direction and density. Most recent methods for deraining use a deep network following a generic "encoder-decoder" architecture which captures low-level features across the initial layers and high-level features in the deeper layers. For the task of deraining, the rain streaks which are to be removed are relatively small and focusing much on global features is not an efficient way to solve the problem. To this end, we propose using an overcomplete convolutional network architecture which gives special attention in learning local structures by restraining the receptive field of filters. We combine it with U-Net so that it does not lose out on the global structures as well while focusing more on low-level features, to compute the derained image. The proposed network called, Over-and-Under Complete Deraining Network (OUCD), consists of two branches: overcomplete branch which is confined to small receptive field size in order to focus on the local structures and an undercomplete branch that has larger receptive fields to primarily focus on global structures. Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. △ Less

Submitted 20 October, 2020; originally announced October 2020.

Report number: J-STSP-DLIVRC-00060-2020

Journal ref: IEEE Journal of Selected Topics in Signal Processing, 2020

arXiv:2010.01663 [pdf, other]

KiU-Net: Overcomplete Convolutional Architectures for Biomedical Image and Volumetric Segmentation

Authors: Jeya Maria Jose Valanarasu, Vishwanath A. Sindagi, Ilker Hacihaliloglu, Vishal M. Patel

Abstract: Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these "traditional" encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This issue can be attributed to the increase in receptive field si… ▽ More Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these "traditional" encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This issue can be attributed to the increase in receptive field size as we go deeper into the encoder. The extra focus on learning high level features causes the U-Net based approaches to learn less information about low-level features which are crucial for detecting small structures. To overcome this issue, we propose using an overcomplete convolutional architecture where we project our input image into a higher dimension such that we constrain the receptive field from increasing in the deep layers of the network. We design a new architecture for image segmentation- KiU-Net which has two branches: (1) an overcomplete convolutional network Kite-Net which learns to capture fine details and accurate edges of the input, and (2) U-Net which learns high level features. Furthermore, we also propose KiU-Net 3D which is a 3D convolutional architecture for volumetric segmentation. We perform a detailed study of KiU-Net by performing experiments on five different datasets covering various image modalities like ultrasound (US), magnetic resonance imaging (MRI), computed tomography (CT), microscopic and fundus images. The proposed method achieves a better performance as compared to all the recent methods with an additional benefit of fewer parameters and faster convergence. Additionally, we also demonstrate that the extensions of KiU-Net based on residual blocks and dense blocks result in further performance improvements. The implementation of KiU-Net can be found here: https://github.com/jeya-maria-jose/KiU-Net-pytorch △ Less

Submitted 14 October, 2021; v1 submitted 4 October, 2020; originally announced October 2020.

Comments: Journal Extension of KiU-Net (MICCAI-2020)

arXiv:2008.02859 [pdf, other]

Confidence-guided Lesion Mask-based Simultaneous Synthesis of Anatomic and Molecular MR Images in Patients with Post-treatment Malignant Gliomas

Authors: Pengfei Guo, Puyang Wang, Rajeev Yasarla, **yuan Zhou, Vishal M. Patel, Shanshan Jiang

Abstract: Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas in neuro-oncology, especially with the help of standard anatomic and advanced molecular MR images. However, data quantity and quality remain a key determinant of, and a significant limit on, the potential of such applications. In our previous work, we explored synthesis of ana… ▽ More Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas in neuro-oncology, especially with the help of standard anatomic and advanced molecular MR images. However, data quantity and quality remain a key determinant of, and a significant limit on, the potential of such applications. In our previous work, we explored synthesis of anatomic and molecular MR image network (SAMR) in patients with post-treatment malignant glioms. Now, we extend it and propose Confidence Guided SAMR (CG-SAMR) that synthesizes data from lesion information to multi-modal anatomic sequences, including T1-weighted (T1w), gadolinium enhanced T1w (Gd-T1w), T2-weighted (T2w), and fluid-attenuated inversion recovery (FLAIR), and the molecular amide proton transfer-weighted (APTw) sequence. We introduce a module which guides the synthesis based on confidence measure about the intermediate results. Furthermore, we extend the proposed architecture for unsupervised synthesis so that unpaired data can be used for training the network. Extensive experiments on real clinical data demonstrate that the proposed model can perform better than the state-of-theart synthesis methods. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: Submit to IEEE TMI. arXiv admin note: text overlap with arXiv:2006.14761

arXiv:2006.14761 [pdf, other]

Lesion Mask-based Simultaneous Synthesis of Anatomic and MolecularMR Images using a GAN

Authors: Pengfei Guo, Puyang Wang, **yuan Zhou, Vishal M. Patel, Shanshan Jiang

Abstract: Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas for patients with malignant gliomas in neuro-oncology with the help of conventional and advanced molecular MR images. However, the lack of sufficient annotated MRI data has vastly impeded the development of such automatic methods. Conventional data augmentation approaches, inc… ▽ More Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas for patients with malignant gliomas in neuro-oncology with the help of conventional and advanced molecular MR images. However, the lack of sufficient annotated MRI data has vastly impeded the development of such automatic methods. Conventional data augmentation approaches, including flip**, scaling, rotation, and distortion are not capable of generating data with diverse image content. In this paper, we propose a method, called synthesis of anatomic and molecular MR images network (SAMR), which can simultaneously synthesize data from arbitrary manipulated lesion information on multiple anatomic and molecular MRI sequences, including T1-weighted (T1w), gadolinium enhanced T1w (Gd-T1w), T2-weighted (T2w), fluid-attenuated inversion recovery (FLAIR), and amide proton transfer-weighted (APTw). The proposed framework consists of a stretch-out up-sampling module, a brain atlas encoder, a segmentation consistency module, and multi-scale label-wise discriminators. Extensive experiments on real clinical data demonstrate that the proposed model can perform significantly better than the state-of-the-art synthesis methods. △ Less

Submitted 26 August, 2020; v1 submitted 25 June, 2020; originally announced June 2020.

Comments: MICCAI 2020

arXiv:2006.04878 [pdf, other]

KiU-Net: Towards Accurate Segmentation of Biomedical Images using Over-complete Representations

Authors: Jeya Maria Jose, Vishwanath Sindagi, Ilker Hacihaliloglu, Vishal M. Patel

Abstract: Due to its excellent performance, U-Net is the most widely used backbone architecture for biomedical image segmentation in the recent years. However, in our studies, we observe that there is a considerable performance drop in the case of detecting smaller anatomical landmarks with blurred noisy boundaries. We analyze this issue in detail, and address it by proposing an over-complete architecture (… ▽ More Due to its excellent performance, U-Net is the most widely used backbone architecture for biomedical image segmentation in the recent years. However, in our studies, we observe that there is a considerable performance drop in the case of detecting smaller anatomical landmarks with blurred noisy boundaries. We analyze this issue in detail, and address it by proposing an over-complete architecture (Ki-Net) which involves projecting the data onto higher dimensions (in the spatial sense). This network, when augmented with U-Net, results in significant improvements in the case of segmenting small anatomical landmarks and blurred noisy boundaries while obtaining better overall performance. Furthermore, the proposed network has additional benefits like faster convergence and fewer number of parameters. We evaluate the proposed method on the task of brain anatomy segmentation from 2D Ultrasound (US) of preterm neonates, and achieve an improvement of around 4% in terms of the DICE accuracy and Jaccard index as compared to the standard-U-Net, while outperforming the recent best methods by 2%. Code: https://github.com/jeya-maria-jose/KiU-Net-pytorch . △ Less

Submitted 8 July, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

Comments: Accepted at MICCAI 2020

arXiv:2004.13964 [pdf, other]

Sequential Bayesian Parameter Estimation of Stochastic Dynamic Load Models

Authors: Daniel Adrian Maldonado, Vishwas Rao, Mihai Anitescu, Vivak Patel

Abstract: In this paper we focus on the parameter estimation of dynamic load models with stochastic terms, in particular, load models where protection settings are uncertain, such as in aggregated air conditioning units. We show how the uncertainty in the aggregated protection characteristics can be formulated as a stochastic differential equation with process noise. We cast the parameter inversion within a… ▽ More In this paper we focus on the parameter estimation of dynamic load models with stochastic terms, in particular, load models where protection settings are uncertain, such as in aggregated air conditioning units. We show how the uncertainty in the aggregated protection characteristics can be formulated as a stochastic differential equation with process noise. We cast the parameter inversion within a Bayesian parameter estimation framework, and we present methods to include process noise. We demonstrate the benefits of considering stochasticity in the parameter estimation and the risks of ignoring it. △ Less

Submitted 29 April, 2020; originally announced April 2020.

Comments: 7 pages, 9 figures, to be published in Electric Power Systems Research PSCC 2020

MSC Class: 93E11 (Primary)

arXiv:2001.05745 [pdf, other]

doi 10.1016/j.ijhcs.2020.102394

A Technology-aided Multi-modal Training Approach to Assist Abdominal Palpation Training and its Assessment in Medical Education

Authors: A. Asadipour, K. Debattista, V. Patel, A. Chalmers

Abstract: Computer-assisted multimodal training is an effective way of learning complex motor skills in various applications. In particular disciplines (eg. healthcare) incompetency in performing dexterous hands-on examinations (clinical palpation) may result in misdiagnosis of symptoms, serious injuries or even death. Furthermore, a high quality clinical examination can help to exclude significant patholog… ▽ More Computer-assisted multimodal training is an effective way of learning complex motor skills in various applications. In particular disciplines (eg. healthcare) incompetency in performing dexterous hands-on examinations (clinical palpation) may result in misdiagnosis of symptoms, serious injuries or even death. Furthermore, a high quality clinical examination can help to exclude significant pathology, and reduce time and cost of diagnosis by eliminating the need for unnecessary medical imaging. Medical palpation is used regularly as an effective preliminary diagnosis method all around the world but years of training are required currently to achieve competency. This paper focuses on a multimodal palpation training system to teach and improve clinical examination skills in relation to the abdomen. It is our aim to shorten significantly the palpation training duration by increasing the frequency of rehearsals as well as providing essential augmented feedback on how to perform various abdominal palpation techniques which has been captured and modelled from medical experts. Twenty three first year medical students divided into a control group (n=8), a semi-visually trained group (n=8), and a fully visually trained group (n=7) were invited to perform three palpation tasks (superficial, deep and liver). The medical students performances were assessed using both computer-based and human-based methods where a positive correlation was shown between the generated scores, r=.62, p(one-tailed)<.05. The visually-trained group significantly outperformed the control group in which abstract visualisation of applied forces and their palmar locations were provided to the students during each palpation examination (p<.05). Moreover, a positive trend was observed between groups when visual feedback was presented, J=132, z=2.62, r=0.55. △ Less

Submitted 16 January, 2020; originally announced January 2020.

Comments: In Press

arXiv:1912.08364 [pdf, other]

Learning to Segment Brain Anatomy from 2D Ultrasound with Less Data

Authors: Jeya Maria Jose V., Rajeev Yasarla, Puyang Wang, Ilker Hacihaliloglu, Vishal M. Patel

Abstract: Automatic segmentation of anatomical landmarks from ultrasound (US) plays an important role in the management of preterm neonates with a very low birth weight due to the increased risk of develo** intraventricular hemorrhage (IVH) or other complications. One major problem in develo** an automatic segmentation method for this task is the limited availability of annotated data. To tackle this is… ▽ More Automatic segmentation of anatomical landmarks from ultrasound (US) plays an important role in the management of preterm neonates with a very low birth weight due to the increased risk of develo** intraventricular hemorrhage (IVH) or other complications. One major problem in develo** an automatic segmentation method for this task is the limited availability of annotated data. To tackle this issue, we propose a novel image synthesis method using multi-scale self attention generator to synthesize US images from various segmentation masks. We show that our method can synthesize high-quality US images for every manipulated segmentation label with qualitative and quantitative improvements over the recent state-of-the-art synthesis methods. Furthermore, for the segmentation task, we propose a novel method, called Confidence-guided Brain Anatomy Segmentation (CBAS) network, where segmentation and corresponding confidence maps are estimated at different scales. In addition, we introduce a technique which guides CBAS to learn the weights based on the confidence measure about the estimate. Extensive experiments demonstrate that the proposed method for both synthesis and segmentation tasks achieve significant improvements over the recent state-of-the-art methods. In particular, we show that the new synthesis framework can be used to generate realistic US images which can be used to improve the performance of a segmentation algorithm. △ Less

Submitted 17 December, 2019; originally announced December 2019.

arXiv:1909.04207 [pdf, other]

doi 10.1109/TIP.2020.2973802

Confidence Measure Guided Single Image De-raining

Authors: Rajeev Yasarla, Vishal M. Patel

Abstract: Single image de-raining is an extremely challenging problem since the rainy images contain rain streaks which often vary in size, direction and density. This varying characteristic of rain streaks affect different parts of the image differently. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the maj… ▽ More Single image de-raining is an extremely challenging problem since the rainy images contain rain streaks which often vary in size, direction and density. This varying characteristic of rain streaks affect different parts of the image differently. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the major limitations of these approaches is that they do not consider the location information of rain drops in the image. The proposed Image Quality-based single image Deraining using Confidence measure (QuDeC), network addresses this issue by learning the quality or distortion level of each patch in the rainy image, and further processes this information to learn the rain content at different scales. In addition, we introduce a technique which guides the network to learn the network weights based on the confidence measure about the estimate of both quality at each location and residual rain streak information (residual map). Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. △ Less

Submitted 9 September, 2019; originally announced September 2019.

Comments: TIP2019 submission. arXiv admin note: substantial text overlap with arXiv:1906.11129

arXiv:1906.11129 [pdf, other]

Uncertainty Guided Multi-Scale Residual Learning-using a Cycle Spinning CNN for Single Image De-Raining

Authors: Rajeev Yasarla, Vishal M. Patel

Abstract: Single image de-raining is an extremely challenging problem since the rainy image may contain rain streaks which may vary in size, direction and density. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the major limitations of these approaches is that they do not consider the location information of… ▽ More Single image de-raining is an extremely challenging problem since the rainy image may contain rain streaks which may vary in size, direction and density. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the major limitations of these approaches is that they do not consider the location information of rain drops in the image. The proposed Uncertainty guided Multi-scale Residual Learning (UMRL) network attempts to address this issue by learning the rain content at different scales and using them to estimate the final de-rained output. In addition, we introduce a technique which guides the network to learn the network weights based on the confidence measure about the estimate. Furthermore, we introduce a new training and testing procedure based on the notion of cycle spinning to improve the final de-raining performance. Extensive experiments on synthetic and real datasets to demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. Code is available at: https://github.com/rajeevyasarla/UMRL--using-Cycle-Spinning △ Less

Submitted 12 June, 2019; originally announced June 2019.

Comments: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019

arXiv:1803.08396 [pdf, other]

Densely Connected Pyramid Dehazing Network

Authors: He Zhang, Vishal M. Patel

Abstract: We propose a new end-to-end single image dehazing method, called Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together. The end-to-end learning is achieved by directly embedding the atmospheric scattering model into the network, thereby ensuring that the proposed method strictly follows the physics-driven scatt… ▽ More We propose a new end-to-end single image dehazing method, called Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together. The end-to-end learning is achieved by directly embedding the atmospheric scattering model into the network, thereby ensuring that the proposed method strictly follows the physics-driven scattering model for dehazing. Inspired by the dense network that can maximize the information flow along features from different levels, we propose a new edge-preserving densely connected encoder-decoder structure with multi-level pyramid pooling module for estimating the transmission map. This network is optimized using a newly introduced edge-preserving loss function. To further incorporate the mutual structural information between the estimated transmission map and the dehazed result, we propose a joint-discriminator based on generative adversarial network framework to decide whether the corresponding dehazed image and the estimated transmission map are real or fake. An ablation study is conducted to demonstrate the effectiveness of each module evaluated at both estimated transmission map and dehazed result. Extensive experiments demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods. Code will be made available at: https://github.com/hezhangsprinter △ Less

Submitted 22 March, 2018; originally announced March 2018.

arXiv:1801.04307 [pdf, other]

Robust Sparse Fourier Transform Based on The Fourier Projection-Slice Theorem

Authors: Shaogang Wang, Vishal M. Patel, Athina Petropulu

Abstract: The state-of-the-art automotive radars employ multidimensional discrete Fourier transforms (DFT) in order to estimate various target parameters. The DFT is implemented using the fast Fourier transform (FFT), at sample and computational complexity of $O(N)$ and $O(N \log N)$, respectively, where $N$ is the number of samples in the signal space. We have recently proposed a sparse Fourier transform b… ▽ More The state-of-the-art automotive radars employ multidimensional discrete Fourier transforms (DFT) in order to estimate various target parameters. The DFT is implemented using the fast Fourier transform (FFT), at sample and computational complexity of $O(N)$ and $O(N \log N)$, respectively, where $N$ is the number of samples in the signal space. We have recently proposed a sparse Fourier transform based on the Fourier projection-slice theorem (FPS-SFT), which applies to multidimensional signals that are sparse in the frequency domain. FPS-SFT achieves sample complexity of $O(K)$ and computational complexity of $O(K \log K)$ for a multidimensional, $K$-sparse signal. While FPS-SFT considers the ideal scenario, i.e., exactly sparse data that contains on-grid frequencies, in this paper, by extending FPS-SFT into a robust version (RFPS-SFT), we emphasize on addressing noisy signals that contain off-grid frequencies; such signals arise from radar applications. This is achieved by employing a windowing technique and a voting-based frequency decoding procedure; the former reduces the frequency leakage of the off-grid frequencies below the noise level to preserve the sparsity of the signal, while the latter significantly lowers the frequency localization error stemming from the noise. The performance of the proposed method is demonstrated both theoretically and numerically. △ Less

Submitted 10 December, 2017; originally announced January 2018.

arXiv:1712.02923 [pdf, other]

SPRK: A Low-Cost Stewart Platform For Motion Study In Surgical Robotics

Authors: Vatsal Patel, Sanjay Krishnan, Aimee Goncalves, Ken Goldberg

Abstract: To simulate body organ motion due to breathing, heart beats, or peristaltic movements, we designed a low-cost, miniaturized SPRK (Stewart Platform Research Kit) to translate and rotate phantom tissue. This platform is 20cm x 20cm x 10cm to fit in the workspace of a da Vinci Research Kit (DVRK) surgical robot and costs $250, two orders of magnitude less than a commercial Stewart platform. The platf… ▽ More To simulate body organ motion due to breathing, heart beats, or peristaltic movements, we designed a low-cost, miniaturized SPRK (Stewart Platform Research Kit) to translate and rotate phantom tissue. This platform is 20cm x 20cm x 10cm to fit in the workspace of a da Vinci Research Kit (DVRK) surgical robot and costs $250, two orders of magnitude less than a commercial Stewart platform. The platform has a range of motion of +/- 1.27 cm in translation along x, y, and z directions and has motion modes for sinusoidal motion and breathing-inspired motion. Modular platform mounts were also designed for pattern cutting and debridement experiments. The platform's positional controller has a time-constant of 0.2 seconds and the root-mean-square error is 1.22 mm, 1.07 mm, and 0.20 mm in x, y, and z directions respectively. All the details, CAD models, and control software for the platform is available at github.com/BerkeleyAutomation/sprk. △ Less

Submitted 7 December, 2017; originally announced December 2017.

arXiv:1712.02917 [pdf, other]

Using Intermittent Synchronization to Compensate for Rhythmic Body Motion During Autonomous Surgical Cutting and Debridement

Authors: Vatsal Patel, Sanjay Krishnan, Aimee Goncalves, Carolyn Chen, Walter Doug Boyd, Ken Goldberg

Abstract: Anatomical structures are rarely static during a surgical procedure due to breathing, heartbeats, and peristaltic movements. Inspired by observing an expert surgeon, we propose an intermittent synchronization with the extrema of the rhythmic motion (i.e., the lowest velocity windows). We performed 2 experiments: (1) pattern cutting, and (2) debridement. In (1), we found that the intermittent synch… ▽ More Anatomical structures are rarely static during a surgical procedure due to breathing, heartbeats, and peristaltic movements. Inspired by observing an expert surgeon, we propose an intermittent synchronization with the extrema of the rhythmic motion (i.e., the lowest velocity windows). We performed 2 experiments: (1) pattern cutting, and (2) debridement. In (1), we found that the intermittent synchronization approach, while 1.8x slower than tracking motion, was significantly more robust to noise and control latency, and it reduced the max cutting error by 2.6x In (2), a baseline approach with no synchronization achieves 62% success rate for each removal, while intermittent synchronization achieves 80%. △ Less

Submitted 7 December, 2017; originally announced December 2017.

arXiv:1711.11407 [pdf, ps, other]

FPS-SFT: A Multi-dimensional Sparse Fourier Transform Based on the Fourier Projection-slice Theorem

Authors: Shaogang Wang, Vishal M. Patel, Athina Petropulu

Abstract: We propose a multi-dimensional (M-D) sparse Fourier transform inspired by the idea of the Fourier projection-slice theorem, called FPS-SFT. FPS-SFT extracts samples along lines (1-dimensional slices from an M-D data cube), which are parameterized by random slopes and offsets. The discrete Fourier transform (DFT) along those lines represents projections of M-D DFT of the M-D data onto those lines.… ▽ More We propose a multi-dimensional (M-D) sparse Fourier transform inspired by the idea of the Fourier projection-slice theorem, called FPS-SFT. FPS-SFT extracts samples along lines (1-dimensional slices from an M-D data cube), which are parameterized by random slopes and offsets. The discrete Fourier transform (DFT) along those lines represents projections of M-D DFT of the M-D data onto those lines. The M-D sinusoids that are contained in the signal can be reconstructed from the DFT along lines with a low sample and computational complexity provided that the signal is sparse in the frequency domain and the lines are appropriately designed. The performance of FPS-SFT is demonstrated both theoretically and numerically. A sparse image reconstruction application is illustrated, which shows the capability of the FPS-SFT in solving practical problems. △ Less

Submitted 1 November, 2017; originally announced November 2017.

arXiv:1711.06815 [pdf, ps, other]

WAKE: Wavelet Decomposition Coupled with Adaptive Kalman Filtering for Pathological Tremor Extraction

Authors: Soroosh Shahtalebi, Seyed Farokh Atashzar, Rajni V. Patel, Arash Mohammadi

Abstract: Pathological Hand Tremor (PHT) is among common symptoms of several neurological movement disorders, which can significantly degrade quality of life of affected individuals. Beside pharmaceutical and surgical therapies, mechatronic technologies have been utilized to control PHTs. Most of these technologies function based on estimation, extraction, and characterization of tremor movement signals. Re… ▽ More Pathological Hand Tremor (PHT) is among common symptoms of several neurological movement disorders, which can significantly degrade quality of life of affected individuals. Beside pharmaceutical and surgical therapies, mechatronic technologies have been utilized to control PHTs. Most of these technologies function based on estimation, extraction, and characterization of tremor movement signals. Real-time extraction of tremor signal is of paramount importance because of its application in assistive and rehabilitative devices. In this paper, we propose a novel on-line adaptive method which can adjust the hyper-parameters of the filter to the variable characteristics of the tremor. The proposed "WAKE: Wavelet decomposition coupled with Adaptive Kalman filtering technique for pathological tremor Extraction, referred to as the WAKE framework" is composed of a new adaptive Kalman filter and a wavelet transform core to provide indirect prediction of the tremor, one sample ahead of time, to be used for its suppression. In this paper, the design, implementation and evaluation of WAKE are given. The performance is evaluated based on three different datasets, the first one is a synthetic dataset, developed in this work, that simulates hand tremor under ten different conditions. The second and third ones are real datasets recorded from patients with PHTs. The results obtained from the proposed WAKE framework demonstrate significant improvements in the estimation accuracy in comparison with two well regarded techniques in the literature. △ Less

Submitted 10 October, 2018; v1 submitted 18 November, 2017; originally announced November 2017.

arXiv:1610.01050 [pdf, ps, other]

An Efficient High-Dimensional Sparse Fourier Transform

Authors: Shaogang Wang, Vishal M. Patel, Athina Petropulu

Abstract: We propose RSFT, which is an extension of the one dimensional Sparse Fourier Transform algorithm to higher dimensions in a way that it can be applied to real, noisy data. The RSFT allows for off-grid frequencies. Furthermore, by incorporating Neyman-Pearson detection, the frequency detection stages in RSFT do not require knowledge of the exact sparsity of the signal and are more robust to noise. W… ▽ More We propose RSFT, which is an extension of the one dimensional Sparse Fourier Transform algorithm to higher dimensions in a way that it can be applied to real, noisy data. The RSFT allows for off-grid frequencies. Furthermore, by incorporating Neyman-Pearson detection, the frequency detection stages in RSFT do not require knowledge of the exact sparsity of the signal and are more robust to noise. We analyze the asymptotic performance of RSFT, and study the computational complexity versus the worst case signal SNR tradeoff. We show that by choosing the proper parameters, the optimal tradeoff can be achieved. We discuss the application of RSFT on short range ubiquitous radar signal processing, and demonstrate its feasibility via simulations. △ Less

Submitted 4 October, 2016; originally announced October 2016.

Showing 1–44 of 44 results for author: Patel, V