-
Beaconless Auto-Alignment for Single-Wavelength 5 Tbit/s Mode-Division Multiplexing Free-Space Optical Communications
Authors:
Yiming Li,
Gil Fernandes,
David Benton,
Antonin Billaud,
Mohammed Patel,
Andrew Ellis
Abstract:
Mode-division multiplexing has shown its ability to significantly increase the capacity of free-space optical communications. An accurate alignment is crucial to enable such links due to possible performance degradation induced by mode crosstalk and narrow beam divergence. Conventionally, a beacon beam is necessary for system alignment due to multiple local maximums in the mode-division multiplexe…
▽ More
Mode-division multiplexing has shown its ability to significantly increase the capacity of free-space optical communications. An accurate alignment is crucial to enable such links due to possible performance degradation induced by mode crosstalk and narrow beam divergence. Conventionally, a beacon beam is necessary for system alignment due to multiple local maximums in the mode-division multiplexed beam profile. However, the beacon beam introduces excess system complexity, power consumption, and alignment errors. Here we demonstrate a beaconless system with significantly higher alignment accuracy and faster acquisition. This system also excludes excess complexity, power consumption, and alignment errors, facilitating simplified system calibration and supporting a record-high 5.14 Tbit/s line rate in a single-wavelength free-space optical link. We anticipate our paper to be a starting point for more sophisticated alignment scenarios in future multi-Terabit mode-division multiplexing free-space optical communications for long-distance applications with a generalised mode basis.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Weakly Supervised Detection of Pheochromocytomas and Paragangliomas in CT
Authors:
David C. Oluigboa,
Bikash Santra,
Tejas Sudharshan Mathai,
Pritam Mukherjee,
Jianfei Liu,
Abhishek Jha,
Mayank Patel,
Karel Pacak,
Ronald M. Summers
Abstract:
Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors which have the potential to metastasize. For the management of patients with PPGLs, CT is the preferred modality of choice for precise localization and estimation of their progression. However, due to the myriad variations in size, morphology, and appearance of the tumors in different anatomical regions, radiolo…
▽ More
Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors which have the potential to metastasize. For the management of patients with PPGLs, CT is the preferred modality of choice for precise localization and estimation of their progression. However, due to the myriad variations in size, morphology, and appearance of the tumors in different anatomical regions, radiologists are posed with the challenge of accurate detection of PPGLs. Since clinicians also need to routinely measure their size and track their changes over time across patient visits, manual demarcation of PPGLs is quite a time-consuming and cumbersome process. To ameliorate the manual effort spent for this task, we propose an automated method to detect PPGLs in CT studies via a proxy segmentation task. As only weak annotations for PPGLs in the form of prospectively marked 2D bounding boxes on an axial slice were available, we extended these 2D boxes into weak 3D annotations and trained a 3D full-resolution nnUNet model to directly segment PPGLs. We evaluated our approach on a dataset consisting of chest-abdomen-pelvis CTs of 255 patients with confirmed PPGLs. We obtained a precision of 70% and sensitivity of 64.1% with our proposed approach when tested on 53 CT studies. Our findings highlight the promising nature of detecting PPGLs via segmentation, and furthers the state-of-the-art in this exciting yet challenging area of rare cancer management.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
A Multi-Area Architecture for Real-Time Feedback-Based Optimization of Distribution Grids
Authors:
Ilyas Farhat,
Etinosa Ekomwenrenren,
John W. Simpson-Porco,
Evangelos Farantatos,
Mahendra Patel,
Aboutaleb Haddadi
Abstract:
A challenge in transmission-distribution coordination is how to quickly and reliably coordinate Distributed Energy Resources (DERs) across large multi-stakeholder Distribution Networks (DNs) to support the Transmission Network (TN), while ensuring operational constraints continue to be met within the DN. Here we propose a hierarchical feedback-based control architecture for coordination of DERs in…
▽ More
A challenge in transmission-distribution coordination is how to quickly and reliably coordinate Distributed Energy Resources (DERs) across large multi-stakeholder Distribution Networks (DNs) to support the Transmission Network (TN), while ensuring operational constraints continue to be met within the DN. Here we propose a hierarchical feedback-based control architecture for coordination of DERs in DNs, enabling the DN to quickly respond to power set-point requests from the Transmission System Operator (TSO) while maintaining local DN constraints. Our scheme allows for multiple independently-managed areas within the DN to optimize their local resources while coordinating to support the TN, and while maintaining data privacy; the only required inter-area communication is between physically adjacent areas within the DN control hierarchy. We conduct a rigorous stability analysis, establishing intuitive conditions for closed-loop stability, and provide detailed tuning recommendations. The proposal is validated via case studies on multiple feeders, including IEEE-123 and IEEE-8500, using a custom MATLAB-based application which integrates with OpenDSS. The simulation results show that the proposed structure is highly scalable and can quickly coordinate DERs in response to TSO commands, while responding to local disturbances within the DN and maintaining DN operational limits.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
A Cascaded Neural Network System For Rating Student Performance In Surgical Knot Tying Simulation
Authors:
Yunzhe Xue,
Olanrewaju Eletta,
Justin W. Ady,
Nell M. Patel,
Advaith Bongu,
Usman Roshan
Abstract:
As part of their training all medical students and residents have to pass basic surgical tasks such as knot tying, needle-passing, and suturing. Their assessment is typically performed in the operating room by surgical faculty where mistakes and failure by the student increases the operation time and cost. This evaluation is quantitative and has a low margin of error. Simulation has emerged as a c…
▽ More
As part of their training all medical students and residents have to pass basic surgical tasks such as knot tying, needle-passing, and suturing. Their assessment is typically performed in the operating room by surgical faculty where mistakes and failure by the student increases the operation time and cost. This evaluation is quantitative and has a low margin of error. Simulation has emerged as a cost effective option but it lacks assessment or requires additional expensive hardware for evaluation. Apps that provide training videos on surgical knot trying are available to students but none have evaluation. We propose a cascaded neural network architecture that evaluates a student's performance just from a video of themselves simulating a surgical knot tying task. Our model converts video frame images into feature vectors with a pre-trained deep convolutional network and then models the sequence of frames with a temporal network. We obtained videos of medical students and residents from the Robert Wood Johnson Hospital performing knot tying on a standardized simulation kit. We manually annotated each video and proceeded to do a five-fold cross-validation study on them. Our model achieves a median precision, recall, and F1-score of 0.71, 0.66, and 0.65 respectively in determining the level of knot related tasks of tying and pushing the knot. Our mean precision score averaged across different probability thresholds is 0.8. Both our F1-score and mean precision score are 8% and 30% higher than that of a recently published study for the same problem. We expect the accuracy of our model to further increase as we add more training videos to the model thus making it a practical solution that students can use to evaluate themselves.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Self-Supervised MRI Reconstruction with Unrolled Diffusion Models
Authors:
Yilmaz Korkmaz,
Tolga Cukur,
Vishal M. Patel
Abstract:
Magnetic Resonance Imaging (MRI) produces excellent soft tissue contrast, albeit it is an inherently slow imaging modality. Promising deep learning methods have recently been proposed to reconstruct accelerated MRI scans. However, existing methods still suffer from various limitations regarding image fidelity, contextual sensitivity, and reliance on fully-sampled acquisitions for model training. T…
▽ More
Magnetic Resonance Imaging (MRI) produces excellent soft tissue contrast, albeit it is an inherently slow imaging modality. Promising deep learning methods have recently been proposed to reconstruct accelerated MRI scans. However, existing methods still suffer from various limitations regarding image fidelity, contextual sensitivity, and reliance on fully-sampled acquisitions for model training. To comprehensively address these limitations, we propose a novel self-supervised deep reconstruction model, named Self-Supervised Diffusion Reconstruction (SSDiffRecon). SSDiffRecon expresses a conditional diffusion process as an unrolled architecture that interleaves cross-attention transformers for reverse diffusion steps with data-consistency blocks for physics-driven processing. Unlike recent diffusion methods for MRI reconstruction, a self-supervision strategy is adopted to train SSDiffRecon using only undersampled k-space data. Comprehensive experiments on public brain MR datasets demonstrates the superiority of SSDiffRecon against state-of-the-art supervised, and self-supervised baselines in terms of reconstruction speed and quality. Implementation will be available at https://github.com/yilmazkorkmaz1/SSDiffRecon.
△ Less
Submitted 15 April, 2024; v1 submitted 28 June, 2023;
originally announced June 2023.
-
The Brain Tumor Segmentation (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI
Authors:
Ahmed W. Moawad,
Anastasia Janas,
Ujjwal Baid,
Divya Ramakrishnan,
Rachit Saluja,
Nader Ashraf,
Leon Jekel,
Raisa Amiruddin,
Maruf Adewole,
Jake Albrecht,
Udunna Anazodo,
Sanjay Aneja,
Syed Muhammad Anwar,
Timothy Bergquist,
Evan Calabrese,
Veronica Chiang,
Verena Chung,
Gian Marco Marco Conte,
Farouk Dako,
James Eddy,
Ivan Ezhov,
Ariana Familiar,
Keyvan Farahani,
Juan Eugenio Iglesias,
Zhifan Jiang
, et al. (206 additional authors not shown)
Abstract:
The translation of AI-generated brain metastases (BM) segmentation into clinical practice relies heavily on diverse, high-quality annotated medical imaging datasets. The BraTS-METS 2023 challenge has gained momentum for testing and benchmarking algorithms using rigorously annotated internationally compiled real-world datasets. This study presents the results of the segmentation challenge and chara…
▽ More
The translation of AI-generated brain metastases (BM) segmentation into clinical practice relies heavily on diverse, high-quality annotated medical imaging datasets. The BraTS-METS 2023 challenge has gained momentum for testing and benchmarking algorithms using rigorously annotated internationally compiled real-world datasets. This study presents the results of the segmentation challenge and characterizes the challenging cases that impacted the performance of the winning algorithms. Untreated brain metastases on standard anatomic MRI sequences (T1, T2, FLAIR, T1PG) from eight contributed international datasets were annotated in stepwise method: published UNET algorithms, student, neuroradiologist, final approver neuroradiologist. Segmentations were ranked based on lesion-wise Dice and Hausdorff distance (HD95) scores. False positives (FP) and false negatives (FN) were rigorously penalized, receiving a score of 0 for Dice and a fixed penalty of 374 for HD95. Eight datasets comprising 1303 studies were annotated, with 402 studies (3076 lesions) released on Synapse as publicly available datasets to challenge competitors. Additionally, 31 studies (139 lesions) were held out for validation, and 59 studies (218 lesions) were used for testing. Segmentation accuracy was measured as rank across subjects, with the winning team achieving a LesionWise mean score of 7.9. Common errors among the leading teams included false negatives for small lesions and misregistration of masks in space.The BraTS-METS 2023 challenge successfully curated well-annotated, diverse datasets and identified common errors, facilitating the translation of BM segmentation across varied clinical environments and providing personalized volumetric reports to patients undergoing BM treatment.
△ Less
Submitted 17 June, 2024; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Data-Driven Fast Frequency Control using Inverter-Based Resources
Authors:
Etinosa Ekomwenrenren,
John W. Simpson-Porco,
Evangelos Farantatos,
Mahendra Patel,
Aboutaleb Haddadi,
Lin Zhu
Abstract:
To address the control challenges associated with the increasing share of inverter-connected renewable energy resources, this paper proposes a direct data-driven approach for fast frequency control in the bulk power system. The proposed control scheme partitions the power system into control areas, and leverages local dispatchable inverter-based resources to rapidly mitigate local power imbalances…
▽ More
To address the control challenges associated with the increasing share of inverter-connected renewable energy resources, this paper proposes a direct data-driven approach for fast frequency control in the bulk power system. The proposed control scheme partitions the power system into control areas, and leverages local dispatchable inverter-based resources to rapidly mitigate local power imbalances upon events. The controller design is based directly on historical measurement sequences, and does not require identification of a parametric power system model. Theoretical results are provided to support the approach. Simulation studies on a nonlinear three-area test system demonstrate that the controller provides fast and localized frequency control under several types of contingencies.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Deep Metric Learning for Unsupervised Remote Sensing Change Detection
Authors:
Wele Gedara Chaminda Bandara,
Vishal M. Patel
Abstract:
Remote Sensing Change Detection (RS-CD) aims to detect relevant changes from Multi-Temporal Remote Sensing Images (MT-RSIs), which aids in various RS applications such as land cover, land use, human development analysis, and disaster response. The performance of existing RS-CD methods is attributed to training on large annotated datasets. Furthermore, most of these models are less transferable in…
▽ More
Remote Sensing Change Detection (RS-CD) aims to detect relevant changes from Multi-Temporal Remote Sensing Images (MT-RSIs), which aids in various RS applications such as land cover, land use, human development analysis, and disaster response. The performance of existing RS-CD methods is attributed to training on large annotated datasets. Furthermore, most of these models are less transferable in the sense that the trained model often performs very poorly when there is a domain gap between training and test datasets. This paper proposes an unsupervised CD method based on deep metric learning that can deal with both of these issues. Given an MT-RSI, the proposed method generates corresponding change probability map by iteratively optimizing an unsupervised CD loss without training it on a large dataset. Our unsupervised CD method consists of two interconnected deep networks, namely Deep-Change Probability Generator (D-CPG) and Deep-Feature Extractor (D-FE). The D-CPG is designed to predict change and no change probability maps for a given MT-RSI, while D-FE is used to extract deep features of MT-RSI that will be further used in the proposed unsupervised CD loss. We use transfer learning capability to initialize the parameters of D-FE. We iteratively optimize the parameters of D-CPG and D-FE for a given MT-RSI by minimizing the proposed unsupervised ``similarity-dissimilarity loss''. This loss is motivated by the principle of metric learning where we simultaneously maximize the distance between change pair-wise pixels while minimizing the distance between no-change pair-wise pixels in bi-temporal image domain and their deep feature domain. The experiments conducted on three CD datasets show that our unsupervised CD method achieves significant improvements over the state-of-the-art supervised and unsupervised CD methods. Code available at https://github.com/wgcban/Metric-CD
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
NBD-GAP: Non-Blind Image Deblurring Without Clean Target Images
Authors:
Nithin Gopalakrishnan Nair,
Rajeev Yasarla,
Vishal M. Patel
Abstract:
In recent years, deep neural network-based restoration methods have achieved state-of-the-art results in various image deblurring tasks. However, one major drawback of deep learning-based deblurring networks is that large amounts of blurry-clean image pairs are required for training to achieve good performance. Moreover, deep networks often fail to perform well when the blurry images and the blur…
▽ More
In recent years, deep neural network-based restoration methods have achieved state-of-the-art results in various image deblurring tasks. However, one major drawback of deep learning-based deblurring networks is that large amounts of blurry-clean image pairs are required for training to achieve good performance. Moreover, deep networks often fail to perform well when the blurry images and the blur kernels during testing are very different from the ones used during training. This happens mainly because of the overfitting of the network parameters on the training data. In this work, we present a method that addresses these issues. We view the non-blind image deblurring problem as a denoising problem. To do so, we perform Wiener filtering on a pair of blurry images with the corresponding blur kernels. This results in a pair of images with colored noise. Hence, the deblurring problem is translated into a denoising problem. We then solve the denoising problem without using explicit clean target images. Extensive experiments are conducted to show that our method achieves results that are on par to the state-of-the-art non-blind deblurring works.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
Comparative analysis of segmentation and generative models for fingerprint retrieval task
Authors:
Megh Patel,
Devarsh Patel,
Sarthak Patel
Abstract:
Biometric Authentication like Fingerprints has become an integral part of the modern technology for authentication and verification of users. It is pervasive in more ways than most of us are aware of. However, these fingerprint images deteriorate in quality if the fingers are dirty, wet, injured or when sensors malfunction. Therefore, extricating the original fingerprint by removing the noise and…
▽ More
Biometric Authentication like Fingerprints has become an integral part of the modern technology for authentication and verification of users. It is pervasive in more ways than most of us are aware of. However, these fingerprint images deteriorate in quality if the fingers are dirty, wet, injured or when sensors malfunction. Therefore, extricating the original fingerprint by removing the noise and inpainting it to restructure the image is crucial for its authentication. Hence, this paper proposes a deep learning approach to address these issues using Generative (GAN) and Segmentation models. Qualitative and Quantitative comparison has been done between pix2pixGAN and cycleGAN (generative models) as well as U-net (segmentation model). To train the model, we created our own dataset NFD - Noisy Fingerprint Dataset meticulously with different backgrounds along with scratches in some images to make it more realistic and robust. In our research, the u-net model performed better than the GAN networks
△ Less
Submitted 2 November, 2022; v1 submitted 13 September, 2022;
originally announced September 2022.
-
Data-Driven Fast Frequency Control using Inverter-Based Resources
Authors:
Etinosa Ekomwenrenren,
John Simpson-Porco,
Evangelos Farantatos,
Mahendra Patel,
Aboutaleb Haddadi,
Lin Zhu
Abstract:
We develop and test a data-driven and area-based fast frequency control scheme, which rapidly redispatches inverter-based resources to compensate for local power imbalances within the bulk power system. The approach requires no explicit system model information, relying only on historical measurement sequences for the computation of control actions. Our technical approach fuses developments in low…
▽ More
We develop and test a data-driven and area-based fast frequency control scheme, which rapidly redispatches inverter-based resources to compensate for local power imbalances within the bulk power system. The approach requires no explicit system model information, relying only on historical measurement sequences for the computation of control actions. Our technical approach fuses developments in low-gain estimator design and data-driven control to provide a model-free and practical solution for fast frequency control. Theoretical results and extensive simulation scenarios on a three area system are provided to support the approach.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.
-
Enhanced Atmospheric Turbulence Resiliency with Successive Interference Cancellation DSP in Mode Division Multiplexing Free-Space Optical Links
Authors:
Yiming Li,
Zhaozhong Chen,
Zhouyi Hu,
David M. Benton,
Abdallah A. I. Ali,
Mohammed Patel,
Martin P. J. Lavery,
Andrew D. Ellis
Abstract:
We experimentally demonstrate the enhanced atmospheric turbulence resiliency in a 137.8 Gbit/s/mode mode-division multiplexing free-space optical communication link through the application of a successive interference cancellation digital signal processing algorithm. The turbulence resiliency is further enhanced through redundant receive channels in the mode-division multiplexing link. The proof o…
▽ More
We experimentally demonstrate the enhanced atmospheric turbulence resiliency in a 137.8 Gbit/s/mode mode-division multiplexing free-space optical communication link through the application of a successive interference cancellation digital signal processing algorithm. The turbulence resiliency is further enhanced through redundant receive channels in the mode-division multiplexing link. The proof of concept demonstration is performed using commercially available mode-selective photonic lanterns, a commercial transponder, and a spatial light modulator based turbulence emulator. In this link, 5 spatial modes with each mode carrying 34.46 GBaud dual-polarization quadrature phase shift keying signals are successfully transmitted with an average bit error rate lower than the hard-decision forward error correction limit. As a result, we achieved a record-high mode- and polarization-division multiplexing channel number of 10, a record-high line rate of 689.23 Gbit/s, and a record-high net spectral efficiency of 13.9 bit/s/Hz in emulated turbulent links in a mode-division multiplexing free-space optical system.
△ Less
Submitted 19 July, 2022;
originally announced August 2022.
-
Learning to restore images degraded by atmospheric turbulence using uncertainty
Authors:
Rajeev Yasarla,
Vishal M. Patel
Abstract:
Atmospheric turbulence can significantly degrade the quality of images acquired by long-range imaging systems by causing spatially and temporally random fluctuations in the index of refraction of the atmosphere. Variations in the refractive index causes the captured images to be geometrically distorted and blurry. Hence, it is important to compensate for the visual degradation in images caused by…
▽ More
Atmospheric turbulence can significantly degrade the quality of images acquired by long-range imaging systems by causing spatially and temporally random fluctuations in the index of refraction of the atmosphere. Variations in the refractive index causes the captured images to be geometrically distorted and blurry. Hence, it is important to compensate for the visual degradation in images caused by atmospheric turbulence. In this paper, we propose a deep learning-based approach for restring a single image degraded by atmospheric turbulence. We make use of the epistemic uncertainty based on Monte Carlo dropouts to capture regions in the image where the network is having hard time restoring. The estimated uncertainty maps are then used to guide the network to obtain the restored image. Extensive experiments are conducted on synthetic and real images to show the significance of the proposed work. Code is available at : https://github.com/rajeevyasarla/AT-Net
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Simultaneous Bone and Shadow Segmentation Network using Task Correspondence Consistency
Authors:
Aimon Rahman,
Jeya Maria Jose Valanarasu,
Ilker Hacihaliloglu,
Vishal M Patel
Abstract:
Segmenting both bone surface and the corresponding acoustic shadow are fundamental tasks in ultrasound (US) guided orthopedic procedures. However, these tasks are challenging due to minimal and blurred bone surface response in US images, cross-machine discrepancy, imaging artifacts, and low signal-to-noise ratio. Notably, bone shadows are caused by a significant acoustic impedance mismatch between…
▽ More
Segmenting both bone surface and the corresponding acoustic shadow are fundamental tasks in ultrasound (US) guided orthopedic procedures. However, these tasks are challenging due to minimal and blurred bone surface response in US images, cross-machine discrepancy, imaging artifacts, and low signal-to-noise ratio. Notably, bone shadows are caused by a significant acoustic impedance mismatch between the soft tissue and bone surfaces. To leverage this mutual information between these highly related tasks, we propose a single end-to-end network with a shared transformer-based encoder and task independent decoders for simultaneous bone and shadow segmentation. To share complementary features, we propose a cross task feature transfer block which learns to transfer meaningful features from decoder of shadow segmentation to that of bone segmentation and vice-versa. We also introduce a correspondence consistency loss which makes sure that network utilizes the inter-dependency between the bone surface and its corresponding shadow to refine the segmentation. Validation against expert annotations shows that the method outperforms the previous state-of-the-art for both bone surface and shadow segmentation.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Orientation-guided Graph Convolutional Network for Bone Surface Segmentation
Authors:
Aimon Rahman,
Wele Gedara Chaminda Bandara,
Jeya Maria Jose Valanarasu,
Ilker Hacihaliloglu,
Vishal M Patel
Abstract:
Due to imaging artifacts and low signal-to-noise ratio in ultrasound images, automatic bone surface segmentation networks often produce fragmented predictions that can hinder the success of ultrasound-guided computer-assisted surgical procedures. Existing pixel-wise predictions often fail to capture the accurate topology of bone tissues due to a lack of supervision to enforce connectivity. In this…
▽ More
Due to imaging artifacts and low signal-to-noise ratio in ultrasound images, automatic bone surface segmentation networks often produce fragmented predictions that can hinder the success of ultrasound-guided computer-assisted surgical procedures. Existing pixel-wise predictions often fail to capture the accurate topology of bone tissues due to a lack of supervision to enforce connectivity. In this work, we propose an orientation-guided graph convolutional network to improve connectivity while segmenting the bone surface. We also propose an additional supervision on the orientation of the bone surface to further impose connectivity. We validated our approach on 1042 vivo US scans of femur, knee, spine, and distal radius. Our approach improves over the state-of-the-art methods by 5.01% in connectivity metric.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
SAR Despeckling using a Denoising Diffusion Probabilistic Model
Authors:
Malsha V. Perera,
Nithin Gopalakrishnan Nair,
Wele Gedara Chaminda Bandara,
Vishal M. Patel
Abstract:
Speckle is a multiplicative noise which affects all coherent imaging modalities including Synthetic Aperture Radar (SAR) images. The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications such as automatic target recognition and change detection. Thus, SAR despeckling is an important problem in remote sensing. In this paper, we…
▽ More
Speckle is a multiplicative noise which affects all coherent imaging modalities including Synthetic Aperture Radar (SAR) images. The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications such as automatic target recognition and change detection. Thus, SAR despeckling is an important problem in remote sensing. In this paper, we introduce SAR-DDPM, a denoising diffusion probabilistic model for SAR despeckling. The proposed method comprises of a Markov chain that transforms clean images to white Gaussian noise by repeatedly adding random noise. The despeckled image is recovered by a reverse process which iteratively predicts the added noise using a noise predictor which is conditioned on the speckled image. In addition, we propose a new inference strategy based on cycle spinning to improve the despeckling performance. Our experiments on both synthetic and real SAR images demonstrate that the proposed method achieves significant improvements in both quantitative and qualitative results over the state-of-the-art despeckling methods.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
SAR Despeckling Using Overcomplete Convolutional Networks
Authors:
Malsha V. Perera,
Wele Gedara Chaminda Bandara,
Jeya Maria Jose Valanarasu,
Vishal M. Patel
Abstract:
Synthetic Aperture Radar (SAR) despeckling is an important problem in remote sensing as speckle degrades SAR images, affecting downstream tasks like detection and segmentation. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. Traditional CNNs try to increase the receptive field size as the network goes deeper, thus extracting global features. H…
▽ More
Synthetic Aperture Radar (SAR) despeckling is an important problem in remote sensing as speckle degrades SAR images, affecting downstream tasks like detection and segmentation. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. Traditional CNNs try to increase the receptive field size as the network goes deeper, thus extracting global features. However,speckle is relatively small, and increasing receptive field does not help in extracting speckle features. This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field. The proposed network consists of an overcomplete branch to focus on the local structures and an undercomplete branch that focuses on the global structures. We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
Deep-learning-enabled Brain Hemodynamic Map** Using Resting-state fMRI
Authors:
Xirui Hou,
Pengfei Guo,
Puyang Wang,
Peiying Liu,
Doris D. M. Lin,
Hongli Fan,
Yang Li,
Zhiliang Wei,
Zixuan Lin,
Dengrong Jiang,
** **,
Catherine Kelly,
Jay J. Pillai,
Judy Huang,
Marco C. Pinho,
Binu P. Thomas,
Babu G. Welch,
Denise C. Park,
Vishal M. Patel,
Argye E. Hillis,
Hanzhang Lu
Abstract:
Cerebrovascular disease is a leading cause of death globally. Prevention and early intervention are known to be the most effective forms of its management. Non-invasive imaging methods hold great promises for early stratification, but at present lack the sensitivity for personalized prognosis. Resting-state functional magnetic resonance imaging (rs-fMRI), a powerful tool previously used for mappin…
▽ More
Cerebrovascular disease is a leading cause of death globally. Prevention and early intervention are known to be the most effective forms of its management. Non-invasive imaging methods hold great promises for early stratification, but at present lack the sensitivity for personalized prognosis. Resting-state functional magnetic resonance imaging (rs-fMRI), a powerful tool previously used for map** neural activity, is available in most hospitals. Here we show that rs-fMRI can be used to map cerebral hemodynamic function and delineate impairment. By exploiting time variations in breathing pattern during rs-fMRI, deep learning enables reproducible map** of cerebrovascular reactivity (CVR) and bolus arrive time (BAT) of the human brain using resting-state CO2 fluctuations as a natural 'contrast media'. The deep-learning network was trained with CVR and BAT maps obtained with a reference method of CO2-inhalation MRI, which included data from young and older healthy subjects and patients with Moyamoya disease and brain tumors. We demonstrate the performance of deep-learning cerebrovascular map** in the detection of vascular abnormalities, evaluation of revascularization effects, and vascular alterations in normal aging. In addition, cerebrovascular maps obtained with the proposed method exhibited excellent reproducibility in both healthy volunteers and stroke patients. Deep-learning resting-state vascular imaging has the potential to become a useful tool in clinical cerebrovascular imaging.
△ Less
Submitted 25 April, 2022;
originally announced April 2022.
-
A comparison of different atmospheric turbulence simulation methods for image restoration
Authors:
Nithin Gopalakrishnan Nair,
Kangfu Mei,
Vishal M. Patel
Abstract:
Atmospheric turbulence deteriorates the quality of images captured by long-range imaging systems by introducing blur and geometric distortions to the captured scene. This leads to a drastic drop in performance when computer vision algorithms like object/face recognition and detection are performed on these images. In recent years, various deep learning-based atmospheric turbulence mitigation metho…
▽ More
Atmospheric turbulence deteriorates the quality of images captured by long-range imaging systems by introducing blur and geometric distortions to the captured scene. This leads to a drastic drop in performance when computer vision algorithms like object/face recognition and detection are performed on these images. In recent years, various deep learning-based atmospheric turbulence mitigation methods have been proposed in the literature. These methods are often trained using synthetically generated images and tested on real-world images. Hence, the performance of these restoration methods depends on the type of simulation used for training the network. In this paper, we systematically evaluate the effectiveness of various turbulence simulation methods on image restoration. In particular, we evaluate the performance of two state-or-the-art restoration networks using six simulations method on a real-world LRFID dataset consisting of face images degraded by turbulence. This paper will provide guidance to the researchers and practitioners working in this field to choose the suitable data generation models for training deep models for turbulence mitigation. The implementation codes for the simulation methods, source codes for the networks, and the pre-trained models will be publicly made available.
△ Less
Submitted 19 April, 2022;
originally announced April 2022.
-
Revisiting Consistency Regularization for Semi-supervised Change Detection in Remote Sensing Images
Authors:
Wele Gedara Chaminda Bandara,
Vishal M. Patel
Abstract:
Remote-sensing (RS) Change Detection (CD) aims to detect "changes of interest" from co-registered bi-temporal images. The performance of existing deep supervised CD methods is attributed to the large amounts of annotated data used to train the networks. However, annotating large amounts of remote sensing images is labor-intensive and expensive, particularly with bi-temporal images, as it requires…
▽ More
Remote-sensing (RS) Change Detection (CD) aims to detect "changes of interest" from co-registered bi-temporal images. The performance of existing deep supervised CD methods is attributed to the large amounts of annotated data used to train the networks. However, annotating large amounts of remote sensing images is labor-intensive and expensive, particularly with bi-temporal images, as it requires pixel-wise comparisons by a human expert. On the other hand, we often have access to unlimited unlabeled multi-temporal RS imagery thanks to ever-increasing earth observation programs. In this paper, we propose a simple yet effective way to leverage the information from unlabeled bi-temporal images to improve the performance of CD approaches. More specifically, we propose a semi-supervised CD model in which we formulate an unsupervised CD loss in addition to the supervised Cross-Entropy (CE) loss by constraining the output change probability map of a given unlabeled bi-temporal image pair to be consistent under the small random perturbations applied on the deep feature difference map that is obtained by subtracting their latent feature representations. Experiments conducted on two publicly available CD datasets show that the proposed semi-supervised CD method can reach closer to the performance of supervised CD even with access to as little as 10% of the annotated training data. Code available at https://github.com/wgcban/SemiCD
△ Less
Submitted 21 April, 2022; v1 submitted 18 April, 2022;
originally announced April 2022.
-
Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation
Authors:
Pengfei Guo,
Dong Yang,
Ali Hatamizadeh,
An Xu,
Ziyue Xu,
Wenqi Li,
Can Zhao,
Daguang Xu,
Stephanie Harmon,
Evrim Turkbey,
Baris Turkbey,
Bradford Wood,
Francesca Patella,
Elvira Stellato,
Gianpaolo Carrafiello,
Vishal M. Patel,
Holger R. Roth
Abstract:
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning t…
▽ More
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning to achieve optimal performance. Conventional hyperparameter optimization algorithms are impractical in real-world FL applications as they involve numerous training trials, which are often not affordable with limited compute budgets. In this work, we propose an efficient reinforcement learning (RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL, in which an online RL agent can dynamically adjust hyperparameters of each client based on the current training progress. Extensive experiments are conducted to investigate different search strategies and RL agents. The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset as well as two real-world medical image segmentation datasets for COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT.
△ Less
Submitted 31 August, 2022; v1 submitted 11 March, 2022;
originally announced March 2022.
-
On-the-Fly Test-time Adaptation for Medical Image Segmentation
Authors:
Jeya Maria Jose Valanarasu,
Pengfei Guo,
Vibashan VS,
Vishal M. Patel
Abstract:
One major problem in deep learning-based solutions for medical imaging is the drop in performance when a model is tested on a data distribution different from the one that it is trained on. Adapting the source model to target data distribution at test-time is an efficient solution for the data-shift problem. Previous methods solve this by adapting the model to target distribution by using techniqu…
▽ More
One major problem in deep learning-based solutions for medical imaging is the drop in performance when a model is tested on a data distribution different from the one that it is trained on. Adapting the source model to target data distribution at test-time is an efficient solution for the data-shift problem. Previous methods solve this by adapting the model to target distribution by using techniques like entropy minimization or regularization. In these methods, the models are still updated by back-propagation using an unsupervised loss on complete test data distribution. In real-world clinical settings, it makes more sense to adapt a model to a new test image on-the-fly and avoid model update during inference due to privacy concerns and lack of computing resource at deployment. To this end, we propose a new setting - On-the-Fly Adaptation which is zero-shot and episodic (i.e., the model is adapted to a single image at a time and also does not perform any back-propagation during test-time). To achieve this, we propose a new framework called Adaptive UNet where each convolutional block is equipped with an adaptive batch normalization layer to adapt the features with respect to a domain code. The domain code is generated using a pre-trained encoder trained on a large corpus of medical images. During test-time, the model takes in just the new test image and generates a domain code to adapt the features of source model according to the test data. We validate the performance on both 2D and 3D data distribution shifts where we get a better performance compared to previous test-time adaptation methods. Code is available at https://github.com/jeya-maria-jose/On-The-Fly-Adaptation
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
UNeXt: MLP-based Rapid Medical Image Segmentation Network
Authors:
Jeya Maria Jose Valanarasu,
Vishal M. Patel
Abstract:
UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. However, these networks cannot be effectively adopted for rapid image segmentation in point-of-care applications as they are parameter-heavy, computationally complex and slow to use. To this end, we propose UNeXt which is a Convolutional multilayer perceptron (MLP) based network…
▽ More
UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. However, these networks cannot be effectively adopted for rapid image segmentation in point-of-care applications as they are parameter-heavy, computationally complex and slow to use. To this end, we propose UNeXt which is a Convolutional multilayer perceptron (MLP) based network for image segmentation. We design UNeXt in an effective way with an early convolutional stage and a MLP stage in the latent stage. We propose a tokenized MLP block where we efficiently tokenize and project the convolutional features and use MLPs to model the representation. To further boost the performance, we propose shifting the channels of the inputs while feeding in to MLPs so as to focus on learning local dependencies. Using tokenized MLPs in latent space reduces the number of parameters and computational complexity while being able to result in a better representation to help segmentation. The network also consists of skip connections between various levels of encoder and decoder. We test UNeXt on multiple medical image segmentation datasets and show that we reduce the number of parameters by 72x, decrease the computational complexity by 68x, and improve the inference speed by 10x while also obtaining better segmentation performance over the state-of-the-art medical image segmentation architectures. Code is available at https://github.com/jeya-maria-jose/UNeXt-pytorch
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening
Authors:
Wele Gedara Chaminda Bandara,
Vishal M. Patel
Abstract:
Pansharpening aims to fuse a registered high-resolution panchromatic image (PAN) with a low-resolution hyperspectral image (LR-HSI) to generate an enhanced HSI with high spectral and spatial resolution. Existing pansharpening approaches neglect using an attention mechanism to transfer HR texture features from PAN to LR-HSI features, resulting in spatial and spectral distortions. In this paper, we…
▽ More
Pansharpening aims to fuse a registered high-resolution panchromatic image (PAN) with a low-resolution hyperspectral image (LR-HSI) to generate an enhanced HSI with high spectral and spatial resolution. Existing pansharpening approaches neglect using an attention mechanism to transfer HR texture features from PAN to LR-HSI features, resulting in spatial and spectral distortions. In this paper, we present a novel attention mechanism for pansharpening called HyperTransformer, in which features of LR-HSI and PAN are formulated as queries and keys in a transformer, respectively. HyperTransformer consists of three main modules, namely two separate feature extractors for PAN and HSI, a multi-head feature soft attention module, and a spatial-spectral feature fusion module. Such a network improves both spatial and spectral quality measures of the pansharpened HSI by learning cross-feature space dependencies and long-range details of PAN and LR-HSI. Furthermore, HyperTransformer can be utilized across multiple spatial scales at the backbone for obtaining improved performance. Extensive experiments conducted on three widely used datasets demonstrate that HyperTransformer achieves significant improvement over the state-of-the-art methods on both spatial and spectral quality measures. Implementation code and pre-trained weights can be accessed at https://github.com/wgcban/HyperTransformer.
△ Less
Submitted 28 March, 2022; v1 submitted 4 March, 2022;
originally announced March 2022.
-
ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer
Authors:
Pengfei Guo,
Yiqun Mei,
**yuan Zhou,
Shanshan Jiang,
Vishal M. Patel
Abstract:
Accelerating magnetic resonance image (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k-space. In this paper, we propose a recurrent transformer model, namely ReconFormer, for MRI reconstruction which can iteratively reconstruct high fertility magnetic resonance images from highly under-sampled k-space data. In particular, th…
▽ More
Accelerating magnetic resonance image (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k-space. In this paper, we propose a recurrent transformer model, namely ReconFormer, for MRI reconstruction which can iteratively reconstruct high fertility magnetic resonance images from highly under-sampled k-space data. In particular, the proposed architecture is built upon Recurrent Pyramid Transformer Layers (RPTL), which jointly exploits intrinsic multi-scale information at every architecture unit as well as the dependencies of the deep feature correlation through recurrent states. Moreover, the proposed ReconFormer is lightweight since it employs the recurrent structure for its parameter efficiency. We validate the effectiveness of ReconFormer on multiple datasets with different magnetic resonance sequences and show that it achieves significant improvements over the state-of-the-art methods with better parameter efficiency. Implementation code will be available in https://github.com/guopengf/ReconFormer.
△ Less
Submitted 27 January, 2022; v1 submitted 23 January, 2022;
originally announced January 2022.
-
Transformer-based SAR Image Despeckling
Authors:
Malsha V. Perera,
Wele Gedara Chaminda Bandara,
Jeya Maria Jose Valanarasu,
Vishal M. Patel
Abstract:
Synthetic Aperture Radar (SAR) images are usually degraded by a multiplicative noise known as speckle which makes processing and interpretation of SAR images difficult. In this paper, we introduce a transformer-based network for SAR image despeckling. The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different im…
▽ More
Synthetic Aperture Radar (SAR) images are usually degraded by a multiplicative noise known as speckle which makes processing and interpretation of SAR images difficult. In this paper, we introduce a transformer-based network for SAR image despeckling. The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different image regions - aiding in better despeckling. The network is trained end-to-end with synthetically generated speckled images using a composite loss function. Experiments show that the proposed method achieves significant improvements over traditional and convolutional neural network-based despeckling methods on both synthetic and real SAR images.
△ Less
Submitted 23 January, 2022;
originally announced January 2022.
-
Shallow geothermal energy potential for heating and cooling of buildings with regeneration under climate change scenarios
Authors:
Alina Walch,
Xiang Li,
Jonathan Chambers,
Nahid Mohajeri,
Selin Yilmaz,
Martin Patel,
Jean-Louis Scartezzini
Abstract:
Shallow ground-source heat pumps (GSHPs) are a promising technology for contributing to the decarbonisation of the energy sector. In heating-dominated climates, the combined use of GSHPs for both heating and cooling increases their technical potential, defined as the maximum energy that can be exchanged with the ground, as the re-injection of excess heat from space cooling leads to a seasonal rege…
▽ More
Shallow ground-source heat pumps (GSHPs) are a promising technology for contributing to the decarbonisation of the energy sector. In heating-dominated climates, the combined use of GSHPs for both heating and cooling increases their technical potential, defined as the maximum energy that can be exchanged with the ground, as the re-injection of excess heat from space cooling leads to a seasonal regeneration of the ground. This paper proposes a new approach to quantify the technical potential of GSHPs, accounting for effects of seasonal regeneration, and to estimate the useful energy to supply building energy demands at regional scale. The useful energy is obtained for direct heat exchange and for district heating and cooling (DHC) under several scenarios for climate change and market penetration levels of cooling systems. The case study in western Switzerland suggests that seasonal regeneration allows for annual maximum heat extraction densities above 300 kWh/m$^2$ at heat injection densities above 330 kWh/m$^2$. Results also show that GSHPs may cover up to 55% of heating demand while covering 57% of service-sector cooling demand for individual GSHPs in 2050, which increases to around 85% with DHC. The regional-scale results may serve to inform decision making on strategic areas for installing GSHPs.
△ Less
Submitted 2 December, 2021;
originally announced December 2021.
-
Simulating Realistic MRI variations to Improve Deep Learning model and visual explanations using GradCAM
Authors:
Muhammad Ilyas Patel,
Shrey Singla,
Razeem Ahmad Ali Mattathodi,
Sumit Sharma,
Deepam Gautam,
Srinivasa Rao Kundeti
Abstract:
In the medical field, landmark detection in MRI plays an important role in reducing medical technician efforts in tasks like scan planning, image registration, etc. First, 88 landmarks spread across the brain anatomy in the three respective views -- sagittal, coronal, and axial are manually annotated, later guidelines from the expert clinical technicians are taken sub-anatomy-wise, for better loca…
▽ More
In the medical field, landmark detection in MRI plays an important role in reducing medical technician efforts in tasks like scan planning, image registration, etc. First, 88 landmarks spread across the brain anatomy in the three respective views -- sagittal, coronal, and axial are manually annotated, later guidelines from the expert clinical technicians are taken sub-anatomy-wise, for better localization of the existing landmarks, in order to identify and locate the important atlas landmarks even in oblique scans. To overcome limited data availability, we implement realistic data augmentation to generate synthetic 3D volumetric data. We use a modified HighRes3DNet model for solving brain MRI volumetric landmark detection problem. In order to visually explain our trained model on unseen data, and discern a stronger model from a weaker model, we implement Gradient-weighted Class Activation Map** (Grad-CAM) which produces a coarse localization map highlighting the regions the model is focusing. Our experiments show that the proposed method shows favorable results, and the overall pipeline can be extended to a variable number of landmarks and other anatomies.
△ Less
Submitted 1 November, 2021;
originally announced November 2021.
-
Hierarchical Frequency and Voltage Control using Prioritized Utilization of Inverter Based Resources
Authors:
Rahul Chakraborty,
Aranya Chakrabortty,
Evangelos Farantatos,
Mahendra Patel,
Hossein Hooshyar,
Atena Darvishi
Abstract:
We propose a novel hierarchical frequency and voltage control design for multi-area power system integrated with inverter-based resources (IBRs). The design is based on the idea of prioritizing the use of IBRs over conventional generator-based control in compensating for sudden and unpredicted changes in loads and generations, and thereby mitigate any undesired dynamics in the frequency or the vol…
▽ More
We propose a novel hierarchical frequency and voltage control design for multi-area power system integrated with inverter-based resources (IBRs). The design is based on the idea of prioritizing the use of IBRs over conventional generator-based control in compensating for sudden and unpredicted changes in loads and generations, and thereby mitigate any undesired dynamics in the frequency or the voltage by exploiting their fast actuation time constants. A new sequential optimization problem, referred to as Area Prioritized Power Flow (APPF), is formulated to model this prioritization. It is shown that compared to conventional power flow APPF not only leads to a fairer balance between the dispatch of active and reactive power from the IBRs and the synchronous generators, but also limits the impact of any contingency from spreading out beyond its respective control area, thereby guaranteeing a better collective dynamic performance of the grid. This improvement, however, comes at the cost of adding an extra layer of communication needed for executing APPF in a hierarchical way. Results are validated using simulations of a 9-machine, 6-IBR, 33-bus, 3-area power system model, illustrating how APPF can mitigate a disturbance faster and more efficiently by prioritizing the use of local area-resources.
△ Less
Submitted 5 February, 2022; v1 submitted 8 September, 2021;
originally announced September 2021.
-
Realistic Ultrasound Image Synthesis for Improved Classification of Liver Disease
Authors:
Hui Che,
Sumana Ramanathan,
David Foran,
John L Nosher,
Vishal M Patel,
Ilker Hacihaliloglu
Abstract:
With the success of deep learning-based methods applied in medical image analysis, convolutional neural networks (CNNs) have been investigated for classifying liver disease from ultrasound (US) data. However, the scarcity of available large-scale labeled US data has hindered the success of CNNs for classifying liver disease from US data. In this work, we propose a novel generative adversarial netw…
▽ More
With the success of deep learning-based methods applied in medical image analysis, convolutional neural networks (CNNs) have been investigated for classifying liver disease from ultrasound (US) data. However, the scarcity of available large-scale labeled US data has hindered the success of CNNs for classifying liver disease from US data. In this work, we propose a novel generative adversarial network (GAN) architecture for realistic diseased and healthy liver US image synthesis. We adopt the concept of stacking to synthesize realistic liver US data. Quantitative and qualitative evaluation is performed on 550 in-vivo B-mode liver US images collected from 55 subjects. We also show that the synthesized images, together with real in vivo data, can be used to significantly improve the performance of traditional CNN architectures for Nonalcoholic fatty liver disease (NAFLD) classification.
△ Less
Submitted 27 July, 2021;
originally announced July 2021.
-
Hyperspectral Pansharpening Based on Improved Deep Image Prior and Residual Reconstruction
Authors:
Wele Gedara Chaminda Bandara,
Jeya Maria Jose Valanarasu,
Vishal M. Patel
Abstract:
Hyperspectral pansharpening aims to synthesize a low-resolution hyperspectral image (LR-HSI) with a registered panchromatic image (PAN) to generate an enhanced HSI with high spectral and spatial resolution. Recently proposed HS pansharpening methods have obtained remarkable results using deep convolutional networks (ConvNets), which typically consist of three steps: (1) up-sampling the LR-HSI, (2)…
▽ More
Hyperspectral pansharpening aims to synthesize a low-resolution hyperspectral image (LR-HSI) with a registered panchromatic image (PAN) to generate an enhanced HSI with high spectral and spatial resolution. Recently proposed HS pansharpening methods have obtained remarkable results using deep convolutional networks (ConvNets), which typically consist of three steps: (1) up-sampling the LR-HSI, (2) predicting the residual image via a ConvNet, and (3) obtaining the final fused HSI by adding the outputs from first and second steps. Recent methods have leveraged Deep Image Prior (DIP) to up-sample the LR-HSI due to its excellent ability to preserve both spatial and spectral information, without learning from large data sets. However, we observed that the quality of up-sampled HSIs can be further improved by introducing an additional spatial-domain constraint to the conventional spectral-domain energy function. We define our spatial-domain constraint as the $L_1$ distance between the predicted PAN image and the actual PAN image. To estimate the PAN image of the up-sampled HSI, we also propose a learnable spectral response function (SRF). Moreover, we noticed that the residual image between the up-sampled HSI and the reference HSI mainly consists of edge information and very fine structures. In order to accurately estimate fine information, we propose a novel over-complete network, called HyperKite, which focuses on learning high-level features by constraining the receptive from increasing in the deep layers. We perform experiments on three HSI datasets to demonstrate the superiority of our DIP-HyperKite over the state-of-the-art pansharpening methods. The deployment codes, pre-trained models, and final fusion outputs of our DIP-HyperKite and the methods used for the comparisons will be publicly made available at https://github.com/wgcban/DIP-HyperKite.git.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
Over-and-Under Complete Convolutional RNN for MRI Reconstruction
Authors:
Pengfei Guo,
Jeya Maria Jose Valanarasu,
Puyang Wang,
**yuan Zhou,
Shanshan Jiang,
Vishal M. Patel
Abstract:
Reconstructing magnetic resonance (MR) images from undersampled data is a challenging problem due to various artifacts introduced by the under-sampling operation. Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture which captures low-level features at the initial layers and high-level features at the deeper layers. Such networks focus…
▽ More
Reconstructing magnetic resonance (MR) images from undersampled data is a challenging problem due to various artifacts introduced by the under-sampling operation. Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture which captures low-level features at the initial layers and high-level features at the deeper layers. Such networks focus much on global features which may not be optimal to reconstruct the fully-sampled image. In this paper, we propose an Over-and-Under Complete Convolutional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network(CRNN). The overcomplete branch gives special attention in learning local structures by restraining the receptive field of the network. Combining it with the undercomplete branch leads to a network which focuses more on low-level features without losing out on the global structures. Extensive experiments on two datasets demonstrate that the proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters.
△ Less
Submitted 24 June, 2021; v1 submitted 16 June, 2021;
originally announced June 2021.
-
Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning
Authors:
Pengfei Guo,
Puyang Wang,
**yuan Zhou,
Shanshan Jiang,
Vishal M. Patel
Abstract:
Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. In recent years, deep learning-based methods have been shown to produce superior performance on MR image reconstruction. However, these methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical dat…
▽ More
Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. In recent years, deep learning-based methods have been shown to produce superior performance on MR image reconstruction. However, these methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical data privacy regulations. In order to overcome this challenge, we propose a federated learning (FL) based solution in which we take advantage of the MR data available at different institutions while preserving patients' privacy. However, the generalizability of models trained with the FL setting can still be suboptimal due to domain shift, which results from the data collected at multiple institutions with different sensors, disease types, and acquisition protocols, etc. With the motivation of circumventing this challenge, we propose a cross-site modeling for MR image reconstruction in which the learned intermediate latent features among different source sites are aligned with the distribution of the latent features at the target site. Extensive experiments are conducted to provide various insights about FL for MR image reconstruction. Experimental results demonstrate that the proposed framework is a promising direction to utilize multi-institutional data without compromising patients' privacy for achieving improved MR image reconstruction. Our code will be available at https://github.com/guopengf/FLMRCM.
△ Less
Submitted 10 March, 2021; v1 submitted 2 March, 2021;
originally announced March 2021.
-
Multi-Feature Multi-Scale CNN-Derived COVID-19 Classification from Lung Ultrasound Data
Authors:
Hui Che,
Jared Radbel,
Jag Sunderram,
John L. Nosher,
Vishal M. Patel,
Ilker Hacihaliloglu
Abstract:
The global pandemic of the novel coronavirus disease 2019 (COVID-19) has put tremendous pressure on the medical system. Imaging plays a complementary role in the management of patients with COVID-19. Computed tomography (CT) and chest X-ray (CXR) are the two dominant screening tools. However, difficulty in eliminating the risk of disease transmission, radiation exposure and not being costeffective…
▽ More
The global pandemic of the novel coronavirus disease 2019 (COVID-19) has put tremendous pressure on the medical system. Imaging plays a complementary role in the management of patients with COVID-19. Computed tomography (CT) and chest X-ray (CXR) are the two dominant screening tools. However, difficulty in eliminating the risk of disease transmission, radiation exposure and not being costeffective are some of the challenges for CT and CXR imaging. This fact induces the implementation of lung ultrasound (LUS) for evaluating COVID-19 due to its practical advantages of noninvasiveness, repeatability, and sensitive bedside property. In this paper, we utilize a deep learning model to perform the classification of COVID-19 from LUS data, which could produce objective diagnostic information for clinicians. Specifically, all LUS images are processed to obtain their corresponding local phase filtered images and radial symmetry transformed images before fed into the multi-scale residual convolutional neural network (CNN). Secondly, image combination as the input of the network is used to explore rich and reliable features. Feature fusion strategy at different levels is adopted to investigate the relationship between the depth of feature aggregation and the classification accuracy. Our proposed method is evaluated on the point-of-care US (POCUS) dataset together with the Italian COVID-19 Lung US database (ICLUS-DB) and shows promising performance for COVID-19 prediction.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Error Diffusion Halftoning Against Adversarial Examples
Authors:
Shao-Yuan Lo,
Vishal M. Patel
Abstract:
Adversarial examples contain carefully crafted perturbations that can fool deep neural networks (DNNs) into making wrong predictions. Enhancing the adversarial robustness of DNNs has gained considerable interest in recent years. Although image transformation-based defenses were widely considered at an earlier time, most of them have been defeated by adaptive attacks. In this paper, we propose a ne…
▽ More
Adversarial examples contain carefully crafted perturbations that can fool deep neural networks (DNNs) into making wrong predictions. Enhancing the adversarial robustness of DNNs has gained considerable interest in recent years. Although image transformation-based defenses were widely considered at an earlier time, most of them have been defeated by adaptive attacks. In this paper, we propose a new image transformation defense based on error diffusion halftoning, and combine it with adversarial training to defend against adversarial examples. Error diffusion halftoning projects an image into a 1-bit space and diffuses quantization error to neighboring pixels. This process can remove adversarial perturbations from a given image while maintaining acceptable image quality in the meantime in favor of recognition. Experimental results demonstrate that the proposed method is able to improve adversarial robustness even under advanced adaptive attacks, while most of the other image transformation-based defenses do not. We show that a proper image transformation can still be an effective defense approach. Code: https://github.com/shaoyuanlo/Halftoning-Defense
△ Less
Submitted 24 July, 2021; v1 submitted 23 January, 2021;
originally announced January 2021.
-
Overcomplete Representations Against Adversarial Videos
Authors:
Shao-Yuan Lo,
Jeya Maria Jose Valanarasu,
Vishal M. Patel
Abstract:
Adversarial robustness of deep neural networks is an extensively studied problem in the literature and various methods have been proposed to defend against adversarial images. However, only a handful of defense methods have been developed for defending against attacked videos. In this paper, we propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OU…
▽ More
Adversarial robustness of deep neural networks is an extensively studied problem in the literature and various methods have been proposed to defend against adversarial images. However, only a handful of defense methods have been developed for defending against attacked videos. In this paper, we propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OUDefend). Most restoration networks adopt an encoder-decoder architecture that first shrinks spatial dimension then expands it back. This approach learns undercomplete representations, which have large receptive fields to collect global information but overlooks local details. On the other hand, overcomplete representations have opposite properties. Hence, OUDefend is designed to balance local and global features by learning those two representations. We attach OUDefend to target video recognition models as a feature restoration block and train the entire network end-to-end. Experimental results show that the defenses focusing on images may be ineffective to videos, while OUDefend enhances robustness against different types of adversarial videos, ranging from additive attacks, multiplicative attacks to physically realizable attacks. Code: https://github.com/shaoyuanlo/OUDefend
△ Less
Submitted 14 June, 2021; v1 submitted 8 December, 2020;
originally announced December 2020.
-
Design and Implementation of Path Trackers for Ackermann Drive based Vehicles
Authors:
Adarsh Patnaik,
Manthan Patel,
Vibhakar Mohta,
Het Shah,
Shubh Agrawal,
Aditya Rathore,
Ritwik Malik,
Debashish Chakravarty,
Ranjan Bhattacharya
Abstract:
This article is an overview of the various literature on path tracking methods and their implementation in simulation and realistic operating environments.The scope of this study includes analysis, implementation,tuning, and comparison of some selected path tracking methods commonly used in practice for trajectory tracking in autonomous vehicles. Many of these methods are applicable at low speed d…
▽ More
This article is an overview of the various literature on path tracking methods and their implementation in simulation and realistic operating environments.The scope of this study includes analysis, implementation,tuning, and comparison of some selected path tracking methods commonly used in practice for trajectory tracking in autonomous vehicles. Many of these methods are applicable at low speed due to the linear assumption for the system model, and hence, some methods are also included that consider nonlinearities present in lateral vehicle dynamics during high-speed navigation. The performance evaluation and comparison of tracking methods are carried out on realistic simulations and a dedicated instrumented passenger car, Mahindra e2o, to get a performance idea of all the methods in realistic operating conditions and develop tuning methodologies for each of the methods. It has been observed that our model predictive control-based approach is able to perform better compared to the others in medium velocity ranges.
△ Less
Submitted 5 December, 2020;
originally announced December 2020.
-
Exploring Overcomplete Representations for Single Image Deraining using CNNs
Authors:
Rajeev Yasarla,
Jeya Maria Jose Valanarasu,
Vishal M. Patel
Abstract:
Removal of rain streaks from a single image is an extremely challenging problem since the rainy images often contain rain streaks of different size, shape, direction and density. Most recent methods for deraining use a deep network following a generic "encoder-decoder" architecture which captures low-level features across the initial layers and high-level features in the deeper layers. For the tas…
▽ More
Removal of rain streaks from a single image is an extremely challenging problem since the rainy images often contain rain streaks of different size, shape, direction and density. Most recent methods for deraining use a deep network following a generic "encoder-decoder" architecture which captures low-level features across the initial layers and high-level features in the deeper layers. For the task of deraining, the rain streaks which are to be removed are relatively small and focusing much on global features is not an efficient way to solve the problem. To this end, we propose using an overcomplete convolutional network architecture which gives special attention in learning local structures by restraining the receptive field of filters. We combine it with U-Net so that it does not lose out on the global structures as well while focusing more on low-level features, to compute the derained image. The proposed network called, Over-and-Under Complete Deraining Network (OUCD), consists of two branches: overcomplete branch which is confined to small receptive field size in order to focus on the local structures and an undercomplete branch that has larger receptive fields to primarily focus on global structures. Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
KiU-Net: Overcomplete Convolutional Architectures for Biomedical Image and Volumetric Segmentation
Authors:
Jeya Maria Jose Valanarasu,
Vishwanath A. Sindagi,
Ilker Hacihaliloglu,
Vishal M. Patel
Abstract:
Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these "traditional" encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This issue can be attributed to the increase in receptive field si…
▽ More
Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these "traditional" encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This issue can be attributed to the increase in receptive field size as we go deeper into the encoder. The extra focus on learning high level features causes the U-Net based approaches to learn less information about low-level features which are crucial for detecting small structures. To overcome this issue, we propose using an overcomplete convolutional architecture where we project our input image into a higher dimension such that we constrain the receptive field from increasing in the deep layers of the network. We design a new architecture for image segmentation- KiU-Net which has two branches: (1) an overcomplete convolutional network Kite-Net which learns to capture fine details and accurate edges of the input, and (2) U-Net which learns high level features. Furthermore, we also propose KiU-Net 3D which is a 3D convolutional architecture for volumetric segmentation. We perform a detailed study of KiU-Net by performing experiments on five different datasets covering various image modalities like ultrasound (US), magnetic resonance imaging (MRI), computed tomography (CT), microscopic and fundus images. The proposed method achieves a better performance as compared to all the recent methods with an additional benefit of fewer parameters and faster convergence. Additionally, we also demonstrate that the extensions of KiU-Net based on residual blocks and dense blocks result in further performance improvements. The implementation of KiU-Net can be found here: https://github.com/jeya-maria-jose/KiU-Net-pytorch
△ Less
Submitted 14 October, 2021; v1 submitted 4 October, 2020;
originally announced October 2020.
-
CinC-GAN for Effective F0 prediction for Whisper-to-Normal Speech Conversion
Authors:
Maitreya Patel,
Mirali Purohit,
Jui Shah,
Hemant A. Patil
Abstract:
Recently, Generative Adversarial Networks (GAN)-based methods have shown remarkable performance for the Voice Conversion and WHiSPer-to-normal SPeeCH (WHSP2SPCH) conversion. One of the key challenges in WHSP2SPCH conversion is the prediction of fundamental frequency (F0). Recently, authors have proposed state-of-the-art method Cycle-Consistent Generative Adversarial Networks (CycleGAN) for WHSP2SP…
▽ More
Recently, Generative Adversarial Networks (GAN)-based methods have shown remarkable performance for the Voice Conversion and WHiSPer-to-normal SPeeCH (WHSP2SPCH) conversion. One of the key challenges in WHSP2SPCH conversion is the prediction of fundamental frequency (F0). Recently, authors have proposed state-of-the-art method Cycle-Consistent Generative Adversarial Networks (CycleGAN) for WHSP2SPCH conversion. The CycleGAN-based method uses two different models, one for Mel Cepstral Coefficients (MCC) map**, and another for F0 prediction, where F0 is highly dependent on the pre-trained model of MCC map**. This leads to additional non-linear noise in predicted F0. To suppress this noise, we propose Cycle-in-Cycle GAN (i.e., CinC-GAN). It is specially designed to increase the effectiveness in F0 prediction without losing the accuracy of MCC map**. We evaluated the proposed method on a non-parallel setting and analyzed on speaker-specific, and gender-specific tasks. The objective and subjective tests show that CinC-GAN significantly outperforms the CycleGAN. In addition, we analyze the CycleGAN and CinC-GAN for unseen speakers and the results show the clear superiority of CinC-GAN.
△ Less
Submitted 18 August, 2020;
originally announced August 2020.
-
Confidence-guided Lesion Mask-based Simultaneous Synthesis of Anatomic and Molecular MR Images in Patients with Post-treatment Malignant Gliomas
Authors:
Pengfei Guo,
Puyang Wang,
Rajeev Yasarla,
**yuan Zhou,
Vishal M. Patel,
Shanshan Jiang
Abstract:
Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas in neuro-oncology, especially with the help of standard anatomic and advanced molecular MR images. However, data quantity and quality remain a key determinant of, and a significant limit on, the potential of such applications. In our previous work, we explored synthesis of ana…
▽ More
Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas in neuro-oncology, especially with the help of standard anatomic and advanced molecular MR images. However, data quantity and quality remain a key determinant of, and a significant limit on, the potential of such applications. In our previous work, we explored synthesis of anatomic and molecular MR image network (SAMR) in patients with post-treatment malignant glioms. Now, we extend it and propose Confidence Guided SAMR (CG-SAMR) that synthesizes data from lesion information to multi-modal anatomic sequences, including T1-weighted (T1w), gadolinium enhanced T1w (Gd-T1w), T2-weighted (T2w), and fluid-attenuated inversion recovery (FLAIR), and the molecular amide proton transfer-weighted (APTw) sequence. We introduce a module which guides the synthesis based on confidence measure about the intermediate results. Furthermore, we extend the proposed architecture for unsupervised synthesis so that unpaired data can be used for training the network. Extensive experiments on real clinical data demonstrate that the proposed model can perform better than the state-of-theart synthesis methods.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Lesion Mask-based Simultaneous Synthesis of Anatomic and MolecularMR Images using a GAN
Authors:
Pengfei Guo,
Puyang Wang,
**yuan Zhou,
Vishal M. Patel,
Shanshan Jiang
Abstract:
Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas for patients with malignant gliomas in neuro-oncology with the help of conventional and advanced molecular MR images. However, the lack of sufficient annotated MRI data has vastly impeded the development of such automatic methods. Conventional data augmentation approaches, inc…
▽ More
Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas for patients with malignant gliomas in neuro-oncology with the help of conventional and advanced molecular MR images. However, the lack of sufficient annotated MRI data has vastly impeded the development of such automatic methods. Conventional data augmentation approaches, including flip**, scaling, rotation, and distortion are not capable of generating data with diverse image content. In this paper, we propose a method, called synthesis of anatomic and molecular MR images network (SAMR), which can simultaneously synthesize data from arbitrary manipulated lesion information on multiple anatomic and molecular MRI sequences, including T1-weighted (T1w), gadolinium enhanced T1w (Gd-T1w), T2-weighted (T2w), fluid-attenuated inversion recovery (FLAIR), and amide proton transfer-weighted (APTw). The proposed framework consists of a stretch-out up-sampling module, a brain atlas encoder, a segmentation consistency module, and multi-scale label-wise discriminators. Extensive experiments on real clinical data demonstrate that the proposed model can perform significantly better than the state-of-the-art synthesis methods.
△ Less
Submitted 26 August, 2020; v1 submitted 25 June, 2020;
originally announced June 2020.
-
Face-to-Music Translation Using a Distance-Preserving Generative Adversarial Network with an Auxiliary Discriminator
Authors:
Chelhwon Kim,
Andrew Port,
Mitesh Patel
Abstract:
Learning a map** between two unrelated domains-such as image and audio, without any supervision is a challenging task. In this work, we propose a distance-preserving generative adversarial model to translate images of human faces into an audio domain. The audio domain is defined by a collection of musical note sounds recorded by 10 different instrument families (NSynth \cite{nsynth2017}) and a d…
▽ More
Learning a map** between two unrelated domains-such as image and audio, without any supervision is a challenging task. In this work, we propose a distance-preserving generative adversarial model to translate images of human faces into an audio domain. The audio domain is defined by a collection of musical note sounds recorded by 10 different instrument families (NSynth \cite{nsynth2017}) and a distance metric where the instrument family class information is incorporated together with a mel-frequency cepstral coefficients (MFCCs) feature. To enforce distance-preservation, a loss term that penalizes difference between pairwise distances of the faces and the translated audio samples is used. Further, we discover that the distance preservation constraint in the generative adversarial model leads to reduced diversity in the translated audio samples, and propose the use of an auxiliary discriminator to enhance the diversity of the translations while using the distance preservation constraint. We also provide a visual demonstration of the results and numerical analysis of the fidelity of the translations. A video demo of our proposed model's learned translation is available in https://www.dropbox.com/s/the176w9obq8465/face_to_musical_note.mov?dl=0.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.
-
KiU-Net: Towards Accurate Segmentation of Biomedical Images using Over-complete Representations
Authors:
Jeya Maria Jose,
Vishwanath Sindagi,
Ilker Hacihaliloglu,
Vishal M. Patel
Abstract:
Due to its excellent performance, U-Net is the most widely used backbone architecture for biomedical image segmentation in the recent years. However, in our studies, we observe that there is a considerable performance drop in the case of detecting smaller anatomical landmarks with blurred noisy boundaries. We analyze this issue in detail, and address it by proposing an over-complete architecture (…
▽ More
Due to its excellent performance, U-Net is the most widely used backbone architecture for biomedical image segmentation in the recent years. However, in our studies, we observe that there is a considerable performance drop in the case of detecting smaller anatomical landmarks with blurred noisy boundaries. We analyze this issue in detail, and address it by proposing an over-complete architecture (Ki-Net) which involves projecting the data onto higher dimensions (in the spatial sense). This network, when augmented with U-Net, results in significant improvements in the case of segmenting small anatomical landmarks and blurred noisy boundaries while obtaining better overall performance. Furthermore, the proposed network has additional benefits like faster convergence and fewer number of parameters. We evaluate the proposed method on the task of brain anatomy segmentation from 2D Ultrasound (US) of preterm neonates, and achieve an improvement of around 4% in terms of the DICE accuracy and Jaccard index as compared to the standard-U-Net, while outperforming the recent best methods by 2%. Code: https://github.com/jeya-maria-jose/KiU-Net-pytorch .
△ Less
Submitted 8 July, 2020; v1 submitted 8 June, 2020;
originally announced June 2020.
-
Deep Sensory Substitution: Noninvasively Enabling Biological Neural Networks to Receive Input from Artificial Neural Networks
Authors:
Andrew Port,
Chelhwon Kim,
Mitesh Patel
Abstract:
As is expressed in the adage "a picture is worth a thousand words", when using spoken language to communicate visual information, brevity can be a challenge. This work describes a novel technique for leveraging machine-learned feature embeddings to sonify visual (and other types of) information into a perceptual audio domain, allowing users to perceive this information using only their aural facul…
▽ More
As is expressed in the adage "a picture is worth a thousand words", when using spoken language to communicate visual information, brevity can be a challenge. This work describes a novel technique for leveraging machine-learned feature embeddings to sonify visual (and other types of) information into a perceptual audio domain, allowing users to perceive this information using only their aural faculty. The system uses a pretrained image embedding network to extract visual features and embed them in a compact subset of Euclidean space -- this converts the images into feature vectors whose $L^2$ distances can be used as a meaningful measure of similarity. A generative adversarial network (GAN) is then used to find a distance preserving map from this metric space of feature vectors into the metric space defined by a target audio dataset equipped with either the Euclidean metric or a mel-frequency cepstrum-based psychoacoustic distance metric. We demonstrate this technique by sonifying images of faces into human speech-like audio. For both target audio metrics, the GAN successfully found a metric preserving map**, and in human subject tests, users were able to accurately classify audio sonifications of faces.
△ Less
Submitted 25 August, 2021; v1 submitted 27 May, 2020;
originally announced May 2020.
-
Uncertainty Quantification for Hyperspectral Image Denoising Frameworks based on Low-rank Matrix Approximation
Authors:
**gwei Song,
Shaobo Xia,
Jun Wang,
Mitesh Patel,
Dong Chen
Abstract:
Sliding-window based low-rank matrix approximation (LRMA) is a technique widely used in hyperspectral images (HSIs) denoising or completion. However, the uncertainty quantification of the restored HSI has not been addressed to date. Accurate uncertainty quantification of the denoised HSI facilitates to applications such as multi-source or multi-scale data fusion, data assimilation, and product unc…
▽ More
Sliding-window based low-rank matrix approximation (LRMA) is a technique widely used in hyperspectral images (HSIs) denoising or completion. However, the uncertainty quantification of the restored HSI has not been addressed to date. Accurate uncertainty quantification of the denoised HSI facilitates to applications such as multi-source or multi-scale data fusion, data assimilation, and product uncertainty quantification, since these applications require an accurate approach to describe the statistical distributions of the input data. Therefore, we propose a prior-free closed-form element-wise uncertainty quantification method for LRMA-based HSI restoration. Our closed-form algorithm overcomes the difficulty of the HSI patch mixing problem caused by the sliding-window strategy used in the conventional LRMA process. The proposed approach only requires the uncertainty of the observed HSI and provides the uncertainty result relatively rapidly and with similar computational complexity as the LRMA technique. We conduct extensive experiments to validate the estimation accuracy of the proposed closed-form uncertainty approach. The method is robust to at least 10% random impulse noise at the cost of 10-20% of additional processing time compared to the LRMA. The experiments indicate that the proposed closed-form uncertainty quantification method is more applicable to real-world applications than the baseline Monte Carlo test, which is computationally expensive. The code is available in the attachment and will be released after the acceptance of this paper.
△ Less
Submitted 6 May, 2022; v1 submitted 22 April, 2020;
originally announced April 2020.
-
Learning to Segment Brain Anatomy from 2D Ultrasound with Less Data
Authors:
Jeya Maria Jose V.,
Rajeev Yasarla,
Puyang Wang,
Ilker Hacihaliloglu,
Vishal M. Patel
Abstract:
Automatic segmentation of anatomical landmarks from ultrasound (US) plays an important role in the management of preterm neonates with a very low birth weight due to the increased risk of develo** intraventricular hemorrhage (IVH) or other complications. One major problem in develo** an automatic segmentation method for this task is the limited availability of annotated data. To tackle this is…
▽ More
Automatic segmentation of anatomical landmarks from ultrasound (US) plays an important role in the management of preterm neonates with a very low birth weight due to the increased risk of develo** intraventricular hemorrhage (IVH) or other complications. One major problem in develo** an automatic segmentation method for this task is the limited availability of annotated data. To tackle this issue, we propose a novel image synthesis method using multi-scale self attention generator to synthesize US images from various segmentation masks. We show that our method can synthesize high-quality US images for every manipulated segmentation label with qualitative and quantitative improvements over the recent state-of-the-art synthesis methods. Furthermore, for the segmentation task, we propose a novel method, called Confidence-guided Brain Anatomy Segmentation (CBAS) network, where segmentation and corresponding confidence maps are estimated at different scales. In addition, we introduce a technique which guides CBAS to learn the weights based on the confidence measure about the estimate. Extensive experiments demonstrate that the proposed method for both synthesis and segmentation tasks achieve significant improvements over the recent state-of-the-art methods. In particular, we show that the new synthesis framework can be used to generate realistic US images which can be used to improve the performance of a segmentation algorithm.
△ Less
Submitted 17 December, 2019;
originally announced December 2019.
-
Confidence Measure Guided Single Image De-raining
Authors:
Rajeev Yasarla,
Vishal M. Patel
Abstract:
Single image de-raining is an extremely challenging problem since the rainy images contain rain streaks which often vary in size, direction and density. This varying characteristic of rain streaks affect different parts of the image differently. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the maj…
▽ More
Single image de-raining is an extremely challenging problem since the rainy images contain rain streaks which often vary in size, direction and density. This varying characteristic of rain streaks affect different parts of the image differently. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the major limitations of these approaches is that they do not consider the location information of rain drops in the image. The proposed Image Quality-based single image Deraining using Confidence measure (QuDeC), network addresses this issue by learning the quality or distortion level of each patch in the rainy image, and further processes this information to learn the rain content at different scales. In addition, we introduce a technique which guides the network to learn the network weights based on the confidence measure about the estimate of both quality at each location and residual rain streak information (residual map). Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
Uncertainty Guided Multi-Scale Residual Learning-using a Cycle Spinning CNN for Single Image De-Raining
Authors:
Rajeev Yasarla,
Vishal M. Patel
Abstract:
Single image de-raining is an extremely challenging problem since the rainy image may contain rain streaks which may vary in size, direction and density. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the major limitations of these approaches is that they do not consider the location information of…
▽ More
Single image de-raining is an extremely challenging problem since the rainy image may contain rain streaks which may vary in size, direction and density. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the major limitations of these approaches is that they do not consider the location information of rain drops in the image. The proposed Uncertainty guided Multi-scale Residual Learning (UMRL) network attempts to address this issue by learning the rain content at different scales and using them to estimate the final de-rained output. In addition, we introduce a technique which guides the network to learn the network weights based on the confidence measure about the estimate. Furthermore, we introduce a new training and testing procedure based on the notion of cycle spinning to improve the final de-raining performance. Extensive experiments on synthetic and real datasets to demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. Code is available at: https://github.com/rajeevyasarla/UMRL--using-Cycle-Spinning
△ Less
Submitted 12 June, 2019;
originally announced June 2019.
-
Densely Connected Pyramid Dehazing Network
Authors:
He Zhang,
Vishal M. Patel
Abstract:
We propose a new end-to-end single image dehazing method, called Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together. The end-to-end learning is achieved by directly embedding the atmospheric scattering model into the network, thereby ensuring that the proposed method strictly follows the physics-driven scatt…
▽ More
We propose a new end-to-end single image dehazing method, called Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together. The end-to-end learning is achieved by directly embedding the atmospheric scattering model into the network, thereby ensuring that the proposed method strictly follows the physics-driven scattering model for dehazing. Inspired by the dense network that can maximize the information flow along features from different levels, we propose a new edge-preserving densely connected encoder-decoder structure with multi-level pyramid pooling module for estimating the transmission map. This network is optimized using a newly introduced edge-preserving loss function. To further incorporate the mutual structural information between the estimated transmission map and the dehazed result, we propose a joint-discriminator based on generative adversarial network framework to decide whether the corresponding dehazed image and the estimated transmission map are real or fake. An ablation study is conducted to demonstrate the effectiveness of each module evaluated at both estimated transmission map and dehazed result. Extensive experiments demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods. Code will be made available at: https://github.com/hezhangsprinter
△ Less
Submitted 22 March, 2018;
originally announced March 2018.