Search | arXiv e-print repository

arXiv:2405.20723 [pdf, other]

Beaconless Auto-Alignment for Single-Wavelength 5 Tbit/s Mode-Division Multiplexing Free-Space Optical Communications

Authors: Yiming Li, Gil Fernandes, David Benton, Antonin Billaud, Mohammed Patel, Andrew Ellis

Abstract: Mode-division multiplexing has shown its ability to significantly increase the capacity of free-space optical communications. An accurate alignment is crucial to enable such links due to possible performance degradation induced by mode crosstalk and narrow beam divergence. Conventionally, a beacon beam is necessary for system alignment due to multiple local maximums in the mode-division multiplexe… ▽ More Mode-division multiplexing has shown its ability to significantly increase the capacity of free-space optical communications. An accurate alignment is crucial to enable such links due to possible performance degradation induced by mode crosstalk and narrow beam divergence. Conventionally, a beacon beam is necessary for system alignment due to multiple local maximums in the mode-division multiplexed beam profile. However, the beacon beam introduces excess system complexity, power consumption, and alignment errors. Here we demonstrate a beaconless system with significantly higher alignment accuracy and faster acquisition. This system also excludes excess complexity, power consumption, and alignment errors, facilitating simplified system calibration and supporting a record-high 5.14 Tbit/s line rate in a single-wavelength free-space optical link. We anticipate our paper to be a starting point for more sophisticated alignment scenarios in future multi-Terabit mode-division multiplexing free-space optical communications for long-distance applications with a generalised mode basis. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2402.08697 [pdf, other]

Weakly Supervised Detection of Pheochromocytomas and Paragangliomas in CT

Authors: David C. Oluigboa, Bikash Santra, Tejas Sudharshan Mathai, Pritam Mukherjee, Jianfei Liu, Abhishek Jha, Mayank Patel, Karel Pacak, Ronald M. Summers

Abstract: Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors which have the potential to metastasize. For the management of patients with PPGLs, CT is the preferred modality of choice for precise localization and estimation of their progression. However, due to the myriad variations in size, morphology, and appearance of the tumors in different anatomical regions, radiolo… ▽ More Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors which have the potential to metastasize. For the management of patients with PPGLs, CT is the preferred modality of choice for precise localization and estimation of their progression. However, due to the myriad variations in size, morphology, and appearance of the tumors in different anatomical regions, radiologists are posed with the challenge of accurate detection of PPGLs. Since clinicians also need to routinely measure their size and track their changes over time across patient visits, manual demarcation of PPGLs is quite a time-consuming and cumbersome process. To ameliorate the manual effort spent for this task, we propose an automated method to detect PPGLs in CT studies via a proxy segmentation task. As only weak annotations for PPGLs in the form of prospectively marked 2D bounding boxes on an axial slice were available, we extended these 2D boxes into weak 3D annotations and trained a 3D full-resolution nnUNet model to directly segment PPGLs. We evaluated our approach on a dataset consisting of chest-abdomen-pelvis CTs of 255 patients with confirmed PPGLs. We obtained a precision of 70% and sensitivity of 64.1% with our proposed approach when tested on 53 CT studies. Our findings highlight the promising nature of detecting PPGLs via segmentation, and furthers the state-of-the-art in this exciting yet challenging area of rare cancer management. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: Accepted at SPIE 2024. arXiv admin note: text overlap with arXiv:2402.00175

arXiv:2401.09694 [pdf, other]

A Multi-Area Architecture for Real-Time Feedback-Based Optimization of Distribution Grids

Authors: Ilyas Farhat, Etinosa Ekomwenrenren, John W. Simpson-Porco, Evangelos Farantatos, Mahendra Patel, Aboutaleb Haddadi

Abstract: A challenge in transmission-distribution coordination is how to quickly and reliably coordinate Distributed Energy Resources (DERs) across large multi-stakeholder Distribution Networks (DNs) to support the Transmission Network (TN), while ensuring operational constraints continue to be met within the DN. Here we propose a hierarchical feedback-based control architecture for coordination of DERs in… ▽ More A challenge in transmission-distribution coordination is how to quickly and reliably coordinate Distributed Energy Resources (DERs) across large multi-stakeholder Distribution Networks (DNs) to support the Transmission Network (TN), while ensuring operational constraints continue to be met within the DN. Here we propose a hierarchical feedback-based control architecture for coordination of DERs in DNs, enabling the DN to quickly respond to power set-point requests from the Transmission System Operator (TSO) while maintaining local DN constraints. Our scheme allows for multiple independently-managed areas within the DN to optimize their local resources while coordinating to support the TN, and while maintaining data privacy; the only required inter-area communication is between physically adjacent areas within the DN control hierarchy. We conduct a rigorous stability analysis, establishing intuitive conditions for closed-loop stability, and provide detailed tuning recommendations. The proposal is validated via case studies on multiple feeders, including IEEE-123 and IEEE-8500, using a custom MATLAB-based application which integrates with OpenDSS. The simulation results show that the proposed structure is highly scalable and can quickly coordinate DERs in response to TSO commands, while responding to local disturbances within the DN and maintaining DN operational limits. △ Less

Submitted 17 January, 2024; originally announced January 2024.

Comments: 13 pages, 11 figures, for the supplement document (pdf), see https://1drv.ms/b/s!AmH_VfOdVVKazYUpDEVplHjtvj-7ZA?e=oyfXD4

arXiv:2312.14952 [pdf, other]

A Cascaded Neural Network System For Rating Student Performance In Surgical Knot Tying Simulation

Authors: Yunzhe Xue, Olanrewaju Eletta, Justin W. Ady, Nell M. Patel, Advaith Bongu, Usman Roshan

Abstract: As part of their training all medical students and residents have to pass basic surgical tasks such as knot tying, needle-passing, and suturing. Their assessment is typically performed in the operating room by surgical faculty where mistakes and failure by the student increases the operation time and cost. This evaluation is quantitative and has a low margin of error. Simulation has emerged as a c… ▽ More As part of their training all medical students and residents have to pass basic surgical tasks such as knot tying, needle-passing, and suturing. Their assessment is typically performed in the operating room by surgical faculty where mistakes and failure by the student increases the operation time and cost. This evaluation is quantitative and has a low margin of error. Simulation has emerged as a cost effective option but it lacks assessment or requires additional expensive hardware for evaluation. Apps that provide training videos on surgical knot trying are available to students but none have evaluation. We propose a cascaded neural network architecture that evaluates a student's performance just from a video of themselves simulating a surgical knot tying task. Our model converts video frame images into feature vectors with a pre-trained deep convolutional network and then models the sequence of frames with a temporal network. We obtained videos of medical students and residents from the Robert Wood Johnson Hospital performing knot tying on a standardized simulation kit. We manually annotated each video and proceeded to do a five-fold cross-validation study on them. Our model achieves a median precision, recall, and F1-score of 0.71, 0.66, and 0.65 respectively in determining the level of knot related tasks of tying and pushing the knot. Our mean precision score averaged across different probability thresholds is 0.8. Both our F1-score and mean precision score are 8% and 30% higher than that of a recently published study for the same problem. We expect the accuracy of our model to further increase as we add more training videos to the model thus making it a practical solution that students can use to evaluate themselves. △ Less

Submitted 9 December, 2023; originally announced December 2023.

Comments: To appear in proceedings of 11th IEEE International Conference on Healthcare Informatics (ICHI) 2023

arXiv:2306.16654 [pdf, other]

Self-Supervised MRI Reconstruction with Unrolled Diffusion Models

Authors: Yilmaz Korkmaz, Tolga Cukur, Vishal M. Patel

Abstract: Magnetic Resonance Imaging (MRI) produces excellent soft tissue contrast, albeit it is an inherently slow imaging modality. Promising deep learning methods have recently been proposed to reconstruct accelerated MRI scans. However, existing methods still suffer from various limitations regarding image fidelity, contextual sensitivity, and reliance on fully-sampled acquisitions for model training. T… ▽ More Magnetic Resonance Imaging (MRI) produces excellent soft tissue contrast, albeit it is an inherently slow imaging modality. Promising deep learning methods have recently been proposed to reconstruct accelerated MRI scans. However, existing methods still suffer from various limitations regarding image fidelity, contextual sensitivity, and reliance on fully-sampled acquisitions for model training. To comprehensively address these limitations, we propose a novel self-supervised deep reconstruction model, named Self-Supervised Diffusion Reconstruction (SSDiffRecon). SSDiffRecon expresses a conditional diffusion process as an unrolled architecture that interleaves cross-attention transformers for reverse diffusion steps with data-consistency blocks for physics-driven processing. Unlike recent diffusion methods for MRI reconstruction, a self-supervision strategy is adopted to train SSDiffRecon using only undersampled k-space data. Comprehensive experiments on public brain MR datasets demonstrates the superiority of SSDiffRecon against state-of-the-art supervised, and self-supervised baselines in terms of reconstruction speed and quality. Implementation will be available at https://github.com/yilmazkorkmaz1/SSDiffRecon. △ Less

Submitted 15 April, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

arXiv:2306.00838 [pdf, other]

The Brain Tumor Segmentation (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI

Authors: Ahmed W. Moawad, Anastasia Janas, Ujjwal Baid, Divya Ramakrishnan, Rachit Saluja, Nader Ashraf, Leon Jekel, Raisa Amiruddin, Maruf Adewole, Jake Albrecht, Udunna Anazodo, Sanjay Aneja, Syed Muhammad Anwar, Timothy Bergquist, Evan Calabrese, Veronica Chiang, Verena Chung, Gian Marco Marco Conte, Farouk Dako, James Eddy, Ivan Ezhov, Ariana Familiar, Keyvan Farahani, Juan Eugenio Iglesias, Zhifan Jiang , et al. (206 additional authors not shown)

Abstract: The translation of AI-generated brain metastases (BM) segmentation into clinical practice relies heavily on diverse, high-quality annotated medical imaging datasets. The BraTS-METS 2023 challenge has gained momentum for testing and benchmarking algorithms using rigorously annotated internationally compiled real-world datasets. This study presents the results of the segmentation challenge and chara… ▽ More The translation of AI-generated brain metastases (BM) segmentation into clinical practice relies heavily on diverse, high-quality annotated medical imaging datasets. The BraTS-METS 2023 challenge has gained momentum for testing and benchmarking algorithms using rigorously annotated internationally compiled real-world datasets. This study presents the results of the segmentation challenge and characterizes the challenging cases that impacted the performance of the winning algorithms. Untreated brain metastases on standard anatomic MRI sequences (T1, T2, FLAIR, T1PG) from eight contributed international datasets were annotated in stepwise method: published UNET algorithms, student, neuroradiologist, final approver neuroradiologist. Segmentations were ranked based on lesion-wise Dice and Hausdorff distance (HD95) scores. False positives (FP) and false negatives (FN) were rigorously penalized, receiving a score of 0 for Dice and a fixed penalty of 374 for HD95. Eight datasets comprising 1303 studies were annotated, with 402 studies (3076 lesions) released on Synapse as publicly available datasets to challenge competitors. Additionally, 31 studies (139 lesions) were held out for validation, and 59 studies (218 lesions) were used for testing. Segmentation accuracy was measured as rank across subjects, with the winning team achieving a LesionWise mean score of 7.9. Common errors among the leading teams included false negatives for small lesions and misregistration of masks in space.The BraTS-METS 2023 challenge successfully curated well-annotated, diverse datasets and identified common errors, facilitating the translation of BM segmentation across varied clinical environments and providing personalized volumetric reports to patients undergoing BM treatment. △ Less

Submitted 17 June, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

arXiv:2304.04927 [pdf, ps, other]

Data-Driven Fast Frequency Control using Inverter-Based Resources

Authors: Etinosa Ekomwenrenren, John W. Simpson-Porco, Evangelos Farantatos, Mahendra Patel, Aboutaleb Haddadi, Lin Zhu

Abstract: To address the control challenges associated with the increasing share of inverter-connected renewable energy resources, this paper proposes a direct data-driven approach for fast frequency control in the bulk power system. The proposed control scheme partitions the power system into control areas, and leverages local dispatchable inverter-based resources to rapidly mitigate local power imbalances… ▽ More To address the control challenges associated with the increasing share of inverter-connected renewable energy resources, this paper proposes a direct data-driven approach for fast frequency control in the bulk power system. The proposed control scheme partitions the power system into control areas, and leverages local dispatchable inverter-based resources to rapidly mitigate local power imbalances upon events. The controller design is based directly on historical measurement sequences, and does not require identification of a parametric power system model. Theoretical results are provided to support the approach. Simulation studies on a nonlinear three-area test system demonstrate that the controller provides fast and localized frequency control under several types of contingencies. △ Less

Submitted 10 April, 2023; originally announced April 2023.

arXiv:2303.09536 [pdf, other]

Deep Metric Learning for Unsupervised Remote Sensing Change Detection

Authors: Wele Gedara Chaminda Bandara, Vishal M. Patel

Abstract: Remote Sensing Change Detection (RS-CD) aims to detect relevant changes from Multi-Temporal Remote Sensing Images (MT-RSIs), which aids in various RS applications such as land cover, land use, human development analysis, and disaster response. The performance of existing RS-CD methods is attributed to training on large annotated datasets. Furthermore, most of these models are less transferable in… ▽ More Remote Sensing Change Detection (RS-CD) aims to detect relevant changes from Multi-Temporal Remote Sensing Images (MT-RSIs), which aids in various RS applications such as land cover, land use, human development analysis, and disaster response. The performance of existing RS-CD methods is attributed to training on large annotated datasets. Furthermore, most of these models are less transferable in the sense that the trained model often performs very poorly when there is a domain gap between training and test datasets. This paper proposes an unsupervised CD method based on deep metric learning that can deal with both of these issues. Given an MT-RSI, the proposed method generates corresponding change probability map by iteratively optimizing an unsupervised CD loss without training it on a large dataset. Our unsupervised CD method consists of two interconnected deep networks, namely Deep-Change Probability Generator (D-CPG) and Deep-Feature Extractor (D-FE). The D-CPG is designed to predict change and no change probability maps for a given MT-RSI, while D-FE is used to extract deep features of MT-RSI that will be further used in the proposed unsupervised CD loss. We use transfer learning capability to initialize the parameters of D-FE. We iteratively optimize the parameters of D-CPG and D-FE for a given MT-RSI by minimizing the proposed unsupervised ``similarity-dissimilarity loss''. This loss is motivated by the principle of metric learning where we simultaneously maximize the distance between change pair-wise pixels while minimizing the distance between no-change pair-wise pixels in bi-temporal image domain and their deep feature domain. The experiments conducted on three CD datasets show that our unsupervised CD method achieves significant improvements over the state-of-the-art supervised and unsupervised CD methods. Code available at https://github.com/wgcban/Metric-CD △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: Code available at https://github.com/wgcban/Metric-CD

arXiv:2209.09498 [pdf, other]

NBD-GAP: Non-Blind Image Deblurring Without Clean Target Images

Authors: Nithin Gopalakrishnan Nair, Rajeev Yasarla, Vishal M. Patel

Abstract: In recent years, deep neural network-based restoration methods have achieved state-of-the-art results in various image deblurring tasks. However, one major drawback of deep learning-based deblurring networks is that large amounts of blurry-clean image pairs are required for training to achieve good performance. Moreover, deep networks often fail to perform well when the blurry images and the blur… ▽ More In recent years, deep neural network-based restoration methods have achieved state-of-the-art results in various image deblurring tasks. However, one major drawback of deep learning-based deblurring networks is that large amounts of blurry-clean image pairs are required for training to achieve good performance. Moreover, deep networks often fail to perform well when the blurry images and the blur kernels during testing are very different from the ones used during training. This happens mainly because of the overfitting of the network parameters on the training data. In this work, we present a method that addresses these issues. We view the non-blind image deblurring problem as a denoising problem. To do so, we perform Wiener filtering on a pair of blurry images with the corresponding blur kernels. This results in a pair of images with colored noise. Hence, the deblurring problem is translated into a denoising problem. We then solve the denoising problem without using explicit clean target images. Extensive experiments are conducted to show that our method achieves results that are on par to the state-of-the-art non-blind deblurring works. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Comments: Accepted at ICIP 2022

arXiv:2209.06172 [pdf]

Comparative analysis of segmentation and generative models for fingerprint retrieval task

Authors: Megh Patel, Devarsh Patel, Sarthak Patel

Abstract: Biometric Authentication like Fingerprints has become an integral part of the modern technology for authentication and verification of users. It is pervasive in more ways than most of us are aware of. However, these fingerprint images deteriorate in quality if the fingers are dirty, wet, injured or when sensors malfunction. Therefore, extricating the original fingerprint by removing the noise and… ▽ More Biometric Authentication like Fingerprints has become an integral part of the modern technology for authentication and verification of users. It is pervasive in more ways than most of us are aware of. However, these fingerprint images deteriorate in quality if the fingers are dirty, wet, injured or when sensors malfunction. Therefore, extricating the original fingerprint by removing the noise and inpainting it to restructure the image is crucial for its authentication. Hence, this paper proposes a deep learning approach to address these issues using Generative (GAN) and Segmentation models. Qualitative and Quantitative comparison has been done between pix2pixGAN and cycleGAN (generative models) as well as U-net (segmentation model). To train the model, we created our own dataset NFD - Noisy Fingerprint Dataset meticulously with different backgrounds along with scratches in some images to make it more realistic and robust. In our research, the u-net model performed better than the GAN networks △ Less

Submitted 2 November, 2022; v1 submitted 13 September, 2022; originally announced September 2022.

Comments: This is a working draft and not indented for publication

arXiv:2208.01761 [pdf, other]

Data-Driven Fast Frequency Control using Inverter-Based Resources

Authors: Etinosa Ekomwenrenren, John Simpson-Porco, Evangelos Farantatos, Mahendra Patel, Aboutaleb Haddadi, Lin Zhu

Abstract: We develop and test a data-driven and area-based fast frequency control scheme, which rapidly redispatches inverter-based resources to compensate for local power imbalances within the bulk power system. The approach requires no explicit system model information, relying only on historical measurement sequences for the computation of control actions. Our technical approach fuses developments in low… ▽ More We develop and test a data-driven and area-based fast frequency control scheme, which rapidly redispatches inverter-based resources to compensate for local power imbalances within the bulk power system. The approach requires no explicit system model information, relying only on historical measurement sequences for the computation of control actions. Our technical approach fuses developments in low-gain estimator design and data-driven control to provide a model-free and practical solution for fast frequency control. Theoretical results and extensive simulation scenarios on a three area system are provided to support the approach. △ Less

Submitted 2 August, 2022; originally announced August 2022.

Comments: In proceedings of the 11th Bulk Power Systems Dynamics and Control Symposium (IREP 2022), July 25-30, 2022, Banff, Canada

Report number: IREP2022-63

arXiv:2208.00836 [pdf, other]

doi 10.1109/JLT.2022.3209092

Enhanced Atmospheric Turbulence Resiliency with Successive Interference Cancellation DSP in Mode Division Multiplexing Free-Space Optical Links

Authors: Yiming Li, Zhaozhong Chen, Zhouyi Hu, David M. Benton, Abdallah A. I. Ali, Mohammed Patel, Martin P. J. Lavery, Andrew D. Ellis

Abstract: We experimentally demonstrate the enhanced atmospheric turbulence resiliency in a 137.8 Gbit/s/mode mode-division multiplexing free-space optical communication link through the application of a successive interference cancellation digital signal processing algorithm. The turbulence resiliency is further enhanced through redundant receive channels in the mode-division multiplexing link. The proof o… ▽ More We experimentally demonstrate the enhanced atmospheric turbulence resiliency in a 137.8 Gbit/s/mode mode-division multiplexing free-space optical communication link through the application of a successive interference cancellation digital signal processing algorithm. The turbulence resiliency is further enhanced through redundant receive channels in the mode-division multiplexing link. The proof of concept demonstration is performed using commercially available mode-selective photonic lanterns, a commercial transponder, and a spatial light modulator based turbulence emulator. In this link, 5 spatial modes with each mode carrying 34.46 GBaud dual-polarization quadrature phase shift keying signals are successfully transmitted with an average bit error rate lower than the hard-decision forward error correction limit. As a result, we achieved a record-high mode- and polarization-division multiplexing channel number of 10, a record-high line rate of 689.23 Gbit/s, and a record-high net spectral efficiency of 13.9 bit/s/Hz in emulated turbulent links in a mode-division multiplexing free-space optical system. △ Less

Submitted 19 July, 2022; originally announced August 2022.

arXiv:2207.03447 [pdf, other]

Learning to restore images degraded by atmospheric turbulence using uncertainty

Authors: Rajeev Yasarla, Vishal M. Patel

Abstract: Atmospheric turbulence can significantly degrade the quality of images acquired by long-range imaging systems by causing spatially and temporally random fluctuations in the index of refraction of the atmosphere. Variations in the refractive index causes the captured images to be geometrically distorted and blurry. Hence, it is important to compensate for the visual degradation in images caused by… ▽ More Atmospheric turbulence can significantly degrade the quality of images acquired by long-range imaging systems by causing spatially and temporally random fluctuations in the index of refraction of the atmosphere. Variations in the refractive index causes the captured images to be geometrically distorted and blurry. Hence, it is important to compensate for the visual degradation in images caused by atmospheric turbulence. In this paper, we propose a deep learning-based approach for restring a single image degraded by atmospheric turbulence. We make use of the epistemic uncertainty based on Monte Carlo dropouts to capture regions in the image where the network is having hard time restoring. The estimated uncertainty maps are then used to guide the network to obtain the restored image. Extensive experiments are conducted on synthetic and real images to show the significance of the proposed work. Code is available at : https://github.com/rajeevyasarla/AT-Net △ Less

Submitted 7 July, 2022; originally announced July 2022.

Comments: Recognized as Best Paper at IEEE International Conference on Image Processing, 2021. arXiv admin note: substantial text overlap with arXiv:2007.08404

arXiv:2206.08936 [pdf, other]

Simultaneous Bone and Shadow Segmentation Network using Task Correspondence Consistency

Authors: Aimon Rahman, Jeya Maria Jose Valanarasu, Ilker Hacihaliloglu, Vishal M Patel

Abstract: Segmenting both bone surface and the corresponding acoustic shadow are fundamental tasks in ultrasound (US) guided orthopedic procedures. However, these tasks are challenging due to minimal and blurred bone surface response in US images, cross-machine discrepancy, imaging artifacts, and low signal-to-noise ratio. Notably, bone shadows are caused by a significant acoustic impedance mismatch between… ▽ More Segmenting both bone surface and the corresponding acoustic shadow are fundamental tasks in ultrasound (US) guided orthopedic procedures. However, these tasks are challenging due to minimal and blurred bone surface response in US images, cross-machine discrepancy, imaging artifacts, and low signal-to-noise ratio. Notably, bone shadows are caused by a significant acoustic impedance mismatch between the soft tissue and bone surfaces. To leverage this mutual information between these highly related tasks, we propose a single end-to-end network with a shared transformer-based encoder and task independent decoders for simultaneous bone and shadow segmentation. To share complementary features, we propose a cross task feature transfer block which learns to transfer meaningful features from decoder of shadow segmentation to that of bone segmentation and vice-versa. We also introduce a correspondence consistency loss which makes sure that network utilizes the inter-dependency between the bone surface and its corresponding shadow to refine the segmentation. Validation against expert annotations shows that the method outperforms the previous state-of-the-art for both bone surface and shadow segmentation. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: Accepted at MICCAI 2022

arXiv:2206.08481 [pdf, other]

Orientation-guided Graph Convolutional Network for Bone Surface Segmentation

Authors: Aimon Rahman, Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Ilker Hacihaliloglu, Vishal M Patel

Abstract: Due to imaging artifacts and low signal-to-noise ratio in ultrasound images, automatic bone surface segmentation networks often produce fragmented predictions that can hinder the success of ultrasound-guided computer-assisted surgical procedures. Existing pixel-wise predictions often fail to capture the accurate topology of bone tissues due to a lack of supervision to enforce connectivity. In this… ▽ More Due to imaging artifacts and low signal-to-noise ratio in ultrasound images, automatic bone surface segmentation networks often produce fragmented predictions that can hinder the success of ultrasound-guided computer-assisted surgical procedures. Existing pixel-wise predictions often fail to capture the accurate topology of bone tissues due to a lack of supervision to enforce connectivity. In this work, we propose an orientation-guided graph convolutional network to improve connectivity while segmenting the bone surface. We also propose an additional supervision on the orientation of the bone surface to further impose connectivity. We validated our approach on 1042 vivo US scans of femur, knee, spine, and distal radius. Our approach improves over the state-of-the-art methods by 5.01% in connectivity metric. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: Accepted at MICCAI 2022

arXiv:2206.04514 [pdf, ps, other]

doi 10.1109/LGRS.2023.3270799

SAR Despeckling using a Denoising Diffusion Probabilistic Model

Authors: Malsha V. Perera, Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, Vishal M. Patel

Abstract: Speckle is a multiplicative noise which affects all coherent imaging modalities including Synthetic Aperture Radar (SAR) images. The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications such as automatic target recognition and change detection. Thus, SAR despeckling is an important problem in remote sensing. In this paper, we… ▽ More Speckle is a multiplicative noise which affects all coherent imaging modalities including Synthetic Aperture Radar (SAR) images. The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications such as automatic target recognition and change detection. Thus, SAR despeckling is an important problem in remote sensing. In this paper, we introduce SAR-DDPM, a denoising diffusion probabilistic model for SAR despeckling. The proposed method comprises of a Markov chain that transforms clean images to white Gaussian noise by repeatedly adding random noise. The despeckled image is recovered by a reverse process which iteratively predicts the added noise using a noise predictor which is conditioned on the speckled image. In addition, we propose a new inference strategy based on cycle spinning to improve the despeckling performance. Our experiments on both synthetic and real SAR images demonstrate that the proposed method achieves significant improvements in both quantitative and qualitative results over the state-of-the-art despeckling methods. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: Our code is available at https://github.com/malshaV/SAR_DDPM

arXiv:2205.15906 [pdf, ps, other]

SAR Despeckling Using Overcomplete Convolutional Networks

Authors: Malsha V. Perera, Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Synthetic Aperture Radar (SAR) despeckling is an important problem in remote sensing as speckle degrades SAR images, affecting downstream tasks like detection and segmentation. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. Traditional CNNs try to increase the receptive field size as the network goes deeper, thus extracting global features. H… ▽ More Synthetic Aperture Radar (SAR) despeckling is an important problem in remote sensing as speckle degrades SAR images, affecting downstream tasks like detection and segmentation. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. Traditional CNNs try to increase the receptive field size as the network goes deeper, thus extracting global features. However,speckle is relatively small, and increasing receptive field does not help in extracting speckle features. This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field. The proposed network consists of an overcomplete branch to focus on the local structures and an undercomplete branch that focuses on the global structures. We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Comments: Accepted to International Geoscience and Remote Sensing Symposium (IGARSS), 2022. Our code is available at https://github.com/malshaV/sar_overcomplete

arXiv:2204.11669 [pdf]

doi 10.1038/s41746-023-00859-y

Deep-learning-enabled Brain Hemodynamic Map** Using Resting-state fMRI

Authors: Xirui Hou, Pengfei Guo, Puyang Wang, Peiying Liu, Doris D. M. Lin, Hongli Fan, Yang Li, Zhiliang Wei, Zixuan Lin, Dengrong Jiang, ** **, Catherine Kelly, Jay J. Pillai, Judy Huang, Marco C. Pinho, Binu P. Thomas, Babu G. Welch, Denise C. Park, Vishal M. Patel, Argye E. Hillis, Hanzhang Lu

Abstract: Cerebrovascular disease is a leading cause of death globally. Prevention and early intervention are known to be the most effective forms of its management. Non-invasive imaging methods hold great promises for early stratification, but at present lack the sensitivity for personalized prognosis. Resting-state functional magnetic resonance imaging (rs-fMRI), a powerful tool previously used for mappin… ▽ More Cerebrovascular disease is a leading cause of death globally. Prevention and early intervention are known to be the most effective forms of its management. Non-invasive imaging methods hold great promises for early stratification, but at present lack the sensitivity for personalized prognosis. Resting-state functional magnetic resonance imaging (rs-fMRI), a powerful tool previously used for map** neural activity, is available in most hospitals. Here we show that rs-fMRI can be used to map cerebral hemodynamic function and delineate impairment. By exploiting time variations in breathing pattern during rs-fMRI, deep learning enables reproducible map** of cerebrovascular reactivity (CVR) and bolus arrive time (BAT) of the human brain using resting-state CO2 fluctuations as a natural 'contrast media'. The deep-learning network was trained with CVR and BAT maps obtained with a reference method of CO2-inhalation MRI, which included data from young and older healthy subjects and patients with Moyamoya disease and brain tumors. We demonstrate the performance of deep-learning cerebrovascular map** in the detection of vascular abnormalities, evaluation of revascularization effects, and vascular alterations in normal aging. In addition, cerebrovascular maps obtained with the proposed method exhibited excellent reproducibility in both healthy volunteers and stroke patients. Deep-learning resting-state vascular imaging has the potential to become a useful tool in clinical cerebrovascular imaging. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Journal ref: npj Digital Medicine (2023) 116

arXiv:2204.08974 [pdf, other]

A comparison of different atmospheric turbulence simulation methods for image restoration

Authors: Nithin Gopalakrishnan Nair, Kangfu Mei, Vishal M. Patel

Abstract: Atmospheric turbulence deteriorates the quality of images captured by long-range imaging systems by introducing blur and geometric distortions to the captured scene. This leads to a drastic drop in performance when computer vision algorithms like object/face recognition and detection are performed on these images. In recent years, various deep learning-based atmospheric turbulence mitigation metho… ▽ More Atmospheric turbulence deteriorates the quality of images captured by long-range imaging systems by introducing blur and geometric distortions to the captured scene. This leads to a drastic drop in performance when computer vision algorithms like object/face recognition and detection are performed on these images. In recent years, various deep learning-based atmospheric turbulence mitigation methods have been proposed in the literature. These methods are often trained using synthetically generated images and tested on real-world images. Hence, the performance of these restoration methods depends on the type of simulation used for training the network. In this paper, we systematically evaluate the effectiveness of various turbulence simulation methods on image restoration. In particular, we evaluate the performance of two state-or-the-art restoration networks using six simulations method on a real-world LRFID dataset consisting of face images degraded by turbulence. This paper will provide guidance to the researchers and practitioners working in this field to choose the suitable data generation models for training deep models for turbulence mitigation. The implementation codes for the simulation methods, source codes for the networks, and the pre-trained models will be publicly made available. △ Less

Submitted 19 April, 2022; originally announced April 2022.

arXiv:2204.08454 [pdf, other]

Revisiting Consistency Regularization for Semi-supervised Change Detection in Remote Sensing Images

Authors: Wele Gedara Chaminda Bandara, Vishal M. Patel

Abstract: Remote-sensing (RS) Change Detection (CD) aims to detect "changes of interest" from co-registered bi-temporal images. The performance of existing deep supervised CD methods is attributed to the large amounts of annotated data used to train the networks. However, annotating large amounts of remote sensing images is labor-intensive and expensive, particularly with bi-temporal images, as it requires… ▽ More Remote-sensing (RS) Change Detection (CD) aims to detect "changes of interest" from co-registered bi-temporal images. The performance of existing deep supervised CD methods is attributed to the large amounts of annotated data used to train the networks. However, annotating large amounts of remote sensing images is labor-intensive and expensive, particularly with bi-temporal images, as it requires pixel-wise comparisons by a human expert. On the other hand, we often have access to unlimited unlabeled multi-temporal RS imagery thanks to ever-increasing earth observation programs. In this paper, we propose a simple yet effective way to leverage the information from unlabeled bi-temporal images to improve the performance of CD approaches. More specifically, we propose a semi-supervised CD model in which we formulate an unsupervised CD loss in addition to the supervised Cross-Entropy (CE) loss by constraining the output change probability map of a given unlabeled bi-temporal image pair to be consistent under the small random perturbations applied on the deep feature difference map that is obtained by subtracting their latent feature representations. Experiments conducted on two publicly available CD datasets show that the proposed semi-supervised CD method can reach closer to the performance of supervised CD even with access to as little as 10% of the annotated training data. Code available at https://github.com/wgcban/SemiCD △ Less

Submitted 21 April, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

Comments: Code available at https://github.com/wgcban/SemiCD 36 pages

arXiv:2203.06338 [pdf, other]

Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation

Authors: Pengfei Guo, Dong Yang, Ali Hatamizadeh, An Xu, Ziyue Xu, Wenqi Li, Can Zhao, Daguang Xu, Stephanie Harmon, Evrim Turkbey, Baris Turkbey, Bradford Wood, Francesca Patella, Elvira Stellato, Gianpaolo Carrafiello, Vishal M. Patel, Holger R. Roth

Abstract: Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning t… ▽ More Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning to achieve optimal performance. Conventional hyperparameter optimization algorithms are impractical in real-world FL applications as they involve numerous training trials, which are often not affordable with limited compute budgets. In this work, we propose an efficient reinforcement learning (RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL, in which an online RL agent can dynamically adjust hyperparameters of each client based on the current training progress. Extensive experiments are conducted to investigate different search strategies and RL agents. The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset as well as two real-world medical image segmentation datasets for COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT. △ Less

Submitted 31 August, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

arXiv:2203.05574 [pdf, other]

On-the-Fly Test-time Adaptation for Medical Image Segmentation

Authors: Jeya Maria Jose Valanarasu, Pengfei Guo, Vibashan VS, Vishal M. Patel

Abstract: One major problem in deep learning-based solutions for medical imaging is the drop in performance when a model is tested on a data distribution different from the one that it is trained on. Adapting the source model to target data distribution at test-time is an efficient solution for the data-shift problem. Previous methods solve this by adapting the model to target distribution by using techniqu… ▽ More One major problem in deep learning-based solutions for medical imaging is the drop in performance when a model is tested on a data distribution different from the one that it is trained on. Adapting the source model to target data distribution at test-time is an efficient solution for the data-shift problem. Previous methods solve this by adapting the model to target distribution by using techniques like entropy minimization or regularization. In these methods, the models are still updated by back-propagation using an unsupervised loss on complete test data distribution. In real-world clinical settings, it makes more sense to adapt a model to a new test image on-the-fly and avoid model update during inference due to privacy concerns and lack of computing resource at deployment. To this end, we propose a new setting - On-the-Fly Adaptation which is zero-shot and episodic (i.e., the model is adapted to a single image at a time and also does not perform any back-propagation during test-time). To achieve this, we propose a new framework called Adaptive UNet where each convolutional block is equipped with an adaptive batch normalization layer to adapt the features with respect to a domain code. The domain code is generated using a pre-trained encoder trained on a large corpus of medical images. During test-time, the model takes in just the new test image and generates a domain code to adapt the features of source model according to the test data. We validate the performance on both 2D and 3D data distribution shifts where we get a better performance compared to previous test-time adaptation methods. Code is available at https://github.com/jeya-maria-jose/On-The-Fly-Adaptation △ Less

Submitted 10 March, 2022; originally announced March 2022.

Comments: Tech Report

arXiv:2203.04967 [pdf, other]

UNeXt: MLP-based Rapid Medical Image Segmentation Network

Authors: Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. However, these networks cannot be effectively adopted for rapid image segmentation in point-of-care applications as they are parameter-heavy, computationally complex and slow to use. To this end, we propose UNeXt which is a Convolutional multilayer perceptron (MLP) based network… ▽ More UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. However, these networks cannot be effectively adopted for rapid image segmentation in point-of-care applications as they are parameter-heavy, computationally complex and slow to use. To this end, we propose UNeXt which is a Convolutional multilayer perceptron (MLP) based network for image segmentation. We design UNeXt in an effective way with an early convolutional stage and a MLP stage in the latent stage. We propose a tokenized MLP block where we efficiently tokenize and project the convolutional features and use MLPs to model the representation. To further boost the performance, we propose shifting the channels of the inputs while feeding in to MLPs so as to focus on learning local dependencies. Using tokenized MLPs in latent space reduces the number of parameters and computational complexity while being able to result in a better representation to help segmentation. The network also consists of skip connections between various levels of encoder and decoder. We test UNeXt on multiple medical image segmentation datasets and show that we reduce the number of parameters by 72x, decrease the computational complexity by 68x, and improve the inference speed by 10x while also obtaining better segmentation performance over the state-of-the-art medical image segmentation architectures. Code is available at https://github.com/jeya-maria-jose/UNeXt-pytorch △ Less

Submitted 9 March, 2022; originally announced March 2022.

Comments: Tech Report

arXiv:2203.02503 [pdf, other]

HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening

Authors: Wele Gedara Chaminda Bandara, Vishal M. Patel

Abstract: Pansharpening aims to fuse a registered high-resolution panchromatic image (PAN) with a low-resolution hyperspectral image (LR-HSI) to generate an enhanced HSI with high spectral and spatial resolution. Existing pansharpening approaches neglect using an attention mechanism to transfer HR texture features from PAN to LR-HSI features, resulting in spatial and spectral distortions. In this paper, we… ▽ More Pansharpening aims to fuse a registered high-resolution panchromatic image (PAN) with a low-resolution hyperspectral image (LR-HSI) to generate an enhanced HSI with high spectral and spatial resolution. Existing pansharpening approaches neglect using an attention mechanism to transfer HR texture features from PAN to LR-HSI features, resulting in spatial and spectral distortions. In this paper, we present a novel attention mechanism for pansharpening called HyperTransformer, in which features of LR-HSI and PAN are formulated as queries and keys in a transformer, respectively. HyperTransformer consists of three main modules, namely two separate feature extractors for PAN and HSI, a multi-head feature soft attention module, and a spatial-spectral feature fusion module. Such a network improves both spatial and spectral quality measures of the pansharpened HSI by learning cross-feature space dependencies and long-range details of PAN and LR-HSI. Furthermore, HyperTransformer can be utilized across multiple spatial scales at the backbone for obtaining improved performance. Extensive experiments conducted on three widely used datasets demonstrate that HyperTransformer achieves significant improvement over the state-of-the-art methods on both spatial and spectral quality measures. Implementation code and pre-trained weights can be accessed at https://github.com/wgcban/HyperTransformer. △ Less

Submitted 28 March, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

Comments: Accepted at CVPR'22. Project page: https://www.wgcban.com/research#h.ar24vwqlm021 Code available at: https://github.com/wgcban/HyperTransformer

arXiv:2201.09376 [pdf, other]

ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer

Authors: Pengfei Guo, Yiqun Mei, **yuan Zhou, Shanshan Jiang, Vishal M. Patel

Abstract: Accelerating magnetic resonance image (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k-space. In this paper, we propose a recurrent transformer model, namely ReconFormer, for MRI reconstruction which can iteratively reconstruct high fertility magnetic resonance images from highly under-sampled k-space data. In particular, th… ▽ More Accelerating magnetic resonance image (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k-space. In this paper, we propose a recurrent transformer model, namely ReconFormer, for MRI reconstruction which can iteratively reconstruct high fertility magnetic resonance images from highly under-sampled k-space data. In particular, the proposed architecture is built upon Recurrent Pyramid Transformer Layers (RPTL), which jointly exploits intrinsic multi-scale information at every architecture unit as well as the dependencies of the deep feature correlation through recurrent states. Moreover, the proposed ReconFormer is lightweight since it employs the recurrent structure for its parameter efficiency. We validate the effectiveness of ReconFormer on multiple datasets with different magnetic resonance sequences and show that it achieves significant improvements over the state-of-the-art methods with better parameter efficiency. Implementation code will be available in https://github.com/guopengf/ReconFormer. △ Less

Submitted 27 January, 2022; v1 submitted 23 January, 2022; originally announced January 2022.

arXiv:2201.09355 [pdf, ps, other]

Transformer-based SAR Image Despeckling

Authors: Malsha V. Perera, Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Synthetic Aperture Radar (SAR) images are usually degraded by a multiplicative noise known as speckle which makes processing and interpretation of SAR images difficult. In this paper, we introduce a transformer-based network for SAR image despeckling. The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different im… ▽ More Synthetic Aperture Radar (SAR) images are usually degraded by a multiplicative noise known as speckle which makes processing and interpretation of SAR images difficult. In this paper, we introduce a transformer-based network for SAR image despeckling. The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different image regions - aiding in better despeckling. The network is trained end-to-end with synthetically generated speckled images using a composite loss function. Experiments show that the proposed method achieves significant improvements over traditional and convolutional neural network-based despeckling methods on both synthetic and real SAR images. △ Less

Submitted 23 January, 2022; originally announced January 2022.

Comments: Submitted to International Geoscience and Remote Sensing Symposium (IGARSS), 2022. Our code is available at https://github.com/malshaV/sar_transformer

arXiv:2112.01183 [pdf]

doi 10.1016/j.energy.2021.123086

Shallow geothermal energy potential for heating and cooling of buildings with regeneration under climate change scenarios

Authors: Alina Walch, Xiang Li, Jonathan Chambers, Nahid Mohajeri, Selin Yilmaz, Martin Patel, Jean-Louis Scartezzini

Abstract: Shallow ground-source heat pumps (GSHPs) are a promising technology for contributing to the decarbonisation of the energy sector. In heating-dominated climates, the combined use of GSHPs for both heating and cooling increases their technical potential, defined as the maximum energy that can be exchanged with the ground, as the re-injection of excess heat from space cooling leads to a seasonal rege… ▽ More Shallow ground-source heat pumps (GSHPs) are a promising technology for contributing to the decarbonisation of the energy sector. In heating-dominated climates, the combined use of GSHPs for both heating and cooling increases their technical potential, defined as the maximum energy that can be exchanged with the ground, as the re-injection of excess heat from space cooling leads to a seasonal regeneration of the ground. This paper proposes a new approach to quantify the technical potential of GSHPs, accounting for effects of seasonal regeneration, and to estimate the useful energy to supply building energy demands at regional scale. The useful energy is obtained for direct heat exchange and for district heating and cooling (DHC) under several scenarios for climate change and market penetration levels of cooling systems. The case study in western Switzerland suggests that seasonal regeneration allows for annual maximum heat extraction densities above 300 kWh/m$^2$ at heat injection densities above 330 kWh/m$^2$. Results also show that GSHPs may cover up to 55% of heating demand while covering 57% of service-sector cooling demand for individual GSHPs in 2050, which increases to around 85% with DHC. The regional-scale results may serve to inform decision making on strategic areas for installing GSHPs. △ Less

Submitted 2 December, 2021; originally announced December 2021.

Comments: Walch and Li contributed equally. Revision submitted to Energy

arXiv:2111.00837 [pdf, other]

Simulating Realistic MRI variations to Improve Deep Learning model and visual explanations using GradCAM

Authors: Muhammad Ilyas Patel, Shrey Singla, Razeem Ahmad Ali Mattathodi, Sumit Sharma, Deepam Gautam, Srinivasa Rao Kundeti

Abstract: In the medical field, landmark detection in MRI plays an important role in reducing medical technician efforts in tasks like scan planning, image registration, etc. First, 88 landmarks spread across the brain anatomy in the three respective views -- sagittal, coronal, and axial are manually annotated, later guidelines from the expert clinical technicians are taken sub-anatomy-wise, for better loca… ▽ More In the medical field, landmark detection in MRI plays an important role in reducing medical technician efforts in tasks like scan planning, image registration, etc. First, 88 landmarks spread across the brain anatomy in the three respective views -- sagittal, coronal, and axial are manually annotated, later guidelines from the expert clinical technicians are taken sub-anatomy-wise, for better localization of the existing landmarks, in order to identify and locate the important atlas landmarks even in oblique scans. To overcome limited data availability, we implement realistic data augmentation to generate synthetic 3D volumetric data. We use a modified HighRes3DNet model for solving brain MRI volumetric landmark detection problem. In order to visually explain our trained model on unseen data, and discern a stronger model from a weaker model, we implement Gradient-weighted Class Activation Map** (Grad-CAM) which produces a coarse localization map highlighting the regions the model is focusing. Our experiments show that the proposed method shows favorable results, and the overall pipeline can be extended to a variable number of landmarks and other anatomies. △ Less

Submitted 1 November, 2021; originally announced November 2021.

Comments: 8 pages, 9 figures, IEEE-CCEM 2021 conference

arXiv:2109.03446 [pdf, other]

Hierarchical Frequency and Voltage Control using Prioritized Utilization of Inverter Based Resources

Authors: Rahul Chakraborty, Aranya Chakrabortty, Evangelos Farantatos, Mahendra Patel, Hossein Hooshyar, Atena Darvishi

Abstract: We propose a novel hierarchical frequency and voltage control design for multi-area power system integrated with inverter-based resources (IBRs). The design is based on the idea of prioritizing the use of IBRs over conventional generator-based control in compensating for sudden and unpredicted changes in loads and generations, and thereby mitigate any undesired dynamics in the frequency or the vol… ▽ More We propose a novel hierarchical frequency and voltage control design for multi-area power system integrated with inverter-based resources (IBRs). The design is based on the idea of prioritizing the use of IBRs over conventional generator-based control in compensating for sudden and unpredicted changes in loads and generations, and thereby mitigate any undesired dynamics in the frequency or the voltage by exploiting their fast actuation time constants. A new sequential optimization problem, referred to as Area Prioritized Power Flow (APPF), is formulated to model this prioritization. It is shown that compared to conventional power flow APPF not only leads to a fairer balance between the dispatch of active and reactive power from the IBRs and the synchronous generators, but also limits the impact of any contingency from spreading out beyond its respective control area, thereby guaranteeing a better collective dynamic performance of the grid. This improvement, however, comes at the cost of adding an extra layer of communication needed for executing APPF in a hierarchical way. Results are validated using simulations of a 9-machine, 6-IBR, 33-bus, 3-area power system model, illustrating how APPF can mitigate a disturbance faster and more efficiently by prioritizing the use of local area-resources. △ Less

Submitted 5 February, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

arXiv:2107.12775 [pdf, other]

doi 10.1007/978-3-030-87583-1_18

Realistic Ultrasound Image Synthesis for Improved Classification of Liver Disease

Authors: Hui Che, Sumana Ramanathan, David Foran, John L Nosher, Vishal M Patel, Ilker Hacihaliloglu

Abstract: With the success of deep learning-based methods applied in medical image analysis, convolutional neural networks (CNNs) have been investigated for classifying liver disease from ultrasound (US) data. However, the scarcity of available large-scale labeled US data has hindered the success of CNNs for classifying liver disease from US data. In this work, we propose a novel generative adversarial netw… ▽ More With the success of deep learning-based methods applied in medical image analysis, convolutional neural networks (CNNs) have been investigated for classifying liver disease from ultrasound (US) data. However, the scarcity of available large-scale labeled US data has hindered the success of CNNs for classifying liver disease from US data. In this work, we propose a novel generative adversarial network (GAN) architecture for realistic diseased and healthy liver US image synthesis. We adopt the concept of stacking to synthesize realistic liver US data. Quantitative and qualitative evaluation is performed on 550 in-vivo B-mode liver US images collected from 55 subjects. We also show that the synthesized images, together with real in vivo data, can be used to significantly improve the performance of traditional CNN architectures for Nonalcoholic fatty liver disease (NAFLD) classification. △ Less

Submitted 27 July, 2021; originally announced July 2021.

Comments: Accepted for presentation at the 2021 MICCAI-International Workshop of Advances in Simplifying Medical UltraSound (ASMUS2021)

arXiv:2107.02630 [pdf, other]

Hyperspectral Pansharpening Based on Improved Deep Image Prior and Residual Reconstruction

Authors: Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Hyperspectral pansharpening aims to synthesize a low-resolution hyperspectral image (LR-HSI) with a registered panchromatic image (PAN) to generate an enhanced HSI with high spectral and spatial resolution. Recently proposed HS pansharpening methods have obtained remarkable results using deep convolutional networks (ConvNets), which typically consist of three steps: (1) up-sampling the LR-HSI, (2)… ▽ More Hyperspectral pansharpening aims to synthesize a low-resolution hyperspectral image (LR-HSI) with a registered panchromatic image (PAN) to generate an enhanced HSI with high spectral and spatial resolution. Recently proposed HS pansharpening methods have obtained remarkable results using deep convolutional networks (ConvNets), which typically consist of three steps: (1) up-sampling the LR-HSI, (2) predicting the residual image via a ConvNet, and (3) obtaining the final fused HSI by adding the outputs from first and second steps. Recent methods have leveraged Deep Image Prior (DIP) to up-sample the LR-HSI due to its excellent ability to preserve both spatial and spectral information, without learning from large data sets. However, we observed that the quality of up-sampled HSIs can be further improved by introducing an additional spatial-domain constraint to the conventional spectral-domain energy function. We define our spatial-domain constraint as the $L_1$ distance between the predicted PAN image and the actual PAN image. To estimate the PAN image of the up-sampled HSI, we also propose a learnable spectral response function (SRF). Moreover, we noticed that the residual image between the up-sampled HSI and the reference HSI mainly consists of edge information and very fine structures. In order to accurately estimate fine information, we propose a novel over-complete network, called HyperKite, which focuses on learning high-level features by constraining the receptive from increasing in the deep layers. We perform experiments on three HSI datasets to demonstrate the superiority of our DIP-HyperKite over the state-of-the-art pansharpening methods. The deployment codes, pre-trained models, and final fusion outputs of our DIP-HyperKite and the methods used for the comparisons will be publicly made available at https://github.com/wgcban/DIP-HyperKite.git. △ Less

Submitted 6 July, 2021; originally announced July 2021.

arXiv:2106.08886 [pdf, other]

Over-and-Under Complete Convolutional RNN for MRI Reconstruction

Authors: Pengfei Guo, Jeya Maria Jose Valanarasu, Puyang Wang, **yuan Zhou, Shanshan Jiang, Vishal M. Patel

Abstract: Reconstructing magnetic resonance (MR) images from undersampled data is a challenging problem due to various artifacts introduced by the under-sampling operation. Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture which captures low-level features at the initial layers and high-level features at the deeper layers. Such networks focus… ▽ More Reconstructing magnetic resonance (MR) images from undersampled data is a challenging problem due to various artifacts introduced by the under-sampling operation. Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture which captures low-level features at the initial layers and high-level features at the deeper layers. Such networks focus much on global features which may not be optimal to reconstruct the fully-sampled image. In this paper, we propose an Over-and-Under Complete Convolutional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network(CRNN). The overcomplete branch gives special attention in learning local structures by restraining the receptive field of the network. Combining it with the undercomplete branch leads to a network which focuses more on low-level features without losing out on the global structures. Extensive experiments on two datasets demonstrate that the proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters. △ Less

Submitted 24 June, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

Comments: Accepted to MICCAI 2021

arXiv:2103.02148 [pdf, other]

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

Authors: Pengfei Guo, Puyang Wang, **yuan Zhou, Shanshan Jiang, Vishal M. Patel

Abstract: Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. In recent years, deep learning-based methods have been shown to produce superior performance on MR image reconstruction. However, these methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical dat… ▽ More Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. In recent years, deep learning-based methods have been shown to produce superior performance on MR image reconstruction. However, these methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical data privacy regulations. In order to overcome this challenge, we propose a federated learning (FL) based solution in which we take advantage of the MR data available at different institutions while preserving patients' privacy. However, the generalizability of models trained with the FL setting can still be suboptimal due to domain shift, which results from the data collected at multiple institutions with different sensors, disease types, and acquisition protocols, etc. With the motivation of circumventing this challenge, we propose a cross-site modeling for MR image reconstruction in which the learned intermediate latent features among different source sites are aligned with the distribution of the latent features at the target site. Extensive experiments are conducted to provide various insights about FL for MR image reconstruction. Experimental results demonstrate that the proposed framework is a promising direction to utilize multi-institutional data without compromising patients' privacy for achieving improved MR image reconstruction. Our code will be available at https://github.com/guopengf/FLMRCM. △ Less

Submitted 10 March, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

Comments: Accepted at CVPR 2021

arXiv:2102.11942 [pdf, other]

doi 10.1109/EMBC46164.2021.9631069

Multi-Feature Multi-Scale CNN-Derived COVID-19 Classification from Lung Ultrasound Data

Authors: Hui Che, Jared Radbel, Jag Sunderram, John L. Nosher, Vishal M. Patel, Ilker Hacihaliloglu

Abstract: The global pandemic of the novel coronavirus disease 2019 (COVID-19) has put tremendous pressure on the medical system. Imaging plays a complementary role in the management of patients with COVID-19. Computed tomography (CT) and chest X-ray (CXR) are the two dominant screening tools. However, difficulty in eliminating the risk of disease transmission, radiation exposure and not being costeffective… ▽ More The global pandemic of the novel coronavirus disease 2019 (COVID-19) has put tremendous pressure on the medical system. Imaging plays a complementary role in the management of patients with COVID-19. Computed tomography (CT) and chest X-ray (CXR) are the two dominant screening tools. However, difficulty in eliminating the risk of disease transmission, radiation exposure and not being costeffective are some of the challenges for CT and CXR imaging. This fact induces the implementation of lung ultrasound (LUS) for evaluating COVID-19 due to its practical advantages of noninvasiveness, repeatability, and sensitive bedside property. In this paper, we utilize a deep learning model to perform the classification of COVID-19 from LUS data, which could produce objective diagnostic information for clinicians. Specifically, all LUS images are processed to obtain their corresponding local phase filtered images and radial symmetry transformed images before fed into the multi-scale residual convolutional neural network (CNN). Secondly, image combination as the input of the network is used to explore rich and reliable features. Feature fusion strategy at different levels is adopted to investigate the relationship between the depth of feature aggregation and the classification accuracy. Our proposed method is evaluated on the point-of-care US (POCUS) dataset together with the Italian COVID-19 Lung US database (ICLUS-DB) and shows promising performance for COVID-19 prediction. △ Less

Submitted 23 February, 2021; originally announced February 2021.

arXiv:2101.09451 [pdf, other]

Error Diffusion Halftoning Against Adversarial Examples

Authors: Shao-Yuan Lo, Vishal M. Patel

Abstract: Adversarial examples contain carefully crafted perturbations that can fool deep neural networks (DNNs) into making wrong predictions. Enhancing the adversarial robustness of DNNs has gained considerable interest in recent years. Although image transformation-based defenses were widely considered at an earlier time, most of them have been defeated by adaptive attacks. In this paper, we propose a ne… ▽ More Adversarial examples contain carefully crafted perturbations that can fool deep neural networks (DNNs) into making wrong predictions. Enhancing the adversarial robustness of DNNs has gained considerable interest in recent years. Although image transformation-based defenses were widely considered at an earlier time, most of them have been defeated by adaptive attacks. In this paper, we propose a new image transformation defense based on error diffusion halftoning, and combine it with adversarial training to defend against adversarial examples. Error diffusion halftoning projects an image into a 1-bit space and diffuses quantization error to neighboring pixels. This process can remove adversarial perturbations from a given image while maintaining acceptable image quality in the meantime in favor of recognition. Experimental results demonstrate that the proposed method is able to improve adversarial robustness even under advanced adaptive attacks, while most of the other image transformation-based defenses do not. We show that a proper image transformation can still be an effective defense approach. Code: https://github.com/shaoyuanlo/Halftoning-Defense △ Less

Submitted 24 July, 2021; v1 submitted 23 January, 2021; originally announced January 2021.

Comments: Accepted at IEEE International Conference on Image Processing (ICIP) 2021

arXiv:2012.04262 [pdf, other]

Overcomplete Representations Against Adversarial Videos

Authors: Shao-Yuan Lo, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Adversarial robustness of deep neural networks is an extensively studied problem in the literature and various methods have been proposed to defend against adversarial images. However, only a handful of defense methods have been developed for defending against attacked videos. In this paper, we propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OU… ▽ More Adversarial robustness of deep neural networks is an extensively studied problem in the literature and various methods have been proposed to defend against adversarial images. However, only a handful of defense methods have been developed for defending against attacked videos. In this paper, we propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OUDefend). Most restoration networks adopt an encoder-decoder architecture that first shrinks spatial dimension then expands it back. This approach learns undercomplete representations, which have large receptive fields to collect global information but overlooks local details. On the other hand, overcomplete representations have opposite properties. Hence, OUDefend is designed to balance local and global features by learning those two representations. We attach OUDefend to target video recognition models as a feature restoration block and train the entire network end-to-end. Experimental results show that the defenses focusing on images may be ineffective to videos, while OUDefend enhances robustness against different types of adversarial videos, ranging from additive attacks, multiplicative attacks to physically realizable attacks. Code: https://github.com/shaoyuanlo/OUDefend △ Less

Submitted 14 June, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

Comments: Accepted at IEEE International Conference on Image Processing (ICIP) 2021

arXiv:2012.02978 [pdf, other]

Design and Implementation of Path Trackers for Ackermann Drive based Vehicles

Authors: Adarsh Patnaik, Manthan Patel, Vibhakar Mohta, Het Shah, Shubh Agrawal, Aditya Rathore, Ritwik Malik, Debashish Chakravarty, Ranjan Bhattacharya

Abstract: This article is an overview of the various literature on path tracking methods and their implementation in simulation and realistic operating environments.The scope of this study includes analysis, implementation,tuning, and comparison of some selected path tracking methods commonly used in practice for trajectory tracking in autonomous vehicles. Many of these methods are applicable at low speed d… ▽ More This article is an overview of the various literature on path tracking methods and their implementation in simulation and realistic operating environments.The scope of this study includes analysis, implementation,tuning, and comparison of some selected path tracking methods commonly used in practice for trajectory tracking in autonomous vehicles. Many of these methods are applicable at low speed due to the linear assumption for the system model, and hence, some methods are also included that consider nonlinearities present in lateral vehicle dynamics during high-speed navigation. The performance evaluation and comparison of tracking methods are carried out on realistic simulations and a dedicated instrumented passenger car, Mahindra e2o, to get a performance idea of all the methods in realistic operating conditions and develop tuning methodologies for each of the methods. It has been observed that our model predictive control-based approach is able to perform better compared to the others in medium velocity ranges. △ Less

Submitted 5 December, 2020; originally announced December 2020.

Comments: 24 pages, 24 figures

arXiv:2010.10661 [pdf, other]

doi 10.1109/JSTSP.2020.3039393

Exploring Overcomplete Representations for Single Image Deraining using CNNs

Authors: Rajeev Yasarla, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Removal of rain streaks from a single image is an extremely challenging problem since the rainy images often contain rain streaks of different size, shape, direction and density. Most recent methods for deraining use a deep network following a generic "encoder-decoder" architecture which captures low-level features across the initial layers and high-level features in the deeper layers. For the tas… ▽ More Removal of rain streaks from a single image is an extremely challenging problem since the rainy images often contain rain streaks of different size, shape, direction and density. Most recent methods for deraining use a deep network following a generic "encoder-decoder" architecture which captures low-level features across the initial layers and high-level features in the deeper layers. For the task of deraining, the rain streaks which are to be removed are relatively small and focusing much on global features is not an efficient way to solve the problem. To this end, we propose using an overcomplete convolutional network architecture which gives special attention in learning local structures by restraining the receptive field of filters. We combine it with U-Net so that it does not lose out on the global structures as well while focusing more on low-level features, to compute the derained image. The proposed network called, Over-and-Under Complete Deraining Network (OUCD), consists of two branches: overcomplete branch which is confined to small receptive field size in order to focus on the local structures and an undercomplete branch that has larger receptive fields to primarily focus on global structures. Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. △ Less

Submitted 20 October, 2020; originally announced October 2020.

Report number: J-STSP-DLIVRC-00060-2020

Journal ref: IEEE Journal of Selected Topics in Signal Processing, 2020

arXiv:2010.01663 [pdf, other]

KiU-Net: Overcomplete Convolutional Architectures for Biomedical Image and Volumetric Segmentation

Authors: Jeya Maria Jose Valanarasu, Vishwanath A. Sindagi, Ilker Hacihaliloglu, Vishal M. Patel

Abstract: Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these "traditional" encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This issue can be attributed to the increase in receptive field si… ▽ More Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these "traditional" encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This issue can be attributed to the increase in receptive field size as we go deeper into the encoder. The extra focus on learning high level features causes the U-Net based approaches to learn less information about low-level features which are crucial for detecting small structures. To overcome this issue, we propose using an overcomplete convolutional architecture where we project our input image into a higher dimension such that we constrain the receptive field from increasing in the deep layers of the network. We design a new architecture for image segmentation- KiU-Net which has two branches: (1) an overcomplete convolutional network Kite-Net which learns to capture fine details and accurate edges of the input, and (2) U-Net which learns high level features. Furthermore, we also propose KiU-Net 3D which is a 3D convolutional architecture for volumetric segmentation. We perform a detailed study of KiU-Net by performing experiments on five different datasets covering various image modalities like ultrasound (US), magnetic resonance imaging (MRI), computed tomography (CT), microscopic and fundus images. The proposed method achieves a better performance as compared to all the recent methods with an additional benefit of fewer parameters and faster convergence. Additionally, we also demonstrate that the extensions of KiU-Net based on residual blocks and dense blocks result in further performance improvements. The implementation of KiU-Net can be found here: https://github.com/jeya-maria-jose/KiU-Net-pytorch △ Less

Submitted 14 October, 2021; v1 submitted 4 October, 2020; originally announced October 2020.

Comments: Journal Extension of KiU-Net (MICCAI-2020)

arXiv:2008.07788 [pdf, other]

CinC-GAN for Effective F0 prediction for Whisper-to-Normal Speech Conversion

Authors: Maitreya Patel, Mirali Purohit, Jui Shah, Hemant A. Patil

Abstract: Recently, Generative Adversarial Networks (GAN)-based methods have shown remarkable performance for the Voice Conversion and WHiSPer-to-normal SPeeCH (WHSP2SPCH) conversion. One of the key challenges in WHSP2SPCH conversion is the prediction of fundamental frequency (F0). Recently, authors have proposed state-of-the-art method Cycle-Consistent Generative Adversarial Networks (CycleGAN) for WHSP2SP… ▽ More Recently, Generative Adversarial Networks (GAN)-based methods have shown remarkable performance for the Voice Conversion and WHiSPer-to-normal SPeeCH (WHSP2SPCH) conversion. One of the key challenges in WHSP2SPCH conversion is the prediction of fundamental frequency (F0). Recently, authors have proposed state-of-the-art method Cycle-Consistent Generative Adversarial Networks (CycleGAN) for WHSP2SPCH conversion. The CycleGAN-based method uses two different models, one for Mel Cepstral Coefficients (MCC) map**, and another for F0 prediction, where F0 is highly dependent on the pre-trained model of MCC map**. This leads to additional non-linear noise in predicted F0. To suppress this noise, we propose Cycle-in-Cycle GAN (i.e., CinC-GAN). It is specially designed to increase the effectiveness in F0 prediction without losing the accuracy of MCC map**. We evaluated the proposed method on a non-parallel setting and analyzed on speaker-specific, and gender-specific tasks. The objective and subjective tests show that CinC-GAN significantly outperforms the CycleGAN. In addition, we analyze the CycleGAN and CinC-GAN for unseen speakers and the results show the clear superiority of CinC-GAN. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: Accepted in 28th European Signal Processing Conference (EUSIPCO), 2020

arXiv:2008.02859 [pdf, other]

Confidence-guided Lesion Mask-based Simultaneous Synthesis of Anatomic and Molecular MR Images in Patients with Post-treatment Malignant Gliomas

Authors: Pengfei Guo, Puyang Wang, Rajeev Yasarla, **yuan Zhou, Vishal M. Patel, Shanshan Jiang

Abstract: Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas in neuro-oncology, especially with the help of standard anatomic and advanced molecular MR images. However, data quantity and quality remain a key determinant of, and a significant limit on, the potential of such applications. In our previous work, we explored synthesis of ana… ▽ More Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas in neuro-oncology, especially with the help of standard anatomic and advanced molecular MR images. However, data quantity and quality remain a key determinant of, and a significant limit on, the potential of such applications. In our previous work, we explored synthesis of anatomic and molecular MR image network (SAMR) in patients with post-treatment malignant glioms. Now, we extend it and propose Confidence Guided SAMR (CG-SAMR) that synthesizes data from lesion information to multi-modal anatomic sequences, including T1-weighted (T1w), gadolinium enhanced T1w (Gd-T1w), T2-weighted (T2w), and fluid-attenuated inversion recovery (FLAIR), and the molecular amide proton transfer-weighted (APTw) sequence. We introduce a module which guides the synthesis based on confidence measure about the intermediate results. Furthermore, we extend the proposed architecture for unsupervised synthesis so that unpaired data can be used for training the network. Extensive experiments on real clinical data demonstrate that the proposed model can perform better than the state-of-theart synthesis methods. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: Submit to IEEE TMI. arXiv admin note: text overlap with arXiv:2006.14761

arXiv:2006.14761 [pdf, other]

Lesion Mask-based Simultaneous Synthesis of Anatomic and MolecularMR Images using a GAN

Authors: Pengfei Guo, Puyang Wang, **yuan Zhou, Vishal M. Patel, Shanshan Jiang

Abstract: Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas for patients with malignant gliomas in neuro-oncology with the help of conventional and advanced molecular MR images. However, the lack of sufficient annotated MRI data has vastly impeded the development of such automatic methods. Conventional data augmentation approaches, inc… ▽ More Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas for patients with malignant gliomas in neuro-oncology with the help of conventional and advanced molecular MR images. However, the lack of sufficient annotated MRI data has vastly impeded the development of such automatic methods. Conventional data augmentation approaches, including flip**, scaling, rotation, and distortion are not capable of generating data with diverse image content. In this paper, we propose a method, called synthesis of anatomic and molecular MR images network (SAMR), which can simultaneously synthesize data from arbitrary manipulated lesion information on multiple anatomic and molecular MRI sequences, including T1-weighted (T1w), gadolinium enhanced T1w (Gd-T1w), T2-weighted (T2w), fluid-attenuated inversion recovery (FLAIR), and amide proton transfer-weighted (APTw). The proposed framework consists of a stretch-out up-sampling module, a brain atlas encoder, a segmentation consistency module, and multi-scale label-wise discriminators. Extensive experiments on real clinical data demonstrate that the proposed model can perform significantly better than the state-of-the-art synthesis methods. △ Less

Submitted 26 August, 2020; v1 submitted 25 June, 2020; originally announced June 2020.

Comments: MICCAI 2020

arXiv:2006.13469 [pdf, other]

Face-to-Music Translation Using a Distance-Preserving Generative Adversarial Network with an Auxiliary Discriminator

Authors: Chelhwon Kim, Andrew Port, Mitesh Patel

Abstract: Learning a map** between two unrelated domains-such as image and audio, without any supervision is a challenging task. In this work, we propose a distance-preserving generative adversarial model to translate images of human faces into an audio domain. The audio domain is defined by a collection of musical note sounds recorded by 10 different instrument families (NSynth \cite{nsynth2017}) and a d… ▽ More Learning a map** between two unrelated domains-such as image and audio, without any supervision is a challenging task. In this work, we propose a distance-preserving generative adversarial model to translate images of human faces into an audio domain. The audio domain is defined by a collection of musical note sounds recorded by 10 different instrument families (NSynth \cite{nsynth2017}) and a distance metric where the instrument family class information is incorporated together with a mel-frequency cepstral coefficients (MFCCs) feature. To enforce distance-preservation, a loss term that penalizes difference between pairwise distances of the faces and the translated audio samples is used. Further, we discover that the distance preservation constraint in the generative adversarial model leads to reduced diversity in the translated audio samples, and propose the use of an auxiliary discriminator to enhance the diversity of the translations while using the distance preservation constraint. We also provide a visual demonstration of the results and numerical analysis of the fidelity of the translations. A video demo of our proposed model's learned translation is available in https://www.dropbox.com/s/the176w9obq8465/face_to_musical_note.mov?dl=0. △ Less

Submitted 24 June, 2020; originally announced June 2020.

Comments: 15 pages, 3 figures

arXiv:2006.04878 [pdf, other]

KiU-Net: Towards Accurate Segmentation of Biomedical Images using Over-complete Representations

Authors: Jeya Maria Jose, Vishwanath Sindagi, Ilker Hacihaliloglu, Vishal M. Patel

Abstract: Due to its excellent performance, U-Net is the most widely used backbone architecture for biomedical image segmentation in the recent years. However, in our studies, we observe that there is a considerable performance drop in the case of detecting smaller anatomical landmarks with blurred noisy boundaries. We analyze this issue in detail, and address it by proposing an over-complete architecture (… ▽ More Due to its excellent performance, U-Net is the most widely used backbone architecture for biomedical image segmentation in the recent years. However, in our studies, we observe that there is a considerable performance drop in the case of detecting smaller anatomical landmarks with blurred noisy boundaries. We analyze this issue in detail, and address it by proposing an over-complete architecture (Ki-Net) which involves projecting the data onto higher dimensions (in the spatial sense). This network, when augmented with U-Net, results in significant improvements in the case of segmenting small anatomical landmarks and blurred noisy boundaries while obtaining better overall performance. Furthermore, the proposed network has additional benefits like faster convergence and fewer number of parameters. We evaluate the proposed method on the task of brain anatomy segmentation from 2D Ultrasound (US) of preterm neonates, and achieve an improvement of around 4% in terms of the DICE accuracy and Jaccard index as compared to the standard-U-Net, while outperforming the recent best methods by 2%. Code: https://github.com/jeya-maria-jose/KiU-Net-pytorch . △ Less

Submitted 8 July, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

Comments: Accepted at MICCAI 2020

arXiv:2005.13291 [pdf, other]

Deep Sensory Substitution: Noninvasively Enabling Biological Neural Networks to Receive Input from Artificial Neural Networks

Authors: Andrew Port, Chelhwon Kim, Mitesh Patel

Abstract: As is expressed in the adage "a picture is worth a thousand words", when using spoken language to communicate visual information, brevity can be a challenge. This work describes a novel technique for leveraging machine-learned feature embeddings to sonify visual (and other types of) information into a perceptual audio domain, allowing users to perceive this information using only their aural facul… ▽ More As is expressed in the adage "a picture is worth a thousand words", when using spoken language to communicate visual information, brevity can be a challenge. This work describes a novel technique for leveraging machine-learned feature embeddings to sonify visual (and other types of) information into a perceptual audio domain, allowing users to perceive this information using only their aural faculty. The system uses a pretrained image embedding network to extract visual features and embed them in a compact subset of Euclidean space -- this converts the images into feature vectors whose $L^2$ distances can be used as a meaningful measure of similarity. A generative adversarial network (GAN) is then used to find a distance preserving map from this metric space of feature vectors into the metric space defined by a target audio dataset equipped with either the Euclidean metric or a mel-frequency cepstrum-based psychoacoustic distance metric. We demonstrate this technique by sonifying images of faces into human speech-like audio. For both target audio metrics, the GAN successfully found a metric preserving map**, and in human subject tests, users were able to accurately classify audio sonifications of faces. △ Less

Submitted 25 August, 2021; v1 submitted 27 May, 2020; originally announced May 2020.

Comments: 9 pages, 3 figures

arXiv:2004.10959 [pdf, other]

Uncertainty Quantification for Hyperspectral Image Denoising Frameworks based on Low-rank Matrix Approximation

Authors: **gwei Song, Shaobo Xia, Jun Wang, Mitesh Patel, Dong Chen

Abstract: Sliding-window based low-rank matrix approximation (LRMA) is a technique widely used in hyperspectral images (HSIs) denoising or completion. However, the uncertainty quantification of the restored HSI has not been addressed to date. Accurate uncertainty quantification of the denoised HSI facilitates to applications such as multi-source or multi-scale data fusion, data assimilation, and product unc… ▽ More Sliding-window based low-rank matrix approximation (LRMA) is a technique widely used in hyperspectral images (HSIs) denoising or completion. However, the uncertainty quantification of the restored HSI has not been addressed to date. Accurate uncertainty quantification of the denoised HSI facilitates to applications such as multi-source or multi-scale data fusion, data assimilation, and product uncertainty quantification, since these applications require an accurate approach to describe the statistical distributions of the input data. Therefore, we propose a prior-free closed-form element-wise uncertainty quantification method for LRMA-based HSI restoration. Our closed-form algorithm overcomes the difficulty of the HSI patch mixing problem caused by the sliding-window strategy used in the conventional LRMA process. The proposed approach only requires the uncertainty of the observed HSI and provides the uncertainty result relatively rapidly and with similar computational complexity as the LRMA technique. We conduct extensive experiments to validate the estimation accuracy of the proposed closed-form uncertainty approach. The method is robust to at least 10% random impulse noise at the cost of 10-20% of additional processing time compared to the LRMA. The experiments indicate that the proposed closed-form uncertainty quantification method is more applicable to real-world applications than the baseline Monte Carlo test, which is computationally expensive. The code is available in the attachment and will be released after the acceptance of this paper. △ Less

Submitted 6 May, 2022; v1 submitted 22 April, 2020; originally announced April 2020.

Comments: Accepted for publication by IEEE Transactions on Geoscience and Remote Sensing. IEEE Transactions on Geoscience and Remote Sensing (TGRS)

arXiv:1912.08364 [pdf, other]

Learning to Segment Brain Anatomy from 2D Ultrasound with Less Data

Authors: Jeya Maria Jose V., Rajeev Yasarla, Puyang Wang, Ilker Hacihaliloglu, Vishal M. Patel

Abstract: Automatic segmentation of anatomical landmarks from ultrasound (US) plays an important role in the management of preterm neonates with a very low birth weight due to the increased risk of develo** intraventricular hemorrhage (IVH) or other complications. One major problem in develo** an automatic segmentation method for this task is the limited availability of annotated data. To tackle this is… ▽ More Automatic segmentation of anatomical landmarks from ultrasound (US) plays an important role in the management of preterm neonates with a very low birth weight due to the increased risk of develo** intraventricular hemorrhage (IVH) or other complications. One major problem in develo** an automatic segmentation method for this task is the limited availability of annotated data. To tackle this issue, we propose a novel image synthesis method using multi-scale self attention generator to synthesize US images from various segmentation masks. We show that our method can synthesize high-quality US images for every manipulated segmentation label with qualitative and quantitative improvements over the recent state-of-the-art synthesis methods. Furthermore, for the segmentation task, we propose a novel method, called Confidence-guided Brain Anatomy Segmentation (CBAS) network, where segmentation and corresponding confidence maps are estimated at different scales. In addition, we introduce a technique which guides CBAS to learn the weights based on the confidence measure about the estimate. Extensive experiments demonstrate that the proposed method for both synthesis and segmentation tasks achieve significant improvements over the recent state-of-the-art methods. In particular, we show that the new synthesis framework can be used to generate realistic US images which can be used to improve the performance of a segmentation algorithm. △ Less

Submitted 17 December, 2019; originally announced December 2019.

arXiv:1909.04207 [pdf, other]

doi 10.1109/TIP.2020.2973802

Confidence Measure Guided Single Image De-raining

Authors: Rajeev Yasarla, Vishal M. Patel

Abstract: Single image de-raining is an extremely challenging problem since the rainy images contain rain streaks which often vary in size, direction and density. This varying characteristic of rain streaks affect different parts of the image differently. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the maj… ▽ More Single image de-raining is an extremely challenging problem since the rainy images contain rain streaks which often vary in size, direction and density. This varying characteristic of rain streaks affect different parts of the image differently. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the major limitations of these approaches is that they do not consider the location information of rain drops in the image. The proposed Image Quality-based single image Deraining using Confidence measure (QuDeC), network addresses this issue by learning the quality or distortion level of each patch in the rainy image, and further processes this information to learn the rain content at different scales. In addition, we introduce a technique which guides the network to learn the network weights based on the confidence measure about the estimate of both quality at each location and residual rain streak information (residual map). Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. △ Less

Submitted 9 September, 2019; originally announced September 2019.

Comments: TIP2019 submission. arXiv admin note: substantial text overlap with arXiv:1906.11129

arXiv:1906.11129 [pdf, other]

Uncertainty Guided Multi-Scale Residual Learning-using a Cycle Spinning CNN for Single Image De-Raining

Authors: Rajeev Yasarla, Vishal M. Patel

Abstract: Single image de-raining is an extremely challenging problem since the rainy image may contain rain streaks which may vary in size, direction and density. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the major limitations of these approaches is that they do not consider the location information of… ▽ More Single image de-raining is an extremely challenging problem since the rainy image may contain rain streaks which may vary in size, direction and density. Previous approaches have attempted to address this problem by leveraging some prior information to remove rain streaks from a single image. One of the major limitations of these approaches is that they do not consider the location information of rain drops in the image. The proposed Uncertainty guided Multi-scale Residual Learning (UMRL) network attempts to address this issue by learning the rain content at different scales and using them to estimate the final de-rained output. In addition, we introduce a technique which guides the network to learn the network weights based on the confidence measure about the estimate. Furthermore, we introduce a new training and testing procedure based on the notion of cycle spinning to improve the final de-raining performance. Extensive experiments on synthetic and real datasets to demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. Code is available at: https://github.com/rajeevyasarla/UMRL--using-Cycle-Spinning △ Less

Submitted 12 June, 2019; originally announced June 2019.

Comments: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019

arXiv:1803.08396 [pdf, other]

Densely Connected Pyramid Dehazing Network

Authors: He Zhang, Vishal M. Patel

Abstract: We propose a new end-to-end single image dehazing method, called Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together. The end-to-end learning is achieved by directly embedding the atmospheric scattering model into the network, thereby ensuring that the proposed method strictly follows the physics-driven scatt… ▽ More We propose a new end-to-end single image dehazing method, called Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together. The end-to-end learning is achieved by directly embedding the atmospheric scattering model into the network, thereby ensuring that the proposed method strictly follows the physics-driven scattering model for dehazing. Inspired by the dense network that can maximize the information flow along features from different levels, we propose a new edge-preserving densely connected encoder-decoder structure with multi-level pyramid pooling module for estimating the transmission map. This network is optimized using a newly introduced edge-preserving loss function. To further incorporate the mutual structural information between the estimated transmission map and the dehazed result, we propose a joint-discriminator based on generative adversarial network framework to decide whether the corresponding dehazed image and the estimated transmission map are real or fake. An ablation study is conducted to demonstrate the effectiveness of each module evaluated at both estimated transmission map and dehazed result. Extensive experiments demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods. Code will be made available at: https://github.com/hezhangsprinter △ Less

Submitted 22 March, 2018; originally announced March 2018.

Showing 1–50 of 53 results for author: Patel, M