Search | arXiv e-print repository

arXiv:2407.00967 [pdf, other]

Deep learning for automated detection of breast cancer in deep ultraviolet fluorescence images with diffusion probabilistic model

Authors: Sepehr Salem Ghahfarokhi, Tyrell To, Julie Jorns, Tina Yen, Bing Yu, Dong Hye Ye

Abstract: Data limitation is a significant challenge in applying deep learning to medical images. Recently, the diffusion probabilistic model (DPM) has shown the potential to generate high-quality images by converting Gaussian random noise into realistic images. In this paper, we apply the DPM to augment the deep ultraviolet fluorescence (DUV) image dataset with an aim to improve breast cancer classificatio… ▽ More Data limitation is a significant challenge in applying deep learning to medical images. Recently, the diffusion probabilistic model (DPM) has shown the potential to generate high-quality images by converting Gaussian random noise into realistic images. In this paper, we apply the DPM to augment the deep ultraviolet fluorescence (DUV) image dataset with an aim to improve breast cancer classification for intraoperative margin assessment. For classification, we divide the whole surface DUV image into small patches and extract convolutional features for each patch by utilizing the pre-trained ResNet. Then, we feed them into an XGBoost classifier for patch-level decisions and then fuse them with a regional importance map computed by Grad-CAM++ for whole surface-level prediction. Our experimental results show that augmenting the training dataset with the DPM significantly improves breast cancer detection performance in DUV images, increasing accuracy from 93% to 97%, compared to using Affine transformations and ProGAN. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: IEEE International Symposium on Biomedical Imaging 2024

arXiv:2406.13129 [pdf, other]

M3T: Multi-Modal Medical Transformer to bridge Clinical Context with Visual Insights for Retinal Image Medical Description Generation

Authors: Nagur Shareef Shaik, Teja Krishna Cherukuri, Dong Hye Ye

Abstract: Automated retinal image medical description generation is crucial for streamlining medical diagnosis and treatment planning. Existing challenges include the reliance on learned retinal image representations, difficulties in handling multiple imaging modalities, and the lack of clinical context in visual representations. Addressing these issues, we propose the Multi-Modal Medical Transformer (M3T),… ▽ More Automated retinal image medical description generation is crucial for streamlining medical diagnosis and treatment planning. Existing challenges include the reliance on learned retinal image representations, difficulties in handling multiple imaging modalities, and the lack of clinical context in visual representations. Addressing these issues, we propose the Multi-Modal Medical Transformer (M3T), a novel deep learning architecture that integrates visual representations with diagnostic keywords. Unlike previous studies focusing on specific aspects, our approach efficiently learns contextual information and semantics from both modalities, enabling the generation of precise and coherent medical descriptions for retinal images. Experimental studies on the DeepEyeNet dataset validate the success of M3T in meeting ophthalmologists' standards, demonstrating a substantial 13.5% improvement in BLEU@4 over the best-performing baseline model. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for presentation at the IEEE International Conference on Image Processing (ICIP 2024)

arXiv:2406.13126 [pdf, other]

Guided Context Gating: Learning to leverage salient lesions in retinal fundus images

Authors: Teja Krishna Cherukuri, Nagur Shareef Shaik, Dong Hye Ye

Abstract: Effectively representing medical images, especially retinal images, presents a considerable challenge due to variations in appearance, size, and contextual information of pathological signs called lesions. Precise discrimination of these lesions is crucial for diagnosing vision-threatening issues such as diabetic retinopathy. While visual attention-based neural networks have been introduced to lea… ▽ More Effectively representing medical images, especially retinal images, presents a considerable challenge due to variations in appearance, size, and contextual information of pathological signs called lesions. Precise discrimination of these lesions is crucial for diagnosing vision-threatening issues such as diabetic retinopathy. While visual attention-based neural networks have been introduced to learn spatial context and channel correlations from retinal images, they often fall short in capturing localized lesion context. Addressing this limitation, we propose a novel attention mechanism called Guided Context Gating, an unique approach that integrates Context Formulation, Channel Correlation, and Guided Gating to learn global context, spatial correlations, and localized lesion context. Our qualitative evaluation against existing attention mechanisms emphasize the superiority of Guided Context Gating in terms of explainability. Notably, experiments on the Zenodo-DR-7 dataset reveal a substantial 2.63% accuracy boost over advanced attention mechanisms & an impressive 6.53% improvement over the state-of-the-art Vision Transformer for assessing the severity grade of retinopathy, even with imbalanced and limited training samples for each class. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for presentation at the IEEE International Conference on Image Processing (ICIP 2024)

arXiv:2406.12683 [pdf, other]

Spatial Sequence Attention Network for Schizophrenia Classification from Structural Brain MR Images

Authors: Nagur Shareef Shaik, Teja Krishna Cherukuri, Vince Calhoun, Dong Hye Ye

Abstract: Schizophrenia is a debilitating, chronic mental disorder that significantly impacts an individual's cognitive abilities, behavior, and social interactions. It is characterized by subtle morphological changes in the brain, particularly in the gray matter. These changes are often imperceptible through manual observation, demanding an automated approach to diagnosis. This study introduces a deep lear… ▽ More Schizophrenia is a debilitating, chronic mental disorder that significantly impacts an individual's cognitive abilities, behavior, and social interactions. It is characterized by subtle morphological changes in the brain, particularly in the gray matter. These changes are often imperceptible through manual observation, demanding an automated approach to diagnosis. This study introduces a deep learning methodology for the classification of individuals with Schizophrenia. We achieve this by implementing a diversified attention mechanism known as Spatial Sequence Attention (SSA) which is designed to extract and emphasize significant feature representations from structural MRI (sMRI). Initially, we employ the transfer learning paradigm by leveraging pre-trained DenseNet to extract initial feature maps from the final convolutional block which contains morphological alterations associated with Schizophrenia. These features are further processed by the proposed SSA to capture and emphasize intricate spatial interactions and relationships across volumes within the brain. Our experimental studies conducted on a clinical dataset have revealed that the proposed attention mechanism outperforms the existing Squeeze & Excitation Network for Schizophrenia classification. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for the 21st IEEE International Symposium on Biomedical Imaging (ISBI 2024)

arXiv:2210.13415 [pdf]

Deep Learning Approach for Dynamic Sampling for Multichannel Mass Spectrometry Imaging

Authors: David Helminiak, Hang Hu, Julia Laskin, Dong Hye Ye

Abstract: Mass Spectrometry Imaging (MSI), using traditional rectilinear scanning, takes hours to days for high spatial resolution acquisitions. Given that most pixels within a sample's field of view are often neither relevant to underlying biological structures nor chemically informative, MSI presents as a prime candidate for integration with sparse and dynamic sampling algorithms. During a scan, stochasti… ▽ More Mass Spectrometry Imaging (MSI), using traditional rectilinear scanning, takes hours to days for high spatial resolution acquisitions. Given that most pixels within a sample's field of view are often neither relevant to underlying biological structures nor chemically informative, MSI presents as a prime candidate for integration with sparse and dynamic sampling algorithms. During a scan, stochastic models determine which locations probabilistically contain information critical to the generation of low-error reconstructions. Decreasing the number of required physical measurements thereby minimizes overall acquisition times. A Deep Learning Approach for Dynamic Sampling (DLADS), utilizing a Convolutional Neural Network (CNN) and encapsulating molecular mass intensity distributions within a third dimension, demonstrates a simulated 70% throughput improvement for Nanospray Desorption Electrospray Ionization (nano-DESI) MSI tissues. Evaluations are conducted between DLADS and a Supervised Learning Approach for Dynamic Sampling, with Least-Squares regression (SLADS-LS) and a Multi-Layer Perceptron (MLP) network (SLADS-Net). When compared with SLADS-LS, limited to a single m/z channel, as well as multichannel SLADS-LS and SLADS-Net, DLADS respectively improves regression performance by 36.7%, 7.0%, and 6.2%, resulting in gains to reconstruction quality of 6.0%, 2.1%, and 3.4% for acquisition of targeted m/z. △ Less

Submitted 24 October, 2022; originally announced October 2022.

arXiv:2002.00011 [pdf, other]

Age-Conditioned Synthesis of Pediatric Computed Tomography with Auxiliary Classifier Generative Adversarial Networks

Authors: Chi Nok Enoch Kan, Najibakram Maheenaboobacker, Dong Hye Ye

Abstract: Deep learning is a popular and powerful tool in computed tomography (CT) image processing such as organ segmentation, but its requirement of large training datasets remains a challenge. Even though there is a large anatomical variability for children during their growth, the training datasets for pediatric CT scans are especially hard to obtain due to risks of radiation to children. In this paper,… ▽ More Deep learning is a popular and powerful tool in computed tomography (CT) image processing such as organ segmentation, but its requirement of large training datasets remains a challenge. Even though there is a large anatomical variability for children during their growth, the training datasets for pediatric CT scans are especially hard to obtain due to risks of radiation to children. In this paper, we propose a method to conditionally synthesize realistic pediatric CT images using a new auxiliary classifier generative adversarial network (ACGAN) architecture by taking age information into account. The proposed network generated age-conditioned high-resolution CT images to enrich pediatric training datasets. △ Less

Submitted 31 January, 2020; originally announced February 2020.

Comments: Accepted for publication at IEEE International Symposium on Biomedical Imaging (ISBI) 2020

Journal ref: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI 2020)

arXiv:1807.02370 [pdf, other]

Deep Back Projection for Sparse-View CT Reconstruction

Authors: Dong Hye Ye, Gregery T. Buzzard, Max Ruby, Charles A. Bouman

Abstract: Filtered back projection (FBP) is a classical method for image reconstruction from sinogram CT data. FBP is computationally efficient but produces lower quality reconstructions than more sophisticated iterative methods, particularly when the number of views is lower than the number required by the Nyquist rate. In this paper, we use a deep convolutional neural network (CNN) to produce high-quality… ▽ More Filtered back projection (FBP) is a classical method for image reconstruction from sinogram CT data. FBP is computationally efficient but produces lower quality reconstructions than more sophisticated iterative methods, particularly when the number of views is lower than the number required by the Nyquist rate. In this paper, we use a deep convolutional neural network (CNN) to produce high-quality reconstructions directly from sinogram data. A primary novelty of our approach is that we first back project each view separately to form a stack of back projections and then feed this stack as input into the convolutional neural network. These single-view back projections convert the encoding of sinogram data into the appropriate spatial location, which can then be leveraged by the spatial invariance of the CNN to learn the reconstruction effectively. We demonstrate the benefit of our CNN based back projection on simulated sparse-view CT data over classical FBP. △ Less

Submitted 6 July, 2018; originally announced July 2018.

Comments: GlobalSIP 2018

arXiv:1703.04653 [pdf, other]

A Framework for Dynamic Image Sampling Based on Supervised Learning (SLADS)

Authors: G. M. Dilshan P. Godaliyadda, Dong Hye Ye, Michael D. Uchic, Michael A. Groeber, Gregery T. Buzzard, Charles A. Bouman

Abstract: Sparse sampling schemes have the potential to dramatically reduce image acquisition time while simultaneously reducing radiation damage to samples. However, for a sparse sampling scheme to be useful it is important that we are able to reconstruct the underlying object with sufficient clarity using the sparse measurements. In dynamic sampling, each new measurement location is selected based on info… ▽ More Sparse sampling schemes have the potential to dramatically reduce image acquisition time while simultaneously reducing radiation damage to samples. However, for a sparse sampling scheme to be useful it is important that we are able to reconstruct the underlying object with sufficient clarity using the sparse measurements. In dynamic sampling, each new measurement location is selected based on information obtained from previous measurements. Therefore, dynamic sampling schemes have the potential to dramatically reduce the number of measurements needed for high fidelity reconstructions. However, most existing dynamic sampling methods for point-wise measurement acquisition tend to be computationally expensive and are therefore too slow for practical applications. In this paper, we present a framework for dynamic sampling based on machine learning techniques, which we call a supervised learning approach for dynamic sampling (SLADS). In each step of SLADS, the objective is to find the pixel that maximizes the expected reduction in distortion (ERD) given previous measurements. SLADS is fast because we use a simple regression function to compute the ERD, and it is accurate because the regression function is trained using data sets that are representative of the specific application. In addition, we introduce a method to terminate dynamic sampling at a desired level of distortion, and we extended the SLADS methodology to sample groups of pixels at each step. Finally, we present results on computationally-generated synthetic data and experimentally-collected data to demonstrate a dramatic improvement over state-of-the-art static sampling methods. △ Less

Submitted 14 March, 2017; originally announced March 2017.

arXiv:1605.04006 [pdf, other]

doi 10.1109/TCI.2016.2582042

A Gaussian Mixture MRF for Model-Based Iterative Reconstruction with Applications to Low-Dose X-ray CT

Authors: Ruoqiao Zhang, Dong Hye Ye, Debashish Pal, Jean-Baptiste Thibault, Ken D. Sauer, Charles A. Bouman

Abstract: Markov random fields (MRFs) have been widely used as prior models in various inverse problems such as tomographic reconstruction. While MRFs provide a simple and often effective way to model the spatial dependencies in images, they suffer from the fact that parameter estimation is difficult. In practice, this means that MRFs typically have very simple structure that cannot completely capture the s… ▽ More Markov random fields (MRFs) have been widely used as prior models in various inverse problems such as tomographic reconstruction. While MRFs provide a simple and often effective way to model the spatial dependencies in images, they suffer from the fact that parameter estimation is difficult. In practice, this means that MRFs typically have very simple structure that cannot completely capture the subtle characteristics of complex images. In this paper, we present a novel Gaussian mixture Markov random field model (GM-MRF) that can be used as a very expressive prior model for inverse problems such as denoising and reconstruction. The GM-MRF forms a global image model by merging together individual Gaussian-mixture models (GMMs) for image patches. In addition, we present a novel analytical framework for computing MAP estimates using the GM-MRF prior model through the construction of surrogate functions that result in a sequence of quadratic optimizations. We also introduce a simple but effective method to adjust the GM-MRF so as to control the sharpness in low- and high-contrast regions of the reconstruction separately. We demonstrate the value of the model with experiments including image denoising and low-dose CT reconstruction. △ Less

Submitted 13 June, 2016; v1 submitted 12 May, 2016; originally announced May 2016.

Comments: accepted by IEEE Transactions on Computed Imaging

Showing 1–9 of 9 results for author: Ye, D H