Search | arXiv e-print repository

Unsupervised Iterative U-Net with an Internal Guidance Layer for Vertebrae Contrast Enhancement in Chest X-Ray Images

Authors: Ella Eidlin, Assaf Hoogi, Nathan S. Netanyahu

Abstract: X-ray imaging is a fundamental clinical tool for screening and diagnosing various diseases. However, the spatial resolution of radiographs is often limited, making it challenging to diagnose small image details and leading to difficulties in identifying vertebrae anomalies at an early stage in chest radiographs. To address this limitation, we propose a novel and robust approach to significantly im… ▽ More X-ray imaging is a fundamental clinical tool for screening and diagnosing various diseases. However, the spatial resolution of radiographs is often limited, making it challenging to diagnose small image details and leading to difficulties in identifying vertebrae anomalies at an early stage in chest radiographs. To address this limitation, we propose a novel and robust approach to significantly improve the quality of X-ray images by iteratively training a deep neural network. Our framework includes an embedded internal guidance layer that enhances the fine structures of spinal vertebrae in chest X-ray images through fully unsupervised training, utilizing an iterative procedure that employs the same network architecture in each enhancement phase. Additionally, we have designed an optimized loss function that accurately identifies object boundaries and enhances spinal features, thereby further enhancing the quality of the images. Experimental results demonstrate that our proposed method surpasses existing detail enhancement methods in terms of BRISQUE scores, and is comparable in terms of LPC-SI. Furthermore, our approach exhibits superior performance in restoring hidden fine structures, as evidenced by our qualitative results. This innovative approach has the potential to significantly enhance the diagnostic accuracy and early detection of diseases, making it a promising advancement in X-ray imaging technology. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: 10 pages, 11 figures, submitted to Transactions on Medical Imaging

ACM Class: I.4; I.4.3

arXiv:2306.01423 [pdf, other]

Improving Gradient-Trend Identification: Fast-Adaptive Moment Estimation with Finance-Inspired Triple Exponential Moving Average

Authors: Roi Peleg, Teddy Lazebnik, Assaf Hoogi

Abstract: The performance improvement of deep networks significantly depends on their optimizers. With existing optimizers, precise and efficient recognition of the gradients trend remains a challenge. Existing optimizers predominantly adopt techniques based on the first-order exponential moving average (EMA), which results in noticeable delays that impede the real-time tracking of gradients trend and conse… ▽ More The performance improvement of deep networks significantly depends on their optimizers. With existing optimizers, precise and efficient recognition of the gradients trend remains a challenge. Existing optimizers predominantly adopt techniques based on the first-order exponential moving average (EMA), which results in noticeable delays that impede the real-time tracking of gradients trend and consequently yield sub-optimal performance. To overcome this limitation, we introduce a novel optimizer called fast-adaptive moment estimation (FAME). Inspired by the triple exponential moving average (TEMA) used in the financial domain, FAME leverages the potency of higher-order TEMA to improve the precision of identifying gradient trends. TEMA plays a central role in the learning process as it actively influences optimization dynamics; this role differs from its conventional passive role as a technical indicator in financial contexts. Because of the introduction of TEMA into the optimization process, FAME can identify gradient trends with higher accuracy and fewer lag issues, thereby offering smoother and more consistent responses to gradient fluctuations compared to conventional first-order EMA. To study the effectiveness of our novel FAME optimizer, we conducted comprehensive experiments encompassing six diverse computer-vision benchmarks and tasks, spanning detection, classification, and semantic comprehension. We integrated FAME into 15 learning architectures and compared its performance with those of six popular optimizers. Results clearly showed that FAME is more robust and accurate and provides superior performance stability by minimizing noise (i.e., trend fluctuations). Notably, FAME achieves higher accuracy levels in remarkably fewer training epochs than its counterparts, clearly indicating its significance for optimizing deep networks in computer-vision tasks. △ Less

Submitted 21 December, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

arXiv:1908.06933 [pdf, other]

Deep Active Lesion Segmentation

Authors: Ali Hatamizadeh, Assaf Hoogi, Debleena Sengupta, Wuyue Lu, Brian Wilcox, Daniel Rubin, Demetri Terzopoulos

Abstract: Lesion segmentation is an important problem in computer-assisted diagnosis that remains challenging due to the prevalence of low contrast, irregular boundaries that are unamenable to shape priors. We introduce Deep Active Lesion Segmentation (DALS), a fully automated segmentation framework for that leverages the powerful nonlinear feature extraction abilities of fully Convolutional Neural Networks… ▽ More Lesion segmentation is an important problem in computer-assisted diagnosis that remains challenging due to the prevalence of low contrast, irregular boundaries that are unamenable to shape priors. We introduce Deep Active Lesion Segmentation (DALS), a fully automated segmentation framework for that leverages the powerful nonlinear feature extraction abilities of fully Convolutional Neural Networks (CNNs) and the precise boundary delineation abilities of Active Contour Models (ACMs). Our DALS framework benefits from an improved level-set ACM formulation with a per-pixel-parameterized energy functional and a novel multiscale encoder-decoder CNN that learns an initialization probability map along with parameter maps for the ACM. We evaluate our lesion segmentation model on a new Multiorgan Lesion Segmentation (MLS) dataset that contains images of various organs, including brain, liver, and lung, across different imaging modalities---MR and CT. Our results demonstrate favorable performance compared to competing methods, especially for small training datasets. Source code : $\text{https://github.com/ahatamiz/dals}$ △ Less

Submitted 30 August, 2020; v1 submitted 19 August, 2019; originally announced August 2019.

Comments: Accepted to Machine Learning in Medical Imaging (MLMI 2019). Link to source code added

Journal ref: MLMI 2019

arXiv:1907.02431 [pdf, other]

From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI

Authors: Roman Beliy, Guy Gaziv, Assaf Hoogi, Francesca Strappini, Tal Golan, Michal Irani

Abstract: Reconstructing observed images from fMRI brain recordings is challenging. Unfortunately, acquiring sufficient "labeled" pairs of {Image, fMRI} (i.e., images with their corresponding fMRI responses) to span the huge space of natural images is prohibitive for many reasons. We present a novel approach which, in addition to the scarce labeled data (training pairs), allows to train fMRI-to-image recons… ▽ More Reconstructing observed images from fMRI brain recordings is challenging. Unfortunately, acquiring sufficient "labeled" pairs of {Image, fMRI} (i.e., images with their corresponding fMRI responses) to span the huge space of natural images is prohibitive for many reasons. We present a novel approach which, in addition to the scarce labeled data (training pairs), allows to train fMRI-to-image reconstruction networks also on "unlabeled" data (i.e., images without fMRI recording, and fMRI recording without images). The proposed model utilizes both an Encoder network (image-to-fMRI) and a Decoder network (fMRI-to-image). Concatenating these two networks back-to-back (Encoder-Decoder & Decoder-Encoder) allows augmenting the training with both types of unlabeled data. Importantly, it allows training on the unlabeled test-fMRI data. This self-supervision adapts the reconstruction network to the new input test-data, despite its deviation from the statistics of the scarce training data. △ Less

Submitted 3 July, 2019; originally announced July 2019.

Comments: *First two authors contributed equally. NeurIPS 2019

Journal ref: https://proceedings.neurips.cc/paper/2019/file/7d2be41b1bde6ff8fe45150c37488ebb-Paper.pdf

arXiv:1904.12483 [pdf, other]

Self-Attention Capsule Networks for Object Classification

Authors: Assaf Hoogi, Brian Wilcox, Yachee Gupta, Daniel L. Rubin

Abstract: We propose a novel architecture for object classification, called Self-Attention Capsule Networks (SACN). SACN is the first model that incorporates the Self-Attention mechanism as an integral layer within the Capsule Network (CapsNet). While the Self-Attention mechanism supplies a long-range dependencies, results in selecting the more dominant image regions to focus on, the CapsNet analyzes the re… ▽ More We propose a novel architecture for object classification, called Self-Attention Capsule Networks (SACN). SACN is the first model that incorporates the Self-Attention mechanism as an integral layer within the Capsule Network (CapsNet). While the Self-Attention mechanism supplies a long-range dependencies, results in selecting the more dominant image regions to focus on, the CapsNet analyzes the relevant features and their spatial correlations inside these regions only. The features are extracted in the convolutional layer. Then, the Self-Attention layer learns to suppress irrelevant regions based on features analysis and highlights salient features useful for a specific task. The attention map is then fed into the CapsNet primary layer that is followed by a classification layer. The proposed SACN model was designed to solve two main limitations of the baseline CapsNet - analysis of complex data and significant computational load. In this work, we use a shallow CapsNet architecture and compensates for the absence of a deeper network by using the Self-Attention module to significantly improve the results. The proposed Self-Attention CapsNet architecture was extensively evaluated on six different datasets, mainly on three different medical sets, in addition to the natural MNIST, SVHN and CIFAR10. The model was able to classify images and their patches with diverse and complex backgrounds better than the baseline CapsNet. As a result, the proposed Self-Attention CapsNet significantly improved classification performance within and across different datasets and outperformed the baseline CapsNet, ResNet-18 and DenseNet-40 not only in classification accuracy but also in robustness. △ Less

Submitted 19 November, 2019; v1 submitted 29 April, 2019; originally announced April 2019.

arXiv:1802.04403 [pdf, other]

TVAE: Triplet-Based Variational Autoencoder using Metric Learning

Authors: Haque Ishfaq, Assaf Hoogi, Daniel Rubin

Abstract: Deep metric learning has been demonstrated to be highly effective in learning semantic representation and encoding information that can be used to measure data similarity, by relying on the embedding learned from metric learning. At the same time, variational autoencoder (VAE) has widely been used to approximate inference and proved to have a good performance for directed probabilistic models. How… ▽ More Deep metric learning has been demonstrated to be highly effective in learning semantic representation and encoding information that can be used to measure data similarity, by relying on the embedding learned from metric learning. At the same time, variational autoencoder (VAE) has widely been used to approximate inference and proved to have a good performance for directed probabilistic models. However, for traditional VAE, the data label or feature information are intractable. Similarly, traditional representation learning approaches fail to represent many salient aspects of the data. In this project, we propose a novel integrated framework to learn latent embedding in VAE by incorporating deep metric learning. The features are learned by optimizing a triplet loss on the mean vectors of VAE in conjunction with standard evidence lower bound (ELBO) of VAE. This approach, which we call Triplet based Variational Autoencoder (TVAE), allows us to capture more fine-grained information in the latent embedding. Our model is tested on MNIST data set and achieves a high triplet accuracy of 95.60% while the traditional VAE (Kingma & Welling, 2013) achieves triplet accuracy of 75.08%. △ Less

Submitted 8 February, 2023; v1 submitted 12 February, 2018; originally announced February 2018.

Comments: Old technical note

MSC Class: 68T30 (Primary); 68T01 (Secondary)

arXiv:1703.06418 [pdf, other]

A Fully-Automated Pipeline for Detection and Segmentation of Liver Lesions and Pathological Lymph Nodes

Authors: Assaf Hoogi, John W. Lambert, Yefeng Zheng, Dorin Comaniciu, Daniel L. Rubin

Abstract: We propose a fully-automated method for accurate and robust detection and segmentation of potentially cancerous lesions found in the liver and in lymph nodes. The process is performed in three steps, including organ detection, lesion detection and lesion segmentation. Our method applies machine learning techniques such as marginal space learning and convolutional neural networks, as well as active… ▽ More We propose a fully-automated method for accurate and robust detection and segmentation of potentially cancerous lesions found in the liver and in lymph nodes. The process is performed in three steps, including organ detection, lesion detection and lesion segmentation. Our method applies machine learning techniques such as marginal space learning and convolutional neural networks, as well as active contour models. The method proves to be robust in its handling of extremely high lesion diversity. We tested our method on volumetric computed tomography (CT) images, including 42 volumes containing liver lesions and 86 volumes containing 595 pathological lymph nodes. Preliminary results under 10-fold cross validation show that for both the liver lesions and the lymph nodes, a total detection sensitivity of 0.53 and average Dice score of $0.71 \pm 0.15$ for segmentation were obtained. △ Less

Submitted 19 March, 2017; originally announced March 2017.

Comments: Workshop on Machine Learning in Healthcare, Neural Information Processing Systems (NIPS). Barcelona, Spain, 2016

arXiv:1606.03765 [pdf]

Adaptive Local Window for Level Set Segmentation of CT and MRI Liver Lesions

Authors: Assaf Hoogi, Christopher F. Beaulieu, Guilherme M. Cunha, Elhamy Heba, Claude B. Sirlin, Sandy Napel, Daniel L. Rubin

Abstract: We propose a novel method, the adaptive local window, for improving level set segmentation technique. The window is estimated separately for each contour point, over iterations of the segmentation process, and for each individual object. Our method considers the object scale, the spatial texture, and changes of the energy functional over iterations. Global and local statistics are considered by ca… ▽ More We propose a novel method, the adaptive local window, for improving level set segmentation technique. The window is estimated separately for each contour point, over iterations of the segmentation process, and for each individual object. Our method considers the object scale, the spatial texture, and changes of the energy functional over iterations. Global and local statistics are considered by calculating several gray level co-occurrence matrices. We demonstrate the capabilities of the method in the domain of medical imaging for segmenting 233 images with liver lesions. To illustrate the strength of our method, those images were obtained by either Computed Tomography or Magnetic Resonance Imaging. Moreover, we analyzed images using three different energy models. We compare our method to a global level set segmentation and to local framework that uses predefined fixed-size square windows. The results indicate that our proposed method outperforms the other methods in terms of agreement with the manual marking and dependence on contour initialization or the energy model used. In case of complex lesions, such as low contrast lesions, heterogeneous lesions, or lesions with a noisy background, our method shows significantly better segmentation with an improvement of 0.25+- 0.13 in Dice similarity coefficient, compared with state of the art fixed-size local windows (Wilcoxon, p < 0.001). △ Less

Submitted 12 June, 2016; originally announced June 2016.

Comments: 24 pages, 11 figures, 3 tables

Showing 1–8 of 8 results for author: Hoogi, A