Search | arXiv e-print repository

doi 10.1148/ryai.230601

Deep Learning Segmentation of Ascites on Abdominal CT Scans for Automatic Volume Quantification

Authors: Benjamin Hou, Sung-Won Lee, Jung-Min Lee, Christopher Koh, **g Xiao, Perry J. Pickhardt, Ronald M. Summers

Abstract: Purpose: To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer. Materials and Methods: This retrospective study included contrast-enhanced and non-contrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, N… ▽ More Purpose: To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer. Materials and Methods: This retrospective study included contrast-enhanced and non-contrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, National Institutes of Health (NIH) and University of Wisconsin (UofW). The model, trained on The Cancer Genome Atlas Ovarian Cancer dataset (mean age, 60 years +/- 11 [s.d.]; 143 female), was tested on two internal (NIH-LC and NIH-OV) and one external dataset (UofW-LC). Its performance was measured by the Dice coefficient, standard deviations, and 95% confidence intervals, focusing on ascites volume in the peritoneal cavity. Results: On NIH-LC (25 patients; mean age, 59 years +/- 14 [s.d.]; 14 male) and NIH-OV (166 patients; mean age, 65 years +/- 9 [s.d.]; all female), the model achieved Dice scores of 0.855 +/- 0.061 (CI: 0.831-0.878) and 0.826 +/- 0.153 (CI: 0.764-0.887), with median volume estimation errors of 19.6% (IQR: 13.2-29.0) and 5.3% (IQR: 2.4-9.7) respectively. On UofW-LC (124 patients; mean age, 46 years +/- 12 [s.d.]; 73 female), the model had a Dice score of 0.830 +/- 0.107 (CI: 0.798-0.863) and median volume estimation error of 9.7% (IQR: 4.5-15.1). The model showed strong agreement with expert assessments, with r^2 values of 0.79, 0.98, and 0.97 across the test sets. Conclusion: The proposed deep learning method performed well in segmenting and quantifying the volume of ascites in concordance with expert radiologist assessments. △ Less

Submitted 22 June, 2024; originally announced June 2024.

arXiv:2406.03688 [pdf, other]

Shadow and Light: Digitally Reconstructed Radiographs for Disease Classification

Authors: Benjamin Hou, Qingqing Zhu, Tejas Sudarshan Mathai, Qiao **, Zhiyong Lu, Ronald M. Summers

Abstract: In this paper, we introduce DRR-RATE, a large-scale synthetic chest X-ray dataset derived from the recently released CT-RATE dataset. DRR-RATE comprises of 50,188 frontal Digitally Reconstructed Radiographs (DRRs) from 21,304 unique patients. Each image is paired with a corresponding radiology text report and binary labels for 18 pathology classes. Given the controllable nature of DRR generation,… ▽ More In this paper, we introduce DRR-RATE, a large-scale synthetic chest X-ray dataset derived from the recently released CT-RATE dataset. DRR-RATE comprises of 50,188 frontal Digitally Reconstructed Radiographs (DRRs) from 21,304 unique patients. Each image is paired with a corresponding radiology text report and binary labels for 18 pathology classes. Given the controllable nature of DRR generation, it facilitates the inclusion of lateral view images and images from any desired viewing position. This opens up avenues for research into new and novel multimodal applications involving paired CT, X-ray images from various views, text, and binary labels. We demonstrate the applicability of DRR-RATE alongside existing large-scale chest X-ray resources, notably the CheXpert dataset and CheXnet model. Experiments demonstrate that CheXnet, when trained and tested on the DRR-RATE dataset, achieves sufficient to high AUC scores for the six common pathologies cited in common literature: Atelectasis, Cardiomegaly, Consolidation, Lung Lesion, Lung Opacity, and Pleural Effusion. Additionally, CheXnet trained on the CheXpert dataset can accurately identify several pathologies, even when operating out of distribution. This confirms that the generated DRR images effectively capture the essential pathology features from CT images. The dataset and labels are publicly accessible at https://huggingface.co/datasets/farrell236/DRR-RATE. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2405.07432 [pdf, other]

Compressed Online Learning of Conditional Mean Embedding

Authors: Boya Hou, Sina Sanjari, Alec Koppel, Subhonmesh Bose

Abstract: The conditional mean embedding (CME) encodes Markovian stochastic kernels through their actions on probability distributions embedded within the reproducing kernel Hilbert spaces (RKHS). The CME plays a key role in several well-known machine learning tasks such as reinforcement learning, analysis of dynamical systems, etc. We present an algorithm to learn the CME incrementally from data via an ope… ▽ More The conditional mean embedding (CME) encodes Markovian stochastic kernels through their actions on probability distributions embedded within the reproducing kernel Hilbert spaces (RKHS). The CME plays a key role in several well-known machine learning tasks such as reinforcement learning, analysis of dynamical systems, etc. We present an algorithm to learn the CME incrementally from data via an operator-valued stochastic gradient descent. As is well-known, function learning in RKHS suffers from scalability challenges from large data. We utilize a compression mechanism to counter the scalability challenge. The core contribution of this paper is a finite-sample performance guarantee on the last iterate of the online compressed operator learning algorithm with fast-mixing Markovian samples, when the target CME may not be contained in the hypothesis space. We illustrate the efficacy of our algorithm by applying it to the analysis of an example dynamical system. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 39 pages

arXiv:2405.05944 [pdf, other]

MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI

Authors: Yan Zhuang, Tejas Sudharshan Mathai, Pritam Mukherjee, Brandon Khoury, Boah Kim, Benjamin Hou, Nusrat Rabbee, Abhinav Suri, Ronald M. Summers

Abstract: Background: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmenta… ▽ More Background: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmentation tool for multi-structure segmentation is also unavailable. Methods: We curated a T1-weighted abdominal MRI dataset consisting of 195 patients who underwent imaging at National Institutes of Health (NIH) Clinical Center. The dataset comprises of axial pre-contrast T1, arterial, venous, and delayed phases for each patient, thereby amounting to a total of 780 series (69,248 2D slices). Each series contains voxel-level annotations of 62 abdominal organs and structures. A 3D nnUNet model, dubbed as MRISegmentator-Abdomen (MRISegmentator in short), was trained on this dataset, and evaluation was conducted on an internal test set and two large external datasets: AMOS22 and Duke Liver. The predicted segmentations were compared against the ground-truth using the Dice Similarity Coefficient (DSC) and Normalized Surface Distance (NSD). Findings: MRISegmentator achieved an average DSC of 0.861$\pm$0.170 and a NSD of 0.924$\pm$0.163 in the internal test set. On the AMOS22 dataset, MRISegmentator attained an average DSC of 0.829$\pm$0.133 and a NSD of 0.908$\pm$0.067. For the Duke Liver dataset, an average DSC of 0.933$\pm$0.015 and a NSD of 0.929$\pm$0.021 was obtained. Interpretation: The proposed MRISegmentator provides automatic, accurate, and robust segmentations of 62 organs and structures in T1-weighted abdominal MRI sequences. The tool has the potential to accelerate research on various clinical topics, such as abnormality detection, radiotherapy, disease classification among others. △ Less

Submitted 24 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: We made the segmentation model publicly available

arXiv:2312.06453 [pdf, other]

Semantic Image Synthesis for Abdominal CT

Authors: Yan Zhuang, Benjamin Hou, Tejas Sudharshan Mathai, Pritam Mukherjee, Boah Kim, Ronald M. Summers

Abstract: As a new emerging and promising type of generative models, diffusion models have proven to outperform Generative Adversarial Networks (GANs) in multiple tasks, including image synthesis. In this work, we explore semantic image synthesis for abdominal CT using conditional diffusion models, which can be used for downstream applications such as data augmentation. We systematically evaluated the perfo… ▽ More As a new emerging and promising type of generative models, diffusion models have proven to outperform Generative Adversarial Networks (GANs) in multiple tasks, including image synthesis. In this work, we explore semantic image synthesis for abdominal CT using conditional diffusion models, which can be used for downstream applications such as data augmentation. We systematically evaluated the performance of three diffusion models, as well as to other state-of-the-art GAN-based approaches, and studied the different conditioning scenarios for the semantic mask. Experimental results demonstrated that diffusion models were able to synthesize abdominal CT images with better quality. Additionally, encoding the mask and the input separately is more effective than naïve concatenating. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: This paper has been accepted at Deep Generative Models workshop at MICCAI 2023

arXiv:2305.01138 [pdf, other]

High-Fidelity Image Synthesis from Pulmonary Nodule Lesion Maps using Semantic Diffusion Model

Authors: Xuan Zhao, Benjamin Hou

Abstract: Lung cancer has been one of the leading causes of cancer-related deaths worldwide for years. With the emergence of deep learning, computer-assisted diagnosis (CAD) models based on learning algorithms can accelerate the nodule screening process, providing valuable assistance to radiologists in their daily clinical workflows. However, develo** such robust and accurate models often requires large-s… ▽ More Lung cancer has been one of the leading causes of cancer-related deaths worldwide for years. With the emergence of deep learning, computer-assisted diagnosis (CAD) models based on learning algorithms can accelerate the nodule screening process, providing valuable assistance to radiologists in their daily clinical workflows. However, develo** such robust and accurate models often requires large-scale and diverse medical datasets with high-quality annotations. Generating synthetic data provides a pathway for augmenting datasets at a larger scale. Therefore, in this paper, we explore the use of Semantic Diffusion Mod- els (SDM) to generate high-fidelity pulmonary CT images from segmentation maps. We utilize annotation information from the LUNA16 dataset to create paired CT images and masks, and assess the quality of the generated images using the Frechet Inception Distance (FID), as well as on two common clinical downstream tasks: nodule detection and nodule localization. Achieving improvements of 3.96% for detection accuracy and 8.50% for AP50 in nodule localization task, respectively, demonstrates the feasibility of the approach. △ Less

Submitted 1 May, 2023; originally announced May 2023.

Comments: 4 pages, 1 figure, submitted to MIDL 2023

ACM Class: I.2.1; J.3

arXiv:2302.09215 [pdf, other]

Domain Agnostic Pipeline for Retina Vessel Segmentation

Authors: Benjamin Hou

Abstract: Automatic segmentation of retina vessels plays a pivotal role in clinical diagnosis of prevalent eye diseases, such as, Diabetic Retinopathy or Age-related Macular Degeneration. Due to the complex construction of blood vessels, with drastically varying thicknesses, accurate vessel segmentation can be quite a challenging task. In this work we show that it is possible to achieve near state-of-the-ar… ▽ More Automatic segmentation of retina vessels plays a pivotal role in clinical diagnosis of prevalent eye diseases, such as, Diabetic Retinopathy or Age-related Macular Degeneration. Due to the complex construction of blood vessels, with drastically varying thicknesses, accurate vessel segmentation can be quite a challenging task. In this work we show that it is possible to achieve near state-of-the-art performance, by crafting a careful thought pre-processing pipeline, without having to resort to complex networks and/or training routines. We also show that our model is able to maintain the same high segmentation performance across different datasets, very poor quality fundus images, as well as images of severe pathological cases. Code and models featured in this paper can be downloaded from http://github.com/farrell236/retina_segmentation. We also demonstrate the potential of our model at http://lazarus.ddns.net:8502. △ Less

Submitted 17 February, 2023; originally announced February 2023.

arXiv:2210.03460 [pdf, other]

Flexible Alignment Super-Resolution Network for Multi-Contrast MRI

Authors: Yiming Liu, Mengxi Zhang, Weiqin Zhang, Bo Jiang, Bo Hou, Dan Liu, Jie Chen, Heqing Lian

Abstract: Magnetic resonance imaging plays an essential role in clinical diagnosis by acquiring the structural information of biological tissue. Recently, many multi-contrast MRI super-resolution networks achieve good effects. However, most studies ignore the impact of the inappropriate foreground scale and patch size of multi-contrast MRI, which probably leads to inappropriate feature alignment. To tackle… ▽ More Magnetic resonance imaging plays an essential role in clinical diagnosis by acquiring the structural information of biological tissue. Recently, many multi-contrast MRI super-resolution networks achieve good effects. However, most studies ignore the impact of the inappropriate foreground scale and patch size of multi-contrast MRI, which probably leads to inappropriate feature alignment. To tackle this problem, we propose the Flexible Alignment Super-Resolution Network (FASR-Net) for multi-contrast MRI Super-Resolution. The Flexible Alignment module of FASR-Net consists of two modules for feature alignment. (1) The Single-Multi Pyramid Alignment(S-A) module solves the situation where low-resolution (LR) images and reference (Ref) images have different scales. (2) The Multi-Multi Pyramid Alignment(M-A) module solves the situation where LR and Ref images have the same scale. Besides, we propose the Cross-Hierarchical Progressive Fusion (CHPF) module aiming at fusing the features effectively, further improving the image quality. Compared with other state-of-the-art methods, FASR-net achieves the most competitive results on FastMRI and IXI datasets. Our code will be available at \href{https://github.com/yimingliu123/FASR-Net}{https://github.com/yimingliu123/FASR-Net}. △ Less

Submitted 8 January, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

arXiv:2108.08963 [pdf, other]

Impact of Aviation Electrification on Airports: Flight Scheduling and Charging

Authors: Boya Hou, Subhonmesh Bose, Lavanya Marla, Kiruba Haran

Abstract: Electrification can help to reduce the carbon footprint of aviation. The transition away from jet fuel-powered conventional airplane towards battery-powered electrified aircraft will impose extra charging requirements on airports. In this paper, we first quantify the increase in energy demands at several airports across the United States (US), when commercial airline carriers partially deploy hybr… ▽ More Electrification can help to reduce the carbon footprint of aviation. The transition away from jet fuel-powered conventional airplane towards battery-powered electrified aircraft will impose extra charging requirements on airports. In this paper, we first quantify the increase in energy demands at several airports across the United States (US), when commercial airline carriers partially deploy hybrid electric aircraft (HEA). We then illustrate that smart charging and minor modifications to flight schedules can substantially reduce peak power demands, and in turn the needs for grid infrastructure upgrade. Motivated by our data analysis, we then formulate an optimization problem for slot allocation that incorporates HEA charging considerations. This problem jointly decides flight schedules and charging profiles to manage airport congestion and peak power demands. We illustrate the efficacy of our formulation through a case study on the John F. Kennedy International Airport. △ Less

Submitted 31 May, 2022; v1 submitted 19 August, 2021; originally announced August 2021.

Comments: 19 pages, 8 figures

arXiv:2011.04994 [pdf, other]

AIM 2020 Challenge on Learned Image Signal Processing Pipeline

Authors: Andrey Ignatov, Radu Timofte, Zhilu Zhang, Ming Liu, Haolin Wang, Wangmeng Zuo, Jiawei Zhang, Ruimao Zhang, Zhanglin Peng, Sijie Ren, Linhui Dai, Xiaohong Liu, Chengqi Li, Jun Chen, Yuichi Ito, Bhavya Vasudeva, Puneesh Deora, Umapada Pal, Zhenyu Guo, Yu Zhu, Tian Liang, Chenghua Li, Cong Leng, Zhihong Pan, Baopu Li , et al. (14 additional authors not shown)

Abstract: This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results. The participating teams were solving a real-world RAW-to-RGB map** problem, where to goal was to map the original low-quality RAW images captured by the Huawei P20 device to the same photos obtained with the Canon 5D DSLR camera. The considered task embraced a number of com… ▽ More This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results. The participating teams were solving a real-world RAW-to-RGB map** problem, where to goal was to map the original low-quality RAW images captured by the Huawei P20 device to the same photos obtained with the Canon 5D DSLR camera. The considered task embraced a number of complex computer vision subtasks, such as image demosaicing, denoising, white balancing, color and contrast correction, demoireing, etc. The target metric used in this challenge combined fidelity scores (PSNR and SSIM) with solutions' perceptual results measured in a user study. The proposed solutions significantly improved the baseline results, defining the state-of-the-art for practical image signal processing pipeline modeling. △ Less

Submitted 10 November, 2020; originally announced November 2020.

Comments: Published in ECCV 2020 Workshops (Advances in Image Manipulation), https://data.vision.ee.ethz.ch/cvl/aim20/

arXiv:2009.06236 [pdf, other]

Consensus of Multi-agent System via Constrained Invariant Set of a class of Unstable System

Authors: Chong ** Ong, Bonan Hou

Abstract: This work shows an approach to achieve output consensus among heterogeneous agents in a multi-agent environment where each agent is subject to input constraints. The communication among agents is described by a time-varying directed/undirected graph. The approach is based on the well-known Internal Model Principle which uses an unstable reference system. One main contribution of this work is the c… ▽ More This work shows an approach to achieve output consensus among heterogeneous agents in a multi-agent environment where each agent is subject to input constraints. The communication among agents is described by a time-varying directed/undirected graph. The approach is based on the well-known Internal Model Principle which uses an unstable reference system. One main contribution of this work is the characterization of the maximal constraint admissible invariant set (MCAI) for the combined agent-reference system. Typically, MCAI sets do not exist for unstable system. This work shows that for an important class of agent-reference system that is unstable, MCAI exists and can be computed. This MCAI set is used in a Reference Governor approach, combined with a projected consensus algorithm, to achieve output consensus of all agents while satisfying constraints of each. Examples are provided to illustrate the approach. △ Less

Submitted 2 June, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

Comments: 13 pages, 8 figures

arXiv:2006.12809 [pdf, other]

3D Probabilistic Segmentation and Volumetry from 2D projection images

Authors: Athanasios Vlontzos, Samuel Budd, Benjamin Hou, Daniel Rueckert, Bernhard Kainz

Abstract: X-Ray imaging is quick, cheap and useful for front-line care assessment and intra-operative real-time imaging (e.g., C-Arm Fluoroscopy). However, it suffers from projective information loss and lacks vital volumetric information on which many essential diagnostic biomarkers are based on. In this paper we explore probabilistic methods to reconstruct 3D volumetric images from 2D imaging modalities a… ▽ More X-Ray imaging is quick, cheap and useful for front-line care assessment and intra-operative real-time imaging (e.g., C-Arm Fluoroscopy). However, it suffers from projective information loss and lacks vital volumetric information on which many essential diagnostic biomarkers are based on. In this paper we explore probabilistic methods to reconstruct 3D volumetric images from 2D imaging modalities and measure the models' performance and confidence. We show our models' performance on large connected structures and we test for limitations regarding fine structures and image domain sensitivity. We utilize fast end-to-end training of a 2D-3D convolutional networks, evaluate our method on 117 CT scans segmenting 3D structures from digitally reconstructed radiographs (DRRs) with a Dice score of $0.91 \pm 0.0013$. Source code will be made available by the time of the conference. △ Less

Submitted 23 June, 2020; originally announced June 2020.

arXiv:2003.06583 [pdf, other]

doi 10.1109/TGRS.2019.2948659

From W-Net to CDGAN: Bi-temporal Change Detection via Deep Learning Techniques

Authors: Bin Hou, Qingjie Liu, Heng Wang, Yunhong Wang

Abstract: Traditional change detection methods usually follow the image differencing, change feature extraction and classification framework, and their performance is limited by such simple image domain differencing and also the hand-crafted features. Recently, the success of deep convolutional neural networks (CNNs) has widely spread across the whole field of computer vision for their powerful representati… ▽ More Traditional change detection methods usually follow the image differencing, change feature extraction and classification framework, and their performance is limited by such simple image domain differencing and also the hand-crafted features. Recently, the success of deep convolutional neural networks (CNNs) has widely spread across the whole field of computer vision for their powerful representation abilities. In this paper, we therefore address the remote sensing image change detection problem with deep learning techniques. We firstly propose an end-to-end dual-branch architecture, termed as the W-Net, with each branch taking as input one of the two bi-temporal images as in the traditional change detection models. In this way, CNN features with more powerful representative abilities can be obtained to boost the final detection performance. Also, W-Net performs differencing in the feature domain rather than in the traditional image domain, which greatly alleviates loss of useful information for determining the changes. Furthermore, by reformulating change detection as an image translation problem, we apply the recently popular Generative Adversarial Network (GAN) in which our W-Net serves as the Generator, leading to a new GAN architecture for change detection which we call CDGAN. To train our networks and also facilitate future research, we construct a large scale dataset by collecting images from Google Earth and provide carefully manually annotated ground truths. Experiments show that our proposed methods can provide fine-grained change detection results superior to the existing state-of-the-art baselines. △ Less

Submitted 14 March, 2020; originally announced March 2020.

Comments: Accept to TGRS

arXiv:1908.11312 [pdf, other]

Flexible Conditional Image Generation of Missing Data with Learned Mental Maps

Authors: Benjamin Hou, Athanasios Vlontzos, Amir Alansary, Daniel Rueckert, Bernhard Kainz

Abstract: Real-world settings often do not allow acquisition of high-resolution volumetric images for accurate morphological assessment and diagnostic. In clinical practice it is frequently common to acquire only sparse data (e.g. individual slices) for initial diagnostic decision making. Thereby, physicians rely on their prior knowledge (or mental maps) of the human anatomy to extrapolate the underlying 3D… ▽ More Real-world settings often do not allow acquisition of high-resolution volumetric images for accurate morphological assessment and diagnostic. In clinical practice it is frequently common to acquire only sparse data (e.g. individual slices) for initial diagnostic decision making. Thereby, physicians rely on their prior knowledge (or mental maps) of the human anatomy to extrapolate the underlying 3D information. Accurate mental maps require years of anatomy training, which in the first instance relies on normative learning, i.e. excluding pathology. In this paper, we leverage Bayesian Deep Learning and environment map** to generate full volumetric anatomy representations from none to a small, sparse set of slices. We evaluate proof of concept implementations based on Generative Query Networks (GQN) and Conditional BRUNO using abdominal CT and brain MRI as well as in a clinical application involving sparse, motion-corrupted MR acquisition for fetal imaging. Our approach allows to reconstruct 3D volumes from 1 to 4 tomographic slices, with a SSIM of 0.7+ and cross-correlation of 0.8+ compared to the 3D ground truth. △ Less

Submitted 29 August, 2019; originally announced August 2019.

Showing 1–14 of 14 results for author: Hou, B