Search | arXiv e-print repository

RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models

Authors: Zefan Wang, Zichuan Liu, Yingying Zhang, Aoxiao Zhong, Lunting Fan, Lingfei Wu, Qingsong Wen

Abstract: Large language model (LLM) applications in cloud root cause analysis (RCA) have been actively explored recently. However, current methods are still reliant on manual workflow settings and do not unleash LLMs' decision-making and environment interaction capabilities. We present RCAgent, a tool-augmented LLM autonomous agent framework for practical and privacy-aware industrial RCA usage. Running on… ▽ More Large language model (LLM) applications in cloud root cause analysis (RCA) have been actively explored recently. However, current methods are still reliant on manual workflow settings and do not unleash LLMs' decision-making and environment interaction capabilities. We present RCAgent, a tool-augmented LLM autonomous agent framework for practical and privacy-aware industrial RCA usage. Running on an internally deployed model rather than GPT families, RCAgent is capable of free-form data collection and comprehensive analysis with tools. Our framework combines a variety of enhancements, including a unique Self-Consistency for action trajectories, and a suite of methods for context management, stabilization, and importing domain knowledge. Our experiments show RCAgent's evident and consistent superiority over ReAct across all aspects of RCA -- predicting root causes, solutions, evidence, and responsibilities -- and tasks covered or uncovered by current rules, as validated by both automated metrics and human evaluations. Furthermore, RCAgent has already been integrated into the diagnosis and issue discovery workflow of the Real-time Compute Platform for Apache Flink of Alibaba Cloud. △ Less

Submitted 24 October, 2023; originally announced October 2023.

arXiv:2309.08842 [pdf, other]

MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image Segmentation

Authors: Cheng Chen, Juzheng Miao, Dufan Wu, Zhiling Yan, Sekeun Kim, Jiang Hu, Aoxiao Zhong, Zhengliang Liu, Lichao Sun, Xiang Li, Tianming Liu, Pheng-Ann Heng, Quanzheng Li

Abstract: The Segment Anything Model (SAM), a foundation model for general image segmentation, has demonstrated impressive zero-shot performance across numerous natural image segmentation tasks. However, SAM's performance significantly declines when applied to medical images, primarily due to the substantial disparity between natural and medical image domains. To effectively adapt SAM to medical images, it… ▽ More The Segment Anything Model (SAM), a foundation model for general image segmentation, has demonstrated impressive zero-shot performance across numerous natural image segmentation tasks. However, SAM's performance significantly declines when applied to medical images, primarily due to the substantial disparity between natural and medical image domains. To effectively adapt SAM to medical images, it is important to incorporate critical third-dimensional information, i.e., volumetric or temporal knowledge, during fine-tuning. Simultaneously, we aim to harness SAM's pre-trained weights within its original 2D backbone to the fullest extent. In this paper, we introduce a modality-agnostic SAM adaptation framework, named as MA-SAM, that is applicable to various volumetric and video medical data. Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments while preserving the majority of SAM's pre-trained weights. By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data. The effectiveness of our method has been comprehensively evaluated on four medical image segmentation tasks, by using 10 public datasets across CT, MRI, and surgical video data. Remarkably, without using any prompt, our method consistently outperforms various state-of-the-art 3D approaches, surpassing nnU-Net by 0.9%, 2.6%, and 9.9% in Dice for CT multi-organ segmentation, MRI prostate segmentation, and surgical scene segmentation respectively. Our model also demonstrates strong generalization, and excels in challenging tumor segmentation when prompts are used. Our code is available at: https://github.com/cchen-cc/MA-SAM. △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2309.06419 [pdf, other]

Radiology-Llama2: Best-in-Class Large Language Model for Radiology

Authors: Zhengliang Liu, Yiwei Li, Peng Shu, Aoxiao Zhong, Longtao Yang, Chao Ju, Zihao Wu, Chong Ma, Jie Luo, Cheng Chen, Sekeun Kim, Jiang Hu, Haixing Dai, Lin Zhao, Dajiang Zhu, Jun Liu, Wei Liu, Dinggang Shen, Tianming Liu, Quanzheng Li, Xiang Li

Abstract: This paper introduces Radiology-Llama2, a large language model specialized for radiology through a process known as instruction tuning. Radiology-Llama2 is based on the Llama2 architecture and further trained on a large dataset of radiology reports to generate coherent and clinically useful impressions from radiological findings. Quantitative evaluations using ROUGE metrics on the MIMIC-CXR and Op… ▽ More This paper introduces Radiology-Llama2, a large language model specialized for radiology through a process known as instruction tuning. Radiology-Llama2 is based on the Llama2 architecture and further trained on a large dataset of radiology reports to generate coherent and clinically useful impressions from radiological findings. Quantitative evaluations using ROUGE metrics on the MIMIC-CXR and OpenI datasets demonstrate that Radiology-Llama2 achieves state-of-the-art performance compared to other generative language models, with a Rouge-1 score of 0.4834 on MIMIC-CXR and 0.4185 on OpenI. Additional assessments by radiology experts highlight the model's strengths in understandability, coherence, relevance, conciseness, and clinical utility. The work illustrates the potential of localized language models designed and tuned for specialized domains like radiology. When properly evaluated and deployed, such models can transform fields like radiology by automating rote tasks and enhancing human expertise. △ Less

Submitted 29 August, 2023; originally announced September 2023.

arXiv:2306.08666 [pdf, other]

Radiology-GPT: A Large Language Model for Radiology

Authors: Zhengliang Liu, Aoxiao Zhong, Yiwei Li, Longtao Yang, Chao Ju, Zihao Wu, Chong Ma, Peng Shu, Cheng Chen, Sekeun Kim, Haixing Dai, Lin Zhao, Lichao Sun, Dajiang Zhu, Jun Liu, Wei Liu, Dinggang Shen, Xiang Li, Quanzheng Li, Tianming Liu

Abstract: We introduce Radiology-GPT, a large language model for radiology. Using an instruction tuning approach on an extensive dataset of radiology domain knowledge, Radiology-GPT demonstrates superior performance compared to general language models such as StableLM, Dolly and LLaMA. It exhibits significant versatility in radiological diagnosis, research, and communication. This work serves as a catalyst… ▽ More We introduce Radiology-GPT, a large language model for radiology. Using an instruction tuning approach on an extensive dataset of radiology domain knowledge, Radiology-GPT demonstrates superior performance compared to general language models such as StableLM, Dolly and LLaMA. It exhibits significant versatility in radiological diagnosis, research, and communication. This work serves as a catalyst for future developments in clinical NLP. The successful implementation of Radiology-GPT is indicative of the potential of localizing generative large language models, specifically tailored for distinctive medical specialties, while ensuring adherence to privacy standards such as HIPAA. The prospect of develo** individualized, large-scale language models that cater to specific needs of various hospitals presents a promising direction. The fusion of conversational competence and domain-specific knowledge in these models is set to foster future development in healthcare AI. A demo of Radiology-GPT is available at https://huggingface.co/spaces/allen-eric/radiology-gpt. △ Less

Submitted 19 March, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

arXiv:2209.04007 [pdf, other]

FedDAR: Federated Domain-Aware Representation Learning

Authors: Aoxiao Zhong, Hao He, Zhaolin Ren, Na Li, Quanzheng Li

Abstract: Cross-silo Federated learning (FL) has become a promising tool in machine learning applications for healthcare. It allows hospitals/institutions to train models with sufficient data while the data is kept private. To make sure the FL model is robust when facing heterogeneous data among FL clients, most efforts focus on personalizing models for clients. However, the latent relationships between cli… ▽ More Cross-silo Federated learning (FL) has become a promising tool in machine learning applications for healthcare. It allows hospitals/institutions to train models with sufficient data while the data is kept private. To make sure the FL model is robust when facing heterogeneous data among FL clients, most efforts focus on personalizing models for clients. However, the latent relationships between clients' data are ignored. In this work, we focus on a special non-iid FL problem, called Domain-mixed FL, where each client's data distribution is assumed to be a mixture of several predefined domains. Recognizing the diversity of domains and the similarity within domains, we propose a novel method, FedDAR, which learns a domain shared representation and domain-wise personalized prediction heads in a decoupled manner. For simplified linear regression settings, we have theoretically proved that FedDAR enjoys a linear convergence rate. For general settings, we have performed intensive empirical studies on both synthetic and real-world medical datasets which demonstrate its superiority over prior FL methods. △ Less

Submitted 8 September, 2022; originally announced September 2022.

arXiv:2208.03022 [pdf, other]

A Probabilistic Bound for Peak Age of Information Guarantee

Authors: Ailing Zhong, Zhidu Li, Tong Tang, Dapeng Wu, Ruyan Wang, Yuming Jiang

Abstract: This paper considers the distribution of a general peak age of information (AoI) model and develops a general analysis approach for probabilistic performance guarantee from the time-domain perspective. Firstly, a general relationship between the peak AoI and the inter-arrival and service times of packets is revealed. With the help of martingale theory, a probabilistic bound on the peak AoI is then… ▽ More This paper considers the distribution of a general peak age of information (AoI) model and develops a general analysis approach for probabilistic performance guarantee from the time-domain perspective. Firstly, a general relationship between the peak AoI and the inter-arrival and service times of packets is revealed. With the help of martingale theory, a probabilistic bound on the peak AoI is then derived for the general case of endogenous independently and identically distributed increments in information generation and transmission processes. Thereafter, the application of the obtained bound is illustrated with the M/M/1 and D/M/1 queuing models. The validity of the proposed bound is finally examined with numerical results. △ Less

Submitted 5 August, 2022; originally announced August 2022.

arXiv:2205.01672 [pdf, other]

Branch & Learn for Recursively and Iteratively Solvable Problems in Predict+Optimize

Authors: Xinyi Hu, Jasper C. H. Lee, Jimmy H. M. Lee, Allen Z. Zhong

Abstract: This paper proposes Branch & Learn, a framework for Predict+Optimize to tackle optimization problems containing parameters that are unknown at the time of solving. Given an optimization problem solvable by a recursive algorithm satisfying simple conditions, we show how a corresponding learning algorithm can be constructed directly and methodically from the recursive algorithm. Our framework applie… ▽ More This paper proposes Branch & Learn, a framework for Predict+Optimize to tackle optimization problems containing parameters that are unknown at the time of solving. Given an optimization problem solvable by a recursive algorithm satisfying simple conditions, we show how a corresponding learning algorithm can be constructed directly and methodically from the recursive algorithm. Our framework applies also to iterative algorithms by viewing them as a degenerate form of recursion. Extensive experimentation shows better performance for our proposal over classical and state-of-the-art approaches. △ Less

Submitted 1 May, 2022; originally announced May 2022.

arXiv:2105.07609 [pdf, ps, other]

Intrablock Interleaving for Batched Network Coding with Blockwise Adaptive Recoding

Authors: Hoover H. F. Yin, Ka Hei Ng, Allen Z. Zhong, Raymond W. Yeung, Shenghao Yang, Ian Y. Y. Chan

Abstract: Batched network coding (BNC) is a low-complexity solution to network transmission in multi-hop packet networks with packet loss. BNC encodes the source data into batches of packets. As a network coding scheme, the intermediate nodes perform recoding on the received packets belonging to the same batch instead of just forwarding them. A recoding scheme that may generate more recoded packets for batc… ▽ More Batched network coding (BNC) is a low-complexity solution to network transmission in multi-hop packet networks with packet loss. BNC encodes the source data into batches of packets. As a network coding scheme, the intermediate nodes perform recoding on the received packets belonging to the same batch instead of just forwarding them. A recoding scheme that may generate more recoded packets for batches of a higher rank is also called adaptive recoding. Meanwhile, in order to combat burst packet loss, the transmission of a block of batches can be interleaved. Stream interleaving studied in literature achieves the maximum separation among any two consecutive packets of a batch, but permutes packets across blocks and hence cannot bound the buffer size and the latency. To resolve the issue of stream interleaver, we design an intrablock interleaver for adaptive recoding that can preserve the advantages of using a block interleaver when the number of recoded packets is the same for all batches. We use potential energy in classical mechanics to measure the performance of an interleaver, and propose an algorithm to optimize the interleaver with this performance measure. Our problem formulation and algorithm for intrablock interleaving are also of independent interest. △ Less

Submitted 15 September, 2021; v1 submitted 17 May, 2021; originally announced May 2021.

Comments: This paper was presented in part at 2021 IEEE International Symposium on Information Theory

arXiv:2103.11269 [pdf]

Development and Validation of a Deep Learning Model for Prediction of Severe Outcomes in Suspected COVID-19 Infection

Authors: Varun Buch, Aoxiao Zhong, Xiang Li, Marcio Aloisio Bezerra Cavalcanti Rockenbach, Dufan Wu, Hui Ren, Jiahui Guan, Andrew Liteplo, Sayon Dutta, Ittai Dayan, Quanzheng Li

Abstract: COVID-19 patient triaging with predictive outcome of the patients upon first present to emergency department (ED) is crucial for improving patient prognosis, as well as better hospital resources management and cross-infection control. We trained a deep feature fusion model to predict patient outcomes, where the model inputs were EHR data including demographic information, co-morbidities, vital sig… ▽ More COVID-19 patient triaging with predictive outcome of the patients upon first present to emergency department (ED) is crucial for improving patient prognosis, as well as better hospital resources management and cross-infection control. We trained a deep feature fusion model to predict patient outcomes, where the model inputs were EHR data including demographic information, co-morbidities, vital signs and laboratory measurements, plus patient's CXR images. The model output was patient outcomes defined as the most insensitive oxygen therapy required. For patients without CXR images, we employed Random Forest method for the prediction. Predictive risk scores for COVID-19 severe outcomes ("CO-RISK" score) were derived from model output and evaluated on the testing dataset, as well as compared to human performance. The study's dataset (the "MGB COVID Cohort") was constructed from all patients presenting to the Mass General Brigham (MGB) healthcare system from March 1st to June 1st, 2020. ED visits with incomplete or erroneous data were excluded. Patients with no test order for COVID or confirmed negative test results were excluded. Patients under the age of 15 were also excluded. Finally, electronic health record (EHR) data from a total of 11060 COVID-19 confirmed or suspected patients were used in this study. Chest X-ray (CXR) images were also collected from each patient if available. Results show that CO-RISK score achieved area under the Curve (AUC) of predicting MV/death (i.e. severe outcomes) in 24 hours of 0.95, and 0.92 in 72 hours on the testing dataset. The model shows superior performance to the commonly used risk scores in ED (CURB-65 and MEWS). Comparing with physician's decisions, CO-RISK score has demonstrated superior performance to human in making ICU/floor decisions. △ Less

Submitted 28 March, 2021; v1 submitted 20 March, 2021; originally announced March 2021.

Comments: Varun Buch, Aoxiao Zhong and Xiang Li contribute equally to this work

arXiv:2012.03663 [pdf]

doi 10.1016/j.media.2021.101993

Deep Metric Learning-based Image Retrieval System for Chest Radiograph and its Clinical Applications in COVID-19

Authors: Aoxiao Zhong, Xiang Li, Dufan Wu, Hui Ren, Kyungsang Kim, Younggon Kim, Varun Buch, Nir Neumark, Bernardo Bizzo, Won Young Tak, Soo Young Park, Yu Rim Lee, Min Kyu Kang, Jung Gil Park, Byung Seok Kim, Woo ** Chung, Ning Guo, Ittai Dayan, Mannudeep K. Kalra, Quanzheng Li

Abstract: In recent years, deep learning-based image analysis methods have been widely applied in computer-aided detection, diagnosis and prognosis, and has shown its value during the public health crisis of the novel coronavirus disease 2019 (COVID-19) pandemic. Chest radiograph (CXR) has been playing a crucial role in COVID-19 patient triaging, diagnosing and monitoring, particularly in the United States.… ▽ More In recent years, deep learning-based image analysis methods have been widely applied in computer-aided detection, diagnosis and prognosis, and has shown its value during the public health crisis of the novel coronavirus disease 2019 (COVID-19) pandemic. Chest radiograph (CXR) has been playing a crucial role in COVID-19 patient triaging, diagnosing and monitoring, particularly in the United States. Considering the mixed and unspecific signals in CXR, an image retrieval model of CXR that provides both similar images and associated clinical information can be more clinically meaningful than a direct image diagnostic model. In this work we develop a novel CXR image retrieval model based on deep metric learning. Unlike traditional diagnostic models which aims at learning the direct map** from images to labels, the proposed model aims at learning the optimized embedding space of images, where images with the same labels and similar contents are pulled together. It utilizes multi-similarity loss with hard-mining sampling strategy and attention mechanism to learn the optimized embedding space, and provides similar images to the query image. The model is trained and validated on an international multi-site COVID-19 dataset collected from 3 different sources. Experimental results of COVID-19 image retrieval and diagnosis tasks show that the proposed model can serve as a robust solution for CXR analysis and patient management for COVID-19. The model is also tested on its transferability on a different clinical decision support task, where the pre-trained model is applied to extract image features from a new dataset without any further training. These results demonstrate our deep metric learning based image retrieval model is highly efficient in the CXR retrieval, diagnosis and prognosis, and thus has great clinical value for the treatment and management of COVID-19 patients. △ Less

Submitted 25 November, 2020; originally announced December 2020.

Comments: Aoxiao Zhong and Xiang Li contribute equally to this work

Journal ref: Medical Image Analysis. 70 (2021) 101993

arXiv:2011.01815 [pdf, other]

LQR with Tracking: A Zeroth-order Approach and Its Global Convergence

Authors: Zhaolin Ren, Aoxiao Zhong, Na Li

Abstract: There has been substantial recent progress on the theoretical understanding of model-free approaches to Linear Quadratic Regulator (LQR) problems. Much attention has been devoted to the special case when the goal is to drive the state close to a zero target. In this work, we consider the general case where the target is allowed to be arbitrary, which we refer to as the LQR tracking problem. We stu… ▽ More There has been substantial recent progress on the theoretical understanding of model-free approaches to Linear Quadratic Regulator (LQR) problems. Much attention has been devoted to the special case when the goal is to drive the state close to a zero target. In this work, we consider the general case where the target is allowed to be arbitrary, which we refer to as the LQR tracking problem. We study the optimization landscape of this problem, and show that similar to the zero-target LQR problem, the LQR tracking problem also satisfies gradient dominance and local smoothness properties. This allows us to develop a zeroth-order policy gradient algorithm that achieves global convergence. We support our arguments with numerical simulations on a linear system. △ Less

Submitted 12 April, 2021; v1 submitted 3 November, 2020; originally announced November 2020.

Comments: Modified parameterization for policy structure. Revised focus to be on optimization landscape and convergence for the LQR tracking problem

arXiv:1911.02549 [pdf, other]

MLPerf Inference Benchmark

Authors: Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee , et al. (22 additional authors not shown)

Abstract: Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devic… ▽ More Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability. △ Less

Submitted 9 May, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

Comments: ISCA 2020

arXiv:1806.04325 [pdf, ps, other]

Augmenting Stream Constraint Programming with Eventuality Conditions

Authors: Jasper C. H. Lee, Jimmy H. M. Lee, Allen Z. Zhong

Abstract: Stream constraint programming is a recent addition to the family of constraint programming frameworks, where variable domains are sets of infinite streams over finite alphabets. Previous works showed promising results for its applicability to real-world planning and control problems. In this paper, motivated by the modelling of planning applications, we improve the expressiveness of the framework… ▽ More Stream constraint programming is a recent addition to the family of constraint programming frameworks, where variable domains are sets of infinite streams over finite alphabets. Previous works showed promising results for its applicability to real-world planning and control problems. In this paper, motivated by the modelling of planning applications, we improve the expressiveness of the framework by introducing 1) the "until" constraint, a new construct that is adapted from Linear Temporal Logic and 2) the @ operator on streams, a syntactic sugar for which we provide a more efficient solving algorithm over simple desugaring. For both constructs, we propose corresponding novel solving algorithms and prove their correctness. We present competitive experimental results on the Missionaries and Cannibals logic puzzle and a standard path planning application on the grid, by comparing with Apt and Brand's method for verifying eventuality conditions using a CP approach. △ Less

Submitted 6 August, 2018; v1 submitted 12 June, 2018; originally announced June 2018.

Comments: Added proofs and an appendix containing a constraint model that was not included in the previous version

arXiv:1710.10121 [pdf, other]

Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

Authors: Yi** Lu, Aoxiao Zhong, Quanzheng Li, Bin Dong

Abstract: In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in… ▽ More In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in numerical analysis to guide us in designing new and potentially more effective deep networks. As an example, we propose a linear multi-step architecture (LM-architecture) which is inspired by the linear multi-step method solving ordinary differential equations. The LM-architecture is an effective structure that can be used on any ResNet-like networks. In particular, we demonstrate that LM-ResNet and LM-ResNeXt (i.e. the networks obtained by applying the LM-architecture on ResNet and ResNeXt respectively) can achieve noticeably higher accuracy than ResNet and ResNeXt on both CIFAR and ImageNet with comparable numbers of trainable parameters. In particular, on both CIFAR and ImageNet, LM-ResNet/LM-ResNeXt can significantly compress ($>50$\%) the original networks while maintaining a similar performance. This can be explained mathematically using the concept of modified equation from numerical analysis. Last but not least, we also establish a connection between stochastic control and noise injection in the training process which helps to improve generalization of the networks. Furthermore, by relating stochastic training strategy with stochastic dynamic system, we can easily apply stochastic training to the networks with the LM-architecture. As an example, we introduced stochastic depth to LM-ResNet and achieve significant improvement over the original LM-ResNet on CIFAR10. △ Less

Submitted 23 March, 2020; v1 submitted 27 October, 2017; originally announced October 2017.

arXiv:1707.06145 [pdf, other]

doi 10.1007/978-3-319-67389-9_25

Self-paced Convolutional Neural Network for Computer Aided Detection in Medical Imaging Analysis

Authors: Xiang Li, Aoxiao Zhong, Ming Lin, Ning Guo, Mu Sun, Arkadiusz Sitek, Jie** Ye, James Thrall, Quanzheng Li

Abstract: Tissue characterization has long been an important component of Computer Aided Diagnosis (CAD) systems for automatic lesion detection and further clinical planning. Motivated by the superior performance of deep learning methods on various computer vision problems, there has been increasing work applying deep learning to medical image analysis. However, the development of a robust and reliable deep… ▽ More Tissue characterization has long been an important component of Computer Aided Diagnosis (CAD) systems for automatic lesion detection and further clinical planning. Motivated by the superior performance of deep learning methods on various computer vision problems, there has been increasing work applying deep learning to medical image analysis. However, the development of a robust and reliable deep learning model for computer-aided diagnosis is still highly challenging due to the combination of the high heterogeneity in the medical images and the relative lack of training samples. Specifically, annotation and labeling of the medical images is much more expensive and time-consuming than other applications and often involves manual labor from multiple domain experts. In this work, we propose a multi-stage, self-paced learning framework utilizing a convolutional neural network (CNN) to classify Computed Tomography (CT) image patches. The key contribution of this approach is that we augment the size of training samples by refining the unlabeled instances with a self-paced learning CNN. By implementing the framework on high performance computing servers including the NVIDIA DGX1 machine, we obtained the experimental result, showing that the self-pace boosted network consistently outperformed the original network even with very scarce manual labels. The performance gain indicates that applications with limited training samples such as medical image analysis can benefit from using the proposed framework. △ Less

Submitted 19 July, 2017; originally announced July 2017.

Comments: accepted by 8th International Workshop on Machine Learning in Medical Imaging (MLMI 2017)

Showing 1–15 of 15 results for author: Zhong, A