Search | arXiv e-print repository

The ULS23 Challenge: a Baseline Model and Benchmark Dataset for 3D Universal Lesion Segmentation in Computed Tomography

Authors: M. J. J. de Grauw, E. Th. Scholten, E. J. Smit, M. J. C. M. Rutten, M. Prokop, B. van Ginneken, A. Hering

Abstract: Size measurements of tumor manifestations on follow-up CT examinations are crucial for evaluating treatment outcomes in cancer patients. Efficient lesion segmentation can speed up these radiological workflows. While numerous benchmarks and challenges address lesion segmentation in specific organs like the liver, kidneys, and lungs, the larger variety of lesion types encountered in clinical practic… ▽ More Size measurements of tumor manifestations on follow-up CT examinations are crucial for evaluating treatment outcomes in cancer patients. Efficient lesion segmentation can speed up these radiological workflows. While numerous benchmarks and challenges address lesion segmentation in specific organs like the liver, kidneys, and lungs, the larger variety of lesion types encountered in clinical practice demands a more universal approach. To address this gap, we introduced the ULS23 benchmark for 3D universal lesion segmentation in chest-abdomen-pelvis CT examinations. The ULS23 training dataset contains 38,693 lesions across this region, including challenging pancreatic, colon and bone lesions. For evaluation purposes, we curated a dataset comprising 775 lesions from 284 patients. Each of these lesions was identified as a target lesion in a clinical context, ensuring diversity and clinical relevance within this dataset. The ULS23 benchmark is publicly accessible via uls23.grand-challenge.org, enabling researchers worldwide to assess the performance of their segmentation methods. Furthermore, we have developed and publicly released our baseline semi-supervised 3D lesion segmentation model. This model achieved an average Dice coefficient of 0.703 $\pm$ 0.240 on the challenge test set. We invite ongoing submissions to advance the development of future ULS models. △ Less

Submitted 21 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

arXiv:2401.02192 [pdf]

Nodule detection and generation on chest X-rays: NODE21 Challenge

Authors: Ecem Sogancioglu, Bram van Ginneken, Finn Behrendt, Marcel Bengs, Alexander Schlaefer, Miron Radu, Di Xu, Ke Sheng, Fabien Scalzo, Eric Marcus, Samuele Papa, Jonas Teuwen, Ernst Th. Scholten, Steven Schalekamp, Nils Hendrix, Colin Jacobs, Ward Hendrix, Clara I Sánchez, Keelin Murphy

Abstract: Pulmonary nodules may be an early manifestation of lung cancer, the leading cause of cancer-related deaths among both men and women. Numerous studies have established that deep learning methods can yield high-performance levels in the detection of lung nodules in chest X-rays. However, the lack of gold-standard public datasets slows down the progression of the research and prevents benchmarking of… ▽ More Pulmonary nodules may be an early manifestation of lung cancer, the leading cause of cancer-related deaths among both men and women. Numerous studies have established that deep learning methods can yield high-performance levels in the detection of lung nodules in chest X-rays. However, the lack of gold-standard public datasets slows down the progression of the research and prevents benchmarking of methods for this task. To address this, we organized a public research challenge, NODE21, aimed at the detection and generation of lung nodules in chest X-rays. While the detection track assesses state-of-the-art nodule detection systems, the generation track determines the utility of nodule generation algorithms to augment training data and hence improve the performance of the detection systems. This paper summarizes the results of the NODE21 challenge and performs extensive additional experiments to examine the impact of the synthetically generated nodule training images on the detection algorithm performance. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: 15 pages, 5 figures

arXiv:2309.03383 [pdf, other]

Kidney abnormality segmentation in thorax-abdomen CT scans

Authors: Gabriel Efrain Humpire Mamani, Nikolas Lessmann, Ernst Th. Scholten, Mathias Prokop, Colin Jacobs, Bram van Ginneken

Abstract: In this study, we introduce a deep learning approach for segmenting kidney parenchyma and kidney abnormalities to support clinicians in identifying and quantifying renal abnormalities such as cysts, lesions, masses, metastases, and primary tumors. Our end-to-end segmentation method was trained on 215 contrast-enhanced thoracic-abdominal CT scans, with half of these scans containing one or more abn… ▽ More In this study, we introduce a deep learning approach for segmenting kidney parenchyma and kidney abnormalities to support clinicians in identifying and quantifying renal abnormalities such as cysts, lesions, masses, metastases, and primary tumors. Our end-to-end segmentation method was trained on 215 contrast-enhanced thoracic-abdominal CT scans, with half of these scans containing one or more abnormalities. We began by implementing our own version of the original 3D U-Net network and incorporated four additional components: an end-to-end multi-resolution approach, a set of task-specific data augmentations, a modified loss function using top-$k$, and spatial dropout. Furthermore, we devised a tailored post-processing strategy. Ablation studies demonstrated that each of the four modifications enhanced kidney abnormality segmentation performance, while three out of four improved kidney parenchyma segmentation. Subsequently, we trained the nnUNet framework on our dataset. By ensembling the optimized 3D U-Net and the nnUNet with our specialized post-processing, we achieved marginally superior results. Our best-performing model attained Dice scores of 0.965 and 0.947 for segmenting kidney parenchyma in two test sets (20 scans without abnormalities and 30 with abnormalities), outperforming an independent human observer who scored 0.944 and 0.925, respectively. In segmenting kidney abnormalities within the 30 test scans containing them, the top-performing method achieved a Dice score of 0.585, while an independent second human observer reached a score of 0.664, suggesting potential for further improvement in computerized methods. All training data is available to the research community under a CC-BY 4.0 license on https://doi.org/10.5281/zenodo.8014289 △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2308.03586 [pdf]

SoilNet: An Attention-based Spatio-temporal Deep Learning Framework for Soil Organic Carbon Prediction with Digital Soil Map** in Europe

Authors: Nafiseh Kakhani, Moien Rangzan, Ali Jamali, Sara Attarchi, Seyed Kazem Alavipanah, Thomas Scholten

Abstract: Digital soil map** (DSM) is an advanced approach that integrates statistical modeling and cutting-edge technologies, including machine learning (ML) methods, to accurately depict soil properties and their spatial distribution. Soil organic carbon (SOC) is a crucial soil attribute providing valuable insights into soil health, nutrient cycling, greenhouse gas emissions, and overall ecosystem produ… ▽ More Digital soil map** (DSM) is an advanced approach that integrates statistical modeling and cutting-edge technologies, including machine learning (ML) methods, to accurately depict soil properties and their spatial distribution. Soil organic carbon (SOC) is a crucial soil attribute providing valuable insights into soil health, nutrient cycling, greenhouse gas emissions, and overall ecosystem productivity. This study highlights the significance of spatial-temporal deep learning (DL) techniques within the DSM framework. A novel architecture is proposed, incorporating spatial information using a base convolutional neural network (CNN) model and spatial attention mechanism, along with climate temporal information using a long short-term memory (LSTM) network, for SOC prediction across Europe. The model utilizes a comprehensive set of environmental features, including Landsat-8 images, topography, remote sensing indices, and climate time series, as input features. Results demonstrate that the proposed framework outperforms conventional ML approaches like random forest commonly used in DSM, yielding lower root mean square error (RMSE). This model is a robust tool for predicting SOC and could be applied to other soil properties, thereby contributing to the advancement of DSM techniques and facilitating land management and decision-making processes based on accurate information. △ Less

Submitted 24 May, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2304.04664 [pdf, other]

Inductive biases in deep learning models for weather prediction

Authors: Jannik Thuemmel, Matthias Karlbauer, Sebastian Otte, Christiane Zarfl, Georg Martius, Nicole Ludwig, Thomas Scholten, Ulrich Friedrich, Volker Wulfmeyer, Bedartha Goswami, Martin V. Butz

Abstract: Deep learning has gained immense popularity in the Earth sciences as it enables us to formulate purely data-driven models of complex Earth system processes. Deep learning-based weather prediction (DLWP) models have made significant progress in the last few years, achieving forecast skills comparable to established numerical weather prediction models with comparatively lesser computational costs. I… ▽ More Deep learning has gained immense popularity in the Earth sciences as it enables us to formulate purely data-driven models of complex Earth system processes. Deep learning-based weather prediction (DLWP) models have made significant progress in the last few years, achieving forecast skills comparable to established numerical weather prediction models with comparatively lesser computational costs. In order to train accurate, reliable, and tractable DLWP models with several millions of parameters, the model design needs to incorporate suitable inductive biases that encode structural assumptions about the data and the modelled processes. When chosen appropriately, these biases enable faster learning and better generalisation to unseen data. Although inductive biases play a crucial role in successful DLWP models, they are often not stated explicitly and their contribution to model performance remains unclear. Here, we review and analyse the inductive biases of state-of-the-art DLWP models with respect to five key design elements: data selection, learning objective, loss function, architecture, and optimisation method. We identify the most important inductive biases and highlight potential avenues towards more efficient and probabilistic DLWP models. △ Less

Submitted 30 April, 2024; v1 submitted 6 April, 2023; originally announced April 2023.

arXiv:2105.01181 [pdf, other]

doi 10.1002/mp.15655

Automated Estimation of Total Lung Volume using Chest Radiographs and Deep Learning

Authors: Ecem Sogancioglu, Keelin Murphy, Ernst Th. Scholten, Luuk H. Boulogne, Mathias Prokop, Bram van Ginneken

Abstract: Total lung volume is an important quantitative biomarker and is used for the assessment of restrictive lung diseases. In this study, we investigate the performance of several deep-learning approaches for automated measurement of total lung volume from chest radiographs. 7621 posteroanterior and lateral view chest radiographs (CXR) were collected from patients with chest CT available. Similarly, 92… ▽ More Total lung volume is an important quantitative biomarker and is used for the assessment of restrictive lung diseases. In this study, we investigate the performance of several deep-learning approaches for automated measurement of total lung volume from chest radiographs. 7621 posteroanterior and lateral view chest radiographs (CXR) were collected from patients with chest CT available. Similarly, 928 CXR studies were chosen from patients with pulmonary function test (PFT) results. The reference total lung volume was calculated from lung segmentation on CT or PFT data, respectively. This dataset was used to train deep-learning architectures to predict total lung volume from chest radiographs. The experiments were constructed in a step-wise fashion with increasing complexity to demonstrate the effect of training with CT-derived labels only and the sources of error. The optimal models were tested on 291 CXR studies with reference lung volume obtained from PFT. The optimal deep-learning regression model showed an MAE of 408 ml and a MAPE of 8.1\% and Pearson's r = 0.92 using both frontal and lateral chest radiographs as input. CT-derived labels were useful for pre-training but the optimal performance was obtained by fine-tuning the network with PFT-derived labels. We demonstrate, for the first time, that state-of-the-art deep learning solutions can accurately measure total lung volume from plain chest radiographs. The proposed model can be used to obtain total lung volume from routinely acquired chest radiographs at no additional cost and could be a useful tool to identify trends over time in patients referred regularly for chest x-rays. △ Less

Submitted 3 May, 2021; originally announced May 2021.

Comments: Under review

arXiv:2009.09823 [pdf, other]

Latent State Inference in a Spatiotemporal Generative Model

Authors: Matthias Karlbauer, Tobias Menge, Sebastian Otte, Hendrik P. A. Lensch, Thomas Scholten, Volker Wulfmeyer, Martin V. Butz

Abstract: Knowledge about the hidden factors that determine particular system dynamics is crucial for both explaining them and pursuing goal-directed interventions. Inferring these factors from time series data without supervision remains an open challenge. Here, we focus on spatiotemporal processes, including wave propagation and weather dynamics, for which we assume that universal causes (e.g. physics) ap… ▽ More Knowledge about the hidden factors that determine particular system dynamics is crucial for both explaining them and pursuing goal-directed interventions. Inferring these factors from time series data without supervision remains an open challenge. Here, we focus on spatiotemporal processes, including wave propagation and weather dynamics, for which we assume that universal causes (e.g. physics) apply throughout space and time. A recently introduced DIstributed SpatioTemporal graph Artificial Neural network Architecture (DISTANA) is used and enhanced to learn such processes, requiring fewer parameters and achieving significantly more accurate predictions compared to temporal convolutional neural networks and other related approaches. We show that DISTANA, when combined with a retrospective latent state inference principle called active tuning, can reliably derive location-respective hidden causal factors. In a current weather prediction benchmark, DISTANA infers our planet's land-sea mask solely by observing temperature dynamics and, meanwhile, uses the self inferred information to improve its own future temperature predictions. △ Less

Submitted 15 August, 2021; v1 submitted 21 September, 2020; originally announced September 2020.

Comments: As submitted to and accepted by the 30th International Conference on Artificial Neural Networks (ICANN)

arXiv:2009.09187 [pdf, other]

Inferring, Predicting, and Denoising Causal Wave Dynamics

Authors: Matthias Karlbauer, Sebastian Otte, Hendrik P. A. Lensch, Thomas Scholten, Volker Wulfmeyer, Martin V. Butz

Abstract: The novel DISTributed Artificial neural Network Architecture (DISTANA) is a generative, recurrent graph convolution neural network. It implements a grid or mesh of locally parameterizable laterally connected network modules. DISTANA is specifically designed to identify the causality behind spatially distributed, non-linear dynamical processes. We show that DISTANA is very well-suited to denoise da… ▽ More The novel DISTributed Artificial neural Network Architecture (DISTANA) is a generative, recurrent graph convolution neural network. It implements a grid or mesh of locally parameterizable laterally connected network modules. DISTANA is specifically designed to identify the causality behind spatially distributed, non-linear dynamical processes. We show that DISTANA is very well-suited to denoise data streams, given that re-occurring patterns are observed, significantly outperforming alternative approaches, such as temporal convolution networks and ConvLSTMs, on a complex spatial wave propagation benchmark. It produces stable and accurate closed-loop predictions even over hundreds of time steps. Moreover, it is able to effectively filter noise -- an ability that can be improved further by applying denoising autoencoder principles or by actively tuning latent neural state activities retrospectively. Results confirm that DISTANA is ready to model real-world spatio-temporal dynamics such as brain imaging, supply networks, water flow, or soil and weather data patterns. △ Less

Submitted 19 September, 2020; originally announced September 2020.

Comments: As accepted by the 29th International Conference on Artificial Neural Networks (ICANN20)

arXiv:2007.12475 [pdf]

doi 10.3390/rs12142234

Predicting and Map** of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran

Authors: Mostafa Emadi, Ruhollah Taghizadeh-Mehrjardi, Ali Cherati, Majid Danesh, Amir Mosavi, Thomas Scholten

Abstract: Estimation of the soil organic carbon content is of utmost importance in understanding the chemical, physical, and biological functions of the soil. This study proposes machine learning algorithms of support vector machines, artificial neural networks, regression tree, random forest, extreme gradient boosting, and conventional deep neural network for advancing prediction models of SOC. Models are… ▽ More Estimation of the soil organic carbon content is of utmost importance in understanding the chemical, physical, and biological functions of the soil. This study proposes machine learning algorithms of support vector machines, artificial neural networks, regression tree, random forest, extreme gradient boosting, and conventional deep neural network for advancing prediction models of SOC. Models are trained with 1879 composite surface soil samples, and 105 auxiliary data as predictors. The genetic algorithm is used as a feature selection approach to identify effective variables. The results indicate that precipitation is the most important predictor driving 15 percent of SOC spatial variability followed by the normalized difference vegetation index, day temperature index of moderate resolution imaging spectroradiometer, multiresolution valley bottom flatness and land use, respectively. Based on 10 fold cross validation, the DNN model reported as a superior algorithm with the lowest prediction error and uncertainty. In terms of accuracy, DNN yielded a mean absolute error of 59 percent, a root mean squared error of 75 percent, a coefficient of determination of 0.65, and Lins concordance correlation coefficient of 0.83. The SOC content was the highest in udic soil moisture regime class with mean values of 4 percent, followed by the aquic and xeric classes, respectively. Soils in dense forestlands had the highest SOC contents, whereas soils of younger geological age and alluvial fans had lower SOC. The proposed DNN is a promising algorithm for handling large numbers of auxiliary data at a province scale, and due to its flexible structure and the ability to extract more information from the auxiliary data surrounding the sampled observations, it had high accuracy for the prediction of the SOC baseline map and minimal uncertainty. △ Less

Submitted 12 July, 2020; originally announced July 2020.

Comments: 30pages, 9 figures

MSC Class: 68T05

Journal ref: Remote Sens. 2020, 12, 2234

arXiv:1912.11141 [pdf, other]

A Distributed Neural Network Architecture for Robust Non-Linear Spatio-Temporal Prediction

Authors: Matthias Karlbauer, Sebastian Otte, Hendrik P. A. Lensch, Thomas Scholten, Volker Wulfmeyer, Martin V. Butz

Abstract: We introduce a distributed spatio-temporal artificial neural network architecture (DISTANA). It encodes mesh nodes using recurrent, neural prediction kernels (PKs), while neural transition kernels (TKs) transfer information between neighboring PKs, together modeling and predicting spatio-temporal time series dynamics. As a consequence, DISTANA assumes that generally applicable causes, which may be… ▽ More We introduce a distributed spatio-temporal artificial neural network architecture (DISTANA). It encodes mesh nodes using recurrent, neural prediction kernels (PKs), while neural transition kernels (TKs) transfer information between neighboring PKs, together modeling and predicting spatio-temporal time series dynamics. As a consequence, DISTANA assumes that generally applicable causes, which may be locally modified, generate the observed data. DISTANA learns in a parallel, spatially distributed manner, scales to large problem spaces, is capable of approximating complex dynamics, and is particularly robust to overfitting when compared to other competitive ANN models. Moreover, it is applicable to heterogeneously structured meshes. △ Less

Submitted 23 December, 2019; originally announced December 2019.

Comments: 8 pages, 4 figures, video on https://www.youtube.com/watch?v=4VHhHYeWTzo

arXiv:1908.11762 [pdf, other]

Classifying single-qubit noise using machine learning

Authors: Travis L. Scholten, Yi-Kai Liu, Kevin Young, Robin Blume-Kohout

Abstract: Quantum characterization, validation, and verification (QCVV) techniques are used to probe, characterize, diagnose, and detect errors in quantum information processors (QIPs). An important component of any QCVV protocol is a map** from experimental data to an estimate of a property of a QIP. Machine learning (ML) algorithms can help automate the development of QCVV protocols, creating such maps… ▽ More Quantum characterization, validation, and verification (QCVV) techniques are used to probe, characterize, diagnose, and detect errors in quantum information processors (QIPs). An important component of any QCVV protocol is a map** from experimental data to an estimate of a property of a QIP. Machine learning (ML) algorithms can help automate the development of QCVV protocols, creating such maps by learning them from training data. We identify the critical components of "machine-learned" QCVV techniques, and present a rubric for develo** them. To demonstrate this approach, we focus on the problem of determining whether noise affecting a single qubit is coherent or stochastic (incoherent) using the data sets originally proposed for gate set tomography. We leverage known ML algorithms to train a classifier distinguishing these two kinds of noise. The accuracy of the classifier depends on how well it can approximate the "natural" geometry of the training data. We find GST data sets generated by a noisy qubit can reliably be separated by linear surfaces, although feature engineering can be necessary. We also show the classifier learned by a support vector machine (SVM) is robust under finite-sample noise. △ Less

Submitted 30 August, 2019; originally announced August 2019.

Comments: 20 pages (15 main, 5 supplemental), 11 figures, and 5 tables

arXiv:1903.03349 [pdf]

doi 10.1038/s41598-020-62148-y

Computer aided detection of tuberculosis on chest radiographs: An evaluation of the CAD4TB v6 system

Authors: Keelin Murphy, Shifa Salman Habib, Syed Mohammad Asad Zaidi, Saira Khowaja, Aamir Khan, Jaime Melendez, Ernst T. Scholten, Farhan Amad, Steven Schalekamp, Maurits Verhagen, Rick H. H. M. Philipsen, Annet Meijers, Bram van Ginneken

Abstract: There is a growing interest in the automated analysis of chest X-Ray (CXR) as a sensitive and inexpensive means of screening susceptible populations for pulmonary tuberculosis. In this work we evaluate the latest version of CAD4TB, a commercial software platform designed for this purpose. Version 6 of CAD4TB was released in 2018 and is here tested on a fully independent dataset of 5565 CXR images… ▽ More There is a growing interest in the automated analysis of chest X-Ray (CXR) as a sensitive and inexpensive means of screening susceptible populations for pulmonary tuberculosis. In this work we evaluate the latest version of CAD4TB, a commercial software platform designed for this purpose. Version 6 of CAD4TB was released in 2018 and is here tested on a fully independent dataset of 5565 CXR images with GeneXpert (Xpert) sputum test results available (854 Xpert positive subjects). A subset of 500 subjects (50% Xpert positive) was reviewed and annotated by 5 expert observers independently to obtain a radiological reference standard. The latest version of CAD4TB is found to outperform all previous versions in terms of area under receiver operating curve (ROC) with respect to both Xpert and radiological reference standards. Improvements with respect to Xpert are most apparent at high sensitivity levels with a specificity of 76% obtained at a fixed 90% sensitivity. When compared with the radiological reference standard, CAD4TB v6 also outperformed previous versions by a considerable margin and achieved 98% specificity at the 90% sensitivity setting. No substantial difference was found between the performance of CAD4TB v6 and any of the various expert observers against the Xpert reference standard. A cost and efficiency analysis on this dataset demonstrates that in a standard clinical situation, operating at 90% sensitivity, users of CAD4TB v6 can process 132 subjects per day at n average cost per screen of \$5.95 per subject, while users of version 3 process only 85 subjects per day at a cost of \$8.38 per subject. At all tested operating points version 6 is shown to be more efficient and cost effective than any other version. △ Less

Submitted 2 April, 2020; v1 submitted 8 March, 2019; originally announced March 2019.

Comments: Published in Scientific Reports

Journal ref: Scientific Reports 10, 5492 (2020)

arXiv:1612.08012 [pdf, other]

doi 10.1016/j.media.2017.06.015

Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge

Authors: Arnaud Arindra Adiyoso Setio, Alberto Traverso, Thomas de Bel, Moira S. N. Berens, Cas van den Bogaard, Piergiorgio Cerello, Hao Chen, Qi Dou, Maria Evelina Fantacci, Bram Geurts, Robbert van der Gugten, Pheng Ann Heng, Bart Jansen, Michael M. J. de Kaste, Valentin Kotov, Jack Yu-Hung Lin, Jeroen T. M. C. Manders, Alexander Sónora-Mengana, Juan Carlos García-Naranjo, Evgenia Papavasileiou, Mathias Prokop, Marco Saletta, Cornelia M Schaefer-Prokop, Ernst T. Scholten, Luuk Scholten , et al. (7 additional authors not shown)

Abstract: Automatic detection of pulmonary nodules in thoracic computed tomography (CT) scans has been an active area of research for the last two decades. However, there have only been few studies that provide a comparative performance evaluation of different systems on a common database. We have therefore set up the LUNA16 challenge, an objective evaluation framework for automatic nodule detection algorit… ▽ More Automatic detection of pulmonary nodules in thoracic computed tomography (CT) scans has been an active area of research for the last two decades. However, there have only been few studies that provide a comparative performance evaluation of different systems on a common database. We have therefore set up the LUNA16 challenge, an objective evaluation framework for automatic nodule detection algorithms using the largest publicly available reference database of chest CT scans, the LIDC-IDRI data set. In LUNA16, participants develop their algorithm and upload their predictions on 888 CT scans in one of the two tracks: 1) the complete nodule detection track where a complete CAD system should be developed, or 2) the false positive reduction track where a provided set of nodule candidates should be classified. This paper describes the setup of LUNA16 and presents the results of the challenge so far. Moreover, the impact of combining individual systems on the detection performance was also investigated. It was observed that the leading solutions employed convolutional networks and used the provided set of nodule candidates. The combination of these solutions achieved an excellent sensitivity of over 95% at fewer than 1.0 false positives per scan. This highlights the potential of combining algorithms to improve the detection performance. Our observer study with four expert readers has shown that the best system detects nodules that were missed by expert readers who originally annotated the LIDC-IDRI data. We released this set of additional nodules for further development of CAD systems. △ Less

Submitted 15 July, 2017; v1 submitted 23 December, 2016; originally announced December 2016.

arXiv:1610.09157 [pdf, other]

doi 10.1038/srep46479

Towards automatic pulmonary nodule management in lung cancer screening with deep learning

Authors: Francesco Ciompi, Kaman Chung, Sarah J. van Riel, Arnaud Arindra Adiyoso Setio, Paul K. Gerke, Colin Jacobs, Ernst Th. Scholten, Cornelia Schaefer-Prokop, Mathilde M. W. Wille, Alfonso Marchiano, Ugo Pastorino, Mathias Prokop, Bram van Ginneken

Abstract: The introduction of lung cancer screening programs will produce an unprecedented amount of chest CT scans in the near future, which radiologists will have to read in order to decide on a patient follow-up strategy. According to the current guidelines, the workup of screen-detected nodules strongly relies on nodule size and nodule type. In this paper, we present a deep learning system based on mult… ▽ More The introduction of lung cancer screening programs will produce an unprecedented amount of chest CT scans in the near future, which radiologists will have to read in order to decide on a patient follow-up strategy. According to the current guidelines, the workup of screen-detected nodules strongly relies on nodule size and nodule type. In this paper, we present a deep learning system based on multi-stream multi-scale convolutional networks, which automatically classifies all nodule types relevant for nodule workup. The system processes raw CT data containing a nodule without the need for any additional information such as nodule segmentation or nodule size and learns a representation of 3D data by analyzing an arbitrary number of 2D views of a given nodule. The deep learning system was trained with data from the Italian MILD screening trial and validated on an independent set of data from the Danish DLCST screening trial. We analyze the advantage of processing nodules at multiple scales with a multi-stream convolutional network architecture, and we show that the proposed deep learning system achieves performance at classifying nodule type that surpasses the one of classical machine learning approaches and is within the inter-observer variability among four experienced human observers. △ Less

Submitted 23 May, 2017; v1 submitted 28 October, 2016; originally announced October 2016.

Comments: Published on Scientific Reports

Journal ref: Sci. Rep. 7, 46479; (2017)

Showing 1–14 of 14 results for author: Scholten, T