-
Surrogate-free machine learning-based organ dose reconstruction for pediatric abdominal radiotherapy
Authors:
M. Virgolin,
Z. Wang,
B. V. Balgobind,
I. W. E. M. van Dijk,
J. Wiersma,
P. S. Kroon,
G. O. Janssens,
M. van Herk,
D. C. Hodgson,
L. Zadravec Zaletel,
C. R. N. Rasch,
A. Bel,
P. A. N. Bosman,
T. Alderliesten
Abstract:
To study radiotherapy-related adverse effects, detailed dose information (3D distribution) is needed for accurate dose-effect modeling. For childhood cancer survivors who underwent radiotherapy in the pre-CT era, only 2D radiographs were acquired, thus 3D dose distributions must be reconstructed from limited information. State-of-the-art methods achieve this by using 3D surrogate anatomies. These…
▽ More
To study radiotherapy-related adverse effects, detailed dose information (3D distribution) is needed for accurate dose-effect modeling. For childhood cancer survivors who underwent radiotherapy in the pre-CT era, only 2D radiographs were acquired, thus 3D dose distributions must be reconstructed from limited information. State-of-the-art methods achieve this by using 3D surrogate anatomies. These can lack personalization and lead to coarse reconstructions. We present and validate a surrogate-free dose reconstruction method based on Machine Learning (ML). Abdominal planning CTs ($n$=142) of recently-treated childhood cancer patients were gathered, their organs at risk were segmented, and 300 artificial Wilms' tumor plans were sampled automatically. Each artificial plan was automatically emulated on the 142 CTs, resulting in 42,600 3D dose distributions from which dose-volume metrics were derived. Anatomical features were extracted from digitally reconstructed radiographs simulated from the CTs to resemble historical radiographs. Further, patient and radiotherapy plan features typically available from historical treatment records were collected. An evolutionary ML algorithm was then used to link features to dose-volume metrics. Besides 5-fold cross-validation, a further evaluation was done on an independent dataset of five CTs each associated with two clinical plans. Cross-validation resulted in Mean Absolute Errors (MAEs) $\leq$0.6 Gy for organs completely inside or outside the field. For organs positioned at the edge of the field, MAEs $\leq$1.7 Gy for D$_{mean}$, $\leq$2.9 Gy for D$_{2cc}$, and $\leq$13% for V$_{5Gy}$ and V$_{10Gy}$, were obtained, without systematic bias. Similar results were found for the independent dataset. Our novel, ML-based organ dose reconstruction method is not only accurate but also efficient, as the setup of a surrogate is no longer needed.
△ Less
Submitted 10 February, 2021; v1 submitted 16 February, 2020;
originally announced February 2020.
-
Overly Optimistic Prediction Results on Imbalanced Data: a Case Study of Flaws and Benefits when Applying Over-sampling
Authors:
Gilles Vandewiele,
Isabelle Dehaene,
György Kovács,
Lucas Sterckx,
Olivier Janssens,
Femke Ongenae,
Femke De Backere,
Filip De Turck,
Kristien Roelens,
Johan Decruyenaere,
Sofie Van Hoecke,
Thomas Demeester
Abstract:
Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource, called the Term/Preterm Electrohysterogram datab…
▽ More
Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource, called the Term/Preterm Electrohysterogram database. However, we argue that these results are overly optimistic due to a methodological flaw being made. In this work, we focus on one specific type of methodological flaw: applying over-sampling before partitioning the data into mutually exclusive training and testing sets. We show how this causes the results to be biased using two artificial datasets and reproduce results of studies in which this flaw was identified. Moreover, we evaluate the actual impact of over-sampling on predictive performance, when applied prior to data partitioning, using the same methodologies of related studies, to provide a realistic view of these methodologies' generalization capabilities. We make our research reproducible by providing all the code under an open license.
△ Less
Submitted 28 November, 2020; v1 submitted 15 January, 2020;
originally announced January 2020.
-
Web Applicable Computer-aided Diagnosis of Glaucoma Using Deep Learning
Authors:
Mijung Kim,
Olivier Janssens,
Ho-min Park,
Jasper Zuallaert,
Sofie Van Hoecke,
Wesley De Neve
Abstract:
Glaucoma is a major eye disease, leading to vision loss in the absence of proper medical treatment. Current diagnosis of glaucoma is performed by ophthalmologists who are often analyzing several types of medical images generated by different types of medical equipment. Capturing and analyzing these medical images is labor-intensive and expensive. In this paper, we present a novel computational app…
▽ More
Glaucoma is a major eye disease, leading to vision loss in the absence of proper medical treatment. Current diagnosis of glaucoma is performed by ophthalmologists who are often analyzing several types of medical images generated by different types of medical equipment. Capturing and analyzing these medical images is labor-intensive and expensive. In this paper, we present a novel computational approach towards glaucoma diagnosis and localization, only making use of eye fundus images that are analyzed by state-of-the-art deep learning techniques. Specifically, our approach leverages Convolutional Neural Networks (CNNs) and Gradient-weighted Class Activation Map** (Grad-CAM) for glaucoma diagnosis and localization, respectively. Quantitative and qualitative results, as obtained for a small-sized dataset with no segmentation ground truth, demonstrate that the proposed approach is promising, for instance achieving an accuracy of 0.91$\pm0.02$ and an ROC-AUC score of 0.94 for the diagnosis task. Furthermore, we present a publicly available prototype web application that integrates our predictive model, with the goal of making effective glaucoma diagnosis available to a wide audience.
△ Less
Submitted 3 April, 2019; v1 submitted 6 December, 2018;
originally announced December 2018.
-
GENESIM: genetic extraction of a single, interpretable model
Authors:
Gilles Vandewiele,
Olivier Janssens,
Femke Ongenae,
Filip De Turck,
Sofie Van Hoecke
Abstract:
Models obtained by decision tree induction techniques excel in being interpretable.However, they can be prone to overfitting, which results in a low predictive performance. Ensemble techniques are able to achieve a higher accuracy. However, this comes at a cost of losing interpretability of the resulting model. This makes ensemble techniques impractical in applications where decision support, inst…
▽ More
Models obtained by decision tree induction techniques excel in being interpretable.However, they can be prone to overfitting, which results in a low predictive performance. Ensemble techniques are able to achieve a higher accuracy. However, this comes at a cost of losing interpretability of the resulting model. This makes ensemble techniques impractical in applications where decision support, instead of decision making, is crucial.
To bridge this gap, we present the GENESIM algorithm that transforms an ensemble of decision trees to a single decision tree with an enhanced predictive performance by using a genetic algorithm. We compared GENESIM to prevalent decision tree induction and ensemble techniques using twelve publicly available data sets. The results show that GENESIM achieves a better predictive performance on most of these data sets than decision tree induction techniques and a predictive performance in the same order of magnitude as the ensemble techniques. Moreover, the resulting model of GENESIM has a very low complexity, making it very interpretable, in contrast to ensemble techniques.
△ Less
Submitted 17 November, 2016;
originally announced November 2016.