Search | arXiv e-print repository

DeepVoxNet2: Yet another CNN framework

Authors: Jeroen Bertels, David Robben, Robin Lemmens, Dirk Vandermeulen

Abstract: We know that both the CNN map** function and the sampling scheme are of paramount importance for CNN-based image analysis. It is clear that both functions operate in the same space, with an image axis $\mathcal{I}$ and a feature axis $\mathcal{F}$. Remarkably, we found that no frameworks existed that unified the two and kept track of the spatial origin of the data automatically. Based on our own… ▽ More We know that both the CNN map** function and the sampling scheme are of paramount importance for CNN-based image analysis. It is clear that both functions operate in the same space, with an image axis $\mathcal{I}$ and a feature axis $\mathcal{F}$. Remarkably, we found that no frameworks existed that unified the two and kept track of the spatial origin of the data automatically. Based on our own practical experience, we found the latter to often result in complex coding and pipelines that are difficult to exchange. This article introduces our framework for 1, 2 or 3D image classification or segmentation: DeepVoxNet2 (DVN2). This article serves as an interactive tutorial, and a pre-compiled version, including the outputs of the code blocks, can be found online in the public DVN2 repository. This tutorial uses data from the multimodal Brain Tumor Image Segmentation Benchmark (BRATS) of 2018 to show an example of a 3D segmentation pipeline. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: 15 pages, part of PhD thesis KU Leuven 2022 "Understanding Final Infarct Prediction in Acute Ischemic Stroke Using Convolutional Neural Networks"

Report number: lirias3838597

arXiv:2211.09562 [pdf, other]

Convolutional neural networks for medical image segmentation

Authors: Jeroen Bertels, David Robben, Robin Lemmens, Dirk Vandermeulen

Abstract: In this article, we look into some essential aspects of convolutional neural networks (CNNs) with the focus on medical image segmentation. First, we discuss the CNN architecture, thereby highlighting the spatial origin of the data, voxel-wise classification and the receptive field. Second, we discuss the sampling of input-output pairs, thereby highlighting the interaction between voxel-wise classi… ▽ More In this article, we look into some essential aspects of convolutional neural networks (CNNs) with the focus on medical image segmentation. First, we discuss the CNN architecture, thereby highlighting the spatial origin of the data, voxel-wise classification and the receptive field. Second, we discuss the sampling of input-output pairs, thereby highlighting the interaction between voxel-wise classification, patch size and the receptive field. Finally, we give a historical overview of crucial changes to CNN architectures for classification and segmentation, giving insights in the relation between three pivotal CNN architectures: FCN, U-Net and DeepMedic. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: 10 pages, 6 figures, part of PhD thesis KU Leuven 2022 "Understanding Final Infarct Prediction in Acute Ischemic Stroke Using Convolutional Neural Networks"

Report number: lirias3838597

arXiv:2211.04850 [pdf, other]

Final infarct prediction in acute ischemic stroke

Authors: Jeroen Bertels, David Robben, Dirk Vandermeulen, Robin Lemmens

Abstract: This article focuses on the control center of each human body: the brain. We will point out the pivotal role of the cerebral vasculature and how its complex mechanisms may vary between subjects. We then emphasize a specific acute pathological state, i.e., acute ischemic stroke, and show how medical imaging and its analysis can be used to define the treatment. We show how the core-penumbra concept… ▽ More This article focuses on the control center of each human body: the brain. We will point out the pivotal role of the cerebral vasculature and how its complex mechanisms may vary between subjects. We then emphasize a specific acute pathological state, i.e., acute ischemic stroke, and show how medical imaging and its analysis can be used to define the treatment. We show how the core-penumbra concept is used in practice using mismatch criteria and how machine learning can be used to make predictions of the final infarct, either via deconvolution or convolutional neural networks. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: 17 pages, 5 figures, part of PhD thesis KU Leuven 2022 "Understanding Final Infarct Prediction in Acute Ischemic Stroke Using Convolutional Neural Networks"

Report number: lirias3838597

arXiv:2211.04161 [pdf, other]

doi 10.1016/j.media.2020.101833

Theoretical analysis and experimental validation of volume bias of soft Dice optimized segmentation maps in the context of inherent uncertainty

Authors: Jeroen Bertels, David Robben, Dirk Vandermeulen, Paul Suetens

Abstract: The clinical interest is often to measure the volume of a structure, which is typically derived from a segmentation. In order to evaluate and compare segmentation methods, the similarity between a segmentation and a predefined ground truth is measured using popular discrete metrics, such as the Dice score. Recent segmentation methods use a differentiable surrogate metric, such as soft Dice, as par… ▽ More The clinical interest is often to measure the volume of a structure, which is typically derived from a segmentation. In order to evaluate and compare segmentation methods, the similarity between a segmentation and a predefined ground truth is measured using popular discrete metrics, such as the Dice score. Recent segmentation methods use a differentiable surrogate metric, such as soft Dice, as part of the loss function during the learning phase. In this work, we first briefly describe how to derive volume estimates from a segmentation that is, potentially, inherently uncertain or ambiguous. This is followed by a theoretical analysis and an experimental validation linking the inherent uncertainty to common loss functions for training CNNs, namely cross-entropy and soft Dice. We find that, even though soft Dice optimization leads to an improved performance with respect to the Dice score and other measures, it may introduce a volume bias for tasks with high inherent uncertainty. These findings indicate some of the method's clinical limitations and suggest doing a closer ad-hoc volume analysis with an optional re-calibration step. △ Less

Submitted 8 November, 2022; originally announced November 2022.

Comments: 18 pages, 7 figures, 3 tables, published in Elsevier Medical Image Analysis (2021)

Journal ref: Elsevier Medical Image Analysis, Volume 61, January 2021, 101833

arXiv:2207.09521 [pdf, other]

doi 10.1007/978-3-031-16443-9_51

The Dice loss in the context of missing or empty labels: Introducing $Φ$ and $ε$

Authors: Sofie Tilborghs, Jeroen Bertels, David Robben, Dirk Vandermeulen, Frederik Maes

Abstract: Albeit the Dice loss is one of the dominant loss functions in medical image segmentation, most research omits a closer look at its derivative, i.e. the real motor of the optimization when using gradient descent. In this paper, we highlight the peculiar action of the Dice loss in the presence of missing or empty labels. First, we formulate a theoretical basis that gives a general description of the… ▽ More Albeit the Dice loss is one of the dominant loss functions in medical image segmentation, most research omits a closer look at its derivative, i.e. the real motor of the optimization when using gradient descent. In this paper, we highlight the peculiar action of the Dice loss in the presence of missing or empty labels. First, we formulate a theoretical basis that gives a general description of the Dice loss and its derivative. It turns out that the choice of the reduction dimensions $Φ$ and the smoothing term $ε$ is non-trivial and greatly influences its behavior. We find and propose heuristic combinations of $Φ$ and $ε$ that work in a segmentation setting with either missing or empty labels. Second, we empirically validate these findings in a binary and multiclass segmentation setting using two publicly available datasets. We confirm that the choice of $Φ$ and $ε$ is indeed pivotal. With $Φ$ chosen such that the reductions happen over a single batch (and class) element and with a negligible $ε$, the Dice loss deals with missing labels naturally and performs similarly compared to recent adaptations specific for missing labels. With $Φ$ chosen such that the reductions happen over multiple batch elements or with a heuristic value for $ε$, the Dice loss handles empty labels correctly. We believe that this work highlights some essential perspectives and hope that it encourages researchers to better describe their exact implementation of the Dice loss in future work. △ Less

Submitted 9 November, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

Comments: 8 pages, 3 figures, 1 table, International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) 2022

Journal ref: Medical Image Computing and Computer Assisted Intervention (MICCAI) 2022. Lecture Notes in Computer Science, vol 13435. Springer, Cham

arXiv:2112.12560 [pdf, other]

On the relationship between calibrated predictors and unbiased volume estimation

Authors: Teodora Popordanoska, Jeroen Bertels, Dirk Vandermeulen, Frederik Maes, Matthew B. Blaschko

Abstract: Machine learning driven medical image segmentation has become standard in medical image analysis. However, deep learning models are prone to overconfident predictions. This has led to a renewed focus on calibrated predictions in the medical imaging and broader machine learning communities. Calibrated predictions are estimates of the probability of a label that correspond to the true expected value… ▽ More Machine learning driven medical image segmentation has become standard in medical image analysis. However, deep learning models are prone to overconfident predictions. This has led to a renewed focus on calibrated predictions in the medical imaging and broader machine learning communities. Calibrated predictions are estimates of the probability of a label that correspond to the true expected value of the label conditioned on the confidence. Such calibrated predictions have utility in a range of medical imaging applications, including surgical planning under uncertainty and active learning systems. At the same time it is often an accurate volume measurement that is of real importance for many medical applications. This work investigates the relationship between model calibration and volume estimation. We demonstrate both mathematically and empirically that if the predictor is calibrated per image, we can obtain the correct volume by taking an expectation of the probability scores per pixel/voxel of the image. Furthermore, we show that convex combinations of calibrated classifiers preserve volume estimation, but do not preserve calibration. Therefore, we conclude that having a calibrated predictor is a sufficient, but not necessary condition for obtaining an unbiased estimate of the volume. We validate our theoretical findings empirically on a collection of 18 different (calibrated) training strategies on the tasks of glioma volume estimation on BraTS 2018, and ischemic stroke lesion volume estimation on ISLES 2018 datasets. △ Less

Submitted 23 December, 2021; originally announced December 2021.

Comments: Published at MICCAI 2021

arXiv:2109.13630 [pdf, other]

Unsupervised Diffeomorphic Surface Registration and Non-Linear Modelling

Authors: Balder Croquet, Daan Christiaens, Seth M. Weinberg, Michael Bronstein, Dirk Vandermeulen, Peter Claes

Abstract: Registration is an essential tool in image analysis. Deep learning based alternatives have recently become popular, achieving competitive performance at a faster speed. However, many contemporary techniques are limited to volumetric representations, despite increased popularity of 3D surface and shape data in medical image analysis. We propose a one-step registration model for 3D surfaces that int… ▽ More Registration is an essential tool in image analysis. Deep learning based alternatives have recently become popular, achieving competitive performance at a faster speed. However, many contemporary techniques are limited to volumetric representations, despite increased popularity of 3D surface and shape data in medical image analysis. We propose a one-step registration model for 3D surfaces that internalises a lower dimensional probabilistic deformation model (PDM) using conditional variational autoencoders (CVAE). The deformations are constrained to be diffeomorphic using an exponentiation layer. The one-step registration model is benchmarked against iterative techniques, trading in a slightly lower performance in terms of shape fit for a higher compactness. We experiment with two distance metrics, Chamfer distance (CD) and Sinkhorn divergence (SD), as specific distance functions for surface data in real-world registration scenarios. The internalised deformation model is benchmarked against linear principal component analysis (PCA) achieving competitive results and improved generalisability from lower dimensions. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Journal ref: International Conference on Medical Image Computing and Computer-Assisted Intervention (2021) 118-128

arXiv:2011.11719 [pdf, other]

Explainable-by-design Semi-Supervised Representation Learning for COVID-19 Diagnosis from CT Imaging

Authors: Abel Díaz Berenguer, Hichem Sahli, Boris Joukovsky, Maryna Kvasnytsia, Ine Dirks, Mitchel Alioscha-Perez, Nikos Deligiannis, Panagiotis Gonidakis, Sebastián Amador Sánchez, Redona Brahimetaj, Evgenia Papavasileiou, Jonathan Cheung-Wai Chana, Fei Li, Shangzhen Song, Yixin Yang, Sofie Tilborghs, Siri Willems, Tom Eelbode, Jeroen Bertels, Dirk Vandermeulen, Frederik Maes, Paul Suetens, Lucas Fidon, Tom Vercauteren, David Robben , et al. (15 additional authors not shown)

Abstract: Our motivating application is a real-world problem: COVID-19 classification from CT imaging, for which we present an explainable Deep Learning approach based on a semi-supervised classification pipeline that employs variational autoencoders to extract efficient feature embedding. We have optimized the architecture of two different networks for CT images: (i) a novel conditional variational autoenc… ▽ More Our motivating application is a real-world problem: COVID-19 classification from CT imaging, for which we present an explainable Deep Learning approach based on a semi-supervised classification pipeline that employs variational autoencoders to extract efficient feature embedding. We have optimized the architecture of two different networks for CT images: (i) a novel conditional variational autoencoder (CVAE) with a specific architecture that integrates the class labels inside the encoder layers and uses side information with shared attention layers for the encoder, which make the most of the contextual clues for representation learning, and (ii) a downstream convolutional neural network for supervised classification using the encoder structure of the CVAE. With the explainable classification results, the proposed diagnosis system is very effective for COVID-19 classification. Based on the promising results obtained qualitatively and quantitatively, we envisage a wide deployment of our developed technique in large-scale clinical studies.Code is available at https://git.etrovub.be/AVSP/ct-based-covid-19-diagnostic-tool.git. △ Less

Submitted 2 September, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

arXiv:2010.13499 [pdf, other]

doi 10.1109/TMI.2020.3002417

Optimization for Medical Image Segmentation: Theory and Practice when evaluating with Dice Score or Jaccard Index

Authors: Tom Eelbode, Jeroen Bertels, Maxim Berman, Dirk Vandermeulen, Frederik Maes, Raf Bisschops, Matthew B. Blaschko

Abstract: In many medical imaging and classical computer vision tasks, the Dice score and Jaccard index are used to evaluate the segmentation performance. Despite the existence and great empirical success of metric-sensitive losses, i.e. relaxations of these metrics such as soft Dice, soft Jaccard and Lovasz-Softmax, many researchers still use per-pixel losses, such as (weighted) cross-entropy to train CNNs… ▽ More In many medical imaging and classical computer vision tasks, the Dice score and Jaccard index are used to evaluate the segmentation performance. Despite the existence and great empirical success of metric-sensitive losses, i.e. relaxations of these metrics such as soft Dice, soft Jaccard and Lovasz-Softmax, many researchers still use per-pixel losses, such as (weighted) cross-entropy to train CNNs for segmentation. Therefore, the target metric is in many cases not directly optimized. We investigate from a theoretical perspective, the relation within the group of metric-sensitive loss functions and question the existence of an optimal weighting scheme for weighted cross-entropy to optimize the Dice score and Jaccard index at test time. We find that the Dice score and Jaccard index approximate each other relatively and absolutely, but we find no such approximation for a weighted Hamming similarity. For the Tversky loss, the approximation gets monotonically worse when deviating from the trivial weight setting where soft Tversky equals soft Dice. We verify these results empirically in an extensive validation on six medical segmentation tasks and can confirm that metric-sensitive losses are superior to cross-entropy based loss functions in case of evaluation with Dice Score or Jaccard Index. This further holds in a multi-class setting, and across different object sizes and foreground/background ratios. These results encourage a wider adoption of metric-sensitive loss functions for medical segmentation tasks where the performance measure of interest is the Dice score or Jaccard index. △ Less

Submitted 26 October, 2020; originally announced October 2020.

Comments: 15 pages, 14 figures, accepted for publication in IEEE Transactions on Medical Imaging (2020)

arXiv:2007.15546 [pdf, other]

Comparative study of deep learning methods for the automatic segmentation of lung, lesion and lesion type in CT scans of COVID-19 patients

Authors: Sofie Tilborghs, Ine Dirks, Lucas Fidon, Siri Willems, Tom Eelbode, Jeroen Bertels, Bart Ilsen, Arne Brys, Adriana Dubbeldam, Nico Buls, Panagiotis Gonidakis, Sebastián Amador Sánchez, Annemiek Snoeckx, Paul M. Parizel, Johan de Mey, Dirk Vandermeulen, Tom Vercauteren, David Robben, Dirk Smeets, Frederik Maes, Jef Vandemeulebroucke, Paul Suetens

Abstract: Recent research on COVID-19 suggests that CT imaging provides useful information to assess disease progression and assist diagnosis, in addition to help understanding the disease. There is an increasing number of studies that propose to use deep learning to provide fast and accurate quantification of COVID-19 using chest CT scans. The main tasks of interest are the automatic segmentation of lung a… ▽ More Recent research on COVID-19 suggests that CT imaging provides useful information to assess disease progression and assist diagnosis, in addition to help understanding the disease. There is an increasing number of studies that propose to use deep learning to provide fast and accurate quantification of COVID-19 using chest CT scans. The main tasks of interest are the automatic segmentation of lung and lung lesions in chest CT scans of confirmed or suspected COVID-19 patients. In this study, we compare twelve deep learning algorithms using a multi-center dataset, including both open-source and in-house developed algorithms. Results show that ensembling different methods can boost the overall test set performance for lung segmentation, binary lesion segmentation and multiclass lesion segmentation, resulting in mean Dice scores of 0.982, 0.724 and 0.469, respectively. The resulting binary lesions were segmented with a mean absolute volume error of 91.3 ml. In general, the task of distinguishing different lesion types was more difficult, with a mean absolute volume difference of 152 ml and mean Dice scores of 0.369 and 0.523 for consolidation and ground glass opacity, respectively. All methods perform binary lesion segmentation with an average volume error that is better than visual assessment by human raters, suggesting these methods are mature enough for a large-scale evaluation for use in clinical practice. △ Less

Submitted 10 January, 2022; v1 submitted 29 July, 2020; originally announced July 2020.

Comments: Updated acknowledgments

arXiv:1911.02278 [pdf, other]

doi 10.1007/978-3-030-46640-4_9

Optimization with soft Dice can lead to a volumetric bias

Authors: Jeroen Bertels, David Robben, Dirk Vandermeulen, Paul Suetens

Abstract: Segmentation is a fundamental task in medical image analysis. The clinical interest is often to measure the volume of a structure. To evaluate and compare segmentation methods, the similarity between a segmentation and a predefined ground truth is measured using metrics such as the Dice score. Recent segmentation methods based on convolutional neural networks use a differentiable surrogate of the… ▽ More Segmentation is a fundamental task in medical image analysis. The clinical interest is often to measure the volume of a structure. To evaluate and compare segmentation methods, the similarity between a segmentation and a predefined ground truth is measured using metrics such as the Dice score. Recent segmentation methods based on convolutional neural networks use a differentiable surrogate of the Dice score, such as soft Dice, explicitly as the loss function during the learning phase. Even though this approach leads to improved Dice scores, we find that, both theoretically and empirically on four medical tasks, it can introduce a volumetric bias for tasks with high inherent uncertainty. As such, this may limit the method's clinical applicability. △ Less

Submitted 6 November, 2019; originally announced November 2019.

Comments: BrainLes Workshop - MICCAI 2019

Journal ref: LNCS 11992, Springer Nature Switzerland AG 2019

arXiv:1911.01816 [pdf, other]

Detection of vertebral fractures in CT using 3D Convolutional Neural Networks

Authors: Joeri Nicolaes, Steven Raeymaeckers, David Robben, Guido Wilms, Dirk Vandermeulen, Cesar Libanati, Marc Debois

Abstract: Osteoporosis induced fractures occur worldwide about every 3 seconds. Vertebral compression fractures are early signs of the disease and considered risk predictors for secondary osteoporotic fractures. We present a detection method to opportunistically screen spine-containing CT images for the presence of these vertebral fractures. Inspired by radiology practice, existing methods are based on 2D a… ▽ More Osteoporosis induced fractures occur worldwide about every 3 seconds. Vertebral compression fractures are early signs of the disease and considered risk predictors for secondary osteoporotic fractures. We present a detection method to opportunistically screen spine-containing CT images for the presence of these vertebral fractures. Inspired by radiology practice, existing methods are based on 2D and 2.5D features but we present, to the best of our knowledge, the first method for detecting vertebral fractures in CT using automatically learned 3D feature maps. The presented method explicitly localizes these fractures allowing radiologists to interpret its results. We train a voxel-classification 3D Convolutional Neural Network (CNN) with a training database of 90 cases that has been semi-automatically generated using radiologist readings that are readily available in clinical practice. Our 3D method produces an Area Under the Curve (AUC) of 95% for patient-level fracture detection and an AUC of 93% for vertebra-level fracture detection in a five-fold cross-validation experiment. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Comments: 13 pages, 7 figures, pre-print MICCAI CSI

arXiv:1911.01685 [pdf, other]

doi 10.1007/978-3-030-32245-8_11

Optimizing the Dice Score and Jaccard Index for Medical Image Segmentation: Theory & Practice

Authors: Jeroen Bertels, Tom Eelbode, Maxim Berman, Dirk Vandermeulen, Frederik Maes, Raf Bisschops, Matthew Blaschko

Abstract: The Dice score and Jaccard index are commonly used metrics for the evaluation of segmentation tasks in medical imaging. Convolutional neural networks trained for image segmentation tasks are usually optimized for (weighted) cross-entropy. This introduces an adverse discrepancy between the learning optimization objective (the loss) and the end target metric. Recent works in computer vision have pro… ▽ More The Dice score and Jaccard index are commonly used metrics for the evaluation of segmentation tasks in medical imaging. Convolutional neural networks trained for image segmentation tasks are usually optimized for (weighted) cross-entropy. This introduces an adverse discrepancy between the learning optimization objective (the loss) and the end target metric. Recent works in computer vision have proposed soft surrogates to alleviate this discrepancy and directly optimize the desired metric, either through relaxations (soft-Dice, soft-Jaccard) or submodular optimization (Lovász-softmax). The aim of this study is two-fold. First, we investigate the theoretical differences in a risk minimization framework and question the existence of a weighted cross-entropy loss with weights theoretically optimized to surrogate Dice or Jaccard. Second, we empirically investigate the behavior of the aforementioned loss functions w.r.t. evaluation with Dice score and Jaccard index on five medical segmentation tasks. Through the application of relative approximation bounds, we show that all surrogates are equivalent up to a multiplicative factor, and that no optimal weighting of cross-entropy exists to approximate Dice or Jaccard measures. We validate these findings empirically and show that, while it is important to opt for one of the target metric surrogates rather than a cross-entropy-based loss, the choice of the surrogate does not make a statistical difference on a wide range of medical segmentation tasks. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Comments: MICCAI 2019

Journal ref: LNCS 11765, Springer Nature Switzerland AG 2019

arXiv:1102.4258 [pdf, other]

SHREC 2011: robust feature detection and description benchmark

Authors: E. Boyer, A. M. Bronstein, M. M. Bronstein, B. Bustos, T. Darom, R. Horaud, I. Hotz, Y. Keller, J. Keustermans, A. Kovnatsky, R. Litman, J. Reininghaus, I. Sipiran, D. Smeets, P. Suetens, D. Vandermeulen, A. Zaharescu, V. Zobel

Abstract: Feature-based approaches have recently become very popular in computer vision and image analysis applications, and are becoming a promising direction in shape retrieval. SHREC'11 robust feature detection and description benchmark simulates the feature detection and description stages of feature-based shape retrieval algorithms. The benchmark tests the performance of shape feature detectors and des… ▽ More Feature-based approaches have recently become very popular in computer vision and image analysis applications, and are becoming a promising direction in shape retrieval. SHREC'11 robust feature detection and description benchmark simulates the feature detection and description stages of feature-based shape retrieval algorithms. The benchmark tests the performance of shape feature detectors and descriptors under a wide variety of transformations. The benchmark allows evaluating how algorithms cope with certain classes of transformations and strength of the transformations that can be dealt with. The present paper is a report of the SHREC'11 robust feature detection and description benchmark results. △ Less

Submitted 21 February, 2011; originally announced February 2011.

Comments: This is a full version of the SHREC'11 report published in 3DOR

Showing 1–14 of 14 results for author: Vandermeulen, D