Search | arXiv e-print repository

Automatic 3D bi-ventricular segmentation of cardiac images by a shape-refined multi-task deep learning approach

Authors: **ming Duan, Ghalib Bello, Jo Schlemper, Wenjia Bai, Timothy J W Dawes, Carlo Biffi, Antonio de Marvao, Georgia Doumou, Declan P O'Regan, Daniel Rueckert

Abstract: Deep learning approaches have achieved state-of-the-art performance in cardiac magnetic resonance (CMR) image segmentation. However, most approaches have focused on learning image intensity features for segmentation, whereas the incorporation of anatomical shape priors has received less attention. In this paper, we combine a multi-task deep learning approach with atlas propagation to develop a sha… ▽ More Deep learning approaches have achieved state-of-the-art performance in cardiac magnetic resonance (CMR) image segmentation. However, most approaches have focused on learning image intensity features for segmentation, whereas the incorporation of anatomical shape priors has received less attention. In this paper, we combine a multi-task deep learning approach with atlas propagation to develop a shape-constrained bi-ventricular segmentation pipeline for short-axis CMR volumetric images. The pipeline first employs a fully convolutional network (FCN) that learns segmentation and landmark localisation tasks simultaneously. The architecture of the proposed FCN uses a 2.5D representation, thus combining the computational advantage of 2D FCNs networks and the capability of addressing 3D spatial consistency without compromising segmentation accuracy. Moreover, the refinement step is designed to explicitly enforce a shape constraint and improve segmentation quality. This step is effective for overcoming image artefacts (e.g. due to different breath-hold positions and large slice thickness), which preclude the creation of anatomically meaningful 3D cardiac shapes. The proposed pipeline is fully automated, due to network's ability to infer landmarks, which are then used downstream in the pipeline to initialise atlas propagation. We validate the pipeline on 1831 healthy subjects and 649 subjects with pulmonary hypertension. Extensive numerical experiments on the two datasets demonstrate that our proposed method is robust and capable of producing accurate, high-resolution and anatomically smooth bi-ventricular 3D models, despite the artefacts in input CMR volumes. △ Less

Submitted 13 July, 2019; v1 submitted 26 August, 2018; originally announced August 2018.

arXiv:1808.08114 [pdf, other]

Attention Gated Networks: Learning to Leverage Salient Regions in Medical Images

Authors: Jo Schlemper, Ozan Oktay, Michiel Schaap, Mattias Heinrich, Bernhard Kainz, Ben Glocker, Daniel Rueckert

Abstract: We propose a novel attention gate (AG) model for medical image analysis that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ locali… ▽ More We propose a novel attention gate (AG) model for medical image analysis that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation modules when using convolutional neural networks (CNNs). AGs can be easily integrated into standard CNN models such as VGG or U-Net architectures with minimal computational overhead while increasing the model sensitivity and prediction accuracy. The proposed AG models are evaluated on a variety of tasks, including medical image classification and segmentation. For classification, we demonstrate the use case of AGs in scan plane detection for fetal ultrasound screening. We show that the proposed attention mechanism can provide efficient object localisation while improving the overall prediction performance by reducing false positives. For segmentation, the proposed architecture is evaluated on two large 3D CT abdominal datasets with manual annotations for multiple organs. Experimental results show that AG models consistently improve the prediction performance of the base architectures across different datasets and training sizes while preserving computational efficiency. Moreover, AGs guide the model activations to be focused around salient regions, which provides better insights into how model predictions are made. The source code for the proposed AG models is publicly available. △ Less

Submitted 19 January, 2019; v1 submitted 22 August, 2018; originally announced August 2018.

Comments: Accepted for Medical Image Analysis (Special Issue on Medical Imaging with Deep Learning). arXiv admin note: substantial text overlap with arXiv:1804.03999, arXiv:1804.05338

arXiv:1808.05130 [pdf, other]

Deep Learning using K-space Based Data Augmentation for Automated Cardiac MR Motion Artefact Detection

Authors: Ilkay Oksuz, Bram Ruijsink, Esther Puyol-Anton, Aurelien Bustin, Gastao Cruz, Claudia Prieto, Daniel Rueckert, Julia A. Schnabel, Andrew P. King

Abstract: Quality assessment of medical images is essential for complete automation of image processing pipelines. For large population studies such as the UK Biobank, artefacts such as those caused by heart motion are problematic and manual identification is tedious and time-consuming. Therefore, there is an urgent need for automatic image quality assessment techniques. In this paper, we propose a method t… ▽ More Quality assessment of medical images is essential for complete automation of image processing pipelines. For large population studies such as the UK Biobank, artefacts such as those caused by heart motion are problematic and manual identification is tedious and time-consuming. Therefore, there is an urgent need for automatic image quality assessment techniques. In this paper, we propose a method to automatically detect the presence of motion-related artefacts in cardiac magnetic resonance (CMR) images. As this is a highly imbalanced classification problem (due to the high number of good quality images compared to the low number of images with motion artefacts), we propose a novel k-space based training data augmentation approach in order to address this problem. Our method is based on 3D spatio-temporal Convolutional Neural Networks, and is able to detect 2D+time short axis images with motion artefacts in less than 1ms. We test our algorithm on a subset of the UK Biobank dataset consisting of 3465 CMR images and achieve not only high accuracy in detection of motion artefacts, but also high precision and recall. We compare our approach to a range of state-of-the-art quality assessment methods. △ Less

Submitted 31 August, 2018; v1 submitted 15 August, 2018; originally announced August 2018.

Comments: Accepted for MICCAI2018 Conference

arXiv:1808.00273 [pdf, other]

Recurrent neural networks for aortic image sequence segmentation with sparse annotations

Authors: Wenjia Bai, Hideaki Suzuki, Chen Qin, Giacomo Tarroni, Ozan Oktay, Paul M. Matthews, Daniel Rueckert

Abstract: Segmentation of image sequences is an important task in medical image analysis, which enables clinicians to assess the anatomy and function of moving organs. However, direct application of a segmentation algorithm to each time frame of a sequence may ignore the temporal continuity inherent in the sequence. In this work, we propose an image sequence segmentation algorithm by combining a fully convo… ▽ More Segmentation of image sequences is an important task in medical image analysis, which enables clinicians to assess the anatomy and function of moving organs. However, direct application of a segmentation algorithm to each time frame of a sequence may ignore the temporal continuity inherent in the sequence. In this work, we propose an image sequence segmentation algorithm by combining a fully convolutional network with a recurrent neural network, which incorporates both spatial and temporal information into the segmentation task. A key challenge in training this network is that the available manual annotations are temporally sparse, which forbids end-to-end training. We address this challenge by performing non-rigid label propagation on the annotations and introducing an exponentially weighted loss function for training. Experiments on aortic MR image sequences demonstrate that the proposed method significantly improves both accuracy and temporal smoothness of segmentation, compared to a baseline method that utilises spatial information only. It achieves an average Dice metric of 0.960 for the ascending aorta and 0.953 for the descending aorta. △ Less

Submitted 1 August, 2018; originally announced August 2018.

Comments: Accepted for publication by MICCAI 2018

arXiv:1807.11368 [pdf, other]

Small Organ Segmentation in Whole-body MRI using a Two-stage FCN and Weighting Schemes

Authors: Vanya V. Valindria, Ioannis Lavdas, Juan Cerrolaza, Eric O. Aboagye, Andrea G. Rockall, Daniel Rueckert, Ben Glocker

Abstract: Accurate and robust segmentation of small organs in whole-body MRI is difficult due to anatomical variation and class imbalance. Recent deep network based approaches have demonstrated promising performance on abdominal multi-organ segmentations. However, the performance on small organs is still suboptimal as these occupy only small regions of the whole-body volumes with unclear boundaries and vari… ▽ More Accurate and robust segmentation of small organs in whole-body MRI is difficult due to anatomical variation and class imbalance. Recent deep network based approaches have demonstrated promising performance on abdominal multi-organ segmentations. However, the performance on small organs is still suboptimal as these occupy only small regions of the whole-body volumes with unclear boundaries and variable shapes. A coarse-to-fine, hierarchical strategy is a common approach to alleviate this problem, however, this might miss useful contextual information. We propose a two-stage approach with weighting schemes based on auto-context and spatial atlas priors. Our experiments show that the proposed approach can boost the segmentation accuracy of multiple small organs in whole-body MRI scans. △ Less

Submitted 30 July, 2018; originally announced July 2018.

Comments: Accepted at the MICCAI Workshop on Machine Learning in Medical Imaging (MLMI) 2018

arXiv:1807.10760 [pdf, other]

Deep nested level sets: Fully automated segmentation of cardiac MR images in patients with pulmonary hypertension

Authors: **ming Duan, Jo Schlemper, Wenjia Bai, Timothy J W Dawes, Ghalib Bello, Georgia Doumou, Antonio De Marvao, Declan P O'Regan, Daniel Rueckert

Abstract: In this paper we introduce a novel and accurate optimisation method for segmentation of cardiac MR (CMR) images in patients with pulmonary hypertension (PH). The proposed method explicitly takes into account the image features learned from a deep neural network. To this end, we estimate simultaneous probability maps over region and edge locations in CMR images using a fully convolutional network.… ▽ More In this paper we introduce a novel and accurate optimisation method for segmentation of cardiac MR (CMR) images in patients with pulmonary hypertension (PH). The proposed method explicitly takes into account the image features learned from a deep neural network. To this end, we estimate simultaneous probability maps over region and edge locations in CMR images using a fully convolutional network. Due to the distinct morphology of the heart in patients with PH, these probability maps can then be incorporated in a single nested level set optimisation framework to achieve multi-region segmentation with high efficiency. The proposed method uses an automatic way for level set initialisation and thus the whole optimisation is fully automated. We demonstrate that the proposed deep nested level set (DNLS) method outperforms existing state-of-the-art methods for CMR segmentation in PH patients. △ Less

Submitted 27 July, 2018; originally announced July 2018.

arXiv:1807.10583 [pdf, other]

EchoFusion: Tracking and Reconstruction of Objects in 4D Freehand Ultrasound Imaging without External Trackers

Authors: Bishesh Khanal, Alberto Gomez, Nicolas Toussaint, Steven McDonagh, Veronika Zimmer, Emily Skelton, Jacqueline Matthew, Daniel Grzech, Robert Wright, Chandni Gupta, Benjamin Hou, Daniel Rueckert, Julia A. Schnabel, Bernhard Kainz

Abstract: Ultrasound (US) is the most widely used fetal imaging technique. However, US images have limited capture range, and suffer from view dependent artefacts such as acoustic shadows. Compounding of overlap** 3D US acquisitions into a high-resolution volume can extend the field of view and remove image artefacts, which is useful for retrospective analysis including population based studies. However,… ▽ More Ultrasound (US) is the most widely used fetal imaging technique. However, US images have limited capture range, and suffer from view dependent artefacts such as acoustic shadows. Compounding of overlap** 3D US acquisitions into a high-resolution volume can extend the field of view and remove image artefacts, which is useful for retrospective analysis including population based studies. However, such volume reconstructions require information about relative transformations between probe positions from which the individual volumes were acquired. In prenatal US scans, the fetus can move independently from the mother, making external trackers such as electromagnetic or optical tracking unable to track the motion between probe position and the moving fetus. We provide a novel methodology for image-based tracking and volume reconstruction by combining recent advances in deep learning and simultaneous localisation and map** (SLAM). Tracking semantics are established through the use of a Residual 3D U-Net and the output is fed to the SLAM algorithm. As a proof of concept, experiments are conducted on US volumes taken from a whole body fetal phantom, and from the heads of real fetuses. For the fetal head segmentation, we also introduce a novel weak annotation approach to minimise the required manual effort for ground truth annotation. We evaluate our method qualitatively, and quantitatively with respect to tissue discrimination accuracy and tracking robustness. △ Less

Submitted 19 July, 2018; originally announced July 2018.

Comments: MICCAI Workshop on Perinatal, Preterm and Paediatric Image analysis (PIPPI), 2018

arXiv:1807.06843 [pdf, other]

Learning Interpretable Anatomical Features Through Deep Generative Models: Application to Cardiac Remodeling

Authors: Carlo Biffi, Ozan Oktay, Giacomo Tarroni, Wenjia Bai, Antonio De Marvao, Georgia Doumou, Martin Rajchl, Reem Bedair, Sanjay Prasad, Stuart Cook, Declan O'Regan, Daniel Rueckert

Abstract: Alterations in the geometry and function of the heart define well-established causes of cardiovascular disease. However, current approaches to the diagnosis of cardiovascular diseases often rely on subjective human assessment as well as manual analysis of medical images. Both factors limit the sensitivity in quantifying complex structural and functional phenotypes. Deep learning approaches have re… ▽ More Alterations in the geometry and function of the heart define well-established causes of cardiovascular disease. However, current approaches to the diagnosis of cardiovascular diseases often rely on subjective human assessment as well as manual analysis of medical images. Both factors limit the sensitivity in quantifying complex structural and functional phenotypes. Deep learning approaches have recently achieved success for tasks such as classification or segmentation of medical images, but lack interpretability in the feature extraction and decision processes, limiting their value in clinical diagnosis. In this work, we propose a 3D convolutional generative model for automatic classification of images from patients with cardiac diseases associated with structural remodeling. The model leverages interpretable task-specific anatomic patterns learned from 3D segmentations. It further allows to visualise and quantify the learned pathology-specific remodeling patterns in the original input space of the images. This approach yields high accuracy in the categorization of healthy and hypertrophic cardiomyopathy subjects when tested on unseen MR images from our own multi-centre dataset (100%) as well on the ACDC MICCAI 2017 dataset (90%). We believe that the proposed deep learning approach is a promising step towards the development of interpretable classifiers for the medical imaging domain, which may help clinicians to improve diagnostic accuracy and enhance patient risk-stratification. △ Less

Submitted 18 July, 2018; originally announced July 2018.

Comments: Accepted at MICCAI 2018

arXiv:1806.11216 [pdf, other]

Adversarial and Perceptual Refinement for Compressed Sensing MRI Reconstruction

Authors: Maximilian Seitzer, Guang Yang, Jo Schlemper, Ozan Oktay, Tobias Würfl, Vincent Christlein, Tom Wong, Raad Mohiaddin, David Firmin, Jennifer Keegan, Daniel Rueckert, Andreas Maier

Abstract: Deep learning approaches have shown promising performance for compressed sensing-based Magnetic Resonance Imaging. While deep neural networks trained with mean squared error (MSE) loss functions can achieve high peak signal to noise ratio, the reconstructed images are often blurry and lack sharp details, especially for higher undersampling rates. Recently, adversarial and perceptual loss functions… ▽ More Deep learning approaches have shown promising performance for compressed sensing-based Magnetic Resonance Imaging. While deep neural networks trained with mean squared error (MSE) loss functions can achieve high peak signal to noise ratio, the reconstructed images are often blurry and lack sharp details, especially for higher undersampling rates. Recently, adversarial and perceptual loss functions have been shown to achieve more visually appealing results. However, it remains an open question how to (1) optimally combine these loss functions with the MSE loss function and (2) evaluate such a perceptual enhancement. In this work, we propose a hybrid method, in which a visual refinement component is learnt on top of an MSE loss-based reconstruction network. In addition, we introduce a semantic interpretability score, measuring the visibility of the region of interest in both ground truth and reconstructed images, which allows us to objectively quantify the usefulness of the image quality for image post-processing and analysis. Applied on a large cardiac MRI dataset simulated with 8-fold undersampling, we demonstrate significant improvements ($p<0.01$) over the state-of-the-art in both a human observer study and the semantic interpretability score. △ Less

Submitted 28 June, 2018; originally announced June 2018.

Comments: To be published at MICCAI 2018

arXiv:1806.07486 [pdf, other]

doi 10.1007/978-3-030-00928-1_45

Standard Plane Detection in 3D Fetal Ultrasound Using an Iterative Transformation Network

Authors: Yuanwei Li, Bishesh Khanal, Benjamin Hou, Amir Alansary, Juan J. Cerrolaza, Matthew Sinclair, Jacqueline Matthew, Chandni Gupta, Caroline Knight, Bernhard Kainz, Daniel Rueckert

Abstract: Standard scan plane detection in fetal brain ultrasound (US) forms a crucial step in the assessment of fetal development. In clinical settings, this is done by manually manoeuvring a 2D probe to the desired scan plane. With the advent of 3D US, the entire fetal brain volume containing these standard planes can be easily acquired. However, manual standard plane identification in 3D volume is labour… ▽ More Standard scan plane detection in fetal brain ultrasound (US) forms a crucial step in the assessment of fetal development. In clinical settings, this is done by manually manoeuvring a 2D probe to the desired scan plane. With the advent of 3D US, the entire fetal brain volume containing these standard planes can be easily acquired. However, manual standard plane identification in 3D volume is labour-intensive and requires expert knowledge of fetal anatomy. We propose a new Iterative Transformation Network (ITN) for the automatic detection of standard planes in 3D volumes. ITN uses a convolutional neural network to learn the relationship between a 2D plane image and the transformation parameters required to move that plane towards the location/orientation of the standard plane in the 3D volume. During inference, the current plane image is passed iteratively to the network until it converges to the standard plane location. We explore the effect of using different transformation representations as regression outputs of ITN. Under a multi-task learning framework, we introduce additional classification probability outputs to the network to act as confidence measures for the regressed transformation parameters in order to further improve the localisation accuracy. When evaluated on 72 US volumes of fetal brain, our method achieves an error of 3.83mm/12.7 degrees and 3.80mm/12.6 degrees for the transventricular and transcerebellar planes respectively and takes 0.46s per plane. Source code is publicly available at https://github.com/yuanwei1989/plane-detection. △ Less

Submitted 6 October, 2018; v1 submitted 19 June, 2018; originally announced June 2018.

Comments: 8 pages, 2 figures, accepted for MICCAI 2018; Added link to source code

Journal ref: LNCS 11070 (2018) 392-400

arXiv:1806.06987 [pdf, other]

doi 10.1007/978-3-030-00928-1_64

Fast Multiple Landmark Localisation Using a Patch-based Iterative Network

Authors: Yuanwei Li, Amir Alansary, Juan J. Cerrolaza, Bishesh Khanal, Matthew Sinclair, Jacqueline Matthew, Chandni Gupta, Caroline Knight, Bernhard Kainz, Daniel Rueckert

Abstract: We propose a new Patch-based Iterative Network (PIN) for fast and accurate landmark localisation in 3D medical volumes. PIN utilises a Convolutional Neural Network (CNN) to learn the spatial relationship between an image patch and anatomical landmark positions. During inference, patches are repeatedly passed to the CNN until the estimated landmark position converges to the true landmark location.… ▽ More We propose a new Patch-based Iterative Network (PIN) for fast and accurate landmark localisation in 3D medical volumes. PIN utilises a Convolutional Neural Network (CNN) to learn the spatial relationship between an image patch and anatomical landmark positions. During inference, patches are repeatedly passed to the CNN until the estimated landmark position converges to the true landmark location. PIN is computationally efficient since the inference stage only selectively samples a small number of patches in an iterative fashion rather than a dense sampling at every location in the volume. Our approach adopts a multi-task learning framework that combines regression and classification to improve localisation accuracy. We extend PIN to localise multiple landmarks by using principal component analysis, which models the global anatomical relationships between landmarks. We have evaluated PIN using 72 3D ultrasound images from fetal screening examinations. PIN achieves quantitatively an average landmark localisation error of 5.59mm and a runtime of 0.44s to predict 10 landmarks per volume. Qualitatively, anatomical 2D standard scan planes derived from the predicted landmark locations are visually similar to the clinical ground truth. Source code is publicly available at https://github.com/yuanwei1989/landmark-detection. △ Less

Submitted 6 October, 2018; v1 submitted 18 June, 2018; originally announced June 2018.

Comments: 8 pages, 4 figures, Accepted for MICCAI 2018

Journal ref: LNCS 11070 (2018) 563-571

arXiv:1806.06244 [pdf, other]

Real-time Prediction of Segmentation Quality

Authors: Robert Robinson, Ozan Oktay, Wenjia Bai, Vanya Valindria, Mihir Sanghvi, Nay Aung, José Paiva, Filip Zemrak, Kenneth Fung, Elena Lukaschuk, Aaron Lee, Valentina Carapella, Young ** Kim, Bernhard Kainz, Stefan Piechnik, Stefan Neubauer, Steffen Petersen, Chris Page, Daniel Rueckert, Ben Glocker

Abstract: Recent advances in deep learning based image segmentation methods have enabled real-time performance with human-level accuracy. However, occasionally even the best method fails due to low image quality, artifacts or unexpected behaviour of black box algorithms. Being able to predict segmentation quality in the absence of ground truth is of paramount importance in clinical practice, but also in lar… ▽ More Recent advances in deep learning based image segmentation methods have enabled real-time performance with human-level accuracy. However, occasionally even the best method fails due to low image quality, artifacts or unexpected behaviour of black box algorithms. Being able to predict segmentation quality in the absence of ground truth is of paramount importance in clinical practice, but also in large-scale studies to avoid the inclusion of invalid data in subsequent analysis. In this work, we propose two approaches of real-time automated quality control for cardiovascular MR segmentations using deep learning. First, we train a neural network on 12,880 samples to predict Dice Similarity Coefficients (DSC) on a per-case basis. We report a mean average error (MAE) of 0.03 on 1,610 test samples and 97% binary classification accuracy for separating low and high quality segmentations. Secondly, in the scenario where no manually annotated data is available, we train a network to predict DSC scores from estimated quality obtained via a reverse testing strategy. We report an MAE=0.14 and 91% binary classification accuracy for this case. Predictions are obtained in real-time which, when combined with real-time segmentation methods, enables instant feedback on whether an acquired scan is analysable while the patient is still in the scanner. This further enables new applications of optimising image acquisition towards best possible analysis results. △ Less

Submitted 16 June, 2018; originally announced June 2018.

Comments: Accepted at MICCAI 2018

arXiv:1806.04224 [pdf, other]

NeuroNet: Fast and Robust Reproduction of Multiple Brain Image Segmentation Pipelines

Authors: Martin Rajchl, Nick Pawlowski, Daniel Rueckert, Paul M. Matthews, Ben Glocker

Abstract: NeuroNet is a deep convolutional neural network mimicking multiple popular and state-of-the-art brain segmentation tools including FSL, SPM, and MALPEM. The network is trained on 5,000 T1-weighted brain MRI scans from the UK Biobank Imaging Study that have been automatically segmented into brain tissue and cortical and sub-cortical structures using the standard neuroimaging pipelines. Training a s… ▽ More NeuroNet is a deep convolutional neural network mimicking multiple popular and state-of-the-art brain segmentation tools including FSL, SPM, and MALPEM. The network is trained on 5,000 T1-weighted brain MRI scans from the UK Biobank Imaging Study that have been automatically segmented into brain tissue and cortical and sub-cortical structures using the standard neuroimaging pipelines. Training a single model from these complementary and partially overlap** label maps yields a new powerful "all-in-one", multi-output segmentation tool. The processing time for a single subject is reduced by an order of magnitude compared to running each individual software package. We demonstrate very good reproducibility of the original outputs while increasing robustness to variations in the input data. We believe NeuroNet could be an important tool in large-scale population imaging studies and serve as a new standard in neuroscience by reducing the risk of introducing bias when choosing a specific software package. △ Less

Submitted 11 June, 2018; originally announced June 2018.

Comments: International conference on Medical Imaging with Deep Learning (MIDL) 2018

arXiv:1806.04066 [pdf, other]

Joint Learning of Motion Estimation and Segmentation for Cardiac MR Image Sequences

Authors: Chen Qin, Wenjia Bai, Jo Schlemper, Steffen E. Petersen, Stefan K. Piechnik, Stefan Neubauer, Daniel Rueckert

Abstract: Cardiac motion estimation and segmentation play important roles in quantitatively assessing cardiac function and diagnosing cardiovascular diseases. In this paper, we propose a novel deep learning method for joint estimation of motion and segmentation from cardiac MR image sequences. The proposed network consists of two branches: a cardiac motion estimation branch which is built on a novel unsuper… ▽ More Cardiac motion estimation and segmentation play important roles in quantitatively assessing cardiac function and diagnosing cardiovascular diseases. In this paper, we propose a novel deep learning method for joint estimation of motion and segmentation from cardiac MR image sequences. The proposed network consists of two branches: a cardiac motion estimation branch which is built on a novel unsupervised Siamese style recurrent spatial transformer network, and a cardiac segmentation branch that is based on a fully convolutional network. In particular, a joint multi-scale feature encoder is learned by optimizing the segmentation branch and the motion estimation branch simultaneously. This enables the weakly-supervised segmentation by taking advantage of features that are unsupervisedly learned in the motion estimation branch from a large amount of unannotated data. Experimental results using cardiac MRI images from 220 subjects show that the joint learning of both tasks is complementary and the proposed models outperform the competing methods significantly in terms of accuracy and speed. △ Less

Submitted 11 June, 2018; originally announced June 2018.

Comments: accepted by MICCAI 2018

arXiv:1806.03228 [pdf, other]

Automatic View Planning with Multi-scale Deep Reinforcement Learning Agents

Authors: Amir Alansary, Loic Le Folgoc, Ghislain Vaillant, Ozan Oktay, Yuanwei Li, Wenjia Bai, Jonathan Passerat-Palmbach, Ricardo Guerrero, Konstantinos Kamnitsas, Benjamin Hou, Steven McDonagh, Ben Glocker, Bernhard Kainz, Daniel Rueckert

Abstract: We propose a fully automatic method to find standardized view planes in 3D image acquisitions. Standard view images are important in clinical practice as they provide a means to perform biometric measurements from similar anatomical regions. These views are often constrained to the native orientation of a 3D image acquisition. Navigating through target anatomy to find the required view plane is te… ▽ More We propose a fully automatic method to find standardized view planes in 3D image acquisitions. Standard view images are important in clinical practice as they provide a means to perform biometric measurements from similar anatomical regions. These views are often constrained to the native orientation of a 3D image acquisition. Navigating through target anatomy to find the required view plane is tedious and operator-dependent. For this task, we employ a multi-scale reinforcement learning (RL) agent framework and extensively evaluate several Deep Q-Network (DQN) based strategies. RL enables a natural learning paradigm by interaction with the environment, which can be used to mimic experienced operators. We evaluate our results using the distance between the anatomical landmarks and detected planes, and the angles between their normal vector and target. The proposed algorithm is assessed on the mid-sagittal and anterior-posterior commissure planes of brain MRI, and the 4-chamber long-axis plane commonly used in cardiac MRI, achieving accuracy of 1.53mm, 1.98mm and 4.84mm, respectively. △ Less

Submitted 8 June, 2018; originally announced June 2018.

Comments: Accepted for MICCAI2018

arXiv:1806.03019 [pdf, ps, other]

doi 10.1007/978-3-319-67558-9_26

3D FCN Feature Driven Regression Forest-Based Pancreas Localization and Segmentation

Authors: Masahiro Oda, Natsuki Shimizu, Holger R. Roth, Ken'ichi Karasawa, Takayuki Kitasaka, Kazunari Misawa, Michitaka Fujiwara, Daniel Rueckert, Kensaku Mori

Abstract: This paper presents a fully automated atlas-based pancreas segmentation method from CT volumes utilizing 3D fully convolutional network (FCN) feature-based pancreas localization. Segmentation of the pancreas is difficult because it has larger inter-patient spatial variations than other organs. Previous pancreas segmentation methods failed to deal with such variations. We propose a fully automated… ▽ More This paper presents a fully automated atlas-based pancreas segmentation method from CT volumes utilizing 3D fully convolutional network (FCN) feature-based pancreas localization. Segmentation of the pancreas is difficult because it has larger inter-patient spatial variations than other organs. Previous pancreas segmentation methods failed to deal with such variations. We propose a fully automated pancreas segmentation method that contains novel localization and segmentation. Since the pancreas neighbors many other organs, its position and size are strongly related to the positions of the surrounding organs. We estimate the position and the size of the pancreas (localized) from global features by regression forests. As global features, we use intensity differences and 3D FCN deep learned features, which include automatically extracted essential features for segmentation. We chose 3D FCN features from a trained 3D U-Net, which is trained to perform multi-organ segmentation. The global features include both the pancreas and surrounding organ information. After localization, a patient-specific probabilistic atlas-based pancreas segmentation is performed. In evaluation results with 146 CT volumes, we achieved 60.6% of the Jaccard index and 73.9% of the Dice overlap. △ Less

Submitted 8 June, 2018; originally announced June 2018.

Comments: Presented in MICCAI 2017 workshop, DLMIA 2017 (Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support)

Report number: Published in LNCS Vol.10553

Journal ref: DLMIA 2017, ML-CDS 2017: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp.222-230

arXiv:1806.02679 [pdf, other]

Semi-Supervised Learning via Compact Latent Space Clustering

Authors: Konstantinos Kamnitsas, Daniel C. Castro, Loic Le Folgoc, Ian Walker, Ryutaro Tanno, Daniel Rueckert, Ben Glocker, Antonio Criminisi, Aditya Nori

Abstract: We present a novel cost function for semi-supervised learning of neural networks that encourages compact clustering of the latent space to facilitate separation. The key idea is to dynamically create a graph over embeddings of labeled and unlabeled samples of a training batch to capture underlying structure in feature space, and use label propagation to estimate its high and low density regions. W… ▽ More We present a novel cost function for semi-supervised learning of neural networks that encourages compact clustering of the latent space to facilitate separation. The key idea is to dynamically create a graph over embeddings of labeled and unlabeled samples of a training batch to capture underlying structure in feature space, and use label propagation to estimate its high and low density regions. We then devise a cost function based on Markov chains on the graph that regularizes the latent space to form a single compact cluster per class, while avoiding to disturb existing clusters during optimization. We evaluate our approach on three benchmarks and compare to state-of-the art with promising results. Our approach combines the benefits of graph-based regularization with efficient, inductive inference, does not require modifications to a network architecture, and can thus be easily applied to existing networks to enable an effective use of unlabeled data. △ Less

Submitted 29 July, 2018; v1 submitted 7 June, 2018; originally announced June 2018.

Comments: Presented as a long oral in ICML 2018. Post-conference camera ready

arXiv:1806.01764 [pdf, other]

Graph Saliency Maps through Spectral Convolutional Networks: Application to Sex Classification with Brain Connectivity

Authors: Salim Arslan, Sofia Ira Ktena, Ben Glocker, Daniel Rueckert

Abstract: Graph convolutional networks (GCNs) allow to apply traditional convolution operations in non-Euclidean domains, where data are commonly modelled as irregular graphs. Medical imaging and, in particular, neuroscience studies often rely on such graph representations, with brain connectivity networks being a characteristic example, while ultimately seeking the locus of phenotypic or disease-related di… ▽ More Graph convolutional networks (GCNs) allow to apply traditional convolution operations in non-Euclidean domains, where data are commonly modelled as irregular graphs. Medical imaging and, in particular, neuroscience studies often rely on such graph representations, with brain connectivity networks being a characteristic example, while ultimately seeking the locus of phenotypic or disease-related differences in the brain. These regions of interest (ROIs) are, then, considered to be closely associated with function and/or behaviour. Driven by this, we explore GCNs for the task of ROI identification and propose a visual attribution method based on class activation map**. By undertaking a sex classification task as proof of concept, we show that this method can be used to identify salient nodes (brain regions) without prior node labels. Based on experiments conducted on neuroimaging data of more than 5000 participants from UK Biobank, we demonstrate the robustness of the proposed method in highlighting reproducible regions across individuals. We further evaluate the neurobiological relevance of the identified regions based on evidence from large-scale UK Biobank studies. △ Less

Submitted 5 June, 2018; originally announced June 2018.

arXiv:1806.01738 [pdf, ps, other]

doi 10.1016/j.media.2018.06.001

Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer's Disease

Authors: Sarah Parisot, Sofia Ira Ktena, Enzo Ferrante, Matthew Lee, Ricardo Guerrero, Ben Glocker, Daniel Rueckert

Abstract: Graphs are widely used as a natural framework that captures interactions between individual elements represented as nodes in a graph. In medical applications, specifically, nodes can represent individuals within a potentially large population (patients or healthy controls) accompanied by a set of features, while the graph edges incorporate associations between subjects in an intuitive manner. This… ▽ More Graphs are widely used as a natural framework that captures interactions between individual elements represented as nodes in a graph. In medical applications, specifically, nodes can represent individuals within a potentially large population (patients or healthy controls) accompanied by a set of features, while the graph edges incorporate associations between subjects in an intuitive manner. This representation allows to incorporate the wealth of imaging and non-imaging information as well as individual subject features simultaneously in disease classification tasks. Previous graph-based approaches for supervised or unsupervised learning in the context of disease prediction solely focus on pairwise similarities between subjects, disregarding individual characteristics and features, or rather rely on subject-specific imaging feature vectors and fail to model interactions between them. In this paper, we present a thorough evaluation of a generic framework that leverages both imaging and non-imaging information and can be used for brain analysis in large populations. This framework exploits Graph Convolutional Networks (GCNs) and involves representing populations as a sparse graph, where its nodes are associated with imaging-based feature vectors, while phenotypic information is integrated as edge weights. The extensive evaluation explores the effect of each individual component of this framework on disease prediction performance and further compares it to different baselines. The framework performance is tested on two large datasets with diverse underlying data, ABIDE and ADNI, for the prediction of Autism Spectrum Disorder and conversion to Alzheimer's disease, respectively. Our analysis shows that our novel framework can improve over state-of-the-art results on both databases, with 70.4% classification accuracy for ABIDE and 80.0% for ADNI. △ Less

Submitted 5 June, 2018; originally announced June 2018.

Comments: in Press at Medical Image Analysis, MICCAI 2017 Special Issue

arXiv:1806.00363 [pdf, other]

Domain Adaptation for MRI Organ Segmentation using Reverse Classification Accuracy

Authors: Vanya V. Valindria, Ioannis Lavdas, Wenjia Bai, Konstantinos Kamnitsas, Eric O. Aboagye, Andrea G. Rockall, Daniel Rueckert, Ben Glocker

Abstract: The variations in multi-center data in medical imaging studies have brought the necessity of domain adaptation. Despite the advancement of machine learning in automatic segmentation, performance often degrades when algorithms are applied on new data acquired from different scanners or sequences than the training data. Manual annotation is costly and time consuming if it has to be carried out for e… ▽ More The variations in multi-center data in medical imaging studies have brought the necessity of domain adaptation. Despite the advancement of machine learning in automatic segmentation, performance often degrades when algorithms are applied on new data acquired from different scanners or sequences than the training data. Manual annotation is costly and time consuming if it has to be carried out for every new target domain. In this work, we investigate automatic selection of suitable subjects to be annotated for supervised domain adaptation using the concept of reverse classification accuracy (RCA). RCA predicts the performance of a trained model on data from the new domain and different strategies of selecting subjects to be included in the adaptation via transfer learning are evaluated. We perform experiments on a two-center MR database for the task of organ segmentation. We show that subject selection via RCA can reduce the burden of annotation of new data for the target domain. △ Less

Submitted 1 June, 2018; originally announced June 2018.

Comments: Accepted at the International Conference on Medical Imaging with Deep Learning (MIDL) 2018

arXiv:1805.12064 [pdf, other]

Stochastic Deep Compressive Sensing for the Reconstruction of Diffusion Tensor Cardiac MRI

Authors: Jo Schlemper, Guang Yang, Pedro Ferreira, Andrew Scott, Laura-Ann McGill, Zohya Khalique, Margarita Gorodezky, Malte Roehl, Jennifer Keegan, Dudley Pennell, David Firmin, Daniel Rueckert

Abstract: Understanding the structure of the heart at the microscopic scale of cardiomyocytes and their aggregates provides new insights into the mechanisms of heart disease and enables the investigation of effective therapeutics. Diffusion Tensor Cardiac Magnetic Resonance (DT-CMR) is a unique non-invasive technique that can resolve the microscopic structure, organisation, and integrity of the myocardium w… ▽ More Understanding the structure of the heart at the microscopic scale of cardiomyocytes and their aggregates provides new insights into the mechanisms of heart disease and enables the investigation of effective therapeutics. Diffusion Tensor Cardiac Magnetic Resonance (DT-CMR) is a unique non-invasive technique that can resolve the microscopic structure, organisation, and integrity of the myocardium without the need for exogenous contrast agents. However, this technique suffers from relatively low signal-to-noise ratio (SNR) and frequent signal loss due to respiratory and cardiac motion. Current DT-CMR techniques rely on acquiring and averaging multiple signal acquisitions to improve the SNR. Moreover, in order to mitigate the influence of respiratory movement, patients are required to perform many breath holds which results in prolonged acquisition durations (e.g., ~30 mins using the existing technology). In this study, we propose a novel cascaded Convolutional Neural Networks (CNN) based compressive sensing (CS) technique and explore its applicability to improve DT-CMR acquisitions. Our simulation based studies have achieved high reconstruction fidelity and good agreement between DT-CMR parameters obtained with the proposed reconstruction and fully sampled ground truth. When compared to other state-of-the-art methods, our proposed deep cascaded CNN method and its stochastic variation demonstrated significant improvements. To the best of our knowledge, this is the first study using deep CNN based CS for the DT-CMR reconstruction. In addition, with relatively straightforward modifications to the acquisition scheme, our method can easily be translated into a method for online, at-the-scanner reconstruction enabling the deployment of accelerated DT-CMR in various clinical applications. △ Less

Submitted 30 May, 2018; originally announced May 2018.

Comments: Accepted for MICCAI 2018

arXiv:1805.01026 [pdf, other]

Computing CNN Loss and Gradients for Pose Estimation with Riemannian Geometry

Authors: Benjamin Hou, Nina Miolane, Bishesh Khanal, Matthew C. H. Lee, Amir Alansary, Steven McDonagh, Jo V. Hajnal, Daniel Rueckert, Ben Glocker, Bernhard Kainz

Abstract: Pose estimation, i.e. predicting a 3D rigid transformation with respect to a fixed co-ordinate frame in, SE(3), is an omnipresent problem in medical image analysis with applications such as: image rigid registration, anatomical standard plane detection, tracking and device/camera pose estimation. Deep learning methods often parameterise a pose with a representation that separates rotation and tran… ▽ More Pose estimation, i.e. predicting a 3D rigid transformation with respect to a fixed co-ordinate frame in, SE(3), is an omnipresent problem in medical image analysis with applications such as: image rigid registration, anatomical standard plane detection, tracking and device/camera pose estimation. Deep learning methods often parameterise a pose with a representation that separates rotation and translation. As commonly available frameworks do not provide means to calculate loss on a manifold, regression is usually performed using the L2-norm independently on the rotation's and the translation's parameterisations, which is a metric for linear spaces that does not take into account the Lie group structure of SE(3). In this paper, we propose a general Riemannian formulation of the pose estimation problem. We propose to train the CNN directly on SE(3) equipped with a left-invariant Riemannian metric, coupling the prediction of the translation and rotation defining the pose. At each training step, the ground truth and predicted pose are elements of the manifold, where the loss is calculated as the Riemannian geodesic distance. We then compute the optimisation direction by back-propagating the gradient with respect to the predicted pose on the tangent space of the manifold SE(3) and update the network weights. We thoroughly evaluate the effectiveness of our loss function by comparing its performance with popular and most commonly used existing methods, on tasks such as image-based localisation and intensity-based 2D/3D registration. We also show that hyper-parameters, used in our loss function to weight the contribution between rotations and translations, can be intrinsically calculated from the dataset to achieve greater performance margins. △ Less

Submitted 17 July, 2018; v1 submitted 2 May, 2018; originally announced May 2018.

arXiv:1804.09102 [pdf]

Human-level Performance On Automatic Head Biometrics In Fetal Ultrasound Using Fully Convolutional Neural Networks

Authors: Matthew Sinclair, Christian F. Baumgartner, Jacqueline Matthew, Wenjia Bai, Juan Cerrolaza Martinez, Yuanwei Li, Sandra Smith, Caroline L. Knight, Bernhard Kainz, Jo Hajnal, Andrew P. King, Daniel Rueckert

Abstract: Measurement of head biometrics from fetal ultrasonography images is of key importance in monitoring the healthy development of fetuses. However, the accurate measurement of relevant anatomical structures is subject to large inter-observer variability in the clinic. To address this issue, an automated method utilizing Fully Convolutional Networks (FCN) is proposed to determine measurements of fetal… ▽ More Measurement of head biometrics from fetal ultrasonography images is of key importance in monitoring the healthy development of fetuses. However, the accurate measurement of relevant anatomical structures is subject to large inter-observer variability in the clinic. To address this issue, an automated method utilizing Fully Convolutional Networks (FCN) is proposed to determine measurements of fetal head circumference (HC) and biparietal diameter (BPD). An FCN was trained on approximately 2000 2D ultrasound images of the head with annotations provided by 45 different sonographers during routine screening examinations to perform semantic segmentation of the head. An ellipse is fitted to the resulting segmentation contours to mimic the annotation typically produced by a sonographer. The model's performance was compared with inter-observer variability, where two experts manually annotated 100 test images. Mean absolute model-expert error was slightly better than inter-observer error for HC (1.99mm vs 2.16mm), and comparable for BPD (0.61mm vs 0.59mm), as well as Dice coefficient (0.980 vs 0.980). Our results demonstrate that the model performs at a level similar to a human expert, and learns to produce accurate predictions from a large dataset annotated by many sonographers. Additionally, measurements are generated in near real-time at 15fps on a GPU, which could speed up clinical workflow for both skilled and trainee sonographers. △ Less

Submitted 24 April, 2018; originally announced April 2018.

Comments: EMBC 2018

arXiv:1804.05338 [pdf, other]

Attention-Gated Networks for Improving Ultrasound Scan Plane Detection

Authors: Jo Schlemper, Ozan Oktay, Liang Chen, Jacqueline Matthew, Caroline Knight, Bernhard Kainz, Ben Glocker, Daniel Rueckert

Abstract: In this work, we apply an attention-gated network to real-time automated scan plane detection for fetal ultrasound screening. Scan plane detection in fetal ultrasound is a challenging problem due the poor image quality resulting in low interpretability for both clinicians and automated algorithms. To solve this, we propose incorporating self-gated soft-attention mechanisms. A soft-attention mechan… ▽ More In this work, we apply an attention-gated network to real-time automated scan plane detection for fetal ultrasound screening. Scan plane detection in fetal ultrasound is a challenging problem due the poor image quality resulting in low interpretability for both clinicians and automated algorithms. To solve this, we propose incorporating self-gated soft-attention mechanisms. A soft-attention mechanism generates a gating signal that is end-to-end trainable, which allows the network to contextualise local information useful for prediction. The proposed attention mechanism is generic and it can be easily incorporated into any existing classification architectures, while only requiring a few additional parameters. We show that, when the base network has a high capacity, the incorporated attention mechanism can provide efficient object localisation while improving the overall performance. When the base network has a low capacity, the method greatly outperforms the baseline approach and significantly reduces false positives. Lastly, the generated attention maps allow us to understand the model's reasoning process, which can also be used for weakly supervised object localisation. △ Less

Submitted 15 April, 2018; originally announced April 2018.

Comments: Submitted to MIDL2018 (OpenReview: https://openreview.net/forum?id=BJtn7-3sM)

arXiv:1804.03999 [pdf, other]

Attention U-Net: Learning Where to Look for the Pancreas

Authors: Ozan Oktay, Jo Schlemper, Loic Le Folgoc, Matthew Lee, Mattias Heinrich, Kazunari Misawa, Kensaku Mori, Steven McDonagh, Nils Y Hammerla, Bernhard Kainz, Ben Glocker, Daniel Rueckert

Abstract: We propose a novel attention gate (AG) model for medical imaging that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation… ▽ More We propose a novel attention gate (AG) model for medical imaging that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation modules of cascaded convolutional neural networks (CNNs). AGs can be easily integrated into standard CNN architectures such as the U-Net model with minimal computational overhead while increasing the model sensitivity and prediction accuracy. The proposed Attention U-Net architecture is evaluated on two large CT abdominal datasets for multi-class image segmentation. Experimental results show that AGs consistently improve the prediction performance of U-Net across different datasets and training sizes while preserving computational efficiency. The code for the proposed architecture is publicly available. △ Less

Submitted 20 May, 2018; v1 submitted 11 April, 2018; originally announced April 2018.

Comments: Accepted to published in MIDL'18 (Revised Version) / OpenReview link: https://openreview.net/forum?id=Skft7cijM

arXiv:1803.09354 [pdf, other]

doi 10.1109/TMI.2018.2878509

Learning-Based Quality Control for Cardiac MR Images

Authors: Giacomo Tarroni, Ozan Oktay, Wenjia Bai, Andreas Schuh, Hideaki Suzuki, Jonathan Passerat-Palmbach, Antonio de Marvao, Declan P. O'Regan, Stuart Cook, Ben Glocker, Paul M. Matthews, Daniel Rueckert

Abstract: The effectiveness of a cardiovascular magnetic resonance (CMR) scan depends on the ability of the operator to correctly tune the acquisition parameters to the subject being scanned and on the potential occurrence of imaging artefacts such as cardiac and respiratory motion. In the clinical practice, a quality control step is performed by visual assessment of the acquired images: however, this proce… ▽ More The effectiveness of a cardiovascular magnetic resonance (CMR) scan depends on the ability of the operator to correctly tune the acquisition parameters to the subject being scanned and on the potential occurrence of imaging artefacts such as cardiac and respiratory motion. In the clinical practice, a quality control step is performed by visual assessment of the acquired images: however, this procedure is strongly operator-dependent, cumbersome and sometimes incompatible with the time constraints in clinical settings and large-scale studies. We propose a fast, fully-automated, learning-based quality control pipeline for CMR images, specifically for short-axis image stacks. Our pipeline performs three important quality checks: 1) heart coverage estimation, 2) inter-slice motion detection, 3) image contrast estimation in the cardiac region. The pipeline uses a hybrid decision forest method - integrating both regression and structured classification models - to extract landmarks as well as probabilistic segmentation maps from both long- and short-axis images as a basis to perform the quality checks. The technique was tested on up to 3000 cases from the UK Biobank as well as on 100 cases from the UK Digital Heart Project, and validated against manual annotations and visual inspections performed by expert interpreters. The results show the capability of the proposed pipeline to correctly detect incomplete or corrupted scans (e.g. on UK Biobank, sensitivity and specificity respectively 88% and 99% for heart coverage estimation, 85% and 95% for motion detection), allowing their exclusion from the analysed dataset or the triggering of a new acquisition. △ Less

Submitted 15 September, 2018; v1 submitted 25 March, 2018; originally announced March 2018.

Journal ref: IEEE Transactions on Medical Imaging Nov 2018

arXiv:1712.01751 [pdf, other]

Convolutional Recurrent Neural Networks for Dynamic MR Image Reconstruction

Authors: Chen Qin, Jo Schlemper, Jose Caballero, Anthony Price, Joseph V. Hajnal, Daniel Rueckert

Abstract: Accelerating the data acquisition of dynamic magnetic resonance imaging (MRI) leads to a challenging ill-posed inverse problem, which has received great interest from both the signal processing and machine learning community over the last decades. The key ingredient to the problem is how to exploit the temporal correlation of the MR sequence to resolve the aliasing artefact. Traditionally, such ob… ▽ More Accelerating the data acquisition of dynamic magnetic resonance imaging (MRI) leads to a challenging ill-posed inverse problem, which has received great interest from both the signal processing and machine learning community over the last decades. The key ingredient to the problem is how to exploit the temporal correlation of the MR sequence to resolve the aliasing artefact. Traditionally, such observation led to a formulation of a non-convex optimisation problem, which were solved using iterative algorithms. Recently, however, deep learning based-approaches have gained significant popularity due to its ability to solve general inversion problems. In this work, we propose a unique, novel convolutional recurrent neural network (CRNN) architecture which reconstructs high quality cardiac MR images from highly undersampled k-space data by jointly exploiting the dependencies of the temporal sequences as well as the iterative nature of the traditional optimisation algorithms. In particular, the proposed architecture embeds the structure of the traditional iterative algorithms, efficiently modelling the recurrence of the iterative reconstruction stages by using recurrent hidden connections over such iterations. In addition, spatiotemporal dependencies are simultaneously learnt by exploiting bidirectional recurrent hidden connections across time sequences. The proposed algorithm is able to learn both the temporal dependency and the iterative reconstruction process effectively with only a very small number of parameters, while outperforming current MR reconstruction methods in terms of computational complexity, reconstruction accuracy and speed. △ Less

Submitted 14 October, 2018; v1 submitted 5 December, 2017; originally announced December 2017.

Comments: Published in IEEE Transactions on Medical Imaging

arXiv:1711.06853 [pdf, other]

DLTK: State of the Art Reference Implementations for Deep Learning on Medical Images

Authors: Nick Pawlowski, Sofia Ira Ktena, Matthew C. H. Lee, Bernhard Kainz, Daniel Rueckert, Ben Glocker, Martin Rajchl

Abstract: We present DLTK, a toolkit providing baseline implementations for efficient experimentation with deep learning methods on biomedical images. It builds on top of TensorFlow and its high modularity and easy-to-use examples allow for a low-threshold access to state-of-the-art implementations for typical medical imaging problems. A comparison of DLTK's reference implementations of popular network arch… ▽ More We present DLTK, a toolkit providing baseline implementations for efficient experimentation with deep learning methods on biomedical images. It builds on top of TensorFlow and its high modularity and easy-to-use examples allow for a low-threshold access to state-of-the-art implementations for typical medical imaging problems. A comparison of DLTK's reference implementations of popular network architectures for image segmentation demonstrates new top performance on the publicly available challenge data "Multi-Atlas Labeling Beyond the Cranial Vault". The average test Dice similarity coefficient of $81.5$ exceeds the previously best performing CNN ($75.7$) and the accuracy of the challenge winning method ($79.0$). △ Less

Submitted 18 November, 2017; originally announced November 2017.

Comments: Submitted to Medical Imaging Meets NIPS 2017, Code at https://github.com/DLTK/DLTK

arXiv:1711.01468 [pdf, other]

Ensembles of Multiple Models and Architectures for Robust Brain Tumour Segmentation

Authors: Konstantinos Kamnitsas, Wenjia Bai, Enzo Ferrante, Steven McDonagh, Matthew Sinclair, Nick Pawlowski, Martin Rajchl, Matthew Lee, Bernhard Kainz, Daniel Rueckert, Ben Glocker

Abstract: Deep learning approaches such as convolutional neural nets have consistently outperformed previous methods on challenging tasks such as dense, semantic segmentation. However, the various proposed networks perform differently, with behaviour largely influenced by architectural choices and training settings. This paper explores Ensembles of Multiple Models and Architectures (EMMA) for robust perform… ▽ More Deep learning approaches such as convolutional neural nets have consistently outperformed previous methods on challenging tasks such as dense, semantic segmentation. However, the various proposed networks perform differently, with behaviour largely influenced by architectural choices and training settings. This paper explores Ensembles of Multiple Models and Architectures (EMMA) for robust performance through aggregation of predictions from a wide range of methods. The approach reduces the influence of the meta-parameters of individual models and the risk of overfitting the configuration to a particular database. EMMA can be seen as an unbiased, generic deep learning model which is shown to yield excellent performance, winning the first position in the BRATS 2017 competition among 50+ participating teams. △ Less

Submitted 4 November, 2017; originally announced November 2017.

Comments: The method won the 1st-place in the Brain Tumour Segmentation (BRATS) 2017 competition (segmentation task)

arXiv:1710.09289 [pdf, other]

Automated cardiovascular magnetic resonance image analysis with fully convolutional networks

Authors: Wenjia Bai, Matthew Sinclair, Giacomo Tarroni, Ozan Oktay, Martin Rajchl, Ghislain Vaillant, Aaron M. Lee, Nay Aung, Elena Lukaschuk, Mihir M. Sanghvi, Filip Zemrak, Kenneth Fung, Jose Miguel Paiva, Valentina Carapella, Young ** Kim, Hideaki Suzuki, Bernhard Kainz, Paul M. Matthews, Steffen E. Petersen, Stefan K. Piechnik, Stefan Neubauer, Ben Glocker, Daniel Rueckert

Abstract: Cardiovascular magnetic resonance (CMR) imaging is a standard imaging modality for assessing cardiovascular diseases (CVDs), the leading cause of death globally. CMR enables accurate quantification of the cardiac chamber volume, ejection fraction and myocardial mass, providing information for diagnosis and monitoring of CVDs. However, for years, clinicians have been relying on manual approaches fo… ▽ More Cardiovascular magnetic resonance (CMR) imaging is a standard imaging modality for assessing cardiovascular diseases (CVDs), the leading cause of death globally. CMR enables accurate quantification of the cardiac chamber volume, ejection fraction and myocardial mass, providing information for diagnosis and monitoring of CVDs. However, for years, clinicians have been relying on manual approaches for CMR image analysis, which is time consuming and prone to subjective errors. It is a major clinical challenge to automatically derive quantitative and clinically relevant information from CMR images. Deep neural networks have shown a great potential in image pattern recognition and segmentation for a variety of tasks. Here we demonstrate an automated analysis method for CMR images, which is based on a fully convolutional network (FCN). The network is trained and evaluated on a large-scale dataset from the UK Biobank, consisting of 4,875 subjects with 93,500 pixelwise annotated images. The performance of the method has been evaluated using a number of technical metrics, including the Dice metric, mean contour distance and Hausdorff distance, as well as clinically relevant measures, including left ventricle (LV) end-diastolic volume (LVEDV) and end-systolic volume (LVESV), LV mass (LVM); right ventricle (RV) end-diastolic volume (RVEDV) and end-systolic volume (RVESV). By combining FCN with a large-scale annotated dataset, the proposed automated method achieves a high performance on par with human experts in segmenting the LV and RV on short-axis CMR images and the left atrium (LA) and right atrium (RA) on long-axis CMR images. △ Less

Submitted 22 May, 2018; v1 submitted 25 October, 2017; originally announced October 2017.

Comments: Accepted for publication by Journal of Cardiovascular Magnetic Resonance

arXiv:1709.06341 [pdf, other]

3D Reconstruction in Canonical Co-ordinate Space from Arbitrarily Oriented 2D Images

Authors: Benjamin Hou, Bishesh Khanal, Amir Alansary, Steven McDonagh, Alice Davidson, Mary Rutherford, Jo V. Hajnal, Daniel Rueckert, Ben Glocker, Bernhard Kainz

Abstract: Limited capture range, and the requirement to provide high quality initialization for optimization-based 2D/3D image registration methods, can significantly degrade the performance of 3D image reconstruction and motion compensation pipelines. Challenging clinical imaging scenarios, which contain significant subject motion such as fetal in-utero imaging, complicate the 3D image and volume reconstru… ▽ More Limited capture range, and the requirement to provide high quality initialization for optimization-based 2D/3D image registration methods, can significantly degrade the performance of 3D image reconstruction and motion compensation pipelines. Challenging clinical imaging scenarios, which contain significant subject motion such as fetal in-utero imaging, complicate the 3D image and volume reconstruction process. In this paper we present a learning based image registration method capable of predicting 3D rigid transformations of arbitrarily oriented 2D image slices, with respect to a learned canonical atlas co-ordinate system. Only image slice intensity information is used to perform registration and canonical alignment, no spatial transform initialization is required. To find image transformations we utilize a Convolutional Neural Network (CNN) architecture to learn the regression function capable of map** 2D image slices to a 3D canonical atlas space. We extensively evaluate the effectiveness of our approach quantitatively on simulated Magnetic Resonance Imaging (MRI), fetal brain imagery with synthetic motion and further demonstrate qualitative results on real fetal MRI data where our method is integrated into a full reconstruction and motion compensation pipeline. Our learning based registration achieves an average spatial prediction error of 7 mm on simulated data and produces qualitatively improved reconstructions for heavily moving fetuses with gestational ages of approximately 20 weeks. Our model provides a general and computationally efficient solution to the 2D/3D registration initialization problem and is suitable for real-time scenarios. △ Less

Submitted 23 January, 2018; v1 submitted 19 September, 2017; originally announced September 2017.

arXiv:1709.01182 [pdf, other]

Is human face processing a feature- or pattern-based task? Evidence using a unified computational method driven by eye movements

Authors: Carlos E. Thomaz, Vagner Amaral, Gilson A. Giraldi, Duncan F. Gillies, Daniel Rueckert

Abstract: Research on human face processing using eye movements has provided evidence that we recognize face images successfully focusing our visual attention on a few inner facial regions, mainly on the eyes, nose and mouth. To understand how we accomplish this process of coding high-dimensional faces so efficiently, this paper proposes and implements a multivariate extraction method that combines face ima… ▽ More Research on human face processing using eye movements has provided evidence that we recognize face images successfully focusing our visual attention on a few inner facial regions, mainly on the eyes, nose and mouth. To understand how we accomplish this process of coding high-dimensional faces so efficiently, this paper proposes and implements a multivariate extraction method that combines face images variance with human spatial attention maps modeled as feature- and pattern-based information sources. It is based on a unified multidimensional representation of the well-known face-space concept. The spatial attention maps are summary statistics of the eye-tracking fixations of a number of participants and trials to frontal and well-framed face images during separate gender and facial expression recognition tasks. Our experimental results carried out on publicly available face databases have indicated that we might emulate the human extraction system as a pattern-based computational method rather than a feature-based one to properly explain the proficiency of the human system in recognizing visual face information. △ Less

Submitted 4 September, 2017; originally announced September 2017.

arXiv:1708.06297 [pdf, other]

Employing Weak Annotations for Medical Image Analysis Problems

Authors: Martin Rajchl, Lisa M. Koch, Christian Ledig, Jonathan Passerat-Palmbach, Kazunari Misawa, Kensaku Mori, Daniel Rueckert

Abstract: To efficiently establish training databases for machine learning methods, collaborative and crowdsourcing platforms have been investigated to collectively tackle the annotation effort. However, when this concept is ported to the medical imaging domain, reading expertise will have a direct impact on the annotation accuracy. In this study, we examine the impact of expertise and the amount of availab… ▽ More To efficiently establish training databases for machine learning methods, collaborative and crowdsourcing platforms have been investigated to collectively tackle the annotation effort. However, when this concept is ported to the medical imaging domain, reading expertise will have a direct impact on the annotation accuracy. In this study, we examine the impact of expertise and the amount of available annotations on the accuracy outcome of a liver segmentation problem in an abdominal computed tomography (CT) image database. In controlled experiments, we study this impact for different types of weak annotations. To address the decrease in accuracy associated with lower expertise, we propose a method for outlier correction making use of a weakly labelled atlas. Using this approach, we demonstrate that weak annotations subject to high error rates can achieve a similarly high accuracy as state-of-the-art multi-atlas segmentation approaches relying on a large amount of expert manual segmentations. Annotations of this nature can realistically be obtained from a non-expert crowd and can potentially enable crowdsourcing of weak annotation tasks for medical image analysis. △ Less

Submitted 21 August, 2017; originally announced August 2017.

arXiv:1706.07355 [pdf, other]

doi 10.1093/bioinformatics/btx552

Three-dimensional Cardiovascular Imaging-Genetics: A Mass Univariate Framework

Authors: Carlo Biffi, Antonio de Marvao, Mark I. Attard, Timothy J. W. Dawes, Nicola Whiffin, Wenjia Bai, Wenzhe Shi, Catherine Francis, Hannah Meyer, Rachel Buchan, Stuart A. Cook, Daniel Rueckert, Declan P. O'Regan

Abstract: MOTIVATION: Left ventricular (LV) hypertrophy is a strong predictor of cardiovascular outcomes, but its genetic regulation remains largely unexplained. Conventional phenoty** relies on manual calculation of LV mass and wall thickness, but advanced cardiac image analysis presents an opportunity for high-throughput map** of genotype-phenotype associations in three dimensions (3D). RESULTS: High-… ▽ More MOTIVATION: Left ventricular (LV) hypertrophy is a strong predictor of cardiovascular outcomes, but its genetic regulation remains largely unexplained. Conventional phenoty** relies on manual calculation of LV mass and wall thickness, but advanced cardiac image analysis presents an opportunity for high-throughput map** of genotype-phenotype associations in three dimensions (3D). RESULTS: High-resolution cardiac magnetic resonance images were automatically segmented in 1,124 healthy volunteers to create a 3D shape model of the heart. Mass univariate regression was used to plot a 3D effect-size map for the association between wall thickness and a set of predictors at each vertex in the mesh. The vertices where a significant effect exists were determined by applying threshold-free cluster enhancement to boost areas of signal with spatial contiguity. Experiments on simulated phenotypic signals and SNP replication show that this approach offers a substantial gain in statistical power for cardiac genotype-phenotype associations while providing good control of the false discovery rate. This framework models the effects of genetic variation throughout the heart and can be automatically applied to large population cohorts. AVAILABILITY: The proposed approach has been coded in an R package freely available at https://doi.org/10.5281/zenodo.834610 together with the clinical data used in this work. △ Less

Submitted 13 September, 2017; v1 submitted 22 June, 2017; originally announced June 2017.

Comments: 14 pages, 11 figures. Version accepted by Bioinformatics (Sept 2017). Includes Supplementary Materials

arXiv:1706.00935 [pdf, ps, other]

White matter hyperintensity and stroke lesion segmentation and differentiation using convolutional neural networks

Authors: R. Guerrero, C. Qin, O. Oktay, C. Bowles, L. Chen, R. Joules, R. Wolz, M. C. Valdes-Hernandez, D. A. Dickie, J. Wardlaw, D. Rueckert

Abstract: The accurate assessment of White matter hyperintensities (WMH) burden is of crucial importance for epidemiological studies to determine association between WMHs, cognitive and clinical data. The manual delineation of WMHs is tedious, costly and time consuming. This is further complicated by the fact that other pathological features (i.e. stroke lesions) often also appear as hyperintense. Several a… ▽ More The accurate assessment of White matter hyperintensities (WMH) burden is of crucial importance for epidemiological studies to determine association between WMHs, cognitive and clinical data. The manual delineation of WMHs is tedious, costly and time consuming. This is further complicated by the fact that other pathological features (i.e. stroke lesions) often also appear as hyperintense. Several automated methods aiming to tackle the challenges of WMH segmentation have been proposed, however cannot differentiate between WMH and strokes. Other methods, capable of distinguishing between different pathologies in brain MRI, are not designed with simultaneous WMH and stroke segmentation in mind. In this work we propose to use a convolutional neural network (CNN) that is able to segment hyperintensities and differentiate between WMHs and stroke lesions. Specifically, we aim to distinguish between WMH pathologies from those caused by stroke lesions due to either cortical, large or small subcortical infarcts. As far as we know, this is the first time such differentiation task has explicitly been proposed. The proposed fully convolutional CNN architecture, is comprised of an analysis path, that gradually learns low and high level features, followed by a synthesis path, that gradually combines and up-samples the low and high level features into a class likelihood semantic segmentation. Quantitatively, the proposed CNN architecture is shown to outperform other well established and state-of-the-art algorithms in terms of overlap with manual expert annotations. Clinically, the extracted WMH volumes were found to correlate better with the Fazekas visual rating score. Additionally, a comparison of the associations found between clinical risk-factors and the WMH volumes generated by the proposed method, were found to be in line with the associations found with the expert-annotated volumes. △ Less

Submitted 2 January, 2018; v1 submitted 3 June, 2017; originally announced June 2017.

arXiv:1705.08302 [pdf, other]

doi 10.1109/TMI.2017.2743464

Anatomically Constrained Neural Networks (ACNN): Application to Cardiac Image Enhancement and Segmentation

Authors: Ozan Oktay, Enzo Ferrante, Konstantinos Kamnitsas, Mattias Heinrich, Wenjia Bai, Jose Caballero, Stuart Cook, Antonio de Marvao, Timothy Dawes, Declan O'Regan, Bernhard Kainz, Ben Glocker, Daniel Rueckert

Abstract: Incorporation of prior knowledge about organ shape and location is key to improve performance of image analysis approaches. In particular, priors can be useful in cases where images are corrupted and contain artefacts due to limitations in image acquisition. The highly constrained nature of anatomical objects can be well captured with learning based techniques. However, in most recent and promisin… ▽ More Incorporation of prior knowledge about organ shape and location is key to improve performance of image analysis approaches. In particular, priors can be useful in cases where images are corrupted and contain artefacts due to limitations in image acquisition. The highly constrained nature of anatomical objects can be well captured with learning based techniques. However, in most recent and promising techniques such as CNN based segmentation it is not obvious how to incorporate such prior knowledge. State-of-the-art methods operate as pixel-wise classifiers where the training objectives do not incorporate the structure and inter-dependencies of the output. To overcome this limitation, we propose a generic training strategy that incorporates anatomical prior knowledge into CNNs through a new regularisation model, which is trained end-to-end. The new framework encourages models to follow the global anatomical properties of the underlying anatomy (e.g. shape, label structure) via learned non-linear representations of the shape. We show that the proposed approach can be easily adapted to different analysis tasks (e.g. image enhancement, segmentation) and improve the prediction accuracy of the state-of-the-art models. The applicability of our approach is shown on multi-modal cardiac datasets and public benchmarks. Additionally, we demonstrate how the learned deep models of 3D shapes can be interpreted and used as biomarkers for classification of cardiac pathologies. △ Less

Submitted 5 December, 2017; v1 submitted 22 May, 2017; originally announced May 2017.

Comments: Published in IEEE Transactions on Medical Imaging (Aug 2017)

arXiv:1704.02422 [pdf, other]

A Deep Cascade of Convolutional Neural Networks for Dynamic MR Image Reconstruction

Authors: Jo Schlemper, Jose Caballero, Joseph V. Hajnal, Anthony Price, Daniel Rueckert

Abstract: Inspired by recent advances in deep learning, we propose a framework for reconstructing dynamic sequences of 2D cardiac magnetic resonance (MR) images from undersampled data using a deep cascade of convolutional neural networks (CNNs) to accelerate the data acquisition process. In particular, we address the case where data is acquired using aggressive Cartesian undersampling. Firstly, we show that… ▽ More Inspired by recent advances in deep learning, we propose a framework for reconstructing dynamic sequences of 2D cardiac magnetic resonance (MR) images from undersampled data using a deep cascade of convolutional neural networks (CNNs) to accelerate the data acquisition process. In particular, we address the case where data is acquired using aggressive Cartesian undersampling. Firstly, we show that when each 2D image frame is reconstructed independently, the proposed method outperforms state-of-the-art 2D compressed sensing approaches such as dictionary learning-based MR image reconstruction, in terms of reconstruction error and reconstruction speed. Secondly, when reconstructing the frames of the sequences jointly, we demonstrate that CNNs can learn spatio-temporal correlations efficiently by combining convolution and data sharing approaches. We show that the proposed method consistently outperforms state-of-the-art methods and is capable of preserving anatomical structure more faithfully up to 11-fold undersampling. Moreover, reconstruction is very fast: each complete dynamic sequence can be reconstructed in less than 10s and, for the 2D case, each image frame can be reconstructed in 23ms, enabling real-time applications. △ Less

Submitted 23 November, 2017; v1 submitted 7 April, 2017; originally announced April 2017.

Comments: To be published in IEEE Transactions on Medical Imaging

arXiv:1703.10062 [pdf, other]

Exploring Heritability of Functional Brain Networks with Inexact Graph Matching

Authors: Sofia Ira Ktena, Salim Arslan, Sarah Parisot, Daniel Rueckert

Abstract: Data-driven brain parcellations aim to provide a more accurate representation of an individual's functional connectivity, since they are able to capture individual variability that arises due to development or disease. This renders comparisons between the emerging brain connectivity networks more challenging, since correspondences between their elements are not preserved. Unveiling these correspon… ▽ More Data-driven brain parcellations aim to provide a more accurate representation of an individual's functional connectivity, since they are able to capture individual variability that arises due to development or disease. This renders comparisons between the emerging brain connectivity networks more challenging, since correspondences between their elements are not preserved. Unveiling these correspondences is of major importance to keep track of local functional connectivity changes. We propose a novel method based on graph edit distance for the comparison of brain graphs directly in their domain, that can accurately reflect similarities between individual networks while providing the network element correspondences. This method is validated on a dataset of 116 twin subjects provided by the Human Connectome Project. △ Less

Submitted 29 March, 2017; originally announced March 2017.

Comments: accepted at ISBI 2017: International Symposium on Biomedical Imaging, Apr 2017, Melbourne, Australia

arXiv:1703.03020 [pdf, ps, other]

Spectral Graph Convolutions for Population-based Disease Prediction

Authors: Sarah Parisot, Sofia Ira Ktena, Enzo Ferrante, Matthew Lee, Ricardo Guerrerro Moreno, Ben Glocker, Daniel Rueckert

Abstract: Exploiting the wealth of imaging and non-imaging information for disease prediction tasks requires models capable of representing, at the same time, individual features as well as data associations between subjects from potentially large populations. Graphs provide a natural framework for such tasks, yet previous graph-based approaches focus on pairwise similarities without modelling the subjects'… ▽ More Exploiting the wealth of imaging and non-imaging information for disease prediction tasks requires models capable of representing, at the same time, individual features as well as data associations between subjects from potentially large populations. Graphs provide a natural framework for such tasks, yet previous graph-based approaches focus on pairwise similarities without modelling the subjects' individual characteristics and features. On the other hand, relying solely on subject-specific imaging feature vectors fails to model the interaction and similarity between subjects, which can reduce performance. In this paper, we introduce the novel concept of Graph Convolutional Networks (GCN) for brain analysis in populations, combining imaging and non-imaging data. We represent populations as a sparse graph where its vertices are associated with image-based feature vectors and the edges encode phenotypic information. This structure was used to train a GCN model on partially labelled graphs, aiming to infer the classes of unlabelled nodes from the node features and pairwise associations between subjects. We demonstrate the potential of the method on the challenging ADNI and ABIDE databases, as a proof of concept of the benefit from integrating contextual information in classification tasks. This has a clear impact on the quality of the predictions, leading to 69.5% accuracy for ABIDE (outperforming the current state of the art of 66.8%) and 77% for ADNI for prediction of MCI conversion, significantly outperforming standard linear classifiers where only individual features are considered. △ Less

Submitted 21 June, 2017; v1 submitted 8 March, 2017; originally announced March 2017.

Comments: International Conference on Medical Image Computing and Computer-Assisted Interventions (MICCAI) 2017

arXiv:1703.02161 [pdf, other]

Distance Metric Learning using Graph Convolutional Networks: Application to Functional Brain Networks

Authors: Sofia Ira Ktena, Sarah Parisot, Enzo Ferrante, Martin Rajchl, Matthew Lee, Ben Glocker, Daniel Rueckert

Abstract: Evaluating similarity between graphs is of major importance in several computer vision and pattern recognition problems, where graph representations are often used to model objects or interactions between elements. The choice of a distance or similarity metric is, however, not trivial and can be highly dependent on the application at hand. In this work, we propose a novel metric learning method to… ▽ More Evaluating similarity between graphs is of major importance in several computer vision and pattern recognition problems, where graph representations are often used to model objects or interactions between elements. The choice of a distance or similarity metric is, however, not trivial and can be highly dependent on the application at hand. In this work, we propose a novel metric learning method to evaluate distance between graphs that leverages the power of convolutional neural networks, while exploiting concepts from spectral graph theory to allow these operations on irregular graphs. We demonstrate the potential of our method in the field of connectomics, where neuronal pathways or functional connections between brain regions are commonly modelled as graphs. In this problem, the definition of an appropriate graph similarity function is critical to unveil patterns of disruptions associated with certain brain disorders. Experimental results on the ABIDE dataset show that our method can learn a graph similarity metric tailored for a clinical application, improving the performance of a simple k-nn classifier by 11.9% compared to a traditional distance metric. △ Less

Submitted 14 June, 2017; v1 submitted 6 March, 2017; originally announced March 2017.

Comments: International Conference on Medical Image Computing and Computer-Assisted Interventions (MICCAI) 2017

arXiv:1703.00555 [pdf, other]

A Deep Cascade of Convolutional Neural Networks for MR Image Reconstruction

Authors: Jo Schlemper, Jose Caballero, Joseph V. Hajnal, Anthony Price, Daniel Rueckert

Abstract: The acquisition of Magnetic Resonance Imaging (MRI) is inherently slow. Inspired by recent advances in deep learning, we propose a framework for reconstructing MR images from undersampled data using a deep cascade of convolutional neural networks to accelerate the data acquisition process. We show that for Cartesian undersampling of 2D cardiac MR images, the proposed method outperforms the state-o… ▽ More The acquisition of Magnetic Resonance Imaging (MRI) is inherently slow. Inspired by recent advances in deep learning, we propose a framework for reconstructing MR images from undersampled data using a deep cascade of convolutional neural networks to accelerate the data acquisition process. We show that for Cartesian undersampling of 2D cardiac MR images, the proposed method outperforms the state-of-the-art compressed sensing approaches, such as dictionary learning-based MRI (DLMRI) reconstruction, in terms of reconstruction error, perceptual quality and reconstruction speed for both 3-fold and 6-fold undersampling. Compared to DLMRI, the error produced by the method proposed is approximately twice as small, allowing to preserve anatomical structures more faithfully. Using our method, each image can be reconstructed in 23 ms, which is fast enough to enable real-time applications. △ Less

Submitted 1 March, 2017; originally announced March 2017.

arXiv:1702.08891 [pdf, other]

Predicting Slice-to-Volume Transformation in Presence of Arbitrary Subject Motion

Authors: Benjamin Hou, Amir Alansary, Steven McDonagh, Alice Davidson, Mary Rutherford, Jo V. Hajnal, Daniel Rueckert, Ben Glocker, Bernhard Kainz

Abstract: This paper aims to solve a fundamental problem in intensity-based 2D/3D registration, which concerns the limited capture range and need for very good initialization of state-of-the-art image registration methods. We propose a regression approach that learns to predict rotation and translations of arbitrary 2D image slices from 3D volumes, with respect to a learned canonical atlas co-ordinate syste… ▽ More This paper aims to solve a fundamental problem in intensity-based 2D/3D registration, which concerns the limited capture range and need for very good initialization of state-of-the-art image registration methods. We propose a regression approach that learns to predict rotation and translations of arbitrary 2D image slices from 3D volumes, with respect to a learned canonical atlas co-ordinate system. To this end, we utilize Convolutional Neural Networks (CNNs) to learn the highly complex regression function that maps 2D image slices into their correct position and orientation in 3D space. Our approach is attractive in challenging imaging scenarios, where significant subject motion complicates reconstruction performance of 3D volumes from 2D slice data. We extensively evaluate the effectiveness of our approach quantitatively on simulated MRI brain data with extreme random motion. We further demonstrate qualitative results on fetal MRI where our method is integrated into a full reconstruction and motion compensation pipeline. With our CNN regression approach we obtain an average prediction error of 7mm on simulated data, and convincing reconstruction quality of images of very young fetuses where previous methods fail. We further discuss applications to Computed Tomography and X-ray projections. Our approach is a general solution to the 2D/3D initialization problem. It is computationally efficient, with prediction times per slice of a few milliseconds, making it suitable for real-time scenarios. △ Less

Submitted 4 March, 2017; v1 submitted 28 February, 2017; originally announced February 2017.

Comments: 8 pages, 4 figures, 6 pages supplemental material, currently under review for MICCAI 2017

arXiv:1702.03407 [pdf, other]

Reverse Classification Accuracy: Predicting Segmentation Performance in the Absence of Ground Truth

Authors: Vanya V. Valindria, Ioannis Lavdas, Wenjia Bai, Konstantinos Kamnitsas, Eric O. Aboagye, Andrea G. Rockall, Daniel Rueckert, Ben Glocker

Abstract: When integrating computational tools such as automatic segmentation into clinical practice, it is of utmost importance to be able to assess the level of accuracy on new data, and in particular, to detect when an automatic method fails. However, this is difficult to achieve due to absence of ground truth. Segmentation accuracy on clinical data might be different from what is found through cross-val… ▽ More When integrating computational tools such as automatic segmentation into clinical practice, it is of utmost importance to be able to assess the level of accuracy on new data, and in particular, to detect when an automatic method fails. However, this is difficult to achieve due to absence of ground truth. Segmentation accuracy on clinical data might be different from what is found through cross-validation because validation data is often used during incremental method development, which can lead to overfitting and unrealistic performance expectations. Before deployment, performance is quantified using different metrics, for which the predicted segmentation is compared to a reference segmentation, often obtained manually by an expert. But little is known about the real performance after deployment when a reference is unavailable. In this paper, we introduce the concept of reverse classification accuracy (RCA) as a framework for predicting the performance of a segmentation method on new data. In RCA we take the predicted segmentation from a new image to train a reverse classifier which is evaluated on a set of reference images with available ground truth. The hypothesis is that if the predicted segmentation is of good quality, then the reverse classifier will perform well on at least some of the reference images. We validate our approach on multi-organ segmentation with different classifiers and segmentation methods. Our results indicate that it is indeed possible to predict the quality of individual segmentations, in the absence of ground truth. Thus, RCA is ideal for integration into automatic processing pipelines in clinical routine and as part of large-scale image analysis studies. △ Less

Submitted 11 February, 2017; originally announced February 2017.

Comments: Accepted article to appear in IEEE Transactions on Medical Imaging 2017

arXiv:1612.08894 [pdf, other]

Unsupervised domain adaptation in brain lesion segmentation with adversarial networks

Authors: Konstantinos Kamnitsas, Christian Baumgartner, Christian Ledig, Virginia F. J. Newcombe, Joanna P. Simpson, Andrew D. Kane, David K. Menon, Aditya Nori, Antonio Criminisi, Daniel Rueckert, Ben Glocker

Abstract: Significant advances have been made towards building accurate automatic segmentation systems for a variety of biomedical applications using machine learning. However, the performance of these systems often degrades when they are applied on new data that differ from the training data, for example, due to variations in imaging protocols. Manually annotating new data for each test domain is not a fea… ▽ More Significant advances have been made towards building accurate automatic segmentation systems for a variety of biomedical applications using machine learning. However, the performance of these systems often degrades when they are applied on new data that differ from the training data, for example, due to variations in imaging protocols. Manually annotating new data for each test domain is not a feasible solution. In this work we investigate unsupervised domain adaptation using adversarial neural networks to train a segmentation method which is more invariant to differences in the input data, and which does not require any annotations on the test domain. Specifically, we learn domain-invariant features by learning to counter an adversarial network, which attempts to classify the domain of the input data by observing the activations of the segmentation network. Furthermore, we propose a multi-connected domain discriminator for improved adversarial training. Our system is evaluated using two MR databases of subjects with traumatic brain injuries, acquired using different scanners and imaging protocols. Using our unsupervised approach, we obtain segmentation accuracies which are close to the upper bound of supervised domain adaptation. △ Less

Submitted 28 December, 2016; originally announced December 2016.

arXiv:1612.05601 [pdf, other]

SonoNet: Real-Time Detection and Localisation of Fetal Standard Scan Planes in Freehand Ultrasound

Authors: Christian F. Baumgartner, Konstantinos Kamnitsas, Jacqueline Matthew, Tara P. Fletcher, Sandra Smith, Lisa M. Koch, Bernhard Kainz, Daniel Rueckert

Abstract: Identifying and interpreting fetal standard scan planes during 2D ultrasound mid-pregnancy examinations are highly complex tasks which require years of training. Apart from guiding the probe to the correct location, it can be equally difficult for a non-expert to identify relevant structures within the image. Automatic image processing can provide tools to help experienced as well as inexperienced… ▽ More Identifying and interpreting fetal standard scan planes during 2D ultrasound mid-pregnancy examinations are highly complex tasks which require years of training. Apart from guiding the probe to the correct location, it can be equally difficult for a non-expert to identify relevant structures within the image. Automatic image processing can provide tools to help experienced as well as inexperienced operators with these tasks. In this paper, we propose a novel method based on convolutional neural networks which can automatically detect 13 fetal standard views in freehand 2D ultrasound data as well as provide a localisation of the fetal structures via a bounding box. An important contribution is that the network learns to localise the target anatomy using weak supervision based on image-level labels only. The network architecture is designed to operate in real-time while providing optimal output for the localisation task. We present results for real-time annotation, retrospective frame retrieval from saved videos, and localisation on a very large and challenging dataset consisting of images and video recordings of full clinical anomaly screenings. We found that the proposed method achieved an average F1-score of 0.798 in a realistic classification experiment modelling real-time detection, and obtained a 90.09% accuracy for retrospective frame retrieval. Moreover, an accuracy of 77.8% was achieved on the localisation task. △ Less

Submitted 25 July, 2017; v1 submitted 16 December, 2016; originally announced December 2016.

Comments: 12 pages, 8 figures, published in IEEE Transactions in Medical Imaging

arXiv:1611.07289 [pdf, other]

PVR: Patch-to-Volume Reconstruction for Large Area Motion Correction of Fetal MRI

Authors: Amir Alansary, Bernhard Kainz, Martin Rajchl, Maria Murgasova, Mellisa Damodaram, David F. A. Lloyd, Alice Davidson, Steven G. McDonagh, Mary Rutherford, Joseph V. Hajnal, Daniel Rueckert

Abstract: In this paper we present a novel method for the correction of motion artifacts that are present in fetal Magnetic Resonance Imaging (MRI) scans of the whole uterus. Contrary to current slice-to-volume registration (SVR) methods, requiring an inflexible anatomical enclosure of a single investigated organ, the proposed patch-to-volume reconstruction (PVR) approach is able to reconstruct a large fiel… ▽ More In this paper we present a novel method for the correction of motion artifacts that are present in fetal Magnetic Resonance Imaging (MRI) scans of the whole uterus. Contrary to current slice-to-volume registration (SVR) methods, requiring an inflexible anatomical enclosure of a single investigated organ, the proposed patch-to-volume reconstruction (PVR) approach is able to reconstruct a large field of view of non-rigidly deforming structures. It relaxes rigid motion assumptions by introducing a specific amount of redundant information that is exploited with parallelized patch-wise optimization, super-resolution, and automatic outlier rejection. We further describe and provide an efficient parallel implementation of PVR allowing its execution within reasonable time on commercially available graphics processing units (GPU), enabling its use in the clinical practice. We evaluate PVR's computational overhead compared to standard methods and observe improved reconstruction accuracy in presence of affine motion artifacts of approximately 30% compared to conventional SVR in synthetic experiments. Furthermore, we have evaluated our method qualitatively and quantitatively on real fetal MRI data subject to maternal breathing and sudden fetal movements. We evaluate peak-signal-to-noise ratio (PSNR), structural similarity index (SSIM), and cross correlation (CC) with respect to the originally acquired data and provide a method for visual inspection of reconstruction uncertainty. With these experiments we demonstrate successful application of PVR motion compensation to the whole uterus, the human fetus, and the human placenta. △ Less

Submitted 25 November, 2016; v1 submitted 22 November, 2016; originally announced November 2016.

Comments: 10 pages, 13 figures, submitted to IEEE Transactions on Medical Imaging. v2: wadded funders acknowledgements to preprint

ACM Class: I.2.10; I.4.3; D.1.3

arXiv:1611.04783 [pdf, other]

Comparison of Brain Networks with Unknown Correspondences

Authors: Sofia Ira Ktena, Sarah Parisot, Jonathan Passerat-Palmbach, Daniel Rueckert

Abstract: Graph theory has drawn a lot of attention in the field of Neuroscience during the last decade, mainly due to the abundance of tools that it provides to explore the interactions of elements in a complex network like the brain. The local and global organization of a brain network can shed light on mechanisms of complex cognitive functions, while disruptions within the network can be linked to neurod… ▽ More Graph theory has drawn a lot of attention in the field of Neuroscience during the last decade, mainly due to the abundance of tools that it provides to explore the interactions of elements in a complex network like the brain. The local and global organization of a brain network can shed light on mechanisms of complex cognitive functions, while disruptions within the network can be linked to neurodevelopmental disorders. In this effort, the construction of a representative brain network for each individual is critical for further analysis. Additionally, graph comparison is an essential step for inference and classification analyses on brain graphs. In this work we explore a method based on graph edit distance for evaluating graph similarity, when correspondences between network elements are unknown due to different underlying subdivisions of the brain. We test this method on 30 unrelated subjects as well as 40 twin pairs and show that this method can accurately reflect the higher similarity between two related networks compared to unrelated ones, while identifying node correspondences. △ Less

Submitted 15 November, 2016; originally announced November 2016.

Comments: Presented at The MICCAI-BACON 16 Workshop (https://arxiv.longhoe.net/abs/1611.03363)

Report number: BACON/2016/03

arXiv:1609.05158 [pdf, other]

Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network

Authors: Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, Zehan Wang

Abstract: Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolut… ▽ More Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods. △ Less

Submitted 23 September, 2016; v1 submitted 16 September, 2016; originally announced September 2016.

Comments: CVPR 2016 paper with updated affiliations and supplemental material, fixed typo in equation 4

arXiv:1606.01100 [pdf, other]

Learning under Distributed Weak Supervision

Authors: Martin Rajchl, Matthew C. H. Lee, Franklin Schrans, Alice Davidson, Jonathan Passerat-Palmbach, Giacomo Tarroni, Amir Alansary, Ozan Oktay, Bernhard Kainz, Daniel Rueckert

Abstract: The availability of training data for supervision is a frequently encountered bottleneck of medical image analysis methods. While typically established by a clinical expert rater, the increase in acquired imaging data renders traditional pixel-wise segmentations less feasible. In this paper, we examine the use of a crowdsourcing platform for the distribution of super-pixel weak annotation tasks an… ▽ More The availability of training data for supervision is a frequently encountered bottleneck of medical image analysis methods. While typically established by a clinical expert rater, the increase in acquired imaging data renders traditional pixel-wise segmentations less feasible. In this paper, we examine the use of a crowdsourcing platform for the distribution of super-pixel weak annotation tasks and collect such annotations from a crowd of non-expert raters. The crowd annotations are subsequently used for training a fully convolutional neural network to address the problem of fetal brain segmentation in T2-weighted MR images. Using this approach we report encouraging results compared to highly targeted, fully supervised methods and potentially address a frequent problem impeding image analysis research. △ Less

Submitted 3 June, 2016; originally announced June 2016.

arXiv:1605.07866 [pdf, other]

DeepCut: Object Segmentation from Bounding Box Annotations using Convolutional Neural Networks

Authors: Martin Rajchl, Matthew C. H. Lee, Ozan Oktay, Konstantinos Kamnitsas, Jonathan Passerat-Palmbach, Wenjia Bai, Mellisa Damodaram, Mary A. Rutherford, Joseph V. Hajnal, Bernhard Kainz, Daniel Rueckert

Abstract: In this paper, we propose DeepCut, a method to obtain pixelwise object segmentations given an image dataset labelled with bounding box annotations. It extends the approach of the well-known GrabCut method to include machine learning by training a neural network classifier from bounding box annotations. We formulate the problem as an energy minimisation problem over a densely-connected conditional… ▽ More In this paper, we propose DeepCut, a method to obtain pixelwise object segmentations given an image dataset labelled with bounding box annotations. It extends the approach of the well-known GrabCut method to include machine learning by training a neural network classifier from bounding box annotations. We formulate the problem as an energy minimisation problem over a densely-connected conditional random field and iteratively update the training targets to obtain pixelwise object segmentations. Additionally, we propose variants of the DeepCut method and compare those to a naive approach to CNN training under weak supervision. We test its applicability to solve brain and lung segmentation problems on a challenging fetal magnetic resonance dataset and obtain encouraging results in terms of accuracy. △ Less

Submitted 5 June, 2016; v1 submitted 25 May, 2016; originally announced May 2016.

Showing 251–300 of 302 results for author: Rueckert, D