Search | arXiv e-print repository

doi 10.1117/12.2612200

An unsupervised learning-based shear wave tracking method for ultrasound elastography

Authors: Remi Delaunay, Yipeng Hu, Tom Vercauteren

Abstract: Shear wave elastography involves applying a non-invasive acoustic radiation force to the tissue and imaging the induced deformation to infer its mechanical properties. This work investigates the use of convolutional neural networks to improve displacement estimation accuracy in shear wave imaging. Our training approach is completely unsupervised, which allows to learn the estimation of the induced… ▽ More Shear wave elastography involves applying a non-invasive acoustic radiation force to the tissue and imaging the induced deformation to infer its mechanical properties. This work investigates the use of convolutional neural networks to improve displacement estimation accuracy in shear wave imaging. Our training approach is completely unsupervised, which allows to learn the estimation of the induced micro-scale deformations without ground truth labels. We also present an ultrasound simulation dataset where the shear wave propagation has been simulated via finite element method. Our dataset is made publicly available along with this paper, and consists in 150 shear wave propagation simulations in both homogenous and hetegeneous media, which represents a total of 20,000 ultrasound images. We assessed the ability of our learning-based approach to characterise tissue elastic properties (i.e., Young's modulus) on our dataset and compared our results with a classical normalised cross-correlation approach. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: Accepted to SPIE Medical Imaging 2022

arXiv:2311.04552 [pdf, other]

doi 10.1007/978-3-031-53767-7_13

A 3D generative model of pathological multi-modal MR images and segmentations

Authors: Virginia Fernandez, Walter Hugo Lopez Pinaya, Pedro Borges, Mark S. Graham, Tom Vercauteren, M. Jorge Cardoso

Abstract: Generative modelling and synthetic data can be a surrogate for real medical imaging datasets, whose scarcity and difficulty to share can be a nuisance when delivering accurate deep learning models for healthcare applications. In recent years, there has been an increased interest in using these models for data augmentation and synthetic data sharing, using architectures such as generative adversari… ▽ More Generative modelling and synthetic data can be a surrogate for real medical imaging datasets, whose scarcity and difficulty to share can be a nuisance when delivering accurate deep learning models for healthcare applications. In recent years, there has been an increased interest in using these models for data augmentation and synthetic data sharing, using architectures such as generative adversarial networks (GANs) or diffusion models (DMs). Nonetheless, the application of synthetic data to tasks such as 3D magnetic resonance imaging (MRI) segmentation remains limited due to the lack of labels associated with the generated images. Moreover, many of the proposed generative MRI models lack the ability to generate arbitrary modalities due to the absence of explicit contrast conditioning. These limitations prevent the user from adjusting the contrast and content of the images and obtaining more generalisable data for training task-specific models. In this work, we propose brainSPADE3D, a 3D generative model for brain MRI and associated segmentations, where the user can condition on specific pathological phenotypes and contrasts. The proposed joint imaging-segmentation generative model is shown to generate high-fidelity synthetic images and associated segmentations, with the ability to combine pathologies. We demonstrate how the model can alleviate issues with segmentation model performance when unexpected pathologies are present in the data. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: Accepted for publication at the 2023 Deep Generative Models (DGM4MICCAI) MICCAI workshop (Vancouver, Canada)

arXiv:2310.19392 [pdf, other]

doi 10.1117/12.3006526

A Clinical Guideline Driven Automated Linear Feature Extraction for Vestibular Schwannoma

Authors: Navodini Wijethilake, Steve Connor, Anna Oviedova, Rebecca Burger, Tom Vercauteren, Jonathan Shapey

Abstract: Vestibular Schwannoma is a benign brain tumour that grows from one of the balance nerves. Patients may be treated by surgery, radiosurgery or with a conservative "wait-and-scan" strategy. Clinicians typically use manually extracted linear measurements to aid clinical decision making. This work aims to automate and improve this process by using deep learning based segmentation to extract relevant c… ▽ More Vestibular Schwannoma is a benign brain tumour that grows from one of the balance nerves. Patients may be treated by surgery, radiosurgery or with a conservative "wait-and-scan" strategy. Clinicians typically use manually extracted linear measurements to aid clinical decision making. This work aims to automate and improve this process by using deep learning based segmentation to extract relevant clinical features through computational algorithms. To the best of our knowledge, our study is the first to propose an automated approach to replicate local clinical guidelines. Our deep learning based segmentation provided Dice-scores of 0.8124 +- 0.2343 and 0.8969 +- 0.0521 for extrameatal and whole tumour regions respectively for T2 weighted MRI, whereas 0.8222 +- 0.2108 and 0.9049 +- 0.0646 were obtained for T1 weighted MRI. We propose a novel algorithm to choose and extract the most appropriate maximum linear measurement from the segmented regions based on the size of the extrameatal portion of the tumour. Using this tool, clinicians will be provided with a visual guide and related metrics relating to tumour progression that will function as a clinical decision aid. In this study, we utilize 187 scans obtained from 50 patients referred to a tertiary specialist neurosurgical service in the United Kingdom. The measurements extracted manually by an expert neuroradiologist indicated a significant correlation with the automated measurements (p < 0.0001). △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: SPIE Medical Imaging

arXiv:2310.10248 [pdf, other]

doi 10.1109/TBME.2023.3325551

Long-term Dependency for 3D Reconstruction of Freehand Ultrasound Without External Tracker

Authors: Qi Li, Ziyi Shen, Qian Li, Dean C. Barratt, Thomas Dowrick, Matthew J. Clarkson, Tom Vercauteren, Yipeng Hu

Abstract: Objective: Reconstructing freehand ultrasound in 3D without any external tracker has been a long-standing challenge in ultrasound-assisted procedures. We aim to define new ways of parameterising long-term dependencies, and evaluate the performance. Methods: First, long-term dependency is encoded by transformation positions within a frame sequence. This is achieved by combining a sequence model wit… ▽ More Objective: Reconstructing freehand ultrasound in 3D without any external tracker has been a long-standing challenge in ultrasound-assisted procedures. We aim to define new ways of parameterising long-term dependencies, and evaluate the performance. Methods: First, long-term dependency is encoded by transformation positions within a frame sequence. This is achieved by combining a sequence model with a multi-transformation prediction. Second, two dependency factors are proposed, anatomical image content and scanning protocol, for contributing towards accurate reconstruction. Each factor is quantified experimentally by reducing respective training variances. Results: 1) The added long-term dependency up to 400 frames at 20 frames per second (fps) indeed improved reconstruction, with an up to 82.4% lowered accumulated error, compared with the baseline performance. The improvement was found to be dependent on sequence length, transformation interval and scanning protocol and, unexpectedly, not on the use of recurrent networks with long-short term modules; 2) Decreasing either anatomical or protocol variance in training led to poorer reconstruction accuracy. Interestingly, greater performance was gained from representative protocol patterns, than from representative anatomical features. Conclusion: The proposed algorithm uses hyperparameter tuning to effectively utilise long-term dependency. The proposed dependency factors are of practical significance in collecting diverse training data, regulating scanning protocols and develo** efficient networks. Significance: The proposed new methodology with publicly available volunteer data and code for parametersing the long-term dependency, experimentally shown to be valid sources of performance improvement, which could potentially lead to better model development and practical optimisation of the reconstruction application. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted to IEEE Transactions on Biomedical Engineering (TBME, 2023)

arXiv:2309.02335 [pdf, ps, other]

doi 10.1109/IUS51837.2023.10308101

DEEPBEAS3D: Deep Learning and B-Spline Explicit Active Surfaces

Authors: Helena Williams, João Pedrosa, Muhammad Asad, Laura Cattani, Tom Vercauteren, Jan Deprest, Jan D'hooge

Abstract: Deep learning-based automatic segmentation methods have become state-of-the-art. However, they are often not robust enough for direct clinical application, as domain shifts between training and testing data affect their performance. Failure in automatic segmentation can cause sub-optimal results that require correction. To address these problems, we propose a novel 3D extension of an interactive s… ▽ More Deep learning-based automatic segmentation methods have become state-of-the-art. However, they are often not robust enough for direct clinical application, as domain shifts between training and testing data affect their performance. Failure in automatic segmentation can cause sub-optimal results that require correction. To address these problems, we propose a novel 3D extension of an interactive segmentation framework that represents a segmentation from a convolutional neural network (CNN) as a B-spline explicit active surface (BEAS). BEAS ensures segmentations are smooth in 3D space, increasing anatomical plausibility, while allowing the user to precisely edit the 3D surface. We apply this framework to the task of 3D segmentation of the anal sphincter complex (AS) from transperineal ultrasound (TPUS) images, and compare it to the clinical tool used in the pelvic floor disorder clinic (4D View VOCAL, GE Healthcare; Zipf, Austria). Experimental results show that: 1) the proposed framework gives the user explicit control of the surface contour; 2) the perceived workload calculated via the NASA-TLX index was reduced by 30% compared to VOCAL; and 3) it required 7 0% (170 seconds) less user time than VOCAL (p< 0.00001) △ Less

Submitted 5 September, 2023; originally announced September 2023.

Comments: 4 pages, 3 figures, 1 table, conference

arXiv:2307.12792 [pdf, other]

doi 10.1007/978-3-031-43996-4_21

Deep Homography Prediction for Endoscopic Camera Motion Imitation Learning

Authors: Martin Huber, Sebastien Ourselin, Christos Bergeles, Tom Vercauteren

Abstract: In this work, we investigate laparoscopic camera motion automation through imitation learning from retrospective videos of laparoscopic interventions. A novel method is introduced that learns to augment a surgeon's behavior in image space through object motion invariant image registration via homographies. Contrary to existing approaches, no geometric assumptions are made and no depth information… ▽ More In this work, we investigate laparoscopic camera motion automation through imitation learning from retrospective videos of laparoscopic interventions. A novel method is introduced that learns to augment a surgeon's behavior in image space through object motion invariant image registration via homographies. Contrary to existing approaches, no geometric assumptions are made and no depth information is necessary, enabling immediate translation to a robotic setup. Deviating from the dominant approach in the literature which consist of following a surgical tool, we do not handcraft the objective and no priors are imposed on the surgical scene, allowing the method to discover unbiased policies. In this new research field, significant improvements are demonstrated over two baselines on the Cholec80 and HeiChole datasets, showcasing an improvement of 47% over camera motion continuation. The method is further shown to indeed predict camera motion correctly on the public motion classification labels of the AutoLaparo dataset. All code is made accessible on GitHub. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: Early accepted at MICCAI 2023

arXiv:2307.12791 [pdf, other]

doi 10.1117/1.JMI.10.4.046001

Synthetic white balancing for intra-operative hyperspectral imaging

Authors: Anisha Bahl, Conor C. Horgan, Mirek Janatka, Oscar J. MacCormac, Philip Noonan, Yi**g Xie, Jianrong Qiu, Nicola Cavalcanti, Philipp Fürnstahl, Michael Ebner, Mads S. Bergholt, Jonathan Shapey, Tom Vercauteren

Abstract: Hyperspectral imaging shows promise for surgical applications to non-invasively provide spatially-resolved, spectral information. For calibration purposes, a white reference image of a highly-reflective Lambertian surface should be obtained under the same imaging conditions. Standard white references are not sterilizable, and so are unsuitable for surgical environments. We demonstrate the necessit… ▽ More Hyperspectral imaging shows promise for surgical applications to non-invasively provide spatially-resolved, spectral information. For calibration purposes, a white reference image of a highly-reflective Lambertian surface should be obtained under the same imaging conditions. Standard white references are not sterilizable, and so are unsuitable for surgical environments. We demonstrate the necessity for in situ white references and address this by proposing a novel, sterile, synthetic reference construction algorithm. The use of references obtained at different distances and lighting conditions to the subject were examined. Spectral and color reconstructions were compared with standard measurements qualitatively and quantitatively, using $ΔE$ and normalised RMSE respectively. The algorithm forms a composite image from a video of a standard sterile ruler, whose imperfect reflectivity is compensated for. The reference is modelled as the product of independent spatial and spectral components, and a scalar factor accounting for gain, exposure, and light intensity. Evaluation of synthetic references against ideal but non-sterile references is performed using the same metrics alongside pixel-by-pixel errors. Finally, intraoperative integration is assessed though cadaveric experiments. Improper white balancing leads to increases in all quantitative and qualitative errors. Synthetic references achieve median pixel-by-pixel errors lower than 6.5% and produce similar reconstructions and errors to an ideal reference. The algorithm integrated well into surgical workflow, achieving median pixel-by-pixel errors of 4.77%, while maintaining good spectral and color reconstruction. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: 22 pages, 10 figures

arXiv:2307.11638 [pdf, other]

doi 10.1007/978-3-031-43996-4_63

Deep Reinforcement Learning Based System for Intraoperative Hyperspectral Video Autofocusing

Authors: Charlie Budd, Jianrong Qiu, Oscar MacCormac, Martin Huber, Christopher Mower, Mirek Janatka, Théo Trotouin, Jonathan Shapey, Mads S. Bergholt, Tom Vercauteren

Abstract: Hyperspectral imaging (HSI) captures a greater level of spectral detail than traditional optical imaging, making it a potentially valuable intraoperative tool when precise tissue differentiation is essential. Hardware limitations of current optical systems used for handheld real-time video HSI result in a limited focal depth, thereby posing usability issues for integration of the technology into t… ▽ More Hyperspectral imaging (HSI) captures a greater level of spectral detail than traditional optical imaging, making it a potentially valuable intraoperative tool when precise tissue differentiation is essential. Hardware limitations of current optical systems used for handheld real-time video HSI result in a limited focal depth, thereby posing usability issues for integration of the technology into the operating room. This work integrates a focus-tunable liquid lens into a video HSI exoscope, and proposes novel video autofocusing methods based on deep reinforcement learning. A first-of-its-kind robotic focal-time scan was performed to create a realistic and reproducible testing dataset. We benchmarked our proposed autofocus algorithm against traditional policies, and found our novel approach to perform significantly ($p<0.05$) better than traditional techniques ($0.070\pm.098$ mean absolute focal error compared to $0.146\pm.148$). In addition, we performed a blinded usability trial by having two neurosurgeons compare the system with different autofocus policies, and found our novel approach to be the most favourable, making our system a desirable addition for intraoperative HSI. △ Less

Submitted 21 July, 2023; originally announced July 2023.

Comments: To be presented at MICCAI 2023

arXiv:2306.11034 [pdf, ps, other]

doi 10.1016/j.pacs.2024.100621

Learning-based sound speed estimation and aberration correction in linear-array photoacoustic imaging

Authors: Mengjie Shi, Tom Vercauteren, Wenfeng Xia

Abstract: Photoacoustic (PA) image reconstruction involves acoustic inversion that necessitates the specification of the speed of sound (SoS) within the medium of propagation. Due to the lack of information on the spatial distribution of the SoS within heterogeneous soft tissue, a homogeneous SoS distribution (such as 1540 m/s) is typically assumed in PA image reconstruction, similar to that of ultrasound (… ▽ More Photoacoustic (PA) image reconstruction involves acoustic inversion that necessitates the specification of the speed of sound (SoS) within the medium of propagation. Due to the lack of information on the spatial distribution of the SoS within heterogeneous soft tissue, a homogeneous SoS distribution (such as 1540 m/s) is typically assumed in PA image reconstruction, similar to that of ultrasound (US) imaging. Failure to compensate the SoS variations leads to aberration artefacts, deteriorating the image quality. Various methods have been proposed to address this issue, but they usually involve complex hardware and/or time-consuming algorithms, hindering clinical translation. In this work, we introduce a deep learning framework for SoS estimation and subsequent aberration correction in a dual-modal PA/US imaging system exploiting a clinical US probe. As the acquired PA and US images were inherently co-registered, the estimated SoS distribution from US channel data using a deep neural network was incorporated for accurate PA image reconstruction. The framework comprised an initial pre-training stage based on digital phantoms, which was further enhanced through transfer learning using physical phantom data and associated SoS maps obtained from measurements. This framework achieved a root mean square error of 10.2 m/s and 15.2 m/s for SoS estimation on digital and physical phantoms, respectively and structural similarity index measures of up to 0.86 for PA reconstructions as compared to the conventional approach of 0.69. A maximum of 1.2 times improvement in signal-to-noise ratio of PA images was further demonstrated with a human volunteer study. Our results show that the proposed framework could be valuable in various clinical and preclinical applications to enhance PA image reconstruction. △ Less

Submitted 5 March, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

arXiv:2305.10655 [pdf, other]

doi 10.1007/978-3-031-17027-0_2

DeepEdit: Deep Editable Learning for Interactive Segmentation of 3D Medical Images

Authors: Andres Diaz-Pinto, Pritesh Mehta, Sachidanand Alle, Muhammad Asad, Richard Brown, Vishwesh Nath, Alvin Ihsani, Michela Antonelli, Daniel Palkovics, Csaba Pinter, Ron Alkalay, Steve Pieper, Holger R. Roth, Daguang Xu, Prerna Dogra, Tom Vercauteren, Andrew Feng, Abood Quraini, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Automatic segmentation of medical images is a key step for diagnostic and interventional tasks. However, achieving this requires large amounts of annotated volumes, which can be tedious and time-consuming task for expert annotators. In this paper, we introduce DeepEdit, a deep learning-based method for volumetric medical image annotation, that allows automatic and semi-automatic segmentation, and… ▽ More Automatic segmentation of medical images is a key step for diagnostic and interventional tasks. However, achieving this requires large amounts of annotated volumes, which can be tedious and time-consuming task for expert annotators. In this paper, we introduce DeepEdit, a deep learning-based method for volumetric medical image annotation, that allows automatic and semi-automatic segmentation, and click-based refinement. DeepEdit combines the power of two methods: a non-interactive (i.e. automatic segmentation using nnU-Net, UNET or UNETR) and an interactive segmentation method (i.e. DeepGrow), into a single deep learning model. It allows easy integration of uncertainty-based ranking strategies (i.e. aleatoric and epistemic uncertainty computation) and active learning. We propose and implement a method for training DeepEdit by using standard training combined with user interaction simulation. Once trained, DeepEdit allows clinicians to quickly segment their datasets by using the algorithm in auto segmentation mode or by providing clicks via a user interface (i.e. 3D Slicer, OHIF). We show the value of DeepEdit through evaluation on the PROSTATEx dataset for prostate/prostatic lesions and the Multi-Atlas Labeling Beyond the Cranial Vault (BTCV) dataset for abdominal CT segmentation, using state-of-the-art network architectures as baseline for comparison. DeepEdit could reduce the time and effort annotating 3D medical images compared to DeepGrow alone. Source code is available at https://github.com/Project-MONAI/MONAILabel △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2303.13696 [pdf, other]

doi 10.1007/978-3-031-43895-0_53

Adaptive Multi-scale Online Likelihood Network for AI-assisted Interactive Segmentation

Authors: Muhammad Asad, Helena Williams, Indrajeet Mandal, Sarim Ather, Jan Deprest, Jan D'hooge, Tom Vercauteren

Abstract: Existing interactive segmentation methods leverage automatic segmentation and user interactions for label refinement, significantly reducing the annotation workload compared to manual annotation. However, these methods lack quick adaptability to ambiguous and noisy data, which is a challenge in CT volumes containing lung lesions from COVID-19 patients. In this work, we propose an adaptive multi-sc… ▽ More Existing interactive segmentation methods leverage automatic segmentation and user interactions for label refinement, significantly reducing the annotation workload compared to manual annotation. However, these methods lack quick adaptability to ambiguous and noisy data, which is a challenge in CT volumes containing lung lesions from COVID-19 patients. In this work, we propose an adaptive multi-scale online likelihood network (MONet) that adaptively learns in a data-efficient online setting from both an initial automatic segmentation and user interactions providing corrections. We achieve adaptive learning by proposing an adaptive loss that extends the influence of user-provided interaction to neighboring regions with similar features. In addition, we propose a data-efficient probability-guided pruning method that discards uncertain and redundant labels in the initial segmentation to enable efficient online training and inference. Our proposed method was evaluated by an expert in a blinded comparative study on COVID-19 lung lesion annotation task in CT. Our approach achieved 5.86% higher Dice score with 24.67% less perceived NASA-TLX workload score than the state-of-the-art. Source code is available at: https://github.com/masadcv/MONet-MONAILabel △ Less

Submitted 24 September, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

arXiv:2303.10173 [pdf, other]

VideoSum: A Python Library for Surgical Video Summarization

Authors: Luis C. Garcia-Peraza-Herrera, Sebastien Ourselin, Tom Vercauteren

Abstract: The performance of deep learning (DL) algorithms is heavily influenced by the quantity and the quality of the annotated data. However, in Surgical Data Science, access to it is limited. It is thus unsurprising that substantial research efforts are made to develop methods aiming at mitigating the scarcity of annotated SDS data. In parallel, an increasing number of Computer Assisted Interventions (C… ▽ More The performance of deep learning (DL) algorithms is heavily influenced by the quantity and the quality of the annotated data. However, in Surgical Data Science, access to it is limited. It is thus unsurprising that substantial research efforts are made to develop methods aiming at mitigating the scarcity of annotated SDS data. In parallel, an increasing number of Computer Assisted Interventions (CAI) datasets are being released, although the scale of these remain limited. On these premises, data curation is becoming a key element of many SDS research endeavors. Surgical video datasets are demanding to curate and would benefit from dedicated support tools. In this work, we propose to summarize surgical videos into storyboards or collages of representative frames to ease visualization, annotation, and processing. Video summarization is well-established for natural images. However, state-of-the-art methods typically rely on models trained on human-made annotations, few methods have been evaluated on surgical videos, and the availability of software packages for the task is limited. We present videosum, an easy-to-use and open-source Python library to generate storyboards from surgical videos that contains a variety of unsupervised methods. △ Less

Submitted 14 July, 2023; v1 submitted 15 February, 2023; originally announced March 2023.

Comments: Camera-ready version accepted at CRAS 2023

arXiv:2303.08252 [pdf, other]

doi 10.1080/21681163.2022.2160377

Hyperspectral Image Segmentation: A Preliminary Study on the Oral and Dental Spectral Image Database (ODSI-DB)

Authors: Luis C. Garcia-Peraza-Herrera, Conor Horgan, Sebastien Ourselin, Michael Ebner, Tom Vercauteren

Abstract: Visual discrimination of clinical tissue types remains challenging, with traditional RGB imaging providing limited contrast for such tasks. Hyperspectral imaging (HSI) is a promising technology providing rich spectral information that can extend far beyond three-channel RGB imaging. Moreover, recently developed snapshot HSI cameras enable real-time imaging with significant potential for clinical a… ▽ More Visual discrimination of clinical tissue types remains challenging, with traditional RGB imaging providing limited contrast for such tasks. Hyperspectral imaging (HSI) is a promising technology providing rich spectral information that can extend far beyond three-channel RGB imaging. Moreover, recently developed snapshot HSI cameras enable real-time imaging with significant potential for clinical applications. Despite this, the investigation into the relative performance of HSI over RGB imaging for semantic segmentation purposes has been limited, particularly in the context of medical imaging. Here we compare the performance of state-of-the-art deep learning image segmentation methods when trained on hyperspectral images, RGB images, hyperspectral pixels (minus spatial context), and RGB pixels (disregarding spatial context). To achieve this, we employ the recently released Oral and Dental Spectral Image Database (ODSI-DB), which consists of 215 manually segmented dental reflectance spectral images with 35 different classes across 30 human subjects. The recent development of snapshot HSI cameras has made real-time clinical HSI a distinct possibility, though successful application requires a comprehensive understanding of the additional information HSI offers. Our work highlights the relative importance of spectral resolution, spectral range, and spatial information to both guide the development of HSI cameras and inform future clinical HSI applications. △ Less

Submitted 14 March, 2023; originally announced March 2023.

arXiv:2302.10927 [pdf, other]

doi 10.1007/s11548-023-02865-7

Spatial gradient consistency for unsupervised learning of hyperspectral demosaicking: Application to surgical imaging

Authors: Peichao Li, Muhammad Asad, Conor Horgan, Oscar MacCormac, Jonathan Shapey, Tom Vercauteren

Abstract: Hyperspectral imaging has the potential to improve intraoperative decision making if tissue characterisation is performed in real-time and with high-resolution. Hyperspectral snapshot mosaic sensors offer a promising approach due to their fast acquisition speed and compact size. However, a demosaicking algorithm is required to fully recover the spatial and spectral information of the snapshot imag… ▽ More Hyperspectral imaging has the potential to improve intraoperative decision making if tissue characterisation is performed in real-time and with high-resolution. Hyperspectral snapshot mosaic sensors offer a promising approach due to their fast acquisition speed and compact size. However, a demosaicking algorithm is required to fully recover the spatial and spectral information of the snapshot images. Most state-of-the-art demosaicking algorithms require ground-truth training data with paired snapshot and high-resolution hyperspectral images, but such imagery pairs with the exact same scene are physically impossible to acquire in intraoperative settings. In this work, we present a fully unsupervised hyperspectral image demosaicking algorithm which only requires exemplar snapshot images for training purposes. We regard hyperspectral demosaicking as an ill-posed linear inverse problem which we solve using a deep neural network. We take advantage of the spectral correlation occurring in natural scenes to design a novel inter spectral band regularisation term based on spatial gradient consistency. By combining our proposed term with standard regularisation techniques and exploiting a standard data fidelity term, we obtain an unsupervised loss function for training deep neural networks, which allows us to achieve real-time hyperspectral image demosaicking. Quantitative results on hyperspetral image datasets show that our unsupervised demosaicking approach can achieve similar performance to its supervised counter-part, and significantly outperform linear demosaicking. A qualitative user study on real snapshot hyperspectral surgical images confirms the results from the quantitative analysis. Our results suggest that the proposed unsupervised algorithm can achieve promising hyperspectral demosaicking in real-time thus advancing the suitability of the modality for intraoperative use. △ Less

Submitted 21 February, 2023; originally announced February 2023.

Journal ref: International Journal of Computer Assisted Radiology and Surgery, 2023

arXiv:2211.04867 [pdf, other]

doi 10.1109/ISBI53787.2023.10230773

Trackerless freehand ultrasound with sequence modelling and auxiliary transformation over past and future frames

Authors: Qi Li, Ziyi Shen, Qian Li, Dean C Barratt, Thomas Dowrick, Matthew J Clarkson, Tom Vercauteren, Yipeng Hu

Abstract: Three-dimensional (3D) freehand ultrasound (US) reconstruction without a tracker can be advantageous over its two-dimensional or tracked counterparts in many clinical applications. In this paper, we propose to estimate 3D spatial transformation between US frames from both past and future 2D images, using feed-forward and recurrent neural networks (RNNs). With the temporally available frames, a fur… ▽ More Three-dimensional (3D) freehand ultrasound (US) reconstruction without a tracker can be advantageous over its two-dimensional or tracked counterparts in many clinical applications. In this paper, we propose to estimate 3D spatial transformation between US frames from both past and future 2D images, using feed-forward and recurrent neural networks (RNNs). With the temporally available frames, a further multi-task learning algorithm is proposed to utilise a large number of auxiliary transformation-predicting tasks between them. Using more than 40,000 US frames acquired from 228 scans on 38 forearms of 19 volunteers in a volunteer study, the hold-out test performance is quantified by frame prediction accuracy, volume reconstruction overlap, accumulated tracking error and final drift, based on ground-truth from an optical tracker. The results show the importance of modelling the temporal-spatially correlated input frames as well as output transformations, with further improvement owing to additional past and/or future frames. The best performing model was associated with predicting transformation between moderately-spaced frames, with an interval of less than ten frames at 20 frames per second (fps). Little benefit was observed by adding frames more than one second away from the predicted transformation, with or without LSTM-based RNNs. Interestingly, with the proposed approach, explicit within-sequence loss that encourages consistency in composing transformations or minimises accumulated error may no longer be required. The implementation code and volunteer data will be made publicly available ensuring reproducibility and further research. △ Less

Submitted 4 February, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

Comments: Accepted to IEEE International Symposium on Biomedical Imaging (ISBI) 2023

arXiv:2208.04680 [pdf, other]

doi 10.1007/978-3-031-17899-3_8

Boundary Distance Loss for Intra-/Extra-meatal Segmentation of Vestibular Schwannoma

Authors: Navodini Wijethilake, Aaron Kujawa, Reuben Dorent, Muhammad Asad, Anna Oviedova, Tom Vercauteren, Jonathan Shapey

Abstract: Vestibular Schwannoma (VS) typically grows from the inner ear to the brain. It can be separated into two regions, intrameatal and extrameatal respectively corresponding to being inside or outside the inner ear canal. The growth of the extrameatal regions is a key factor that determines the disease management followed by the clinicians. In this work, a VS segmentation approach with subdivision into… ▽ More Vestibular Schwannoma (VS) typically grows from the inner ear to the brain. It can be separated into two regions, intrameatal and extrameatal respectively corresponding to being inside or outside the inner ear canal. The growth of the extrameatal regions is a key factor that determines the disease management followed by the clinicians. In this work, a VS segmentation approach with subdivision into intra-/extra-meatal parts is presented. We annotated a dataset consisting of 227 T2 MRI instances, acquired longitudinally on 137 patients, excluding post-operative instances. We propose a staged approach, with the first stage performing the whole tumour segmentation and the second stage performing the intra-/extra-meatal segmentation using the T2 MRI along with the mask obtained from the first stage. To improve on the accuracy of the predicted meatal boundary, we introduce a task-specific loss which we call Boundary Distance Loss. The performance is evaluated in contrast to the direct intrameatal extrameatal segmentation task performance, i.e. the Baseline. Our proposed method, with the two-stage approach and the Boundary Distance Loss, achieved a Dice score of 0.8279+-0.2050 and 0.7744+-0.1352 for extrameatal and intrameatal regions respectively, significantly improving over the Baseline, which gave Dice score of 0.7939+-0.2325 and 0.7475+-0.1346 for the extrameatal and intrameatal regions respectively. △ Less

Submitted 9 August, 2022; originally announced August 2022.

Comments: Accepted for the MICCAI MLCN workshop 2022

arXiv:2208.00001 [pdf, other]

doi 10.21105/joss.04532

FastGeodis: Fast Generalised Geodesic Distance Transform

Authors: Muhammad Asad, Reuben Dorent, Tom Vercauteren

Abstract: The FastGeodis package provides an efficient implementation for computing Geodesic and Euclidean distance transforms (or a mixture of both), targeting efficient utilisation of CPU and GPU hardware. In particular, it implements the paralellisable raster scan method from Criminisi et al. (2009), where elements in a row (2D) or plane (3D) can be computed with parallel threads. This package is able to… ▽ More The FastGeodis package provides an efficient implementation for computing Geodesic and Euclidean distance transforms (or a mixture of both), targeting efficient utilisation of CPU and GPU hardware. In particular, it implements the paralellisable raster scan method from Criminisi et al. (2009), where elements in a row (2D) or plane (3D) can be computed with parallel threads. This package is able to handle 2D as well as 3D data, where it achieves up to a 20x speedup on a CPU and up to a 74x speedup on a GPU as compared to an existing open-source library (Wang, 2020) that uses a non-parallelisable single-thread CPU implementation. The performance speedups reported here were evaluated using 3D volume data on an Nvidia GeForce Titan X (12 GB) with a 6-Core Intel Xeon E5-1650 CPU. Further in-depth comparison of performance improvements are discussed in the FastGeodis documentation: https://fastgeodis.readthedocs.io △ Less

Submitted 23 November, 2022; v1 submitted 26 July, 2022; originally announced August 2022.

Comments: Accepted at Journal of Open Source Software (JOSS)

arXiv:2207.12901 [pdf, other]

doi 10.1109/TMI.2022.3187873

Cross-Modality Image Registration using a Training-Time Privileged Third Modality

Authors: Qianye Yang, David Atkinson, Yunguan Fu, Tom Syer, Wen Yan, Shonit Punwani, Matthew J. Clarkson, Dean C. Barratt, Tom Vercauteren, Yipeng Hu

Abstract: In this work, we consider the task of pairwise cross-modality image registration, which may benefit from exploiting additional images available only at training time from an additional modality that is different to those being registered. As an example, we focus on aligning intra-subject multiparametric Magnetic Resonance (mpMR) images, between T2-weighted (T2w) scans and diffusion-weighted scans… ▽ More In this work, we consider the task of pairwise cross-modality image registration, which may benefit from exploiting additional images available only at training time from an additional modality that is different to those being registered. As an example, we focus on aligning intra-subject multiparametric Magnetic Resonance (mpMR) images, between T2-weighted (T2w) scans and diffusion-weighted scans with high b-value (DWI$_{high-b}$). For the application of localising tumours in mpMR images, diffusion scans with zero b-value (DWI$_{b=0}$) are considered easier to register to T2w due to the availability of corresponding features. We propose a learning from privileged modality algorithm, using a training-only imaging modality DWI$_{b=0}$, to support the challenging multi-modality registration problems. We present experimental results based on 369 sets of 3D multiparametric MRI images from 356 prostate cancer patients and report, with statistical significance, a lowered median target registration error of 4.34 mm, when registering the holdout DWI$_{high-b}$ and T2w image pairs, compared with that of 7.96 mm before registration. Results also show that the proposed learning-based registration networks enabled efficient registration with comparable or better accuracy, compared with a classical iterative algorithm and other tested learning-based methods with/without the additional modality. These compared algorithms also failed to produce any significantly improved alignment between DWI$_{high-b}$ and T2w in this challenging application. △ Less

Submitted 26 July, 2022; originally announced July 2022.

Comments: Accepted by IEEE Transactions on Medical Imaging (TMI, 2022)

arXiv:2207.04229 [pdf]

doi 10.1364/BOE.471198

Spatiotemporal singular value decomposition for denoising in photoacoustic imaging with low-energy excitation light source

Authors: Mengjie Shi, Tom Vercauteren, Wenfeng Xia

Abstract: Photoacoustic (PA) imaging is an emerging hybrid imaging modality that combines rich optical spectroscopic contrast and high ultrasonic resolution and thus holds tremendous promise for a wide range of pre-clinical and clinical applications. Compact and affordable light sources such as light-emitting diodes (LEDs) and laser diodes (LDs) are promising alternatives to bulky and expensive solid-state… ▽ More Photoacoustic (PA) imaging is an emerging hybrid imaging modality that combines rich optical spectroscopic contrast and high ultrasonic resolution and thus holds tremendous promise for a wide range of pre-clinical and clinical applications. Compact and affordable light sources such as light-emitting diodes (LEDs) and laser diodes (LDs) are promising alternatives to bulky and expensive solid-state laser systems that are commonly used as PA light sources. These could accelerate the clinical translation of PA technology. However, PA signals generated with these light sources are readily degraded by noise due to the low optical fluence, leading to decreased signal-to-noise ratio (SNR) in PA images. In this work, a spatiotemporal singular value decomposition (SVD) based PA denoising method was investigated for these light sources that usually have low fluence and high repetition rates. The proposed method leverages both spatial and temporal correlations between radiofrequency (RF) data frames. Validation was performed on simulations and in vivo PA data acquired from human fingers (2D) and forearm (3D) using a LED-based system. Spatiotemporal SVD greatly enhanced the PA signals of blood vessels corrupted by noise while preserving a high temporal resolution to slow motions, improving the SNR of in vivo PA images by 1.1, 0.7, and 1.9 times compared to single frame-based wavelet denoising, averaging across 200 frames, and single frame without denoising, respectively. The proposed method demonstrated a processing time of around 50 \mus per frame with SVD acceleration and GPU. Thus, spatiotemporal SVD is well suited to PA imaging systems with low-energy excitation light sources for real-time in vivo applications. △ Less

Submitted 9 July, 2022; originally announced July 2022.

arXiv:2205.10355 [pdf, other]

Deep Quality Estimation: Creating Surrogate Models for Human Quality Ratings

Authors: Florian Kofler, Ivan Ezhov, Lucas Fidon, Izabela Horvath, Ezequiel de la Rosa, John LaMaster, Hongwei Li, Tom Finck, Suprosanna Shit, Johannes Paetzold, Spyridon Bakas, Marie Piraud, Jan Kirschke, Tom Vercauteren, Claus Zimmer, Benedikt Wiestler, Bjoern Menze

Abstract: Human ratings are abstract representations of segmentation quality. To approximate human quality ratings on scarce expert data, we train surrogate quality estimation models. We evaluate on a complex multi-class segmentation problem, specifically glioma segmentation, following the BraTS annotation protocol. The training data features quality ratings from 15 expert neuroradiologists on a scale rangi… ▽ More Human ratings are abstract representations of segmentation quality. To approximate human quality ratings on scarce expert data, we train surrogate quality estimation models. We evaluate on a complex multi-class segmentation problem, specifically glioma segmentation, following the BraTS annotation protocol. The training data features quality ratings from 15 expert neuroradiologists on a scale ranging from 1 to 6 stars for various computer-generated and manual 3D annotations. Even though the networks operate on 2D images and with scarce training data, we can approximate segmentation quality within a margin of error comparable to human intra-rater reliability. Segmentation quality prediction has broad applications. While an understanding of segmentation quality is imperative for successful clinical translation of automatic segmentation quality algorithms, it can play an essential role in training new segmentation models. Due to the split-second inference times, it can be directly applied within a loss function or as a fully-automatic dataset curation mechanism in a federated learning setting. △ Less

Submitted 30 August, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

Comments: 10 pages, 5 figures

arXiv:2205.08209 [pdf, other]

doi 10.1007/978-3-031-34048-2_58

blob loss: instance imbalance aware loss functions for semantic segmentation

Authors: Florian Kofler, Suprosanna Shit, Ivan Ezhov, Lucas Fidon, Izabela Horvath, Rami Al-Maskari, Hongwei Li, Harsharan Bhatia, Timo Loehr, Marie Piraud, Ali Erturk, Jan Kirschke, Jan C. Peeken, Tom Vercauteren, Claus Zimmer, Benedikt Wiestler, Bjoern Menze

Abstract: Deep convolutional neural networks (CNN) have proven to be remarkably effective in semantic segmentation tasks. Most popular loss functions were introduced targeting improved volumetric scores, such as the Dice coefficient (DSC). By design, DSC can tackle class imbalance, however, it does not recognize instance imbalance within a class. As a result, a large foreground instance can dominate minor i… ▽ More Deep convolutional neural networks (CNN) have proven to be remarkably effective in semantic segmentation tasks. Most popular loss functions were introduced targeting improved volumetric scores, such as the Dice coefficient (DSC). By design, DSC can tackle class imbalance, however, it does not recognize instance imbalance within a class. As a result, a large foreground instance can dominate minor instances and still produce a satisfactory DSC. Nevertheless, detecting tiny instances is crucial for many applications, such as disease monitoring. For example, it is imperative to locate and surveil small-scale lesions in the follow-up of multiple sclerosis patients. We propose a novel family of loss functions, \emph{blob loss}, primarily aimed at maximizing instance-level detection metrics, such as F1 score and sensitivity. \emph{Blob loss} is designed for semantic segmentation problems where detecting multiple instances matters. We extensively evaluate a DSC-based \emph{blob loss} in five complex 3D semantic segmentation tasks featuring pronounced instance heterogeneity in terms of texture and morphology. Compared to soft Dice loss, we achieve 5% improvement for MS lesions, 3% improvement for liver tumor, and an average 2% improvement for microscopy segmentation tasks considering F1 score. △ Less

Submitted 6 June, 2023; v1 submitted 17 May, 2022; originally announced May 2022.

Comments: 23 pages, 7 figures // corrected one mistake where it said beta instead of alpha in the text

arXiv:2205.03122 [pdf]

doi 10.1364/BOE.463057

Ultrathin, high-speed, all-optical photoacoustic endomicroscopy probe for guiding minimally invasive surgery

Authors: Tianrui Zhao, Truc Thuy Pham, Christian Baker, Michelle T. Ma, Sebastien Ourselin, Tom Vercauteren, Edward Zhang, Paul C. Beard, Wenfeng Xia

Abstract: Photoacoustic (PA) endoscopy has shown significant potential for clinical diagnosis and surgical guidance. Multimode fibres (MMFs) are becoming increasing attractive for the development of miniature endoscopy probes owing to ultrathin size, low cost and diffraction-limited spatial resolution enabled by wavefront sha**. However, current MMF-based PA endomicroscopy probes are either limited by a b… ▽ More Photoacoustic (PA) endoscopy has shown significant potential for clinical diagnosis and surgical guidance. Multimode fibres (MMFs) are becoming increasing attractive for the development of miniature endoscopy probes owing to ultrathin size, low cost and diffraction-limited spatial resolution enabled by wavefront sha**. However, current MMF-based PA endomicroscopy probes are either limited by a bulky ultrasound detector or a low imaging speed which hindered their usability. In this work, we report the development of a highly miniaturised and high-speed PA endomicroscopy probe that is integrated within the cannula of a 20 gauge medical needle. This probe comprises a MMF for delivering the PA excitation light and a single-mode optical fibre with a plano-concave microresonator for ultrasound detection. Wavefront sha** with a digital micromirror device enabled rapid raster-scanning of a focused light spot at the distal end of the MMF for tissue interrogation. High-resolution PA imaging of mouse red blood cells covering an area 100 microns in diameter was achieved with the needle probe at ~3 frames per second. Mosaicing imaging was performed after fibre characterisation by translating the needle probe to enlarge the field-of-view in real-time. The developed ultrathin PA endomicroscopy probe is promising for guiding minimally invasive surgery by providing functional, molecular and microstructural information of tissue in real-time. △ Less

Submitted 6 May, 2022; originally announced May 2022.

arXiv:2204.09573 [pdf]

doi 10.1016/j.media.2023.102833

Fetal Brain Tissue Annotation and Segmentation Challenge Results

Authors: Kelly Payette, Hongwei Li, Priscille de Dumast, Roxane Licandro, Hui Ji, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Hao Liu, Yuchen Pei, Lisheng Wang, Ying Peng, Juanying Xie, Huiquan Zhang, Guiming Dong, Hao Fu, Guotai Wang, ZunHyan Rieu, Donghyeon Kim, Hyun Gi Kim, Davood Karimi, Ali Gholipour, Helena R. Torres, Bruno Oliveira, João L. Vilaça , et al. (33 additional authors not shown)

Abstract: In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the develo** human brain. Automatic segmentation of the develo** fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variabili… ▽ More In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the develo** human brain. Automatic segmentation of the develo** fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variability. Therefore, we organized the Fetal Tissue Annotation (FeTA) Challenge in 2021 in order to encourage the development of automatic segmentation algorithms on an international level. The challenge utilized FeTA Dataset, an open dataset of fetal brain MRI reconstructions segmented into seven different tissues (external cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, brainstem, deep grey matter). 20 international teams participated in this challenge, submitting a total of 21 algorithms for evaluation. In this paper, we provide a detailed analysis of the results from both a technical and clinical perspective. All participants relied on deep learning methods, mainly U-Nets, with some variability present in the network architecture, optimization, and image pre- and post-processing. The majority of teams used existing medical imaging deep learning frameworks. The main differences between the submissions were the fine tuning done during training, and the specific pre- and post-processing steps performed. The challenge results showed that almost all submissions performed similarly. Four of the top five teams used ensemble learning methods. However, one team's algorithm performed significantly superior to the other submissions, and consisted of an asymmetrical U-Net network architecture. This paper provides a first of its kind benchmark for future automatic multi-tissue segmentation algorithms for the develo** human brain in utero. △ Less

Submitted 20 April, 2022; originally announced April 2022.

Comments: Results from FeTA Challenge 2021, held at MICCAI; Manuscript submitted

arXiv:2204.02779 [pdf, other]

doi 10.1109/TPAMI.2023.3346330

A Dempster-Shafer approach to trustworthy AI with application to fetal brain MRI segmentation

Authors: Lucas Fidon, Michael Aertsen, Florian Kofler, Andrea Bink, Anna L. David, Thomas Deprest, Doaa Emam, Frédéric Guffens, András Jakab, Gregor Kasprian, Patric Kienast, Andrew Melbourne, Bjoern Menze, Nada Mufti, Ivana Pogledic, Daniela Prayer, Marlene Stuempflen, Esther Van Elslander, Sébastien Ourselin, Jan Deprest, Tom Vercauteren

Abstract: Deep learning models for medical image segmentation can fail unexpectedly and spectacularly for pathological cases and images acquired at different centers than training images, with labeling errors that violate expert knowledge. Such errors undermine the trustworthiness of deep learning models for medical image segmentation. Mechanisms for detecting and correcting such failures are essential for… ▽ More Deep learning models for medical image segmentation can fail unexpectedly and spectacularly for pathological cases and images acquired at different centers than training images, with labeling errors that violate expert knowledge. Such errors undermine the trustworthiness of deep learning models for medical image segmentation. Mechanisms for detecting and correcting such failures are essential for safely translating this technology into clinics and are likely to be a requirement of future regulations on artificial intelligence (AI). In this work, we propose a trustworthy AI theoretical framework and a practical system that can augment any backbone AI system using a fallback method and a fail-safe mechanism based on Dempster-Shafer theory. Our approach relies on an actionable definition of trustworthy AI. Our method automatically discards the voxel-level labeling predicted by the backbone AI that violate expert knowledge and relies on a fallback for those voxels. We demonstrate the effectiveness of the proposed trustworthy AI approach on the largest reported annotated dataset of fetal MRI consisting of 540 manually annotated fetal brain 3D T2w MRIs from 13 centers. Our trustworthy AI method improves the robustness of a state-of-the-art backbone AI for fetal brain MRIs acquired across various centers and for fetuses with various brain abnormalities. △ Less

Submitted 17 January, 2024; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: Published in IEEE TPAMI. Minor revision compared to the previous version

arXiv:2203.12362 [pdf, other]

doi 10.1016/j.media.2024.103207

MONAI Label: A framework for AI-assisted Interactive Labeling of 3D Medical Images

Authors: Andres Diaz-Pinto, Sachidanand Alle, Vishwesh Nath, Yucheng Tang, Alvin Ihsani, Muhammad Asad, Fernando Pérez-García, Pritesh Mehta, Wenqi Li, Mona Flores, Holger R. Roth, Tom Vercauteren, Daguang Xu, Prerna Dogra, Sebastien Ourselin, Andrew Feng, M. Jorge Cardoso

Abstract: The lack of annotated datasets is a major bottleneck for training new task-specific supervised machine learning models, considering that manual annotation is extremely expensive and time-consuming. To address this problem, we present MONAI Label, a free and open-source framework that facilitates the development of applications based on artificial intelligence (AI) models that aim at reducing the t… ▽ More The lack of annotated datasets is a major bottleneck for training new task-specific supervised machine learning models, considering that manual annotation is extremely expensive and time-consuming. To address this problem, we present MONAI Label, a free and open-source framework that facilitates the development of applications based on artificial intelligence (AI) models that aim at reducing the time required to annotate radiology datasets. Through MONAI Label, researchers can develop AI annotation applications focusing on their domain of expertise. It allows researchers to readily deploy their apps as services, which can be made available to clinicians via their preferred user interface. Currently, MONAI Label readily supports locally installed (3D Slicer) and web-based (OHIF) frontends and offers two active learning strategies to facilitate and speed up the training of segmentation algorithms. MONAI Label allows researchers to make incremental improvements to their AI-based annotation application by making them available to other researchers and clinicians alike. Additionally, MONAI Label provides sample AI-based interactive and non-interactive labeling applications, that can be used directly off the shelf, as plug-and-play to any given dataset. Significant reduced annotation times using the interactive model can be observed on two public datasets. △ Less

Submitted 28 April, 2023; v1 submitted 23 March, 2022; originally announced March 2022.

arXiv:2201.04584 [pdf, other]

ECONet: Efficient Convolutional Online Likelihood Network for Scribble-based Interactive Segmentation

Authors: Muhammad Asad, Lucas Fidon, Tom Vercauteren

Abstract: Automatic segmentation of lung lesions associated with COVID-19 in CT images requires large amount of annotated volumes. Annotations mandate expert knowledge and are time-intensive to obtain through fully manual segmentation methods. Additionally, lung lesions have large inter-patient variations, with some pathologies having similar visual appearance as healthy lung tissues. This poses a challenge… ▽ More Automatic segmentation of lung lesions associated with COVID-19 in CT images requires large amount of annotated volumes. Annotations mandate expert knowledge and are time-intensive to obtain through fully manual segmentation methods. Additionally, lung lesions have large inter-patient variations, with some pathologies having similar visual appearance as healthy lung tissues. This poses a challenge when applying existing semi-automatic interactive segmentation techniques for data labelling. To address these challenges, we propose an efficient convolutional neural networks (CNNs) that can be learned online while the annotator provides scribble-based interaction. To accelerate learning from only the samples labelled through user-interactions, a patch-based approach is used for training the network. Moreover, we use weighted cross-entropy loss to address the class imbalance that may result from user-interactions. During online inference, the learned network is applied to the whole input volume using a fully convolutional approach. We compare our proposed method with state-of-the-art using synthetic scribbles and show that it outperforms existing methods on the task of annotating lung lesions associated with COVID-19, achieving 16% higher Dice score while reducing execution time by 3$\times$ and requiring 9000 lesser scribbles-based labelled voxels. Due to the online learning aspect, our approach adapts quickly to user input, resulting in high quality segmentation labels. Source code for ECONet is available at: https://github.com/masadcv/ECONet-MONAILabel. △ Less

Submitted 29 March, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

Comments: Accepted at MIDL 2022

Report number: https://proceedings.mlr.press/v172/asad22a.html

arXiv:2201.02831 [pdf, other]

doi 10.1016/j.media.2022.102628

CrossMoDA 2021 challenge: Benchmark of Cross-Modality Domain Adaptation techniques for Vestibular Schwannoma and Cochlea Segmentation

Authors: Reuben Dorent, Aaron Kujawa, Marina Ivory, Spyridon Bakas, Nicola Rieke, Samuel Joutard, Ben Glocker, Jorge Cardoso, Marc Modat, Kayhan Batmanghelich, Arseniy Belkov, Maria Baldeon Calisto, Jae Won Choi, Benoit M. Dawant, Hexin Dong, Sergio Escalera, Yubo Fan, Lasse Hansen, Mattias P. Heinrich, Smriti Joshi, Victoriya Kashtanova, Hyeon Gyu Kim, Satoshi Kondo, Christian N. Kruse, Susana K. Lai-Yuen , et al. (15 additional authors not shown)

Abstract: Domain Adaptation (DA) has recently raised strong interests in the medical imaging community. While a large variety of DA techniques has been proposed for image segmentation, most of these techniques have been validated either on private datasets or on small publicly available datasets. Moreover, these datasets mostly addressed single-class problems. To tackle these limitations, the Cross-Modality… ▽ More Domain Adaptation (DA) has recently raised strong interests in the medical imaging community. While a large variety of DA techniques has been proposed for image segmentation, most of these techniques have been validated either on private datasets or on small publicly available datasets. Moreover, these datasets mostly addressed single-class problems. To tackle these limitations, the Cross-Modality Domain Adaptation (crossMoDA) challenge was organised in conjunction with the 24th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2021). CrossMoDA is the first large and multi-class benchmark for unsupervised cross-modality DA. The challenge's goal is to segment two key brain structures involved in the follow-up and treatment planning of vestibular schwannoma (VS): the VS and the cochleas. Currently, the diagnosis and surveillance in patients with VS are performed using contrast-enhanced T1 (ceT1) MRI. However, there is growing interest in using non-contrast sequences such as high-resolution T2 (hrT2) MRI. Therefore, we created an unsupervised cross-modality segmentation benchmark. The training set provides annotated ceT1 (N=105) and unpaired non-annotated hrT2 (N=105). The aim was to automatically perform unilateral VS and bilateral cochlea segmentation on hrT2 as provided in the testing set (N=137). A total of 16 teams submitted their algorithm for the evaluation phase. The level of performance reached by the top-performing teams is strikingly high (best median Dice - VS:88.4%; Cochleas:85.7%) and close to full supervision (median Dice - VS:92.5%; Cochleas:87.7%). All top-performing methods made use of an image-to-image translation approach to transform the source-domain images into pseudo-target-domain images. A segmentation network was then trained using these generated images and the manual annotations provided for the source image. △ Less

Submitted 14 December, 2022; v1 submitted 8 January, 2022; originally announced January 2022.

Comments: In Medical Image Analysis

arXiv:2112.13054 [pdf, other]

doi 10.1007/978-3-031-09002-8_17

Generalized Wasserstein Dice Loss, Test-time Augmentation, and Transformers for the BraTS 2021 challenge

Authors: Lucas Fidon, Suprosanna Shit, Ivan Ezhov, Johannes C. Paetzold, Sébastien Ourselin, Tom Vercauteren

Abstract: Brain tumor segmentation from multiple Magnetic Resonance Imaging (MRI) modalities is a challenging task in medical image computation. The main challenges lie in the generalizability to a variety of scanners and imaging protocols. In this paper, we explore strategies to increase model robustness without increasing inference time. Towards this aim, we explore finding a robust ensemble from models t… ▽ More Brain tumor segmentation from multiple Magnetic Resonance Imaging (MRI) modalities is a challenging task in medical image computation. The main challenges lie in the generalizability to a variety of scanners and imaging protocols. In this paper, we explore strategies to increase model robustness without increasing inference time. Towards this aim, we explore finding a robust ensemble from models trained using different losses, optimizers, and train-validation data split. Importantly, we explore the inclusion of a transformer in the bottleneck of the U-Net architecture. While we find transformer in the bottleneck performs slightly worse than the baseline U-Net in average, the generalized Wasserstein Dice loss consistently produces superior results. Further, we adopt an efficient test time augmentation strategy for faster and robust inference. Our final ensemble of seven 3D U-Nets with test-time augmentation produces an average dice score of 89.4% and an average Hausdorff 95% distance of 10.0 mm when evaluated on the BraTS 2021 testing dataset. Our code and trained models are publicly available at https://github.com/LucasFidon/TRABIT_BraTS2021. △ Less

Submitted 24 December, 2021; originally announced December 2021.

Comments: BraTS 2021 challenge

arXiv:2111.07673 [pdf, ps, other]

doi 10.1016/j.pacs.2022.100351

Improving needle visibility in LED-based photoacoustic imaging using deep learning with semi-synthetic datasets

Authors: Mengjie Shi, Tianrui Zhao, Simeon J. West, Adrien E. Desjardins, Tom Vercauteren, Wenfeng Xia

Abstract: Photoacoustic imaging has shown great potential for guiding minimally invasive procedures by accurate identification of critical tissue targets and invasive medical devices (such as metallic needles). The use of light emitting diodes (LEDs) as the excitation light sources accelerates its clinical translation owing to its high affordability and portability. However, needle visibility in LED-based p… ▽ More Photoacoustic imaging has shown great potential for guiding minimally invasive procedures by accurate identification of critical tissue targets and invasive medical devices (such as metallic needles). The use of light emitting diodes (LEDs) as the excitation light sources accelerates its clinical translation owing to its high affordability and portability. However, needle visibility in LED-based photoacoustic imaging is compromised primarily due to its low optical fluence. In this work, we propose a deep learning framework based on U-Net to improve the visibility of clinical metallic needles with a LED-based photoacoustic and ultrasound imaging system. To address the complexity of capturing ground truth for real data and the poor realism of purely simulated data, this framework included the generation of semi-synthetic training datasets combining both simulated data to represent features from the needles and in vivo measurements for tissue background. Evaluation of the trained neural network was performed with needle insertions into blood-vessel-mimicking phantoms, pork joint tissue ex vivo and measurements on human volunteers. This deep learning-based framework substantially improved the needle visibility in photoacoustic imaging in vivo compared to conventional reconstruction by suppressing background noise and image artefacts, achieving 5.8 and 4.5 times improvements in signal-to-noise ratio (SNR) and the modified Hausdorff distance (MHD) respectively. Thus, the proposed framework could be helpful for reducing complications during percutaneous needle insertions by accurate identification of clinical needles in photoacoustic imaging. △ Less

Submitted 9 July, 2022; v1 submitted 15 November, 2021; originally announced November 2021.

arXiv:2111.02408 [pdf, other]

Partial supervision for the FeTA challenge 2021

Authors: Lucas Fidon, Michael Aertsen, Suprosanna Shit, Philippe Demaerel, Sébastien Ourselin, Jan Deprest, Tom Vercauteren

Abstract: This paper describes our method for our participation in the FeTA challenge2021 (team name: TRABIT). The performance of convolutional neural networks for medical image segmentation is thought to correlate positively with the number of training data. The FeTA challenge does not restrict participants to using only the provided training data but also allows for using other publicly available sources.… ▽ More This paper describes our method for our participation in the FeTA challenge2021 (team name: TRABIT). The performance of convolutional neural networks for medical image segmentation is thought to correlate positively with the number of training data. The FeTA challenge does not restrict participants to using only the provided training data but also allows for using other publicly available sources. Yet, open access fetal brain data remains limited. An advantageous strategy could thus be to expand the training data to cover broader perinatal brain imaging sources. Perinatal brain MRIs, other than the FeTA challenge data, that are currently publicly available, span normal and pathological fetal atlases as well as neonatal scans. However, perinatal brain MRIs segmented in different datasets typically come with different annotation protocols. This makes it challenging to combine those datasets to train a deep neural network. We recently proposed a family of loss functions, the label-set loss functions, for partially supervised learning. Label-set loss functions allow to train deep neural networks with partially segmented images, i.e. segmentations in which some classes may be grouped into super-classes. We propose to use label-set loss functions to improve the segmentation performance of a state-of-the-art deep learning pipeline for multi-class fetal brain segmentation by merging several publicly available datasets. To promote generalisability, our approach does not introduce any additional hyper-parameters tuning. △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: Accepted as a poster at the MICCAI 2021 Perinatal, Preterm and Paediatric Image Analysis (PIPPI) workshop

arXiv:2110.13245 [pdf, other]

doi 10.1109/ISMR48346.2021.9661563

Homography-based Visual Servoing with Remote Center of Motion for Semi-autonomous Robotic Endoscope Manipulation

Authors: Martin Huber, John Bason Mitchell, Ross Henry, Sébastien Ourselin, Tom Vercauteren, Christos Bergeles

Abstract: The dominant visual servoing approaches in Minimally Invasive Surgery (MIS) follow single points or adapt the endoscope's field of view based on the surgical tools' distance. These methods rely on point positions with respect to the camera frame to infer a control policy. Deviating from the dominant methods, we formulate a robotic controller that allows for image-based visual servoing that require… ▽ More The dominant visual servoing approaches in Minimally Invasive Surgery (MIS) follow single points or adapt the endoscope's field of view based on the surgical tools' distance. These methods rely on point positions with respect to the camera frame to infer a control policy. Deviating from the dominant methods, we formulate a robotic controller that allows for image-based visual servoing that requires neither explicit tool and camera positions nor any explicit image depth information. The proposed method relies on homography-based image registration, which changes the automation paradigm from point-centric towards surgical-scene-centric approach. It simultaneously respects a programmable Remote Center of Motion (RCM). Our approach allows a surgeon to build a graph of desired views, from which, once built, views can be manually selected and automatically servoed to irrespective of robot-patient frame transformation changes. We evaluate our method on an abdominal phantom and provide an open source ROS Moveit integration for use with any serial manipulator. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: Accepted as conference paper at ISMR 2021

arXiv:2110.12939 [pdf, other]

doi 10.1007/978-3-030-87193-2_30

Interactive Segmentation via Deep Learning and B-Spline Explicit Active Surfaces

Authors: Helena Williams, João Pedrosa, Laura Cattani, Susanne Housmans, Tom Vercauteren, Jan Deprest, Jan D'hooge

Abstract: Automatic medical image segmentation via convolutional neural networks (CNNs) has shown promising results. However, they may not always be robust enough for clinical use. Sub-optimal segmentation would require clinician's to manually delineate the target object, causing frustration. To address this problem, a novel interactive CNN-based segmentation framework is proposed in this work. The aim is t… ▽ More Automatic medical image segmentation via convolutional neural networks (CNNs) has shown promising results. However, they may not always be robust enough for clinical use. Sub-optimal segmentation would require clinician's to manually delineate the target object, causing frustration. To address this problem, a novel interactive CNN-based segmentation framework is proposed in this work. The aim is to represent the CNN segmentation contour as B-splines by utilising B-spline explicit active surfaces (BEAS). The interactive element of the framework allows the user to precisely edit the contour in real-time, and by utilising BEAS it ensures the final contour is smooth and anatomically plausible. This framework was applied to the task of 2D segmentation of the levator hiatus from 2D ultrasound (US) images, and compared to the current clinical tools used in pelvic floor disorder clinic (4DView, GE Healthcare; Zipf, Austria). Experimental results show that: 1) the proposed framework is more robust than current state-of-the-art CNNs; 2) the perceived workload calculated via the NASA-TLX index was reduced more than half for the proposed approach in comparison to current clinical tools; and 3) the proposed tool requires at least 13 seconds less user time than the clinical tools, which was significant (p=0.001). △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: 11 pages, 3 figures, 2 tables

Journal ref: MICCAI 2021. Lecture Notes in Computer Science, vol 12901. Springer, Cham

arXiv:2109.15098 [pdf, other]

doi 10.1080/21681163.2021.2002195

Deep Homography Estimation in Dynamic Surgical Scenes for Laparoscopic Camera Motion Extraction

Authors: Martin Huber, Sébastien Ourselin, Christos Bergeles, Tom Vercauteren

Abstract: Current laparoscopic camera motion automation relies on rule-based approaches or only focuses on surgical tools. Imitation Learning (IL) methods could alleviate these shortcomings, but have so far been applied to oversimplified setups. Instead of extracting actions from oversimplified setups, in this work we introduce a method that allows to extract a laparoscope holder's actions from videos of la… ▽ More Current laparoscopic camera motion automation relies on rule-based approaches or only focuses on surgical tools. Imitation Learning (IL) methods could alleviate these shortcomings, but have so far been applied to oversimplified setups. Instead of extracting actions from oversimplified setups, in this work we introduce a method that allows to extract a laparoscope holder's actions from videos of laparoscopic interventions. We synthetically add camera motion to a newly acquired dataset of camera motion free da Vinci surgery image sequences through a novel homography generation algorithm. The synthetic camera motion serves as a supervisory signal for camera motion estimation that is invariant to object and tool motion. We perform an extensive evaluation of state-of-the-art (SOTA) Deep Neural Networks (DNNs) across multiple compute regimes, finding our method transfers from our camera motion free da Vinci surgery dataset to videos of laparoscopic interventions, outperforming classical homography estimation approaches in both, precision by 41%, and runtime on a CPU by 43%. △ Less

Submitted 27 October, 2021; v1 submitted 30 September, 2021; originally announced September 2021.

Comments: Accepted for publication in 2021 AE-CAI Special Issue of the Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization

arXiv:2109.01403 [pdf, other]

doi 10.1080/21681163.2021.1997646

Deep Learning Approach for Hyperspectral Image Demosaicking, Spectral Correction and High-resolution RGB Reconstruction

Authors: Peichao Li, Michael Ebner, Philip Noonan, Conor Horgan, Anisha Bahl, Sebastien Ourselin, Jonathan Shapey, Tom Vercauteren

Abstract: Hyperspectral imaging is one of the most promising techniques for intraoperative tissue characterisation. Snapshot mosaic cameras, which can capture hyperspectral data in a single exposure, have the potential to make a real-time hyperspectral imaging system for surgical decision-making possible. However, optimal exploitation of the captured data requires solving an ill-posed demosaicking problem a… ▽ More Hyperspectral imaging is one of the most promising techniques for intraoperative tissue characterisation. Snapshot mosaic cameras, which can capture hyperspectral data in a single exposure, have the potential to make a real-time hyperspectral imaging system for surgical decision-making possible. However, optimal exploitation of the captured data requires solving an ill-posed demosaicking problem and applying additional spectral corrections to recover spatial and spectral information of the image. In this work, we propose a deep learning-based image demosaicking algorithm for snapshot hyperspectral images using supervised learning methods. Due to the lack of publicly available medical images acquired with snapshot mosaic cameras, a synthetic image generation approach is proposed to simulate snapshot images from existing medical image datasets captured by high-resolution, but slow, hyperspectral imaging devices. Image reconstruction is achieved using convolutional neural networks for hyperspectral image super-resolution, followed by cross-talk and leakage correction using a sensor-specific calibration matrix. The resulting demosaicked images are evaluated both quantitatively and qualitatively, showing clear improvements in image quality compared to a baseline demosaicking method using linear interpolation. Moreover, the fast processing time of~45\,ms of our algorithm to obtain super-resolved RGB or oxygenation saturation maps per image frame for a state-of-the-art snapshot mosaic camera demonstrates the potential for its seamless integration into real-time surgical hyperspectral imaging applications. △ Less

Submitted 3 September, 2021; originally announced September 2021.

arXiv:2108.04175 [pdf, other]

doi 10.1007/978-3-030-87735-4_25

Distributionally Robust Segmentation of Abnormal Fetal Brain 3D MRI

Authors: Lucas Fidon, Michael Aertsen, Nada Mufti, Thomas Deprest, Doaa Emam, Frédéric Guffens, Ernst Schwartz, Michael Ebner, Daniela Prayer, Gregor Kasprian, Anna L. David, Andrew Melbourne, Sébastien Ourselin, Jan Deprest, Georg Langs, Tom Vercauteren

Abstract: The performance of deep neural networks typically increases with the number of training images. However, not all images have the same importance towards improved performance and robustness. In fetal brain MRI, abnormalities exacerbate the variability of the develo** brain anatomy compared to non-pathological cases. A small number of abnormal cases, as is typically available in clinical datasets… ▽ More The performance of deep neural networks typically increases with the number of training images. However, not all images have the same importance towards improved performance and robustness. In fetal brain MRI, abnormalities exacerbate the variability of the develo** brain anatomy compared to non-pathological cases. A small number of abnormal cases, as is typically available in clinical datasets used for training, are unlikely to fairly represent the rich variability of abnormal develo** brains. This leads machine learning systems trained by maximizing the average performance to be biased toward non-pathological cases. This problem was recently referred to as hidden stratification. To be suited for clinical use, automatic segmentation methods need to reliably achieve high-quality segmentation outcomes also for pathological cases. In this paper, we show that the state-of-the-art deep learning pipeline nnU-Net has difficulties to generalize to unseen abnormal cases. To mitigate this problem, we propose to train a deep neural network to minimize a percentile of the distribution of per-volume loss over the dataset. We show that this can be achieved by using Distributionally Robust Optimization (DRO). DRO automatically reweights the training samples with lower performance, encouraging nnU-Net to perform more consistently on all cases. We validated our approach using a dataset of 368 fetal brain T2w MRIs, including 124 MRIs of open spina bifida cases and 51 MRIs of cases with other severe abnormalities of brain development. △ Less

Submitted 9 August, 2021; originally announced August 2021.

Comments: Accepted at the MICCAI 2021 Perinatal, Preterm and Paediatric Image Analysis (PIPPI) workshop. arXiv admin note: substantial text overlap with arXiv:2001.02658

arXiv:2107.03846 [pdf, other]

doi 10.1007/978-3-030-87196-3_60

Label-set Loss Functions for Partial Supervision: Application to Fetal Brain 3D MRI Parcellation

Authors: Lucas Fidon, Michael Aertsen, Doaa Emam, Nada Mufti, Frédéric Guffens, Thomas Deprest, Philippe Demaerel, Anna L. David, Andrew Melbourne, Sébastien Ourselin, Jan Deprest, Tom Vercauteren

Abstract: Deep neural networks have increased the accuracy of automatic segmentation, however, their accuracy depends on the availability of a large number of fully segmented images. Methods to train deep neural networks using images for which some, but not all, regions of interest are segmented are necessary to make better use of partially annotated datasets. In this paper, we propose the first axiomatic d… ▽ More Deep neural networks have increased the accuracy of automatic segmentation, however, their accuracy depends on the availability of a large number of fully segmented images. Methods to train deep neural networks using images for which some, but not all, regions of interest are segmented are necessary to make better use of partially annotated datasets. In this paper, we propose the first axiomatic definition of label-set loss functions that are the loss functions that can handle partially segmented images. We prove that there is one and only one method to convert a classical loss function for fully segmented images into a proper label-set loss function. Our theory also allows us to define the leaf-Dice loss, a label-set generalization of the Dice loss particularly suited for partial supervision with only missing labels. Using the leaf-Dice loss, we set a new state of the art in partially supervised learning for fetal brain 3D MRI segmentation. We achieve a deep neural network able to segment white matter, ventricles, cerebellum, extra-ventricular CSF, cortical gray matter, deep gray matter, brainstem, and corpus callosum based on fetal brain 3D MRI of anatomically normal fetuses or with open spina bifida. Our implementation of the proposed label-set loss functions is available at https://github.com/LucasFidon/label-set-loss-functions △ Less

Submitted 9 July, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

Comments: Accepted at MICCAI 2021

arXiv:2105.11239 [pdf, other]

doi 10.1007/s11548-021-02420-2

A self-supervised learning strategy for postoperative brain cavity segmentation simulating resections

Authors: Fernando Pérez-García, Reuben Dorent, Michele Rizzi, Francesco Cardinale, Valerio Frazzini, Vincent Navarro, Caroline Essert, Irène Ollivier, Tom Vercauteren, Rachel Sparks, John S. Duncan, Sébastien Ourselin

Abstract: Accurate segmentation of brain resection cavities (RCs) aids in postoperative analysis and determining follow-up treatment. Convolutional neural networks (CNNs) are the state-of-the-art image segmentation technique, but require large annotated datasets for training. Annotation of 3D medical images is time-consuming, requires highly-trained raters, and may suffer from high inter-rater variability.… ▽ More Accurate segmentation of brain resection cavities (RCs) aids in postoperative analysis and determining follow-up treatment. Convolutional neural networks (CNNs) are the state-of-the-art image segmentation technique, but require large annotated datasets for training. Annotation of 3D medical images is time-consuming, requires highly-trained raters, and may suffer from high inter-rater variability. Self-supervised learning strategies can leverage unlabeled data for training. We developed an algorithm to simulate resections from preoperative magnetic resonance images (MRIs). We performed self-supervised training of a 3D CNN for RC segmentation using our simulation method. We curated EPISURG, a dataset comprising 430 postoperative and 268 preoperative MRIs from 430 refractory epilepsy patients who underwent resective neurosurgery. We fine-tuned our model on three small annotated datasets from different institutions and on the annotated images in EPISURG, comprising 20, 33, 19 and 133 subjects. The model trained on data with simulated resections obtained median (interquartile range) Dice score coefficients (DSCs) of 81.7 (16.4), 82.4 (36.4), 74.9 (24.2) and 80.5 (18.7) for each of the four datasets. After fine-tuning, DSCs were 89.2 (13.3), 84.1 (19.8), 80.2 (20.1) and 85.2 (10.8). For comparison, inter-rater agreement between human annotators from our previous study was 84.0 (9.9). We present a self-supervised learning strategy for 3D CNNs using simulated RCs to accurately segment real RCs on postoperative MRI. Our method generalizes well to data from different institutions, pathologies and modalities. Source code, segmentation models and the EPISURG dataset are available at https://github.com/fepegar/ressegijcars . △ Less

Submitted 24 May, 2021; originally announced May 2021.

Comments: To be published in the International Journal of Computer Assisted Radiology and Surgery (IJCARS) - Special issue MICCAI 2020

arXiv:2104.14873 [pdf, other]

doi 10.1016/j.media.2021.102265

Robust joint registration of multiple stains and MRI for multimodal 3D histology reconstruction: Application to the Allen human brain atlas

Authors: Adrià Casamitjana, Marco Lorenzi, Sebastiano Ferraris, Loc Peter, Marc Modat, Allison Stevens, Bruce Fischl, Tom Vercauteren, Juan Eugenio Iglesias

Abstract: Joint registration of a stack of 2D histological sections to recover 3D structure (``3D histology reconstruction'') finds application in areas such as atlas building and validation of \emph{in vivo} imaging. Straightforward pairwise registration of neighbouring sections yields smooth reconstructions but has well-known problems such as ``banana effect'' (straightening of curved structures) and ``z-… ▽ More Joint registration of a stack of 2D histological sections to recover 3D structure (``3D histology reconstruction'') finds application in areas such as atlas building and validation of \emph{in vivo} imaging. Straightforward pairwise registration of neighbouring sections yields smooth reconstructions but has well-known problems such as ``banana effect'' (straightening of curved structures) and ``z-shift'' (drift). While these problems can be alleviated with an external, linearly aligned reference (e.g., Magnetic Resonance (MR) images), registration is often inaccurate due to contrast differences and the strong nonlinear distortion of the tissue, including artefacts such as folds and tears. In this paper, we present a probabilistic model of spatial deformation that yields reconstructions for multiple histological stains that that are jointly smooth, robust to outliers, and follow the reference shape. The model relies on a spanning tree of latent transforms connecting all the sections and slices of the reference volume, and assumes that the registration between any pair of images can be see as a noisy version of the composition of (possibly inverted) latent transforms connecting the two images. Bayesian inference is used to compute the most likely latent transforms given a set of pairwise registrations between image pairs within and across modalities. The framework is used for accurate 3D reconstruction of two stains (Nissl and parvalbumin) from the Allen human brain atlas, showing its benefits on real data with severe distortions. Moreover, we also provide the registration of the reconstructed volume to MNI space, bridging the gaps between two of the most widely used atlases in histology and MRI. The 3D reconstructed volumes and atlas registration can be downloaded from https://openneuro.org/datasets/ds003590. The code is freely available at https://github.com/acasamitjana/3dhirest. △ Less

Submitted 12 October, 2021; v1 submitted 30 April, 2021; originally announced April 2021.

Comments: Medical Image Analysis (Accepted on 07/10/2021). 22 pages, 11 figures,

arXiv:2103.14015 [pdf, other]

doi 10.1109/TMI.2021.3067512

Zero-shot super-resolution with a physically-motivated downsampling kernel for endomicroscopy

Authors: Agnieszka Barbara Szczotka, Dzhoshkun Ismail Shakir, Matthew J. Clarkson, Stephen P. Pereira, Tom Vercauteren

Abstract: Super-resolution (SR) methods have seen significant advances thanks to the development of convolutional neural networks (CNNs). CNNs have been successfully employed to improve the quality of endomicroscopy imaging. Yet, the inherent limitation of research on SR in endomicroscopy remains the lack of ground truth high-resolution (HR) images, commonly used for both supervised training and reference-b… ▽ More Super-resolution (SR) methods have seen significant advances thanks to the development of convolutional neural networks (CNNs). CNNs have been successfully employed to improve the quality of endomicroscopy imaging. Yet, the inherent limitation of research on SR in endomicroscopy remains the lack of ground truth high-resolution (HR) images, commonly used for both supervised training and reference-based image quality assessment (IQA). Therefore, alternative methods, such as unsupervised SR are being explored. To address the need for non-reference image quality improvement, we designed a novel zero-shot super-resolution (ZSSR) approach that relies only on the endomicroscopy data to be processed in a self-supervised manner without the need for ground-truth HR images. We tailored the proposed pipeline to the idiosyncrasies of endomicroscopy by introducing both: a physically-motivated Voronoi downscaling kernel accounting for the endomicroscope's irregular fibre-based sampling pattern, and realistic noise patterns. We also took advantage of video sequences to exploit a sequence of images for self-supervised zero-shot image quality improvement. We run ablation studies to assess our contribution in regards to the downscaling kernel and noise simulation. We validate our methodology on both synthetic and original data. Synthetic experiments were assessed with reference-based IQA, while our results for original images were evaluated in a user study conducted with both expert and non-expert observers. The results demonstrated superior performance in image quality of ZSSR reconstructions in comparison to the baseline method. The ZSSR is also competitive when compared to supervised single-image SR, especially being the preferred reconstruction technique by experts. △ Less

Submitted 25 March, 2021; originally announced March 2021.

Journal ref: IEEE Transactions on Medical Imaging, 2021

arXiv:2103.13314 [pdf, other]

MONAIfbs: MONAI-based fetal brain MRI deep learning segmentation

Authors: Marta B. M. Ranzini, Lucas Fidon, Sébastien Ourselin, Marc Modat, Tom Vercauteren

Abstract: In fetal Magnetic Resonance Imaging, Super Resolution Reconstruction (SRR) algorithms are becoming popular tools to obtain high-resolution 3D volume reconstructions from low-resolution stacks of 2D slices, acquired at different orientations. To be effective, these algorithms often require accurate segmentation of the region of interest, such as the fetal brain in suspected pathological cases. In t… ▽ More In fetal Magnetic Resonance Imaging, Super Resolution Reconstruction (SRR) algorithms are becoming popular tools to obtain high-resolution 3D volume reconstructions from low-resolution stacks of 2D slices, acquired at different orientations. To be effective, these algorithms often require accurate segmentation of the region of interest, such as the fetal brain in suspected pathological cases. In the case of Spina Bifida, Ebner, Wang et al. (NeuroImage, 2020) combined their SRR algorithm with a 2-step segmentation pipeline (2D localisation followed by a 2D segmentation network). However, if the localisation step fails, the second network is not able to recover a correct brain mask, thus requiring manual corrections for an effective SRR. In this work, we aim at improving the fetal brain segmentation for SRR in Spina Bifida. We hypothesise that a well-trained single-step UNet can achieve accurate performance, avoiding the need of a 2-step approach. We propose a new tool for fetal brain segmentation called MONAIfbs, which takes advantage of the Medical Open Network for Artificial Intelligence (MONAI) framework. Our network is based on the dynamic UNet (dynUNet), an adaptation of the nnU-Net framework. When compared to the original 2-step approach proposed in Ebner-Wang, and the same Ebner-Wang approach retrained with the expanded dataset available for this work, the dynUNet showed to achieve higher performance using a single step only. It also showed to reduce the number of outliers, as only 28 stacks obtained Dice score less than 0.9, compared to 68 for Ebner-Wang and 53 Ebner-Wang expanded. The proposed dynUNet model thus provides an improvement of the state-of-the-art fetal brain segmentation techniques, reducing the need for manual correction in automated SRR pipelines. Our code and our trained model are made publicly available at https://github.com/gift-surg/MONAIfbs. △ Less

Submitted 21 March, 2021; originally announced March 2021.

Comments: Abstract accepted at IEEE International Symposium on Biomedical Imaging (ISBI) 2021

arXiv:2102.09963 [pdf, other]

doi 10.1007/s11548-020-02127-w

Intrapapillary Capillary Loop Classification in Magnification Endoscopy: Open Dataset and Baseline Methodology

Authors: Luis C. Garcia-Peraza-Herrera, Martin Everson, Laurence Lovat, Hsiu-Po Wang, Wen Lun Wang, Rehan Haidry, Danail Stoyanov, Sebastien Ourselin, Tom Vercauteren

Abstract: Purpose. Early squamous cell neoplasia (ESCN) in the oesophagus is a highly treatable condition. Lesions confined to the mucosal layer can be curatively treated endoscopically. We build a computer-assisted detection (CADe) system that can classify still images or video frames as normal or abnormal with high diagnostic accuracy. Methods. We present a new benchmark dataset containing 68K binary labe… ▽ More Purpose. Early squamous cell neoplasia (ESCN) in the oesophagus is a highly treatable condition. Lesions confined to the mucosal layer can be curatively treated endoscopically. We build a computer-assisted detection (CADe) system that can classify still images or video frames as normal or abnormal with high diagnostic accuracy. Methods. We present a new benchmark dataset containing 68K binary labeled frames extracted from 114 patient videos whose imaged areas have been resected and correlated to histopathology. Our novel convolutional network (CNN) architecture solves the binary classification task and explains what features of the input domain drive the decision-making process of the network. Results. The proposed method achieved an average accuracy of 91.7 % compared to the 94.7 % achieved by a group of 12 senior clinicians. Our novel network architecture produces deeply supervised activation heatmaps that suggest the network is looking at intrapapillary capillary loop (IPCL) patterns when predicting abnormality. Conclusion. We believe that this dataset and baseline method may serve as a reference for future benchmarks on both video frame classification and explainability in the context of ESCN detection. A future work path of high clinical relevance is the extension of the classification to ESCN types. △ Less

Submitted 19 February, 2021; originally announced February 2021.

Comments: Accepted by IJCARS

arXiv:2101.06440 [pdf, other]

doi 10.1007/978-3-319-24571-3_81

Scale factor point spread function matching: Beyond aliasing in image resampling

Authors: M. Jorge Cardoso, Marc Modat, Tom Vercauteren, Sebastien Ourselin

Abstract: Imaging devices exploit the Nyquist-Shannon sampling theorem to avoid both aliasing and redundant oversampling by design. Conversely, in medical image resampling, images are considered as continuous functions, are warped by a spatial transformation, and are then sampled on a regular grid. In most cases, the spatial war** changes the frequency characteristics of the continuous function and no spe… ▽ More Imaging devices exploit the Nyquist-Shannon sampling theorem to avoid both aliasing and redundant oversampling by design. Conversely, in medical image resampling, images are considered as continuous functions, are warped by a spatial transformation, and are then sampled on a regular grid. In most cases, the spatial war** changes the frequency characteristics of the continuous function and no special care is taken to ensure that the resampling grid respects the conditions of the sampling theorem. This paper shows that this oversight introduces artefacts, including aliasing, that can lead to important bias in clinical applications. One notable exception to this common practice is when multi-resolution pyramids are constructed, with low-pass "anti-aliasing" filters being applied prior to downsampling. In this work, we illustrate why similar caution is needed when resampling images under general spatial transformations and propose a novel method that is more respectful of the sampling theorem, minimising aliasing and loss of information. We introduce the notion of scale factor point spread function (sfPSF) and employ Gaussian kernels to achieve a computationally tractable resampling scheme that can cope with arbitrary non-linear spatial transformations and grid sizes. Experiments demonstrate significant (p<1e-4) technical and clinical implications of the proposed method. △ Less

Submitted 16 January, 2021; originally announced January 2021.

Comments: Published in MICCAI 2015

arXiv:2101.06425 [pdf, other]

doi 10.1109/ISBI48211.2021.9433798

Morphological Change Forecasting for Prostate Glands using Feature-based Registration and Kernel Density Extrapolation

Authors: Qianye Yang, Tom Vercauteren, Yunguan Fu, Francesco Giganti, Nooshin Ghavami, Vasilis Stavrinides, Caroline Moore, Matt Clarkson, Dean Barratt, Yipeng Hu

Abstract: Organ morphology is a key indicator for prostate disease diagnosis and prognosis. For instance, In longitudinal study of prostate cancer patients under active surveillance, the volume, boundary smoothness and their changes are closely monitored on time-series MR image data. In this paper, we describe a new framework for forecasting prostate morphological changes, as the ability to detect such chan… ▽ More Organ morphology is a key indicator for prostate disease diagnosis and prognosis. For instance, In longitudinal study of prostate cancer patients under active surveillance, the volume, boundary smoothness and their changes are closely monitored on time-series MR image data. In this paper, we describe a new framework for forecasting prostate morphological changes, as the ability to detect such changes earlier than what is currently possible may enable timely treatment or avoiding unnecessary confirmatory biopsies. In this work, an efficient feature-based MR image registration is first developed to align delineated prostate gland capsules to quantify the morphological changes using the inferred dense displacement fields (DDFs). We then propose to use kernel density estimation (KDE) of the probability density of the DDF-represented \textit{future morphology changes}, between current and future time points, before the future data become available. The KDE utilises a novel distance function that takes into account morphology, stage-of-progression and duration-of-change, which are considered factors in such subject-specific forecasting. We validate the proposed approach on image masks unseen to registration network training, without using any data acquired at the future target time points. The experiment results are presented on a longitudinal data set with 331 images from 73 patients, yielding an average Dice score of 0.865 on a holdout set, between the ground-truth and the image masks warped by the KDE-predicted-DDFs. △ Less

Submitted 16 January, 2021; originally announced January 2021.

Comments: Accepted by ISBI 2021

arXiv:2011.11719 [pdf, other]

Explainable-by-design Semi-Supervised Representation Learning for COVID-19 Diagnosis from CT Imaging

Authors: Abel Díaz Berenguer, Hichem Sahli, Boris Joukovsky, Maryna Kvasnytsia, Ine Dirks, Mitchel Alioscha-Perez, Nikos Deligiannis, Panagiotis Gonidakis, Sebastián Amador Sánchez, Redona Brahimetaj, Evgenia Papavasileiou, Jonathan Cheung-Wai Chana, Fei Li, Shangzhen Song, Yixin Yang, Sofie Tilborghs, Siri Willems, Tom Eelbode, Jeroen Bertels, Dirk Vandermeulen, Frederik Maes, Paul Suetens, Lucas Fidon, Tom Vercauteren, David Robben , et al. (15 additional authors not shown)

Abstract: Our motivating application is a real-world problem: COVID-19 classification from CT imaging, for which we present an explainable Deep Learning approach based on a semi-supervised classification pipeline that employs variational autoencoders to extract efficient feature embedding. We have optimized the architecture of two different networks for CT images: (i) a novel conditional variational autoenc… ▽ More Our motivating application is a real-world problem: COVID-19 classification from CT imaging, for which we present an explainable Deep Learning approach based on a semi-supervised classification pipeline that employs variational autoencoders to extract efficient feature embedding. We have optimized the architecture of two different networks for CT images: (i) a novel conditional variational autoencoder (CVAE) with a specific architecture that integrates the class labels inside the encoder layers and uses side information with shared attention layers for the encoder, which make the most of the contextual clues for representation learning, and (ii) a downstream convolutional neural network for supervised classification using the encoder structure of the CVAE. With the explainable classification results, the proposed diagnosis system is very effective for COVID-19 classification. Based on the promising results obtained qualitatively and quantitatively, we envisage a wide deployment of our developed technique in large-scale clinical studies.Code is available at https://git.etrovub.be/AVSP/ct-based-covid-19-diagnostic-tool.git. △ Less

Submitted 2 September, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

arXiv:2011.02580 [pdf, ps, other]

doi 10.21105/joss.02705

DeepReg: a deep learning toolkit for medical image registration

Authors: Yunguan Fu, Nina Montaña Brown, Shaheer U. Saeed, Adrià Casamitjana, Zachary M. C. Baum, Rémi Delaunay, Qianye Yang, Alexander Grimwood, Zhe Min, Stefano B. Blumberg, Juan Eugenio Iglesias, Dean C. Barratt, Ester Bonmati, Daniel C. Alexander, Matthew J. Clarkson, Tom Vercauteren, Yipeng Hu

Abstract: DeepReg (https://github.com/DeepRegNet/DeepReg) is a community-supported open-source toolkit for research and education in medical image registration using deep learning. DeepReg (https://github.com/DeepRegNet/DeepReg) is a community-supported open-source toolkit for research and education in medical image registration using deep learning. △ Less

Submitted 4 November, 2020; originally announced November 2020.

Comments: Accepted in The Journal of Open Source Software (JOSS)

arXiv:2011.01614 [pdf, other]

doi 10.1007/978-3-030-72087-2_18

Generalized Wasserstein Dice Score, Distributionally Robust Deep Learning, and Ranger for brain tumor segmentation: BraTS 2020 challenge

Authors: Lucas Fidon, Sebastien Ourselin, Tom Vercauteren

Abstract: Training a deep neural network is an optimization problem with four main ingredients: the design of the deep neural network, the per-sample loss function, the population loss function, and the optimizer. However, methods developed to compete in recent BraTS challenges tend to focus only on the design of deep neural network architectures, while paying less attention to the three other aspects. In t… ▽ More Training a deep neural network is an optimization problem with four main ingredients: the design of the deep neural network, the per-sample loss function, the population loss function, and the optimizer. However, methods developed to compete in recent BraTS challenges tend to focus only on the design of deep neural network architectures, while paying less attention to the three other aspects. In this paper, we experimented with adopting the opposite approach. We stuck to a generic and state-of-the-art 3D U-Net architecture and experimented with a non-standard per-sample loss function, the generalized Wasserstein Dice loss, a non-standard population loss function, corresponding to distributionally robust optimization, and a non-standard optimizer, Ranger. Those variations were selected specifically for the problem of multi-class brain tumor segmentation. The generalized Wasserstein Dice loss is a per-sample loss function that allows taking advantage of the hierarchical structure of the tumor regions labeled in BraTS. Distributionally robust optimization is a generalization of empirical risk minimization that accounts for the presence of underrepresented subdomains in the training dataset. Ranger is a generalization of the widely used Adam optimizer that is more stable with small batch size and noisy labels. We found that each of those variations of the optimization of deep neural networks for brain tumor segmentation leads to improvements in terms of Dice scores and Hausdorff distances. With an ensemble of three deep neural networks trained with various optimization procedures, we achieved promising results on the validation dataset of the BraTS 2020 challenge. Our ensemble ranked fourth out of the 693 registered teams for the segmentation task of the BraTS 2020 challenge. △ Less

Submitted 25 January, 2021; v1 submitted 3 November, 2020; originally announced November 2020.

Comments: MICCAI 2020 BrainLes Workshop. Our method ranked fourth out of the 693 registered teams for the segmentation task of the BraTS 2020 challenge. v2: Added some clarifications following reviewers' feedback (camera-ready version)

arXiv:2009.13318 [pdf]

doi 10.1021/acs.analchem.1c02178

High-throughput molecular imaging via deep learning enabled Raman spectroscopy

Authors: Conor C. Horgan, Magnus Jensen, Anika Nagelkerke, Jean-Phillipe St-Pierre, Tom Vercauteren, Molly M. Stevens, Mads S. Bergholt

Abstract: Raman spectroscopy enables non-destructive, label-free imaging with unprecedented molecular contrast but is limited by slow data acquisition, largely preventing high-throughput imaging applications. Here, we present a comprehensive framework for higher-throughput molecular imaging via deep learning enabled Raman spectroscopy, termed DeepeR, trained on a large dataset of hyperspectral Raman images,… ▽ More Raman spectroscopy enables non-destructive, label-free imaging with unprecedented molecular contrast but is limited by slow data acquisition, largely preventing high-throughput imaging applications. Here, we present a comprehensive framework for higher-throughput molecular imaging via deep learning enabled Raman spectroscopy, termed DeepeR, trained on a large dataset of hyperspectral Raman images, with over 1.5 million spectra (400 hours of acquisition) in total. We firstly perform denoising and reconstruction of low signal-to-noise ratio Raman molecular signatures via deep learning, with a 9x improvement in mean squared error over state-of-the-art Raman filtering methods. Next, we develop a neural network for robust 2-4x super-resolution of hyperspectral Raman images that preserves molecular cellular information. Combining these approaches, we achieve Raman imaging speed-ups of up to 160x, enabling high resolution, high signal-to-noise ratio cellular imaging in under one minute. Finally, transfer learning is applied to extend DeepeR from cell to tissue-scale imaging. DeepeR provides a foundation that will enable a host of higher-throughput Raman spectroscopy and molecular imaging applications across biomedicine. △ Less

Submitted 28 September, 2020; originally announced September 2020.

arXiv:2009.10549 [pdf, other]

doi 10.1109/TMI.2020.3035253

CA-Net: Comprehensive Attention Convolutional Neural Networks for Explainable Medical Image Segmentation

Authors: Ran Gu, Guotai Wang, Tao Song, Rui Huang, Michael Aertsen, Jan Deprest, Sébastien Ourselin, Tom Vercauteren, Shaoting Zhang

Abstract: Accurate medical image segmentation is essential for diagnosis and treatment planning of diseases. Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance for automatic medical image segmentation. However, they are still challenged by complicated conditions where the segmentation target has large variations of position, shape and scale, and existing CNNs have a poor explain… ▽ More Accurate medical image segmentation is essential for diagnosis and treatment planning of diseases. Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance for automatic medical image segmentation. However, they are still challenged by complicated conditions where the segmentation target has large variations of position, shape and scale, and existing CNNs have a poor explainability that limits their application to clinical decisions. In this work, we make extensive use of multiple attentions in a CNN architecture and propose a comprehensive attention-based CNN (CA-Net) for more accurate and explainable medical image segmentation that is aware of the most important spatial positions, channels and scales at the same time. In particular, we first propose a joint spatial attention module to make the network focus more on the foreground region. Then, a novel channel attention module is proposed to adaptively recalibrate channel-wise feature responses and highlight the most relevant feature channels. Also, we propose a scale attention module implicitly emphasizing the most salient feature maps among multiple scales so that the CNN is adaptive to the size of an object. Extensive experiments on skin lesion segmentation from ISIC 2018 and multi-class segmentation of fetal MRI found that our proposed CA-Net significantly improved the average segmentation Dice score from 87.77% to 92.08% for skin lesion, 84.79% to 87.08% for the placenta and 93.20% to 95.88% for the fetal brain respectively compared with U-Net. It reduced the model size to around 15 times smaller with close or even better accuracy compared with state-of-the-art DeepLabv3+. In addition, it has a much higher explainability than existing networks by visualizing the attention weight maps. Our code is available at https://github.com/HiLab-git/CA-Net △ Less

Submitted 22 September, 2020; v1 submitted 22 September, 2020; originally announced September 2020.

arXiv:2009.04009 [pdf, other]

doi 10.1016/j.media.2020.101862

Learning joint segmentation of tissues and brain lesions from task-specific hetero-modal domain-shifted datasets

Authors: Reuben Dorent, Thomas Booth, Wenqi Li, Carole H. Sudre, Sina Kafiabadi, Jorge Cardoso, Sebastien Ourselin, Tom Vercauteren

Abstract: Brain tissue segmentation from multimodal MRI is a key building block of many neuroimaging analysis pipelines. Established tissue segmentation approaches have, however, not been developed to cope with large anatomical changes resulting from pathology, such as white matter lesions or tumours, and often fail in these cases. In the meantime, with the advent of deep neural networks (DNNs), segmentatio… ▽ More Brain tissue segmentation from multimodal MRI is a key building block of many neuroimaging analysis pipelines. Established tissue segmentation approaches have, however, not been developed to cope with large anatomical changes resulting from pathology, such as white matter lesions or tumours, and often fail in these cases. In the meantime, with the advent of deep neural networks (DNNs), segmentation of brain lesions has matured significantly. However, few existing approaches allow for the joint segmentation of normal tissue and brain lesions. Develo** a DNN for such a joint task is currently hampered by the fact that annotated datasets typically address only one specific task and rely on task-specific imaging protocols including a task-specific set of imaging modalities. In this work, we propose a novel approach to build a joint tissue and lesion segmentation model from aggregated task-specific hetero-modal domain-shifted and partially-annotated datasets. Starting from a variational formulation of the joint problem, we show how the expected risk can be decomposed and optimised empirically. We exploit an upper bound of the risk to deal with heterogeneous imaging modalities across datasets. To deal with potential domain shift, we integrated and tested three conventional techniques based on data augmentation, adversarial learning and pseudo-healthy generation. For each individual task, our joint approach reaches comparable performance to task-specific and fully-supervised models. The proposed framework is assessed on two different types of brain lesions: White matter lesions and gliomas. In the latter case, lacking a joint ground-truth for quantitative assessment purposes, we propose and use a novel clinically-relevant qualitative assessment methodology. △ Less

Submitted 8 September, 2020; originally announced September 2020.

Comments: MIDL 2019 special issue - Medical Image Analysis

arXiv:2009.01924 [pdf, other]

Introduction to Medical Image Registration with DeepReg, Between Old and New

Authors: N. Montana Brown, Y. Fu, S. U. Saeed, A. Casamitjana, Z. M. C. Baum, R. Delaunay, Q. Yang, A. Grimwood, Z. Min, E. Bonmati, T. Vercauteren, M. J. Clarkson, Y. Hu

Abstract: This document outlines a tutorial to get started with medical image registration using the open-source package DeepReg. The basic concepts of medical image registration are discussed, linking classical methods to newer methods using deep learning. Two iterative, classical algorithms using optimisation and one learning-based algorithm using deep learning are coded step-by-step using DeepReg utiliti… ▽ More This document outlines a tutorial to get started with medical image registration using the open-source package DeepReg. The basic concepts of medical image registration are discussed, linking classical methods to newer methods using deep learning. Two iterative, classical algorithms using optimisation and one learning-based algorithm using deep learning are coded step-by-step using DeepReg utilities, all with real, open-accessible, medical data. △ Less

Submitted 7 September, 2020; v1 submitted 29 August, 2020; originally announced September 2020.

Comments: Submitted to MICCAI Educational Challenge 2020

Showing 1–50 of 67 results for author: Vercauteren, T