-
LOOC: Localizing Organs using Occupancy Networks and Body Surface Depth Images
Authors:
Pit Henrich,
Franziska Mathis-Ullrich
Abstract:
We introduce a novel method employing occupancy networks for the precise localization of 67 anatomical structures from single depth images captured from the exterior of the human body. This method considers the anatomical diversity across individuals. Our contributions include the application of occupancy networks for occluded structure localization, a robust method for estimating anatomical posit…
▽ More
We introduce a novel method employing occupancy networks for the precise localization of 67 anatomical structures from single depth images captured from the exterior of the human body. This method considers the anatomical diversity across individuals. Our contributions include the application of occupancy networks for occluded structure localization, a robust method for estimating anatomical positions from depth images, and the creation of detailed, individualized 3D anatomical atlases. This approach promises improvements in medical imaging and automated diagnostic procedures by offering accurate, non-invasive localization of critical anatomical features.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Sim-To-Real Transfer for Visual Reinforcement Learning of Deformable Object Manipulation for Robot-Assisted Surgery
Authors:
Paul Maria Scheikl,
Eleonora Tagliabue,
Balázs Gyenes,
Martin Wagner,
Diego Dall'Alba,
Paolo Fiorini,
Franziska Mathis-Ullrich
Abstract:
Automation holds the potential to assist surgeons in robotic interventions, shifting their mental work load from visuomotor control to high level decision making. Reinforcement learning has shown promising results in learning complex visuomotor policies, especially in simulation environments where many samples can be collected at low cost. A core challenge is learning policies in simulation that c…
▽ More
Automation holds the potential to assist surgeons in robotic interventions, shifting their mental work load from visuomotor control to high level decision making. Reinforcement learning has shown promising results in learning complex visuomotor policies, especially in simulation environments where many samples can be collected at low cost. A core challenge is learning policies in simulation that can be deployed in the real world, thereby overcoming the sim-to-real gap.
In this work, we bridge the visual sim-to-real gap with an image-based reinforcement learning pipeline based on pixel-level domain adaptation and demonstrate its effectiveness on an image-based task in deformable object manipulation. We choose a tissue retraction task because of its importance in clinical reality of precise cancer surgery. After training in simulation on domain-translated images, our policy requires no retraining to perform tissue retraction with a 50% success rate on the real robotic system using raw RGB images. Furthermore, our sim-to-real transfer method makes no assumptions on the task itself and requires no paired images. This work introduces the first successful application of visual sim-to-real transfer for robotic manipulation of deformable objects in the surgical field, which represents a notable step towards the clinical translation of cognitive surgical robotics.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects
Authors:
Paul Maria Scheikl,
Nicolas Schreiber,
Christoph Haas,
Niklas Freymuth,
Gerhard Neumann,
Rudolf Lioutikov,
Franziska Mathis-Ullrich
Abstract:
Policy learning in robot-assisted surgery (RAS) lacks data efficient and versatile methods that exhibit the desired motion quality for delicate surgical interventions. To this end, we introduce Movement Primitive Diffusion (MPD), a novel method for imitation learning (IL) in RAS that focuses on gentle manipulation of deformable objects. The approach combines the versatility of diffusion-based imit…
▽ More
Policy learning in robot-assisted surgery (RAS) lacks data efficient and versatile methods that exhibit the desired motion quality for delicate surgical interventions. To this end, we introduce Movement Primitive Diffusion (MPD), a novel method for imitation learning (IL) in RAS that focuses on gentle manipulation of deformable objects. The approach combines the versatility of diffusion-based imitation learning (DIL) with the high-quality motion generation capabilities of Probabilistic Dynamic Movement Primitives (ProDMPs). This combination enables MPD to achieve gentle manipulation of deformable objects, while maintaining data efficiency critical for RAS applications where demonstration data is scarce. We evaluate MPD across various simulated and real world robotic tasks on both state and image observations. MPD outperforms state-of-the-art DIL methods in success rate, motion quality, and data efficiency.
Project page: https://scheiklp.github.io/movement-primitive-diffusion/
△ Less
Submitted 10 June, 2024; v1 submitted 15 December, 2023;
originally announced December 2023.
-
Registered and Segmented Deformable Object Reconstruction from a Single View Point Cloud
Authors:
Pit Henrich,
Balázs Gyenes,
Paul Maria Scheikl,
Gerhard Neumann,
Franziska Mathis-Ullrich
Abstract:
In deformable object manipulation, we often want to interact with specific segments of an object that are only defined in non-deformed models of the object. We thus require a system that can recognize and locate these segments in sensor data of deformed real world objects. This is normally done using deformable object registration, which is problem specific and complex to tune. Recent methods util…
▽ More
In deformable object manipulation, we often want to interact with specific segments of an object that are only defined in non-deformed models of the object. We thus require a system that can recognize and locate these segments in sensor data of deformed real world objects. This is normally done using deformable object registration, which is problem specific and complex to tune. Recent methods utilize neural occupancy functions to improve deformable object registration by registering to an object reconstruction. Going one step further, we propose a system that in addition to reconstruction learns segmentation of the reconstructed object. As the resulting output already contains the information about the segments, we can skip the registration process. Tested on a variety of deformable objects in simulation and the real world, we demonstrate that our method learns to robustly find these segments. We also introduce a simple sampling algorithm to generate better training data for occupancy learning.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Grounding Graph Network Simulators using Physical Sensor Observations
Authors:
Jonas Linkerhägner,
Niklas Freymuth,
Paul Maria Scheikl,
Franziska Mathis-Ullrich,
Gerhard Neumann
Abstract:
Physical simulations that accurately model reality are crucial for many engineering disciplines such as mechanical engineering and robotic motion planning. In recent years, learned Graph Network Simulators produced accurate mesh-based simulations while requiring only a fraction of the computational cost of traditional simulators. Yet, the resulting predictors are confined to learning from data gen…
▽ More
Physical simulations that accurately model reality are crucial for many engineering disciplines such as mechanical engineering and robotic motion planning. In recent years, learned Graph Network Simulators produced accurate mesh-based simulations while requiring only a fraction of the computational cost of traditional simulators. Yet, the resulting predictors are confined to learning from data generated by existing mesh-based simulators and thus cannot include real world sensory information such as point cloud data. As these predictors have to simulate complex physical systems from only an initial state, they exhibit a high error accumulation for long-term predictions. In this work, we integrate sensory information to ground Graph Network Simulators on real world observations. In particular, we predict the mesh state of deformable objects by utilizing point cloud data. The resulting model allows for accurate predictions over longer time horizons, even under uncertainties in the simulation, such as unknown material properties. Since point clouds are usually not available for every time step, especially in online settings, we employ an imputation-based model. The model can make use of such additional information only when provided, and resorts to a standard Graph Network Simulator, otherwise. We experimentally validate our approach on a suite of prediction tasks for mesh-based interactions between soft and rigid bodies. Our method results in utilization of additional point cloud information to accurately predict stable simulations where existing Graph Network Simulators fail.
△ Less
Submitted 7 March, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
LapGym -- An Open Source Framework for Reinforcement Learning in Robot-Assisted Laparoscopic Surgery
Authors:
Paul Maria Scheikl,
Balázs Gyenes,
Rayan Younis,
Christoph Haas,
Gerhard Neumann,
Martin Wagner,
Franziska Mathis-Ullrich
Abstract:
Recent advances in reinforcement learning (RL) have increased the promise of introducing cognitive assistance and automation to robot-assisted laparoscopic surgery (RALS). However, progress in algorithms and methods depends on the availability of standardized learning environments that represent skills relevant to RALS. We present LapGym, a framework for building RL environments for RALS that mode…
▽ More
Recent advances in reinforcement learning (RL) have increased the promise of introducing cognitive assistance and automation to robot-assisted laparoscopic surgery (RALS). However, progress in algorithms and methods depends on the availability of standardized learning environments that represent skills relevant to RALS. We present LapGym, a framework for building RL environments for RALS that models the challenges posed by surgical tasks, and sofa_env, a diverse suite of 12 environments. Motivated by surgical training, these environments are organized into 4 tracks: Spatial Reasoning, Deformable Object Manipulation & Gras**, Dissection, and Thread Manipulation. Each environment is highly parametrizable for increasing difficulty, resulting in a high performance ceiling for new algorithms. We use Proximal Policy Optimization (PPO) to establish a baseline for model-free RL algorithms, investigating the effect of several environment parameters on task difficulty. Finally, we show that many environments and parameter configurations reflect well-known, open problems in RL research, allowing researchers to continue exploring these fundamental problems in a surgical context. We aim to provide a challenging, standard environment suite for further development of RL for RALS, ultimately hel** to realize the full potential of cognitive surgical robotics. LapGym is publicly accessible through GitHub (https://github.com/ScheiklP/lap_gym).
△ Less
Submitted 19 February, 2023;
originally announced February 2023.
-
Multimodal CNN Networks for Brain Tumor Segmentation in MRI: A BraTS 2022 Challenge Solution
Authors:
Ramy A. Zeineldin,
Mohamed E. Karar,
Oliver Burgert,
Franziska Mathis-Ullrich
Abstract:
Automatic segmentation is essential for the brain tumor diagnosis, disease prognosis, and follow-up therapy of patients with gliomas. Still, accurate detection of gliomas and their sub-regions in multimodal MRI is very challenging due to the variety of scanners and imaging protocols. Over the last years, the BraTS Challenge has provided a large number of multi-institutional MRI scans as a benchmar…
▽ More
Automatic segmentation is essential for the brain tumor diagnosis, disease prognosis, and follow-up therapy of patients with gliomas. Still, accurate detection of gliomas and their sub-regions in multimodal MRI is very challenging due to the variety of scanners and imaging protocols. Over the last years, the BraTS Challenge has provided a large number of multi-institutional MRI scans as a benchmark for glioma segmentation algorithms. This paper describes our contribution to the BraTS 2022 Continuous Evaluation challenge. We propose a new ensemble of multiple deep learning frameworks namely, DeepSeg, nnU-Net, and DeepSCAN for automatic glioma boundaries detection in pre-operative MRI. It is worth noting that our ensemble models took first place in the final evaluation on the BraTS testing dataset with Dice scores of 0.9294, 0.8788, and 0.8803, and Hausdorf distance of 5.23, 13.54, and 12.05, for the whole tumor, tumor core, and enhancing tumor, respectively. Furthermore, the proposed ensemble method ranked first in the final ranking on another unseen test dataset, namely Sub-Saharan Africa dataset, achieving mean Dice scores of 0.9737, 0.9593, and 0.9022, and HD95 of 2.66, 1.72, 3.32 for the whole tumor, tumor core, and enhancing tumor, respectively. The docker image for the winning submission is publicly available at (https://hub.docker.com/r/razeineldin/camed22).
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
Self-supervised iRegNet for the Registration of Longitudinal Brain MRI of Diffuse Glioma Patients
Authors:
Ramy A. Zeineldin,
Mohamed E. Karar,
Franziska Mathis-Ullrich,
Oliver Burgert
Abstract:
Reliable and accurate registration of patient-specific brain magnetic resonance imaging (MRI) scans containing pathologies is challenging due to tissue appearance changes. This paper describes our contribution to the Registration of the longitudinal brain MRI task of the Brain Tumor Sequence Registration Challenge 2022 (BraTS-Reg 2022). We developed an enhanced unsupervised learning-based method t…
▽ More
Reliable and accurate registration of patient-specific brain magnetic resonance imaging (MRI) scans containing pathologies is challenging due to tissue appearance changes. This paper describes our contribution to the Registration of the longitudinal brain MRI task of the Brain Tumor Sequence Registration Challenge 2022 (BraTS-Reg 2022). We developed an enhanced unsupervised learning-based method that extends the iRegNet. In particular, incorporating an unsupervised learning-based paradigm as well as several minor modifications to the network pipeline, allows the enhanced iRegNet method to achieve respectable results. Experimental findings show that the enhanced self-supervised model is able to improve the initial mean median registration absolute error (MAE) from 8.20 (7.62) mm to the lowest value of 3.51 (3.50) for the training set while achieving an MAE of 2.93 (1.63) mm for the validation set. Additional qualitative validation of this study was conducted through overlaying pre-post MRI pairs before and after the de-formable registration. The proposed method scored 5th place during the testing phase of the MICCAI BraTS-Reg 2022 challenge. The docker image to reproduce our BraTS-Reg submission results will be publicly available.
△ Less
Submitted 20 November, 2022;
originally announced November 2022.
-
LapSeg3D: Weakly Supervised Semantic Segmentation of Point Clouds Representing Laparoscopic Scenes
Authors:
Benjamin Alt,
Christian Kunz,
Darko Katic,
Rayan Younis,
Rainer Jäkel,
Beat Peter Müller-Stich,
Martin Wagner,
Franziska Mathis-Ullrich
Abstract:
The semantic segmentation of surgical scenes is a prerequisite for task automation in robot assisted interventions. We propose LapSeg3D, a novel DNN-based approach for the voxel-wise annotation of point clouds representing surgical scenes. As the manual annotation of training data is highly time consuming, we introduce a semi-autonomous clustering-based pipeline for the annotation of the gallbladd…
▽ More
The semantic segmentation of surgical scenes is a prerequisite for task automation in robot assisted interventions. We propose LapSeg3D, a novel DNN-based approach for the voxel-wise annotation of point clouds representing surgical scenes. As the manual annotation of training data is highly time consuming, we introduce a semi-autonomous clustering-based pipeline for the annotation of the gallbladder, which is used to generate segmented labels for the DNN. When evaluated against manually annotated data, LapSeg3D achieves an F1 score of 0.94 for gallbladder segmentation on various datasets of ex-vivo porcine livers. We show LapSeg3D to generalize accurately across different gallbladders and datasets recorded with different RGB-D camera systems.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
The Brain Tumor Sequence Registration (BraTS-Reg) Challenge: Establishing Correspondence Between Pre-Operative and Follow-up MRI Scans of Diffuse Glioma Patients
Authors:
Bhakti Baheti,
Satrajit Chakrabarty,
Hamed Akbari,
Michel Bilello,
Benedikt Wiestler,
Julian Schwarting,
Evan Calabrese,
Jeffrey Rudie,
Syed Abidi,
Mina Mousa,
Javier Villanueva-Meyer,
Brandon K. K. Fields,
Florian Kofler,
Russell Takeshi Shinohara,
Juan Eugenio Iglesias,
Tony C. W. Mok,
Albert C. S. Chung,
Marek Wodzinski,
Artur Jurgas,
Niccolo Marini,
Manfredo Atzori,
Henning Muller,
Christoph Grobroehmer,
Hanna Siebert,
Lasse Hansen
, et al. (48 additional authors not shown)
Abstract:
Registration of longitudinal brain MRI scans containing pathologies is challenging due to dramatic changes in tissue appearance. Although there has been progress in develo** general-purpose medical image registration techniques, they have not yet attained the requisite precision and reliability for this task, highlighting its inherent complexity. Here we describe the Brain Tumor Sequence Registr…
▽ More
Registration of longitudinal brain MRI scans containing pathologies is challenging due to dramatic changes in tissue appearance. Although there has been progress in develo** general-purpose medical image registration techniques, they have not yet attained the requisite precision and reliability for this task, highlighting its inherent complexity. Here we describe the Brain Tumor Sequence Registration (BraTS-Reg) challenge, as the first public benchmark environment for deformable registration algorithms focusing on estimating correspondences between pre-operative and follow-up scans of the same patient diagnosed with a diffuse brain glioma. The BraTS-Reg data comprise de-identified multi-institutional multi-parametric MRI (mpMRI) scans, curated for size and resolution according to a canonical anatomical template, and divided into training, validation, and testing sets. Clinical experts annotated ground truth (GT) landmark points of anatomical locations distinct across the temporal domain. Quantitative evaluation and ranking were based on the Median Euclidean Error (MEE), Robustness, and the determinant of the Jacobian of the displacement field. The top-ranked methodologies yielded similar performance across all evaluation metrics and shared several methodological commonalities, including pre-alignment, deep neural networks, inverse consistency analysis, and test-time instance optimization per-case basis as a post-processing step. The top-ranked method attained the MEE at or below that of the inter-rater variability for approximately 60% of the evaluated landmarks, underscoring the scope for further accuracy and robustness improvements, especially relative to human experts. The aim of BraTS-Reg is to continue to serve as an active resource for research, with the data and online evaluation tools accessible at https://bratsreg.github.io/.
△ Less
Submitted 17 April, 2024; v1 submitted 13 December, 2021;
originally announced December 2021.
-
Ensemble CNN Networks for GBM Tumors Segmentation using Multi-parametric MRI
Authors:
Ramy A. Zeineldin,
Mohamed E. Karar,
Franziska Mathis-Ullrich,
Oliver Burgert
Abstract:
Glioblastomas are the most aggressive fast-growing primary brain cancer which originate in the glial cells of the brain. Accurate identification of the malignant brain tumor and its sub-regions is still one of the most challenging problems in medical image segmentation. The Brain Tumor Segmentation Challenge (BraTS) has been a popular benchmark for automatic brain glioblastomas segmentation algori…
▽ More
Glioblastomas are the most aggressive fast-growing primary brain cancer which originate in the glial cells of the brain. Accurate identification of the malignant brain tumor and its sub-regions is still one of the most challenging problems in medical image segmentation. The Brain Tumor Segmentation Challenge (BraTS) has been a popular benchmark for automatic brain glioblastomas segmentation algorithms since its initiation. In this year, BraTS 2021 challenge provides the largest multi-parametric (mpMRI) dataset of 2,000 pre-operative patients. In this paper, we propose a new aggregation of two deep learning frameworks namely, DeepSeg and nnU-Net for automatic glioblastoma recognition in pre-operative mpMRI. Our ensemble method obtains Dice similarity scores of 92.00, 87.33, and 84.10 and Hausdorff Distances of 3.81, 8.91, and 16.02 for the enhancing tumor, tumor core, and whole tumor regions, respectively, on the BraTS 2021 validation set, ranking us among the top ten teams. These experimental findings provide evidence that it can be readily applied clinically and thereby aiding in the brain cancer prognosis, therapy planning, and therapy response monitoring. A docker image for reproducing our segmentation results is available online at (https://hub.docker.com/r/razeineldin/deepseg21).
△ Less
Submitted 27 December, 2021; v1 submitted 13 December, 2021;
originally announced December 2021.
-
Cooperative Assistance in Robotic Surgery through Multi-Agent Reinforcement Learning
Authors:
Paul Maria Scheikl,
Balázs Gyenes,
Tornike Davitashvili,
Rayan Younis,
André Schulze,
Beat P. Müller-Stich,
Gerhard Neumann,
Martin Wagner,
Franziska Mathis-Ullrich
Abstract:
Cognitive cooperative assistance in robot-assisted surgery holds the potential to increase quality of care in minimally invasive interventions. Automation of surgical tasks promises to reduce the mental exertion and fatigue of surgeons. In this work, multi-agent reinforcement learning is demonstrated to be robust to the distribution shift introduced by pairing a learned policy with a human team me…
▽ More
Cognitive cooperative assistance in robot-assisted surgery holds the potential to increase quality of care in minimally invasive interventions. Automation of surgical tasks promises to reduce the mental exertion and fatigue of surgeons. In this work, multi-agent reinforcement learning is demonstrated to be robust to the distribution shift introduced by pairing a learned policy with a human team member. Multi-agent policies are trained directly from images in simulation to control multiple instruments in a sub task of the minimally invasive removal of the gallbladder. These agents are evaluated individually and in cooperation with humans to demonstrate their suitability as autonomous assistants. Compared to human teams, the hybrid teams with artificial agents perform better considering completion time (44.4% to 71.2% shorter) as well as number of collisions (44.7% to 98.0% fewer). Path lengths, however, increase under control of an artificial agent (11.4% to 33.5% longer). A multi-agent formulation of the learning problem was favored over a single-agent formulation on this surgical sub task, due to the sequential learning of the two instruments. This approach may be extended to other tasks that are difficult to formulate within the standard reinforcement learning framework. Multi-agent reinforcement learning may shift the paradigm of cognitive robotic surgery towards seamless cooperation between surgeons and assistive technologies.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Comparative Validation of Machine Learning Algorithms for Surgical Workflow and Skill Analysis with the HeiChole Benchmark
Authors:
Martin Wagner,
Beat-Peter Müller-Stich,
Anna Kisilenko,
Duc Tran,
Patrick Heger,
Lars Mündermann,
David M Lubotsky,
Benjamin Müller,
Tornike Davitashvili,
Manuela Capek,
Annika Reinke,
Tong Yu,
Armine Vardazaryan,
Chinedu Innocent Nwoye,
Nicolas Padoy,
Xinyang Liu,
Eung-Joo Lee,
Constantin Disch,
Hans Meine,
Tong Xia,
Fucang Jia,
Satoshi Kondo,
Wolfgang Reiter,
Yueming **,
Yonghao Long
, et al. (16 additional authors not shown)
Abstract:
PURPOSE: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported fo…
▽ More
PURPOSE: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported for phase recognition on an open data single-center dataset. In this work we investigated the generalizability of phase recognition algorithms in a multi-center setting including more difficult recognition tasks such as surgical action and surgical skill. METHODS: To achieve this goal, a dataset with 33 laparoscopic cholecystectomy videos from three surgical centers with a total operation time of 22 hours was created. Labels included annotation of seven surgical phases with 250 phase transitions, 5514 occurences of four surgical actions, 6980 occurences of 21 surgical instruments from seven instrument categories and 495 skill classifications in five skill dimensions. The dataset was used in the 2019 Endoscopic Vision challenge, sub-challenge for surgical workflow and skill analysis. Here, 12 teams submitted their machine learning algorithms for recognition of phase, action, instrument and/or skill assessment. RESULTS: F1-scores were achieved for phase recognition between 23.9% and 67.7% (n=9 teams), for instrument presence detection between 38.5% and 63.8% (n=8 teams), but for action recognition only between 21.8% and 23.3% (n=5 teams). The average absolute error for skill assessment was 0.78 (n=1 team). CONCLUSION: Surgical workflow and skill analysis are promising technologies to support the surgical team, but are not solved yet, as shown by our comparison of algorithms. This novel benchmark can be used for comparable evaluation and validation of future work.
△ Less
Submitted 30 September, 2021;
originally announced September 2021.