-
SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle
Authors:
Matthias Boehm,
Iulian Antonov,
Sebastian Baunsgaard,
Mark Dokter,
Robert Ginthoer,
Kevin Innerebner,
Florijan Klezin,
Stefanie Lindstaedt,
Arnab Phani,
Benjamin Rath,
Berthold Reinwald,
Shafaq Siddiqi,
Sebastian Benjamin Wrede
Abstract:
Machine learning (ML) applications become increasingly common in many domains. ML systems to execute these workloads include numerical computing frameworks and libraries, ML algorithm libraries, and specialized systems for deep neural networks and distributed ML. These systems focus primarily on efficient model training and scoring. However, the data science process is exploratory, and deals with…
▽ More
Machine learning (ML) applications become increasingly common in many domains. ML systems to execute these workloads include numerical computing frameworks and libraries, ML algorithm libraries, and specialized systems for deep neural networks and distributed ML. These systems focus primarily on efficient model training and scoring. However, the data science process is exploratory, and deals with underspecified objectives and a wide variety of heterogeneous data sources. Therefore, additional tools are employed for data engineering and debugging, which requires boundary crossing, unnecessary manual effort, and lacks optimization across the lifecycle. In this paper, we introduce SystemDS, an open source ML system for the end-to-end data science lifecycle from data integration, cleaning, and preparation, over local, distributed, and federated ML model training, to debugging and serving. To this end, we aim to provide a stack of declarative language abstractions for the different lifecycle tasks, and users with different expertise. We describe the overall system architecture, explain major design decisions (motivated by lessons learned from Apache SystemML), and discuss key features and research directions. Finally, we provide preliminary results that show the potential of end-to-end lifecycle optimization.
△ Less
Submitted 7 January, 2020; v1 submitted 6 September, 2019;
originally announced September 2019.
-
US-Cut: Interactive Algorithm for rapid Detection and Segmentation of Liver Tumors in Ultrasound Acquisitions
Authors:
Jan Egger,
Philip Voglreiter,
Mark Dokter,
Michael Hofmann,
Xiaojun Chen,
Wolfram G. Zoller,
Dieter Schmalstieg,
Alexander Hann
Abstract:
Ultrasound (US) is the most commonly used liver imaging modality worldwide. It plays an important role in follow-up of cancer patients with liver metastases. We present an interactive segmentation approach for liver tumors in US acquisitions. Due to the low image quality and the low contrast between the tumors and the surrounding tissue in US images, the segmentation is very challenging. Thus, the…
▽ More
Ultrasound (US) is the most commonly used liver imaging modality worldwide. It plays an important role in follow-up of cancer patients with liver metastases. We present an interactive segmentation approach for liver tumors in US acquisitions. Due to the low image quality and the low contrast between the tumors and the surrounding tissue in US images, the segmentation is very challenging. Thus, the clinical practice still relies on manual measurement and outlining of the tumors in the US images. We target this problem by applying an interactive segmentation algorithm to the US data, allowing the user to get real-time feedback of the segmentation results. The algorithm has been developed and tested hand-in-hand by physicians and computer scientists to make sure a future practical usage in a clinical setting is feasible. To cover typical acquisitions from the clinical routine, the approach has been evaluated with dozens of datasets where the tumors are hyperechoic (brighter), hypoechoic (darker) or isoechoic (similar) in comparison to the surrounding liver tissue. Due to the interactive real-time behavior of the approach, it was possible even in difficult cases to find satisfying segmentations of the tumors within seconds and without parameter settings, and the average tumor deviation was only 1.4mm compared with manual measurements. However, the long term goal is to ease the volumetric acquisition of liver tumors in order to evaluate for treatment response. Additional aim is the registration of intraoperative US images via the interactive segmentations to the patient's pre-interventional CT acquisitions.
△ Less
Submitted 1 March, 2016;
originally announced March 2016.
-
Interactive Volumetry Of Liver Ablation Zones
Authors:
Jan Egger,
Harald Busse,
Philipp Brandmaier,
Daniel Seider,
Matthias Gawlitza,
Steffen Strocka,
Philip Voglreiter,
Mark Dokter,
Michael Hofmann,
Bernhard Kainz,
Alexander Hann,
Xiaojun Chen,
Tuomas Alhonnoro,
Mika Pollari,
Dieter Schmalstieg,
Michael Moche
Abstract:
Percutaneous radiofrequency ablation (RFA) is a minimally invasive technique that destroys cancer cells by heat. The heat results from focusing energy in the radiofrequency spectrum through a needle. Amongst others, this can enable the treatment of patients who are not eligible for an open surgery. However, the possibility of recurrent liver cancer due to incomplete ablation of the tumor makes pos…
▽ More
Percutaneous radiofrequency ablation (RFA) is a minimally invasive technique that destroys cancer cells by heat. The heat results from focusing energy in the radiofrequency spectrum through a needle. Amongst others, this can enable the treatment of patients who are not eligible for an open surgery. However, the possibility of recurrent liver cancer due to incomplete ablation of the tumor makes post-interventional monitoring via regular follow-up scans mandatory. These scans have to be carefully inspected for any conspicuousness. Within this study, the RF ablation zones from twelve post-interventional CT acquisitions have been segmented semi-automatically to support the visual inspection. An interactive, graph-based contouring approach, which prefers spherically shaped regions, has been applied. For the quantitative and qualitative analysis of the algorithm's results, manual slice-by-slice segmentations produced by clinical experts have been used as the gold standard (which have also been compared among each other). As evaluation metric for the statistical validation, the Dice Similarity Coefficient (DSC) has been calculated. The results show that the proposed tool provides lesion segmentation with sufficient accuracy much faster than manual segmentation. The visual feedback and interactivity make the proposed tool well suitable for the clinical workflow.
△ Less
Submitted 21 October, 2015;
originally announced December 2015.