Search | arXiv e-print repository

arXiv:2402.19340 [pdf, other]

One model to use them all: Training a segmentation model with complementary datasets

Authors: Alexander C. Jenke, Sebastian Bodenstedt, Fiona R. Kolbinger, Marius Distler, Jürgen Weitz, Stefanie Speidel

Abstract: Understanding a surgical scene is crucial for computer-assisted surgery systems to provide any intelligent assistance functionality. One way of achieving this scene understanding is via scene segmentation, where every pixel of a frame is classified and therefore identifies the visible structures and tissues. Progress on fully segmenting surgical scenes has been made using machine learning. However… ▽ More Understanding a surgical scene is crucial for computer-assisted surgery systems to provide any intelligent assistance functionality. One way of achieving this scene understanding is via scene segmentation, where every pixel of a frame is classified and therefore identifies the visible structures and tissues. Progress on fully segmenting surgical scenes has been made using machine learning. However, such models require large amounts of annotated training data, containing examples of all relevant object classes. Such fully annotated datasets are hard to create, as every pixel in a frame needs to be annotated by medical experts and, therefore, are rarely available. In this work, we propose a method to combine multiple partially annotated datasets, which provide complementary annotations, into one model, enabling better scene segmentation and the use of multiple readily available datasets. Our method aims to combine available data with complementary labels by leveraging mutual exclusive properties to maximize information. Specifically, we propose to use positive annotations of other classes as negative samples and to exclude background pixels of binary annotations, as we cannot tell if they contain a class not annotated but predicted by the model. We evaluate our method by training a DeepLabV3 on the publicly available Dresden Surgical Anatomy Dataset, which provides multiple subsets of binary segmented anatomical structures. Our approach successfully combines 6 classes into one model, increasing the overall Dice Score by 4.4% compared to an ensemble of models trained on the classes individually. By including information on multiple classes, we were able to reduce confusion between stomach and colon by 24%. Our results demonstrate the feasibility of training a model on multiple datasets. This paves the way for future work further alleviating the need for one large, fully segmented datasets. △ Less

Submitted 5 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: Accepted at IPCAI 2024; submitted to IJCARS (under revision)

arXiv:2312.05978 [pdf, other]

Neural Architecture Codesign for Fast Bragg Peak Analysis

Authors: Luke McDermott, Jason Weitz, Dmitri Demler, Daniel Cummings, Nhan Tran, Javier Duarte

Abstract: We develop an automated pipeline to streamline neural architecture codesign for fast, real-time Bragg peak analysis in high-energy diffraction microscopy. Traditional approaches, notably pseudo-Voigt fitting, demand significant computational resources, prompting interest in deep learning models for more efficient solutions. Our method employs neural architecture search and AutoML to enhance these… ▽ More We develop an automated pipeline to streamline neural architecture codesign for fast, real-time Bragg peak analysis in high-energy diffraction microscopy. Traditional approaches, notably pseudo-Voigt fitting, demand significant computational resources, prompting interest in deep learning models for more efficient solutions. Our method employs neural architecture search and AutoML to enhance these models, including hardware costs, leading to the discovery of more hardware-efficient neural architectures. Our results match the performance, while achieving a 13$\times$ reduction in bit operations compared to the previous state-of-the-art. We show further speedup through model compression techniques such as quantization-aware-training and neural network pruning. Additionally, our hierarchical search space provides greater flexibility in optimization, which can easily extend to other tasks and domains. △ Less

Submitted 11 December, 2023; v1 submitted 10 December, 2023; originally announced December 2023.

Comments: To appear in 3rd Annual AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE)

Report number: FERMILAB-CONF-23-0813-CSAID-PPD

arXiv:2309.03048 [pdf, other]

Exploring Semantic Consistency in Unpaired Image Translation to Generate Data for Surgical Applications

Authors: Danush Kumar Venkatesh, Dominik Rivoir, Micha Pfeiffer, Fiona Kolbinger, Marius Distler, Jürgen Weitz, Stefanie Speidel

Abstract: In surgical computer vision applications, obtaining labeled training data is challenging due to data-privacy concerns and the need for expert annotation. Unpaired image-to-image translation techniques have been explored to automatically generate large annotated datasets by translating synthetic images to the realistic domain. However, preserving the structure and semantic consistency between the i… ▽ More In surgical computer vision applications, obtaining labeled training data is challenging due to data-privacy concerns and the need for expert annotation. Unpaired image-to-image translation techniques have been explored to automatically generate large annotated datasets by translating synthetic images to the realistic domain. However, preserving the structure and semantic consistency between the input and translated images presents significant challenges, mainly when there is a distributional mismatch in the semantic characteristics of the domains. This study empirically investigates unpaired image translation methods for generating suitable data in surgical applications, explicitly focusing on semantic consistency. We extensively evaluate various state-of-the-art image translation models on two challenging surgical datasets and downstream semantic segmentation tasks. We find that a simple combination of structural-similarity loss and contrastive learning yields the most promising results. Quantitatively, we show that the data generated with this approach yields higher semantic consistency and can be used more effectively as training data.The code is available at https://gitlab.com/nct_tso_public/constructs. △ Less

Submitted 21 February, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

Comments: Accepted at IPCAI 2024

arXiv:2103.17204 [pdf, other]

Long-Term Temporally Consistent Unpaired Video Translation from Simulated Surgical 3D Data

Authors: Dominik Rivoir, Micha Pfeiffer, Reuben Docea, Fiona Kolbinger, Carina Riediger, Jürgen Weitz, Stefanie Speidel

Abstract: Research in unpaired video translation has mainly focused on short-term temporal consistency by conditioning on neighboring frames. However for transfer from simulated to photorealistic sequences, available information on the underlying geometry offers potential for achieving global consistency across views. We propose a novel approach which combines unpaired image translation with neural renderin… ▽ More Research in unpaired video translation has mainly focused on short-term temporal consistency by conditioning on neighboring frames. However for transfer from simulated to photorealistic sequences, available information on the underlying geometry offers potential for achieving global consistency across views. We propose a novel approach which combines unpaired image translation with neural rendering to transfer simulated to photorealistic surgical abdominal scenes. By introducing global learnable textures and a lighting-invariant view-consistency loss, our method produces consistent translations of arbitrary views and thus enables long-term consistent video synthesis. We design and test our model to generate video sequences from minimally-invasive surgical abdominal scenes. Because labeled data is often limited in this domain, photorealistic data where ground truth information from the simulated domain is preserved is especially relevant. By extending existing image-based methods to view-consistent videos, we aim to impact the applicability of simulated training and evaluation environments for surgical applications. Code and data: http://opencas.dkfz.de/video-sim2real. △ Less

Submitted 19 August, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

Comments: Accepted at the International Conference on Computer Vision (ICCV) 2021

arXiv:2009.09150 [pdf, other]

Population Susceptibility Variation and Its Effect on Contagion Dynamics

Authors: Christopher Rose, Andrew J. Medford, C. Franklin Goldsmith, Tejs Vegge, Joshua Weitz, Andrew A. Peterson

Abstract: Susceptibility governs the dynamics of contagion. The classical SIR model is one of the simplest compartmental models of contagion spread, assuming a single shared susceptibility level. However, variation in susceptibility over a population can fundamentally affect the dynamics of contagion and thus the ultimate outcome of a pandemic. We develop mathematical machinery which explicitly considers su… ▽ More Susceptibility governs the dynamics of contagion. The classical SIR model is one of the simplest compartmental models of contagion spread, assuming a single shared susceptibility level. However, variation in susceptibility over a population can fundamentally affect the dynamics of contagion and thus the ultimate outcome of a pandemic. We develop mathematical machinery which explicitly considers susceptibility variation, illuminates how the susceptibility distribution is sculpted by contagion, and thence how such variation affects the SIR differential questions that govern contagion. Our methods allow us to derive closed form expressions for herd immunity thresholds as a function of initial susceptibility distributions and suggests an intuitively satisfying approach to inoculation when only a fraction of the population is accessible to such intervention. Of particular interest, if we assume static susceptibility of individuals in the susceptible pool, ignoring susceptibility diversity {\em always} results in overestimation of the herd immunity threshold and that difference can be dramatic. Therefore, we should develop robust measures of susceptibility variation as part of public health strategies for handling pandemics. △ Less

Submitted 18 September, 2020; originally announced September 2020.

Comments: 12 pages, 2 figures

arXiv:2007.00548 [pdf, other]

Rethinking Anticipation Tasks: Uncertainty-aware Anticipation of Sparse Surgical Instrument Usage for Context-aware Assistance

Authors: Dominik Rivoir, Sebastian Bodenstedt, Isabel Funke, Felix von Bechtolsheim, Marius Distler, Jürgen Weitz, Stefanie Speidel

Abstract: Intra-operative anticipation of instrument usage is a necessary component for context-aware assistance in surgery, e.g. for instrument preparation or semi-automation of robotic tasks. However, the sparsity of instrument occurrences in long videos poses a challenge. Current approaches are limited as they assume knowledge on the timing of future actions or require dense temporal segmentations during… ▽ More Intra-operative anticipation of instrument usage is a necessary component for context-aware assistance in surgery, e.g. for instrument preparation or semi-automation of robotic tasks. However, the sparsity of instrument occurrences in long videos poses a challenge. Current approaches are limited as they assume knowledge on the timing of future actions or require dense temporal segmentations during training and inference. We propose a novel learning task for anticipation of instrument usage in laparoscopic videos that overcomes these limitations. During training, only sparse instrument annotations are required and inference is done solely on image data. We train a probabilistic model to address the uncertainty associated with future events. Our approach outperforms several baselines and is competitive to a variant using richer annotations. We demonstrate the model's ability to quantify task-relevant uncertainties. To the best of our knowledge, we are the first to propose a method for anticipating instruments in surgery. △ Less

Submitted 30 March, 2022; v1 submitted 1 July, 2020; originally announced July 2020.

Comments: Accepted at MICCAI 2020

arXiv:2005.14695 [pdf, other]

Non-Rigid Volume to Surface Registration using a Data-Driven Biomechanical Model

Authors: Micha Pfeiffer, Carina Riediger, Stefan Leger, Jens-Peter Kühn, Danilo Seppelt, Ralf-Thorsten Hoffmann, Jürgen Weitz, Stefanie Speidel

Abstract: Non-rigid registration is a key component in soft-tissue navigation. We focus on laparoscopic liver surgery, where we register the organ model obtained from a preoperative CT scan to the intraoperative partial organ surface, reconstructed from the laparoscopic video. This is a challenging task due to sparse and noisy intraoperative data, real-time requirements and many unknowns - such as tissue pr… ▽ More Non-rigid registration is a key component in soft-tissue navigation. We focus on laparoscopic liver surgery, where we register the organ model obtained from a preoperative CT scan to the intraoperative partial organ surface, reconstructed from the laparoscopic video. This is a challenging task due to sparse and noisy intraoperative data, real-time requirements and many unknowns - such as tissue properties and boundary conditions. Furthermore, establishing correspondences between pre- and intraoperative data can be extremely difficult since the liver usually lacks distinct surface features and the used imaging modalities suffer from very different types of noise. In this work, we train a convolutional neural network to perform both the search for surface correspondences as well as the non-rigid registration in one step. The network is trained on physically accurate biomechanical simulations of randomly generated, deforming organ-like structures. This enables the network to immediately generalize to a new patient organ without the need to re-train. We add various amounts of noise to the intraoperative surfaces during training, making the network robust to noisy intraoperative data. During inference, the network outputs the displacement field which matches the preoperative volume to the partial intraoperative surface. In multiple experiments, we show that the network translates well to real data while maintaining a high inference speed. Our code is made available online. △ Less

Submitted 29 May, 2020; originally announced May 2020.

Comments: Provisionally accepted for MICCAI 2020

arXiv:2002.11367 [pdf, other]

Unsupervised Temporal Video Segmentation as an Auxiliary Task for Predicting the Remaining Surgery Duration

Authors: Dominik Rivoir, Sebastian Bodenstedt, Felix von Bechtolsheim, Marius Distler, Jürgen Weitz, Stefanie Speidel

Abstract: Estimating the remaining surgery duration (RSD) during surgical procedures can be useful for OR planning and anesthesia dose estimation. With the recent success of deep learning-based methods in computer vision, several neural network approaches have been proposed for fully automatic RSD prediction based solely on visual data from the endoscopic camera. We investigate whether RSD prediction can be… ▽ More Estimating the remaining surgery duration (RSD) during surgical procedures can be useful for OR planning and anesthesia dose estimation. With the recent success of deep learning-based methods in computer vision, several neural network approaches have been proposed for fully automatic RSD prediction based solely on visual data from the endoscopic camera. We investigate whether RSD prediction can be improved using unsupervised temporal video segmentation as an auxiliary learning task. As opposed to previous work, which presented supervised surgical phase recognition as auxiliary task, we avoid the need for manual annotations by proposing a similar but unsupervised learning objective which clusters video sequences into temporally coherent segments. In multiple experimental setups, results obtained by learning the auxiliary task are incorporated into a deep RSD model through feature extraction, pretraining or regularization. Further, we propose a novel loss function for RSD training which attempts to counteract unfavorable characteristics of the RSD ground truth. Using our unsupervised method as an auxiliary task for RSD training, we outperform other self-supervised methods and are comparable to the supervised state-of-the-art. Combined with the novel RSD loss, we slightly outperform the supervised approach. △ Less

Submitted 26 February, 2020; originally announced February 2020.

Journal ref: OR 2.0 Context-Aware Operating Theaters and Machine Learning in Clinical Neuroimaging (2019) 29-37

arXiv:1907.11454 [pdf, other]

Using 3D Convolutional Neural Networks to Learn Spatiotemporal Features for Automatic Surgical Gesture Recognition in Video

Authors: Isabel Funke, Sebastian Bodenstedt, Florian Oehme, Felix von Bechtolsheim, Jürgen Weitz, Stefanie Speidel

Abstract: Automatically recognizing surgical gestures is a crucial step towards a thorough understanding of surgical skill. Possible areas of application include automatic skill assessment, intra-operative monitoring of critical surgical steps, and semi-automation of surgical tasks. Solutions that rely only on the laparoscopic video and do not require additional sensor hardware are especially attractive as… ▽ More Automatically recognizing surgical gestures is a crucial step towards a thorough understanding of surgical skill. Possible areas of application include automatic skill assessment, intra-operative monitoring of critical surgical steps, and semi-automation of surgical tasks. Solutions that rely only on the laparoscopic video and do not require additional sensor hardware are especially attractive as they can be implemented at low cost in many scenarios. However, surgical gesture recognition based only on video is a challenging problem that requires effective means to extract both visual and temporal information from the video. Previous approaches mainly rely on frame-wise feature extractors, either handcrafted or learned, which fail to capture the dynamics in surgical video. To address this issue, we propose to use a 3D Convolutional Neural Network (CNN) to learn spatiotemporal features from consecutive video frames. We evaluate our approach on recordings of robot-assisted suturing on a bench-top model, which are taken from the publicly available JIGSAWS dataset. Our approach achieves high frame-wise surgical gesture recognition accuracies of more than 84%, outperforming comparable models that either extract only spatial features or model spatial and low-level temporal information separately. For the first time, these results demonstrate the benefit of spatiotemporal CNNs for video-based surgical gesture recognition. △ Less

Submitted 26 July, 2019; originally announced July 2019.

Comments: Accepted at MICCAI 2019. Source code will be made available

arXiv:1907.02882 [pdf, other]

Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation

Authors: Micha Pfeiffer, Isabel Funke, Maria R. Robu, Sebastian Bodenstedt, Leon Strenger, Sandy Engelhardt, Tobias Roß, Matthew J. Clarkson, Kurinchi Gurusamy, Brian R. Davidson, Lena Maier-Hein, Carina Riediger, Thilo Welsch, Jürgen Weitz, Stefanie Speidel

Abstract: In the medical domain, the lack of large training data sets and benchmarks is often a limiting factor for training deep neural networks. In contrast to expensive manual labeling, computer simulations can generate large and fully labeled data sets with a minimum of manual effort. However, models that are trained on simulated data usually do not translate well to real scenarios. To bridge the domain… ▽ More In the medical domain, the lack of large training data sets and benchmarks is often a limiting factor for training deep neural networks. In contrast to expensive manual labeling, computer simulations can generate large and fully labeled data sets with a minimum of manual effort. However, models that are trained on simulated data usually do not translate well to real scenarios. To bridge the domain gap between simulated and real laparoscopic images, we exploit recent advances in unpaired image-to-image translation. We extent an image-to-image translation method to generate a diverse multitude of realistically looking synthetic images based on images from a simple laparoscopy simulation. By incorporating means to ensure that the image content is preserved during the translation process, we ensure that the labels given for the simulated images remain valid for their realistically looking translations. This way, we are able to generate a large, fully labeled synthetic data set of laparoscopic images with realistic appearance. We show that this data set can be used to train models for the task of liver segmentation of laparoscopic images. We achieve average dice scores of up to 0.89 in some patients without manually labeling a single laparoscopic image and show that using our synthetic data to pre-train models can greatly improve their performance. The synthetic data set will be made publicly available, fully labeled with segmentation maps, depth maps, normal maps, and positions of tools and camera (http://opencas.dkfz.de/image2image). △ Less

Submitted 5 July, 2019; originally announced July 2019.

Comments: Accepted at MICCAI 2019

arXiv:1904.00722 [pdf, other]

Learning Soft Tissue Behavior of Organs for Surgical Navigation with Convolutional Neural Networks

Authors: Micha Pfeiffer, Carina Riediger, Jürgen Weitz, Stefanie Speidel

Abstract: Purpose: In surgical navigation, pre-operative organ models are presented to surgeons during the intervention to help them in efficiently finding their target. In the case of soft tissue, these models need to be deformed and adapted to the current situation by using intra-operative sensor data. A promising method to realize this are real-time capable biomechanical models. Methods: We train a ful… ▽ More Purpose: In surgical navigation, pre-operative organ models are presented to surgeons during the intervention to help them in efficiently finding their target. In the case of soft tissue, these models need to be deformed and adapted to the current situation by using intra-operative sensor data. A promising method to realize this are real-time capable biomechanical models. Methods: We train a fully convolutional neural network to estimate a displacement field of all points inside an organ when given only the displacement of a part of the organ's surface. The network trains on entirely synthetic data of random organ-like meshes, which allows us to generate much more data than is otherwise available. The input and output data is discretized into a regular grid, allowing us to fully utilize the capabilities of convolutional operators and to train and infer in a highly parallelized manner. Results: The system is evaluated on in-silico liver models, phantom liver data and human in-vivo breathing data. We test the performance with varying material parameters, organ shapes and amount of visible surface. Even though the network is only trained on synthetic data, it adapts well to the various cases and gives a good estimation of the internal organ displacement. The inference runs at over 50 frames per second. Conclusions: We present a novel method for training a data-driven, real-time capable deformation model. The accuracy is comparable to other registration methods, it adapts very well to previously unseen organs and does not need to be re-trained for every patient. The high inferring speed makes this method useful for many applications such as surgical navigation and real-time simulation. △ Less

Submitted 26 March, 2019; originally announced April 2019.

Comments: Accepted at IPCAI 2019; submitted to IJCARS (under revision). Source code will be released upon publication in IJCARS

arXiv:1903.02306 [pdf, other]

doi 10.1007/s11548-019-01995-1

Video-based surgical skill assessment using 3D convolutional neural networks

Authors: Isabel Funke, Sören Torge Mees, Jürgen Weitz, Stefanie Speidel

Abstract: Purpose: A profound education of novice surgeons is crucial to ensure that surgical interventions are effective and safe. One important aspect is the teaching of technical skills for minimally invasive or robot-assisted procedures. This includes the objective and preferably automatic assessment of surgical skill. Recent studies presented good results for automatic, objective skill evaluation by co… ▽ More Purpose: A profound education of novice surgeons is crucial to ensure that surgical interventions are effective and safe. One important aspect is the teaching of technical skills for minimally invasive or robot-assisted procedures. This includes the objective and preferably automatic assessment of surgical skill. Recent studies presented good results for automatic, objective skill evaluation by collecting and analyzing motion data such as trajectories of surgical instruments. However, obtaining the motion data generally requires additional equipment for instrument tracking or the availability of a robotic surgery system to capture kinematic data. In contrast, we investigate a method for automatic, objective skill assessment that requires video data only. This has the advantage that video can be collected effortlessly during minimally invasive and robot-assisted training scenarios. Methods: Our method builds on recent advances in deep learning-based video classification. Specifically, we propose to use an inflated 3D ConvNet to classify snippets, i.e., stacks of a few consecutive frames, extracted from surgical video. The network is extended into a Temporal Segment Network during training. Results: We evaluate the method on the publicly available JIGSAWS dataset, which consists of recordings of basic robot-assisted surgery tasks performed on a dry lab bench-top model. Our approach achieves high skill classification accuracies ranging from 95.1% to 100.0%. Conclusions: Our results demonstrate the feasibility of deep learning-based assessment of technical skill from surgical video. Notably, the 3D ConvNet is able to learn meaningful patterns directly from the data, alleviating the need for manual feature engineering. Further evaluation will require more annotated data for training and testing. △ Less

Submitted 4 September, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

Comments: IPCAI 2019/ IJCARS

Journal ref: IJCARS 14.7 (2019) pp. 1217-1225

arXiv:1811.03384 [pdf, other]

Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data

Authors: Sebastian Bodenstedt, Martin Wagner, Lars Mündermann, Hannes Kenngott, Beat Müller-Stich, Michael Breucha, Sören Torge Mees, Jürgen Weitz, Stefanie Speidel

Abstract: Purpose The course of surgical procedures is often unpredictable, making it difficult to estimate the duration of procedures beforehand. A context-aware method that analyses the workflow of an intervention online and automatically predicts the remaining duration would alleviate these problems. As basis for such an estimate, information regarding the current state of the intervention is required. M… ▽ More Purpose The course of surgical procedures is often unpredictable, making it difficult to estimate the duration of procedures beforehand. A context-aware method that analyses the workflow of an intervention online and automatically predicts the remaining duration would alleviate these problems. As basis for such an estimate, information regarding the current state of the intervention is required. Methods Today, the operating room contains a diverse range of sensors. During laparoscopic interventions, the endoscopic video stream is an ideal source of such information. Extracting quantitative information from the video is challenging though, due to its high dimensionality. Other surgical devices (e.g. insufflator, lights, etc.) provide data streams which are, in contrast to the video stream, more compact and easier to quantify. Though whether such streams offer sufficient information for estimating the duration of surgery is uncertain. Here, we propose and compare methods, based on convolutional neural networks, for continuously predicting the duration of laparoscopic interventions based on unlabeled data, such as from endoscopic images and surgical device streams. Results The methods are evaluated on 80 laparoscopic interventions of various types, for which surgical device data and the endoscopic video are available. Here the combined method performs best with an overall average error of 37% and an average halftime error of 28%. Conclusion In this paper, we present, to our knowledge, the first approach for online procedure duration prediction using unlabeled endoscopic video data and surgical device data in a laparoscopic setting. We also show that a method incorporating both vision and device data performs better than methods based only on vision, while methods only based on tool usage and surgical device data perform poorly, showing the importance of the visual channel. △ Less

Submitted 3 April, 2019; v1 submitted 8 November, 2018; originally announced November 2018.

arXiv:1811.03382 [pdf, other]

Active Learning using Deep Bayesian Networks for Surgical Workflow Analysis

Authors: Sebastian Bodenstedt, Dominik Rivoir, Alexander Jenke, Martin Wagner, Michael Breucha, Beat Müller-Stich, Sören Torge Mees, Jürgen Weitz, Stefanie Speidel

Abstract: For many applications in the field of computer assisted surgery, such as providing the position of a tumor, specifying the most probable tool required next by the surgeon or determining the remaining duration of surgery, methods for surgical workflow analysis are a prerequisite. Often machine learning based approaches serve as basis for surgical workflow analysis. In general machine learning algor… ▽ More For many applications in the field of computer assisted surgery, such as providing the position of a tumor, specifying the most probable tool required next by the surgeon or determining the remaining duration of surgery, methods for surgical workflow analysis are a prerequisite. Often machine learning based approaches serve as basis for surgical workflow analysis. In general machine learning algorithms, such as convolutional neural networks (CNN), require large amounts of labeled data. While data is often available in abundance, many tasks in surgical workflow analysis need data annotated by domain experts, making it difficult to obtain a sufficient amount of annotations. The aim of using active learning to train a machine learning model is to reduce the annotation effort. Active learning methods determine which unlabeled data points would provide the most information according to some metric, such as prediction uncertainty. Experts will then be asked to only annotate these data points. The model is then retrained with the new data and used to select further data for annotation. Recently, active learning has been applied to CNN by means of Deep Bayesian Networks (DBN). These networks make it possible to assign uncertainties to predictions. In this paper, we present a DBN-based active learning approach adapted for image-based surgical workflow analysis task. Furthermore, by using a recurrent architecture, we extend this network to video-based surgical workflow analysis. We evaluate these approaches on the Cholec80 dataset by performing instrument presence detection and surgical phase segmentation. Here we are able to show that using a DBN-based active learning approach for selecting what data points to annotate next outperforms a baseline based on randomly selecting data points. △ Less

Submitted 2 April, 2019; v1 submitted 8 November, 2018; originally announced November 2018.

arXiv:1806.06811 [pdf, other]

doi 10.1007/978-3-030-01201-4_11

Temporal coherence-based self-supervised learning for laparoscopic workflow analysis

Authors: Isabel Funke, Alexander Jenke, Sören Torge Mees, Jürgen Weitz, Stefanie Speidel, Sebastian Bodenstedt

Abstract: In order to provide the right type of assistance at the right time, computer-assisted surgery systems need context awareness. To achieve this, methods for surgical workflow analysis are crucial. Currently, convolutional neural networks provide the best performance for video-based workflow analysis tasks. For training such networks, large amounts of annotated data are necessary. However, collecting… ▽ More In order to provide the right type of assistance at the right time, computer-assisted surgery systems need context awareness. To achieve this, methods for surgical workflow analysis are crucial. Currently, convolutional neural networks provide the best performance for video-based workflow analysis tasks. For training such networks, large amounts of annotated data are necessary. However, collecting a sufficient amount of data is often costly, time-consuming, and not always feasible. In this paper, we address this problem by presenting and comparing different approaches for self-supervised pretraining of neural networks on unlabeled laparoscopic videos using temporal coherence. We evaluate our pretrained networks on Cholec80, a publicly available dataset for surgical phase segmentation, on which a maximum F1 score of 84.6 was reached. Furthermore, we were able to achieve an increase of the F1 score of up to 10 points when compared to a non-pretrained neural network. △ Less

Submitted 7 September, 2018; v1 submitted 18 June, 2018; originally announced June 2018.

Comments: Accepted at the Workshop on Context-Aware Operating Theaters (OR 2.0), a MICCAI satellite event

Journal ref: CARE 2018, CLIP 2018, OR 2.0 2018, ISIC 2018. Lecture Notes in Computer Science, vol 11041 (2018) 85-93

arXiv:1607.02502 [pdf, other]

doi 10.1109/TCSS.2017.2719585

Networked SIS Epidemics with Awareness

Authors: Keith Paarporn, Ceyhun Eksin, Joshua S. Weitz, Jeff S. Shamma

Abstract: We study an SIS epidemic process over a static contact network where the nodes have partial information about the epidemic state. They react by limiting their interactions with their neighbors when they believe the epidemic is currently prevalent. A node's awareness is weighted by the fraction of infected neighbors in their social network, and a global broadcast of the fraction of infected nodes i… ▽ More We study an SIS epidemic process over a static contact network where the nodes have partial information about the epidemic state. They react by limiting their interactions with their neighbors when they believe the epidemic is currently prevalent. A node's awareness is weighted by the fraction of infected neighbors in their social network, and a global broadcast of the fraction of infected nodes in the entire network. The dynamics of the benchmark (no awareness) and awareness models are described by discrete-time Markov chains, from which mean-field approximations (MFA) are derived. The states of the MFA are interpreted as the nodes' probabilities of being infected. We show a sufficient condition for existence of a "metastable", or endemic, state of the awareness model coincides with that of the benchmark model. Furthermore, we use a coupling technique to give a full stochastic comparison analysis between the two chains, which serves as a probabilistic analogue to the MFA analysis. In particular, we show that adding awareness reduces the expectation of any epidemic metric on the space of sample paths, e.g. eradication time or total infections. We characterize the reduction in expectations in terms of the coupling distribution. In simulations, we evaluate the effect social distancing has on contact networks from different random graph families (geometric, Erdős-Renyi, and scale-free random networks). △ Less

Submitted 12 July, 2016; v1 submitted 8 July, 2016; originally announced July 2016.

Comments: 10 pages, 5 figures

arXiv:1604.03240 [pdf, other]

Disease dynamics on a network game: a little empathy goes a long way

Authors: Ceyhun Eksin, Jeff S. Shamma, Joshua S. Weitz

Abstract: Individuals change their behavior during an epidemic in response to whether they and/or those they interact with are healthy or sick. Healthy individuals are concerned about contracting a disease from their sick contacts and may utilize protective measures. Sick individuals may be concerned with spreading the disease to their healthy contacts and adopt preemptive measures. Yet, in practice both pr… ▽ More Individuals change their behavior during an epidemic in response to whether they and/or those they interact with are healthy or sick. Healthy individuals are concerned about contracting a disease from their sick contacts and may utilize protective measures. Sick individuals may be concerned with spreading the disease to their healthy contacts and adopt preemptive measures. Yet, in practice both protective and preemptive changes in behavior come with costs. This paper proposes a stochastic network disease game model that captures the self-interests of individuals during the spread of a susceptible-infected-susceptible (SIS) disease where individuals react to current risk of disease spread, and their reactions together with the current state of the disease stochastically determine the next stage of the disease. We show that there is a critical level of concern, i.e., empathy, by the sick individuals above which disease is eradicated fast. Furthermore, we find that if the network and disease parameters are above the epidemic threshold, the risk averse behavior by the healthy individuals cannot eradicate the disease without the preemptive measures of the sick individuals. This imbalance in the role played by the response of the infected versus the susceptible individuals in disease eradication affords critical policy insights. △ Less

Submitted 15 April, 2016; v1 submitted 12 April, 2016; originally announced April 2016.

Comments: 27 pages, 9 figures, submitted for publication

arXiv:1304.3521 [pdf, ps, other]

doi 10.1371/journal.pone.0085585

The Fiber Walk: A Model of Tip-Driven Growth with Lateral Expansion

Authors: Alexander Bucksch, Greg Turk, Joshua S. Weitz

Abstract: Tip-driven growth processes underlie the development of many plants. To date, tip-driven growth processes have been modelled as an elongating path or series of segments without taking into account lateral expansion during elongation. Instead, models of growth often introduce an explicit thickness by expanding the area around the completed elongated path. Modelling expansion in this way can lead to… ▽ More Tip-driven growth processes underlie the development of many plants. To date, tip-driven growth processes have been modelled as an elongating path or series of segments without taking into account lateral expansion during elongation. Instead, models of growth often introduce an explicit thickness by expanding the area around the completed elongated path. Modelling expansion in this way can lead to contradictions in the physical plausibility of the resulting surface and to uncertainty about how the object reached certain regions of space. Here, we introduce "fiber walks" as a self-avoiding random walk model for tip-driven growth processes that includes lateral expansion. In 2D, the fiber walk takes place on a square lattice and the space occupied by the fiber is modelled as a lateral contraction of the lattice. This contraction influences the possible follow-up steps of the fiber walk. The boundary of the area consumed by the contraction is derived as the dual of the lattice faces adjacent to the fiber. We show that fiber walks generate fibers that have well-defined curvatures, enable the identification of the process underlying the occupancy of physical space. Hence, fiber walks provide a base from which to model both the extension and expansion of physical biological objects with finite thickness. △ Less

Submitted 29 December, 2013; v1 submitted 11 April, 2013; originally announced April 2013.

Comments: Plos One (in press)

Showing 1–18 of 18 results for author: Weitz, J