Skip to main content

Showing 1–50 of 68 results for author: Padoy, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20333  [pdf, other

    cs.CV

    SurgiTrack: Fine-Grained Multi-Class Multi-Tool Tracking in Surgical Videos

    Authors: Chinedu Innocent Nwoye, Nicolas Padoy

    Abstract: Accurate tool tracking is essential for the success of computer-assisted intervention. Previous efforts often modeled tool trajectories rigidly, overlooking the dynamic nature of surgical procedures, especially tracking scenarios like out-of-body and out-of-camera views. Addressing this limitation, the new CholecTrack20 dataset provides detailed labels that account for multiple tool trajectories i… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 15 pages, 7 figures, 9 tables, 1 video. Supplementary video available at: https://vimeo.com/951853260

  2. arXiv:2405.10075  [pdf, other

    cs.CV cs.AI

    HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase Recognition

    Authors: Kun Yuan, Vinkle Srivastav, Nassir Navab, Nicolas Padoy

    Abstract: Natural language could play an important role in develo** generalist surgical models by providing a broad source of supervision from raw texts. This flexible form of supervision can enable the model's transferability across datasets and tasks as natural language can be used to reference learned visual concepts or describe new ones. In this work, we present HecVL, a novel hierarchical video-langu… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted by MICCAI2024

  3. On-the-Fly Point Annotation for Fast Medical Video Labeling

    Authors: Meyer Adrien, Mazellier Jean-Paul, Jeremy Dana, Nicolas Padoy

    Abstract: Purpose: In medical research, deep learning models rely on high-quality annotated data, a process often laborious and timeconsuming. This is particularly true for detection tasks where bounding box annotations are required. The need to adjust two corners makes the process inherently frame-by-frame. Given the scarcity of experts' time, efficient annotation methods suitable for clinicians are needed… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 7 pages, 5 figures. Int J CARS (2024)

  4. arXiv:2404.02041  [pdf, other

    cs.CV

    SelfPose3d: Self-Supervised Multi-Person Multi-View 3d Pose Estimation

    Authors: Vinkle Srivastav, Keqi Chen, Nicolas Padoy

    Abstract: We present a new self-supervised approach, SelfPose3d, for estimating 3d poses of multiple persons from multiple camera views. Unlike current state-of-the-art fully-supervised methods, our approach does not require any 2d or 3d ground-truth poses and uses only the multi-view input images from a calibrated camera setup and 2d pseudo poses generated from an off-the-shelf 2d human pose estimator. We… ▽ More

    Submitted 8 June, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted for CVPR 2024. Code: https://github.com/CAMMA-public/SelfPose3D. Video demo: https://youtu.be/GAqhmUIr2E8

  5. arXiv:2403.13756  [pdf, other

    cs.CV

    Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model

    Authors: Diwei Wang, Kun Yuan, Candice Muller, Frédéric Blanc, Nicolas Padoy, Hyewon Seo

    Abstract: We present a knowledge augmentation strategy for assessing the diagnostic groups and gait impairment from monocular gait videos. Based on a large-scale pre-trained Vision Language Model (VLM), our model learns and improves visual, textual, and numerical representations of patient gait videos, through a collective learning across three distinct modalities: gait videos, class-specific descriptions,… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  6. arXiv:2403.06953  [pdf, other

    cs.CV

    Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer

    Authors: Siddhant Satyanaik, Aditya Murali, Deepak Alapatt, Xin Wang, Pietro Mascagni, Nicolas Padoy

    Abstract: Purpose: Advances in deep learning have resulted in effective models for surgical video analysis; however, these models often fail to generalize across medical centers due to domain shift caused by variations in surgical workflow, camera setups, and patient demographics. Recently, object-centric learning has emerged as a promising approach for improved surgical scene understanding, capturing and d… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 7 pages, 3 figures, Accepted to IPCAI 2024

  7. arXiv:2402.14611  [pdf, other

    cs.CV

    Overcoming Dimensional Collapse in Self-supervised Contrastive Learning for Medical Image Segmentation

    Authors: Jamshid Hassanpour, Vinkle Srivastav, Didier Mutter, Nicolas Padoy

    Abstract: Self-supervised learning (SSL) approaches have achieved great success when the amount of labeled data is limited. Within SSL, models learn robust feature representations by solving pretext tasks. One such pretext task is contrastive learning, which involves forming pairs of similar and dissimilar input samples, guiding the model to distinguish between them. In this work, we investigate the applica… ▽ More

    Submitted 27 February, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: Accepted at at ISBI-2024 (https://biomedicalimaging.org/2024/). 4 pages, 2 figures, 2 tables

  8. arXiv:2312.12429  [pdf, other

    cs.CV

    The Endoscapes Dataset for Surgical Scene Segmentation, Object Detection, and Critical View of Safety Assessment: Official Splits and Benchmark

    Authors: Aditya Murali, Deepak Alapatt, Pietro Mascagni, Armine Vardazaryan, Alain Garcia, Nariaki Okamoto, Guido Costamagna, Didier Mutter, Jacques Marescaux, Bernard Dallemagne, Nicolas Padoy

    Abstract: This technical report provides a detailed overview of Endoscapes, a dataset of laparoscopic cholecystectomy (LC) videos with highly intricate annotations targeted at automated assessment of the Critical View of Safety (CVS). Endoscapes comprises 201 LC videos with frames annotated sparsely but regularly with segmentation masks, bounding boxes, and CVS assessment by three different clinical experts… ▽ More

    Submitted 26 January, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: 7 pages; 3 figures

  9. arXiv:2312.12250  [pdf, other

    cs.CV

    ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room

    Authors: Idris Hamoud, Muhammad Abdullah Jamal, Vinkle Srivastav, Didier Mutter, Nicolas Padoy, Omid Mohareri

    Abstract: Surgical robotics holds much promise for improving patient safety and clinician experience in the Operating Room (OR). However, it also comes with new challenges, requiring strong team coordination and effective OR management. Automatic detection of surgical activities is a key requirement for develo** AI-based intelligent tools to tackle these challenges. The current state-of-the-art surgical a… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  10. arXiv:2312.11250  [pdf, other

    cs.CV

    Challenges in Multi-centric Generalization: Phase and Step Recognition in Roux-en-Y Gastric Bypass Surgery

    Authors: Joel L. Lavanchy, Sanat Ramesh, Diego Dall'Alba, Cristians Gonzalez, Paolo Fiorini, Beat Muller-Stich, Philipp C. Nett, Jacques Marescaux, Didier Mutter, Nicolas Padoy

    Abstract: Most studies on surgical activity recognition utilizing Artificial intelligence (AI) have focused mainly on recognizing one type of activity from small and mono-centric surgical video datasets. It remains speculative whether those models would generalize to other centers. In this work, we introduce a large multi-centric multi-activity dataset consisting of 140 videos (MultiBypass140) of laparoscop… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  11. arXiv:2312.10251  [pdf, other

    cs.CV cs.AI

    Advancing Surgical VQA with Scene Graph Knowledge

    Authors: Kun Yuan, Manasi Kattel, Joel L. Lavanchy, Nassir Navab, Vinkle Srivastav, Nicolas Padoy

    Abstract: Modern operating room is becoming increasingly complex, requiring innovative intra-operative support systems. While the focus of surgical data science has largely been on video analysis, integrating surgical computer vision with language capabilities is emerging as a necessity. Our work aims to advance Visual Question Answering (VQA) in the surgical context with scene graph knowledge, addressing t… ▽ More

    Submitted 24 June, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: IPCAI 2024, Int J CARS (2024)

  12. arXiv:2312.08593  [pdf

    cs.CV

    MOSaiC: a Web-based Platform for Collaborative Medical Video Assessment and Annotation

    Authors: Jean-Paul Mazellier, Antoine Boujon, Méline Bour-Lang, Maël Erharhd, Julien Waechter, Emilie Wernert, Pietro Mascagni, Nicolas Padoy

    Abstract: This technical report presents MOSaiC 3.6.2, a web-based collaborative platform designed for the annotation and evaluation of medical videos. MOSaiC is engineered to facilitate video-based assessment and accelerate surgical data science projects. We provide an overview of MOSaiC's key functionalities, encompassing group and video management, annotation tools, ontologies, assessment capabilities, a… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  13. arXiv:2312.07352  [pdf, other

    cs.CV cs.AI

    CholecTrack20: A Dataset for Multi-Class Multiple Tool Tracking in Laparoscopic Surgery

    Authors: Chinedu Innocent Nwoye, Kareem Elgohary, Anvita Srinivas, Fauzan Zaid, Joël L. Lavanchy, Nicolas Padoy

    Abstract: Tool tracking in surgical videos is vital in computer-assisted intervention for tasks like surgeon skill assessment, safety zone estimation, and human-machine collaboration during minimally invasive procedures. The lack of large-scale datasets hampers Artificial Intelligence implementation in this domain. Current datasets exhibit overly generic tracking formalization, often lacking surgical contex… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: Surgical tool tracking dataset paper, 15 pages, 9 figures, 4 tables

  14. Encoding Surgical Videos as Latent Spatiotemporal Graphs for Object and Anatomy-Driven Reasoning

    Authors: Aditya Murali, Deepak Alapatt, Pietro Mascagni, Armine Vardazaryan, Alain Garcia, Nariaki Okamoto, Didier Mutter, Nicolas Padoy

    Abstract: Recently, spatiotemporal graphs have emerged as a concise and elegant manner of representing video clips in an object-centric fashion, and have shown to be useful for downstream tasks such as action recognition. In this work, we investigate the use of latent spatiotemporal graphs to represent a surgical video in terms of the constituent anatomical structures and tools and their evolving properties… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 13 pages, 2 figures, MICCAI 2023

  15. arXiv:2312.05968  [pdf, other

    cs.CV

    Jumpstarting Surgical Computer Vision

    Authors: Deepak Alapatt, Aditya Murali, Vinkle Srivastav, Pietro Mascagni, AI4SafeChole Consortium, Nicolas Padoy

    Abstract: Purpose: General consensus amongst researchers and industry points to a lack of large, representative annotated datasets as the biggest obstacle to progress in the field of surgical data science. Self-supervised learning represents a solution to part of this problem, removing the reliance on annotations. However, the robustness of current self-supervised learning methods to domain shifts remains u… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: 7 pages, 3 figures

  16. arXiv:2309.01723  [pdf, other

    cs.CV

    SAF-IS: a Spatial Annotation Free Framework for Instance Segmentation of Surgical Tools

    Authors: Luca Sestini, Benoit Rosa, Elena De Momi, Giancarlo Ferrigno, Nicolas Padoy

    Abstract: Instance segmentation of surgical instruments is a long-standing research problem, crucial for the development of many applications for computer-assisted surgery. This problem is commonly tackled via fully-supervised training of deep learning models, requiring expensive pixel-level annotations to train. In this work, we develop a framework for instance segmentation not relying on spatial annotatio… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  17. arXiv:2307.15220  [pdf, other

    cs.CV cs.AI

    Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures

    Authors: Kun Yuan, Vinkle Srivastav, Tong Yu, Joel L. Lavanchy, Pietro Mascagni, Nassir Navab, Nicolas Padoy

    Abstract: Recent advancements in surgical computer vision applications have been driven by fully-supervised methods, primarily using only visual data. These methods rely on manually annotated surgical videos to predict a fixed set of object categories, limiting their generalizability to unseen surgical procedures and downstream tasks. In this work, we put forward the idea that the surgical video lectures av… ▽ More

    Submitted 13 January, 2024; v1 submitted 27 July, 2023; originally announced July 2023.

  18. arXiv:2307.09548  [pdf, other

    cs.CV

    Surgical Action Triplet Detection by Mixed Supervised Learning of Instrument-Tissue Interactions

    Authors: Saurav Sharma, Chinedu Innocent Nwoye, Didier Mutter, Nicolas Padoy

    Abstract: Surgical action triplets describe instrument-tissue interactions as (instrument, verb, target) combinations, thereby supporting a detailed analysis of surgical scene activities and workflow. This work focuses on surgical action triplet detection, which is challenging but more precise than the traditional triplet recognition task as it consists of joint (1) localization of surgical instruments and… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: Accepted at MICCAI, 2023. Project Page: https://github.com/CAMMA-public/mcit-ig

  19. arXiv:2306.14780  [pdf

    cs.CV

    INDEXITY: a web-based collaborative tool for medical video annotation

    Authors: Jean-Paul Mazellier, Méline Bour-Lang, Sabrina Bourouis, Johan Moreau, Aimable Muzuri, Olivier Schweitzer, Aslan Vatsaev, Julien Waechter, Emilie Wernert, Frederic Woelffel, Alexandre Hostettler, Nicolas Padoy, Flavien Bridault

    Abstract: This technical report presents Indexity 1.4.0, a web-based tool designed for medical video annotation in surgical data science projects. We describe the main features available for the management of videos, annotations, ontology and users, as well as the global software architecture.

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: 7 pages, 7 figures, technical report

  20. arXiv:2305.07152  [pdf, other

    cs.CV

    Surgical tool classification and localization: results and methods from the MICCAI 2022 SurgToolLoc challenge

    Authors: Aneeq Zia, Kiran Bhattacharyya, Xi Liu, Max Berniker, Ziheng Wang, Rogerio Nespolo, Satoshi Kondo, Satoshi Kasai, Kousuke Hirasawa, Bo Liu, David Austin, Yiheng Wang, Michal Futrega, Jean-Francois Puget, Zhenqiang Li, Yoichi Sato, Ryo Fujii, Ryo Hachiuma, Mana Masuda, Hideo Saito, An Wang, Mengya Xu, Mobarakol Islam, Long Bai, Winnie Pang , et al. (46 additional authors not shown)

    Abstract: The ability to automatically detect and track surgical instruments in endoscopic videos can enable transformational interventions. Assessing surgical performance and efficiency, identifying skilled tool use and choreography, and planning operational and logistical aspects of OR resources are just a few of the applications that could benefit. Unfortunately, obtaining the annotations needed to train… ▽ More

    Submitted 31 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

  21. arXiv:2303.17719  [pdf, other

    cs.CV cs.LG

    Why is the winner the best?

    Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Sharib Ali, Vincent Andrearczyk, Marc Aubreville, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano, Jorge Bernal, Sebastian Bodenstedt, Alessandro Casella, Veronika Cheplygina, Marie Daum, Marleen de Bruijne, Adrien Depeursinge, Reuben Dorent, Jan Egger, David G. Ellis, Sandy Engelhardt, Melanie Ganz , et al. (100 additional authors not shown)

    Abstract: International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To addre… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: accepted to CVPR 2023

  22. arXiv:2303.12915  [pdf, other

    cs.CV cs.LG

    Self-distillation for surgical action recognition

    Authors: Amine Yamlahi, Thuy Nuong Tran, Patrick Godau, Melanie Schellenberg, Dominik Michael, Finn-Henri Smidt, Jan-Hinrich Noelke, Tim Adler, Minu Dietlinde Tizabi, Chinedu Nwoye, Nicolas Padoy, Lena Maier-Hein

    Abstract: Surgical scene understanding is a key prerequisite for contextaware decision support in the operating room. While deep learning-based approaches have already reached or even surpassed human performance in various fields, the task of surgical action recognition remains a major challenge. With this contribution, we are the first to investigate the concept of self-distillation as a means of addressin… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

  23. Weakly Supervised Temporal Convolutional Networks for Fine-grained Surgical Activity Recognition

    Authors: Sanat Ramesh, Diego Dall'Alba, Cristians Gonzalez, Tong Yu, Pietro Mascagni, Didier Mutter, Jacques Marescaux, Paolo Fiorini, Nicolas Padoy

    Abstract: Automatic recognition of fine-grained surgical activities, called steps, is a challenging but crucial task for intelligent intra-operative computer assistance. The development of current vision-based activity recognition methods relies heavily on a high volume of manually annotated data. This data is difficult and time-consuming to generate and requires domain-specific knowledge. In this work, we… ▽ More

    Submitted 11 April, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

  24. arXiv:2302.06294  [pdf, other

    eess.IV cs.CV cs.LG

    CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection

    Authors: Chinedu Innocent Nwoye, Tong Yu, Saurav Sharma, Aditya Murali, Deepak Alapatt, Armine Vardazaryan, Kun Yuan, Jonas Hajek, Wolfgang Reiter, Amine Yamlahi, Finn-Henri Smidt, Xiaoyang Zou, Guoyan Zheng, Bruno Oliveira, Helena R. Torres, Satoshi Kondo, Satoshi Kasai, Felix Holm, Ege Özsoy, Shuangchun Gui, Han Li, Sista Raviteja, Rachana Sathish, Pranav Poudel, Binod Bhattarai , et al. (24 additional authors not shown)

    Abstract: Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier effor… ▽ More

    Submitted 14 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: MICCAI EndoVis CholecTriplet2022 challenge report. Published at Elsevier journal of Medical Image Analysis. 25 pages, 15 figures, 8 tables

    Journal ref: Medical Image Analysis, Volume 89, 2023, 102888, ISSN 1361-8415

  25. Preserving Privacy in Surgical Video Analysis Using Artificial Intelligence: A Deep Learning Classifier to Identify Out-of-Body Scenes in Endoscopic Videos

    Authors: Joël L. Lavanchy, Armine Vardazaryan, Pietro Mascagni, AI4SafeChole Consortium, Didier Mutter, Nicolas Padoy

    Abstract: Objective: To develop and validate a deep learning model for the identification of out-of-body images in endoscopic videos. Background: Surgical video analysis facilitates education and research. However, video recordings of endoscopic surgeries can contain privacy-sensitive information, especially if out-of-body scenes are recorded. Therefore, identification of out-of-body scenes in endoscopic vi… ▽ More

    Submitted 7 June, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: Joël L. Lavanchy and Armine Vardazaryan contributed equally and share first co-authorship

    Journal ref: Scientific Reports 13, 9235 (2023)

  26. arXiv:2212.08568  [pdf, other

    cs.CV cs.LG

    Biomedical image analysis competitions: The state of current participation practice

    Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano , et al. (331 additional authors not shown)

    Abstract: The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,… ▽ More

    Submitted 12 September, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

  27. arXiv:2212.06809  [pdf

    eess.IV cs.CV

    Real-Time Artificial Intelligence Assistance for Safe Laparoscopic Cholecystectomy: Early-Stage Clinical Evaluation

    Authors: Pietro Mascagni, Deepak Alapatt, Alfonso Lapergola, Armine Vardazaryan, Jean-Paul Mazellier, Bernard Dallemagne, Didier Mutter, Nicolas Padoy

    Abstract: Artificial intelligence is set to be deployed in operating rooms to improve surgical care. This early-stage clinical evaluation shows the feasibility of concurrently attaining real-time, high-quality predictions from several deep neural networks for endoscopic video analysis deployed for assistance during three laparoscopic cholecystectomies.

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: 12 pages, 1 figure

  28. arXiv:2212.04155  [pdf, other

    cs.CV

    Latent Graph Representations for Critical View of Safety Assessment

    Authors: Aditya Murali, Deepak Alapatt, Pietro Mascagni, Armine Vardazaryan, Alain Garcia, Nariaki Okamoto, Didier Mutter, Nicolas Padoy

    Abstract: Assessing the critical view of safety in laparoscopic cholecystectomy requires accurate identification and localization of key anatomical structures, reasoning about their geometric relationships to one another, and determining the quality of their exposure. Prior works have approached this task by including semantic segmentation as an intermediate step, using predicted segmentation masks to then… ▽ More

    Submitted 19 December, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: 12 pages, 4 figures

    Report number: 10.1109/TMI.2023.3333034

  29. Rendezvous in Time: An Attention-based Temporal Fusion approach for Surgical Triplet Recognition

    Authors: Saurav Sharma, Chinedu Innocent Nwoye, Didier Mutter, Nicolas Padoy

    Abstract: One of the recent advances in surgical AI is the recognition of surgical activities as triplets of (instrument, verb, target). Albeit providing detailed information for computer-assisted intervention, current triplet recognition approaches rely only on single frame features. Exploiting the temporal cues from earlier frames would improve the recognition of surgical action triplets from videos. In t… ▽ More

    Submitted 16 June, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Accepted at IPCAI, 2023. Project Page: https://github.com/CAMMA-public/rendezvous-in-time

  30. arXiv:2207.00449  [pdf, other

    cs.CV

    Dissecting Self-Supervised Learning Methods for Surgical Computer Vision

    Authors: Sanat Ramesh, Vinkle Srivastav, Deepak Alapatt, Tong Yu, Aditya Murali, Luca Sestini, Chinedu Innocent Nwoye, Idris Hamoud, Saurav Sharma, Antoine Fleurentin, Georgios Exarchakis, Alexandros Karargyris, Nicolas Padoy

    Abstract: The field of surgical computer vision has undergone considerable breakthroughs in recent years with the rising popularity of deep neural network-based methods. However, standard fully-supervised approaches for training such models require vast amounts of annotated data, imposing a prohibitively high cost; especially in the clinical domain. Self-Supervised Learning (SSL) methods, which have begun t… ▽ More

    Submitted 31 May, 2023; v1 submitted 1 July, 2022; originally announced July 2022.

  31. arXiv:2204.05235  [pdf, other

    cs.CV

    Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet Datasets

    Authors: Chinedu Innocent Nwoye, Nicolas Padoy

    Abstract: In addition to generating data and annotations, devising sensible data splitting strategies and evaluation metrics is essential for the creation of a benchmark dataset. This practice ensures consensus on the usage of the data, homogeneous assessment, and uniform comparison of research methods on the dataset. This study focuses on CholecT50, which is a 50 video surgical dataset that formalizes surg… ▽ More

    Submitted 28 February, 2023; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: Official splits for the CholecT50 and CholecT45 datasets, 13 pages, 2 figures, 12 tables

  32. CholecTriplet2021: A benchmark challenge for surgical action triplet recognition

    Authors: Chinedu Innocent Nwoye, Deepak Alapatt, Tong Yu, Armine Vardazaryan, Fangfang Xia, Zixuan Zhao, Tong Xia, Fucang Jia, Yuxuan Yang, Hao Wang, Derong Yu, Guoyan Zheng, Xiaotian Duan, Neil Getty, Ricardo Sanchez-Matilla, Maria Robu, Li Zhang, Huabin Chen, Jiacheng Wang, Liansheng Wang, Bokai Zhang, Beerend Gerats, Sista Raviteja, Rachana Sathish, Rong Tao , et al. (37 additional authors not shown)

    Abstract: Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in… ▽ More

    Submitted 29 December, 2022; v1 submitted 10 April, 2022; originally announced April 2022.

    Comments: CholecTriplet2021 challenge report. Paper accepted at Elsevier journal of Medical Image Analysis. 22 pages, 8 figures, 11 tables. Challenge website: https://cholectriplet2021.grand-challenge.org

    Journal ref: Medical Image Analysis 86 (2023) 102803

  33. arXiv:2203.07345  [pdf, other

    cs.CV cs.AI cs.LG

    Federated Cycling (FedCy): Semi-supervised Federated Learning of Surgical Phases

    Authors: Hasan Kassem, Deepak Alapatt, Pietro Mascagni, AI4SafeChole Consortium, Alexandros Karargyris, Nicolas Padoy

    Abstract: Recent advancements in deep learning methods bring computer-assistance a step closer to fulfilling promises of safer surgical procedures. However, the generalizability of such methods is often dependent on training on diverse datasets from multiple medical institutions, which is a restrictive requirement considering the sensitive nature of medical data. Recently proposed collaborative learning met… ▽ More

    Submitted 28 December, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: 13 pages, 6 figures

    ACM Class: I.2.10

  34. Live Laparoscopic Video Retrieval with Compressed Uncertainty

    Authors: Tong Yu, Pietro Mascagni, Juan Verde, Jacques Marescaux, Didier Mutter, Nicolas Padoy

    Abstract: Searching through large volumes of medical data to retrieve relevant information is a challenging yet crucial task for clinical care. However the primitive and most common approach to retrieval, involving text in the form of keywords, is severely limited when dealing with complex media formats. Content-based retrieval offers a way to overcome this limitation, by using rich media as the query itsel… ▽ More

    Submitted 12 June, 2023; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: 16 pages, 13 figures

    Journal ref: Medical Image Analysis 88 (2023) 102866

  35. arXiv:2202.08141  [pdf, other

    cs.CV

    FUN-SIS: a Fully UNsupervised approach for Surgical Instrument Segmentation

    Authors: Luca Sestini, Benoit Rosa, Elena De Momi, Giancarlo Ferrigno, Nicolas Padoy

    Abstract: Automatic surgical instrument segmentation of endoscopic images is a crucial building block of many computer-assistance applications for minimally invasive surgery. So far, state-of-the-art approaches completely rely on the availability of a ground-truth supervision signal, obtained via manual annotation, thus expensive to collect at large scale. In this paper, we present FUN-SIS, a Fully-UNsuperv… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

  36. arXiv:2112.13815  [pdf, other

    cs.CV cs.NE

    Temporally Constrained Neural Networks (TCNN): A framework for semi-supervised video semantic segmentation

    Authors: Deepak Alapatt, Pietro Mascagni, Armine Vardazaryan, Alain Garcia, Nariaki Okamoto, Didier Mutter, Jacques Marescaux, Guido Costamagna, Bernard Dallemagne, Nicolas Padoy

    Abstract: A major obstacle to building models for effective semantic segmentation, and particularly video semantic segmentation, is a lack of large and well annotated datasets. This bottleneck is particularly prohibitive in highly specialized and regulated fields such as medicine and surgery, where video semantic segmentation could have important applications but data and expert annotations are scarce. In t… ▽ More

    Submitted 27 December, 2021; originally announced December 2021.

    Comments: 10 pages, 4 figures

  37. arXiv:2110.10965  [pdf, other

    eess.IV cs.CV

    2020 CATARACTS Semantic Segmentation Challenge

    Authors: Imanol Luengo, Maria Grammatikopoulou, Rahim Mohammadi, Chris Walsh, Chinedu Innocent Nwoye, Deepak Alapatt, Nicolas Padoy, Zhen-Liang Ni, Chen-Chen Fan, Gui-Bin Bian, Zeng-Guang Hou, Heon** Ha, Jiacheng Wang, Haojie Wang, Dong Guo, Lu Wang, Guotai Wang, Mobarakol Islam, Bharat Giddwani, Ren Hongliang, Theodoros Pissas, Claudio Ravasio, Martin Huber, Jeremy Birch, Joan M. Nunez Do Rio , et al. (15 additional authors not shown)

    Abstract: Surgical scene segmentation is essential for anatomy and instrument localization which can be further used to assess tissue-instrument interactions during a surgical procedure. In 2017, the Challenge on Automatic Tool Annotation for cataRACT Surgery (CATARACTS) released 50 cataract surgery videos accompanied by instrument usage annotations. These annotations included frame-level instrument presenc… ▽ More

    Submitted 24 February, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

  38. arXiv:2110.01406  [pdf

    cs.LG cs.DC cs.PF cs.SE

    MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

    Authors: Alexandros Karargyris, Renato Umeton, Micah J. Sheller, Alejandro Aristizabal, Johnu George, Srini Bala, Daniel J. Beutel, Victor Bittorf, Akshay Chaudhari, Alexander Chowdhury, Cody Coleman, Bala Desinghu, Gregory Diamos, Debo Dutta, Diane Feddema, Grigori Fursin, Junyi Guo, Xinyuan Huang, David Kanter, Satyananda Kashyap, Nicholas Lane, Indranil Mallick, Pietro Mascagni, Virendra Mehta, Vivek Natarajan , et al. (17 additional authors not shown)

    Abstract: Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience. We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. To meet this need, we are building MedPerf,… ▽ More

    Submitted 28 December, 2021; v1 submitted 29 September, 2021; originally announced October 2021.

  39. arXiv:2109.14956  [pdf

    eess.IV cs.CV cs.LG

    Comparative Validation of Machine Learning Algorithms for Surgical Workflow and Skill Analysis with the HeiChole Benchmark

    Authors: Martin Wagner, Beat-Peter Müller-Stich, Anna Kisilenko, Duc Tran, Patrick Heger, Lars Mündermann, David M Lubotsky, Benjamin Müller, Tornike Davitashvili, Manuela Capek, Annika Reinke, Tong Yu, Armine Vardazaryan, Chinedu Innocent Nwoye, Nicolas Padoy, Xinyang Liu, Eung-Joo Lee, Constantin Disch, Hans Meine, Tong Xia, Fucang Jia, Satoshi Kondo, Wolfgang Reiter, Yueming **, Yonghao Long , et al. (16 additional authors not shown)

    Abstract: PURPOSE: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported fo… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

  40. Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos

    Authors: Chinedu Innocent Nwoye, Tong Yu, Cristians Gonzalez, Barbara Seeliger, Pietro Mascagni, Didier Mutter, Jacques Marescaux, Nicolas Padoy

    Abstract: Out of all existing frameworks for surgical workflow analysis in endoscopic videos, action triplet recognition stands out as the only one aiming to provide truly fine-grained and comprehensive information on surgical activities. This information, presented as <instrument, verb, target> combinations, is highly challenging to be accurately identified. Triplet components can be difficult to recognize… ▽ More

    Submitted 3 March, 2022; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: 21 pages, 11 figures, 19 tables, 1 video. Accepted at Elsevier Journal of Medical Image Analysis. Supplementary video available at: https://youtu.be/d_yHdJtCa98

    Journal ref: Medical Image Analysis (2022) 102433

  41. arXiv:2108.11801  [pdf, other

    cs.CV

    Unsupervised domain adaptation for clinician pose estimation and instance segmentation in the operating room

    Authors: Vinkle Srivastav, Afshin Gangi, Nicolas Padoy

    Abstract: The fine-grained localization of clinicians in the operating room (OR) is a key component to design the new generation of OR support systems. Computer vision models for person pixel-based segmentation and body-keypoints detection are needed to better understand the clinical activities and the spatial layout of the OR. This is challenging, not only because OR images are very different from traditio… ▽ More

    Submitted 30 June, 2022; v1 submitted 26 August, 2021; originally announced August 2021.

    Comments: Accepted at Elsevier Journal of Medical Image Analysis. Code is available at https://github.com/CAMMA-public/HPE-AdaptOR. Supplementary video is available at https://youtu.be/gqwPu9-nfGs

  42. arXiv:2106.10916  [pdf

    eess.IV cs.CV

    Surgical data science for safe cholecystectomy: a protocol for segmentation of hepatocystic anatomy and assessment of the critical view of safety

    Authors: Pietro Mascagni, Deepak Alapatt, Alain Garcia, Nariaki Okamoto, Armine Vardazaryan, Guido Costamagna, Bernard Dallemagne, Nicolas Padoy

    Abstract: Minimally invasive image-guided surgery heavily relies on vision. Deep learning models for surgical video analysis could therefore support visual tasks such as assessing the critical view of safety (CVS) in laparoscopic cholecystectomy (LC), potentially contributing to surgical safety and efficiency. However, the performance, reliability and reproducibility of such models are deeply dependent on t… ▽ More

    Submitted 20 September, 2021; v1 submitted 21 June, 2021; originally announced June 2021.

    Comments: 24 pages, 34 figures. v2: Minor revisions and code linked

  43. arXiv:2103.00586  [pdf, other

    cs.RO cs.AI cs.CV

    A Kinematic Bottleneck Approach For Pose Regression of Flexible Surgical Instruments directly from Images

    Authors: Luca Sestini, Benoit Rosa, Elena De Momi, Giancarlo Ferrigno, Nicolas Padoy

    Abstract: 3-D pose estimation of instruments is a crucial step towards automatic scene understanding in robotic minimally invasive surgery. Although robotic systems can potentially directly provide joint values, this information is not commonly exploited inside the operating room, due to its possible unreliability, limited access and the time-consuming calibration required, especially for continuum robots.… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

  44. Multi-Task Temporal Convolutional Networks for Joint Recognition of Surgical Phases and Steps in Gastric Bypass Procedures

    Authors: Sanat Ramesh, Diego Dall'Alba, Cristians Gonzalez, Tong Yu, Pietro Mascagni, Didier Mutter, Jacques Marescaux, Paolo Fiorini, Nicolas Padoy

    Abstract: Purpose: Automatic segmentation and classification of surgical activity is crucial for providing advanced support in computer-assisted interventions and autonomous functionalities in robot-assisted surgeries. Prior works have focused on recognizing either coarse activities, such as phases, or fine-grained activities, such as gestures. This work aims at jointly recognizing two complementary levels… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: Accepted to IPCAI 2021

  45. arXiv:2011.02284  [pdf, other

    cs.CY cs.CV cs.LG eess.IV

    Surgical Data Science -- from Concepts toward Clinical Translation

    Authors: Lena Maier-Hein, Matthias Eisenmann, Duygu Sarikaya, Keno März, Toby Collins, Anand Malpani, Johannes Fallert, Hubertus Feussner, Stamatia Giannarou, Pietro Mascagni, Hirenkumar Nakawala, Adrian Park, Carla Pugh, Danail Stoyanov, Swaroop S. Vedula, Kevin Cleary, Gabor Fichtinger, Germain Forestier, Bernard Gibaud, Teodor Grantcharov, Makoto Hashizume, Doreen Heckmann-Nötzel, Hannes G. Kenngott, Ron Kikinis, Lars Mündermann , et al. (25 additional authors not shown)

    Abstract: Recent developments in data science in general and machine learning in particular have transformed the way experts envision the future of surgery. Surgical Data Science (SDS) is a new research field that aims to improve the quality of interventional healthcare through the capture, organization, analysis and modeling of data. While an increasing number of data-driven approaches and clinical applica… ▽ More

    Submitted 30 July, 2021; v1 submitted 30 October, 2020; originally announced November 2020.

  46. arXiv:2009.14661  [pdf, other

    cs.CV

    Encode the Unseen: Predictive Video Hashing for Scalable Mid-Stream Retrieval

    Authors: Tong Yu, Nicolas Padoy

    Abstract: This paper tackles a new problem in computer vision: mid-stream video-to-video retrieval. This task, which consists in searching a database for content similar to a video right as it is playing, e.g. from a live stream, exhibits challenging characteristics. Only the beginning part of the video is available as query and new frames are constantly added as the video plays out. To perform retrieval in… ▽ More

    Submitted 2 October, 2020; v1 submitted 30 September, 2020; originally announced September 2020.

    Comments: Accepted at ACCV 2020

  47. arXiv:2009.13411  [pdf

    cs.NE

    Artificial Intelligence in Surgery: Neural Networks and Deep Learning

    Authors: Deepak Alapatt, Pietro Mascagni, Vinkle Srivastav, Nicolas Padoy

    Abstract: Deep neural networks power most recent successes of artificial intelligence, spanning from self-driving cars to computer aided diagnosis in radiology and pathology. The high-stake data intensive process of surgery could highly benefit from such computational methods. However, surgeons and computer scientists should partner to develop and assess deep learning applications of value to patients and h… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

    Journal ref: In Hashimoto D.A. (Ed.) Artificial Intelligence in Surgery: A Primer for Surgical Practice. New York: McGraw Hill. ISBN: 978-1260452730 (2020)

  48. Self-supervision on Unlabelled OR Data for Multi-person 2D/3D Human Pose Estimation

    Authors: Vinkle Srivastav, Afshin Gangi, Nicolas Padoy

    Abstract: 2D/3D human pose estimation is needed to develop novel intelligent tools for the operating room that can analyze and support the clinical activities. The lack of annotated data and the complexity of state-of-the-art pose estimation approaches limit, however, the deployment of such techniques inside the OR. In this work, we propose to use knowledge distillation in a teacher/student framework to har… ▽ More

    Submitted 20 August, 2021; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: Published at MICCAI 2020. Code is available at https://github.com/CAMMA-public/ORPose-Color

    Journal ref: Springer (2020) LNCS, volume 12261

  49. Human Pose Estimation on Privacy-Preserving Low-Resolution Depth Images

    Authors: Vinkle Srivastav, Afshin Gangi, Nicolas Padoy

    Abstract: Human pose estimation (HPE) is a key building block for develo** AI-based context-aware systems inside the operating room (OR). The 24/7 use of images coming from cameras mounted on the OR ceiling can however raise concerns for privacy, even in the case of depth images captured by RGB-D sensors. Being able to solely use low-resolution privacy-preserving images would address these concerns and he… ▽ More

    Submitted 20 August, 2021; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: Published at MICCAI-2019. Code is available at https://github.com/CAMMA-public/ORPose-depth

    Journal ref: Springer (2019) 583-591

  50. Recognition of Instrument-Tissue Interactions in Endoscopic Videos via Action Triplets

    Authors: Chinedu Innocent Nwoye, Cristians Gonzalez, Tong Yu, Pietro Mascagni, Didier Mutter, Jacques Marescaux, Nicolas Padoy

    Abstract: Recognition of surgical activity is an essential component to develop context-aware decision support for the operating room. In this work, we tackle the recognition of fine-grained activities, modeled as action triplets <instrument, verb, target> representing the tool activity. To this end, we introduce a new laparoscopic dataset, CholecT40, consisting of 40 videos from the public dataset Cholec80… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

    Comments: 13 pages, 4 figures, 6 tables. Accepted and to be published in MICCAI 2020

    Journal ref: Medical Image Computing and Computer Assisted Intervention MICCAI 12263 (2020) 364-374