-
Complementary Textures. A Novel Approach to Object Alignment in Mixed Reality
Authors:
Alejandro Martin-Gomez,
Alexander Winkler,
Rafael de la Tijera Obert,
Javad Fotouhi,
Daniel Roth,
Ulrich Eck,
Nassir Navab
Abstract:
Alignment between real and virtual objects is a challenging task required for the deployment of Mixed Reality (MR) into manufacturing, medical, and construction applications. To face this challenge, a series of methods have been proposed. While many approaches use dynamic augmentations such as animations, arrows, or text to assist users, they require tracking the position of real objects. In contr…
▽ More
Alignment between real and virtual objects is a challenging task required for the deployment of Mixed Reality (MR) into manufacturing, medical, and construction applications. To face this challenge, a series of methods have been proposed. While many approaches use dynamic augmentations such as animations, arrows, or text to assist users, they require tracking the position of real objects. In contrast, when tracking of the real objects is not available or desired, alternative approaches use virtual replicas of real objects to allow for interactive, perceptual virtual-to-real, and/or real-to-virtual alignment. In these cases, the accuracy achieved strongly depends on the quality of the perceptual information provided to the user. This paper proposes a novel set of perceptual alignment concepts that go beyond the use of traditional visualization of virtual replicas, introducing the concept of COMPLEMENTARY TEXTURES to improve interactive alignment in MR applications. To showcase the advantages of using COMPLEMENTARY TEXTURES, we describe three different implementations that provide highly salient visual cues when misalignment is observed; or present semantic augmentations that, when combined with a real object, provide contextual information that can be used during the alignment process. The authors aim to open new paths for the community to explore rather than describing end-to-end solutions. The objective is to show the multitude of opportunities such concepts could provide for further research and development.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
4D-OR: Semantic Scene Graphs for OR Domain Modeling
Authors:
Ege Özsoy,
Evin Pınar Örnek,
Ulrich Eck,
Tobias Czempiel,
Federico Tombari,
Nassir Navab
Abstract:
Surgical procedures are conducted in highly complex operating rooms (OR), comprising different actors, devices, and interactions. To date, only medically trained human experts are capable of understanding all the links and interactions in such a demanding environment. This paper aims to bring the community one step closer to automated, holistic and semantic understanding and modeling of OR domain.…
▽ More
Surgical procedures are conducted in highly complex operating rooms (OR), comprising different actors, devices, and interactions. To date, only medically trained human experts are capable of understanding all the links and interactions in such a demanding environment. This paper aims to bring the community one step closer to automated, holistic and semantic understanding and modeling of OR domain. Towards this goal, for the first time, we propose using semantic scene graphs (SSG) to describe and summarize the surgical scene. The nodes of the scene graphs represent different actors and objects in the room, such as medical staff, patients, and medical equipment, whereas edges are the relationships between them. To validate the possibilities of the proposed representation, we create the first publicly available 4D surgical SSG dataset, 4D-OR, containing ten simulated total knee replacement surgeries recorded with six RGB-D sensors in a realistic OR simulation center. 4D-OR includes 6734 frames and is richly annotated with SSGs, human and object poses, and clinical roles. We propose an end-to-end neural network-based SSG generation pipeline, with a rate of success of 0.75 macro F1, indeed being able to infer semantic reasoning in the OR. We further demonstrate the representation power of our scene graphs by using it for the problem of clinical role prediction, where we achieve 0.85 macro F1. The code and dataset will be made available upon acceptance.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
Know your sensORs -- A Modality Study For Surgical Action Classification
Authors:
Lennart Bastian,
Tobias Czempiel,
Christian Heiliger,
Konrad Karcz,
Ulrich Eck,
Benjamin Busam,
Nassir Navab
Abstract:
The surgical operating room (OR) presents many opportunities for automation and optimization. Videos from various sources in the OR are becoming increasingly available. The medical community seeks to leverage this wealth of data to develop automated methods to advance interventional care, lower costs, and improve overall patient outcomes. Existing datasets from OR room cameras are thus far limited…
▽ More
The surgical operating room (OR) presents many opportunities for automation and optimization. Videos from various sources in the OR are becoming increasingly available. The medical community seeks to leverage this wealth of data to develop automated methods to advance interventional care, lower costs, and improve overall patient outcomes. Existing datasets from OR room cameras are thus far limited in size or modalities acquired, leaving it unclear which sensor modalities are best suited for tasks such as recognizing surgical action from videos. This study demonstrates that surgical action recognition performance can vary depending on the image modalities used. We perform a methodical analysis on several commonly available sensor modalities, presenting two fusion approaches that improve classification performance. The analyses are carried out on a set of multi-view RGB-D video recordings of 18 laparoscopic procedures.
△ Less
Submitted 17 September, 2022; v1 submitted 16 March, 2022;
originally announced March 2022.
-
Motion-Aware Robotic 3D Ultrasound
Authors:
Zhongliang Jiang,
Hanyu Wang,
Zhenyu Li,
Matthias Grimm,
Mingchuan Zhou,
Ulrich Eck,
Sandra V. Brecht,
Tim C. Lueth,
Thomas Wendler,
Nassir Navab
Abstract:
Robotic three-dimensional (3D) ultrasound (US) imaging has been employed to overcome the drawbacks of traditional US examinations, such as high inter-operator variability and lack of repeatability. However, object movement remains a challenge as unexpected motion decreases the quality of the 3D compounding. Furthermore, attempted adjustment of objects, e.g., adjusting limbs to display the entire l…
▽ More
Robotic three-dimensional (3D) ultrasound (US) imaging has been employed to overcome the drawbacks of traditional US examinations, such as high inter-operator variability and lack of repeatability. However, object movement remains a challenge as unexpected motion decreases the quality of the 3D compounding. Furthermore, attempted adjustment of objects, e.g., adjusting limbs to display the entire limb artery tree, is not allowed for conventional robotic US systems. To address this challenge, we propose a vision-based robotic US system that can monitor the object's motion and automatically update the sweep trajectory to provide 3D compounded images of the target anatomy seamlessly. To achieve these functions, a depth camera is employed to extract the manually planned sweep trajectory after which the normal direction of the object is estimated using the extracted 3D trajectory. Subsequently, to monitor the movement and further compensate for this motion to accurately follow the trajectory, the position of firmly attached passive markers is tracked in real-time. Finally, a step-wise compounding was performed. The experiments on a gel phantom demonstrate that the system can resume a sweep when the object is not stationary during scanning.
△ Less
Submitted 13 July, 2021;
originally announced July 2021.
-
Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical Procedures
Authors:
Ege Özsoy,
Evin Pınar Örnek,
Ulrich Eck,
Federico Tombari,
Nassir Navab
Abstract:
From a computer science viewpoint, a surgical domain model needs to be a conceptual one incorporating both behavior and data. It should therefore model actors, devices, tools, their complex interactions and data flow. To capture and model these, we take advantage of the latest computer vision methodologies for generating 3D scene graphs from camera views. We then introduce the Multimodal Semantic…
▽ More
From a computer science viewpoint, a surgical domain model needs to be a conceptual one incorporating both behavior and data. It should therefore model actors, devices, tools, their complex interactions and data flow. To capture and model these, we take advantage of the latest computer vision methodologies for generating 3D scene graphs from camera views. We then introduce the Multimodal Semantic Scene Graph (MSSG) which aims at providing a unified symbolic, spatiotemporal and semantic representation of surgical procedures. This methodology aims at modeling the relationship between different components in surgical domain including medical staff, imaging systems, and surgical devices, opening the path towards holistic understanding and modeling of surgical procedures. We then use MSSG to introduce a dynamically generated graphical user interface tool for surgical procedure analysis which could be used for many applications including process optimization, OR design and automatic report generation. We finally demonstrate that the proposed MSSGs could also be used for synchronizing different complex surgical procedures. While the system still needs to be integrated into real operating rooms before getting validated, this conference paper aims mainly at providing the community with the basic principles of this novel concept through a first prototypal partial realization based on MVOR dataset.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Exploring Non-Reversing Magic Mirrors for Screen-Based Augmented Reality Systems
Authors:
Felix Bork,
Roghayeh Barmaki,
Ulrich Eck,
Pascal Fallavollita,
Bernhard Fuerst,
Nassir Navab
Abstract:
Screen-based Augmented Reality (AR) systems can be built as a window into the real world as often done in mobile AR applications or using the Magic Mirror metaphor, where users can see themselves with augmented graphics on a large display. Such Magic Mirror systems have been used in digital clothing environments to create virtual dressing rooms, to teach human anatomy, and for collaborative design…
▽ More
Screen-based Augmented Reality (AR) systems can be built as a window into the real world as often done in mobile AR applications or using the Magic Mirror metaphor, where users can see themselves with augmented graphics on a large display. Such Magic Mirror systems have been used in digital clothing environments to create virtual dressing rooms, to teach human anatomy, and for collaborative design tasks. The term Magic Mirror implies that the display shows the users enantiomorph, i.e. the mirror image, such that the system mimics a real-world physical mirror. However, the question arises whether one should design a traditional mirror, or instead display the true mirror image by means of a non-reversing mirror? This is an intriguing perceptual question, as the image one observes in a mirror is not a real view, as it would be seen by an external observer, but a reflection, i.e. a front-to-back reversed image. In this paper, we discuss the perceptual differences between these two mirror visualization concepts and present a first comparative study in the context of Magic Mirror anatomy teaching. We investigate the ability of users to identify the correct placement of virtual anatomical structures in our screen-based AR system for two conditions: a regular mirror and a non-reversing mirror setup. The results of our study indicate that the latter is more suitable for applications where previously acquired domain-specific knowledge plays an important role. The lessons learned open up new research directions in the fields of user interfaces and interaction in non-reversing mirror environments and could impact the implementation of general screen-based AR systems in other domains.
△ Less
Submitted 7 December, 2016; v1 submitted 10 November, 2016;
originally announced November 2016.