Search | arXiv e-print repository

A Multipurpose Interface for Close- and Far-Proximity Control of Mobile Collaborative Robots

Authors: Hamidreza Raei, Juan M. Gandarias, Elena De Momi, Pietro Balatti, Arash Ajoudani

Abstract: This letter introduces an innovative visuo-haptic interface to control Mobile Collaborative Robots (MCR). Thanks to a passive detachable mechanism, the interface can be attached/detached from a robot, offering two control modes: local control (attached) and teleoperation (detached). These modes are integrated with a robot whole-body controller and presented in a unified close- and far-proximity co… ▽ More This letter introduces an innovative visuo-haptic interface to control Mobile Collaborative Robots (MCR). Thanks to a passive detachable mechanism, the interface can be attached/detached from a robot, offering two control modes: local control (attached) and teleoperation (detached). These modes are integrated with a robot whole-body controller and presented in a unified close- and far-proximity control framework for MCR. The earlier introduction of the haptic component in this interface enabled users to execute intricate loco-manipulation tasks via admittance-type control, effectively decoupling task dynamics and enhancing human capabilities. In contrast, this ongoing work proposes a novel design that integrates a visual component. This design utilizes Visual-Inertial Odometry (VIO) for teleoperation, estimating the interface's pose through stereo cameras and an Inertial Measurement Unit (IMU). The estimated pose serves as the reference for the robot's end-effector in teleoperation mode. Hence, the interface offers complete flexibility and adaptability, enabling any user to operate an MCR seamlessly without needing expert knowledge. In this letter, we primarily focus on the new visual feature, and first present a performance evaluation of different VIO-based methods for teleoperation. Next, the interface's usability is analyzed in a home-care application and compared to an alternative designed by a commercial MoCap system. Results show comparable performance in terms of accuracy, completion time, and usability. Nevertheless, the proposed interface is low-cost, poses minimal wearability constraints, and can be used anywhere and anytime without needing external devices or additional equipment, offering a versatile and accessible solution for teleoperation. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.04359 [pdf, other]

A Personalizable Controller for the Walking Assistive omNi-Directional Exo-Robot (WANDER)

Authors: A. Fortuna, M. Lorenzini, M. Leonori, JM. Gandarias, P. Balatti, Y. Cho, E. De Momi, A. Ajoudani

Abstract: Preserving and encouraging mobility in the elderly and adults with chronic conditions is of paramount importance. However, existing walking aids are either inadequate to provide sufficient support to users' stability or too bulky and poorly maneuverable to be used outside hospital environments. In addition, they all lack adaptability to individual requirements. To address these challenges, this pa… ▽ More Preserving and encouraging mobility in the elderly and adults with chronic conditions is of paramount importance. However, existing walking aids are either inadequate to provide sufficient support to users' stability or too bulky and poorly maneuverable to be used outside hospital environments. In addition, they all lack adaptability to individual requirements. To address these challenges, this paper introduces WANDER, a novel Walking Assistive omNi-Directional Exo-Robot. It consists of an omnidirectional platform and a robust aluminum structure mounted on top of it, which provides partial body weight support. A comfortable and minimally restrictive coupling interface embedded with a force/torque sensor allows to detect users' intentions, which are translated into command velocities by means of a variable admittance controller. An optimization technique based on users' preferences, i.e., Preference-Based Optimization (PBO) guides the choice of the admittance parameters (i.e., virtual mass and dam**) to better fit subject-specific needs and characteristics. Experiments with twelve healthy subjects exhibited a significant decrease in energy consumption and jerk when using WANDER with PBO parameters as well as improved user performance and comfort. The great interpersonal variability in the optimized parameters highlights the importance of personalized control settings when walking with an assistive device, aiming to enhance users' comfort and mobility while ensuring reliable physical support. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 6 pages, 4 figures, IEEE International Conference on Robotics and Automation (2024)

arXiv:2402.13915 [pdf, other]

A Combined Learning and Optimization Framework to Transfer Human Whole-body Loco-manipulation Skills to Mobile Manipulators

Authors: Jianzhuang Zhao, Francesco Tassi, Yanlong Huang, Elena De Momi, Arash Ajoudani

Abstract: Humans' ability to smoothly switch between locomotion and manipulation is a remarkable feature of sensorimotor coordination. Leaning and replication of such human-like strategies can lead to the development of more sophisticated robots capable of performing complex whole-body tasks in real-world environments. To this end, this paper proposes a combined learning and optimization framework for trans… ▽ More Humans' ability to smoothly switch between locomotion and manipulation is a remarkable feature of sensorimotor coordination. Leaning and replication of such human-like strategies can lead to the development of more sophisticated robots capable of performing complex whole-body tasks in real-world environments. To this end, this paper proposes a combined learning and optimization framework for transferring human's loco-manipulation soft-switching skills to mobile manipulators. The methodology departs from data collection of human demonstrations for a locomotion-integrated manipulation task through a vision system. Next, the wrist and pelvis motions are mapped to mobile manipulators' End-Effector (EE) and mobile base. A kernelized movement primitive algorithm learns the wrist and pelvis trajectories and generalizes to new desired points according to task requirements. Next, the reference trajectories are sent to a hierarchical quadratic programming controller, where the EE and the mobile base reference trajectories are provided as the first and second priority tasks, generating the feasible and optimal joint level commands. A locomotion-integrated pick-and-place task is executed to validate the proposed approach. After a human demonstrates the task, a mobile manipulator executes the task with the same and new settings, gras** a bottle at non-zero velocity. The results showed that the proposed approach successfully transfers the human loco-manipulation skills to mobile manipulators, even with different geometry. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: 8 pages, 6 figures

arXiv:2402.00537 [pdf, other]

Robust Path Planning via Learning from Demonstrations for Robotic Catheters in Deformable Environments

Authors: Zhen Li, Chiara Lambranzi, Di Wu, Alice Segato, Federico De Marco, Emmanuel Vander Poorten, Jenny Dankelman, Elena De Momi

Abstract: Navigation through tortuous and deformable vessels using catheters with limited steering capability underscores the need for reliable path planning. State-of-the-art path planners do not fully account for the deformable nature of the environment. This work proposes a robust path planner via a learning from demonstrations method, named Curriculum Generative Adversarial Imitation Learning (C-GAIL).… ▽ More Navigation through tortuous and deformable vessels using catheters with limited steering capability underscores the need for reliable path planning. State-of-the-art path planners do not fully account for the deformable nature of the environment. This work proposes a robust path planner via a learning from demonstrations method, named Curriculum Generative Adversarial Imitation Learning (C-GAIL). This path planning framework takes into account the interaction between steerable catheters and vessel walls and the deformable property of vessels. In-silico comparative experiments show that the proposed network achieves smaller targeting errors, and a higher success rate, compared to a state-of-the-art approach based on GAIL. The in-vitro validation experiments demonstrate that the path generated by the proposed C-GAIL path planner aligns better with the actual steering capability of the pneumatic artificial muscle-driven catheter utilized in this study. Therefore, the proposed approach can provide enhanced support to the user in navigating the catheter towards the target with greater precision, in contrast to the conventional centerline-following technique. The targeting and tracking errors are 1.26$\pm$0.55mm and 5.18$\pm$3.48mm, respectively. The proposed path planning framework exhibits superior performance in managing uncertainty associated with vessel deformation, thereby resulting in lower tracking errors. △ Less

Submitted 1 February, 2024; originally announced February 2024.

Comments: Under review in IEEE Transactions on Biomedical Engineering (TBME)

arXiv:2401.04492 [pdf, other]

doi 10.1109/MRA.2024.3358721

Augmented Reality and Human-Robot Collaboration Framework for Percutaneous Nephrolithotomy: System Design, Implementation, and Performance Metrics

Authors: Junling Fu, Matteo Pecorella, Elisa Iovene, Maria Chiara Palumbo, Alberto Rota, Alberto Redaelli, Giancarlo Ferrigno, Elena De Momi

Abstract: During Percutaneous Nephrolithotomy (PCNL) operations, the surgeon is required to define the incision point on the patient's back, align the needle to a pre-planned path, and perform puncture operations afterward. The procedure is currently performed manually using ultrasound or fluoroscopy imaging for needle orientation, which, however, implies limited accuracy and low reproducibility. This work… ▽ More During Percutaneous Nephrolithotomy (PCNL) operations, the surgeon is required to define the incision point on the patient's back, align the needle to a pre-planned path, and perform puncture operations afterward. The procedure is currently performed manually using ultrasound or fluoroscopy imaging for needle orientation, which, however, implies limited accuracy and low reproducibility. This work incorporates Augmented Reality (AR) visualization with an optical see-through head-mounted display (OST-HMD) and Human-Robot Collaboration (HRC) framework to empower the surgeon's task completion performance. In detail, Eye-to-Hand calibration, system registration, and hologram model registration are performed to realize visual guidance. A Cartesian impedance controller is used to guide the operator during the needle puncture task execution. Experiments are conducted to verify the system performance compared with conventional manual puncture procedures and a 2D monitor-based visualisation interface. The results showed that the proposed framework achieves the lowest median and standard deviation error across all the experimental groups, respectively. Furthermore, the NASA-TLX user evaluation results indicate that the proposed framework requires the lowest workload score for task completion compared to other experimental setups. The proposed framework exhibits significant potential for clinical application in the PCNL task, as it enhances the surgeon's perception capability, facilitates collision-free needle insertion path planning, and minimises errors in task completion. △ Less

Submitted 23 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

Comments: Accepted by IEEE Robotics and Automation Magazine

arXiv:2309.07693 [pdf, other]

Towards Safer Robot-Assisted Surgery: A Markerless Augmented Reality Framework

Authors: Ziyang Chen, Laura Cruciani, Ke Fan, Matteo Fontana, Elena Lievore, Ottavio De Cobelli, Gennaro Musi, Giancarlo Ferrigno, Elena De Momi

Abstract: Robot-assisted surgery is rapidly develo** in the medical field, and the integration of augmented reality shows the potential of improving the surgeons' operation performance by providing more visual information. In this paper, we proposed a markerless augmented reality framework to enhance safety by avoiding intra-operative bleeding which is a high risk caused by the collision between the surgi… ▽ More Robot-assisted surgery is rapidly develo** in the medical field, and the integration of augmented reality shows the potential of improving the surgeons' operation performance by providing more visual information. In this paper, we proposed a markerless augmented reality framework to enhance safety by avoiding intra-operative bleeding which is a high risk caused by the collision between the surgical instruments and the blood vessel. Advanced stereo reconstruction and segmentation networks are compared to find out the best combination to reconstruct the intra-operative blood vessel in the 3D space for the registration of the pre-operative model, and the minimum distance detection between the instruments and the blood vessel is implemented. A robot-assisted lymphadenectomy is simulated on the da Vinci Research Kit in a dry lab, and ten human subjects performed this operation to explore the usability of the proposed framework. The result shows that the augmented reality framework can help the users to avoid the dangerous collision between the instruments and the blood vessel while not introducing an extra load. It provides a flexible framework that integrates augmented reality into the medical robot platform to enhance safety during the operation. △ Less

Submitted 14 September, 2023; originally announced September 2023.

arXiv:2309.01723 [pdf, other]

SAF-IS: a Spatial Annotation Free Framework for Instance Segmentation of Surgical Tools

Authors: Luca Sestini, Benoit Rosa, Elena De Momi, Giancarlo Ferrigno, Nicolas Padoy

Abstract: Instance segmentation of surgical instruments is a long-standing research problem, crucial for the development of many applications for computer-assisted surgery. This problem is commonly tackled via fully-supervised training of deep learning models, requiring expensive pixel-level annotations to train. In this work, we develop a framework for instance segmentation not relying on spatial annotatio… ▽ More Instance segmentation of surgical instruments is a long-standing research problem, crucial for the development of many applications for computer-assisted surgery. This problem is commonly tackled via fully-supervised training of deep learning models, requiring expensive pixel-level annotations to train. In this work, we develop a framework for instance segmentation not relying on spatial annotations for training. Instead, our solution only requires binary tool masks, obtainable using recent unsupervised approaches, and binary tool presence labels, freely obtainable in robot-assisted surgery. Based on the binary mask information, our solution learns to extract individual tool instances from single frames, and to encode each instance into a compact vector representation, capturing its semantic features. Such representations guide the automatic selection of a tiny number of instances (8 only in our experiments), displayed to a human operator for tool-type labelling. The gathered information is finally used to match each training instance with a binary tool presence label, providing an effective supervision signal to train a tool instance classifier. We validate our framework on the EndoVis 2017 and 2018 segmentation datasets. We provide results using binary masks obtained either by manual annotation or as predictions of an unsupervised binary segmentation model. The latter solution yields an instance segmentation approach completely free from spatial annotations, outperforming several state-of-the-art fully-supervised segmentation approaches. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2306.07205 [pdf, other]

doi 10.1109/LRA.2023.3280752

Maximising Coefficiency of Human-Robot Handovers through Reinforcement Learning

Authors: Marta Lagomarsino, Marta Lorenzini, Merryn Dale Constable, Elena De Momi, Cristina Becchio, Arash Ajoudani

Abstract: Handing objects to humans is an essential capability for collaborative robots. Previous research works on human-robot handovers focus on facilitating the performance of the human partner and possibly minimising the physical effort needed to grasp the object. However, altruistic robot behaviours may result in protracted and awkward robot motions, contributing to unpleasant sensations by the human p… ▽ More Handing objects to humans is an essential capability for collaborative robots. Previous research works on human-robot handovers focus on facilitating the performance of the human partner and possibly minimising the physical effort needed to grasp the object. However, altruistic robot behaviours may result in protracted and awkward robot motions, contributing to unpleasant sensations by the human partner and affecting perceived safety and social acceptance. This paper investigates whether transferring the cognitive science principle that "humans act coefficiently as a group" (i.e. simultaneously maximising the benefits of all agents involved) to human-robot cooperative tasks promotes a more seamless and natural interaction. Human-robot coefficiency is first modelled by identifying implicit indicators of human comfort and discomfort as well as calculating the robot energy consumption in performing the desired trajectory. We then present a reinforcement learning approach that uses the human-robot coefficiency score as reward to adapt and learn online the combination of robot interaction parameters that maximises such coefficiency. Results proved that by acting coefficiently the robot could meet the individual preferences of most subjects involved in the experiments, improve the human perceived comfort, and foster trust in the robotic partner. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: 8 pages, 6 figures, IEEE Robotics and Automation Letters

arXiv:2305.04027 [pdf, other]

doi 10.1109/TRO.2023.3269384

Autonomous Navigation for Robot-assisted Intraluminal and Endovascular Procedures: A Systematic Review

Authors: Ameya Pore, Zhen Li, Diego Dall'Alba, Albert Hernansanz, Elena De Momi, Arianna Menciassi, Alicia Casals, Jenny Denkelman, Paolo Fiorini, Emmanuel Vander Poorten

Abstract: Increased demand for less invasive procedures has accelerated the adoption of Intraluminal Procedures (IP) and Endovascular Interventions (EI) performed through body lumens and vessels. As navigation through lumens and vessels is quite complex, interest grows to establish autonomous navigation techniques for IP and EI for reaching the target area. Current research efforts are directed toward incre… ▽ More Increased demand for less invasive procedures has accelerated the adoption of Intraluminal Procedures (IP) and Endovascular Interventions (EI) performed through body lumens and vessels. As navigation through lumens and vessels is quite complex, interest grows to establish autonomous navigation techniques for IP and EI for reaching the target area. Current research efforts are directed toward increasing the Level of Autonomy (LoA) during the navigation phase. One key ingredient for autonomous navigation is Motion Planning (MP) techniques. This paper provides an overview of MP techniques categorizing them based on LoA. Our analysis investigates advances for the different clinical scenarios. Through a systematic literature analysis using the PRISMA method, the study summarizes relevant works and investigates the clinical aim, LoA, adopted MP techniques, and validation types. We identify the limitations of the corresponding MP methods and provide directions to improve the robustness of the algorithms in dynamic intraluminal environments. MP for IP and EI can be classified into four subgroups: node, sampling, optimization, and learning-based techniques, with a notable rise in learning-based approaches in recent years. One of the review's contributions is the identification of the limiting factors in IP and EI robotic systems hindering higher levels of autonomous navigation. In the future, navigation is bound to become more autonomous, placing the clinician in a supervisory position to improve control precision and reduce workload. △ Less

Submitted 6 May, 2023; originally announced May 2023.

Comments: 31 pages, 7 figures, 3 tables; Accepted in IEEE Transactions on Robotics

arXiv:2304.14589 [pdf, other]

Uncertainty-aware Self-supervised Learning for Cross-domain Technical Skill Assessment in Robot-assisted Surgery

Authors: Ziheng Wang, Andrea Mariani, Arianna Menciassi, Elena De Momi, Ann Majewicz Fey

Abstract: Objective technical skill assessment is crucial for effective training of new surgeons in robot-assisted surgery. With advancements in surgical training programs in both physical and virtual environments, it is imperative to develop generalizable methods for automatically assessing skills. In this paper, we propose a novel approach for skill assessment by transferring domain knowledge from labeled… ▽ More Objective technical skill assessment is crucial for effective training of new surgeons in robot-assisted surgery. With advancements in surgical training programs in both physical and virtual environments, it is imperative to develop generalizable methods for automatically assessing skills. In this paper, we propose a novel approach for skill assessment by transferring domain knowledge from labeled kinematic data to unlabeled data. Our approach leverages labeled data from common surgical training tasks such as Suturing, Needle Passing, and Knot Tying to jointly train a model with both labeled and unlabeled data. Pseudo labels are generated for the unlabeled data through an iterative manner that incorporates uncertainty estimation to ensure accurate labeling. We evaluate our method on a virtual reality simulated training task (Ring Transfer) using data from the da Vinci Research Kit (dVRK). The results show that trainees with robotic assistance have significantly higher expert probability compared to these without any assistance, p < 0.05, which aligns with previous studies showing the benefits of robotic assistance in improving training proficiency. Our method offers a significant advantage over other existing works as it does not require manual labeling or prior knowledge of the surgical training task for robot-assisted surgery. △ Less

Submitted 27 April, 2023; originally announced April 2023.

Comments: Manuscript ACCEPTED on 18-April-2023 for publication in IEEE Transactions on Medical Robotics and Bionics (TMRB). 12 pages, 9 figures, and 2 tables

arXiv:2303.18119 [pdf, other]

Markerless 3D human pose tracking through multiple cameras and AI: Enabling high accuracy, robustness, and real-time performance

Authors: Luca Fortini, Mattia Leonori, Juan M. Gandarias, Elena de Momi, Arash Ajoudani

Abstract: Tracking 3D human motion in real-time is crucial for numerous applications across many fields. Traditional approaches involve attaching artificial fiducial objects or sensors to the body, limiting their usability and comfort-of-use and consequently narrowing their application fields. Recent advances in Artificial Intelligence (AI) have allowed for markerless solutions. However, most of these metho… ▽ More Tracking 3D human motion in real-time is crucial for numerous applications across many fields. Traditional approaches involve attaching artificial fiducial objects or sensors to the body, limiting their usability and comfort-of-use and consequently narrowing their application fields. Recent advances in Artificial Intelligence (AI) have allowed for markerless solutions. However, most of these methods operate in 2D, while those providing 3D solutions compromise accuracy and real-time performance. To address this challenge and unlock the potential of visual pose estimation methods in real-world scenarios, we propose a markerless framework that combines multi-camera views and 2D AI-based pose estimation methods to track 3D human motion. Our approach integrates a Weighted Least Square (WLS) algorithm that computes 3D human motion from multiple 2D pose estimations provided by an AI-driven method. The method is integrated within the Open-VICO framework allowing simulation and real-world execution. Several experiments have been conducted, which have shown high accuracy and real-time performance, demonstrating the high level of readiness for real-world applications and the potential to revolutionize human motion capture. △ Less

Submitted 31 March, 2023; originally announced March 2023.

Comments: 19 pages, 7 figures

arXiv:2301.08038 [pdf, other]

A Unified Architecture for Dynamic Role Allocation and Collaborative Task Planning in Mixed Human-Robot Teams

Authors: Edoardo Lamon, Fabio Fusaro, Elena De Momi, Arash Ajoudani

Abstract: The growing deployment of human-robot collaborative processes in several industrial applications, such as handling, welding, and assembly, unfolds the pursuit of systems which are able to manage large heterogeneous teams and, at the same time, monitor the execution of complex tasks. In this paper, we present a novel architecture for dynamic role allocation and collaborative task planning in a mixe… ▽ More The growing deployment of human-robot collaborative processes in several industrial applications, such as handling, welding, and assembly, unfolds the pursuit of systems which are able to manage large heterogeneous teams and, at the same time, monitor the execution of complex tasks. In this paper, we present a novel architecture for dynamic role allocation and collaborative task planning in a mixed human-robot team of arbitrary size. The architecture capitalizes on a centralized reactive and modular task-agnostic planning method based on Behavior Trees (BTs), in charge of actions scheduling, while the allocation problem is formulated through a Mixed-Integer Linear Program (MILP), that assigns dynamically individual roles or collaborations to the agents of the team. Different metrics used as MILP cost allow the architecture to favor various aspects of the collaboration (e.g. makespan, ergonomics, human preferences). Human preference are identified through a negotiation phase, in which, an human agent can accept/refuse to execute the assigned task.In addition, bilateral communication between humans and the system is achieved through an Augmented Reality (AR) custom user interface that provides intuitive functionalities to assist and coordinate workers in different action phases. The computational complexity of the proposed methodology outperforms literature approaches in industrial sized jobs and teams (problems up to 50 actions and 20 agents in the team with collaborations are solved within 1 s). The different allocated roles, as the cost functions change, highlights the flexibility of the architecture to several production requirements. Finally, the subjective evaluation demonstrating the high usability level and the suitability for the targeted scenario. △ Less

Submitted 25 September, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

Comments: 18 pages, 20 figures, 2nd round review at Transaction on Robotics

arXiv:2212.11375 [pdf, other]

Semi-supervised Bladder Tissue Classification in Multi-Domain Endoscopic Images

Authors: Jorge F. Lazo, Benoit Rosa, Michele Catellani, Matteo Fontana, Francesco A. Mistretta, Gennaro Musi, Ottavio de Cobelli, Michel de Mathelin, Elena De Momi

Abstract: Objective: Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to… ▽ More Objective: Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to identify and classify cancerous lesions. Computer vision methods that use both imaging techniques could improve endoscopic diagnosis. We address the challenge of tissue classification when annotations are available only in one domain, in our case WLI, and the endoscopic images correspond to an unpaired dataset, i.e. there is no exact equivalent for every image in both NBI and WLI domains. Method: We propose a semi-surprised Generative Adversarial Network (GAN)-based method composed of three main components: a teacher network trained on the labeled WLI data; a cycle-consistency GAN to perform unpaired image-to-image translation, and a multi-input student network. To ensure the quality of the synthetic images generated by the proposed GAN we perform a detailed quantitative, and qualitative analysis with the help of specialists. Conclusion: The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively, while the same metrics obtained in the unlabeled domain (NBI) are 0.92, 0.64, and 0.94 respectively. The quality of the generated images is reliable enough to deceive specialists. Significance: This study shows the potential of using semi-supervised GAN-based bladder tissue classification when annotations are limited in multi-domain data. The dataset is available at https://zenodo.org/record/7741476#.ZBQUK7TMJ6k △ Less

Submitted 17 March, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

Comments: Title and abstract updated. Typos corrected

arXiv:2209.12563 [pdf, other]

Impact-Friendly Object Catching at Non-Zero Velocity Based on Combined Optimization and Learning

Authors: Jianzhuang Zhao, Gustavo J. G. Lahr, Francesco Tassi, Alessandro Santopaolo, Elena De Momi, Arash Ajoudani

Abstract: This paper proposes a combined optimization and learning method for impact-friendly, non-prehensile catching of objects at non-zero velocity. Through a constrained Quadratic Programming problem, the method generates optimal trajectories up to the contact point between the robot and the object to minimize their relative velocity and reduce the impact forces. Next, the generated trajectories are upd… ▽ More This paper proposes a combined optimization and learning method for impact-friendly, non-prehensile catching of objects at non-zero velocity. Through a constrained Quadratic Programming problem, the method generates optimal trajectories up to the contact point between the robot and the object to minimize their relative velocity and reduce the impact forces. Next, the generated trajectories are updated by Kernelized Movement Primitives, which are based on human catching demonstrations to ensure a smooth transition around the catching point. In addition, the learned human variable stiffness (HVS) is sent to the robot's Cartesian impedance controller to absorb the post-impact forces and stabilize the catching position. Three experiments are conducted to compare our method with and without HVS against a fixed-position impedance controller (FP-IC). The results showed that the proposed methods outperform the FP-IC while adding HVS yields better results for absorbing the post-impact forces. △ Less

Submitted 5 September, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

Comments: 8 pages, 9 figures, accepted by 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023)

arXiv:2207.13185 [pdf, other]

Learning-Based Keypoint Registration for Fetoscopic Mosaicking

Authors: Alessandro Casella, Sophia Bano, Francisco Vasconcelos, Anna L. David, Dario Paladini, Jan Deprest, Elena De Momi, Leonardo S. Mattos, Sara Moccia, Danail Stoyanov

Abstract: In Twin-to-Twin Transfusion Syndrome (TTTS), abnormal vascular anastomoses in the monochorionic placenta can produce uneven blood flow between the two fetuses. In the current practice, TTTS is treated surgically by closing abnormal anastomoses using laser ablation. This surgery is minimally invasive and relies on fetoscopy. Limited field of view makes anastomosis identification a challenging task… ▽ More In Twin-to-Twin Transfusion Syndrome (TTTS), abnormal vascular anastomoses in the monochorionic placenta can produce uneven blood flow between the two fetuses. In the current practice, TTTS is treated surgically by closing abnormal anastomoses using laser ablation. This surgery is minimally invasive and relies on fetoscopy. Limited field of view makes anastomosis identification a challenging task for the surgeon. To tackle this challenge, we propose a learning-based framework for in-vivo fetoscopy frame registration for field-of-view expansion. The novelties of this framework relies on a learning-based keypoint proposal network and an encoding strategy to filter (i) irrelevant keypoints based on fetoscopic image segmentation and (ii) inconsistent homographies. We validate of our framework on a dataset of 6 intraoperative sequences from 6 TTTS surgeries from 6 different women against the most recent state of the art algorithm, which relies on the segmentation of placenta vessels. The proposed framework achieves higher performance compared to the state of the art, paving the way for robust mosaicking to provide surgeons with context awareness during TTTS surgery. △ Less

Submitted 26 July, 2022; originally announced July 2022.

arXiv:2207.03779 [pdf, other]

doi 10.1109/TCDS.2022.3182811

Pick the Right Co-Worker: Online Assessment of Cognitive Ergonomics in Human-Robot Collaborative Assembly

Authors: Marta Lagomarsino, Marta Lorenzini, Pietro Balatti, Elena De Momi, Arash Ajoudani

Abstract: Human-robot collaborative assembly systems enhance the efficiency and productivity of the workplace but may increase the workers' cognitive demand. This paper proposes an online and quantitative framework to assess the cognitive workload induced by the interaction with a co-worker, either a human operator or an industrial collaborative robot with different control strategies. The approach monitors… ▽ More Human-robot collaborative assembly systems enhance the efficiency and productivity of the workplace but may increase the workers' cognitive demand. This paper proposes an online and quantitative framework to assess the cognitive workload induced by the interaction with a co-worker, either a human operator or an industrial collaborative robot with different control strategies. The approach monitors the operator's attention distribution and upper-body kinematics benefiting from the input images of a low-cost stereo camera and cutting-edge artificial intelligence algorithms (i.e. head pose estimation and skeleton tracking). Three experimental scenarios with variations in workstation features and interaction modalities were designed to test the performance of our online method against state-of-the-art offline measurements. Results proved that our vision-based cognitive load assessment has the potential to be integrated into the new generation of collaborative robotic technologies. The latter would enable human cognitive state monitoring and robot control strategy adaptation for improving human comfort, ergonomics, and trust in automation. △ Less

Submitted 8 July, 2022; originally announced July 2022.

Comments: 10 pages, 7 figures, IEEE Transactions on Cognitive and Developmental Systems

arXiv:2207.03739 [pdf, other]

doi 10.1109/IROS47612.2022.9981424

Robot Trajectory Adaptation to Optimise the Trade-off between Human Cognitive Ergonomics and Workplace Productivity in Collaborative Tasks

Authors: Marta Lagomarsino, Marta Lorenzini, Elena De Momi, Arash Ajoudani

Abstract: In hybrid industrial environments, workers' comfort and positive perception of safety are essential requirements for successful acceptance and usage of collaborative robots. This paper proposes a novel human-robot interaction framework in which the robot behaviour is adapted online according to the operator's cognitive workload and stress. The method exploits the generation of B-spline trajectorie… ▽ More In hybrid industrial environments, workers' comfort and positive perception of safety are essential requirements for successful acceptance and usage of collaborative robots. This paper proposes a novel human-robot interaction framework in which the robot behaviour is adapted online according to the operator's cognitive workload and stress. The method exploits the generation of B-spline trajectories in the joint space and formulation of a multi-objective optimisation problem to online adjust the total execution time and smoothness of the robot trajectories. The former ensures human efficiency and productivity of the workplace, while the latter contributes to safeguarding the user's comfort and cognitive ergonomics. The performance of the proposed framework was evaluated in a typical industrial task. Results demonstrated its capability to enhance the productivity of the human-robot dyad while mitigating the cognitive workload induced in the worker. △ Less

Submitted 8 July, 2022; originally announced July 2022.

Comments: 7 pages, 8 figures, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

arXiv:2207.03435 [pdf, other]

Sociable and Ergonomic Human-Robot Collaboration through Action Recognition and Augmented Hierarchical Quadratic Programming

Authors: Francesco Tassi, Francesco Iodice, Elena De Momi, Arash Ajoudani

Abstract: The recognition of actions performed by humans and the anticipation of their intentions are important enablers to yield sociable and successful collaboration in human-robot teams. Meanwhile, robots should have the capacity to deal with multiple objectives and constraints, arising from the collaborative task or the human. In this regard, we propose vision techniques to perform human action recognit… ▽ More The recognition of actions performed by humans and the anticipation of their intentions are important enablers to yield sociable and successful collaboration in human-robot teams. Meanwhile, robots should have the capacity to deal with multiple objectives and constraints, arising from the collaborative task or the human. In this regard, we propose vision techniques to perform human action recognition and image classification, which are integrated into an Augmented Hierarchical Quadratic Programming (AHQP) scheme to hierarchically optimize the robot's reactive behavior and human ergonomics. The proposed framework allows one to intuitively command the robot in space while a task is being executed. The experiments confirm increased human ergonomics and usability, which are fundamental parameters for reducing musculoskeletal diseases and increasing trust in automation. △ Less

Submitted 7 July, 2022; originally announced July 2022.

Comments: 8 pages, 8 figures, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

arXiv:2207.00401 [pdf, other]

Autonomous Intraluminal Navigation of a Soft Robot using Deep-Learning-based Visual Servoing

Authors: Jorge F. Lazo, Chun-Feng Lai, Sara Moccia, Benoit Rosa, Michele Catellani, Michel de Mathelin, Giancarlo Ferrigno, Paul Breedveld, Jenny Dankelman, Elena De Momi

Abstract: Navigation inside luminal organs is an arduous task that requires non-intuitive coordination between the movement of the operator's hand and the information obtained from the endoscopic video. The development of tools to automate certain tasks could alleviate the physical and mental load of doctors during interventions, allowing them to focus on diagnosis and decision-making tasks. In this paper,… ▽ More Navigation inside luminal organs is an arduous task that requires non-intuitive coordination between the movement of the operator's hand and the information obtained from the endoscopic video. The development of tools to automate certain tasks could alleviate the physical and mental load of doctors during interventions, allowing them to focus on diagnosis and decision-making tasks. In this paper, we present a synergic solution for intraluminal navigation consisting of a 3D printed endoscopic soft robot that can move safely inside luminal structures. Visual servoing, based on Convolutional Neural Networks (CNNs) is used to achieve the autonomous navigation task. The CNN is trained with phantoms and in-vivo data to segment the lumen, and a model-less approach is presented to control the movement in constrained environments. The proposed robot is validated in anatomical phantoms in different path configurations. We analyze the movement of the robot using different metrics such as task completion time, smoothness, error in the steady-state, and mean and maximum error. We show that our method is suitable to navigate safely in hollow environments and conditions which are different than the ones the network was originally trained on. △ Less

Submitted 26 July, 2022; v1 submitted 1 July, 2022; originally announced July 2022.

arXiv:2206.12512 [pdf, other]

Placental Vessel Segmentation and Registration in Fetoscopy: Literature Review and MICCAI FetReg2021 Challenge Findings

Authors: Sophia Bano, Alessandro Casella, Francisco Vasconcelos, Abdul Qayyum, Abdesslam Benzinou, Moona Mazher, Fabrice Meriaudeau, Chiara Lena, Ilaria Anita Cintorrino, Gaia Romana De Paolis, Jessica Biagioli, Daria Grechishnikova, **g Jiao, Bizhe Bai, Yanyan Qiao, Binod Bhattarai, Rebati Raman Gaire, Ronast Subedi, Eduard Vazquez, Szymon Płotka, Aneta Lisowska, Arkadiusz Sitek, George Attilakos, Ruwan Wimalasundera, Anna L David , et al. (6 additional authors not shown)

Abstract: Fetoscopy laser photocoagulation is a widely adopted procedure for treating Twin-to-Twin Transfusion Syndrome (TTTS). The procedure involves photocoagulation pathological anastomoses to regulate blood exchange among twins. The procedure is particularly challenging due to the limited field of view, poor manoeuvrability of the fetoscope, poor visibility, and variability in illumination. These challe… ▽ More Fetoscopy laser photocoagulation is a widely adopted procedure for treating Twin-to-Twin Transfusion Syndrome (TTTS). The procedure involves photocoagulation pathological anastomoses to regulate blood exchange among twins. The procedure is particularly challenging due to the limited field of view, poor manoeuvrability of the fetoscope, poor visibility, and variability in illumination. These challenges may lead to increased surgery time and incomplete ablation. Computer-assisted intervention (CAI) can provide surgeons with decision support and context awareness by identifying key structures in the scene and expanding the fetoscopic field of view through video mosaicking. Research in this domain has been hampered by the lack of high-quality data to design, develop and test CAI algorithms. Through the Fetoscopic Placental Vessel Segmentation and Registration (FetReg2021) challenge, which was organized as part of the MICCAI2021 Endoscopic Vision challenge, we released the first largescale multicentre TTTS dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms. For this challenge, we released a dataset of 2060 images, pixel-annotated for vessels, tool, fetus and background classes, from 18 in-vivo TTTS fetoscopy procedures and 18 short video clips. Seven teams participated in this challenge and their model performance was assessed on an unseen test dataset of 658 pixel-annotated images from 6 fetoscopic procedures and 6 short clips. The challenge provided an opportunity for creating generalized solutions for fetoscopic scene understanding and mosaicking. In this paper, we present the findings of the FetReg2021 challenge alongside reporting a detailed literature review for CAI in TTTS fetoscopy. Through this challenge, its analysis and the release of multi-centre fetoscopic data, we provide a benchmark for future research in this field. △ Less

Submitted 26 February, 2023; v1 submitted 24 June, 2022; originally announced June 2022.

Comments: Accepted at MedIA (Medical Image Analysis)

arXiv:2204.04003 [pdf, other]

Learning Cooperative Dynamic Manipulation Skills from Human Demonstration Videos

Authors: Francesco Iodice, Yuqiang Wu, Wansoo Kim, Fei Zhao, Elena De Momi, Arash Ajoudani

Abstract: This article proposes a method for learning and robotic replication of dynamic collaborative tasks from offline videos. The objective is to extend the concept of learning from demonstration (LfD) to dynamic scenarios, benefiting from widely available or easily producible offline videos. To achieve this goal, we decode important dynamic information, such as the Configuration Dependent Stiffness (CD… ▽ More This article proposes a method for learning and robotic replication of dynamic collaborative tasks from offline videos. The objective is to extend the concept of learning from demonstration (LfD) to dynamic scenarios, benefiting from widely available or easily producible offline videos. To achieve this goal, we decode important dynamic information, such as the Configuration Dependent Stiffness (CDS), which reveals the contribution of arm pose to the arm endpoint stiffness, from a three-dimensional human skeleton model. Next, through encoding of the CDS via Gaussian Mixture Model (GMM) and decoding via Gaussian Mixture Regression (GMR), the robot's Cartesian impedance profile is estimated and replicated. We demonstrate the proposed method in a collaborative sawing task with leader-follower structure, considering environmental constraints and dynamic uncertainties. The experimental setup includes two Panda robots, which replicate the leader-follower roles and the impedance profiles extracted from a two-persons sawing video. △ Less

Submitted 8 April, 2022; originally announced April 2022.

arXiv:2203.14733 [pdf, other]

doi 10.1109/RO-MAN53752.2022.9900851.

Open-VICO: An Open-Source Gazebo Toolkit for Vision-based Skeleton Tracking in Human-Robot Collaboration

Authors: Luca Fortini, Mattia Leonori, Juan M. Gandarias, Elena De Momi, Arash Ajoudani

Abstract: Simulation tools are essential for robotics research, especially for those domains in which safety is crucial, such as Human-Robot Collaboration (HRC). However, it is challenging to simulate human behaviors, and existing robotics simulators do not integrate functional human models. This work presents Open-VICO, an open-source toolkit to integrate virtual human models in Gazebo focusing on vision-b… ▽ More Simulation tools are essential for robotics research, especially for those domains in which safety is crucial, such as Human-Robot Collaboration (HRC). However, it is challenging to simulate human behaviors, and existing robotics simulators do not integrate functional human models. This work presents Open-VICO, an open-source toolkit to integrate virtual human models in Gazebo focusing on vision-based human tracking. In particular, Open-VICO allows to combine in the same simulation environment realistic human kinematic models, multi-camera vision setups, and human-tracking techniques along with numerous robot and sensor models thanks to Gazebo. The possibility to incorporate pre-recorded human skeleton motion with Motion Capture systems broadens the landscape of human performance behavioral analysis within Human-Robot Interaction (HRI) settings. To describe the functionalities and stress the potential of the toolkit four specific examples, chosen among relevant literature challenges in the field, are developed using our simulation utils: i) 3D multi-RGB-D camera calibration in simulation, ii) creation of a synthetic human skeleton tracking dataset based on OpenPose, iii) multi-camera scenario for human skeleton tracking in simulation, and iv) a human-robot interaction example. The key of this work is to create a straightforward pipeline which we hope will motivate research on new vision-based algorithms and methodologies for lightweight human-tracking and flexible human-robot applications. △ Less

Submitted 4 October, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

Comments: 7 pages, 8 figures. The final version of this preprint has been published at IEEE International Conference on Robot & Human Interactive Communication. DOI: 10.1109/RO-MAN53752.2022.9900851. Code: https://gitlab.iit.it/hrii-public/open-vico

arXiv:2203.14613 [pdf, other]

doi 10.1109/LRA.2022.3187258

A Hybrid Learning and Optimization Framework to Achieve Physically Interactive Tasks with Mobile Manipulators

Authors: Jianzhuang Zhao, Alberto Giammarino, Edoardo Lamon, Juan M. Gandarias, Elena De Momi, Arash Ajoudani

Abstract: This paper proposes a hybrid learning and optimization framework for mobile manipulators for complex and physically interactive tasks. The framework exploits an admittance-type physical interface to obtain intuitive and simplified human demonstrations and Gaussian Mixture Model (GMM)/Gaussian Mixture Regression (GMR) to encode and generate the learned task requirements in terms of position, veloci… ▽ More This paper proposes a hybrid learning and optimization framework for mobile manipulators for complex and physically interactive tasks. The framework exploits an admittance-type physical interface to obtain intuitive and simplified human demonstrations and Gaussian Mixture Model (GMM)/Gaussian Mixture Regression (GMR) to encode and generate the learned task requirements in terms of position, velocity, and force profiles. Next, using the desired trajectories and force profiles generated by GMM/GMR, the impedance parameters of a Cartesian impedance controller are optimized online through a Quadratic Program augmented with an energy tank to ensure the passivity of the controlled system. Two experiments are conducted to validate the framework, comparing our method with two approaches with constant stiffness (high and low). The results showed that the proposed method outperforms the other two cases in terms of trajectory tracking and generated interaction forces, even in the presence of disturbances such as unexpected end-effector collisions. △ Less

Submitted 1 August, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

Comments: 8 pages, 6 figures, accepted by IEEE Robotics and Automation Letters and IEEE International Conference on Intelligent Robots and Systems 2022

arXiv:2202.08141 [pdf, other]

FUN-SIS: a Fully UNsupervised approach for Surgical Instrument Segmentation

Authors: Luca Sestini, Benoit Rosa, Elena De Momi, Giancarlo Ferrigno, Nicolas Padoy

Abstract: Automatic surgical instrument segmentation of endoscopic images is a crucial building block of many computer-assistance applications for minimally invasive surgery. So far, state-of-the-art approaches completely rely on the availability of a ground-truth supervision signal, obtained via manual annotation, thus expensive to collect at large scale. In this paper, we present FUN-SIS, a Fully-UNsuperv… ▽ More Automatic surgical instrument segmentation of endoscopic images is a crucial building block of many computer-assistance applications for minimally invasive surgery. So far, state-of-the-art approaches completely rely on the availability of a ground-truth supervision signal, obtained via manual annotation, thus expensive to collect at large scale. In this paper, we present FUN-SIS, a Fully-UNsupervised approach for binary Surgical Instrument Segmentation. FUN-SIS trains a per-frame segmentation model on completely unlabelled endoscopic videos, by solely relying on implicit motion information and instrument shape-priors. We define shape-priors as realistic segmentation masks of the instruments, not necessarily coming from the same dataset/domain as the videos. The shape-priors can be collected in various and convenient ways, such as recycling existing annotations from other datasets. We leverage them as part of a novel generative-adversarial approach, allowing to perform unsupervised instrument segmentation of optical-flow images during training. We then use the obtained instrument masks as pseudo-labels in order to train a per-frame segmentation model; to this aim, we develop a learning-from-noisy-labels architecture, designed to extract a clean supervision signal from these pseudo-labels, leveraging their peculiar noise properties. We validate the proposed contributions on three surgical datasets, including the MICCAI 2017 EndoVis Robotic Instrument Segmentation Challenge dataset. The obtained fully-unsupervised results for surgical instrument segmentation are almost on par with the ones of fully-supervised state-of-the-art approaches. This suggests the tremendous potential of the proposed method to leverage the great amount of unlabelled data produced in the context of minimally invasive surgery. △ Less

Submitted 16 February, 2022; originally announced February 2022.

arXiv:2109.03627 [pdf, other]

doi 10.1016/j.rcim.2022.102380

An Online Framework for Cognitive Load Assessment in Assembly Tasks

Authors: Marta Lagomarsino, Marta Lorenzini, Elena De Momi, Arash Ajoudani

Abstract: The ongoing trend towards Industry 4.0 has revolutionised ordinary workplaces, profoundly changing the role played by humans in the production chain. Research on ergonomics in industrial settings mainly focuses on reducing the operator's physical fatigue and discomfort to improve throughput and avoid safety hazards. However, as the production complexity increases, the cognitive resources demand an… ▽ More The ongoing trend towards Industry 4.0 has revolutionised ordinary workplaces, profoundly changing the role played by humans in the production chain. Research on ergonomics in industrial settings mainly focuses on reducing the operator's physical fatigue and discomfort to improve throughput and avoid safety hazards. However, as the production complexity increases, the cognitive resources demand and mental workload could compromise the operator's performance and the efficiency of the shop floor workplace. State-of-the-art methods in cognitive science work offline and/or involve bulky equipment hardly deployable in industrial settings. This paper presents a novel method for online assessment of cognitive load in manufacturing, primarily assembly, by detecting patterns in human motion directly from the input images of a stereo camera. Head pose estimation and skeleton tracking are exploited to investigate the workers' attention and assess hyperactivity and unforeseen movements. Pilot experiments suggest that our factor assessment tool provides significant insights into workers' mental workload, even confirmed by correlations with physiological and performance measurements. According to data gathered in this study, a vision-based cognitive load assessment has the potential to be integrated into the development of mechatronic systems for improving cognitive ergonomics in manufacturing. △ Less

Submitted 19 October, 2021; v1 submitted 8 September, 2021; originally announced September 2021.

Journal ref: Robotics and Computer-Integrated Manufacturing, Volume 78, December 2022

arXiv:2106.10206 [pdf, other]

doi 10.1109/LRA.2021.3090016

Position-based Dynamics Simulator of Brain Deformations for Path Planning and Intra-Operative Control in Keyhole Neurosurgery

Authors: Alice Segato, Chiara Di Vece, Sara Zucchelli, Marco Di Marzo, Thomas Wendler, Mohammad Farid Azampour, Stefano Galvan, Riccardo Secoli, Elena De Momi

Abstract: Many tasks in robot-assisted surgery require planning and controlling manipulators' motions that interact with highly deformable objects. This study proposes a realistic, time-bounded simulator based on Position-based Dynamics (PBD) simulation that mocks brain deformations due to catheter insertion for pre-operative path planning and intra-operative guidance in keyhole surgical procedures. It maxi… ▽ More Many tasks in robot-assisted surgery require planning and controlling manipulators' motions that interact with highly deformable objects. This study proposes a realistic, time-bounded simulator based on Position-based Dynamics (PBD) simulation that mocks brain deformations due to catheter insertion for pre-operative path planning and intra-operative guidance in keyhole surgical procedures. It maximizes the probability of success by accounting for uncertainty in deformation models, noisy sensing, and unpredictable actuation. The PBD deformation parameters were initialized on a parallelepiped-shaped simulated phantom to obtain a reasonable starting guess for the brain white matter. They were calibrated by comparing the obtained displacements with deformation data for catheter insertion in a composite hydrogel phantom. Knowing the gray matter brain structures' different behaviors, the parameters were fine-tuned to obtain a generalized human brain model. The brain structures' average displacement was compared with values in the literature. The simulator's numerical model uses a novel approach with respect to the literature, and it has proved to be a close match with real brain deformations through validation using recorded deformation data of in-vivo animal trials with a mean mismatch of 4.73$\pm$2.15%. The stability, accuracy, and real-time performance make this model suitable for creating a dynamic environment for KN path planning, pre-operative path planning, and intra-operative guidance. △ Less

Submitted 18 June, 2021; originally announced June 2021.

Comments: 8 pages, 8 figures. This article has been accepted for publication in a future issue of IEEE Robotics and Automation Letters, but has not been fully edited. Content may change prior to final publication. 2377-3766 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. A. Segato and C. Di Vece equally contributed

Journal ref: IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 6061-6067, July 2021

arXiv:2106.05923 [pdf, other]

FetReg: Placental Vessel Segmentation and Registration in Fetoscopy Challenge Dataset

Authors: Sophia Bano, Alessandro Casella, Francisco Vasconcelos, Sara Moccia, George Attilakos, Ruwan Wimalasundera, Anna L. David, Dario Paladini, Jan Deprest, Elena De Momi, Leonardo S. Mattos, Danail Stoyanov

Abstract: Fetoscopy laser photocoagulation is a widely used procedure for the treatment of Twin-to-Twin Transfusion Syndrome (TTTS), that occur in mono-chorionic multiple pregnancies due to placental vascular anastomoses. This procedure is particularly challenging due to limited field of view, poor manoeuvrability of the fetoscope, poor visibility due to fluid turbidity, variability in light source, and unu… ▽ More Fetoscopy laser photocoagulation is a widely used procedure for the treatment of Twin-to-Twin Transfusion Syndrome (TTTS), that occur in mono-chorionic multiple pregnancies due to placental vascular anastomoses. This procedure is particularly challenging due to limited field of view, poor manoeuvrability of the fetoscope, poor visibility due to fluid turbidity, variability in light source, and unusual position of the placenta. This may lead to increased procedural time and incomplete ablation, resulting in persistent TTTS. Computer-assisted intervention may help overcome these challenges by expanding the fetoscopic field of view through video mosaicking and providing better visualization of the vessel network. However, the research and development in this domain remain limited due to unavailability of high-quality data to encode the intra- and inter-procedure variability. Through the \textit{Fetoscopic Placental Vessel Segmentation and Registration (FetReg)} challenge, we present a large-scale multi-centre dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms for the fetal environment with a focus on creating drift-free mosaics from long duration fetoscopy videos. In this paper, we provide an overview of the FetReg dataset, challenge tasks, evaluation metrics and baseline methods for both segmentation and registration. Baseline methods results on the FetReg dataset shows that our dataset poses interesting challenges, offering large opportunity for the creation of novel methods and models through a community effort initiative guided by the FetReg challenge. △ Less

Submitted 16 June, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

arXiv:2105.12031 [pdf, other]

doi 10.1109/RO-MAN50785.2021.9515500

An Integrated Dynamic Method for Allocating Roles and Planning Tasks for Mixed Human-Robot Teams

Authors: Fabio Fusaro, Edoardo Lamon, Elena De Momi, Arash Ajoudani

Abstract: This paper proposes a novel integrated dynamic method based on Behavior Trees for planning and allocating tasks in mixed human robot teams, suitable for manufacturing environments. The Behavior Tree formulation allows encoding a single job as a compound of different tasks with temporal and logic constraints. In this way, instead of the well-studied offline centralized optimization problem, the rol… ▽ More This paper proposes a novel integrated dynamic method based on Behavior Trees for planning and allocating tasks in mixed human robot teams, suitable for manufacturing environments. The Behavior Tree formulation allows encoding a single job as a compound of different tasks with temporal and logic constraints. In this way, instead of the well-studied offline centralized optimization problem, the role allocation problem is solved with multiple simplified online optimization sub-problem, without complex and cross-schedule task dependencies. These sub-problems are defined as Mixed-Integer Linear Programs, that, according to the worker-actions related costs and the workers' availability, allocate the yet-to-execute tasks among the available workers. To characterize the behavior of the developed method, we opted to perform different simulation experiments in which the results of the action-worker allocation and computational complexity are evaluated. The obtained results, due to the nature of the algorithm and to the possibility of simulating the agents' behavior, should describe well also how the algorithm performs in real experiments. △ Less

Submitted 19 January, 2023; v1 submitted 25 May, 2021; originally announced May 2021.

Comments: 6 pages, 5 figures, presented at the 30th IEEE International Conference on Robot and Human Interactive Communication

arXiv:2105.09809 [pdf, other]

Quantitative Physical Ergonomics Assessment of Teleoperation Interfaces

Authors: Soheil Gholami, Marta Lorenzini, Elena De Momi, Arash Ajoudani

Abstract: Human factors and ergonomics are the essential constituents of teleoperation interfaces, which can significantly affect the human operator's performance. Thus, a quantitative evaluation of these elements and the ability to establish reliable comparison bases for different teleoperation interfaces are the keys to select the most suitable one for a particular application. However, most of the works… ▽ More Human factors and ergonomics are the essential constituents of teleoperation interfaces, which can significantly affect the human operator's performance. Thus, a quantitative evaluation of these elements and the ability to establish reliable comparison bases for different teleoperation interfaces are the keys to select the most suitable one for a particular application. However, most of the works on teleoperation have so far focused on the stability analysis and the transparency improvement of these systems, and do not cover the important usability aspects. In this work, we propose a foundation to build a general framework for the analysis of human factors and ergonomics in employing diverse teleoperation interfaces. The proposed framework will go beyond the traditional subjective analyses of usability by complementing it with online measurements of the human body configurations. As a result, multiple quantitative metrics such as joints' usage, range of motion comfort, center of mass divergence, and posture comfort are introduced. To demonstrate the potential of the proposed framework, two different teleoperation interfaces are considered, and real-world experiments with eleven participants performing a simulated industrial remote pick-and-place task are conducted. The quantitative results of this analysis are provided, and compared with subjective questionnaires, illustrating the effectiveness of the proposed framework. △ Less

Submitted 20 May, 2021; originally announced May 2021.

Comments: 10 pages, 9 figures, submitted to IEEE Transactions on Human-Machine Systems

arXiv:2104.06510 [pdf]

Robotic needle steering in deformable tissues with extreme learning machines

Authors: Pedro Henrique Suruagy Perrusi, Anna Cazzaniga, Paul Baksic, Eleonora Tagliabue, Elena de Momi, Hadrien Courtecuisse

Abstract: Control strategies for robotic needle steering in soft tissues must account for complex interactions between the needle and the tissue to achieve accurate needle tip positioning. Recent findings show faster robotic command rate can improve the control stability in realistic scenarios. This study proposes the use of Extreme Learning Machines to provide fast commands for robotic needle steering. A s… ▽ More Control strategies for robotic needle steering in soft tissues must account for complex interactions between the needle and the tissue to achieve accurate needle tip positioning. Recent findings show faster robotic command rate can improve the control stability in realistic scenarios. This study proposes the use of Extreme Learning Machines to provide fast commands for robotic needle steering. A synthetic dataset based on the inverse finite element simulation control framework is used to train the model. Results show the model is capable to infer commands 66% faster than the inverse simulation and reaches acceptable precision even on previously unseen trajectories. △ Less

Submitted 2 April, 2021; originally announced April 2021.

Journal ref: AUTOMED 2021, Jun 2021, Basel, Switzerland

arXiv:2104.03927 [pdf, other]

A transfer-learning approach for lesion detection in endoscopic images from the urinary tract

Authors: Jorge F. Lazo, Sara Moccia, Aldo Marzullo, Michele Catellani, Ottavio De Cobelli, Benoit Rosa, Michel de Mathelin, Elena De Momi

Abstract: Ureteroscopy and cystoscopy are the gold standard methods to identify and treat tumors along the urinary tract. It has been reported that during a normal procedure a rate of 10-20 % of the lesions could be missed. In this work we study the implementation of 3 different Convolutional Neural Networks (CNNs), using a 2-steps training strategy, to classify images from the urinary tract with and withou… ▽ More Ureteroscopy and cystoscopy are the gold standard methods to identify and treat tumors along the urinary tract. It has been reported that during a normal procedure a rate of 10-20 % of the lesions could be missed. In this work we study the implementation of 3 different Convolutional Neural Networks (CNNs), using a 2-steps training strategy, to classify images from the urinary tract with and without lesions. A total of 6,101 images from ureteroscopy and cystoscopy procedures were collected. The CNNs were trained and tested using transfer learning in a two-steps fashion on 3 datasets. The datasets used were: 1) only ureteroscopy images, 2) only cystoscopy images and 3) the combination of both of them. For cystoscopy data, VGG performed better obtaining an Area Under the ROC Curve (AUC) value of 0.846. In the cases of ureteroscopy and the combination of both datasets, ResNet50 achieved the best results with AUC values of 0.987 and 0.940. The use of a training dataset that comprehends both domains results in general better performances, but performing a second stage of transfer learning achieves comparable ones. There is no single model which performs better in all scenarios, but ResNet50 is the network that achieves the best performances in most of them. The obtained results open the opportunity for further investigation with a view for improving lesion detection in endoscopic images of the urinary system. △ Less

Submitted 8 April, 2021; originally announced April 2021.

arXiv:2104.01985 [pdf, ps, other]

Using spatial-temporal ensembles of convolutional neural networks for lumen segmentation in ureteroscopy

Authors: Jorge F. Lazo, Aldo Marzullo, Sara Moccia, Michele Catellani, Benoit Rosa, Michel de Mathelin, Elena De Momi

Abstract: Purpose: Ureteroscopy is an efficient endoscopic minimally invasive technique for the diagnosis and treatment of upper tract urothelial carcinoma (UTUC). During ureteroscopy, the automatic segmentation of the hollow lumen is of primary importance, since it indicates the path that the endoscope should follow. In order to obtain an accurate segmentation of the hollow lumen, this paper presents an au… ▽ More Purpose: Ureteroscopy is an efficient endoscopic minimally invasive technique for the diagnosis and treatment of upper tract urothelial carcinoma (UTUC). During ureteroscopy, the automatic segmentation of the hollow lumen is of primary importance, since it indicates the path that the endoscope should follow. In order to obtain an accurate segmentation of the hollow lumen, this paper presents an automatic method based on Convolutional Neural Networks (CNNs). Methods: The proposed method is based on an ensemble of 4 parallel CNNs to simultaneously process single and multi-frame information. Of these, two architectures are taken as core-models, namely U-Net based in residual blocks($m_1$) and Mask-RCNN($m_2$), which are fed with single still-frames $I(t)$. The other two models ($M_1$, $M_2$) are modifications of the former ones consisting on the addition of a stage which makes use of 3D Convolutions to process temporal information. $M_1$, $M_2$ are fed with triplets of frames ($I(t-1)$, $I(t)$, $I(t+1)$) to produce the segmentation for $I(t)$. Results: The proposed method was evaluated using a custom dataset of 11 videos (2,673 frames) which were collected and manually annotated from 6 patients. We obtain a Dice similarity coefficient of 0.80, outperforming previous state-of-the-art methods. Conclusion: The obtained results show that spatial-temporal information can be effectively exploited by the ensemble model to improve hollow lumen segmentation in ureteroscopic images. The method is effective also in presence of poor visibility, occasional bleeding, or specular reflections. △ Less

Submitted 5 April, 2021; originally announced April 2021.

arXiv:2103.00586 [pdf, other]

doi 10.1109/LRA.2021.3062308

A Kinematic Bottleneck Approach For Pose Regression of Flexible Surgical Instruments directly from Images

Authors: Luca Sestini, Benoit Rosa, Elena De Momi, Giancarlo Ferrigno, Nicolas Padoy

Abstract: 3-D pose estimation of instruments is a crucial step towards automatic scene understanding in robotic minimally invasive surgery. Although robotic systems can potentially directly provide joint values, this information is not commonly exploited inside the operating room, due to its possible unreliability, limited access and the time-consuming calibration required, especially for continuum robots.… ▽ More 3-D pose estimation of instruments is a crucial step towards automatic scene understanding in robotic minimally invasive surgery. Although robotic systems can potentially directly provide joint values, this information is not commonly exploited inside the operating room, due to its possible unreliability, limited access and the time-consuming calibration required, especially for continuum robots. For this reason, standard approaches for 3-D pose estimation involve the use of external tracking systems. Recently, image-based methods have emerged as promising, non-invasive alternatives. While many image-based approaches in the literature have shown accurate results, they generally require either a complex iterative optimization for each processed image, making them unsuitable for real-time applications, or a large number of manually-annotated images for efficient learning. In this paper we propose a self-supervised image-based method, exploiting, at training time only, the imprecise kinematic information provided by the robot. In order to avoid introducing time-consuming manual annotations, the problem is formulated as an auto-encoder, smartly bottlenecked by the presence of a physical model of the robotic instruments and surgical camera, forcing a separation between image background and kinematic content. Validation of the method was performed on semi-synthetic, phantom and in-vivo datasets, obtained using a flexible robotized endoscope, showing promising results for real-time image-based 3-D pose estimation of surgical instruments. △ Less

Submitted 28 February, 2021; originally announced March 2021.

arXiv:2101.05021 [pdf, other]

A Lumen Segmentation Method in Ureteroscopy Images based on a Deep Residual U-Net architecture

Authors: Jorge F. Lazo, Aldo Marzullo, Sara Moccia, Michele Catellani, Benoit Rosa, Michel de Mathelin, Elena De Momi

Abstract: Ureteroscopy is becoming the first surgical treatment option for the majority of urinary affections. This procedure is performed using an endoscope which provides the surgeon with the visual information necessary to navigate inside the urinary tract. Having in mind the development of surgical assistance systems, that could enhance the performance of surgeon, the task of lumen segmentation is a fun… ▽ More Ureteroscopy is becoming the first surgical treatment option for the majority of urinary affections. This procedure is performed using an endoscope which provides the surgeon with the visual information necessary to navigate inside the urinary tract. Having in mind the development of surgical assistance systems, that could enhance the performance of surgeon, the task of lumen segmentation is a fundamental part since this is the visual reference which marks the path that the endoscope should follow. This is something that has not been analyzed in ureteroscopy data before. However, this task presents several challenges given the image quality and the conditions itself of ureteroscopy procedures. In this paper, we study the implementation of a Deep Neural Network which exploits the advantage of residual units in an architecture based on U-Net. For the training of these networks, we analyze the use of two different color spaces: gray-scale and RGB data images. We found that training on gray-scale images gives the best results obtaining mean values of Dice Score, Precision, and Recall of 0.73, 0.58, and 0.92 respectively. The results obtained shows that the use of residual U-Net could be a suitable model for further development for a computer-aided system for navigation and guidance through the urinary system. △ Less

Submitted 13 January, 2021; originally announced January 2021.

arXiv:2012.14517 [pdf, other]

Comparison of different CNNs for breast tumor classification from ultrasound images

Authors: Jorge F. Lazo, Sara Moccia, Emanuele Frontoni, Elena De Momi

Abstract: Breast cancer is one of the deadliest cancer worldwide. Timely detection could reduce mortality rates. In the clinical routine, classifying benign and malignant tumors from ultrasound (US) imaging is a crucial but challenging task. An automated method, which can deal with the variability of data is therefore needed. In this paper, we compared different Convolutional Neural Networks (CNNs) and tr… ▽ More Breast cancer is one of the deadliest cancer worldwide. Timely detection could reduce mortality rates. In the clinical routine, classifying benign and malignant tumors from ultrasound (US) imaging is a crucial but challenging task. An automated method, which can deal with the variability of data is therefore needed. In this paper, we compared different Convolutional Neural Networks (CNNs) and transfer learning methods for the task of automated breast tumor classification. The architectures investigated in this study were VGG-16 and Inception V3. Two different training strategies were investigated: the first one was using pretrained models as feature extractors and the second one was to fine-tune the pre-trained models. A total of 947 images were used, 587 corresponded to US images of benign tumors and 360 with malignant tumors. 678 images were used for the training and validation process, while 269 images were used for testing the models. Accuracy and Area Under the receiver operating characteristic Curve (AUC) were used as performance metrics. The best performance was obtained by fine tuning VGG-16, with an accuracy of 0.919 and an AUC of 0.934. The obtained results open the opportunity to further investigation with a view of improving cancer detection. △ Less

Submitted 28 December, 2020; originally announced December 2020.

arXiv:1907.10993 [pdf, other]

Weakly Supervised Recognition of Surgical Gestures

Authors: Beatrice van Amsterdam, Hirenkumar Nakawala, Elena De Momi, Danail Stoyanov

Abstract: Kinematic trajectories recorded from surgical robots contain information about surgical gestures and potentially encode cues about surgeon's skill levels. Automatic segmentation of these trajectories into meaningful action units could help to develop new metrics for surgical skill assessment as well as to simplify surgical automation. State-of-the-art methods for action recognition relied on manua… ▽ More Kinematic trajectories recorded from surgical robots contain information about surgical gestures and potentially encode cues about surgeon's skill levels. Automatic segmentation of these trajectories into meaningful action units could help to develop new metrics for surgical skill assessment as well as to simplify surgical automation. State-of-the-art methods for action recognition relied on manual labelling of large datasets, which is time consuming and error prone. Unsupervised methods have been developed to overcome these limitations. However, they often rely on tedious parameter tuning and perform less well than supervised approaches, especially on data with high variability such as surgical trajectories. Hence, the potential of weak supervision could be to improve unsupervised learning while avoiding manual annotation of large datasets. In this paper, we used at a minimum one expert demonstration and its ground truth annotations to generate an appropriate initialization for a GMM-based algorithm for gesture recognition. We showed on real surgical demonstrations that the latter significantly outperforms standard task-agnostic initialization methods. We also demonstrated how to improve the recognition accuracy further by redefining the actions and optimising the inputs. △ Less

Submitted 25 July, 2019; originally announced July 2019.

Comments: 2019 IEEE International Conference on Robotics and Automation (ICRA)

MSC Class: 68 Computer science

arXiv:1804.03141 [pdf, other]

Automated pick-up of suturing needles for robotic surgical assistance

Authors: Claudia D'Ettorre, George Dwyer, Xiaofei Du, Francois Chadebecq, Francisco Vasconcelos, Elena De Momi, Danail Stoyanov

Abstract: Robot-assisted laparoscopic prostatectomy (RALP) is a treatment for prostate cancer that involves complete or nerve sparing removal prostate tissue that contains cancer. After removal the bladder neck is successively sutured directly with the urethra. The procedure is called urethrovesical anastomosis and is one of the most dexterity demanding tasks during RALP. Two suturing instruments and a pair… ▽ More Robot-assisted laparoscopic prostatectomy (RALP) is a treatment for prostate cancer that involves complete or nerve sparing removal prostate tissue that contains cancer. After removal the bladder neck is successively sutured directly with the urethra. The procedure is called urethrovesical anastomosis and is one of the most dexterity demanding tasks during RALP. Two suturing instruments and a pair of needles are used in combination to perform a running stitch during urethrovesical anastomosis. While robotic instruments provide enhanced dexterity to perform the anastomosis, it is still highly challenging and difficult to learn. In this paper, we presents a vision-guided needle gras** method for automatically gras** the needle that has been inserted into the patient prior to anastomosis. We aim to automatically grasp the suturing needle in a position that avoids hand-offs and immediately enables the start of suturing. The full gras** process can be broken down into: a needle detection algorithm; an approach phase where the surgical tool moves closer to the needle based on visual feedback; and a gras** phase through path planning based on observed surgical practice. Our experimental results show examples of successful autonomous gras** that has the potential to simplify and decrease the operational time in RALP by assisting a small component of urethrovesical anastomosis. △ Less

Submitted 9 April, 2018; originally announced April 2018.

arXiv:1706.07002 [pdf, other]

doi 10.1109/TBME.2018.2813015

Uncertainty-Aware Organ Classification for Surgical Data Science Applications in Laparoscopy

Authors: S. Moccia, S. J. Wirkert, H. Kenngott, A. S. Vemuri, M. Apitz, B. Mayer, E. De Momi, L. S. Mattos, L. Maier-Hein

Abstract: Objective: Surgical data science is evolving into a research field that aims to observe everything occurring within and around the treatment process to provide situation-aware data-driven assistance. In the context of endoscopic video analysis, the accurate classification of organs in the field of view of the camera proffers a technical challenge. Herein, we propose a new approach to anatomical st… ▽ More Objective: Surgical data science is evolving into a research field that aims to observe everything occurring within and around the treatment process to provide situation-aware data-driven assistance. In the context of endoscopic video analysis, the accurate classification of organs in the field of view of the camera proffers a technical challenge. Herein, we propose a new approach to anatomical structure classification and image tagging that features an intrinsic measure of confidence to estimate its own performance with high reliability and which can be applied to both RGB and multispectral imaging (MI) data. Methods: Organ recognition is performed using a superpixel classification strategy based on textural and reflectance information. Classification confidence is estimated by analyzing the dispersion of class probabilities. Assessment of the proposed technology is performed through a comprehensive in vivo study with seven pigs. Results: When applied to image tagging, mean accuracy in our experiments increased from 65% (RGB) and 80% (MI) to 90% (RGB) and 96% (MI) with the confidence measure. Conclusion: Results showed that the confidence measure had a significant influence on the classification accuracy, and MI data are better suited for anatomical structure labeling than RGB data. Significance: This work significantly enhances the state of art in automatic labeling of endoscopic videos by introducing the use of the confidence metric, and by being the first study to use MI data for in vivo laparoscopic tissue classification. The data of our experiments will be released as the first in vivo MI dataset upon publication of this paper. △ Less

Submitted 19 October, 2018; v1 submitted 21 June, 2017; originally announced June 2017.

Comments: 7 pages, 6 images, 2 tables

Showing 1–38 of 38 results for author: de Momi, E