Search | arXiv e-print repository

DEAR: Disentangled Environment and Agent Representations for Reinforcement Learning without Reconstruction

Authors: Ameya Pore, Riccardo Muradore, Diego Dall'Alba

Abstract: Reinforcement Learning (RL) algorithms can learn robotic control tasks from visual observations, but they often require a large amount of data, especially when the visual scene is complex and unstructured. In this paper, we explore how the agent's knowledge of its shape can improve the sample efficiency of visual RL methods. We propose a novel method, Disentangled Environment and Agent Representat… ▽ More Reinforcement Learning (RL) algorithms can learn robotic control tasks from visual observations, but they often require a large amount of data, especially when the visual scene is complex and unstructured. In this paper, we explore how the agent's knowledge of its shape can improve the sample efficiency of visual RL methods. We propose a novel method, Disentangled Environment and Agent Representations (DEAR), that uses the segmentation mask of the agent as supervision to learn disentangled representations of the environment and the agent through feature separation constraints. Unlike previous approaches, DEAR does not require reconstruction of visual observations. These representations are then used as an auxiliary loss to the RL objective, encouraging the agent to focus on the relevant features of the environment. We evaluate DEAR on two challenging benchmarks: Distracting DeepMind control suite and Franka Kitchen manipulation tasks. Our findings demonstrate that DEAR surpasses state-of-the-art methods in sample efficiency, achieving comparable or superior performance with reduced parameters. Our results indicate that integrating agent knowledge into visual RL methods has the potential to enhance their learning efficiency and robustness. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 7 pages, 8 figures, 2 tables. Accepted at 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

arXiv:2305.04027 [pdf, other]

doi 10.1109/TRO.2023.3269384

Autonomous Navigation for Robot-assisted Intraluminal and Endovascular Procedures: A Systematic Review

Authors: Ameya Pore, Zhen Li, Diego Dall'Alba, Albert Hernansanz, Elena De Momi, Arianna Menciassi, Alicia Casals, Jenny Denkelman, Paolo Fiorini, Emmanuel Vander Poorten

Abstract: Increased demand for less invasive procedures has accelerated the adoption of Intraluminal Procedures (IP) and Endovascular Interventions (EI) performed through body lumens and vessels. As navigation through lumens and vessels is quite complex, interest grows to establish autonomous navigation techniques for IP and EI for reaching the target area. Current research efforts are directed toward incre… ▽ More Increased demand for less invasive procedures has accelerated the adoption of Intraluminal Procedures (IP) and Endovascular Interventions (EI) performed through body lumens and vessels. As navigation through lumens and vessels is quite complex, interest grows to establish autonomous navigation techniques for IP and EI for reaching the target area. Current research efforts are directed toward increasing the Level of Autonomy (LoA) during the navigation phase. One key ingredient for autonomous navigation is Motion Planning (MP) techniques. This paper provides an overview of MP techniques categorizing them based on LoA. Our analysis investigates advances for the different clinical scenarios. Through a systematic literature analysis using the PRISMA method, the study summarizes relevant works and investigates the clinical aim, LoA, adopted MP techniques, and validation types. We identify the limitations of the corresponding MP methods and provide directions to improve the robustness of the algorithms in dynamic intraluminal environments. MP for IP and EI can be classified into four subgroups: node, sampling, optimization, and learning-based techniques, with a notable rise in learning-based approaches in recent years. One of the review's contributions is the identification of the limiting factors in IP and EI robotic systems hindering higher levels of autonomous navigation. In the future, navigation is bound to become more autonomous, placing the clinician in a supervisory position to improve control precision and reduce workload. △ Less

Submitted 6 May, 2023; originally announced May 2023.

Comments: 31 pages, 7 figures, 3 tables; Accepted in IEEE Transactions on Robotics

arXiv:2303.03207 [pdf, other]

Constrained Reinforcement Learning and Formal Verification for Safe Colonoscopy Navigation

Authors: Davide Corsi, Luca Marzari, Ameya Pore, Alessandro Farinelli, Alicia Casals, Paolo Fiorini, Diego Dall'Alba

Abstract: The field of robotic Flexible Endoscopes (FEs) has progressed significantly, offering a promising solution to reduce patient discomfort. However, the limited autonomy of most robotic FEs results in non-intuitive and challenging manoeuvres, constraining their application in clinical settings. While previous studies have employed lumen tracking for autonomous navigation, they fail to adapt to the pr… ▽ More The field of robotic Flexible Endoscopes (FEs) has progressed significantly, offering a promising solution to reduce patient discomfort. However, the limited autonomy of most robotic FEs results in non-intuitive and challenging manoeuvres, constraining their application in clinical settings. While previous studies have employed lumen tracking for autonomous navigation, they fail to adapt to the presence of obstructions and sharp turns when the endoscope faces the colon wall. In this work, we propose a Deep Reinforcement Learning (DRL)-based navigation strategy that eliminates the need for lumen tracking. However, the use of DRL methods poses safety risks as they do not account for potential hazards associated with the actions taken. To ensure safety, we exploit a Constrained Reinforcement Learning (CRL) method to restrict the policy in a predefined safety regime. Moreover, we present a model selection strategy that utilises Formal Verification (FV) to choose a policy that is entirely safe before deployment. We validate our approach in a virtual colonoscopy environment and report that out of the 300 trained policies, we could identify three policies that are entirely safe. Our work demonstrates that CRL, combined with model selection through FV, can improve the robustness and safety of robotic behaviour in surgical applications. △ Less

Submitted 16 August, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

Comments: Accepted in the IEEE International Conference on Intelligent Robots and Systems (IROS), 2023. [Corsi, Marzari and Pore contributed equally]

arXiv:2206.15086 [pdf, other]

Colonoscopy Navigation using End-to-End Deep Visuomotor Control: A User Study

Authors: Ameya Pore, Martina Finocchiaro, Diego Dall'Alba, Albert Hernansanz, Gastone Ciuti, Alberto Arezzo, Arianna Menciassi, Alicia Casals, Paolo Fiorini

Abstract: Flexible endoscopes for colonoscopy present several limitations due to their inherent complexity, resulting in patient discomfort and lack of intuitiveness for clinicians. Robotic devices together with autonomous control represent a viable solution to reduce the workload of endoscopists and the training time while improving the overall procedure outcome. Prior works on autonomous endoscope control… ▽ More Flexible endoscopes for colonoscopy present several limitations due to their inherent complexity, resulting in patient discomfort and lack of intuitiveness for clinicians. Robotic devices together with autonomous control represent a viable solution to reduce the workload of endoscopists and the training time while improving the overall procedure outcome. Prior works on autonomous endoscope control use heuristic policies that limit their generalisation to the unstructured and highly deformable colon environment and require frequent human intervention. This work proposes an image-based control of the endoscope using Deep Reinforcement Learning, called Deep Visuomotor Control (DVC), to exhibit adaptive behaviour in convoluted sections of the colon tract. DVC learns a map** between the endoscopic images and the control signal of the endoscope. A first user study of 20 expert gastrointestinal endoscopists was carried out to compare their navigation performance with DVC policies using a realistic virtual simulator. The results indicate that DVC shows equivalent performance on several assessment parameters, being more safer. Moreover, a second user study with 20 novice participants was performed to demonstrate easier human supervision compared to a state-of-the-art heuristic control policy. Seamless supervision of colonoscopy procedures would enable interventionists to focus on the medical decision rather than on the control problem of the endoscope. △ Less

Submitted 30 June, 2022; originally announced June 2022.

Comments: Accepted in IROS2022

arXiv:2110.00336 [pdf, other]

Learning from Demonstrations for Autonomous Soft-tissue Retraction

Authors: Ameya Pore, Eleonora Tagliabue, Marco Piccinelli, Diego Dall'Alba, Alicia Casals, Paolo Fiorini

Abstract: The current research focus in Robot-Assisted Minimally Invasive Surgery (RAMIS) is directed towards increasing the level of robot autonomy, to place surgeons in a supervisory position. Although Learning from Demonstrations (LfD) approaches are among the preferred ways for an autonomous surgical system to learn expert gestures, they require a high number of demonstrations and show poor generalizati… ▽ More The current research focus in Robot-Assisted Minimally Invasive Surgery (RAMIS) is directed towards increasing the level of robot autonomy, to place surgeons in a supervisory position. Although Learning from Demonstrations (LfD) approaches are among the preferred ways for an autonomous surgical system to learn expert gestures, they require a high number of demonstrations and show poor generalization to the variable conditions of the surgical environment. In this work, we propose an LfD methodology based on Generative Adversarial Imitation Learning (GAIL) that is built on a Deep Reinforcement Learning (DRL) setting. GAIL combines generative adversarial networks to learn the distribution of expert trajectories with a DRL setting to ensure generalisation of trajectories providing human-like behaviour. We consider automation of tissue retraction, a common RAMIS task that involves soft tissues manipulation to expose a region of interest. In our proposed methodology, a small set of expert trajectories can be acquired through the da Vinci Research Kit (dVRK) and used to train the proposed LfD method inside a simulated environment. Results indicate that our methodology can accomplish the tissue retraction task with human-like behaviour while being more sample-efficient than the baseline DRL method. Towards the end, we show that the learnt policies can be successfully transferred to the real robotic platform and deployed for soft tissue retraction on a synthetic phantom. △ Less

Submitted 1 October, 2021; originally announced October 2021.

Comments: Accepted in IEEE International Symposium of Medical Robotics (ISMR 2021)

arXiv:2109.02323 [pdf, other]

Safe Reinforcement Learning using Formal Verification for Tissue Retraction in Autonomous Robotic-Assisted Surgery

Authors: Ameya Pore, Davide Corsi, Enrico Marchesini, Diego Dall'Alba, Alicia Casals, Alessandro Farinelli, Paolo Fiorini

Abstract: Deep Reinforcement Learning (DRL) is a viable solution for automating repetitive surgical subtasks due to its ability to learn complex behaviours in a dynamic environment. This task automation could lead to reduced surgeon's cognitive workload, increased precision in critical aspects of the surgery, and fewer patient-related complications. However, current DRL methods do not guarantee any safety c… ▽ More Deep Reinforcement Learning (DRL) is a viable solution for automating repetitive surgical subtasks due to its ability to learn complex behaviours in a dynamic environment. This task automation could lead to reduced surgeon's cognitive workload, increased precision in critical aspects of the surgery, and fewer patient-related complications. However, current DRL methods do not guarantee any safety criteria as they maximise cumulative rewards without considering the risks associated with the actions performed. Due to this limitation, the application of DRL in the safety-critical paradigm of robot-assisted Minimally Invasive Surgery (MIS) has been constrained. In this work, we introduce a Safe-DRL framework that incorporates safety constraints for the automation of surgical subtasks via DRL training. We validate our approach in a virtual scene that replicates a tissue retraction task commonly occurring in multiple phases of an MIS. Furthermore, to evaluate the safe behaviour of the robotic arms, we formulate a formal verification tool for DRL methods that provides the probability of unsafe configurations. Our results indicate that a formal analysis guarantees safety with high confidence such that the robotic instruments operate within the safe workspace and avoid hazardous interaction with other anatomical structures. △ Less

Submitted 6 September, 2021; originally announced September 2021.

Comments: 7 pages, 6 figures

arXiv:2104.01609 [pdf, ps, other]

doi 10.1007/s12036-021-09706-6

AstroSat Science Support Cell

Authors: J. Roy, Md S. Alam, C. Balamurugan, D. Bhattacharya, P. Bhoye, G. C. Dewangan, M. Hulsurkar, N. Mali, R. Misra, A. Pore

Abstract: AstroSat is India's first dedicated multi-wavelength space observatory launched by the Indian Space Research Organisation (ISRO) on 28 September 2015. After launch, the AstroSat Science Support Cell (ASSC) was set up as a joint venture of ISRO and the Inter-University Centre for Astronomy and Astrophysics (IUCAA) with the primary purpose of facilitating the use of AstroSat, both for making observi… ▽ More AstroSat is India's first dedicated multi-wavelength space observatory launched by the Indian Space Research Organisation (ISRO) on 28 September 2015. After launch, the AstroSat Science Support Cell (ASSC) was set up as a joint venture of ISRO and the Inter-University Centre for Astronomy and Astrophysics (IUCAA) with the primary purpose of facilitating the use of AstroSat, both for making observing proposals and for utilising archival data. The ASSC organises meetings, workshops and webinars to train users in these activities, runs a help desk to address user queries, provides utility tools and disseminates analysis software through a consolidated web portal. It also maintains the AstroSat Proposal Processing System (APPS) which is deployed at ISSDC, a software platform central to the workflow management of AstroSat operations. This paper illustrates the various aspects of ASSC functionality. △ Less

Submitted 4 April, 2021; originally announced April 2021.

Comments: Accepted for publication in the Journal of Astrophysics & Astronomy (JOAA)

arXiv:2102.04022 [pdf, other]

Towards Hierarchical Task Decomposition using Deep Reinforcement Learning for Pick and Place Subtasks

Authors: Luca Marzari, Ameya Pore, Diego Dall'Alba, Gerardo Aragon-Camarasa, Alessandro Farinelli, Paolo Fiorini

Abstract: Deep Reinforcement Learning (DRL) is emerging as a promising approach to generate adaptive behaviors for robotic platforms. However, a major drawback of using DRL is the data-hungry training regime that requires millions of trial and error attempts, which is impractical when running experiments on robotic systems. Learning from Demonstrations (LfD) has been introduced to solve this issue by clonin… ▽ More Deep Reinforcement Learning (DRL) is emerging as a promising approach to generate adaptive behaviors for robotic platforms. However, a major drawback of using DRL is the data-hungry training regime that requires millions of trial and error attempts, which is impractical when running experiments on robotic systems. Learning from Demonstrations (LfD) has been introduced to solve this issue by cloning the behavior of expert demonstrations. However, LfD requires a large number of demonstrations that are difficult to be acquired since dedicated complex setups are required. To overcome these limitations, we propose a multi-subtask reinforcement learning methodology where complex pick and place tasks can be decomposed into low-level subtasks. These subtasks are parametrized as expert networks and learned via DRL methods. Trained subtasks are then combined by a high-level choreographer to accomplish the intended pick and place task considering different initial configurations. As a testbed, we use a pick and place robotic simulator to demonstrate our methodology and show that our method outperforms a benchmark methodology based on LfD in terms of sample-efficiency. We transfer the learned policy to the real robotic system and demonstrate robust gras** using various geometric-shaped objects. △ Less

Submitted 19 October, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

Comments: This work has been accepted to the IEEE International Conference on Advanced Robotics (ICAR) 2021

arXiv:2011.01880 [pdf, other]

Intrinsic Robotic Introspection: Learning Internal States From Neuron Activations

Authors: Nikos Pitsillos, Ameya Pore, Bjorn Sand Jensen, Gerardo Aragon-Camarasa

Abstract: We present an introspective framework inspired by the process of how humans perform introspection. Our working assumption is that neural network activations encode information, and building internal states from these activations can improve the performance of an actor-critic model. We perform experiments where we first train a Variational Autoencoder model to reconstruct the activations of a featu… ▽ More We present an introspective framework inspired by the process of how humans perform introspection. Our working assumption is that neural network activations encode information, and building internal states from these activations can improve the performance of an actor-critic model. We perform experiments where we first train a Variational Autoencoder model to reconstruct the activations of a feature extraction network and use the latent space to improve the performance of an actor-critic when deciding which low-level robotic behaviour to execute. We show that internal states reduce the number of episodes needed by about 1300 episodes while training an actor-critic, denoting faster convergence to get a high success value while completing a robotic task. △ Less

Submitted 3 June, 2021; v1 submitted 3 November, 2020; originally announced November 2020.

Comments: Paper accepted at the International Conference on Development and Learning (ICDL) 2021

arXiv:2001.07973 [pdf, other]

On Simple Reactive Neural Networks for Behaviour-Based Reinforcement Learning

Authors: Ameya Pore, Gerardo Aragon-Camarasa

Abstract: We present a behaviour-based reinforcement learning approach, inspired by Brook's subsumption architecture, in which simple fully connected networks are trained as reactive behaviours. Our working assumption is that a pick and place robotic task can be simplified by leveraging domain knowledge of a robotics developer to decompose and train such reactive behaviours; namely, approach, grasp, and ret… ▽ More We present a behaviour-based reinforcement learning approach, inspired by Brook's subsumption architecture, in which simple fully connected networks are trained as reactive behaviours. Our working assumption is that a pick and place robotic task can be simplified by leveraging domain knowledge of a robotics developer to decompose and train such reactive behaviours; namely, approach, grasp, and retract. Then the robot autonomously learns how to combine them via an Actor-Critic architecture. The Actor-Critic policy is to determine the activation and inhibition mechanisms of the reactive behaviours in a particular temporal sequence. We validate our approach in a simulated robot environment where the task is picking a block and taking it to a target position while orienting the gripper from a top grasp. The latter represents an extra degree-of-freedom of which current end-to-end reinforcement learning fail to generalise. Our findings suggest that robotic learning can be more effective if each behaviour is learnt in isolation and then combined them to accomplish the task. That is, our approach learns the pick and place task in 8,000 episodes, which represents a drastic reduction in the number of training episodes required by an end-to-end approach and the existing state-of-the-art algorithms. △ Less

Submitted 29 May, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

Comments: 6 pages, 5 figures. Accepted for publication to the International Conference on Robotics and Automation (ICRA 2020)

Showing 1–10 of 10 results for author: Pore, A