Search | arXiv e-print repository

Policy Learning for Active Target Tracking over Continuous SE(3) Trajectories

Authors: Pengzhi Yang, Shumon Koga, Arash Asgharivaskasi, Nikolay Atanasov

Abstract: This paper proposes a novel model-based policy gradient algorithm for tracking dynamic targets using a mobile robot, equipped with an onboard sensor with limited field of view. The task is to obtain a continuous control policy for the mobile robot to collect sensor measurements that reduce uncertainty in the target states, measured by the target distribution entropy. We design a neural network con… ▽ More This paper proposes a novel model-based policy gradient algorithm for tracking dynamic targets using a mobile robot, equipped with an onboard sensor with limited field of view. The task is to obtain a continuous control policy for the mobile robot to collect sensor measurements that reduce uncertainty in the target states, measured by the target distribution entropy. We design a neural network control policy with the robot $SE(3)$ pose and the mean vector and information matrix of the joint target distribution as inputs and attention layers to handle variable numbers of targets. We also derive the gradient of the target entropy with respect to the network parameters explicitly, allowing efficient model-based policy gradient optimization. △ Less

Submitted 16 May, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

Comments: 12 pages, 2 figures, submitted to Learning for Dynamics and Control Conference

arXiv:2209.12427 [pdf, other]

Learning Continuous Control Policies for Information-Theoretic Active Perception

Authors: Pengzhi Yang, Yuhan Liu, Shumon Koga, Arash Asgharivaskasi, Nikolay Atanasov

Abstract: This paper proposes a method for learning continuous control policies for active landmark localization and exploration using an information-theoretic cost. We consider a mobile robot detecting landmarks within a limited sensing range, and tackle the problem of learning a control policy that maximizes the mutual information between the landmark states and the sensor observations. We employ a Kalman… ▽ More This paper proposes a method for learning continuous control policies for active landmark localization and exploration using an information-theoretic cost. We consider a mobile robot detecting landmarks within a limited sensing range, and tackle the problem of learning a control policy that maximizes the mutual information between the landmark states and the sensor observations. We employ a Kalman filter to convert the partially observable problem in the landmark state to Markov decision process (MDP), a differentiable field of view to shape the reward, and an attention-based neural network to represent the control policy. The approach is further unified with active volumetric map** to promote exploration in addition to landmark localization. The performance is demonstrated in several simulated landmark localization tasks in comparison with benchmark methods. △ Less

Submitted 16 May, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

Comments: 7 pages, 6 figures, submitted to International Conference on Robotics and Automation (ICRA) 2023

arXiv:2204.07623 [pdf, other]

Active Map** via Gradient Ascent Optimization of Shannon Mutual Information over Continuous SE(3) Trajectories

Authors: Arash Asgharivaskasi, Shumon Koga, Nikolay Atanasov

Abstract: The problem of active map** aims to plan an informative sequence of sensing views given a limited budget such as distance traveled. This paper consider active occupancy grid map** using a range sensor, such as LiDAR or depth camera. State-of-the-art methods optimize information-theoretic measures relating the occupancy grid probabilities with the range sensor measurements. The non-smooth natur… ▽ More The problem of active map** aims to plan an informative sequence of sensing views given a limited budget such as distance traveled. This paper consider active occupancy grid map** using a range sensor, such as LiDAR or depth camera. State-of-the-art methods optimize information-theoretic measures relating the occupancy grid probabilities with the range sensor measurements. The non-smooth nature of ray-tracing within a grid representation makes the objective function non-differentiable, forcing existing methods to search over a discrete space of candidate trajectories. This work proposes a differentiable approximation of the Shannon mutual information between a grid map and ray-based observations that enables gradient ascent optimization in the continuous space of SE(3) sensor poses. Our gradient-based formulation leads to more informative sensing trajectories, while avoiding occlusions and collisions. The proposed method is demonstrated in simulated and real-world experiments in 2-D and 3-D environments. △ Less

Submitted 15 April, 2022; originally announced April 2022.

arXiv:2110.07546 [pdf, other]

Active SLAM over Continuous Trajectory and Control: A Covariance-Feedback Approach

Authors: Shumon Koga, Arash Asgharivaskasi, Nikolay Atanasov

Abstract: This paper proposes a novel active Simultaneous Localization and Map** (SLAM) method with continuous trajectory optimization over a stochastic robot dynamics model. The problem is formalized as a stochastic optimal control over the continuous robot kinematic model to minimize a cost function that involves the covariance matrix of the landmark states. We tackle the problem by separately obtaining… ▽ More This paper proposes a novel active Simultaneous Localization and Map** (SLAM) method with continuous trajectory optimization over a stochastic robot dynamics model. The problem is formalized as a stochastic optimal control over the continuous robot kinematic model to minimize a cost function that involves the covariance matrix of the landmark states. We tackle the problem by separately obtaining an open-loop control sequence subject to deterministic dynamics by iterative Covariance Regulation (iCR) and a closed-loop feedback control under stochastic robot and covariance dynamics by Linear Quadratic Regulator (LQR). The proposed optimization method captures the coupling between localization and map** in predicting uncertainty evolution and synthesizes highly informative sensing trajectories. We demonstrate its performance in active landmark-based SLAM using relative-position measurements with a limited field of view. △ Less

Submitted 14 October, 2021; originally announced October 2021.

Comments: 8 pages, 4 figures, submitted to American Control Conference 2022

arXiv:2103.05819 [pdf, other]

Active Exploration and Map** via Iterative Covariance Regulation over Continuous $SE(3)$ Trajectories

Authors: Shumon Koga, Arash Asgharivaskasi, Nikolay Atanasov

Abstract: This paper develops \emph{iterative Covariance Regulation} (iCR), a novel method for active exploration and map** for a mobile robot equipped with on-board sensors. The problem is posed as optimal control over the $SE(3)$ pose kinematics of the robot to minimize the differential entropy of the map conditioned the potential sensor observations. We introduce a differentiable field of view formulat… ▽ More This paper develops \emph{iterative Covariance Regulation} (iCR), a novel method for active exploration and map** for a mobile robot equipped with on-board sensors. The problem is posed as optimal control over the $SE(3)$ pose kinematics of the robot to minimize the differential entropy of the map conditioned the potential sensor observations. We introduce a differentiable field of view formulation, and derive iCR via the gradient descent method to iteratively update an open-loop control sequence in continuous space so that the covariance of the map estimate is minimized. We demonstrate autonomous exploration and uncertainty reduction in simulated occupancy grid environments. △ Less

Submitted 9 March, 2021; originally announced March 2021.

Comments: 8 pages, 5 figures, submitted to 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Showing 1–5 of 5 results for author: Koga, S