Search | arXiv e-print repository

arXiv:1909.09705 [pdf, other]

A Layered Architecture for Active Perception: Image Classification using Deep Reinforcement Learning

Authors: Hossein K. Mousavi, Guangyi Liu, Weihang Yuan, Martin Takáč, Héctor Muñoz-Avila, Nader Motee

Abstract: We propose a planning and perception mechanism for a robot (agent), that can only observe the underlying environment partially, in order to solve an image classification problem. A three-layer architecture is suggested that consists of a meta-layer that decides the intermediate goals, an action-layer that selects local actions as the agent navigates towards a goal, and a classification-layer that… ▽ More We propose a planning and perception mechanism for a robot (agent), that can only observe the underlying environment partially, in order to solve an image classification problem. A three-layer architecture is suggested that consists of a meta-layer that decides the intermediate goals, an action-layer that selects local actions as the agent navigates towards a goal, and a classification-layer that evaluates the reward and makes a prediction. We design and implement these layers using deep reinforcement learning. A generalized policy gradient algorithm is utilized to learn the parameters of these layers to maximize the expected reward. Our proposed methodology is tested on the MNIST dataset of handwritten digits, which provides us with a level of explainability while interpreting the agent's intermediate goals and course of action. △ Less

Submitted 20 September, 2019; originally announced September 2019.

Comments: Submitted to ICRA-2020

arXiv:1908.01421 [pdf, other]

Explicit Characterization of Performance of a Class of Networked Linear Control Systems

Authors: Hossein K. Mousavi, Nader Motee

Abstract: We show that the steady-state variance as a performance measure for a class of networked linear control systems is expressible as the summation of a rational function over the Laplacian eigenvalues of the network graph. Moreover, we characterize the role of connectivity thresholds for the feedback (and observer) gain design of these networks. We use our framework to derive bounds and scaling laws… ▽ More We show that the steady-state variance as a performance measure for a class of networked linear control systems is expressible as the summation of a rational function over the Laplacian eigenvalues of the network graph. Moreover, we characterize the role of connectivity thresholds for the feedback (and observer) gain design of these networks. We use our framework to derive bounds and scaling laws for the performance of the dynamical network. Our approach generalizes and unifies the previous results on the performance measure of these networks for the case of arbitrary nodal dynamics. We bring extensions of our methodology for the case of decentralized observer-based output feedback as well as a class of composite networks. Numerous examples support our theoretical contributions. △ Less

Submitted 4 August, 2019; originally announced August 2019.

Comments: detailed version of a paper of the same name to be submitted to IEEE TCNS

arXiv:1905.04835 [pdf, other]

Multi-Agent Image Classification via Reinforcement Learning

Authors: Hossein K. Mousavi, Mohammadreza Nazari, Martin Takáč, Nader Motee

Abstract: We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment. The objective is to classify an image over a finite time horizon. We propose a network architecture on how agents should form a local belief, take local actions, and extract relevant features from their raw partial observations. Agents are allo… ▽ More We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment. The objective is to classify an image over a finite time horizon. We propose a network architecture on how agents should form a local belief, take local actions, and extract relevant features from their raw partial observations. Agents are allowed to exchange information with their neighboring agents to update their own beliefs. It is shown how reinforcement learning techniques can be utilized to achieve decentralized implementation of the classification problem by running a decentralized consensus protocol. Our experimental results on the MNIST handwritten digit dataset demonstrates the effectiveness of our proposed framework. △ Less

Submitted 6 August, 2019; v1 submitted 12 May, 2019; originally announced May 2019.

Comments: Preprint of the paper to be published in IROS'19 proceedings

arXiv:1902.01026 [pdf, other]

Estimation with Fast Landmark Selection in Robot Visual Navigation

Authors: Hossein K. Mousavi, Nader Motee

Abstract: We consider the visual feature selection to improve the estimation quality required for the accurate navigation of a robot. We build upon a key property that asserts: contributions of trackable features (landmarks) appear linearly in the information matrix of the corresponding estimation problem. We utilize standard models for motion and vision system using a camera to formulate the feature select… ▽ More We consider the visual feature selection to improve the estimation quality required for the accurate navigation of a robot. We build upon a key property that asserts: contributions of trackable features (landmarks) appear linearly in the information matrix of the corresponding estimation problem. We utilize standard models for motion and vision system using a camera to formulate the feature selection problem over moving finite time horizons. A scalable randomized sampling algorithm is proposed to select more informative features (and ignore the rest) to achieve a superior position estimation quality. We provide probabilistic performance guarantees for our method. The time-complexity of our feature selection algorithm is linear in the number of candidate features, which is practically plausible and outperforms existing greedy methods that scale quadratically with the number of candidates features. Our numerical simulations confirm that not only the execution time of our proposed method is comparably less than that of the greedy method, but also the resulting estimation quality is very close to the greedy method. △ Less

Submitted 3 February, 2019; originally announced February 2019.

Showing 1–4 of 4 results for author: Mousavi, H K