Skip to main content

Showing 1–20 of 20 results for author: Panov, A I

.
  1. arXiv:2407.09287  [pdf, other

    cs.AI

    Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments

    Authors: Zoya Volovikova, Alexey Skrynnik, Petr Kuderov, Aleksandr I. Panov

    Abstract: In this study, we address the issue of enabling an artificial intelligence agent to execute complex language instructions within virtual environments. In our framework, we assume that these instructions involve intricate linguistic structures and multiple interdependent tasks that must be navigated successfully to achieve the desired outcomes. To effectively manage these complexities, we propose a… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2311.06295  [pdf, other

    physics.chem-ph cs.LG

    Gradual Optimization Learning for Conformational Energy Minimization

    Authors: Artem Tsypin, Leonid Ugadiarov, Kuzma Khrabrov, Alexander Telepov, Egor Rumiantsev, Alexey Skrynnik, Aleksandr I. Panov, Dmitry Vetrov, Elena Tutubalina, Artur Kadurin

    Abstract: Molecular conformation optimization is crucial to computer-aided drug discovery and materials design. Traditional energy minimization techniques rely on iterative optimization methods that use molecular forces calculated by a physical simulator (oracle) as anti-gradients. However, this is a computationally expensive approach that requires many interactions with a physical simulator. One way to acc… ▽ More

    Submitted 12 March, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: Published as a conference paper at ICLR2024 (Poster)

  3. arXiv:2311.04640  [pdf, other

    cs.LG cs.AI cs.CV

    Object-Centric Learning with Slot Mixture Module

    Authors: Daniil Kirilenko, Vitaliy Vorobyov, Alexey K. Kovalev, Aleksandr I. Panov

    Abstract: Object-centric architectures usually apply a differentiable module to the entire feature map to decompose it into sets of entity representations called slots. Some of these methods structurally resemble clustering algorithms, where the cluster's center in latent space serves as a slot representation. Slot Attention is an example of such a method, acting as a learnable analog of the soft k-means al… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: 17 pages, 6 figures

  4. arXiv:2310.17178  [pdf, other

    cs.AI cs.LG cs.RO

    Graphical Object-Centric Actor-Critic

    Authors: Leonid Ugadiarov, Aleksandr I. Panov

    Abstract: There have recently been significant advances in the problem of unsupervised object-centric representation learning and its application to downstream tasks. The latest works support the argument that employing disentangled object representations in image-based object-centric reinforcement learning tasks facilitates policy learning. We propose a novel object-centric reinforcement learning algorithm… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  5. arXiv:2310.13391  [pdf, other

    cs.LG cs.AI cs.NE

    Learning Successor Features with Distributed Hebbian Temporal Memory

    Authors: Evgenii Dzhivelikian, Petr Kuderov, Aleksandr I. Panov

    Abstract: This paper presents a novel approach to address the challenge of online temporal memory learning for decision-making under uncertainty in non-stationary, partially observable environments. The proposed algorithm, Distributed Hebbian Temporal Memory (DHTM), is based on factor graph formalism and a multicomponent neuron model. DHTM aims to capture sequential data relationships and make cumulative pr… ▽ More

    Submitted 19 March, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: 20 pages, 9 figures

  6. arXiv:2306.09459  [pdf, other

    cs.LG cs.AI

    Recurrent Action Transformer with Memory

    Authors: Alexey Staroverov, Egor Cherepanov, Dmitry Yudin, Alexey K. Kovalev, Aleksandr I. Panov

    Abstract: Recently, the use of transformers in offline reinforcement learning has become a rapidly develo** area. This is due to their ability to treat the agent's trajectory in the environment as a sequence, thereby reducing the policy learning problem to sequence modeling. In environments where the agent's decisions depend on past events, it is essential to capture both the event itself and the decision… ▽ More

    Submitted 27 March, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: 15 pages, 11 figures

  7. arXiv:2301.10067  [pdf, other

    cs.LG cs.AI

    Intrinsic Motivation in Model-based Reinforcement Learning: A Brief Review

    Authors: Artem Latyshev, Aleksandr I. Panov

    Abstract: The reinforcement learning research area contains a wide range of methods for solving the problems of intelligent agent control. Despite the progress that has been made, the task of creating a highly autonomous agent is still a significant challenge. One potential solution to this problem is intrinsic motivation, a concept derived from developmental psychology. This review considers the existing m… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: 13 pages, 7 figures

  8. arXiv:2212.14649  [pdf, other

    cs.CV cs.AI

    HPointLoc: Point-based Indoor Place Recognition using Synthetic RGB-D Images

    Authors: Dmitry Yudin, Yaroslav Solomentsev, Ruslan Musaev, Aleksei Staroverov, Aleksandr I. Panov

    Abstract: We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment and loop detection in simultaneous localization and map**. The loop detection sub-task is especially relevant when a robot with an on-board RGB-D camera can drive past the same place (``Point") at different angles. The dataset is based on the popular Habi… ▽ More

    Submitted 30 December, 2022; originally announced December 2022.

    Comments: Accepted for publishing in proceedings of the 29th International Conference on Neural Information Processing (ICONIP 2022)

  9. arXiv:2206.10944  [pdf, other

    cs.LG cs.AI cs.MA

    POGEMA: Partially Observable Grid Environment for Multiple Agents

    Authors: Alexey Skrynnik, Anton Andreychuk, Konstantin Yakovlev, Aleksandr I. Panov

    Abstract: We introduce POGEMA (https://github.com/AIRI-Institute/pogema) a sandbox for challenging partially observable multi-agent pathfinding (PO-MAPF) problems . This is a grid-based environment that was specifically designed to be a flexible, tunable and scalable benchmark. It can be tailored to a variety of PO-MAPF, which can serve as an excellent testing ground for planning and learning methods, and t… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

    Comments: 7 pages, 7 figures

  10. arXiv:2206.00142  [pdf, other

    cs.LG cs.AI cs.CL

    IGLU Gridworld: Simple and Fast Environment for Embodied Dialog Agents

    Authors: Artem Zholus, Alexey Skrynnik, Shrestha Mohanty, Zoya Volovikova, Julia Kiseleva, Artur Szlam, Marc-Alexandre Coté, Aleksandr I. Panov

    Abstract: We present the IGLU Gridworld: a reinforcement learning environment for building and evaluating language conditioned embodied agents in a scalable way. The environment features visual agent embodiment, interactive learning through collaboration, language conditioned RL, and combinatorically hard task (3d blocks building) space.

    Submitted 31 May, 2022; originally announced June 2022.

  11. arXiv:2110.13241  [pdf, other

    cs.LG

    Multitask Adaptation by Retrospective Exploration with Learned World Models

    Authors: Artem Zholus, Aleksandr I. Panov

    Abstract: Model-based reinforcement learning (MBRL) allows solving complex tasks in a sample-efficient manner. However, no information is reused between the tasks. In this work, we propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from continuously growing task-agnostic storage. The model is trained to maximize the expected agent's performance by sel… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

  12. arXiv:2109.10173  [pdf, other

    cs.LG cs.AI

    Long-Term Exploration in Persistent MDPs

    Authors: Leonid Ugadiarov, Alexey Skrynnik, Aleksandr I. Panov

    Abstract: Exploration is an essential part of reinforcement learning, which restricts the quality of learned policy. Hard-exploration environments are defined by huge state space and sparse rewards. In such conditions, an exhaustive exploration of the environment is often impossible, and the successful training of an agent requires a lot of interaction steps. In this paper, we propose an exploration method… ▽ More

    Submitted 21 September, 2021; originally announced September 2021.

    Comments: This is a preprint of the paper accepted to MICAI 2021. It contains 13 pages and 6 figures

  13. arXiv:2109.09512  [pdf, other

    cs.AI cs.RO

    Landmark Policy Optimization for Object Navigation Task

    Authors: Aleksey Staroverov, Aleksandr I. Panov

    Abstract: This work studies object goal navigation task, which involves navigating to the closest object related to the given semantic category in unseen environments. Recent works have shown significant achievements both in the end-to-end Reinforcement Learning approach and modular systems, but need a big step forward to be robust and optimal. We propose a hierarchical method that incorporates standard tas… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

  14. arXiv:2108.06148  [pdf, other

    cs.LG cs.AI

    Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable Grid Environments

    Authors: Vasilii Davydov, Alexey Skrynnik, Konstantin Yakovlev, Aleksandr I. Panov

    Abstract: In this paper, we consider the problem of multi-agent navigation in partially observable grid environments. This problem is challenging for centralized planning approaches as they, typically, rely on the full knowledge of the environment. We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these polici… ▽ More

    Submitted 13 August, 2021; originally announced August 2021.

    Comments: This is a preprint of the paper accepted to RCAI 2021. It contains 11 pages and 5 figures

  15. arXiv:2006.09950  [pdf, other

    cs.LG cs.AI

    Delta Schema Network in Model-based Reinforcement Learning

    Authors: Andrey Gorodetskiy, Alexandra Shlychkova, Aleksandr I. Panov

    Abstract: This work is devoted to unresolved problems of Artificial General Intelligence - the inefficiency of transfer learning. One of the mechanisms that are used to solve this problem in the area of reinforcement learning is a model-based approach. In the paper we are expanding the schema networks method which allows to extract the logical relationships between objects and actions from the environment d… ▽ More

    Submitted 8 July, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: Published at the AGI 2020 conference

  16. arXiv:2006.09939  [pdf, other

    cs.LG cs.AI

    Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations

    Authors: Alexey Skrynnik, Aleksey Staroverov, Ermek Aitygulov, Kirill Aksenov, Vasilii Davydov, Aleksandr I. Panov

    Abstract: Currently, deep reinforcement learning (RL) shows impressive results in complex gaming and robotic environments. Often these results are achieved at the expense of huge computational costs and require an incredible number of episodes of interaction between the agent and the environment. There are two main approaches to improving the sample efficiency of reinforcement learning methods - using hiera… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

  17. arXiv:1912.08664  [pdf, other

    cs.AI

    Hierarchical Deep Q-Network from Imperfect Demonstrations in Minecraft

    Authors: Alexey Skrynnik, Aleksey Staroverov, Ermek Aitygulov, Kirill Aksenov, Vasilii Davydov, Aleksandr I. Panov

    Abstract: We present Hierarchical Deep Q-Network (HDQfD) that took first place in the MineRL competition. HDQfD works on imperfect demonstrations and utilizes the hierarchical structure of expert trajectories. We introduce the procedure of extracting an effective sequence of meta-actions and subgoals from demonstration data. We present a structured task-dependent replay buffer and adaptive prioritizing tech… ▽ More

    Submitted 13 July, 2020; v1 submitted 18 December, 2019; originally announced December 2019.

  18. arXiv:1806.05292  [pdf, other

    cs.AI

    Automatic formation of the structure of abstract machines in hierarchical reinforcement learning with state clustering

    Authors: Aleksandr I. Panov, Aleksey Skrynnik

    Abstract: We introduce a new approach to hierarchy formation and task decomposition in hierarchical reinforcement learning. Our method is based on the Hierarchy Of Abstract Machines (HAM) framework because HAM approach is able to design efficient controllers that will realize specific behaviors in real robots. The key to our algorithm is the introduction of the internal or "mental" environment in which the… ▽ More

    Submitted 13 June, 2018; originally announced June 2018.

  19. arXiv:1607.08181  [pdf, other

    cs.AI

    Psychologically inspired planning method for smart relocation task

    Authors: Aleksandr I. Panov, Konstantin Yakovlev

    Abstract: Behavior planning is known to be one of the basic cognitive functions, which is essential for any cognitive architecture of any control system used in robotics. At the same time most of the widespread planning algorithms employed in those systems are developed using only approaches and models of Artificial Intelligence and don't take into account numerous results of cognitive experiments. As a res… ▽ More

    Submitted 27 July, 2016; originally announced July 2016.

    Comments: As submitted to the 7th International Conference on Biologically Inspired Cognitive Architectures (BICA 2016), New-York, USA, July 16-19 2016

  20. Behavior and path planning for the coalition of cognitive robots in smart relocation tasks

    Authors: Aleksandr I. Panov, Konstantin Yakovlev

    Abstract: In this paper we outline the approach of solving special type of navigation tasks for robotic systems, when a coalition of robots (agents) acts in the 2D environment, which can be modified by the actions, and share the same goal location. The latter is originally unreachable for some members of the coalition, but the common task still can be accomplished as the agents can assist each other (e.g. b… ▽ More

    Submitted 27 July, 2016; originally announced July 2016.

    Comments: As submitted to the 4th International Conference on Robot Intelligence Technology and Applications (RiTA-2015), Bucheon, Korea, December 14-16, 2015