Skip to main content

Showing 1–14 of 14 results for author: Kurin, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2209.11142  [pdf, other

    cs.LG cs.AI stat.ML

    A Generalist Neural Algorithmic Learner

    Authors: Borja Ibarz, Vitaly Kurin, George Papamakarios, Kyriacos Nikiforou, Mehdi Bennani, Róbert Csordás, Andrew Dudzik, Matko Bošnjak, Alex Vitvitskyi, Yulia Rubanova, Andreea Deac, Beatrice Bevilacqua, Yaroslav Ganin, Charles Blundell, Petar Veličković

    Abstract: The cornerstone of neural algorithmic reasoning is the ability to solve algorithmic tasks, especially in a way that generalises out of distribution. While recent years have seen a surge in methodological improvements in this area, they mostly focused on building specialist models. Specialist models are capable of learning to neurally execute either only one algorithm or a collection of algorithms… ▽ More

    Submitted 3 December, 2022; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: To appear at LoG 2022 (Spotlight talk). 23 pages, 11 figures

  2. arXiv:2203.11889  [pdf, other

    cs.LG cs.AI cs.NE cs.SC stat.ML

    Insights From the NeurIPS 2021 NetHack Challenge

    Authors: Eric Hambro, Sharada Mohanty, Dmitrii Babaev, Minwoo Byeon, Dipam Chakraborty, Edward Grefenstette, Minqi Jiang, Dae** Jo, Anssi Kanervisto, Jongmin Kim, Sungwoong Kim, Robert Kirk, Vitaly Kurin, Heinrich Küttler, Taehwon Kwon, Donghoon Lee, Vegard Mella, Nantas Nardelli, Ivan Nazarov, Nikita Ovsov, Jack Parker-Holder, Roberta Raileanu, Karolis Ramanauskas, Tim Rocktäschel, Danielle Rothermel , et al. (4 additional authors not shown)

    Abstract: In this report, we summarize the takeaways from the first NeurIPS 2021 NetHack Challenge. Participants were tasked with develo** a program or agent that can win (i.e., 'ascend' in) the popular dungeon-crawler game of NetHack by interacting with the NetHack Learning Environment (NLE), a scalable, procedurally generated, and challenging Gym environment for reinforcement learning (RL). The challeng… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: Under review at PMLR for the NeuRIPS 2021 Competition Workshop Track, 10 pages + 10 in appendices

  3. arXiv:2202.00079  [pdf, other

    cs.LG cs.AI

    You May Not Need Ratio Clip** in PPO

    Authors: Mingfei Sun, Vitaly Kurin, Guoqing Liu, Sam Devlin, Tao Qin, Katja Hofmann, Shimon Whiteson

    Abstract: Proximal Policy Optimization (PPO) methods learn a policy by iteratively performing multiple mini-batch optimization epochs of a surrogate objective with one set of sampled data. Ratio clip** PPO is a popular variant that clips the probability ratios between the target policy and the policy used to collect samples. Ratio clip** yields a pessimistic estimate of the original surrogate objective,… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

  4. arXiv:2201.04122  [pdf, other

    cs.LG cs.AI cs.CV

    In Defense of the Unitary Scalarization for Deep Multi-Task Learning

    Authors: Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, M. Pawan Kumar

    Abstract: Recent multi-task learning research argues against unitary scalarization, where training simply minimizes the sum of the task losses. Several ad-hoc multi-task optimization algorithms have instead been proposed, inspired by various hypotheses about what makes multi-task settings difficult. The majority of these optimizers require per-task gradients, and introduce significant memory, runtime, and i… ▽ More

    Submitted 8 March, 2023; v1 submitted 11 January, 2022; originally announced January 2022.

    Comments: NeurIPS 2022 camera-ready version, fixed training loss y axis scale

  5. arXiv:2109.13202  [pdf, other

    cs.LG stat.ML

    MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

    Authors: Mikayel Samvelyan, Robert Kirk, Vitaly Kurin, Jack Parker-Holder, Minqi Jiang, Eric Hambro, Fabio Petroni, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel

    Abstract: Progress in deep reinforcement learning (RL) is heavily driven by the availability of challenging benchmarks used for training agents. However, benchmarks that are widely adopted by the community are not explicitly designed for evaluating specific capabilities of RL methods. While there exist environments for assessing particular open problems in RL (such as exploration, transfer learning, unsuper… ▽ More

    Submitted 16 November, 2021; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: NeurIPS 2021: Datasets and Benchmarks Track

  6. arXiv:2103.01009  [pdf, other

    cs.LG

    Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing

    Authors: Charlie Blake, Vitaly Kurin, Maximilian Igl, Shimon Whiteson

    Abstract: Recent research has shown that graph neural networks (GNNs) can learn policies for locomotion control that are as effective as a typical multi-layer perceptron (MLP), with superior transfer and multi-task performance (Wang et al., 2018; Huang et al., 2020). Results have so far been limited to training on small agents, with the performance of GNNs deteriorating rapidly as the number of sensors and… ▽ More

    Submitted 3 January, 2022; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: 20 pages, 14 figures, published at NeurIPS 2021

  7. arXiv:2010.01856  [pdf, other

    cs.LG stat.ML

    My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control

    Authors: Vitaly Kurin, Maximilian Igl, Tim Rocktäschel, Wendelin Boehmer, Shimon Whiteson

    Abstract: Multitask Reinforcement Learning is a promising way to obtain models with better performance, generalisation, data efficiency, and robustness. Most existing work is limited to compatible settings, where the state and action space dimensions are the same across tasks. Graph Neural Networks (GNN) are one way to address incompatible environments, because they can process graphs of arbitrary size. The… ▽ More

    Submitted 14 April, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: ICLR 2021 Camera-Ready Version

  8. arXiv:1910.00091  [pdf, other

    cs.LG cs.AI cs.MA

    Deep Coordination Graphs

    Authors: Wendelin Böhmer, Vitaly Kurin, Shimon Whiteson

    Abstract: This paper introduces the deep coordination graph (DCG) for collaborative multi-agent reinforcement learning. DCG strikes a flexible trade-off between representational capacity and generalization by factoring the joint value function of all agents according to a coordination graph into payoffs between pairs of agents. The value can be maximized by local message passing along the graph, which allow… ▽ More

    Submitted 23 June, 2020; v1 submitted 27 September, 2019; originally announced October 2019.

    Comments: Accepted at ICML 2020

  9. arXiv:1909.11830  [pdf, other

    cs.LG cs.AI

    Can $Q$-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?

    Authors: Vitaly Kurin, Saad Godil, Shimon Whiteson, Bryan Catanzaro

    Abstract: We present Graph-$Q$-SAT, a branching heuristic for a Boolean SAT solver trained with value-based reinforcement learning (RL) using Graph Neural Networks for function approximation. Solvers using Graph-$Q$-SAT are complete SAT solvers that either provide a satisfying assignment or proof of unsatisfiability, which is required for many SAT applications. The branching heuristics commonly used in SAT… ▽ More

    Submitted 25 November, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: Camera-ready for NeurIPS 2020

  10. arXiv:1902.06583  [pdf, other

    cs.LG stat.ML

    Fast Efficient Hyperparameter Tuning for Policy Gradients

    Authors: Supratik Paul, Vitaly Kurin, Shimon Whiteson

    Abstract: The performance of policy gradient methods is sensitive to hyperparameter settings that must be tuned for any new application. Widely used grid search methods for tuning hyperparameters are sample inefficient and computationally expensive. More advanced methods like Population Based Training that learn optimal schedules for hyperparameters instead of fixed settings can yield better results, but ar… ▽ More

    Submitted 17 September, 2019; v1 submitted 18 February, 2019; originally announced February 2019.

    Comments: NeurIPS 2019

  11. arXiv:1811.03516  [pdf, other

    cs.LG stat.ML

    Learning from Demonstration in the Wild

    Authors: Feryal Behbahani, Kyriacos Shiarlis, Xi Chen, Vitaly Kurin, Sudhanshu Kasewa, Ciprian Stirbu, João Gomes, Supratik Paul, Frans A. Oliehoek, João Messias, Shimon Whiteson

    Abstract: Learning from demonstration (LfD) is useful in settings where hand-coding behaviour or a reward function is impractical. It has succeeded in a wide range of problems but typically relies on manually generated demonstrations or specially deployed sensors and has not generally been able to leverage the copious demonstrations available in the wild: those that capture behaviours that were occurring an… ▽ More

    Submitted 25 March, 2019; v1 submitted 8 November, 2018; originally announced November 2018.

    Comments: Accepted to the IEEE International Conference on Robotics and Automation (ICRA) 2019; extended version with appendix

  12. arXiv:1810.03642  [pdf, other

    cs.LG stat.ML

    Fast Context Adaptation via Meta-Learning

    Authors: Luisa M Zintgraf, Kyriacos Shiarlis, Vitaly Kurin, Katja Hofmann, Shimon Whiteson

    Abstract: We propose CAVIA for meta-learning, a simple extension to MAML that is less prone to meta-overfitting, easier to parallelise, and more interpretable. CAVIA partitions the model parameters into two parts: context parameters that serve as additional input to the model and are adapted on individual tasks, and shared parameters that are meta-trained and shared across tasks. At test time, only the cont… ▽ More

    Submitted 10 June, 2019; v1 submitted 8 October, 2018; originally announced October 2018.

    Comments: Published at the International Conference on Machine Learning (ICML) 2019

  13. arXiv:1705.10998  [pdf, other

    cs.AI

    The Atari Grand Challenge Dataset

    Authors: Vitaly Kurin, Sebastian Nowozin, Katja Hofmann, Lucas Beyer, Bastian Leibe

    Abstract: Recent progress in Reinforcement Learning (RL), fueled by its combination, with Deep Learning has enabled impressive results in learning to interact with complex virtual environments, yet real-world applications of RL are still scarce. A key limitation is data efficiency, with current state-of-the-art approaches requiring millions of training samples. A promising way to tackle this problem is to a… ▽ More

    Submitted 31 May, 2017; originally announced May 2017.

  14. arXiv:1705.04608  [pdf, other

    cs.CV

    Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters

    Authors: Lucas Beyer, Stefan Breuers, Vitaly Kurin, Bastian Leibe

    Abstract: With the rise of end-to-end learning through deep learning, person detectors and re-identification (ReID) models have recently become very strong. Multi-camera multi-target (MCMT) tracking has not fully gone through this transformation yet. We intend to take another step in this direction by presenting a theoretically principled way of integrating ReID with tracking formulated as an optimal Bayes… ▽ More

    Submitted 16 May, 2017; v1 submitted 12 May, 2017; originally announced May 2017.

    Comments: First two authors have equal contribution. This is initial work into a new direction, not a benchmark-beating method. v2 only adds acknowledgements and fixes a typo in e-mail