Skip to main content

Showing 1–14 of 14 results for author: Zolna, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2011.13885  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Offline Learning from Demonstrations and Unlabeled Experience

    Authors: Konrad Zolna, Alexander Novikov, Ksenia Konyushkova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed

    Abstract: Behavior cloning (BC) is often practical for robot learning because it allows a policy to be trained offline without rewards, by supervised learning on expert demonstrations. However, BC does not effectively leverage what we will refer to as unlabeled experience: data of mixed and unknown quality without reward annotations. This unlabeled data can be generated by a variety of sources such as human… ▽ More

    Submitted 27 November, 2020; originally announced November 2020.

    Comments: Accepted to Offline Reinforcement Learning Workshop at Neural Information Processing Systems (2020)

  2. arXiv:2007.09055  [pdf, other

    cs.LG cs.AI stat.ML

    Hyperparameter Selection for Offline Reinforcement Learning

    Authors: Tom Le Paine, Cosmin Paduraru, Andrea Michi, Caglar Gulcehre, Konrad Zolna, Alexander Novikov, Ziyu Wang, Nando de Freitas

    Abstract: Offline reinforcement learning (RL purely from logged data) is an important avenue for deploying RL techniques in real-world scenarios. However, existing hyperparameter selection methods for offline RL break the offline assumption by evaluating policies corresponding to each hyperparameter setting in the environment. This online execution is often infeasible and hence undermines the main aim of of… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

  3. arXiv:2006.15134  [pdf, other

    cs.LG cs.AI stat.ML

    Critic Regularized Regression

    Authors: Ziyu Wang, Alexander Novikov, Konrad Zolna, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Josh Merel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

    Abstract: Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction. It addresses challenges with regard to the cost of data collection and safety, both of which are particularly pertinent to real-world applications of RL. Unfortunately, most off-policy algorithms perform poorly when learnin… ▽ More

    Submitted 22 September, 2021; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: 24 pages; presented at NeurIPS 2020

  4. arXiv:2006.13888  [pdf, other

    cs.LG stat.ML

    RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

    Authors: Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, Nando de Freitas

    Abstract: Offline methods for reinforcement learning have a potential to help bridge the gap between reinforcement learning research and real-world applications. They make it possible to learn policies from offline datasets, thus overcoming concerns associated with online data collection in the real-world, including cost, safety, or ethical concerns. In this paper, we propose a benchmark called RL Unplugged… ▽ More

    Submitted 12 February, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: NeurIPS paper. 21 pages including supplementary material, the github link for the datasets: https://github.com/deepmind/deepmind-research/rl_unplugged

  5. arXiv:2002.00412  [pdf, other

    cs.LG cs.AI stat.ML

    Combating False Negatives in Adversarial Imitation Learning

    Authors: Konrad Zolna, Chitwan Saharia, Leonard Boussioux, David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

    Abstract: In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the desired behavior. However, as the trained policy learns to be more successful, the negative examples (the ones produced by the agent) become increasingly similar to expert ones. Despite the fact that the task is successfully accomplished in some of the agent's t… ▽ More

    Submitted 2 February, 2020; originally announced February 2020.

    Comments: This is an extended version of the student abstract published at 34th AAAI Conference on Artificial Intelligence

  6. arXiv:1910.01077  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Task-Relevant Adversarial Imitation Learning

    Authors: Konrad Zolna, Scott Reed, Alexander Novikov, Sergio Gomez Colmenarejo, David Budden, Serkan Cabi, Misha Denil, Nando de Freitas, Ziyu Wang

    Abstract: We show that a critical vulnerability in adversarial imitation is the tendency of discriminator networks to learn spurious associations between visual features and expert labels. When the discriminator focuses on task-irrelevant features, it does not provide an informative reward signal, leading to poor task performance. We analyze this problem in detail and propose a solution that outperforms sta… ▽ More

    Submitted 12 November, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: Accepted to CoRL 2020 (see presentation here: https://youtu.be/ZgQvFGuEgFU )

  7. arXiv:1906.07576  [pdf, other

    cs.CY cs.LG stat.ML

    The Dynamics of Handwriting Improves the Automated Diagnosis of Dysgraphia

    Authors: Konrad Zolna, Thibault Asselborn, Caroline Jolly, Laurence Casteran, Marie-Ange~Nguyen-Morel, Wafa Johal, Pierre Dillenbourg

    Abstract: Handwriting disorder (termed dysgraphia) is a far from a singular problem as nearly 8.6% of the population in France is considered dysgraphic. Moreover, research highlights the fundamental importance to detect and remediate these handwriting difficulties as soon as possible as they may affect a child's entire life, undermining performance and self-confidence in a wide variety of school activities.… ▽ More

    Submitted 12 June, 2019; originally announced June 2019.

  8. arXiv:1904.03515  [pdf, other

    cs.LG stat.ML

    Split Batch Normalization: Improving Semi-Supervised Learning under Domain Shift

    Authors: Michał Zając, Konrad Zolna, Stanisław Jastrzębski

    Abstract: Recent work has shown that using unlabeled data in semi-supervised learning is not always beneficial and can even hurt generalization, especially when there is a class mismatch between the unlabeled and labeled examples. We investigate this phenomenon for image classification on the CIFAR-10 and the ImageNet datasets, and with many other forms of domain shifts applied (e.g. salt-and-pepper noise).… ▽ More

    Submitted 6 April, 2019; originally announced April 2019.

    Comments: Under review for ECML PKDD 2019

  9. arXiv:1904.03438  [pdf, other

    cs.LG cs.AI stat.ML

    Reinforced Imitation in Heterogeneous Action Space

    Authors: Konrad Zolna, Negar Rostamzadeh, Yoshua Bengio, Sung** Ahn, Pedro O. Pinheiro

    Abstract: Imitation learning is an effective alternative approach to learn a policy when the reward function is sparse. In this paper, we consider a challenging setting where an agent and an expert use different actions from each other. We assume that the agent has access to a sparse reward function and state-only expert observations. We propose a method which gradually balances between the imitation learni… ▽ More

    Submitted 26 August, 2019; v1 submitted 6 April, 2019; originally announced April 2019.

    Comments: The extended version of the work "Reinforced Imitation Learning from Observations" presented on the NeurIPS workshop "Imitation Learning and its Challenges in Robotics"

  10. arXiv:1812.04599  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Adversarial Framing for Image and Video Classification

    Authors: Konrad Zolna, Michal Zajac, Negar Rostamzadeh, Pedro O. Pinheiro

    Abstract: Neural networks are prone to adversarial attacks. In general, such attacks deteriorate the quality of the input by either slightly modifying most of its pixels, or by occluding it with a patch. In this paper, we propose a method that keeps the image unchanged and only adds an adversarial framing on the border of the image. We show empirically that our method is able to successfully attack state-of… ▽ More

    Submitted 17 October, 2019; v1 submitted 11 December, 2018; originally announced December 2018.

    Comments: This is an extended version of the paper published at 33rd AAAI Conference on Artificial Intelligence (see https://doi.org/10.1609/aaai.v33i01.330110077 )

  11. arXiv:1806.04342  [pdf, other

    stat.ML cs.LG

    Focused Hierarchical RNNs for Conditional Sequence Processing

    Authors: Nan Rosemary Ke, Konrad Zolna, Alessandro Sordoni, Zhouhan Lin, Adam Trischler, Yoshua Bengio, Joelle Pineau, Laurent Charlin, Chris Pal

    Abstract: Recurrent Neural Networks (RNNs) with attention mechanisms have obtained state-of-the-art results for many sequence processing tasks. Most of these models use a simple form of encoder with attention that looks over the entire sequence and assigns a weight to each token independently. We present a mechanism for focusing RNN encoders for sequence modelling tasks which allows them to attend to key pa… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.

    Comments: To appear at ICML 2018

  12. arXiv:1805.08249  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    Classifier-agnostic saliency map extraction

    Authors: Konrad Zolna, Krzysztof J. Geras, Kyunghyun Cho

    Abstract: Currently available methods for extracting saliency maps identify parts of the input which are the most important to a specific fixed classifier. We show that this strong dependence on a given classifier hinders their performance. To address this problem, we propose classifier-agnostic saliency map extraction, which finds all parts of the image that any classifier could use, not just one given in… ▽ More

    Submitted 19 July, 2020; v1 submitted 21 May, 2018; originally announced May 2018.

    Journal ref: Computer Vision and Image Understanding, Volume 196, 2020, 102969, ISSN 1077-3142

  13. arXiv:1711.00066  [pdf, other

    stat.ML cs.AI cs.LG

    Fraternal Dropout

    Authors: Konrad Zolna, Devansh Arpit, Dendi Suhubdy, Yoshua Bengio

    Abstract: Recurrent neural networks (RNNs) are important class of architectures among neural networks useful for language modeling and sequential prediction. However, optimizing RNNs is known to be harder compared to feed-forward neural networks. A number of techniques have been proposed in literature to address this problem. In this paper we propose a simple technique called fraternal dropout that takes ad… ▽ More

    Submitted 28 March, 2018; v1 submitted 31 October, 2017; originally announced November 2017.

    Comments: Accepted to ICLR 2018. Extended appendix. Added official GitHub code for replication: https://github.com/kondiz/fraternal-dropout . Added references. Corrected typos

  14. arXiv:1612.01589  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Improving the Performance of Neural Networks in Regression Tasks Using Drawering

    Authors: Konrad Zolna

    Abstract: The method presented extends a given regression neural network to make its performance improve. The modification affects the learning procedure only, hence the extension may be easily omitted during evaluation without any change in prediction. It means that the modified model may be evaluated as quickly as the original one but tends to perform better. This improvement is possible because the mod… ▽ More

    Submitted 5 December, 2016; originally announced December 2016.