Skip to main content

Showing 1–18 of 18 results for author: Denil, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2106.10251  [pdf, other

    cs.LG cs.AI stat.ML

    Active Offline Policy Selection

    Authors: Ksenia Konyushkova, Yutian Chen, Tom Le Paine, Caglar Gulcehre, Cosmin Paduraru, Daniel J Mankowitz, Misha Denil, Nando de Freitas

    Abstract: This paper addresses the problem of policy selection in domains with abundant logged data, but with a restricted interaction budget. Solving this problem would enable safe evaluation and deployment of offline reinforcement learning policies in industry, robotics, and recommendation domains among others. Several off-policy evaluation (OPE) techniques have been proposed to assess the value of polici… ▽ More

    Submitted 6 May, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: Presented at NeurIPS 2021

  2. arXiv:2011.13885  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Offline Learning from Demonstrations and Unlabeled Experience

    Authors: Konrad Zolna, Alexander Novikov, Ksenia Konyushkova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed

    Abstract: Behavior cloning (BC) is often practical for robot learning because it allows a policy to be trained offline without rewards, by supervised learning on expert demonstrations. However, BC does not effectively leverage what we will refer to as unlabeled experience: data of mixed and unknown quality without reward annotations. This unlabeled data can be generated by a variety of sources such as human… ▽ More

    Submitted 27 November, 2020; originally announced November 2020.

    Comments: Accepted to Offline Reinforcement Learning Workshop at Neural Information Processing Systems (2020)

  3. arXiv:1911.00459  [pdf, other

    cs.LG stat.ML

    Positive-Unlabeled Reward Learning

    Authors: Danfei Xu, Misha Denil

    Abstract: Learning reward functions from data is a promising path towards achieving scalable Reinforcement Learning (RL) for robotics. However, a major challenge in training agents from learned reward models is that the agent can learn to exploit errors in the reward model to achieve high reward behaviors that do not correspond to the intended task. These reward delusions can lead to unintended and even dan… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

  4. arXiv:1910.01077  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Task-Relevant Adversarial Imitation Learning

    Authors: Konrad Zolna, Scott Reed, Alexander Novikov, Sergio Gomez Colmenarejo, David Budden, Serkan Cabi, Misha Denil, Nando de Freitas, Ziyu Wang

    Abstract: We show that a critical vulnerability in adversarial imitation is the tendency of discriminator networks to learn spurious associations between visual features and expert labels. When the discriminator focuses on task-irrelevant features, it does not provide an informative reward signal, leading to poor task performance. We analyze this problem in detail and propose a solution that outperforms sta… ▽ More

    Submitted 12 November, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: Accepted to CoRL 2020 (see presentation here: https://youtu.be/ZgQvFGuEgFU )

  5. arXiv:1706.06383  [pdf, other

    cs.AI cs.NE stat.ML

    Programmable Agents

    Authors: Misha Denil, Sergio Gómez Colmenarejo, Serkan Cabi, David Saxton, Nando de Freitas

    Abstract: We build deep RL agents that execute declarative programs expressed in formal language. The agents learn to ground the terms in this language in their environment, and can generalize their behavior at test time to execute new programs that refer to objects that were not referenced during training. The agents develop disentangled interpretable representations that allow them to generalize to a wide… ▽ More

    Submitted 20 June, 2017; originally announced June 2017.

  6. arXiv:1703.04813  [pdf, other

    cs.LG cs.NE stat.ML

    Learned Optimizers that Scale and Generalize

    Authors: Olga Wichrowska, Niru Maheswaranathan, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Nando de Freitas, Jascha Sohl-Dickstein

    Abstract: Learning to learn has emerged as an important direction for achieving artificial intelligence. Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks. We introduce a learned gradient descent optimizer that generalizes well to new tasks, and which has significantly reduced memory and computation overhead. We achieve… ▽ More

    Submitted 7 September, 2017; v1 submitted 14 March, 2017; originally announced March 2017.

    Comments: Final ICML paper after reviewer suggestions

  7. arXiv:1611.03824  [pdf, other

    stat.ML cs.LG

    Learning to Learn without Gradient Descent by Gradient Descent

    Authors: Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Matt Botvinick, Nando de Freitas

    Abstract: We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter t… ▽ More

    Submitted 12 June, 2017; v1 submitted 11 November, 2016; originally announced November 2016.

    Comments: Accepted by ICML 2017. Previous version "Learning to Learn for Global Optimization of Black Box Functions" was published in the Deep Reinforcement Learning Workshop, NIPS 2016

  8. arXiv:1611.01843  [pdf, other

    stat.ML cs.AI cs.CV cs.LG cs.NE physics.soc-ph

    Learning to Perform Physics Experiments via Deep Reinforcement Learning

    Authors: Misha Denil, Pulkit Agrawal, Tejas D Kulkarni, Tom Erez, Peter Battaglia, Nando de Freitas

    Abstract: When encountering novel objects, humans are able to infer a wide range of physical properties such as mass, friction and deformability by interacting with them in a goal driven way. This process of active interaction is in the same spirit as a scientist performing experiments to discover hidden facts. Recent advances in artificial intelligence have yielded machines that can achieve superhuman perf… ▽ More

    Submitted 17 August, 2017; v1 submitted 6 November, 2016; originally announced November 2016.

  9. arXiv:1603.00391  [pdf, other

    cs.LG cs.NE stat.ML

    Noisy Activation Functions

    Authors: Caglar Gulcehre, Marcin Moczulski, Misha Denil, Yoshua Bengio

    Abstract: Common nonlinear activation functions used in neural networks can cause training difficulties due to the saturation behavior of the activation function, which may hide dependencies that are not visible to vanilla-SGD (using first order gradients only). Gating mechanisms that use softly saturating activation functions to emulate the discrete switching of digital logic circuits are good examples of… ▽ More

    Submitted 3 April, 2016; v1 submitted 1 March, 2016; originally announced March 2016.

  10. arXiv:1412.7149  [pdf, other

    cs.LG cs.NE stat.ML

    Deep Fried Convnets

    Authors: Zichao Yang, Marcin Moczulski, Misha Denil, Nando de Freitas, Alex Smola, Le Song, Ziyu Wang

    Abstract: The fully connected layers of a deep convolutional neural network typically contain over 90% of the network parameters, and consume the majority of the memory required to store the network parameters. Reducing the number of parameters while preserving essentially the same predictive performance is critically important for operating deep neural networks in memory constrained environments such as GP… ▽ More

    Submitted 17 July, 2015; v1 submitted 22 December, 2014; originally announced December 2014.

    Comments: svd experiments included

  11. arXiv:1411.3128  [pdf, other

    cs.LG stat.ML

    Deep Multi-Instance Transfer Learning

    Authors: Dimitrios Kotzias, Misha Denil, Phil Blunsom, Nando de Freitas

    Abstract: We present a new approach for transferring knowledge from groups to individuals that comprise them. We evaluate our method in text, by inferring the ratings of individual sentences using full-review ratings. This approach, which combines ideas from transfer learning, deep learning and multi-instance learning, reduces the need for laborious human labelling of fine-grained data when abundant labels… ▽ More

    Submitted 10 December, 2014; v1 submitted 12 November, 2014; originally announced November 2014.

  12. arXiv:1406.3830  [pdf, other

    cs.CL cs.LG stat.ML

    Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

    Authors: Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, Nando de Freitas

    Abstract: Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the meaning of documents by embedding them in a low dimensional vector space, while preserving distinctions of word and sentence order crucial for capturing nuanced se… ▽ More

    Submitted 15 June, 2014; originally announced June 2014.

  13. arXiv:1406.3070  [pdf, other

    stat.ML

    Distributed Parameter Estimation in Probabilistic Graphical Models

    Authors: Yariv Dror Mizrahi, Misha Denil, Nando de Freitas

    Abstract: This paper presents foundational theoretical results on distributed parameter estimation for undirected probabilistic graphical models. It introduces a general condition on composite likelihood decompositions of these models which guarantees the global consistency of distributed estimators, provided the local estimators are consistent.

    Submitted 11 June, 2014; originally announced June 2014.

  14. arXiv:1310.1415  [pdf, other

    stat.ML cs.LG

    Narrowing the Gap: Random Forests In Theory and In Practice

    Authors: Misha Denil, David Matheson, Nando de Freitas

    Abstract: Despite widespread interest and practical use, the theoretical properties of random forests are still not well understood. In this paper we contribute to this understanding in two ways. We present a new theoretically tractable variant of random regression forests and prove that our algorithm is consistent. We also provide an empirical evaluation, comparing our algorithm and other theoretically tra… ▽ More

    Submitted 4 October, 2013; originally announced October 2013.

    Comments: Under review by the International Conference on Machine Learning (ICML) 2014

  15. arXiv:1308.6342  [pdf, other

    stat.ML cs.LG

    Linear and Parallel Learning of Markov Random Fields

    Authors: Yariv Dror Mizrahi, Misha Denil, Nando de Freitas

    Abstract: We introduce a new embarrassingly parallel parameter learning algorithm for Markov random fields with untied parameters which is efficient for a large class of practical models. Our algorithm parallelizes naturally over cliques and, for graphs of bounded degree, its complexity is linear in the number of cliques. Unlike its competitors, our algorithm is fully parallel and for log-linear models it i… ▽ More

    Submitted 5 February, 2014; v1 submitted 28 August, 2013; originally announced August 2013.

  16. arXiv:1306.0543  [pdf, other

    cs.LG cs.NE stat.ML

    Predicting Parameters in Deep Learning

    Authors: Misha Denil, Babak Shakibi, Laurent Dinh, Marc'Aurelio Ranzato, Nando de Freitas

    Abstract: We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small nu… ▽ More

    Submitted 27 October, 2014; v1 submitted 3 June, 2013; originally announced June 2013.

  17. arXiv:1302.4853  [pdf, other

    stat.ML

    Consistency of Online Random Forests

    Authors: Misha Denil, David Matheson, Nando de Freitas

    Abstract: As a testament to their success, the theory of random forests has long been outpaced by their application in practice. In this paper, we take a step towards narrowing this gap by providing a consistency result for online random forests.

    Submitted 8 May, 2013; v1 submitted 20 February, 2013; originally announced February 2013.

    Comments: To appear in Proceedings of the 30th International Conference on Machine Learning, 2013

  18. arXiv:1208.0959  [pdf, other

    cs.LG cs.CV stat.ML

    Recklessly Approximate Sparse Coding

    Authors: Misha Denil, Nando de Freitas

    Abstract: It has recently been observed that certain extremely simple feature encoding techniques are able to achieve state of the art performance on several standard image classification benchmarks including deep belief networks, convolutional nets, factored RBMs, mcRBMs, convolutional RBMs, sparse autoencoders and several others. Moreover, these "triangle" or "soft threshold" encodings are ex- tremely eff… ▽ More

    Submitted 6 January, 2013; v1 submitted 4 August, 2012; originally announced August 2012.