Skip to main content

Showing 1–50 of 57 results for author: Lamb, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.01424  [pdf, other

    cs.LG cs.AI cs.CL

    Universal In-Context Approximation By Prompting Fully Recurrent Models

    Authors: Aleksandar Petrov, Tom A. Lamb, Alasdair Paren, Philip H. S. Torr, Adel Bibi

    Abstract: Zero-shot and in-context learning enable solving tasks without model fine-tuning, making them essential for develo** generative model solutions. Therefore, it is crucial to understand whether a pretrained model can be prompted to approximate any function, i.e., whether it is a universal in-context approximator. While it was recently shown that transformer models do possess this property, these r… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  2. arXiv:2404.14552  [pdf, other

    cs.LG cs.AI

    Generalizing Multi-Step Inverse Models for Representation Learning to Finite-Memory POMDPs

    Authors: Lili Wu, Ben Evans, Riashat Islam, Raihan Seraj, Yonathan Efroni, Alex Lamb

    Abstract: Discovering an informative, or agent-centric, state representation that encodes only the relevant information while discarding the irrelevant is a key challenge towards scaling reinforcement learning algorithms and efficiently applying them to downstream tasks. Prior works studied this problem in high-dimensional Markovian environments, when the current observation may be a complex object but is s… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  3. arXiv:2403.13765  [pdf, other

    cs.LG cs.AI cs.CV

    Towards Principled Representation Learning from Videos for Reinforcement Learning

    Authors: Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford

    Abstract: We study pre-training representations for decision-making using video data, which is abundantly available for tasks such as game agents and software testing. Even though significant empirical advances have been made on this problem, a theoretical understanding remains absent. We initiate the theoretical investigation into principled approaches for representation learning and focus on learning the… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: ICLR 2024 Spotlight Conference Paper

  4. arXiv:2401.01623  [pdf, other

    cs.AI cs.CL

    Can AI Be as Creative as Humans?

    Authors: Haonan Wang, James Zou, Michael Mozer, Anirudh Goyal, Alex Lamb, Linjun Zhang, Weijie J Su, Zhun Deng, Michael Qizhe Xie, Hannah Brown, Kenji Kawaguchi

    Abstract: Creativity serves as a cornerstone for societal progress and innovation. With the rise of advanced generative AI models capable of tasks once reserved for human creativity, the study of AI's creative potential becomes imperative for its responsible development and application. In this paper, we prove in theory that AI can be as creative as humans under the condition that it can properly fit the da… ▽ More

    Submitted 25 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: The paper examines AI's creativity, introducing Relative and Statistical Creativity for theoretical and practical analysis, along with practical training guidelines. Project Page: ai-relative-creativity.github.io

  5. arXiv:2311.03534  [pdf, other

    cs.LG cs.AI cs.RO

    PcLast: Discovering Plannable Continuous Latent States

    Authors: Anurag Koul, Shivakanth Sujit, Shaoru Chen, Ben Evans, Lili Wu, Byron Xu, Rajan Chari, Riashat Islam, Raihan Seraj, Yonathan Efroni, Lekan Molu, Miro Dudik, John Langford, Alex Lamb

    Abstract: Goal-conditioned planning benefits from learned low-dimensional representations of rich observations. While compact latent representations typically learned from variational autoencoders or inverse dynamics enable goal-conditioned decision making, they ignore state reachability, hampering their performance. In this paper, we learn a representation that associates reachable states together for effe… ▽ More

    Submitted 10 June, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted at ICML 2024

  6. arXiv:2310.09412  [pdf, other

    cs.AI cs.LG

    Hybrid Reinforcement Learning for Optimizing Pump Sustainability in Real-World Water Distribution Networks

    Authors: Harsh Patel, Yuan Zhou, Alexander P Lamb, Shu Wang, Jieliang Luo

    Abstract: This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs). Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs. Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of conv… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  7. arXiv:2308.14976  [pdf

    astro-ph.SR astro-ph.IM cs.AI cs.LG eess.IV

    Efficient labeling of solar flux evolution videos by a deep learning model

    Authors: Subhamoy Chatterjee, Andrés Muñoz-Jaramillo, Derek A. Lamb

    Abstract: Machine learning (ML) is becoming a critical tool for interrogation of large complex data. Labeling, defined as the process of adding meaningful annotations, is a crucial step of supervised ML. However, labeling datasets is time consuming. Here we show that convolutional neural networks (CNNs), trained on crudely labeled astronomical videos, can be leveraged to improve the quality of data labeling… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: 16 pages, 7 figures, published in Nature Astronomy, June 27, 2022

    Journal ref: Nat.Astron.6(2022)796-803

  8. arXiv:2306.13564  [pdf, other

    cs.CV eess.IV

    Estimating Residential Solar Potential Using Aerial Data

    Authors: Ross Goroshin, Alex Wilson, Andrew Lamb, Betty Peng, Brandon Ewonus, Cornelius Ratsch, Jordan Raisher, Marisa Leung, Max Burq, Thomas Colthurst, William Rucklidge, Carl Elkin

    Abstract: Project Sunroof estimates the solar potential of residential buildings using high quality aerial data. That is, it estimates the potential solar energy (and associated financial savings) that can be captured by buildings if solar panels were to be installed on their roofs. Unfortunately its coverage is limited by the lack of high resolution digital surface map (DSM) data. We present a deep learnin… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Journal ref: ICLR 2023 - Tackling Climate Change with Machine Learning Workshop

  9. arXiv:2306.04431  [pdf, other

    cs.LG

    Faithful Knowledge Distillation

    Authors: Tom A. Lamb, Rudy Brunel, Krishnamurthy DJ Dvijotham, M. Pawan Kumar, Philip H. S. Torr, Francisco Eiras

    Abstract: Knowledge distillation (KD) has received much attention due to its success in compressing networks to allow for their deployment in resource-constrained systems. While the problem of adversarial robustness has been studied before in the KD setting, previous works overlook what we term the relative calibration of the student network with respect to its teacher in terms of soft confidences. In parti… ▽ More

    Submitted 11 August, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 7pgs (main content), 4 figures

  10. arXiv:2301.11790  [pdf, other

    cs.CV cs.LG stat.ML

    Leveraging the Third Dimension in Contrastive Learning

    Authors: Sumukh Aithal, Anirudh Goyal, Alex Lamb, Yoshua Bengio, Michael Mozer

    Abstract: Self-Supervised Learning (SSL) methods operate on unlabeled data to learn robust representations useful for downstream tasks. Most SSL methods rely on augmentations obtained by transforming the 2D image pixel map. These augmentations ignore the fact that biological vision takes place in an immersive three-dimensional, temporally contiguous environment, and that low-level biological vision relies h… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  11. arXiv:2212.13835  [pdf, other

    cs.LG

    Representation Learning in Deep RL via Discrete Information Bottleneck

    Authors: Riashat Islam, Hongyu Zang, Manan Tomar, Aniket Didolkar, Md Mofijul Islam, Samin Yeasar Arnob, Tariq Iqbal, Xin Li, Anirudh Goyal, Nicolas Heess, Alex Lamb

    Abstract: Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real-world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in th… ▽ More

    Submitted 30 May, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

    Comments: AISTATS 2023

  12. arXiv:2211.07614  [pdf, other

    cs.LG

    Towards Data-Driven Offline Simulations for Online Reinforcement Learning

    Authors: Shengpu Tang, Felipe Vieira Frujeri, Dipendra Misra, Alex Lamb, John Langford, Paul Mineiro, Sebastian Kochman

    Abstract: Modern decision-making systems, from robots to web recommendation engines, are expected to adapt: to user preferences, changing circumstances or even new tasks. Yet, it is still uncommon to deploy a dynamically learning agent (rather than a fixed policy) to a production system, as it's perceived as unsafe. Using historical data to reason about learning algorithms, similar to offline policy evaluat… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Presented at the 3rd Offline Reinforcement Learning Workshop at NeurIPS 2022

  13. arXiv:2211.00928  [pdf, ps, other

    cs.LG cs.AI

    Neural Active Learning on Heteroskedastic Distributions

    Authors: Savya Khosla, Chew Kin Whye, Jordan T. Ash, Cyril Zhang, Kenji Kawaguchi, Alex Lamb

    Abstract: Models that can actively seek out the best quality training data hold the promise of more accurate, adaptable, and efficient machine learning. Active learning techniques often tend to prefer examples that are the most difficult to classify. While this works well on homogeneous datasets, we find that it can lead to catastrophic failures when performed on multiple distributions with different degree… ▽ More

    Submitted 23 July, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

  14. arXiv:2211.00247  [pdf, other

    cs.LG cs.AI

    Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning

    Authors: Riashat Islam, Hongyu Zang, Anirudh Goyal, Alex Lamb, Kenji Kawaguchi, Xin Li, Romain Laroche, Yoshua Bengio, Remi Tachet Des Combes

    Abstract: Goal-conditioned reinforcement learning (RL) is a promising direction for training agents that are capable of solving multiple tasks and reach a diverse set of objectives. How to \textit{specify} and \textit{ground} these goals in such a way that we can both reliably reach goals during training as well as generalize to new goals during evaluation remains an open area of research. Defining goals in… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: Neurips 2022

  15. arXiv:2211.00164  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information

    Authors: Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang, Aniket Didolkar, Dipendra Misra, Xin Li, Harm van Seijen, Remi Tachet des Combes, John Langford

    Abstract: Learning to control an agent from data collected offline in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input information that is hard to model and irrelevant to controlling the agent. This problem has been approached by the theoretical RL community through the lens of exogenou… ▽ More

    Submitted 13 August, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: ICML 2023

  16. arXiv:2210.09505  [pdf, other

    cs.LG stat.ML

    CNT (Conditioning on Noisy Targets): A new Algorithm for Leveraging Top-Down Feedback

    Authors: Alexia Jolicoeur-Martineau, Alex Lamb, Vikas Verma, Aniket Didolkar

    Abstract: We propose a novel regularizer for supervised learning called Conditioning on Noisy Targets (CNT). This approach consists in conditioning the model on a noisy version of the target(s) (e.g., actions in imitation learning or labels in classification) at a random noise level (from small to large noise). At inference time, since we do not know the target, we run the network with only noise in place o… ▽ More

    Submitted 26 October, 2022; v1 submitted 17 October, 2022; originally announced October 2022.

  17. arXiv:2207.08229  [pdf, other

    cs.LG cs.RO stat.ML

    Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models

    Authors: Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, John Langford

    Abstract: In many sequential decision-making tasks, the agent is not able to model the full complexity of the world, which consists of multitudes of relevant and irrelevant information. For example, a person walking along a city street who tries to model all aspects of the world would quickly be overwhelmed by a multitude of shops, cars, and people moving in and out of view, each following their own complex… ▽ More

    Submitted 27 December, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

    Comments: Project Website: https://controllable-latent-state.github.io/

  18. arXiv:2205.14794  [pdf, other

    cs.LG cs.AI

    Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

    Authors: Aniket Didolkar, Kshitij Gupta, Anirudh Goyal, Nitesh B. Gundavarapu, Alex Lamb, Nan Rosemary Ke, Yoshua Bengio

    Abstract: Recurrent neural networks have a strong inductive bias towards learning temporally compressed representations, as the entire history of a sequence is represented by a single vector. By contrast, Transformers have little inductive bias towards learning temporally compressed representations, as they allow for attention over all previously computed elements in a sequence. Having a more compressed rep… ▽ More

    Submitted 25 October, 2022; v1 submitted 29 May, 2022; originally announced May 2022.

  19. arXiv:2202.02195  [pdf, other

    stat.ML cs.LG

    Deep End-to-end Causal Inference

    Authors: Tomas Geffner, Javier Antoran, Adam Foster, Wenbo Gong, Chao Ma, Emre Kiciman, Amit Sharma, Angus Lamb, Martin Kukla, Nick Pawlowski, Miltiadis Allamanis, Cheng Zhang

    Abstract: Causal inference is essential for data-driven decision making across domains such as business engagement, medical treatment and policy making. However, research on causal discovery has evolved separately from inference methods, preventing straight-forward combination of methods from both fields. In this work, we develop Deep End-to-end Causal Inference (DECI), a single flow-based non-linear additi… ▽ More

    Submitted 20 June, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

  20. arXiv:2202.01334  [pdf, other

    cs.LG cs.AI

    Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization

    Authors: Dianbo Liu, Alex Lamb, Xu Ji, Pascal Notsawo, Mike Mozer, Yoshua Bengio, Kenji Kawaguchi

    Abstract: Vector Quantization (VQ) is a method for discretizing latent representations and has become a major part of the deep learning toolkit. It has been theoretically and empirically shown that discretization of representations leads to improved generalization, including in reinforcement learning where discretization can be used to bottleneck multi-agent communication to promote agent specialization and… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

  21. arXiv:2110.08223  [pdf, other

    cs.LG

    Simultaneous Missing Value Imputation and Structure Learning with Groups

    Authors: Pablo Morales-Alvarez, Wenbo Gong, Angus Lamb, Simon Woodhead, Simon Peyton Jones, Nick Pawlowski, Miltiadis Allamanis, Cheng Zhang

    Abstract: Learning structures between groups of variables from data with missing values is an important task in the real world, yet difficult to solve. One typical scenario is discovering the structure among topics in the education domain to identify learning pathways. Here, the observations are student performances for questions under each topic which contain missing values. However, most existing methods… ▽ More

    Submitted 24 February, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

  22. arXiv:2110.04866  [pdf, other

    cs.LG

    CoRGi: Content-Rich Graph Neural Networks with Attention

    Authors: Jooyeon Kim, Angus Lamb, Simon Woodhead, Simon Peyton Jones, Cheng Zheng, Miltiadis Allamanis

    Abstract: Graph representations of a target domain often project it to a set of entities (nodes) and their relations (edges). However, such projections often miss important and rich information. For example, in graph representations used in missing value imputation, items - represented as nodes - may contain rich textual information. However, when processing graphs with graph neural networks (GNN), such inf… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

  23. arXiv:2107.02367  [pdf, other

    cs.LG cs.AI

    Discrete-Valued Neural Communication

    Authors: Dianbo Liu, Alex Lamb, Kenji Kawaguchi, Anirudh Goyal, Chen Sun, Michael Curtis Mozer, Yoshua Bengio

    Abstract: Deep learning has advanced from fully connected architectures to structured models organized into components, e.g., the transformer composed of positional elements, modular architectures divided into slots, and graph neural nets made up of nodes. In structured models, an interesting question is how to conduct dynamic and possibly sparse communication among the separate components. Here, we explore… ▽ More

    Submitted 10 July, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

  24. arXiv:2106.06786  [pdf, other

    cs.CL cs.DL cs.LG

    Predicting the Ordering of Characters in Japanese Historical Documents

    Authors: Alex Lamb, Tarin Clanuwat, Siyu Han, Mikel Bober-Irizar, Asanobu Kitamoto

    Abstract: Japan is a unique country with a distinct cultural heritage, which is reflected in billions of historical documents that have been preserved. However, the change in Japanese writing system in 1900 made these documents inaccessible for the general public. A major research project has been to make these historical documents accessible and understandable. An increasing amount of research has focused… ▽ More

    Submitted 12 June, 2021; originally announced June 2021.

  25. arXiv:2104.05860  [pdf, other

    cs.LG

    Contextual HyperNetworks for Novel Feature Adaptation

    Authors: Angus Lamb, Evgeny Saveliev, Yingzhen Li, Sebastian Tschiatschek, Camilla Longden, Simon Woodhead, José Miguel Hernández-Lobato, Richard E. Turner, Pashmina Cameron, Cheng Zhang

    Abstract: While deep learning has obtained state-of-the-art results in many applications, the adaptation of neural network architectures to incorporate new output features remains a challenge, as neural networks are commonly trained to produce a fixed output dimension. This issue is particularly severe in online learning settings, where new output features, such as items in a recommender system, are added c… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: 17 pages, 9 Figures, workshop paper at NeurIPS 2020 Meta-Learning Workshop

  26. arXiv:2104.04034  [pdf, other

    cs.CY cs.HC

    Results and Insights from Diagnostic Questions: The NeurIPS 2020 Education Challenge

    Authors: Zichao Wang, Angus Lamb, Evgeny Saveliev, Pashmina Cameron, Yordan Zaykov, Jose Miguel Hernandez-Lobato, Richard E. Turner, Richard G. Baraniuk, Craig Barton, Simon Peyton Jones, Simon Woodhead, Cheng Zhang

    Abstract: This competition concerns educational diagnostic questions, which are pedagogically effective, multiple-choice questions (MCQs) whose distractors embody misconceptions. With a large and ever-increasing number of such questions, it becomes overwhelming for teachers to know which questions are the best ones to use for their students. We thus seek to answer the following question: how can we use data… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

    Comments: arXiv admin note: text overlap with arXiv:2007.12061

  27. arXiv:2103.01197  [pdf, other

    cs.LG cs.AI stat.ML

    Coordination Among Neural Modules Through a Shared Global Workspace

    Authors: Anirudh Goyal, Aniket Didolkar, Alex Lamb, Kartikeya Badola, Nan Rosemary Ke, Nasim Rahaman, Jonathan Binas, Charles Blundell, Michael Mozer, Yoshua Bengio

    Abstract: Deep learning has seen a movement away from representing examples with a monolithic hidden state towards a richly structured state. For example, Transformers segment by position, and object-centric architectures decompose images into entities. In all these architectures, interactions between different elements are modeled via pairwise interactions: Transformers make use of self-attention to incorp… ▽ More

    Submitted 22 March, 2022; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: ICLR'22 accepted paper

  28. arXiv:2103.00336  [pdf, other

    cs.LG cs.AI

    Transformers with Competitive Ensembles of Independent Mechanisms

    Authors: Alex Lamb, Di He, Anirudh Goyal, Guolin Ke, Chien-Feng Liao, Mirco Ravanelli, Yoshua Bengio

    Abstract: An important development in deep learning from the earliest MLPs has been a move towards architectures with structural inductive biases which enable the model to keep distinct sources of information and routes of processing well-separated. This structure is linked to the notion of independent mechanisms from the causality literature, in which a mechanism is able to retain the same processing as ir… ▽ More

    Submitted 27 February, 2021; originally announced March 2021.

    Comments: Under Review, ICML 2021

  29. arXiv:2103.00265  [pdf, ps, other

    cs.LG

    A Brief Introduction to Generative Models

    Authors: Alex Lamb

    Abstract: We introduce and motivate generative modeling as a central task for machine learning and provide a critical view of the algorithms which have been proposed for solving this task. We overview how generative modeling can be defined mathematically as trying to make an estimating distribution the same as an unknown ground truth distribution. This can then be quantified in terms of the value of a stati… ▽ More

    Submitted 27 February, 2021; originally announced March 2021.

  30. arXiv:2010.15187  [pdf, other

    cs.LG cs.AI

    A Study on Efficiency in Continual Learning Inspired by Human Learning

    Authors: Philip J. Ball, Yingzhen Li, Angus Lamb, Cheng Zhang

    Abstract: Humans are efficient continual learning systems; we continually learn new skills from birth with finite cells and resources. Our learning is highly optimized both in terms of capacity and time while not suffering from catastrophic forgetting. In this work we study the efficiency of continual learning systems, taking inspiration from human learning. In particular, inspired by the mechanisms of slee… ▽ More

    Submitted 28 October, 2020; originally announced October 2020.

    Comments: Accepted at NeurIPS 2020 BabyMind Workshop

  31. arXiv:2010.08012  [pdf, other

    cs.LG stat.ML

    Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers

    Authors: Alex Lamb, Anirudh Goyal, Agnieszka Słowik, Michael Mozer, Philippe Beaudoin, Yoshua Bengio

    Abstract: Feed-forward neural networks consist of a sequence of layers, in which each layer performs some processing on the information from the previous layer. A downside to this approach is that each layer (or module, as multiple modules can operate in parallel) is tasked with processing the entire hidden state, rather than a particular part of the state which is most relevant for that module. Methods whi… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

  32. arXiv:2007.12061  [pdf, other

    cs.CY cs.HC cs.LG

    Instructions and Guide for Diagnostic Questions: The NeurIPS 2020 Education Challenge

    Authors: Zichao Wang, Angus Lamb, Evgeny Saveliev, Pashmina Cameron, Yordan Zaykov, José Miguel Hernández-Lobato, Richard E. Turner, Richard G. Baraniuk, Craig Barton, Simon Peyton Jones, Simon Woodhead, Cheng Zhang

    Abstract: Digital technologies are becoming increasingly prevalent in education, enabling personalized, high quality education resources to be accessible by students across the world. Importantly, among these resources are diagnostic questions: the answers that the students give to these questions reveal key information about the specific nature of misconceptions that the students may hold. Analyzing the ma… ▽ More

    Submitted 12 April, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: 28 pages, 6 figures, NeurIPS 2020 Competition Track

  33. arXiv:2006.16981  [pdf, other

    cs.LG cs.NE stat.ML

    Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules

    Authors: Sarthak Mittal, Alex Lamb, Anirudh Goyal, Vikram Voleti, Murray Shanahan, Guillaume Lajoie, Michael Mozer, Yoshua Bengio

    Abstract: Robust perception relies on both bottom-up and top-down signals. Bottom-up signals consist of what's directly observed through sensation. Top-down signals consist of beliefs and expectations based on past experience and short-term memory, such as how the phrase `peanut butter and~...' will be completed. The optimal combination of bottom-up and top-down information remains an open question, but the… ▽ More

    Submitted 15 November, 2020; v1 submitted 30 June, 2020; originally announced June 2020.

    Comments: ICML 2020

  34. arXiv:2006.16225  [pdf, other

    cs.LG stat.ML

    Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems

    Authors: Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Sergey Levine, Charles Blundell, Yoshua Bengio, Michael Mozer

    Abstract: Modeling a structured, dynamic environment like a video game requires kee** track of the objects and their states declarative knowledge) as well as predicting how objects behave (procedural knowledge). Black-box models with a monolithic hidden state often fail to apply procedural knowledge consistently and uniformly, i.e., they lack systematicity. For example, in a video game, correct prediction… ▽ More

    Submitted 12 November, 2020; v1 submitted 29 June, 2020; originally announced June 2020.

    Comments: Type/Token Distinction in Deep learning Framework

  35. arXiv:2005.05496  [pdf, other

    cs.LG cs.CV stat.ML

    Jigsaw-VAE: Towards Balancing Features in Variational Autoencoders

    Authors: Saeid Asgari Taghanaki, Mohammad Havaei, Alex Lamb, Aditya Sanghi, Ara Danielyan, Tonya Custis

    Abstract: The latent variables learned by VAEs have seen considerable interest as an unsupervised way of extracting features, which can then be used for downstream tasks. There is a growing interest in the question of whether features learned on one environment will generalize across different environments. We demonstrate here that VAE latent variables often focus on some factors of variation at the expense… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

  36. arXiv:2002.08595  [pdf, other

    cs.CV cs.LG stat.ML

    KaoKore: A Pre-modern Japanese Art Facial Expression Dataset

    Authors: Yingtao Tian, Chikahiko Suzuki, Tarin Clanuwat, Mikel Bober-Irizar, Alex Lamb, Asanobu Kitamoto

    Abstract: From classifying handwritten digits to generating strings of text, the datasets which have received long-time focus from the machine learning community vary greatly in their subject matter. This has motivated a renewed interest in building datasets which are socially and culturally relevant, so that algorithmic research may have a more direct and immediate impact on society. One such area is in hi… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

  37. arXiv:1912.11570  [pdf, other

    cs.CV cs.LG stat.ML

    SketchTransfer: A Challenging New Task for Exploring Detail-Invariance and the Abstractions Learned by Deep Networks

    Authors: Alex Lamb, Sherjil Ozair, Vikas Verma, David Ha

    Abstract: Deep networks have achieved excellent results in perceptual tasks, yet their ability to generalize to variations not seen during training has come under increasing scrutiny. In this work we focus on their ability to have invariance towards the presence or absence of details. For example, humans are able to watch cartoons, which are missing many visual details, without being explicitly trained to d… ▽ More

    Submitted 24 December, 2019; originally announced December 2019.

    Comments: Accepted WACV 2020

  38. arXiv:1910.09433  [pdf, other

    cs.CV cs.LG

    KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition with Deep Learning

    Authors: Tarin Clanuwat, Alex Lamb, Asanobu Kitamoto

    Abstract: Kuzushiji, a cursive writing style, had been used in Japan for over a thousand years starting from the 8th century. Over 3 millions books on a diverse array of topics, such as literature, science, mathematics and even cooking are preserved. However, following a change to the Japanese writing system in 1900, Kuzushiji has not been included in regular school curricula. Therefore, most Japanese nativ… ▽ More

    Submitted 21 October, 2019; originally announced October 2019.

    Comments: International Conference on Document Recognition (ICDAR) 2019 [oral]

  39. arXiv:1909.11715  [pdf, other

    cs.LG stat.ML

    GraphMix: Improved Training of GNNs for Semi-Supervised Learning

    Authors: Vikas Verma, Meng Qu, Kenji Kawaguchi, Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang

    Abstract: We present GraphMix, a regularization method for Graph Neural Network based semi-supervised object classification, whereby we propose to train a fully-connected network jointly with the graph neural network via parameter sharing and interpolation-based regularization. Further, we provide a theoretical analysis of how GraphMix improves the generalization bounds of the underlying graph neural networ… ▽ More

    Submitted 8 October, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: https://github.com/vikasverma1077/GraphMix

  40. arXiv:1909.10893  [pdf, other

    cs.LG cs.AI stat.ML

    Recurrent Independent Mechanisms

    Authors: Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf

    Abstract: Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes. We propose Recurrent Independent Mechanisms (RIMs), a new recurrent architecture in which multiple groups of recurrent cells operate with nearly independent transition dynamics, communicate only sparingly through the… ▽ More

    Submitted 17 November, 2020; v1 submitted 24 September, 2019; originally announced September 2019.

  41. Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy

    Authors: Alex Lamb, Vikas Verma, Kenji Kawaguchi, Alexander Matyasko, Savya Khosla, Juho Kannala, Yoshua Bengio

    Abstract: Adversarial robustness has become a central goal in deep learning, both in the theory and the practice. However, successful methods to improve the adversarial robustness (such as adversarial training) greatly hurt generalization performance on the unperturbed data. This could have a major impact on how the adversarial robustness affects real world systems (i.e. many may opt to forego robustness if… ▽ More

    Submitted 19 October, 2022; v1 submitted 16 June, 2019; originally announced June 2019.

    Comments: This is the latest version, which is published in the Journal, "Neural Networks", in 2022. All the previous results are unchanged. First two authors contributed equally

    Journal ref: Neural Networks, volume 154, pages 218-233 (2022)

  42. arXiv:1905.11382  [pdf, other

    cs.LG cs.AI stat.ML

    State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations

    Authors: Alex Lamb, Jonathan Binas, Anirudh Goyal, Sandeep Subramanian, Ioannis Mitliagkas, Denis Kazakov, Yoshua Bengio, Michael C. Mozer

    Abstract: Machine learning promises methods that generalize well from finite labeled data. However, the brittleness of existing neural net approaches is revealed by notable failures, such as the existence of adversarial examples that are misclassified despite being nearly identical to a training example, or the inability of recurrent sequence-processing nets to stay on track without teacher forcing. We intr… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: ICML 2019 [full oral]. arXiv admin note: text overlap with arXiv:1805.08394

  43. arXiv:1903.03825  [pdf

    stat.ML cs.AI cs.LG

    Interpolation Consistency Training for Semi-Supervised Learning

    Authors: Vikas Verma, Kenji Kawaguchi, Alex Lamb, Juho Kannala, Arno Solin, Yoshua Bengio, David Lopez-Paz

    Abstract: We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. ICT encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolation of the predictions at those points. In classification problems, ICT moves the decision boundary to low-density reg… ▽ More

    Submitted 19 October, 2022; v1 submitted 9 March, 2019; originally announced March 2019.

    Comments: This is the latest version, which is published in the Journal, "Neural Networks", in 2022. All the previous results are unchanged. Keyword: Deep Learning, Semi-supervised Learning, Mixup

    Journal ref: Neural Networks, volume 145, pages 90-106 (2022)

  44. arXiv:1903.02709  [pdf, other

    stat.ML cs.LG

    On Adversarial Mixup Resynthesis

    Authors: Christopher Beckham, Sina Honari, Vikas Verma, Alex Lamb, Farnoosh Ghadiri, R Devon Hjelm, Yoshua Bengio, Christopher Pal

    Abstract: In this paper, we explore new approaches to combining information encoded within the learned representations of auto-encoders. We explore models that are capable of combining the attributes of multiple inputs such that a resynthesised output is trained to fool an adversarial discriminator for real versus synthesised data. Furthermore, we explore the use of such an architecture in the context of se… ▽ More

    Submitted 23 October, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: 'Camera-ready draft'

  45. Multispectral snapshot demosaicing via non-convex matrix completion

    Authors: Giancarlo A. Antonucci, Simon Vary, David Humphreys, Robert A. Lamb, Jonathan Piper, Jared Tanner

    Abstract: Snapshot mosaic multispectral imagery acquires an undersampled data cube by acquiring a single spectral measurement per spatial pixel. Sensors which acquire $p$ frequencies, therefore, suffer from severe $1/p$ undersampling of the full data cube. We show that the missing entries can be accurately imputed using non-convex techniques from sparse approximation and matrix completion initialised with t… ▽ More

    Submitted 23 April, 2019; v1 submitted 28 February, 2019; originally announced February 2019.

    Comments: 5 pages, 2 figures, 1 table

    MSC Class: 94A08; 15A83 ACM Class: I.4.5; I.4.9

  46. arXiv:1812.01718  [pdf, other

    cs.CV cs.LG stat.ML

    Deep Learning for Classical Japanese Literature

    Authors: Tarin Clanuwat, Mikel Bober-Irizar, Asanobu Kitamoto, Alex Lamb, Kazuaki Yamamoto, David Ha

    Abstract: Much of machine learning research focuses on producing models which perform well on benchmark tasks, in turn improving our understanding of the challenges associated with those tasks. From the perspective of ML researchers, the content of the task itself is largely irrelevant, and thus there have increasingly been calls for benchmark tasks to more heavily focus on problems which are of social or c… ▽ More

    Submitted 3 December, 2018; originally announced December 2018.

    Comments: To appear at Neural Information Processing Systems 2018 Workshop on Machine Learning for Creativity and Design

  47. arXiv:1806.05236  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Manifold Mixup: Better Representations by Interpolating Hidden States

    Authors: Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, Yoshua Bengio

    Abstract: Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples. This includes distribution shifts, outliers, and adversarial examples. To address these issues, we propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden repr… ▽ More

    Submitted 11 May, 2019; v1 submitted 13 June, 2018; originally announced June 2018.

    Comments: To appear in ICML 2019

  48. arXiv:1804.02485  [pdf, other

    stat.ML cs.LG

    Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations

    Authors: Alex Lamb, Jonathan Binas, Anirudh Goyal, Dmitriy Serdyuk, Sandeep Subramanian, Ioannis Mitliagkas, Yoshua Bengio

    Abstract: Deep networks have achieved impressive results across a variety of important tasks. However a known weakness is a failure to perform well when evaluated on data which differ from the training distribution, even if these differences are very small, as is the case with adversarial examples. We propose Fortified Networks, a simple transformation of existing networks, which fortifies the hidden layers… ▽ More

    Submitted 6 April, 2018; originally announced April 2018.

    Comments: Under Review ICML 2018

  49. arXiv:1712.04120  [pdf, other

    stat.ML cs.LG

    GibbsNet: Iterative Adversarial Inference for Deep Graphical Models

    Authors: Alex Lamb, Devon Hjelm, Yaroslav Ganin, Joseph Paul Cohen, Aaron Courville, Yoshua Bengio

    Abstract: Directed latent variable models that formulate the joint distribution as $p(x,z) = p(z) p(x \mid z)$ have the advantage of fast and exact sampling. However, these models have the weakness of needing to specify $p(z)$, often with a simple fixed prior that limits the expressiveness of the model. Undirected latent variable models discard the requirement that $p(z)$ be specified with a prior, yet samp… ▽ More

    Submitted 11 December, 2017; originally announced December 2017.

    Comments: NIPS 2017

  50. arXiv:1711.04755  [pdf, other

    stat.ML cs.LG

    ACtuAL: Actor-Critic Under Adversarial Learning

    Authors: Anirudh Goyal, Nan Rosemary Ke, Alex Lamb, R Devon Hjelm, Chris Pal, Joelle Pineau, Yoshua Bengio

    Abstract: Generative Adversarial Networks (GANs) are a powerful framework for deep generative modeling. Posed as a two-player minimax problem, GANs are typically trained end-to-end on real-valued data and can be used to train a generator of high-dimensional and realistic images. However, a major limitation of GANs is that training relies on passing gradients from the discriminator through the generator via… ▽ More

    Submitted 13 November, 2017; originally announced November 2017.