Skip to main content

Showing 1–18 of 18 results for author: Doan, T

Searching in archive stat. Search in all archives.
.
  1. arXiv:2010.04003  [pdf, other

    cs.LG cs.AI stat.ML

    A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix

    Authors: Thang Doan, Mehdi Bennani, Bogdan Mazoure, Guillaume Rabusseau, Pierre Alquier

    Abstract: Continual learning (CL) is a setting in which an agent has to learn from an incoming stream of data during its entire lifetime. Although major advances have been made in the field, one recurring problem which remains unsolved is that of Catastrophic Forgetting (CF). While the issue has been extensively studied empirically, little attention has been paid from a theoretical angle. In this paper, we… ▽ More

    Submitted 25 February, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted to AISTATS 2021. Keywords: continual learning, catastrophic forgetting, NTK regime, orthgonal gradient descent

    Journal ref: Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)

  2. arXiv:2006.13460  [pdf, ps, other

    cs.LG math.OC stat.ML

    Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms

    Authors: Thinh T. Doan

    Abstract: Motivated by broad applications in reinforcement learning and federated learning, we study local stochastic approximation over a network of agents, where their goal is to find the root of an operator composed of the local operators at the agents. Our focus is to characterize the finite-time performance of this method when the data at each agent are generated from Markov processes, and hence they a… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

  3. arXiv:2006.11942  [pdf, other

    stat.ML cs.LG

    Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent

    Authors: Mehdi Abbana Bennani, Thang Doan, Masashi Sugiyama

    Abstract: In Continual Learning settings, deep neural networks are prone to Catastrophic Forgetting. Orthogonal Gradient Descent was proposed to tackle the challenge. However, no theoretical guarantees have been proven yet. We present a theoretical framework to study Continual Learning algorithms in the Neural Tangent Kernel regime. This framework comprises closed form expression of the model through tasks… ▽ More

    Submitted 4 December, 2020; v1 submitted 21 June, 2020; originally announced June 2020.

  4. arXiv:2006.07217  [pdf, other

    cs.LG stat.ML

    Deep Reinforcement and InfoMax Learning

    Authors: Bogdan Mazoure, Remi Tachet des Combes, Thang Doan, Philip Bachman, R Devon Hjelm

    Abstract: We begin with the hypothesis that a model-free agent whose representations are predictive of properties of future states (beyond expected rewards) will be more capable of solving and adapting to new RL problems. To test that hypothesis, we introduce an objective based on Deep InfoMax (DIM) which trains the agent to predict the future by maximizing the mutual information between its internal repres… ▽ More

    Submitted 16 November, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020

  5. arXiv:2006.04338  [pdf, other

    cs.LG stat.ML

    A Decentralized Policy Gradient Approach to Multi-task Reinforcement Learning

    Authors: Sihan Zeng, Aqeel Anwar, Thinh Doan, Arijit Raychowdhury, Justin Romberg

    Abstract: We develop a mathematical framework for solving multi-task reinforcement learning (MTRL) problems based on a type of policy gradient method. The goal in MTRL is to learn a common policy that operates effectively in different environments; these environments have similar (or overlap**) state spaces, but have different rewards and dynamics. We highlight two fundamental challenges in MTRL that are… ▽ More

    Submitted 27 May, 2021; v1 submitted 7 June, 2020; originally announced June 2020.

  6. arXiv:2002.02863  [pdf, other

    cs.LG stat.ML

    Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces

    Authors: Bogdan Mazoure, Thang Doan, Tianyu Li, Vladimir Makarenkov, Joelle Pineau, Doina Precup, Guillaume Rabusseau

    Abstract: We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box mode… ▽ More

    Submitted 15 October, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  7. arXiv:1912.10583  [pdf, ps, other

    cs.LG math.OC stat.ML

    Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation

    Authors: Thinh T. Doan

    Abstract: Motivated by their broad applications in reinforcement learning, we study the linear two-time-scale stochastic approximation, an iterative method using two different step sizes for finding the solutions of a system of two equations. Our main focus is to characterize the finite-time complexity of this method under time-varying step sizes and Markovian noise. In particular, we show that the mean squ… ▽ More

    Submitted 9 January, 2020; v1 submitted 22 December, 2019; originally announced December 2019.

  8. arXiv:1909.07543  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning

    Authors: Thang Doan, Bogdan Mazoure, Moloud Abdar, Audrey Durand, Joelle Pineau, R Devon Hjelm

    Abstract: Continuous control tasks in reinforcement learning are important because they provide an important framework for learning in high-dimensional state spaces with deceptive rewards, where the agent can easily become trapped into suboptimal solutions. One way to avoid local optima is to use a population of agents to ensure coverage of the policy space, yet learning a population with the "best" coverag… ▽ More

    Submitted 9 July, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

  9. arXiv:1907.02998  [pdf, other

    cs.LG cs.AI stat.ML

    Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning

    Authors: Srinivas Venkattaramanujam, Eric Crawford, Thang Doan, Doina Precup

    Abstract: Goal-conditioned policies are used in order to break down complex reinforcement learning (RL) problems by using subgoals, which can be defined either in state space or in a latent feature space. This can increase the efficiency of learning by using a curriculum, and also enables simultaneous learning and generalization across goals. A crucial requirement of goal-conditioned policies is to be able… ▽ More

    Submitted 2 June, 2020; v1 submitted 5 July, 2019; originally announced July 2019.

    Comments: Preprint; Under Review (updated)

  10. arXiv:1905.06893  [pdf, other

    cs.LG stat.ML

    Leveraging exploration in off-policy algorithms via normalizing flows

    Authors: Bogdan Mazoure, Thang Doan, Audrey Durand, R Devon Hjelm, Joelle Pineau

    Abstract: The ability to discover approximately optimal policies in domains with sparse rewards is crucial to applying reinforcement learning (RL) in many real-world scenarios. Approaches such as neural density models and continuous exploration (e.g., Go-Explore) have been proposed to maintain the high exploration rate necessary to find high performing and generalizable policies. Soft actor-critic(SAC) is a… ▽ More

    Submitted 24 September, 2019; v1 submitted 16 May, 2019; originally announced May 2019.

    Comments: Accepted to 3rd Conference on Robot Learning (CoRL 2019); Keywords: Exploration, soft actor-critic, normalizing flow, off-policy; maximum entropy, reinforcement learning; deceptive reward; sparse reward; inverse autoregressive flow

  11. arXiv:1901.08680  [pdf, other

    cs.LG stat.ML

    Multi-objective training of Generative Adversarial Networks with multiple discriminators

    Authors: Isabela Albuquerque, João Monteiro, Thang Doan, Breandan Considine, Tiago Falk, Ioannis Mitliagkas

    Abstract: Recent literature has demonstrated promising results for training Generative Adversarial Networks by employing a set of discriminators, in contrast to the traditional game involving one generator against a single adversary. Such methods perform single-objective optimization on some simple consolidation of the losses, e.g. an arithmetic average. In this work, we revisit the multiple-discriminator s… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

    Comments: The first two authors contributed equally to this work

  12. arXiv:1901.05577  [pdf, other

    cs.LG cs.AI stat.ML

    Generating Realistic Sequences of Customer-level Transactions for Retail Datasets

    Authors: Thang Doan, Neil Veira, Saibal Ray, Brian Keng

    Abstract: In order to better engage with customers, retailers rely on extensive customer and product databases which allows them to better understand customer behaviour and purchasing patterns. This has long been a challenging task as customer modelling is a multi-faceted, noisy and time-dependent problem. The most common way to tackle this problem is indirectly through task-specific supervised learning pre… ▽ More

    Submitted 16 September, 2019; v1 submitted 16 January, 2019; originally announced January 2019.

    Comments: Published at IEEE ICDM Workshop on Data Mining for Services 2018

  13. arXiv:1811.02722  [pdf, ps, other

    cs.LG stat.ML

    Scalable Bottom-up Subspace Clustering using FP-Trees for High Dimensional Data

    Authors: Minh Tuan Doan, Jianzhong Qi, Sutharshan Rajasegarar, Christopher Leckie

    Abstract: Subspace clustering aims to find groups of similar objects (clusters) that exist in lower dimensional subspaces from a high dimensional dataset. It has a wide range of applications, such as analysing high dimensional sensor data or DNA sequences. However, existing algorithms have limitations in finding clusters in non-disjoint subspaces and scaling to large data, which im**e their applicability… ▽ More

    Submitted 6 November, 2018; originally announced November 2018.

    Comments: Accepted to IEEE International Conference on Big Data 2018

  14. arXiv:1808.00020  [pdf, other

    cs.LG stat.ML

    On-line Adaptative Curriculum Learning for GANs

    Authors: Thang Doan, Joao Monteiro, Isabela Albuquerque, Bogdan Mazoure, Audrey Durand, Joelle Pineau, R Devon Hjelm

    Abstract: Generative Adversarial Networks (GANs) can successfully approximate a probability distribution and produce realistic samples. However, open questions such as sufficient convergence conditions and mode collapse still persist. In this paper, we build on existing work in the area by proposing a novel framework for training the generator against an ensemble of discriminator networks, which can be seen… ▽ More

    Submitted 11 March, 2019; v1 submitted 31 July, 2018; originally announced August 2018.

    Comments: Accepted to the Thirty-Third AAAI Conference On Artificial Intelligence, 2019 (Added 128x128 CelebA samples to the end of the appendix)

    Journal ref: Proceedings of 33rd AAAI Conference on Artificial Intelligence (AAAI 2019)

  15. arXiv:1805.04874  [pdf, other

    stat.ML cs.LG

    GAN Q-learning

    Authors: Thang Doan, Bogdan Mazoure, Clare Lyle

    Abstract: Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation. However, there are many different ways in which one can leverage the distributional approach to reinforcement learning. In this paper, we propose GAN Q-learning, a novel distributional RL method based on generative adve… ▽ More

    Submitted 20 July, 2018; v1 submitted 13 May, 2018; originally announced May 2018.

  16. arXiv:1512.03308  [pdf, other

    stat.ML

    Guaranteed inference in topic models

    Authors: Khoat Than, Tung Doan

    Abstract: One of the core problems in statistical models is the estimation of a posterior distribution. For topic models, the problem of posterior inference for individual texts is particularly important, especially when dealing with data streams, but is often intractable in the worst case. As a consequence, existing methods for posterior inference are approximate and do not have any guarantee on neither qu… ▽ More

    Submitted 17 August, 2016; v1 submitted 10 December, 2015; originally announced December 2015.

  17. arXiv:1402.3014  [pdf, other

    stat.AP

    Joint Inference of Misaligned Irregular Time Series with Application to Greenland Ice Core Data

    Authors: Thinh K. Doan, Andrew C. Parnell, John Haslett

    Abstract: Ice cores provide insight into the past climate over many millennia. Due to ice compaction, the raw data for any single core are irregular in time. Multiple cores have different irregularities; jointly these series are misaligned. After processing, such data are made available to researchers as regular time series: a data product. Typically, these cores are independently processed. In this paper,… ▽ More

    Submitted 22 September, 2014; v1 submitted 12 February, 2014; originally announced February 2014.

    Comments: 14 pages, 8 figures

  18. arXiv:1206.5009  [pdf, other

    stat.AP

    On Bayesian Modelling of the Uncertainties in Palaeoclimate Reconstruction

    Authors: Andrew C. Parnell, James Sweeney, Thinh K. Doan, Michael Salter-Townshend, Judy R. M. Allen, Brian Huntley, John Haslett

    Abstract: We outline a model and algorithm to perform inference on the palaeoclimate and palaeoclimate volatility from pollen proxy data. We use a novel multivariate non-linear non-Gaussian state space model consisting of an observation equation linking climate to proxy data and an evolution equation driving climate change over time. The link from climate to proxy data is defined by a pre-calibrated forward… ▽ More

    Submitted 21 June, 2012; originally announced June 2012.

    Comments: 25 pages, 7 figures