Skip to main content

Showing 1–11 of 11 results for author: Tachet, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2202.07496  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms

    Authors: Romain Laroche, Remi Tachet

    Abstract: In Reinforcement Learning, the optimal action at a given state is dependent on policy decisions at subsequent states. As a consequence, the learning targets evolve with time and the policy optimization process must be efficient at unlearning what it previously learnt. In this paper, we discover that the policy gradient theorem prescribes policy updates that are slow to unlearn because of their str… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

    Comments: 9p+appendix, accepted to AISTATS 2022

  2. arXiv:2202.06828  [pdf, other

    cs.LG

    On the Convergence of SARSA with Linear Function Approximation

    Authors: Shangtong Zhang, Remi Tachet, Romain Laroche

    Abstract: SARSA, a classical on-policy control algorithm for reinforcement learning, is known to chatter when combined with linear function approximation: SARSA does not diverge but oscillates in a bounded region. However, little is known about how fast SARSA converges to that region and how large the region is. In this paper, we make progress towards this open problem by showing the convergence rate of pro… ▽ More

    Submitted 12 May, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: ICML 2023

  3. arXiv:2111.02997  [pdf, other

    cs.LG

    Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch

    Authors: Shangtong Zhang, Remi Tachet, Romain Laroche

    Abstract: In this paper, we establish the global optimality and convergence rate of an off-policy actor critic algorithm in the tabular setting without using density ratio to correct the discrepancy between the state distribution of the behavior policy and that of the target policy. Our work goes beyond existing works on the optimality of policy gradient methods in that existing works use the exact policy g… ▽ More

    Submitted 24 October, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

    Comments: Journal of Machine Learning Research 2022

  4. arXiv:2109.14727  [pdf, other

    cs.LG cs.AI

    Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates

    Authors: Romain Laroche, Remi Tachet

    Abstract: The policy gradient theorem states that the policy should only be updated in states that are visited by the current policy, which leads to insufficient planning in the off-policy states, and thus to convergence to suboptimal policies. We tackle this planning issue by extending the policy gradient theory to policy updates with respect to any state density. Under these generalized policy updates, we… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

    Comments: accepted to NeurIPS as a poster

  5. arXiv:2106.13401  [pdf, other

    cs.LG cs.AI

    Decomposed Mutual Information Estimation for Contrastive Representation Learning

    Authors: Alessandro Sordoni, Nouha Dziri, Hannes Schulz, Geoff Gordon, Phil Bachman, Remi Tachet

    Abstract: Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong unde… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: ICML 2021

  6. arXiv:2003.04475  [pdf, other

    cs.LG cs.AI stat.ML

    Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift

    Authors: Remi Tachet, Han Zhao, Yu-Xiang Wang, Geoff Gordon

    Abstract: Adversarial learning has demonstrated good performance in the unsupervised domain adaptation setting, by learning domain-invariant representations. However, recent work has shown limitations of this approach when label distributions differ between the source and target domains. In this paper, we propose a new assumption, generalized label shift ($GLS$), to improve robustness against mismatched lab… ▽ More

    Submitted 11 December, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

    Comments: Appeared in NeurIPS 2020

  7. arXiv:2002.10948  [pdf, other

    q-bio.NC cs.AI cs.LG eess.SY

    Reinforcement Learning Framework for Deep Brain Stimulation Study

    Authors: Dmitrii Krylov, Remi Tachet, Romain Laroche, Michael Rosenblum, Dmitry V. Dylov

    Abstract: Malfunctioning neurons in the brain sometimes operate synchronously, reportedly causing many neurological diseases, e.g. Parkinson's. Suppression and control of this collective synchronous activity are therefore of great importance for neuroscience, and can only rely on limited engineering trials due to the need to experiment with live human brains. We present the first Reinforcement Learning gym… ▽ More

    Submitted 22 February, 2020; originally announced February 2020.

    Comments: 7 pages + 1 references, 7 figures. arXiv admin note: text overlap with arXiv:1909.12154

    Journal ref: IJCAI 2020, pp. 2847-2854

  8. arXiv:1911.03861  [pdf, other

    cs.CL cs.LG

    Increasing Robustness to Spurious Correlations using Forgettable Examples

    Authors: Yadollah Yaghoobzadeh, Soroush Mehri, Remi Tachet, T. J. Hazen, Alessandro Sordoni

    Abstract: Neural NLP models tend to rely on spurious correlations between labels and input features to perform their tasks. Minority examples, i.e., examples that contradict the spurious correlations present in the majority of data points, have been shown to increase the out-of-distribution generalization of pre-trained language models. In this paper, we first propose using example forgetting to find minori… ▽ More

    Submitted 1 February, 2021; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: 14 pages, Accepted at EACL2021

  9. arXiv:1809.06848  [pdf, other

    cs.LG cs.AI stat.ML

    On the Learning Dynamics of Deep Neural Networks

    Authors: Remi Tachet, Mohammad Pezeshki, Samira Shabanian, Aaron Courville, Yoshua Bengio

    Abstract: While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood. In this work, we study the case of binary classification and prove various properties of learning in such networks under strong assumptions such as linear separability of the data. Extending existing results from the linear case, we confirm emp… ▽ More

    Submitted 11 December, 2020; v1 submitted 18 September, 2018; originally announced September 2018.

    Comments: 19 pages, 7 figures

  10. arXiv:1809.02591  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Invariances for Policy Generalization

    Authors: Remi Tachet, Philip Bachman, Harm van Seijen

    Abstract: While recent progress has spawned very powerful machine learning systems, those agents remain extremely specialized and fail to transfer the knowledge they gain to similar yet unseen tasks. In this paper, we study a simple reinforcement learning problem and focus on learning policies that encode the proper invariances for generalization to different settings. We evaluate three potential methods fo… ▽ More

    Submitted 12 December, 2020; v1 submitted 7 September, 2018; originally announced September 2018.

    Comments: 7 pages, 1 figure

  11. Estimating savings in parking demand using shared vehicles for home-work commuting

    Authors: Dániel Kondor, Hongmou Zhang, Remi Tachet, Paolo Santi, Carlo Ratti

    Abstract: The increasing availability and adoption of shared vehicles as an alternative to personally-owned cars presents ample opportunities for achieving more efficient transportation in cities. With private cars spending on the average over 95\% of the time parked, one of the possible benefits of shared mobility is the reduced need for parking space. While widely discussed, a systematic quantification of… ▽ More

    Submitted 21 October, 2018; v1 submitted 13 October, 2017; originally announced October 2017.

    Comments: IEEE Transactions on Intelligent Transportation Systems, 2018