Skip to main content

Showing 1–31 of 31 results for author: Gelly, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2010.14766  [pdf, other

    cs.LG stat.ML

    A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation

    Authors: Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem

    Abstract: The idea behind the \emph{unsupervised} learning of \emph{disentangled} representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of d… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1811.12359

    Journal ref: Journal of Machine Learning Research 2020, Volume 21, Number 209

  2. arXiv:2009.13239  [pdf, other

    cs.LG cs.CV stat.ML

    Scalable Transfer Learning with Expert Models

    Authors: Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Cedric Renggli, André Susano Pinto, Sylvain Gelly, Daniel Keysers, Neil Houlsby

    Abstract: Transfer of pre-trained representations can improve sample efficiency and reduce computational requirements for new tasks. However, representations used for transfer are usually generic, and are not tailored to a particular distribution of downstream tasks. We explore the use of expert representations for transfer with a simple, yet effective, strategy. We train a diverse set of experts by exploit… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

  3. arXiv:2007.14184  [pdf, other

    cs.LG cs.AI stat.ML

    A Commentary on the Unsupervised Learning of Disentangled Representations

    Authors: Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem

    Abstract: The goal of the unsupervised learning of disentangled representations is to separate the independent explanatory factors of variation in the data without access to supervision. In this paper, we summarize the results of Locatello et al., 2019, and focus on their implications for practitioners. We discuss the theoretical result showing that the unsupervised learning of disentangled representations… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Journal ref: The Thirty-Fourth AAAI Conference on Artificial Intelligence 2020 (AAAI-20)

  4. arXiv:2006.10455  [pdf, other

    stat.ML cs.LG

    What Do Neural Networks Learn When Trained With Random Labels?

    Authors: Hartmut Maennel, Ibrahim Alabdulmohsin, Ilya Tolstikhin, Robert J. N. Baldock, Olivier Bousquet, Sylvain Gelly, Daniel Keysers

    Abstract: We study deep neural networks (DNNs) trained on natural image data with entirely random labels. Despite its popularity in the literature, where it is often used to study memorization, generalization, and other phenomena, little is known about what DNNs learn in this setting. In this paper, we show analytically for convolutional and fully connected networks that an alignment between the principal c… ▽ More

    Submitted 11 November, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: Accepted, NeurIPS2020

  5. arXiv:2006.05990  [pdf, other

    cs.LG stat.ML

    What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

    Authors: Marcin Andrychowicz, Anton Raichuk, Piotr Stańczyk, Manu Orsini, Sertan Girgin, Raphael Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem

    Abstract: In recent years, on-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks. While RL algorithms are often conceptually simple, their state-of-the-art implementations take numerous low- and high-level design decisions that strongly affect the performance of the resulting agents. Those choices are usually not extensively discussed in the literatur… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

  6. arXiv:2002.11448  [pdf, other

    stat.ML cs.LG

    Predicting Neural Network Accuracy from Weights

    Authors: Thomas Unterthiner, Daniel Keysers, Sylvain Gelly, Olivier Bousquet, Ilya Tolstikhin

    Abstract: We show experimentally that the accuracy of a trained neural network can be predicted surprisingly well by looking only at its weights, without evaluating it on input data. We motivate this task and introduce a formal setting for it. Even when using simple statistics of the weights, the predictors are able to rank neural networks by their performance with very high accuracy (R2 score more than 0.9… ▽ More

    Submitted 9 April, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

    Comments: Updated the Small CNN Zoo dataset: reduced the maximal learning rate and got rid of multiple bad runs. Replaced all the experiments with the new numbers. Added MLP. Fixed typo in the abstract (R2 score instead of Kendall's tau). Added several earlier related works to the literature overview

  7. arXiv:2001.08049  [pdf, other

    stat.ML cs.LG

    On Last-Layer Algorithms for Classification: Decoupling Representation from Uncertainty Estimation

    Authors: Nicolas Brosse, Carlos Riquelme, Alice Martin, Sylvain Gelly, Éric Moulines

    Abstract: Uncertainty quantification for deep learning is a challenging open problem. Bayesian statistics offer a mathematically grounded framework to reason about uncertainties; however, approximate posteriors for modern neural networks still require prohibitive computational costs. We propose a family of algorithms which split the classification task into two stages: representation learning and uncertaint… ▽ More

    Submitted 22 January, 2020; originally announced January 2020.

  8. arXiv:1911.11357  [pdf, other

    cs.LG cs.CV stat.ML

    Semantic Bottleneck Scene Generation

    Authors: Samaneh Azadi, Michael Tschannen, Eric Tzeng, Sylvain Gelly, Trevor Darrell, Mario Lucic

    Abstract: Coupling the high-fidelity generation capabilities of label-conditional image synthesis methods with the flexibility of unconditional generative models, we propose a semantic bottleneck GAN model for unconditional synthesis of complex scenes. We assume pixel-wise segmentation labels are available during training and use them to learn the scene structure. During inference, our model first synthesiz… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

  9. arXiv:1910.04867  [pdf, other

    cs.CV cs.LG stat.ML

    A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark

    Authors: Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby

    Abstract: Representation learning promises to unlock deep learning for the long tail of vision tasks without expensive labelled datasets. Yet, the absence of a unified evaluation for general visual representations hinders progress. Popular protocols are often too constrained (linear classification), limited in diversity (ImageNet, CIFAR, Pascal-VOC), or only weakly related to representation quality (ELBO, r… ▽ More

    Submitted 21 February, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

  10. arXiv:1907.13625  [pdf, other

    cs.LG stat.ML

    On Mutual Information Maximization for Representation Learning

    Authors: Michael Tschannen, Josip Djolonga, Paul K. Rubenstein, Sylvain Gelly, Mario Lucic

    Abstract: Many recent methods for unsupervised or self-supervised representation learning train feature extractors by maximizing an estimate of the mutual information (MI) between different views of the data. This comes with several immediate problems: For example, MI is notoriously hard to estimate, and using it as an objective for representation learning may lead to highly entangled representations due to… ▽ More

    Submitted 23 January, 2020; v1 submitted 31 July, 2019; originally announced July 2019.

    Comments: ICLR 2020. Michael Tschannen and Josip Djolonga contributed equally

  11. arXiv:1907.11180  [pdf, other

    cs.LG stat.ML

    Google Research Football: A Novel Reinforcement Learning Environment

    Authors: Karol Kurach, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, Sylvain Gelly

    Abstract: Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner. We introduce the Google Research Football Environment, a new reinforcement learning environment where agents are trained to play football in an advanced, physics-based 3D simulator… ▽ More

    Submitted 14 April, 2020; v1 submitted 25 July, 2019; originally announced July 2019.

  12. arXiv:1907.00868  [pdf, other

    cs.LG cs.AI stat.ML

    MULEX: Disentangling Exploitation from Exploration in Deep RL

    Authors: Lucas Beyer, Damien Vincent, Olivier Teboul, Sylvain Gelly, Matthieu Geist, Olivier Pietquin

    Abstract: An agent learning through interactions should balance its action selection process between probing the environment to discover new rewards and using the information acquired in the past to adopt useful behaviour. This trade-off is usually obtained by perturbing either the agent's actions (e.g., e-greedy or Gibbs sampling) or the agent's parameters (e.g., NoisyNet), or by modifying the reward it re… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

  13. arXiv:1906.07987  [pdf, other

    cs.LG cs.AI stat.ML

    Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates

    Authors: Hugo Penedones, Carlos Riquelme, Damien Vincent, Hartmut Maennel, Timothy Mann, Andre Barreto, Sylvain Gelly, Gergely Neu

    Abstract: We consider the core reinforcement-learning problem of on-policy value function approximation from a batch of trajectory data, and focus on various issues of Temporal Difference (TD) learning and Monte Carlo (MC) policy evaluation. The two methods are known to achieve complementary bias-variance trade-off properties, with TD tending to achieve lower variance but potentially higher bias. In this pa… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

  14. arXiv:1905.11866  [pdf, ps, other

    cs.LG stat.ML

    When can unlabeled data improve the learning rate?

    Authors: Christina Göpfert, Shai Ben-David, Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Ruth Urner

    Abstract: In semi-supervised classification, one is given access both to labeled and unlabeled data. As unlabeled data is typically cheaper to acquire than labeled data, this setup becomes advantageous as soon as one can exploit the unlabeled data in order to produce a better classifier than with labeled data alone. However, the conditions under which such an improvement is possible are not fully understood… ▽ More

    Submitted 9 February, 2022; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: Small correction in proof of Theorem 1

    Journal ref: Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:1500-1518, 2019

  15. arXiv:1905.10768  [pdf, other

    cs.LG stat.ML

    Precision-Recall Curves Using Information Divergence Frontiers

    Authors: Josip Djolonga, Mario Lucic, Marco Cuturi, Olivier Bachem, Olivier Bousquet, Sylvain Gelly

    Abstract: Despite the tremendous progress in the estimation of generative models, the development of tools for diagnosing their failures and assessing their performance has advanced at a much slower pace. Recent developments have investigated metrics that quantify which parts of the true distribution is modeled well, and, on the contrary, what the model fails to capture, akin to precision and recall in info… ▽ More

    Submitted 8 June, 2020; v1 submitted 26 May, 2019; originally announced May 2019.

    Comments: Updated to the AISTATS 2020 version

  16. arXiv:1903.02271  [pdf, other

    cs.LG cs.CV stat.ML

    High-Fidelity Image Generation With Fewer Labels

    Authors: Mario Lucic, Michael Tschannen, Marvin Ritter, Xiaohua Zhai, Olivier Bachem, Sylvain Gelly

    Abstract: Deep generative models are becoming a cornerstone of modern machine learning. Recent work on conditional generative adversarial networks has shown that learning complex, high-dimensional distributions over natural images is within reach. While the latest models are able to generate high-fidelity, diverse natural images at high resolution, they rely on a vast quantity of labeled data. In this work… ▽ More

    Submitted 14 May, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: Mario Lucic, Michael Tschannen, and Marvin Ritter contributed equally to this work. ICML 2019 camera-ready version. Code available at https://github.com/google/compare_gan

  17. arXiv:1902.08077  [pdf, other

    cs.LG stat.ML

    Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities

    Authors: Octavian-Eugen Ganea, Sylvain Gelly, Gary Bécigneul, Aliaksei Severyn

    Abstract: The Softmax function on top of a final linear layer is the de facto method to output probability distributions in neural networks. In many applications such as language models or text generation, this model has to produce distributions over large output vocabularies. Recently, this has been shown to have limited representational capacity due to its connection with the rank bottleneck in matrix fac… ▽ More

    Submitted 13 May, 2019; v1 submitted 21 February, 2019; originally announced February 2019.

    Journal ref: ICML 2019

  18. arXiv:1902.00751  [pdf, other

    cs.LG cs.CL stat.ML

    Parameter-Efficient Transfer Learning for NLP

    Authors: Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly

    Abstract: Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we propose transfer with adapter modules. Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can… ▽ More

    Submitted 13 June, 2019; v1 submitted 2 February, 2019; originally announced February 2019.

  19. arXiv:1812.01717  [pdf, other

    cs.CV cs.AI cs.LG cs.NE stat.ML

    Towards Accurate Generative Models of Video: A New Metric & Challenges

    Authors: Thomas Unterthiner, Sjoerd van Steenkiste, Karol Kurach, Raphael Marinier, Marcin Michalski, Sylvain Gelly

    Abstract: Recent advances in deep generative models have lead to remarkable progress in synthesizing high quality images. Following their successful application in image processing and representation learning, an important next step is to consider videos. Learning generative models of video is a much harder task, requiring a model to capture the temporal dynamics of a scene, in addition to the visual presen… ▽ More

    Submitted 27 March, 2019; v1 submitted 2 December, 2018; originally announced December 2018.

  20. arXiv:1811.12359  [pdf, other

    cs.LG cs.AI stat.ML

    Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

    Authors: Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, Olivier Bachem

    Abstract: The key idea behind the unsupervised learning of disentangled representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangle… ▽ More

    Submitted 18 June, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

    Journal ref: Proceedings of the 36th International Conference on Machine Learning (ICML 2019)

  21. arXiv:1810.02274  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Episodic Curiosity through Reachability

    Authors: Nikolay Savinov, Anton Raichuk, Raphaël Marinier, Damien Vincent, Marc Pollefeys, Timothy Lillicrap, Sylvain Gelly

    Abstract: Rewards are sparse in the real world and most of today's reinforcement learning algorithms struggle with such sparsity. One solution to this problem is to allow the agent to create rewards for itself - thus making rewards dense and more suitable for learning. In particular, inspired by curious behaviour in animals, observing something novel could be rewarded with a bonus. Such bonus is summed up w… ▽ More

    Submitted 6 August, 2019; v1 submitted 4 October, 2018; originally announced October 2018.

    Comments: Accepted to ICLR 2019. Code at https://github.com/google-research/episodic-curiosity/. Videos at https://sites.google.com/view/episodic-curiosity/

  22. arXiv:1810.01365  [pdf, other

    cs.LG cs.CV stat.ML

    On Self Modulation for Generative Adversarial Networks

    Authors: Ting Chen, Mario Lucic, Neil Houlsby, Sylvain Gelly

    Abstract: Training Generative Adversarial Networks (GANs) is notoriously challenging. We propose and study an architectural modification, self-modulation, which improves GAN performance across different data sets, architectures, losses, regularizers, and hyperparameter settings. Intuitively, self-modulation allows the intermediate feature maps of a generator to change as a function of the input noise vector… ▽ More

    Submitted 2 May, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

  23. arXiv:1807.04720  [pdf, other

    cs.LG stat.ML

    A Large-Scale Study on Regularization and Normalization in GANs

    Authors: Karol Kurach, Mario Lucic, Xiaohua Zhai, Marcin Michalski, Sylvain Gelly

    Abstract: Generative adversarial networks (GANs) are a class of deep generative models which aim to learn a target distribution in an unsupervised fashion. While they were successfully applied to many problems, training a GAN is a notoriously challenging task and requires a significant number of hyperparameter tuning, neural architecture engineering, and a non-trivial amount of "tricks". The success in many… ▽ More

    Submitted 14 May, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: Revision accepted to ICML'19: More focus on regularization and normalization aspects. Added recent references and promising future directions

  24. arXiv:1807.03064  [pdf, other

    cs.LG stat.ML

    Temporal Difference Learning with Neural Networks - Study of the Leakage Propagation Problem

    Authors: Hugo Penedones, Damien Vincent, Hartmut Maennel, Sylvain Gelly, Timothy Mann, Andre Barreto

    Abstract: Temporal-Difference learning (TD) [Sutton, 1988] with function approximation can converge to solutions that are worse than those obtained by Monte-Carlo regression, even in the simple case of on-policy evaluation. To increase our understanding of the problem, we investigate the issue of approximation errors in areas of sharp discontinuities of the value function being further propagated by bootstr… ▽ More

    Submitted 9 July, 2018; originally announced July 2018.

  25. arXiv:1806.00035  [pdf, other

    stat.ML cs.LG

    Assessing Generative Models via Precision and Recall

    Authors: Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, Sylvain Gelly

    Abstract: Recent advances in generative modeling have led to an increased interest in the study of statistical divergences as means of model comparison. Commonly used evaluation methods, such as the Frechet Inception Distance (FID), correlate well with the perceived quality of samples and are sensitive to mode drop**. However, these metrics are unable to distinguish between different failure cases since t… ▽ More

    Submitted 28 October, 2018; v1 submitted 31 May, 2018; originally announced June 2018.

    Comments: NIPS 2018

  26. arXiv:1804.11130  [pdf, other

    cs.LG cs.AI stat.ML

    Competitive Training of Mixtures of Independent Deep Generative Models

    Authors: Francesco Locatello, Damien Vincent, Ilya Tolstikhin, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf

    Abstract: A common assumption in causal modeling posits that the data is generated by a set of independent mechanisms, and algorithms should aim to recover this structure. Standard unsupervised learning, however, is often concerned with training a single model to capture the overall distribution or aspects thereof. Inspired by clustering approaches, we consider mixtures of implicit generative models that ``… ▽ More

    Submitted 3 March, 2019; v1 submitted 30 April, 2018; originally announced April 2018.

  27. arXiv:1803.08367  [pdf, other

    stat.ML cs.LG

    Gradient Descent Quantizes ReLU Network Features

    Authors: Hartmut Maennel, Olivier Bousquet, Sylvain Gelly

    Abstract: Deep neural networks are often trained in the over-parametrized regime (i.e. with far more parameters than training examples), and understanding why the training converges to solutions that generalize remains an open problem. Several studies have highlighted the fact that the training procedure, i.e. mini-batch Stochastic Gradient Descent (SGD) leads to solutions that have specific properties in t… ▽ More

    Submitted 22 March, 2018; originally announced March 2018.

  28. arXiv:1711.10337  [pdf, other

    stat.ML cs.LG

    Are GANs Created Equal? A Large-Scale Study

    Authors: Mario Lucic, Karol Kurach, Marcin Michalski, Sylvain Gelly, Olivier Bousquet

    Abstract: Generative adversarial networks (GAN) are a powerful subclass of generative models. Despite a very rich research activity leading to numerous interesting GAN algorithms, it is still very hard to assess which algorithm(s) perform better than others. We conduct a neutral, multi-faceted large-scale empirical study on state-of-the art models and evaluation measures. We find that most models can reach… ▽ More

    Submitted 29 October, 2018; v1 submitted 28 November, 2017; originally announced November 2017.

    Comments: NIPS'18: Added a section on the limitations of the study and additional empirical results

  29. arXiv:1711.01558  [pdf, other

    stat.ML cs.LG

    Wasserstein Auto-Encoders

    Authors: Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, Bernhard Schoelkopf

    Abstract: We propose the Wasserstein Auto-Encoder (WAE)---a new algorithm for building a generative model of the data distribution. WAE minimizes a penalized form of the Wasserstein distance between the model distribution and the target distribution, which leads to a different regularizer than the one used by the Variational Auto-Encoder (VAE). This regularizer encourages the encoded training distribution t… ▽ More

    Submitted 5 December, 2019; v1 submitted 5 November, 2017; originally announced November 2017.

    Comments: Published at ICLR 2018.. Included much wider hyperparameter sweep: in significant improvements in FIDs on CelebA

  30. arXiv:1705.07642  [pdf, other

    stat.ML

    From optimal transport to generative modeling: the VEGAN cookbook

    Authors: Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Carl-Johann Simon-Gabriel, Bernhard Schoelkopf

    Abstract: We study unsupervised generative modeling in terms of the optimal transport (OT) problem between true (but unknown) data distribution $P_X$ and the latent variable model distribution $P_G$. We show that the OT problem can be equivalently written in terms of probabilistic encoders, which are constrained to match the posterior and prior distributions over the latent space. When relaxed, this constra… ▽ More

    Submitted 22 May, 2017; originally announced May 2017.

  31. arXiv:1701.02386  [pdf, other

    stat.ML cs.LG

    AdaGAN: Boosting Generative Models

    Authors: Ilya Tolstikhin, Sylvain Gelly, Olivier Bousquet, Carl-Johann Simon-Gabriel, Bernhard Schölkopf

    Abstract: Generative Adversarial Networks (GAN) (Goodfellow et al., 2014) are an effective method for training generative models of complex data such as natural images. However, they are notoriously hard to train and can suffer from the problem of missing modes where the model is not able to produce examples in certain regions of the space. We propose an iterative procedure, called AdaGAN, where at every st… ▽ More

    Submitted 24 May, 2017; v1 submitted 9 January, 2017; originally announced January 2017.

    Comments: Updated with MNIST pictures and discussions + Unrolled GAN experiments