Skip to main content

Showing 1–36 of 36 results for author: Jenatton, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.06600  [pdf, other

    cs.LG cs.CV

    Pi-DUAL: Using Privileged Information to Distinguish Clean from Noisy Labels

    Authors: Ke Wang, Guillermo Ortiz-Jimenez, Rodolphe Jenatton, Mark Collier, Efi Kokiopoulou, Pascal Frossard

    Abstract: Label noise is a pervasive problem in deep learning that often compromises the generalization performance of trained models. Recently, leveraging privileged information (PI) -- information available only during training but not at test time -- has emerged as an effective approach to mitigate this issue. Yet, existing PI-based methods have failed to consistently outperform their no-PI counterparts… ▽ More

    Submitted 28 May, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted ICML 2024

  2. arXiv:2305.16999  [pdf, other

    cs.CV cs.AI cs.LG

    Three Towers: Flexible Contrastive Learning with Pretrained Image Models

    Authors: Jannik Kossen, Mark Collier, Basil Mustafa, Xiao Wang, Xiaohua Zhai, Lucas Beyer, Andreas Steiner, Jesse Berent, Rodolphe Jenatton, Efi Kokiopoulou

    Abstract: We introduce Three Towers (3T), a flexible method to improve the contrastive learning of vision-language models by incorporating pretrained image classifiers. While contrastive models are usually trained from scratch, LiT (Zhai et al., 2022) has recently shown performance gains from using pretrained classifier embeddings. However, LiT directly replaces the image tower with the frozen embeddings, e… ▽ More

    Submitted 30 October, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted for publication at NeurIPS 2023

  3. arXiv:2303.01806  [pdf, other

    cs.LG cs.CV

    When does Privileged Information Explain Away Label Noise?

    Authors: Guillermo Ortiz-Jimenez, Mark Collier, Anant Nawalgaria, Alexander D'Amour, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou

    Abstract: Leveraging privileged information (PI), or features available during training but not at test time, has recently been shown to be an effective method for addressing label noise. However, the reasons for its effectiveness are not well understood. In this study, we investigate the role played by different properties of the PI in explaining away label noise. Through experiments on multiple datasets w… ▽ More

    Submitted 1 June, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted ICML 2023, Honolulu

  4. arXiv:2302.05442  [pdf, other

    cs.CV cs.AI cs.LG

    Scaling Vision Transformers to 22 Billion Parameters

    Authors: Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver , et al. (17 additional authors not shown)

    Abstract: The scaling of Transformers has driven breakthrough capabilities for language models. At present, the largest large language models (LLMs) contain upwards of 100B parameters. Vision Transformers (ViT) have introduced the same architecture to image and video modelling, but these have not yet been successfully scaled to nearly the same degree; the largest dense ViT contains 4B parameters (Chen et al… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

  5. arXiv:2301.12860  [pdf, other

    cs.LG stat.ML

    Massively Scaling Heteroscedastic Classifiers

    Authors: Mark Collier, Rodolphe Jenatton, Basil Mustafa, Neil Houlsby, Jesse Berent, Effrosyni Kokiopoulou

    Abstract: Heteroscedastic classifiers, which learn a multivariate Gaussian distribution over prediction logits, have been shown to perform well on image classification problems with hundreds to thousands of classes. However, compared to standard classifiers, they introduce extra parameters that scale linearly with the number of classes. This makes them infeasible to apply to larger-scale problems. In additi… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted to ICLR 2023

  6. arXiv:2210.10253  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    On the Adversarial Robustness of Mixture of Experts

    Authors: Joan Puigcerver, Rodolphe Jenatton, Carlos Riquelme, Pranjal Awasthi, Srinadh Bhojanapalli

    Abstract: Adversarial robustness is a key desirable property of neural networks. It has been empirically shown to be affected by their sizes, with larger networks being typically more robust. Recently, Bubeck and Sellke proved a lower bound on the Lipschitz constant of functions that fit the training data in terms of their number of parameters. This raises an interesting open question, do -- and can -- func… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022

  7. arXiv:2207.07411  [pdf, other

    cs.LG stat.ML

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

    Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Code available at https://goo.gle/plex-code

  8. arXiv:2206.02770  [pdf, other

    cs.CV

    Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts

    Authors: Basil Mustafa, Carlos Riquelme, Joan Puigcerver, Rodolphe Jenatton, Neil Houlsby

    Abstract: Large sparsely-activated models have obtained excellent performance in multiple domains. However, such models are typically trained on a single modality at a time. We present the Language-Image MoE, LIMoE, a sparse mixture of experts model capable of multimodal learning. LIMoE accepts both images and text simultaneously, while being trained using a contrastive loss. MoEs are a natural fit for a mu… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

  9. arXiv:2202.09244  [pdf, other

    cs.LG

    Transfer and Marginalize: Explaining Away Label Noise with Privileged Information

    Authors: Mark Collier, Rodolphe Jenatton, Efi Kokiopoulou, Jesse Berent

    Abstract: Supervised learning datasets often have privileged information, in the form of features which are available at training time but are not available at test time e.g. the ID of the annotator that provided the label. We argue that privileged information is useful for explaining away label noise, thereby reducing the harmful impact of noisy labels. We develop a simple and efficient method for supervis… ▽ More

    Submitted 15 June, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: Accepted at ICML 2022, Baltimore

  10. arXiv:2112.08250  [pdf, other

    cs.LG

    Predicting the utility of search spaces for black-box optimization: a simple, budget-aware approach

    Authors: Setareh Ariafar, Justin Gilmer, Zachary Nado, Jasper Snoek, Rodolphe Jenatton, George E. Dahl

    Abstract: Black box optimization requires specifying a search space to explore for solutions, e.g. a d-dimensional compact space, and this choice is critical for getting the best results at a reasonable budget. Unfortunately, determining a high quality search space can be challenging in many applications. For example, when tuning hyperparameters for machine learning pipelines on a new problem given a limite… ▽ More

    Submitted 16 December, 2021; v1 submitted 15 December, 2021; originally announced December 2021.

  11. arXiv:2110.03360  [pdf, other

    cs.LG cs.CV stat.ML

    Sparse MoEs meet Efficient Ensembles

    Authors: James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

    Abstract: Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that the two approaches have complementary features whose combinatio… ▽ More

    Submitted 9 July, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: 59 pages, 26 figures, 36 tables. Accepted at TMLR

  12. arXiv:2110.02609  [pdf, other

    stat.ML cs.LG

    Deep Classifiers with Label Noise Modeling and Distance Awareness

    Authors: Vincent Fortuin, Mark Collier, Florian Wenzel, James Allingham, Jeremiah Liu, Dustin Tran, Balaji Lakshminarayanan, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou

    Abstract: Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications. While there have been many proposed methods that either focus on distance-aware model uncertainties for out-of-distribution detection or on input-dependent label uncertainties for in-distribution calibration, both of these types of uncert… ▽ More

    Submitted 8 August, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Published in TMLR

  13. arXiv:2106.05974  [pdf, other

    cs.CV cs.LG stat.ML

    Scaling Vision with Sparse Mixture of Experts

    Authors: Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, Neil Houlsby

    Abstract: Sparsely-gated Mixture of Experts networks (MoEs) have demonstrated excellent scalability in Natural Language Processing. In Computer Vision, however, almost all performant networks are "dense", that is, every input is processed by every parameter. We present a Vision MoE (V-MoE), a sparse version of the Vision Transformer, that is scalable and competitive with the largest dense networks. When app… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: 44 pages, 38 figures

  14. arXiv:2106.04015  [pdf, other

    cs.LG

    Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning

    Authors: Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal , et al. (1 additional authors not shown)

    Abstract: High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compu… ▽ More

    Submitted 5 January, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  15. arXiv:2105.10305  [pdf, other

    cs.LG cs.CV stat.ML

    Correlated Input-Dependent Label Noise in Large-Scale Image Classification

    Authors: Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

    Abstract: Large scale image classification datasets often contain noisy labels. We take a principled probabilistic approach to modelling input-dependent, also known as heteroscedastic, label noise in these datasets. We place a multivariate Normal distributed latent variable on the final hidden layer of a neural network classifier. The covariance matrix of this latent variable, models the aleatoric uncertain… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: Accepted as Oral at CVPR 2021

  16. arXiv:2012.08489  [pdf, other

    cs.LG cs.AI stat.ML

    Amazon SageMaker Automatic Model Tuning: Scalable Gradient-Free Optimization

    Authors: Valerio Perrone, Huibin Shen, Aida Zolic, Iaroslav Shcherbatyi, Amr Ahmed, Tanya Bansal, Michele Donini, Fela Winkelmolen, Rodolphe Jenatton, Jean Baptiste Faddoul, Barbara Pogorzelska, Miroslav Miladinovic, Krishnaram Kenthapadi, Matthias Seeger, Cédric Archambeau

    Abstract: Tuning complex machine learning systems is challenging. Machine learning typically requires to set hyperparameters, be it regularization, architecture, or optimization parameters, whose tuning is critical to achieve good predictive performance. To democratize access to machine learning systems, it is essential to automate the tuning. This paper presents Amazon SageMaker Automatic Model Tuning (AMT… ▽ More

    Submitted 18 June, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

  17. arXiv:2012.08483  [pdf, other

    cs.LG

    Amazon SageMaker Autopilot: a white box AutoML solution at scale

    Authors: Piali Das, Valerio Perrone, Nikita Ivkin, Tanya Bansal, Zohar Karnin, Huibin Shen, Iaroslav Shcherbatyi, Yotam Elor, Wilton Wu, Aida Zolic, Thibaut Lienart, Alex Tang, Amr Ahmed, Jean Baptiste Faddoul, Rodolphe Jenatton, Fela Winkelmolen, Philip Gautier, Leo Dirac, Andre Perunicic, Miroslav Miladinovic, Giovanni Zappella, Cédric Archambeau, Matthias Seeger, Bhaskar Dutt, Laurence Rouesnel

    Abstract: AutoML systems provide a black-box solution to machine learning problems by selecting the right way of processing features, choosing an algorithm and tuning the hyperparameters of the entire pipeline. Although these systems perform well on many datasets, there is still a non-negligible number of datasets for which the one-shot solution produced by each particular system would provide sub-par perfo… ▽ More

    Submitted 16 December, 2020; v1 submitted 15 December, 2020; originally announced December 2020.

  18. arXiv:2010.06610  [pdf, other

    cs.LG cs.CV stat.ML

    Training independent subnetworks for robust prediction

    Authors: Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran

    Abstract: Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network. However, these methods still require multiple forward passes for prediction, leading to a significant computational cost. In this work, we show a surprising result: the benefits of using multiple pred… ▽ More

    Submitted 4 August, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: Updated to the ICLR camera ready version, added reference to Soflaei et al. 2020

  19. arXiv:2006.13570  [pdf, other

    cs.LG stat.ML

    Hyperparameter Ensembles for Robustness and Uncertainty Quantification

    Authors: Florian Wenzel, Jasper Snoek, Dustin Tran, Rodolphe Jenatton

    Abstract: Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For… ▽ More

    Submitted 8 January, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Accepted at NeurIPS 2020

  20. arXiv:2006.06049  [pdf, other

    cs.LG stat.ML

    On Mixup Regularization

    Authors: Luigi Carratino, Moustapha Cissé, Rodolphe Jenatton, Jean-Philippe Vert

    Abstract: Mixup is a data augmentation technique that creates new examples as convex combinations of training points and labels. This simple technique has empirically shown to improve the accuracy of many state-of-the-art models in different settings and applications, but the reasons behind this empirical success remain poorly understood. In this paper we take a substantial step in explaining the theoretica… ▽ More

    Submitted 17 October, 2022; v1 submitted 10 June, 2020; originally announced June 2020.

  21. arXiv:2003.06778  [pdf, other

    cs.LG stat.ML

    A Simple Probabilistic Method for Deep Classification under Input-Dependent Label Noise

    Authors: Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

    Abstract: Datasets with noisy labels are a common occurrence in practical applications of classification methods. We propose a simple probabilistic method for training deep classifiers under input-dependent (heteroscedastic) label noise. We assume an underlying heteroscedastic generative process for noisy labels. To make gradient based training feasible we use a temperature parameterized softmax as a smooth… ▽ More

    Submitted 12 November, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

  22. arXiv:2002.02655  [pdf, other

    cs.LG stat.ML

    The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks

    Authors: Jakub Swiatkowski, Kevin Roth, Bastiaan S. Veeling, Linh Tran, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

    Abstract: Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights. Recent work develo** this class of methods has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. In contrast, here we share a curious experimental finding that suggests instead restricting the variational d… ▽ More

    Submitted 5 July, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  23. arXiv:2002.02405  [pdf, other

    stat.ML cs.LG stat.CO

    How Good is the Bayes Posterior in Deep Neural Networks Really?

    Authors: Florian Wenzel, Kevin Roth, Bastiaan S. Veeling, Jakub Świątkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

    Abstract: During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neura… ▽ More

    Submitted 2 July, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: Full version (main paper and appendix) of the ICML 2020 publication

  24. arXiv:2001.04694  [pdf, other

    cs.LG stat.ML

    Hydra: Preserving Ensemble Diversity for Model Distillation

    Authors: Linh Tran, Bastiaan S. Veeling, Kevin Roth, Jakub Swiatkowski, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Sebastian Nowozin, Rodolphe Jenatton

    Abstract: Ensembles of models have been empirically shown to improve predictive performance and to yield robust measures of uncertainty. However, they are expensive in computation and memory. Therefore, recent research has focused on distilling ensembles into a single compact model, reducing the computational and memory burden of the ensemble while trying to preserve its predictive behavior. Most existing d… ▽ More

    Submitted 19 March, 2021; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: Accepted to ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning

  25. arXiv:1910.07003  [pdf, other

    stat.ML cs.LG

    Constrained Bayesian Optimization with Max-Value Entropy Search

    Authors: Valerio Perrone, Iaroslav Shcherbatyi, Rodolphe Jenatton, Cedric Archambeau, Matthias Seeger

    Abstract: Bayesian optimization (BO) is a model-based approach to sequentially optimize expensive black-box functions, such as the validation error of a deep neural network with respect to its hyperparameters. In many real-world scenarios, the optimization is further subject to a priori unknown constraints. For example, training a deep network configuration may fail with an out-of-memory error when the mode… ▽ More

    Submitted 15 October, 2019; originally announced October 2019.

  26. arXiv:1909.12552  [pdf, other

    stat.ML cs.LG

    Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning

    Authors: Valerio Perrone, Huibin Shen, Matthias Seeger, Cedric Archambeau, Rodolphe Jenatton

    Abstract: Bayesian optimization (BO) is a successful methodology to optimize black-box functions that are expensive to evaluate. While traditional methods optimize each black-box function in isolation, there has been recent interest in speeding up BO by transferring knowledge across multiple related black-box functions. In this work, we introduce a method to automatically design the BO search space by relyi… ▽ More

    Submitted 27 September, 2019; originally announced September 2019.

  27. arXiv:1602.05394  [pdf, other

    stat.ML cs.LG math.OC math.ST

    Online optimization and regret guarantees for non-additive long-term constraints

    Authors: Rodolphe Jenatton, Jim Huang, Dominik Csiba, Cedric Archambeau

    Abstract: We consider online optimization in the 1-lookahead setting, where the objective does not decompose additively over the rounds of the online game. The resulting formulation enables us to deal with non-stationary and/or long-term constraints , which arise, for example, in online display advertising problems. We propose an on-line primal-dual algorithm for which we obtain dynamic cumulative regret gu… ▽ More

    Submitted 8 June, 2016; v1 submitted 17 February, 2016; originally announced February 2016.

  28. arXiv:1512.07422  [pdf, other

    stat.ML cs.LG math.OC

    Adaptive Algorithms for Online Convex Optimization with Long-term Constraints

    Authors: Rodolphe Jenatton, Jim Huang, Cédric Archambeau

    Abstract: We present an adaptive online gradient descent algorithm to solve online convex optimization problems with long-term constraints , which are constraints that need to be satisfied when accumulated over a finite number of rounds T , but can be violated in intermediate rounds. For some user-defined trade-off parameter $β$ $\in$ (0, 1), the proposed algorithm achieves cumulative regret bounds of O(T^m… ▽ More

    Submitted 23 December, 2015; originally announced December 2015.

  29. arXiv:1407.5155  [pdf, ps, other

    cs.LG stat.ML

    Sparse and spurious: dictionary learning with noise and outliers

    Authors: Rémi Gribonval, Rodolphe Jenatton, Francis Bach

    Abstract: A popular approach within the signal processing and machine learning communities consists in modelling signals as sparse linear combinations of atoms selected from a learned dictionary. While this paradigm has led to numerous empirical successes in various fields ranging from image to audio processing, there have only been a few theoretical arguments supporting these evidences. In particular, spar… ▽ More

    Submitted 22 August, 2015; v1 submitted 19 July, 2014; originally announced July 2014.

    Comments: This is a substantially revised version of a first draft that appeared as a preprint titled "Local stability and robustness of sparse dictionary learning in the presence of noise", http://hal.inria.fr/hal-00737152, IEEE Transactions on Information Theory, Institute of Electrical and Electronics Engineers (IEEE), 2015, pp.22

  30. arXiv:1312.3790  [pdf, ps, other

    stat.ML cs.IT

    Sample Complexity of Dictionary Learning and other Matrix Factorizations

    Authors: Rémi Gribonval, Rodolphe Jenatton, Francis Bach, Martin Kleinsteuber, Matthias Seibert

    Abstract: Many modern tools in machine learning and signal processing, such as sparse dictionary learning, principal component analysis (PCA), non-negative matrix factorization (NMF), $K$-means clustering, etc., rely on the factorization of a matrix obtained by concatenating high-dimensional vectors from a training collection. While the idealized task would be to optimize the expected quality of the factors… ▽ More

    Submitted 9 April, 2015; v1 submitted 13 December, 2013; originally announced December 2013.

    Comments: to appear

    Journal ref: IEEE Transactions on Information Theory, Institute of Electrical and Electronics Engineers (IEEE), 2015, pp.18

  31. arXiv:1210.0685  [pdf, ps, other

    stat.ML cs.LG

    Local stability and robustness of sparse dictionary learning in the presence of noise

    Authors: Rodolphe Jenatton, Rémi Gribonval, Francis Bach

    Abstract: A popular approach within the signal processing and machine learning communities consists in modelling signals as sparse linear combinations of atoms selected from a learned dictionary. While this paradigm has led to numerous empirical successes in various fields ranging from image to audio processing, there have only been a few theoretical arguments supporting these evidences. In particular, spar… ▽ More

    Submitted 2 October, 2012; originally announced October 2012.

  32. Learning Hierarchical and Topographic Dictionaries with Structured Sparsity

    Authors: Julien Mairal, Rodolphe Jenatton, Guillaume Obozinski, Francis Bach

    Abstract: Recent work in signal processing and statistics have focused on defining new regularization functions, which not only induce sparsity of the solution, but also take into account the structure of the problem. We present in this paper a class of convex penalties introduced in the machine learning community, which take the form of a sum of l_2 and l_infinity-norms over groups of variables. They exten… ▽ More

    Submitted 20 October, 2011; originally announced October 2011.

    Journal ref: SPIE Wavelets and Sparsity XIV 81381P (2011)

  33. arXiv:1109.2397  [pdf, ps, other

    cs.LG stat.ML

    Structured sparsity through convex optimization

    Authors: Francis Bach, Rodolphe Jenatton, Julien Mairal, Guillaume Obozinski

    Abstract: Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. While naturally cast as a combinatorial optimization problem, variable or feature selection admits a convex relaxation through the regularization by the $\ell_1$-norm. In this paper, we consider situations where we are not only interested in sparsity, but where some structural prior knowledge… ▽ More

    Submitted 20 April, 2012; v1 submitted 12 September, 2011; originally announced September 2011.

    Comments: Statistical Science (2012) To appear

  34. arXiv:1108.0775  [pdf, ps, other

    cs.LG math.OC stat.ML

    Optimization with Sparsity-Inducing Penalties

    Authors: Francis Bach, Rodolphe Jenatton, Julien Mairal, Guillaume Obozinski

    Abstract: Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropr… ▽ More

    Submitted 22 November, 2011; v1 submitted 3 August, 2011; originally announced August 2011.

  35. arXiv:1104.1872  [pdf, ps, other

    math.OC cs.LG stat.ML

    Convex and Network Flow Optimization for Structured Sparsity

    Authors: Julien Mairal, Rodolphe Jenatton, Guillaume Obozinski, Francis Bach

    Abstract: We consider a class of learning problems regularized by a structured sparsity-inducing norm defined as the sum of l_2- or l_infinity-norms over groups of variables. Whereas much effort has been put in develo** fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address here the case of general overlap** groups. To this end, we present two different strategi… ▽ More

    Submitted 16 September, 2011; v1 submitted 11 April, 2011; originally announced April 2011.

    Comments: to appear in the Journal of Machine Learning Research (JMLR)

    Journal ref: Journal of Machine Learning Research 12 (2011) 2681?2720

  36. arXiv:1008.5209  [pdf, ps, other

    cs.LG stat.ML

    Network Flow Algorithms for Structured Sparsity

    Authors: Julien Mairal, Rodolphe Jenatton, Guillaume Obozinski, Francis Bach

    Abstract: We consider a class of learning problems that involve a structured sparsity-inducing norm defined as the sum of $\ell_\infty$-norms over groups of variables. Whereas a lot of effort has been put in develo** fast optimization methods when the groups are disjoint or embedded in a specific hierarchical structure, we address here the case of general overlap** groups. To this end, we show that the… ▽ More

    Submitted 30 August, 2010; originally announced August 2010.

    Comments: accepted for publication in Adv. Neural Information Processing Systems, 2010

    Report number: RR-7372