Skip to main content

Showing 1–32 of 32 results for author: Vidal, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.15942  [pdf, other

    cs.LG stat.ML

    Can Implicit Bias Imply Adversarial Robustness?

    Authors: Hancheng Min, René Vidal

    Abstract: The implicit bias of gradient-based training algorithms has been considered mostly beneficial as it leads to trained networks that often generalize well. However, Frei et al. (2023) show that such implicit bias can harm adversarial robustness. Specifically, they show that if the data consists of clusters with small inter-cluster correlation, a shallow (two-layer) ReLU network trained by gradient f… ▽ More

    Submitted 5 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: icml 2024 camera-ready

  2. arXiv:2403.07148  [pdf, other

    math.OC cs.GT cs.LG stat.ML

    Stochastic Extragradient with Random Reshuffling: Improved Convergence for Variational Inequalities

    Authors: Konstantinos Emmanouilidis, René Vidal, Nicolas Loizou

    Abstract: The Stochastic Extragradient (SEG) method is one of the most popular algorithms for solving finite-sum min-max optimization and variational inequality problems (VIPs) appearing in various machine learning tasks. However, existing convergence analyses of SEG focus on its with-replacement variants, while practical implementations of the method randomly reshuffle components and sequentially use them.… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  3. arXiv:2308.12562  [pdf, other

    cs.LG stat.ML

    Variational Information Pursuit with Large Language and Multimodal Models for Interpretable Predictions

    Authors: Kwan Ho Ryan Chan, Aditya Chattopadhyay, Benjamin David Haeffele, Rene Vidal

    Abstract: Variational Information Pursuit (V-IP) is a framework for making interpretable predictions by design by sequentially selecting a short chain of task-relevant, user-defined and interpretable queries about the data that are most informative for the task. While this allows for built-in interpretability in predictive models, applying V-IP to any task requires data samples with dense concept-labeling b… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

  4. arXiv:2302.02876  [pdf, other

    cs.LG cs.AI stat.ML

    Variational Information Pursuit for Interpretable Predictions

    Authors: Aditya Chattopadhyay, Kwan Ho Ryan Chan, Benjamin D. Haeffele, Donald Geman, René Vidal

    Abstract: There is a growing interest in the machine learning community in develo** predictive algorithms that are "interpretable by design". Towards this end, recent work proposes to make interpretable decisions by sequentially asking interpretable queries about data until a prediction can be made with high confidence based on the answers obtained (the history). To promote short query-answer chains, a gr… ▽ More

    Submitted 15 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Code is available at https://github.com/ryanchankh/VariationalInformationPursuit

    Report number: https://openreview.net/forum?id=77lSWa-Tm3Z

  5. arXiv:2301.01542  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Federated Learning for Data Streams

    Authors: Othmane Marfoq, Giovanni Neglia, Laetitia Kameni, Richard Vidal

    Abstract: Federated learning (FL) is an effective solution to train machine learning models on the increasing amount of data generated by IoT devices and smartphones while kee** such data localized. Most previous work on federated learning assumes that clients operate on static datasets collected before training starts. This approach may be inefficient because 1) it ignores new samples clients collect dur… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

    Comments: 34 pages

  6. arXiv:2111.09360  [pdf, other

    cs.LG stat.ML

    Personalized Federated Learning through Local Memorization

    Authors: Othmane Marfoq, Giovanni Neglia, Laetitia Kameni, Richard Vidal

    Abstract: Federated learning allows clients to collaboratively learn statistical models while kee** their data local. Federated learning was originally used to train a unique global model to be served to all clients, but this approach might be sub-optimal when clients' local data distributions are heterogeneous. In order to tackle this limitation, recent personalized federated learning methods train a sep… ▽ More

    Submitted 17 June, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: 23 pages, ICML 2022

  7. arXiv:2108.10252  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Federated Multi-Task Learning under a Mixture of Distributions

    Authors: Othmane Marfoq, Giovanni Neglia, Aurélien Bellet, Laetitia Kameni, Richard Vidal

    Abstract: The increasing size of data generated by smartphones and IoT devices motivated the development of Federated Learning (FL), a framework for on-device collaborative training of machine learning models. First efforts in FL focused on learning a single global model with good average performance across clients, but the global model may be arbitrarily bad for a given client, due to the inherent heteroge… ▽ More

    Submitted 7 November, 2022; v1 submitted 23 August, 2021; originally announced August 2021.

    Comments: 77 pages, NeurIPS 2021

  8. arXiv:2009.06530  [pdf, ps, other

    cs.LG stat.ML

    A Game Theoretic Analysis of Additive Adversarial Attacks and Defenses

    Authors: Ambar Pal, René Vidal

    Abstract: Research in adversarial learning follows a cat and mouse game between attackers and defenders where attacks are proposed, they are mitigated by new defenses, and subsequently new attacks are proposed that break earlier defenses, and so on. However, it has remained unclear as to whether there are conditions under which no better attacks or defenses can be proposed. In this paper, we propose a game-… ▽ More

    Submitted 11 November, 2020; v1 submitted 14 September, 2020; originally announced September 2020.

    Comments: Accepted at Neural Information Processing Systems (NeurIPS) 2020

  9. arXiv:2006.11901  [pdf, other

    cs.LG stat.ML

    Free-rider Attacks on Model Aggregation in Federated Learning

    Authors: Yann Fraboni, Richard Vidal, Marco Lorenzi

    Abstract: Free-rider attacks against federated learning consist in dissimulating participation to the federated learning process with the goal of obtaining the final aggregated model without actually contributing with any data. This kind of attacks is critical in sensitive applications of federated learning, where data is scarce and the model has high commercial value. We introduce here the first theoretica… ▽ More

    Submitted 22 February, 2021; v1 submitted 21 June, 2020; originally announced June 2020.

  10. arXiv:2005.03888  [pdf, other

    cs.LG cs.CV stat.ML

    Is an Affine Constraint Needed for Affine Subspace Clustering?

    Authors: Chong You, Chun-Guang Li, Daniel P. Robinson, Rene Vidal

    Abstract: Subspace clustering methods based on expressing each data point as a linear combination of other data points have achieved great success in computer vision applications such as motion segmentation, face and digit clustering. In face clustering, the subspaces are linear and subspace clustering methods can be applied directly. In motion segmentation, the subspaces are affine and an additional affine… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: ICCV 2019. Including proofs that are omitted in the conference version

  11. arXiv:2004.06840  [pdf, other

    math.OC cond-mat.dis-nn cond-mat.stat-mech stat.ML

    On dissipative symplectic integration with applications to gradient-based optimization

    Authors: Guilherme França, Michael I. Jordan, René Vidal

    Abstract: Recently, continuous-time dynamical systems have proved useful in providing conceptual and quantitative insights into gradient-based optimization, widely used in modern machine learning and statistics. An important question that arises in this line of work is how to discretize the system in such a way that its stability and rates of convergence are preserved. In this paper we propose a geometric f… ▽ More

    Submitted 28 April, 2021; v1 submitted 14 April, 2020; originally announced April 2020.

    Comments: matches published version

    Journal ref: J. Stat. Mech. (2021) 043402

  12. arXiv:2001.06970  [pdf, other

    cs.LG cs.IT eess.IV math.OC stat.ML

    Finding the Sparsest Vectors in a Subspace: Theory, Algorithms, and Applications

    Authors: Qing Qu, Zhihui Zhu, Xiao Li, Manolis C. Tsakiris, John Wright, René Vidal

    Abstract: The problem of finding the sparsest vector (direction) in a low dimensional subspace can be considered as a homogeneous variant of the sparse recovery problem, which finds applications in robust subspace recovery, dictionary learning, sparse blind deconvolution, and many other problems in signal processing and machine learning. However, in contrast to the classical sparse recovery problem, the mos… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

    Comments: QQ and ZZ contributed equally to the work. Invited review paper for IEEE Signal Processing Magazine Special Issue on non-convex optimization for signal processing and machine learning. This article contains 26 pages with 11 figures

  13. arXiv:1912.13091  [pdf, ps, other

    cs.LG cs.CV stat.ML

    Basis Pursuit and Orthogonal Matching Pursuit for Subspace-preserving Recovery: Theoretical Analysis

    Authors: Daniel P. Robinson, Rene Vidal, Chong You

    Abstract: Given an overcomplete dictionary $A$ and a signal $b = Ac^*$ for some sparse vector $c^*$ whose nonzero entries correspond to linearly independent columns of $A$, classical sparse signal recovery theory considers the problem of whether $c^*$ can be recovered as the unique sparsest solution to $b = A c$. It is now well-understood that such recovery is possible by practical algorithms when the dicti… ▽ More

    Submitted 30 December, 2019; originally announced December 2019.

    Comments: 31 pages, 6 figures

  14. arXiv:1910.14186  [pdf, other

    cs.LG stat.ML

    On the Regularization Properties of Structured Dropout

    Authors: Ambar Pal, Connor Lane, René Vidal, Benjamin D. Haeffele

    Abstract: Dropout and its extensions (eg. DropBlock and DropConnect) are popular heuristics for training neural networks, which have been shown to improve generalization performance in practice. However, a theoretical understanding of their optimization and regularization properties remains elusive. Recent work shows that in the case of single hidden-layer linear networks, Dropout is a stochastic gradient d… ▽ More

    Submitted 20 June, 2020; v1 submitted 30 October, 2019; originally announced October 2019.

    Comments: Accepted at Computer Vision and Pattern Recognition (CVPR) 2020

  15. arXiv:1910.03749  [pdf, other

    cs.LG eess.SP math.OC stat.ML

    The fastest $\ell_{1,\infty}$ prox in the west

    Authors: Benjamín Béjar, Ivan Dokmanić, René Vidal

    Abstract: Proximal operators are of particular interest in optimization problems dealing with non-smooth objectives because in many practical cases they lead to optimization algorithms whose updates can be computed in closed form or very efficiently. A well-known example is the proximal operator of the vector $\ell_1$ norm, which is given by the soft-thresholding operator. In this paper we study the proxima… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

    Comments: 9 pages, 2 figures, journal

  16. arXiv:1908.00865  [pdf, other

    math.OC math.NA stat.ML

    Gradient flows and proximal splitting methods: A unified view on accelerated and stochastic optimization

    Authors: Guilherme França, Daniel P. Robinson, René Vidal

    Abstract: Optimization is at the heart of machine learning, statistics and many applied scientific disciplines. It also has a long history in physics, ranging from the minimal action principle to finding ground states of disordered systems such as spin glasses. Proximal algorithms form a class of methods that are broadly applicable and are particularly well-suited to nonsmooth, constrained, large-scale, and… ▽ More

    Submitted 10 May, 2021; v1 submitted 2 August, 2019; originally announced August 2019.

    Comments: the paper was reorganized; new additional material; matches published version

    Journal ref: Phys. Rev. E 103, 053304 (2021)

  17. Conformal Symplectic and Relativistic Optimization

    Authors: Guilherme França, Jeremias Sulam, Daniel P. Robinson, René Vidal

    Abstract: Arguably, the two most popular accelerated or momentum-based optimization methods in machine learning are Nesterov's accelerated gradient and Polyaks's heavy ball, both corresponding to different discretizations of a particular second order differential equation with friction. Such connections with continuous-time dynamical systems have been instrumental in demystifying acceleration phenomena in o… ▽ More

    Submitted 24 December, 2020; v1 submitted 10 March, 2019; originally announced March 2019.

    Comments: A short version of this paper appeared at NeurIPS 2020 (spotlight). This lengthier version matches the published paper at JSTAT, which contains additional results

    Journal ref: J. Stat. Mech. (2020) 124008

  18. A Nonsmooth Dynamical Systems Perspective on Accelerated Extensions of ADMM

    Authors: Guilherme França, Daniel P. Robinson, René Vidal

    Abstract: Recently, there has been great interest in connections between continuous-time dynamical systems and optimization methods, notably in the context of accelerated methods for smooth and unconstrained problems. In this paper we extend this perspective to nonsmooth and constrained problems by obtaining differential inclusions associated to novel accelerated variants of the alternating direction method… ▽ More

    Submitted 17 January, 2023; v1 submitted 12 August, 2018; originally announced August 2018.

    Comments: Last version was completely rewritten. New results for modified/perturbed equations, constraints, and singular perturbation theory. Matches the version to appear on IEEE Transactions on Automatic Control

    Journal ref: IEEE Transactions on Automatic Control, Vol. 68, Issue 5 (2023)

  19. arXiv:1806.09777  [pdf, other

    cs.LG cs.AI stat.ML

    On the Implicit Bias of Dropout

    Authors: Poorya Mianjy, Raman Arora, Rene Vidal

    Abstract: Algorithmic approaches endow deep learning systems with implicit bias that helps them generalize even in over-parametrized settings. In this paper, we focus on understanding such a bias induced in learning through dropout, a popular technique to avoid overfitting in deep learning. For single hidden-layer linear neural networks, we show that dropout tends to make the norm of incoming/outgoing weigh… ▽ More

    Submitted 25 June, 2018; originally announced June 2018.

    Comments: 17 pages, 3 figures, In Proceedings of the Thirty-fifth International Conference on Machine Learning (ICML), 2018

  20. arXiv:1801.00393  [pdf, ps, other

    cs.LG stat.ML

    Theoretical Analysis of Sparse Subspace Clustering with Missing Entries

    Authors: Manolis C. Tsakiris, Rene Vidal

    Abstract: Sparse Subspace Clustering (SSC) is a popular unsupervised machine learning method for clustering data lying close to an unknown union of low-dimensional linear subspaces; a problem with numerous applications in pattern recognition and computer vision. Even though the behavior of SSC for complete data is by now well-understood, little is known about its theoretical properties when applied to data… ▽ More

    Submitted 9 February, 2018; v1 submitted 31 December, 2017; originally announced January 2018.

    Journal ref: Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4975-4984, 2018

  21. arXiv:1710.05092  [pdf, other

    cs.LG stat.ML

    Dropout as a Low-Rank Regularizer for Matrix Factorization

    Authors: Jacopo Cavazza, Pietro Morerio, Benjamin Haeffele, Connor Lane, Vittorio Murino, Rene Vidal

    Abstract: Regularization for matrix factorization (MF) and approximation problems has been carried out in many different ways. Due to its popularity in deep learning, dropout has been applied also for this class of problems. Despite its solid empirical performance, the theoretical properties of dropout as a regularizer remain quite elusive for this class of problems. In this paper, we present a theoretical… ▽ More

    Submitted 13 October, 2017; originally announced October 2017.

  22. arXiv:1710.03487  [pdf, other

    cs.LG stat.ML

    An Analysis of Dropout for Matrix Factorization

    Authors: Jacopo Cavazza, Connor Lane, Benjamin D. Haeffele, Vittorio Murino, René Vidal

    Abstract: Dropout is a simple yet effective algorithm for regularizing neural networks by randomly drop** out units through Bernoulli multiplicative noise, and for some restricted problem classes, such as linear or logistic regression, several theoretical studies have demonstrated the equivalence between dropout and a fully deterministic optimization problem with data-dependent Tikhonov regularization. Th… ▽ More

    Submitted 10 October, 2017; originally announced October 2017.

  23. arXiv:1706.01604  [pdf, other

    cs.CV cs.LG stat.ML

    Hyperplane Clustering Via Dual Principal Component Pursuit

    Authors: Manolis C. Tsakiris, Rene Vidal

    Abstract: We extend the theoretical analysis of a recently proposed single subspace learning algorithm, called Dual Principal Component Pursuit (DPCP), to the case where the data are drawn from of a union of hyperplanes. To gain insight into the properties of the $\ell_1$ non-convex problem associated with DPCP, we develop a geometric analysis of a closely related continuous optimization problem. Then trans… ▽ More

    Submitted 19 June, 2017; v1 submitted 6 June, 2017; originally announced June 2017.

    Journal ref: Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3472-3481, 2017

  24. arXiv:1704.03925  [pdf, other

    cs.CV stat.ML

    Provable Self-Representation Based Outlier Detection in a Union of Subspaces

    Authors: Chong You, Daniel P. Robinson, René Vidal

    Abstract: Many computer vision tasks involve processing large amounts of data contaminated by outliers, which need to be detected and rejected. While outlier detection methods based on robust statistics have existed for decades, only recently have methods based on sparse and low-rank representation been developed along with guarantees of correct outlier detection when the inliers lie in one or more low-dime… ▽ More

    Submitted 12 April, 2017; originally announced April 2017.

    Comments: 16 pages. CVPR 2017 spotlight oral presentation

  25. arXiv:1703.06229  [pdf, other

    cs.NE cs.LG stat.ML

    Curriculum Dropout

    Authors: Pietro Morerio, Jacopo Cavazza, Riccardo Volpi, Rene Vidal, Vittorio Murino

    Abstract: Dropout is a very effective way of regularizing neural networks. Stochastically "drop** out" units with a certain probability discourages over-specific co-adaptations of feature detectors, preventing overfitting and improving network generalization. Besides, Dropout can be interpreted as an approximate model aggregation technique, where an exponential number of smaller networks are averaged in o… ▽ More

    Submitted 3 August, 2017; v1 submitted 17 March, 2017; originally announced March 2017.

    Comments: Accepted at ICCV (International Conference on Computer Vision) 2017

  26. arXiv:1701.02343  [pdf, other

    cs.CV cs.AI stat.ML

    Information Pursuit: A Bayesian Framework for Sequential Scene Parsing

    Authors: Ehsan Jahangiri, Erdem Yoruk, Rene Vidal, Laurent Younes, Donald Geman

    Abstract: Despite enormous progress in object detection and classification, the problem of incorporating expected contextual relationships among object instances into modern recognition systems remains a key challenge. In this work we propose Information Pursuit, a Bayesian framework for scene parsing that combines prior models for the geometry of the scene and the spatial arrangement of objects instances w… ▽ More

    Submitted 9 January, 2017; originally announced January 2017.

  27. arXiv:1612.05846  [pdf, other

    stat.ML cs.CV q-bio.QM

    Joint Spatial-Angular Sparse Coding for dMRI with Separable Dictionaries

    Authors: Evan Schwab, René Vidal, Nicolas Charon

    Abstract: Diffusion MRI (dMRI) provides the ability to reconstruct neuronal fibers in the brain, $\textit{in vivo}$, by measuring water diffusion along angular gradient directions in q-space. High angular resolution diffusion imaging (HARDI) can produce better estimates of fiber orientation than the popularly used diffusion tensor imaging, but the high number of samples needed to estimate diffusivity requir… ▽ More

    Submitted 29 May, 2018; v1 submitted 17 December, 2016; originally announced December 2016.

    Journal ref: Evan Schwab, Rene Vidal, Nicolas Charon, Joint spatial-angular sparse coding for dMRI with separable dictionaries, Medical Image Analysis, Volume 48, August 2018, Pages 25-42, ISSN 1361-8415

  28. arXiv:1605.02633  [pdf, ps, other

    cs.LG cs.CV stat.ML

    Oracle Based Active Set Algorithm for Scalable Elastic Net Subspace Clustering

    Authors: Chong You, Chun-Guang Li, Daniel P. Robinson, Rene Vidal

    Abstract: State-of-the-art subspace clustering methods are based on expressing each data point as a linear combination of other data points while regularizing the matrix of coefficients with $\ell_1$, $\ell_2$ or nuclear norms. $\ell_1$ regularization is guaranteed to give a subspace-preserving affinity (i.e., there are no connections between points from different subspaces) under broad theoretical conditio… ▽ More

    Submitted 9 May, 2016; originally announced May 2016.

    Comments: 15 pages, 6 figures, accepted to CVPR 2016 for oral presentation

  29. arXiv:1507.01307  [pdf, ps, other

    stat.ML cs.IT

    Subspace-Sparse Representation

    Authors: C. You, R. Vidal

    Abstract: Given an overcomplete dictionary $A$ and a signal $b$ that is a linear combination of a few linearly independent columns of $A$, classical sparse recovery theory deals with the problem of recovering the unique sparse representation $x$ such that $b = A x$. It is known that under certain conditions on $A$, $x$ can be recovered by the Basis Pursuit (BP) and the Orthogonal Matching Pursuit (OMP) algo… ▽ More

    Submitted 5 July, 2015; originally announced July 2015.

    Comments: 15 pages, 3 figures, previous version published in ICML 2015

  30. arXiv:1507.01238  [pdf, other

    cs.CV cs.LG stat.ML

    Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit

    Authors: Chong You, Daniel P. Robinson, Rene Vidal

    Abstract: Subspace clustering methods based on $\ell_1$, $\ell_2$ or nuclear norm regularization have become very popular due to their simplicity, theoretical guarantees and empirical success. However, the choice of the regularizer can greatly impact both theory and practice. For instance, $\ell_1$ regularization is guaranteed to give a subspace-preserving affinity (i.e., there are no connections between po… ▽ More

    Submitted 5 May, 2016; v1 submitted 5 July, 2015; originally announced July 2015.

    Comments: 13 pages, 1 figure, 2 tables. Accepted to CVPR 2016 as an oral presentation

  31. arXiv:1506.07540  [pdf, ps, other

    math.NA cs.LG stat.ML

    Global Optimality in Tensor Factorization, Deep Learning, and Beyond

    Authors: Benjamin D. Haeffele, Rene Vidal

    Abstract: Techniques involving factorization are found in a wide range of applications and have enjoyed significant empirical success in many fields. However, common to a vast majority of these problems is the significant disadvantage that the associated optimization problems are typically non-convex due to a multilinear form or other convexity destroying transformation. Here we build on ideas from convex r… ▽ More

    Submitted 24 June, 2015; originally announced June 2015.

  32. arXiv:1203.1005  [pdf, other

    cs.CV cs.IR cs.IT cs.LG math.OC stat.ML

    Sparse Subspace Clustering: Algorithm, Theory, and Applications

    Authors: Ehsan Elhamifar, Rene Vidal

    Abstract: In many real-world problems, we are dealing with collections of high-dimensional data, such as images, videos, text and web documents, DNA microarray data, and more. Often, high-dimensional data lie close to low-dimensional structures corresponding to several classes or categories the data belongs to. In this paper, we propose and study an algorithm, called Sparse Subspace Clustering (SSC), to clu… ▽ More

    Submitted 4 February, 2013; v1 submitted 5 March, 2012; originally announced March 2012.