Skip to main content

Showing 1–45 of 45 results for author: Pedregosa, F

.
  1. arXiv:2406.09073  [pdf, other

    cs.LG

    Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition

    Authors: Eleni Triantafillou, Peter Kairouz, Fabian Pedregosa, Jamie Hayes, Meghdad Kurmanji, Kairan Zhao, Vincent Dumoulin, Julio Jacques Junior, Ioannis Mitliagkas, Jun Wan, Lisheng Sun Hosoya, Sergio Escalera, Gintare Karolina Dziugaite, Peter Triantafillou, Isabelle Guyon

    Abstract: We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and initiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In thi… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2402.13984  [pdf, other

    cs.LG cond-mat.dis-nn cond-mat.mtrl-sci physics.chem-ph physics.comp-ph

    Stability-Aware Training of Neural Network Interatomic Potentials with Differentiable Boltzmann Estimators

    Authors: Sanjeev Raja, Ishan Amin, Fabian Pedregosa, Aditi S. Krishnapriyan

    Abstract: Neural network interatomic potentials (NNIPs) are an attractive alternative to ab-initio methods for molecular dynamics (MD) simulations. However, they can produce unstable simulations which sample unphysical states, limiting their usefulness for modeling phenomena occurring over longer timescales. To address these challenges, we present Stability-Aware Boltzmann Estimator (StABlE) Training, a mul… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  3. arXiv:2312.00209  [pdf, other

    cs.LG cs.AI math.OC

    On the Interplay Between Stepsize Tuning and Progressive Sharpening

    Authors: Vincent Roulet, Atish Agarwala, Fabian Pedregosa

    Abstract: Recent empirical work has revealed an intriguing property of deep learning models by which the sharpness (largest eigenvalue of the Hessian) increases throughout optimization until it stabilizes around a critical value at which the optimizer operates at the edge of stability, given a fixed stepsize (Cohen et al, 2022). We investigate empirically how the sharpness evolves when using stepsize-tuners… ▽ More

    Submitted 29 December, 2023; v1 submitted 30 November, 2023; originally announced December 2023.

    Comments: Presented at the NeurIPS 2023 OPT Wokshop

  4. arXiv:2212.14032  [pdf, other

    cs.LG

    On Implicit Bias in Overparameterized Bilevel Optimization

    Authors: Paul Vicol, Jonathan Lorraine, Fabian Pedregosa, David Duvenaud, Roger Grosse

    Abstract: Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

    Comments: ICML 2022

  5. arXiv:2212.04025  [pdf, other

    cs.LG cs.AI stat.ML

    A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces

    Authors: Charline Le Lan, Joshua Greaves, Jesse Farebrother, Mark Rowland, Fabian Pedregosa, Rishabh Agarwal, Marc G. Bellemare

    Abstract: Many machine learning problems encode their data as a matrix with a possibly very large number of rows and columns. In several applications like neuroscience, image compression or deep reinforcement learning, the principal subspace of such a matrix provides a useful, low-dimensional representation of individual data. Here, we are interested in determining the $d$-dimensional principal subspace of… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: 8 pages in main content, 2 pages of bibliography and 5 pages in Appendix

  6. arXiv:2211.04659  [pdf, other

    cs.LG math.OC stat.ML

    When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

    Authors: Junhyung Lyle Kim, Gauthier Gidel, Anastasios Kyrillidis, Fabian Pedregosa

    Abstract: The extragradient method has gained popularity due to its robust convergence properties for differentiable games. Unlike single-objective optimization, game dynamics involve complex interactions reflected by the eigenvalues of the game vector field's Jacobian scattered across the complex plane. This complexity can cause the simple gradient method to diverge, even for bilinear games, while the extr… ▽ More

    Submitted 10 February, 2024; v1 submitted 8 November, 2022; originally announced November 2022.

  7. arXiv:2210.04860  [pdf, other

    cs.LG cs.AI math.OC

    Second-order regression models exhibit progressive sharpening to the edge of stability

    Authors: Atish Agarwala, Fabian Pedregosa, Jeffrey Pennington

    Abstract: Recent studies of gradient descent with large step sizes have shown that there is often a regime with an initial increase in the largest eigenvalue of the loss Hessian (progressive sharpening), followed by a stabilization of the eigenvalue near the maximum value which allows convergence (edge of stability). These phenomena are intrinsically non-linear and do not happen for models in the constant N… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

  8. arXiv:2209.13271  [pdf, other

    math.OC stat.ML

    The Curse of Unrolling: Rate of Differentiating Through Optimization

    Authors: Damien Scieur, Quentin Bertrand, Gauthier Gidel, Fabian Pedregosa

    Abstract: Computing the Jacobian of the solution of an optimization problem is a central problem in machine learning, with applications in hyperparameter optimization, meta-learning, optimization as a layer, and dataset distillation, to name a few. Unrolled differentiation is a popular heuristic that approximates the solution using an iterative solver and differentiates it through the computational path. Th… ▽ More

    Submitted 25 August, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

  9. arXiv:2206.09901  [pdf, other

    math.OC cs.LG

    Only Tails Matter: Average-Case Universality and Robustness in the Convex Regime

    Authors: Leonardo Cunha, Gauthier Gidel, Fabian Pedregosa, Damien Scieur, Courtney Paquette

    Abstract: The recently developed average-case analysis of optimization methods allows a more fine-grained and representative convergence analysis than usual worst-case results. In exchange, this analysis requires a more precise hypothesis over the data generating process, namely assuming knowledge of the expected spectral distribution (ESD) of the random matrix associated with the problem. This work shows t… ▽ More

    Submitted 22 June, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: To be published in ICML 2022

  10. arXiv:2202.12328  [pdf, other

    cs.LG

    Cutting Some Slack for SGD with Adaptive Polyak Stepsizes

    Authors: Robert M. Gower, Mathieu Blondel, Nidham Gazagnadou, Fabian Pedregosa

    Abstract: Tuning the step size of stochastic gradient descent is tedious and error prone. This has motivated the development of methods that automatically adapt the step size using readily available information. In this paper, we consider the family of SPS (Stochastic gradient with a Polyak Stepsize) adaptive methods. These are methods that make use of gradient and loss value at the sampled points to adapti… ▽ More

    Submitted 20 May, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

    Comments: 48 pages, 7 figures

    MSC Class: 90C53; 74S60; 90C06; 62L20; 68W20; 15B52; 65Y20; 68W40 ACM Class: G.1.6

  11. arXiv:2201.05125  [pdf, other

    cs.LG cs.CV

    GradMax: Growing Neural Networks using Gradient Information

    Authors: Utku Evci, Bart van Merriënboer, Thomas Unterthiner, Max Vladymyrov, Fabian Pedregosa

    Abstract: The architecture and the parameters of neural networks are often optimized independently, which requires costly retraining of the parameters whenever the architecture is modified. In this work we instead focus on growing the architecture without requiring costly retraining. We present a method that adds new neurons during training without impacting what is already learned, while improving the trai… ▽ More

    Submitted 7 June, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

    Comments: ICLR 2022

    Journal ref: International Conference on Learning Representations, 2022

  12. arXiv:2106.09687  [pdf, other

    math.OC

    Super-Acceleration with Cyclical Step-sizes

    Authors: Baptiste Goujaud, Damien Scieur, Aymeric Dieuleveut, Adrien Taylor, Fabian Pedregosa

    Abstract: We develop a convergence-rate analysis of momentum with cyclical step-sizes. We show that under some assumption on the spectral gap of Hessians in machine learning, cyclical step-sizes are provably faster than constant step-sizes. More precisely, we develop a convergence rate analysis for quadratic objectives that provides optimal parameters and shows that cyclical learning rates can improve upon… ▽ More

    Submitted 9 May, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

    Journal ref: Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:3028-3065, 2022

  13. arXiv:2105.15183  [pdf, other

    cs.LG math.NA stat.ML

    Efficient and Modular Implicit Differentiation

    Authors: Mathieu Blondel, Quentin Berthet, Marco Cuturi, Roy Frostig, Stephan Hoyer, Felipe Llinares-López, Fabian Pedregosa, Jean-Philippe Vert

    Abstract: Automatic differentiation (autodiff) has revolutionized machine learning. It allows to express complex computations by composing elementary ones in creative ways and removes the burden of computing their derivatives by hand. More recently, differentiation of optimization problem solutions has attracted widespread attention with applications such as optimization layers, and in bi-level problems suc… ▽ More

    Submitted 12 October, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: V3: added more related work and Jacobian precision figure

  14. arXiv:2105.09240  [pdf, other

    cs.LG stat.ML

    Boosting Variational Inference With Locally Adaptive Step-Sizes

    Authors: Gideon Dresdner, Saurav Shekhar, Fabian Pedregosa, Francesco Locatello, Gunnar Rätsch

    Abstract: Variational Inference makes a trade-off between the capacity of the variational family and the tractability of finding an approximate posterior distribution. Instead, Boosting Variational Inference allows practitioners to obtain increasingly good posterior approximations by spending more compute. The main obstacle to widespread adoption of Boosting Variational Inference is the amount of resources… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

  15. arXiv:2102.08868  [pdf, other

    cs.LG cs.CV stat.ML

    Bridging the Gap Between Adversarial Robustness and Optimization Bias

    Authors: Fartash Faghri, Sven Gowal, Cristina Vasconcelos, David J. Fleet, Fabian Pedregosa, Nicolas Le Roux

    Abstract: We demonstrate that the choice of optimizer, neural network architecture, and regularizer significantly affect the adversarial robustness of linear neural networks, providing guarantees without the need for adversarial training. To this end, we revisit a known result linking maximally robust classifiers and minimum norm solutions, and combine it with recent results on the implicit bias of optimize… ▽ More

    Submitted 7 June, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: New CIFAR-10 experiments and Fourier attack variations

  16. arXiv:2102.04396  [pdf, other

    math.OC cs.LG math.PR stat.ML

    SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality

    Authors: Courtney Paquette, Kiwon Lee, Fabian Pedregosa, Elliot Paquette

    Abstract: We propose a new framework, inspired by random matrix theory, for analyzing the dynamics of stochastic gradient descent (SGD) when both number of samples and dimensions are large. This framework applies to any fixed stepsize and the finite sum setting. Using this new framework, we show that the dynamics of SGD on a least squares problem with random data become deterministic in the large sample and… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

  17. arXiv:2010.02076  [pdf, other

    math.OC cs.GT cs.LG

    Average-case Acceleration for Bilinear Games and Normal Matrices

    Authors: Carles Domingo-Enrich, Fabian Pedregosa, Damien Scieur

    Abstract: Advances in generative modeling and adversarial learning have given rise to renewed interest in smooth games. However, the absence of symmetry in the matrix of second derivatives poses challenges that are not present in the classical minimization framework. While a rich theory of average-case analysis has been developed for minimization problems, little is known in the context of smooth games. In… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: 24 pages, 1 figure

  18. arXiv:2006.04299  [pdf, other

    math.OC stat.ML

    Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis

    Authors: Courtney Paquette, Bart van Merriënboer, Elliot Paquette, Fabian Pedregosa

    Abstract: Average-case analysis computes the complexity of an algorithm averaged over all possible inputs. Compared to worst-case analysis, it is more representative of the typical behavior of an algorithm, but remains largely unexplored in optimization. One difficulty is that the analysis can depend on the probability distribution of the inputs to the model. However, we show that this is not the case for a… ▽ More

    Submitted 2 October, 2021; v1 submitted 7 June, 2020; originally announced June 2020.

  19. arXiv:2002.11860  [pdf, other

    math.OC cs.LG

    Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization

    Authors: Geoffrey Négiar, Gideon Dresdner, Alicia Tsai, Laurent El Ghaoui, Francesco Locatello, Robert M. Freund, Fabian Pedregosa

    Abstract: We propose a novel Stochastic Frank-Wolfe (a.k.a. conditional gradient) algorithm for constrained smooth finite-sum minimization with a generalized linear prediction/structure. This class of problems includes empirical risk minimization with sparse, low-rank, or other structured constraints. The proposed method is simple to implement, does not require step-size tuning, and has a constant per-itera… ▽ More

    Submitted 8 September, 2022; v1 submitted 26 February, 2020; originally announced February 2020.

    Comments: Proceedings of the 37th International Conference on Machine Learning, 2020

  20. arXiv:2002.08056  [pdf, other

    cs.LG stat.ML

    The Geometry of Sign Gradient Descent

    Authors: Lukas Balles, Fabian Pedregosa, Nicolas Le Roux

    Abstract: Sign-based optimization methods have become popular in machine learning due to their favorable communication cost in distributed optimization and their surprisingly good performance in neural network training. Furthermore, they are closely connected to so-called adaptive gradient methods like Adam. Recent works on signSGD have used a non-standard "separable smoothness" assumption, whereas some old… ▽ More

    Submitted 19 February, 2020; originally announced February 2020.

  21. arXiv:2002.04756  [pdf, other

    math.OC cs.LG

    Average-case Acceleration Through Spectral Density Estimation

    Authors: Fabian Pedregosa, Damien Scieur

    Abstract: We develop a framework for the average-case analysis of random quadratic problems and derive algorithms that are optimal under this analysis. This yields a new class of methods that achieve acceleration given a model of the Hessian's eigenvalue distribution. We develop explicit algorithms for the uniform, Marchenko-Pastur, and exponential distributions. These methods are momentum-based algorithms,… ▽ More

    Submitted 20 February, 2023; v1 submitted 11 February, 2020; originally announced February 2020.

    Journal ref: Proceedings of the 37th International Conference on Machine Learning, PMLR 119, 2020

  22. arXiv:2002.04664  [pdf, other

    math.OC

    Universal Average-Case Optimality of Polyak Momentum

    Authors: Damien Scieur, Fabian Pedregosa

    Abstract: Polyak momentum (PM), also known as the heavy-ball method, is a widely used optimization method that enjoys an asymptotic optimal worst-case complexity on quadratic objectives. However, its remarkable empirical success is not fully explained by this optimality, as the worst-case analysis -- contrary to the average-case -- is not representative of the expected complexity of an algorithm. In this wo… ▽ More

    Submitted 21 January, 2021; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: Added references in the proof of Theorem 4.1

    Journal ref: Proceedings of the 37th International Conference on Machine Learning, PMLR 119, 2020

  23. arXiv:1910.05271  [pdf, other

    q-bio.NC cs.LG stat.ML stat.OT

    A Test for Shared Patterns in Cross-modal Brain Activation Analysis

    Authors: Elena Kalinina, Fabian Pedregosa, Vittorio Iacovella, Emanuele Olivetti, Paolo Avesani

    Abstract: Determining the extent to which different cognitive modalities (understood here as the set of cognitive processes underlying the elaboration of a stimulus by the brain) rely on overlap** neural representations is a fundamental issue in cognitive neuroscience. In the last decade, the identification of shared activity patterns has been mostly framed as a supervised learning problem. For instance,… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

    Comments: 5 figures, tables after References (as required by SciRep template)

  24. arXiv:1907.10121  [pdf, other

    cs.MS cs.DS cs.SE physics.comp-ph

    SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python

    Authors: Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, CJ Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde , et al. (10 additional authors not shown)

    Abstract: SciPy is an open source scientific computing library for the Python programming language. SciPy 1.0 was released in late 2017, about 16 years after the original version 0.1 release. SciPy has become a de facto standard for leveraging scientific algorithms in the Python programming language, with more than 600 unique code contributors, thousands of dependent packages, over 100,000 dependent reposit… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: Article source data is available here: https://github.com/scipy/scipy-articles

    Journal ref: Nature Methods 17, 261 (2020)

  25. arXiv:1906.10732  [pdf, other

    cs.LG cs.CV stat.ML

    The Difficulty of Training Sparse Neural Networks

    Authors: Utku Evci, Fabian Pedregosa, Aidan Gomez, Erich Elsen

    Abstract: We investigate the difficulties of training sparse neural networks and make new observations about optimization dynamics and the energy landscape within the sparse regime. Recent work of \citep{Gale2019, Liu2018} has shown that sparse ResNet-50 architectures trained on ImageNet-2012 dataset converge to solutions that are significantly worse than those found by pruning. We show that, despite the fa… ▽ More

    Submitted 7 October, 2020; v1 submitted 25 June, 2019; originally announced June 2019.

    Comments: sparse networks, pruning, energy landscape, sparsity

  26. arXiv:1906.07774  [pdf, other

    cs.LG stat.ML

    On the interplay between noise and curvature and its effect on optimization and generalization

    Authors: Valentin Thomas, Fabian Pedregosa, Bart van Merriënboer, Pierre-Antoine Mangazol, Yoshua Bengio, Nicolas Le Roux

    Abstract: The speed at which one can minimize an expected loss using stochastic methods depends on two properties: the curvature of the loss and the variance of the gradients. While most previous works focus on one or the other of these properties, we explore how their interaction affects optimization speed. Further, as the ultimate goal is good generalization performance, we clarify how both curvature and… ▽ More

    Submitted 6 April, 2020; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: Accepted to AISTATS 2020

  27. arXiv:1806.07294  [pdf, other

    math.OC

    Proximal Splitting Meets Variance Reduction

    Authors: Fabian Pedregosa, Kilian Fatras, Mattia Casotto

    Abstract: Despite the rise to fame of incremental variance-reduced methods in recent years, their use in nonsmooth optimization is still limited to few simple cases. This is due to the fact that existing methods require to evaluate the proximity operator for the nonsmooth terms, which can be a costly operation for complex penalties. In this work we introduce two variance-reduced incremental methods based on… ▽ More

    Submitted 24 January, 2019; v1 submitted 19 June, 2018; originally announced June 2018.

    MSC Class: 65K10

    Journal ref: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics 2019

  28. arXiv:1806.05123  [pdf, other

    math.OC

    Linearly Convergent Frank-Wolfe with Backtracking Line-Search

    Authors: Fabian Pedregosa, Geoffrey Negiar, Armin Askari, Martin Jaggi

    Abstract: Structured constraints in Machine Learning have recently brought the Frank-Wolfe (FW) family of algorithms back in the spotlight. While the classical FW algorithm has poor local convergence properties, the Away-steps and Pairwise FW variants have emerged as improved variants with faster convergence. However, these improved variants suffer from two practical limitations: they require at each iterat… ▽ More

    Submitted 8 September, 2022; v1 submitted 13 June, 2018; originally announced June 2018.

    Journal ref: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS) 2020

  29. arXiv:1804.03176  [pdf, other

    math.OC cs.LG stat.ML

    Frank-Wolfe Splitting via Augmented Lagrangian Method

    Authors: Gauthier Gidel, Fabian Pedregosa, Simon Lacoste-Julien

    Abstract: Minimizing a function over an intersection of convex sets is an important task in optimization that is often much more challenging than minimizing it over each individual constraint set. While traditional methods such as Frank-Wolfe (FW) or proximal gradient descent assume access to a linear or quadratic oracle on the intersection, splitting techniques take advantage of the structure of each sets,… ▽ More

    Submitted 9 April, 2018; originally announced April 2018.

    Comments: Appears in: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018). 30 pages

    MSC Class: 90C52; 90C90; 68T05 ACM Class: G.1.6; I.2.6

  30. arXiv:1804.02339  [pdf, other

    math.OC cs.LG

    Adaptive Three Operator Splitting

    Authors: Fabian Pedregosa, Gauthier Gidel

    Abstract: We propose and analyze an adaptive step-size variant of the Davis-Yin three operator splitting. This method can solve optimization problems composed by a sum of a smooth term for which we have access to its gradient and an arbitrary number of potentially non-smooth terms for which we have access to their proximal operator. The proposed method sets the step-size based on local information of the ob… ▽ More

    Submitted 31 July, 2018; v1 submitted 6 April, 2018; originally announced April 2018.

    Journal ref: Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4082-4091, 2018

  31. arXiv:1803.07348  [pdf, ps, other

    math.OC cs.LG stat.ML

    Frank-Wolfe with Subsampling Oracle

    Authors: Thomas Kerdreux, Fabian Pedregosa, Alexandre d'Aspremont

    Abstract: We analyze two novel randomized variants of the Frank-Wolfe (FW) or conditional gradient algorithm. While classical FW algorithms require solving a linear minimization problem over the domain at each iteration, the proposed method only requires to solve a linear minimization problem over a small \emph{subset} of the original domain. The first algorithm that we propose is a randomized variant of th… ▽ More

    Submitted 20 March, 2018; originally announced March 2018.

  32. arXiv:1801.03749  [pdf, other

    math.OC cs.LG stat.ML

    Improved asynchronous parallel optimization analysis for stochastic incremental methods

    Authors: Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

    Abstract: As datasets continue to increase in size and multi-core computer architectures are developed, asynchronous parallel optimization algorithms become more and more essential to the field of Machine Learning. Unfortunately, conducting the theoretical analysis asynchronous methods is difficult, notably due to the introduction of delay and inconsistency in inherently sequential algorithms. Handling thes… ▽ More

    Submitted 21 March, 2019; v1 submitted 11 January, 2018; originally announced January 2018.

    Comments: 67 pages, published in JMLR, can be found online at http://jmlr.org/papers/v19/17-650.html. arXiv admin note: substantial text overlap with arXiv:1606.04809

  33. arXiv:1707.06468  [pdf, other

    math.OC cs.LG stat.ML

    Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization

    Authors: Fabian Pedregosa, Rémi Leblond, Simon Lacoste-Julien

    Abstract: Due to their simplicity and excellent performance, parallel asynchronous variants of stochastic gradient descent have become popular methods to solve a wide range of large-scale optimization problems on multi-core architectures. Yet, despite their practical success, support for nonsmooth objectives is still lacking, making them unsuitable for many problems of interest in machine learning, such as… ▽ More

    Submitted 5 November, 2017; v1 submitted 20 July, 2017; originally announced July 2017.

    Comments: Appears in Advances in Neural Information Processing Systems 30 (NIPS 2017), 28 pages

    MSC Class: 90C52; 90C90; 68T05 ACM Class: G.1.6; I.2.6

    Journal ref: Advances in Neural Information Processing Systems 30 (NIPS 2017)

  34. arXiv:1610.07830  [pdf, ps, other

    stat.ML math.OC

    On the convergence rate of the three operator splitting scheme

    Authors: Fabian Pedregosa

    Abstract: The three operator splitting scheme was recently proposed by [Davis and Yin, 2015] as a method to optimize composite objective functions with one convex smooth term and two convex (possibly non-smooth) terms for which we have access to their proximity operator. In this short note we provide an alternative proof for the sublinear rate of convergence of this method.

    Submitted 25 June, 2021; v1 submitted 25 October, 2016; originally announced October 2016.

    Comments: Fixed typo in Lemma 3

  35. arXiv:1606.04809  [pdf, other

    math.OC cs.LG stat.ML

    ASAGA: Asynchronous Parallel SAGA

    Authors: Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

    Abstract: We describe ASAGA, an asynchronous parallel version of the incremental gradient algorithm SAGA that enjoys fast linear convergence rates. Through a novel perspective, we revisit and clarify a subtle but important technical issue present in a large fraction of the recent convergence rate proofs for asynchronous parallel optimization algorithms, and propose a simplification of the recently introduce… ▽ More

    Submitted 8 November, 2017; v1 submitted 15 June, 2016; originally announced June 2016.

    Comments: Appears in: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017), 37 pages

  36. arXiv:1602.02355  [pdf, other

    stat.ML cs.LG math.OC

    Hyperparameter optimization with approximate gradient

    Authors: Fabian Pedregosa

    Abstract: Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing an appropriate set of hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this work we propose an algorithm for the optimization of continuous hyperparameters using inexact gradient information. An advantage of this method is that hyperparameters… ▽ More

    Submitted 21 November, 2022; v1 submitted 7 February, 2016; originally announced February 2016.

    Comments: Fixes error in proof of Theorem 2

  37. arXiv:1412.3919  [pdf, other

    cs.LG cs.CV stat.ML

    Machine Learning for Neuroimaging with Scikit-Learn

    Authors: Alexandre Abraham, Fabian Pedregosa, Michael Eickenberg, Philippe Gervais, Andreas Muller, Jean Kossaifi, Alexandre Gramfort, Bertrand Thirion, Gäel Varoquaux

    Abstract: Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g. multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learnin… ▽ More

    Submitted 12 December, 2014; originally announced December 2014.

    Comments: Frontiers in neuroscience, Frontiers Research Foundation, 2013, pp.15

  38. arXiv:1408.2327  [pdf, other

    cs.LG

    On the Consistency of Ordinal Regression Methods

    Authors: Fabian Pedregosa, Francis Bach, Alexandre Gramfort

    Abstract: Many of the ordinal regression models that have been proposed in the literature can be seen as methods that minimize a convex surrogate of the zero-one, absolute, or squared loss functions. A key property that allows to study the statistical implications of such approximations is that of Fisher consistency. Fisher consistency is a desirable property for surrogate loss functions and implies that in… ▽ More

    Submitted 21 July, 2017; v1 submitted 11 August, 2014; originally announced August 2014.

    Comments: Journal of Machine Learning Research 18 (2017)

    Journal ref: Journal of Machine Learning Research 18 (2017) 1-35

  39. Data-driven HRF estimation for encoding and decoding models

    Authors: Fabian Pedregosa, Michael Eickenberg, Philippe Ciuciu, Bertrand Thirion, Alexandre Gramfort

    Abstract: Despite the common usage of a canonical, data-independent, hemodynamic response function (HRF), it is known that the shape of the HRF varies across brain regions and subjects. This suggests that a data-driven estimation of this function could lead to more statistical power when modeling BOLD fMRI data. However, unconstrained estimation of the HRF can yield highly unstable results when the number o… ▽ More

    Submitted 7 November, 2014; v1 submitted 27 February, 2014; originally announced February 2014.

    Comments: appears in NeuroImage (2015)

  40. arXiv:1310.1257  [pdf, other

    cs.CV

    Second order scattering descriptors predict fMRI activity due to visual textures

    Authors: Michael Eickenberg, Fabian Pedregosa, Senoussi Mehdi, Alexandre Gramfort, Bertrand Thirion

    Abstract: Second layer scattering descriptors are known to provide good classification performance on natural quasi-stationary processes such as visual textures due to their sensitivity to higher order moments and continuity with respect to small deformations. In a functional Magnetic Resonance Imaging (fMRI) experiment we present visual textures to subjects and evaluate the predictive power of these descri… ▽ More

    Submitted 10 August, 2013; originally announced October 2013.

    Comments: 3nd International Workshop on Pattern Recognition in NeuroImaging (2013)

  41. arXiv:1309.0238  [pdf, ps, other

    cs.LG cs.MS

    API design for machine learning software: experiences from the scikit-learn project

    Authors: Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake Vanderplas, Arnaud Joly, Brian Holt, Gaël Varoquaux

    Abstract: Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and p… ▽ More

    Submitted 1 September, 2013; originally announced September 2013.

    Journal ref: European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Databases (2013)

  42. arXiv:1305.2788  [pdf, other

    cs.LG stat.AP

    HRF estimation improves sensitivity of fMRI encoding and decoding models

    Authors: Fabian Pedregosa, Michael Eickenberg, Bertrand Thirion, Alexandre Gramfort

    Abstract: Extracting activation patterns from functional Magnetic Resonance Images (fMRI) datasets remains challenging in rapid-event designs due to the inherent delay of blood oxygen level-dependent (BOLD) signal. The general linear model (GLM) allows to estimate the activation from a design matrix and a fixed hemodynamic response function (HRF). However, the HRF is known to vary substantially between subj… ▽ More

    Submitted 13 May, 2013; originally announced May 2013.

    Comments: 3nd International Workshop on Pattern Recognition in NeuroImaging (2013)

  43. arXiv:1207.3598  [pdf, other

    cs.LG cs.CV

    Learning to rank from medical imaging data

    Authors: Fabian Pedregosa, Alexandre Gramfort, Gaël Varoquaux, Elodie Cauvet, Christophe Pallier, Bertrand Thirion

    Abstract: Medical images can be used to predict a clinical score coding for the severity of a disease, a pain level or the complexity of a cognitive task. In all these cases, the predicted variable has a natural order. While a standard classifier discards this information, we would like to take it into account in order to improve prediction performance. A standard linear regression does model such informati… ▽ More

    Submitted 30 September, 2012; v1 submitted 16 July, 2012; originally announced July 2012.

    Journal ref: MLMI 2012 - 3rd International Workshop on Machine Learning in Medical Imaging (2012)

  44. arXiv:1207.3520  [pdf, other

    cs.LG stat.ML

    Improved brain pattern recovery through ranking approaches

    Authors: Fabian Pedregosa, Alexandre Gramfort, Gaël Varoquaux, Bertrand Thirion, Christophe Pallier, Elodie Cauvet

    Abstract: Inferring the functional specificity of brain regions from functional Magnetic Resonance Images (fMRI) data is a challenging statistical problem. While the General Linear Model (GLM) remains the standard approach for brain map**, supervised learning techniques (a.k.a.} decoding) have proven to be useful to capture multivariate statistical effects distributed across voxels and brain regions. Up t… ▽ More

    Submitted 15 July, 2012; originally announced July 2012.

    Journal ref: Pattern Recognition in NeuroImaging (PRNI 2012) (2012)

  45. arXiv:1201.0490  [pdf, ps, other

    cs.LG cs.MS

    Scikit-learn: Machine Learning in Python

    Authors: Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Andreas Müller, Joel Nothman, Gilles Louppe, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay

    Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distribute… ▽ More

    Submitted 5 June, 2018; v1 submitted 2 January, 2012; originally announced January 2012.

    Comments: Update authors list and URLs

    Journal ref: Journal of Machine Learning Research (2011)