Skip to main content

Showing 1–26 of 26 results for author: Gaillard, P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.12366  [pdf, ps, other

    cs.LG math.ST stat.ML

    Structured Prediction in Online Learning

    Authors: Pierre Boudart, Alessandro Rudi, Pierre Gaillard

    Abstract: We study a theoretical and algorithmic framework for structured prediction in the online learning setting. The problem of structured prediction, i.e. estimating function where the output space lacks a vectorial structure, is well studied in the literature of supervised statistical learning. We show that our algorithm is a generalisation of optimal algorithms from the supervised learning setting, a… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 29 pages

  2. arXiv:2405.19807  [pdf, ps, other

    cs.LG math.PR math.ST stat.ML

    MetaCURL: Non-stationary Concave Utility Reinforcement Learning

    Authors: Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane

    Abstract: We explore online learning in episodic loop-free Markov decision processes on non-stationary environments (changing losses and probability transitions). Our focus is on the Concave Utility Reinforcement Learning problem (CURL), an extension of classical RL for handling convex performance criteria in state-action distributions induced by agent policies. While various machine learning problems can b… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  3. arXiv:2402.15171  [pdf, ps, other

    cs.LG math.ST stat.ML

    Towards Efficient and Optimal Covariance-Adaptive Algorithms for Combinatorial Semi-Bandits

    Authors: Julien Zhou, Pierre Gaillard, Thibaud Rahier, Houssam Zenati, Julyan Arbel

    Abstract: We address the problem of stochastic combinatorial semi-bandits, where a player selects among $P$ actions from the power set of a set containing $d$ base items. Adaptivity to the problem's structure is essential in order to obtain optimal regret upper bounds. As estimating the coefficients of a covariance matrix can be manageable in practice, leveraging them should improve the regret. We design ``… ▽ More

    Submitted 3 July, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  4. arXiv:2402.05145  [pdf, other

    cs.LG physics.data-an stat.ML

    Online Learning Approach for Survival Analysis

    Authors: Camila Fernandez, Pierre Gaillard, Joseph de Vilmarest, Olivier Wintenberger

    Abstract: We introduce an online mathematical framework for survival analysis, allowing real time adaptation to dynamic environments and censored data. This framework enables the estimation of event time distributions through an optimal second order online convex optimization algorithm-Online Newton Step (ONS). This approach, previously unexplored, presents substantial advantages, including explicit algorit… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  5. arXiv:2311.18346  [pdf, other

    math.OC physics.data-an stat.ML

    Efficient Model-Based Concave Utility Reinforcement Learning through Greedy Mirror Descent

    Authors: Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane

    Abstract: Many machine learning tasks can be solved by minimizing a convex function of an occupancy measure over the policies that generate them. These include reinforcement learning, imitation learning, among others. This more general paradigm is called the Concave Utility Reinforcement Learning problem (CURL). Since CURL invalidates classical Bellman equations, it requires new algorithms. We introduce MD-… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  6. arXiv:2302.08190  [pdf, other

    math.OC cs.LG math.PR stat.AP stat.ML

    Reimagining Demand-Side Management with Mean Field Learning

    Authors: Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane

    Abstract: Integrating renewable energy into the power grid while balancing supply and demand is a complex issue, given its intermittent nature. Demand side management (DSM) offers solutions to this challenge. We propose a new method for DSM, in particular the problem of controlling a large population of electrical devices to follow a desired consumption signal. We model it as a finite horizon Markovian mean… ▽ More

    Submitted 25 May, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

  7. arXiv:2202.06694  [pdf, other

    cs.LG stat.ML

    Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences

    Authors: Aadirupa Saha, Pierre Gaillard

    Abstract: We study the problem of $K$-armed dueling bandit for both stochastic and adversarial environments, where the goal of the learner is to aggregate information through relative preferences of pair of decisions points queried in an online sequential manner. We first propose a novel reduction from any (general) dueling bandits to multi-armed bandits and despite the simplicity, it allows us to improve m… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

  8. arXiv:2110.03960  [pdf, other

    cs.LG math.ST stat.ML

    Mixability made efficient: Fast online multiclass logistic regression

    Authors: Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

    Abstract: Mixability has been shown to be a powerful tool to obtain algorithms with optimal regret. However, the resulting methods often suffer from high computational complexity which has reduced their practical applicability. For example, in the case of multiclass logistic regression, the aggregating forecaster (Foster et al. (2018)) achieves a regret of $O(\log(Bn))$ whereas Online Newton Step achieves… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  9. arXiv:2106.07644  [pdf, other

    math.OC cs.LG cs.MA math.PR stat.ML

    A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip

    Authors: Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Hadrien Hendrikx, Laurent Massoulié, Adrien Taylor

    Abstract: We introduce the continuized Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter. The two variables continuously mix following a linear ordinary differential equation and take gradient steps at random times. This continuized variant benefits from the best of the continuous and the discrete frameworks: as a continuous process, o… ▽ More

    Submitted 27 October, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2102.06035

  10. arXiv:2102.03594  [pdf, other

    math.ST cs.LG stat.ML

    Online nonparametric regression with Sobolev kernels

    Authors: Oleksandr Zadorozhnyi, Pierre Gaillard, Sebastien Gerschinovitz, Alessandro Rudi

    Abstract: In this work we investigate the variation of the online kernelized ridge regression algorithm in the setting of $d-$dimensional adversarial nonparametric regression. We derive the regret upper bounds on the classes of Sobolev spaces $W_{p}^β(\mathcal{X})$, $p\geq 2, β>\frac{d}{p}$. The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $β> \frac{d}{2}$ or… ▽ More

    Submitted 13 July, 2021; v1 submitted 6 February, 2021; originally announced February 2021.

    Comments: 40 pages, 5 figures, 3 tables (version 2)

  11. arXiv:2006.08212  [pdf, other

    cs.LG cs.MA math.OC stat.ML

    Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

    Authors: Raphaël Berthier, Francis Bach, Pierre Gaillard

    Abstract: In the context of statistical supervised learning, the noiseless linear model assumes that there exists a deterministic linear relation $Y = \langle θ_*, X \rangle$ between the random output $Y$ and the random feature vector $Φ(U)$, a potentially non-linear transformation of the inputs $U$. We analyze the convergence of single-pass, fixed step-size stochastic gradient descent on the least-square r… ▽ More

    Submitted 27 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

  12. arXiv:2004.11722  [pdf, other

    stat.ML cs.LG

    Counterfactual Learning of Stochastic Policies with Continuous Actions: from Models to Offline Evaluation

    Authors: Houssam Zenati, Alberto Bietti, Matthieu Martin, Eustache Diemert, Pierre Gaillard, Julien Mairal

    Abstract: Counterfactual reasoning from logged data has become increasingly important for many applications such as web advertising or healthcare. In this paper, we address the problem of learning stochastic policies with continuous actions from the viewpoint of counterfactual risk minimization (CRM). While the CRM framework is appealing and well studied for discrete actions, the continuous action case rais… ▽ More

    Submitted 14 December, 2022; v1 submitted 22 April, 2020; originally announced April 2020.

  13. arXiv:2004.06248  [pdf, other

    cs.LG stat.ML

    Improved Slee** Bandits with Stochastic Actions Sets and Adversarial Rewards

    Authors: Aadirupa Saha, Pierre Gaillard, Michal Valko

    Abstract: In this paper, we consider the problem of slee** bandits with stochastic action sets and adversarial rewards. In this setting, in contrast to most work in bandits, the actions may not be available at all times. For instance, some products might be out of stock in item recommendation. The best existing efficient (i.e., polynomial-time) algorithms for this problem only guarantee an $O(T^{2/3})$ up… ▽ More

    Submitted 8 August, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

    Comments: Accepted to ICML 2020

  14. arXiv:2003.08820  [pdf, other

    cs.LG stat.ML

    Experimental Comparison of Semi-parametric, Parametric, and Machine Learning Models for Time-to-Event Analysis Through the Concordance Index

    Authors: Camila Fernandez, Chung Shue Chen, Pierre Gaillard, Alonso Silva

    Abstract: In this paper, we make an experimental comparison of semi-parametric (Cox proportional hazards model, Aalen's additive regression model), parametric (Weibull AFT model), and machine learning models (Random Survival Forest, Gradient Boosting with Cox Proportional Hazards Loss, DeepSurv) through the concordance index on two different datasets (PBC and GBCSG2). We present two comparisons: one with th… ▽ More

    Submitted 13 March, 2020; originally announced March 2020.

  15. arXiv:2003.08109  [pdf, other

    cs.LG math.ST stat.ML

    Efficient improper learning for online logistic regression

    Authors: Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

    Abstract: We consider the setting of online logistic regression and consider the regret with respect to the 2-ball of radius B. It is known (see [Hazan et al., 2014]) that any proper algorithm which has logarithmic regret in the number of samples (denoted n) necessarily suffers an exponential multiplicative constant in B. In this work, we design an efficient improper algorithm that avoids this exponential c… ▽ More

    Submitted 3 November, 2020; v1 submitted 18 March, 2020; originally announced March 2020.

    Journal ref: Conference on Learning Theory 2020, Jul 2020, Graz, Austria

  16. arXiv:1902.09917  [pdf, other

    stat.ML cs.LG math.ST

    Efficient online learning with kernels for adversarial large scale problems

    Authors: Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

    Abstract: We are interested in a framework of online learning with kernels for low-dimensional but large-scale and potentially adversarial datasets. We study the computational and theoretical performance of online variations of kernel Ridge regression. Despite its simplicity, the algorithm we study is the first to achieve the optimal regret for a wide range of kernels with a per-round complexity of order… ▽ More

    Submitted 29 May, 2019; v1 submitted 26 February, 2019; originally announced February 2019.

  17. Bayesian inference and non-linear extensions of the CIRCE method for quantifying the uncertainty of closure relationships integrated into thermal-hydraulic system codes

    Authors: Guillaume Damblin, Pierre Gaillard

    Abstract: Uncertainty Quantification of closure relationships integrated into thermal-hydraulic system codes is a critical prerequisite in applying the Best-Estimate Plus Uncertainty (BEPU) methodology for nuclear safety and licensing processes.The purpose of the CIRCE method is to estimate the (log)-Gaussian probability distribution of a multiplicative factor applied to a reference closure relationship in… ▽ More

    Submitted 9 March, 2020; v1 submitted 13 February, 2019; originally announced February 2019.

    Comments: 37 pages, 5 figures

    MSC Class: 62F15

    Journal ref: Nuclear Engineering and Design, 2020, Volume 359, 1 April 2020, 110391

  18. arXiv:1901.09532  [pdf, other

    cs.LG stat.ML

    Target Tracking for Contextual Bandits: Application to Demand Side Management

    Authors: Margaux Brégère, Pierre Gaillard, Yannig Goude, Gilles Stoltz

    Abstract: We propose a contextual-bandit approach for demand side management by offering price incentives. More precisely, a target mean consumption is set at each round and the mean consumption is modeled as a complex function of the distribution of prices sent and of some contextual variables such as the temperature, weather, and so on. The performance of our strategies is measured in quadratic losses thr… ▽ More

    Submitted 13 May, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

    Journal ref: ICML 2019 (Thirty-sixth International Conference on Machine Learning), Jun 2019, Long Beach, United States

  19. arXiv:1805.11386  [pdf, ps, other

    stat.ML cs.LG math.ST

    Uniform regret bounds over $R^d$ for the sequential linear regression problem with the square loss

    Authors: Pierre Gaillard, Sébastien Gerchinovitz, Malo Huard, Gilles Stoltz

    Abstract: We consider the setting of online linear regression for arbitrary deterministic sequences, with the square loss. We are interested in the aim set by Bartlett et al. (2015): obtain regret bounds that hold uniformly over all competitor vectors. When the feature sequence is known at the beginning of the game, they provided closed-form regret bounds of $2d B^2 \ln T + \mathcal{O}_T(1)$, where $T$ is t… ▽ More

    Submitted 25 February, 2019; v1 submitted 29 May, 2018; originally announced May 2018.

    Comments: Proceedings of ALT'2019

  20. arXiv:1805.08531  [pdf, other

    cs.MA cs.DC stat.ML

    Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations

    Authors: Raphaël Berthier, Francis Bach, Pierre Gaillard

    Abstract: Consider a network of agents connected by communication links, where each agent holds a real value. The gossip problem consists in estimating the average of the values diffused in the network in a distributed manner. We develop a method solving the gossip problem that depends only on the spectral dimension of the network, that is, in the communication network set-up, the dimension of the space in… ▽ More

    Submitted 11 June, 2019; v1 submitted 22 May, 2018; originally announced May 2018.

  21. arXiv:1702.08211  [pdf, ps, other

    stat.ML cs.LG math.ST

    Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning

    Authors: Nicolò Cesa-Bianchi, Pierre Gaillard, Claudio Gentile, Sébastien Gerchinovitz

    Abstract: We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under different assumptions on losses and feedback information. For full information feedback and Lipschitz losses, we design the first explicit algorithm achieving the minimax regret rate (up to log factors). In a partial feedback model motivated by second-price auctions, we obtain algorithms for Lipschitz… ▽ More

    Submitted 30 June, 2017; v1 submitted 27 February, 2017; originally announced February 2017.

    Comments: This document is the full version of an extended abstract accepted for presentation at COLT 2017

  22. arXiv:1502.07697  [pdf, other

    stat.ML cs.LG

    A Chaining Algorithm for Online Nonparametric Regression

    Authors: Pierre Gaillard, Sébastien Gerchinovitz

    Abstract: We consider the problem of online nonparametric regression with arbitrary deterministic sequences. Using ideas from the chaining technique, we design an algorithm that achieves a Dudley-type regret bound similar to the one obtained in a non-constructive fashion by Rakhlin and Sridharan (2014). Our regret bound is expressed in terms of the metric entropy in the sup norm, which yields optimal guaran… ▽ More

    Submitted 1 July, 2015; v1 submitted 26 February, 2015; originally announced February 2015.

    Comments: Published in the proceedings of COLT 2015: http://jmlr.org/proceedings/papers/v40/Gaillard15.html

  23. arXiv:1405.1533  [pdf, ps, other

    math.ST cs.LG stat.ML

    A consistent deterministic regression tree for non-parametric prediction of time series

    Authors: Pierre Gaillard, Paul Baudin

    Abstract: We study online prediction of bounded stationary ergodic processes. To do so, we consider the setting of prediction of individual sequences and build a deterministic regression tree that performs asymptotically as well as the best L-Lipschitz constant predictors. Then, we show why the obtained regret bound entails the asymptotical optimality with respect to the class of bounded stationary ergodic… ▽ More

    Submitted 8 May, 2014; v1 submitted 7 May, 2014; originally announced May 2014.

  24. arXiv:1402.2044  [pdf, ps, other

    stat.ML cs.LG math.ST

    A Second-order Bound with Excess Losses

    Authors: Pierre Gaillard, Gilles Stoltz, Tim Van Erven

    Abstract: We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm (and also a version of the polynomially weighted average algorithm) with multiple learning rates. These bounds are in terms of excess losses, the differences between the instantaneous losses suffered by the algorith… ▽ More

    Submitted 10 February, 2014; originally announced February 2014.

  25. arXiv:1207.1965  [pdf, other

    stat.ML cs.LG stat.AP

    Forecasting electricity consumption by aggregating specialized experts

    Authors: Marie Devaine, Pierre Gaillard, Yannig Goude, Gilles Stoltz

    Abstract: We consider the setting of sequential prediction of arbitrary sequences based on specialized experts. We first provide a review of the relevant literature and present two theoretical contributions: a general analysis of the specialist aggregation rule of Freund et al. (1997) and an adaptation of fixed-share rules of Herbster and Warmuth (1998) in this setting. We then apply these rules to the sequ… ▽ More

    Submitted 9 July, 2012; originally announced July 2012.

    Comments: 33 pages

  26. arXiv:1202.3323  [pdf, ps, other

    cs.LG stat.ML

    Mirror Descent Meets Fixed Share (and feels no regret)

    Authors: Nicolò Cesa-Bianchi, Pierre Gaillard, Gabor Lugosi, Gilles Stoltz

    Abstract: Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension. This is done using either a carefully designed projection or by a weight sharing technique. Via a novel unified analysis, we show that these two approaches deliver essentially equivalent bounds on a notion of regret generalizing shifting, adaptive, discounted, and other rel… ▽ More

    Submitted 27 September, 2012; v1 submitted 15 February, 2012; originally announced February 2012.

    Journal ref: NIPS 2012, Lake Tahoe : United States (2012)