Skip to main content

Showing 51–63 of 63 results for author: Guedj, B

.
  1. arXiv:1904.00865  [pdf, other

    stat.ML cs.CV cs.LG eess.IV

    Non-linear aggregation of filters to improve image denoising

    Authors: Benjamin Guedj, Juliette Rengot

    Abstract: We introduce a novel aggregation method to efficiently perform image denoising. Preliminary filters are aggregated in a non-linear fashion, using a new metric of pixel proximity based on how the pool of filters reaches a consensus. We provide a theoretical bound to support our aggregation scheme, its numerical performance is illustrated and we show that the aggregate significantly outperforms each… ▽ More

    Submitted 23 June, 2020; v1 submitted 1 April, 2019; originally announced April 2019.

    Comments: To appear at Computing Conference 2020

    Journal ref: Computing Conference 2020

  2. arXiv:1903.04479  [pdf, other

    cs.LG stat.ML

    Revisiting clustering as matrix factorisation on the Stiefel manifold

    Authors: Stéphane Chrétien, Benjamin Guedj

    Abstract: This paper studies clustering for possibly high dimensional data (e.g. images, time series, gene expression data, and many other settings), and rephrase it as low rank matrix estimation in the PAC-Bayesian framework. Our approach leverages the well known Burer-Monteiro factorisation strategy from large scale optimisation, in the context of low rank estimation. Moreover, our Burer-Monteiro factors… ▽ More

    Submitted 18 June, 2020; v1 submitted 11 March, 2019; originally announced March 2019.

    Comments: Accepted at the LOD 2020 Conference -- The Sixth International Conference on Machine Learning, Optimization, and Data Science

    Journal ref: LOD 2020

  3. arXiv:1901.05353  [pdf, ps, other

    stat.ML cs.LG

    A Primer on PAC-Bayesian Learning

    Authors: Benjamin Guedj

    Abstract: Generalised Bayesian learning algorithms are increasingly popular in machine learning, due to their PAC generalisation properties and flexibility. The present paper aims at providing a self-contained survey on the resulting PAC-Bayes framework and some of its main theoretical and algorithmic developments.

    Submitted 7 May, 2019; v1 submitted 16 January, 2019; originally announced January 2019.

    Journal ref: Proceedings of the 2nd congress of the Société Mathématique de France, 2019, pp. 391--414

  4. arXiv:1805.07418  [pdf, other

    stat.ML cs.LG math.ST

    Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly

    Authors: Benjamin Guedj, Le Li

    Abstract: When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. Principal curves act as a nonlinear generalization of PCA and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our procedure is supported by regret bounds with optim… ▽ More

    Submitted 8 May, 2019; v1 submitted 18 May, 2018; originally announced May 2018.

    Journal ref: Entropy 2021

  5. arXiv:1804.10028  [pdf, other

    stat.ML cs.AI cs.DC cs.LG

    Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles

    Authors: John Klein, Mahmoud Albardan, Benjamin Guedj, Olivier Colot

    Abstract: We examine a network of learners which address the same classification task but must learn from different data sets. The learners cannot share data but instead share their models. Models are shared only one time so as to preserve the network load. We introduce DELCO (standing for Decentralized Ensemble Learning with COpulas), a new approach allowing to aggregate the predictions of the classifiers… ▽ More

    Submitted 15 July, 2019; v1 submitted 26 April, 2018; originally announced April 2018.

    Journal ref: ECML-PKDD 2019

  6. arXiv:1707.00558  [pdf, other

    stat.CO stat.ML

    Pycobra: A Python Toolbox for Ensemble Learning and Visualisation

    Authors: Benjamin Guedj, Bhargav Srinivasa Desikan

    Abstract: We introduce \texttt{pycobra}, a Python library devoted to ensemble learning (regression and classification) and visualisation. Its main assets are the implementation of several ensemble learning algorithms, a flexible and generic interface to compare and blend any existing machine learning algorithm available in Python libraries (as long as a \texttt{predict} method is given), and visualisation t… ▽ More

    Submitted 23 May, 2019; v1 submitted 25 April, 2017; originally announced July 2017.

    Journal ref: Journal of Machine Learning Research (JMLR), 18(190):1--5, 2018

  7. Simpler PAC-Bayesian Bounds for Hostile Data

    Authors: Pierre Alquier, Benjamin Guedj

    Abstract: PAC-Bayesian learning bounds are of the utmost interest to the learning community. Their role is to connect the generalization ability of an aggregation distribution $ρ$ to its empirical risk and to its Kullback-Leibler divergence with respect to some prior distribution $π$. Unfortunately, most of the available bounds typically rely on heavy assumptions such as boundedness and independence of the… ▽ More

    Submitted 23 May, 2019; v1 submitted 23 October, 2016; originally announced October 2016.

    Comments: 18 pages

    Journal ref: Machine Learning (2018), vol. 107 (5), 887--902

  8. arXiv:1608.06412  [pdf, ps, other

    stat.ML math.ST

    Stability revisited: new generalisation bounds for the Leave-one-Out

    Authors: Alain Celisse, Benjamin Guedj

    Abstract: The present paper provides a new generic strategy leading to non-asymptotic theoretical guarantees on the Leave-one-Out procedure applied to a broad class of learning algorithms. This strategy relies on two main ingredients: the new notion of $L^q$ stability, and the strong use of moment inequalities. $L^q$ stability extends the ongoing notion of hypothesis stability while remaining weaker than th… ▽ More

    Submitted 23 August, 2016; originally announced August 2016.

    Comments: 12 pages

  9. arXiv:1602.00522  [pdf, other

    stat.ML math.ST

    A Quasi-Bayesian Perspective to Online Clustering

    Authors: Le Li, Benjamin Guedj, Sébastien Loustau

    Abstract: When faced with high frequency streams of data, clustering raises theoretical and algorithmic pitfalls. We introduce a new and adaptive online clustering algorithm relying on a quasi-Bayesian approach, with a dynamic (i.e., time-dependent) estimation of the (unknown and changing) number of clusters. We prove that our approach is supported by minimax regret bounds. We also provide an RJMCMC-flavore… ▽ More

    Submitted 25 May, 2018; v1 submitted 1 February, 2016; originally announced February 2016.

    Journal ref: Electronic Journal of Statistics (2018), vol. 12(2), 3071--3113

  10. An Oracle Inequality for Quasi-Bayesian Non-Negative Matrix Factorization

    Authors: Pierre Alquier, Benjamin Guedj

    Abstract: The aim of this paper is to provide some theoretical understanding of quasi-Bayesian aggregation methods non-negative matrix factorization. We derive an oracle inequality for an aggregated estimator. This result holds for a very general class of prior distributions and shows how the prior affects the rate of convergence.

    Submitted 26 June, 2018; v1 submitted 6 January, 2016; originally announced January 2016.

    Comments: This is the corrected version of the published paper P. Alquier, B. Guedj, An Oracle Inequality for Quasi-Bayesian Non-negative Matrix Factorization, Mathematical Methods of Statistics, 2017, vol. 26, no. 1, pp. 55-67. Since then Arnak Dalalyan (ENSAE) found a mistake in the proofs. We fixed the mistake at the price of a slightly different logarithmic term in the bound

    Journal ref: Mathematical Methods of Statistics (MMS), 26(1): 55-67, 2017

  11. PAC-Bayesian High Dimensional Bipartite Ranking

    Authors: Benjamin Guedj, Sylvain Robbiano

    Abstract: This paper is devoted to the bipartite ranking problem, a classical statistical learning task, in a high dimensional setting. We propose a scoring and ranking strategy based on the PAC-Bayesian approach. We consider nonlinear additive scoring functions, and we derive non-asymptotic risk bounds under a sparsity assumption. In particular, oracle inequalities in probability holding under a margin con… ▽ More

    Submitted 16 May, 2019; v1 submitted 9 November, 2015; originally announced November 2015.

    Journal ref: Journal of Statistical Planning and Inference (2018), vol. 196, 70--86

  12. COBRA: A Combined Regression Strategy

    Authors: Gérard Biau, Aurélie Fischer, Benjamin Guedj, James Malley

    Abstract: A new method for combining several initial estimators of the regression function is introduced. Instead of building a linear or convex optimized combination over a collection of basic estimators $r_1,\dots,r_M$, we use them as a collective indicator of the proximity between the training data and a test observation. This local distance approach is model-free and very fast. More specifically, the re… ▽ More

    Submitted 23 May, 2019; v1 submitted 9 March, 2013; originally announced March 2013.

    Comments: 42 pages

    Journal ref: Journal of Multivariate Analysis (2016), vol. 146, 18--28

  13. arXiv:1208.1211  [pdf, ps, other

    stat.ME math.ST

    PAC-Bayesian Estimation and Prediction in Sparse Additive Models

    Authors: Benjamin Guedj, Pierre Alquier

    Abstract: The present paper is about estimation and prediction in high-dimensional additive models under a sparsity assumption ($p\gg n$ paradigm). A PAC-Bayesian strategy is investigated, delivering oracle inequalities in probability. The implementation is performed through recent outcomes in high-dimensional MCMC algorithms, and the performance of our method is assessed on simulated data.

    Submitted 1 February, 2013; v1 submitted 6 August, 2012; originally announced August 2012.

    Comments: 28 pages

    MSC Class: 62G08; 62J02; 65C40

    Journal ref: Electronic Journal of Statistics, volume 7, 2013, 264--291