Skip to main content

Showing 1–50 of 96 results for author: Moulines, E

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01794  [pdf, other

    stat.ML cs.LG math.PR math.ST stat.ME

    Conditionally valid Probabilistic Conformal Prediction

    Authors: Vincent Plassier, Alexander Fishkov, Maxim Panov, Eric Moulines

    Abstract: We develop a new method for creating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution $P_{Y \mid X}$. Most existing methods, such as conformalized quantile regression and probabilistic conformal prediction, only offer marginal coverage guarantees. Our approach extends these methods to achieve conditional coverage, which is essentia… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 23 pages

  2. arXiv:2406.19824  [pdf, ps, other

    cs.GT stat.ML

    Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality

    Authors: Antoine Scheid, Aymeric Capitaine, Etienne Boursier, Eric Moulines, Michael I Jordan, Alain Durmus

    Abstract: In economic theory, the concept of externality refers to any indirect effect resulting from an interaction between players that affects the social welfare. Most of the models within which externality has been studied assume that agents have perfect knowledge of their environment and preferences. This is a major hindrance to the practical implementation of many proposed solutions. To address this i… ▽ More

    Submitted 3 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

  3. arXiv:2406.09048  [pdf, other

    stat.ML cs.LG math.PR math.ST

    Central Limit Theorem for Bayesian Neural Network trained with Variational Inference

    Authors: Arnaud Descours, Tom Huix, Arnaud Guillin, Manon Michel, Éric Moulines, Boris Nectoux

    Abstract: In this paper, we rigorously derive Central Limit Theorems (CLT) for Bayesian two-layerneural networks in the infinite-width limit and trained by variational inference on a regression task. The different networks are trained via different maximization schemes of the regularized evidence lower bound: (i) the idealized case with exact estimation of a multiple Gaussian integral from the reparametriza… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  4. arXiv:2406.04012  [pdf, other

    stat.ML cs.LG

    Theoretical Guarantees for Variational Inference with Fixed-Variance Mixture of Gaussians

    Authors: Tom Huix, Anna Korba, Alain Durmus, Eric Moulines

    Abstract: Variational inference (VI) is a popular approach in Bayesian inference, that looks for the best approximation of the posterior distribution within a parametric family, minimizing a loss that is typically the (reverse) Kullback-Leibler (KL) divergence. Despite its empirical success, the theoretical properties of VI have only received attention recently, and mostly when the parametric family is the… ▽ More

    Submitted 10 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  5. arXiv:2405.16644  [pdf, other

    stat.ML cs.LG math.OC math.PR math.ST

    Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning

    Authors: Sergey Samsonov, Eric Moulines, Qi-Man Shao, Zhuo-Song Zhang, Alexey Naumov

    Abstract: In this paper, we obtain the Berry-Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Our findings reveal that the fastest rate of normal approximation is achieved when setting the most aggressive step size $α_{k} \asymp k^{-1/2}$. Moreover, we prove the non-asymptotic validit… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    MSC Class: 60F05; 62L20; 62E20

  6. arXiv:2405.00017  [pdf, other

    cs.DC cs.LG stat.ML

    Queuing dynamics of asynchronous Federated Learning

    Authors: Louis Leconte, Matthieu Jonckheere, Sergey Samsonov, Eric Moulines

    Abstract: We study asynchronous federated learning mechanisms with nodes having potentially different computational speeds. In such an environment, each node is allowed to work on models with potential delays and contribute to updates to the central server at its own pace. Existing analyses of such algorithms typically depend on intractable quantities such as the maximum node delay and do not consider the u… ▽ More

    Submitted 12 February, 2024; originally announced May 2024.

  7. arXiv:2404.19517  [pdf, ps, other

    math.OC stat.ML

    Inexact subgradient methods for semialgebraic functions

    Authors: Jérôme Bolte, Tam Le, Éric Moulines, Edouard Pauwels

    Abstract: Motivated by the widespread use of approximate derivatives in machine learning and optimization, we study inexact subgradient methods with non-vanishing additive errors and step sizes. In the nonconvex semialgebraic setting, under boundedness assumptions, we prove that the method provides points that eventually fluctuate close to the critical set at a distance proportional to $ε^ρ$ where $ε$ is t… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  8. arXiv:2403.11407  [pdf, other

    stat.ML cs.LG

    Divide-and-Conquer Posterior Sampling for Denoising Diffusion Priors

    Authors: Yazid Janati, Alain Durmus, Eric Moulines, Jimmy Olsson

    Abstract: Interest in the use of Denoising Diffusion Models (DDM) as priors for solving inverse Bayesian problems has recently increased significantly. However, sampling from the resulting posterior distribution poses a challenge. To solve this problem, previous works have proposed approximations to bias the drift term of the diffusion. In this work, we take a different approach and utilize the specific str… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: preprint

  9. arXiv:2403.03811  [pdf, other

    stat.ML cs.GT cs.LG

    Incentivized Learning in Principal-Agent Bandit Games

    Authors: Antoine Scheid, Daniil Tiapkin, Etienne Boursier, Aymeric Capitaine, El Mahdi El Mhamdi, Eric Moulines, Michael I. Jordan, Alain Durmus

    Abstract: This work considers a repeated principal-agent bandit game, where the principal can only interact with her environment through the agent. The principal and the agent have misaligned objectives and the choice of action is only left to the agent. However, the principal can influence the agent's decisions by offering incentives which add up to his rewards. The principal aims to iteratively learn an i… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  10. arXiv:2402.04114  [pdf, other

    stat.ML cs.LG math.OC

    SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning

    Authors: Paul Mangold, Sergey Samsonov, Safwan Labbi, Ilya Levin, Reda Alami, Alexey Naumov, Eric Moulines

    Abstract: In this paper, we analyze the sample and communication complexity of the federated linear stochastic approximation (FedLSA) algorithm. We explicitly quantify the effects of local training with agent heterogeneity. We show that the communication complexity of FedLSA scales polynomially with the inverse of the desired accuracy $ε$. To overcome this, we propose SCAFFLSA a new variant of FedLSA that u… ▽ More

    Submitted 27 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: now with linear speed-up!

  11. arXiv:2401.05388  [pdf, other

    eess.SP cs.LG stat.ML

    Bayesian ECG reconstruction using denoising diffusion generative models

    Authors: Gabriel V. Cardoso, Lisa Bedin, Josselin Duchateau, Rémi Dubois, Eric Moulines

    Abstract: In this work, we propose a denoising diffusion generative model (DDGM) trained with healthy electrocardiogram (ECG) data that focuses on ECG morphology and inter-lead dependence. Our results show that this innovative generative model can successfully generate realistic ECG signals. Furthermore, we explore the application of recent breakthroughs in solving linear inverse Bayesian problems using DDG… ▽ More

    Submitted 18 December, 2023; originally announced January 2024.

  12. arXiv:2312.15799  [pdf, other

    stat.ML cs.LG

    Efficient Conformal Prediction under Data Heterogeneity

    Authors: Vincent Plassier, Nikita Kotelevskii, Aleksandr Rubashevskii, Fedor Noskov, Maksim Velikanov, Alexander Fishkov, Samuel Horvath, Martin Takac, Eric Moulines, Maxim Panov

    Abstract: Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification, which is crucial for ensuring the reliability of predictions. However, common CP methods heavily rely on data exchangeability, a condition often violated in practice. Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples. This work introduce… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: 28 pages

  13. arXiv:2310.18186  [pdf, other

    stat.ML cs.LG

    Model-free Posterior Sampling via Learning Rate Randomization

    Authors: Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Menard

    Abstract: In this paper, we introduce Randomized Q-learning (RandQL), a novel randomized model-free algorithm for regret minimization in episodic Markov Decision Processes (MDPs). To the best of our knowledge, RandQL is the first tractable model-free posterior sampling-based algorithm. We analyze the performance of RandQL in both tabular and non-tabular metric space settings. In tabular MDPs, RandQL achieve… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: NeurIPS-2023

  14. arXiv:2310.17303  [pdf, ps, other

    stat.ML cs.LG

    Demonstration-Regularized RL

    Authors: Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Menard

    Abstract: Incorporating expert demonstrations has empirically helped to improve the sample efficiency of reinforcement learning (RL). This paper quantifies theoretically to what extent this extra information reduces RL's sample complexity. In particular, we study the demonstration-regularized reinforcement learning that leverages the expert demonstrations by KL-regularization for a policy learned by behavio… ▽ More

    Submitted 10 June, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: This revision fixes an error due to use of some incorrect results (Lemma 32, Corollary 11 by Talebi & Maillard, 2018) in the proof of Theorem 8. The condition for the RLHF results have slightly changed

  15. arXiv:2310.14286  [pdf, ps, other

    stat.ML cs.LG math.OC

    Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability

    Authors: Sergey Samsonov, Daniil Tiapkin, Alexey Naumov, Eric Moulines

    Abstract: In this paper we consider the problem of obtaining sharp bounds for the performance of temporal difference (TD) methods with linear function approximation for policy evaluation in discounted Markov decision processes. We show that a simple algorithm with a universal and instance-independent step size together with Polyak-Ruppert tail averaging is sufficient to obtain near-optimal variance and bias… ▽ More

    Submitted 15 June, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: Accepted to COLT-2024

    MSC Class: 62L20; 60J20

  16. arXiv:2308.07983  [pdf, other

    stat.ML cs.LG stat.ME

    Monte Carlo guided Diffusion for Bayesian linear inverse problems

    Authors: Gabriel Cardoso, Yazid Janati El Idrissi, Sylvain Le Corff, Eric Moulines

    Abstract: Ill-posed linear inverse problems arise frequently in various applications, from computational photography to medical imaging. A recent line of research exploits Bayesian inference with informative priors to handle the ill-posedness of such problems. Amongst such priors, score-based generative models (SGM) have recently been successfully applied to several different inverse problems. In this study… ▽ More

    Submitted 25 October, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: preprint

  17. arXiv:2307.04779  [pdf, other

    stat.ML math.PR math.ST

    Law of Large Numbers for Bayesian two-layer Neural Network trained with Variational Inference

    Authors: Arnaud Descours, Tom Huix, Arnaud Guillin, Manon Michel, Éric Moulines, Boris Nectoux

    Abstract: We provide a rigorous analysis of training by variational inference (VI) of Bayesian neural networks in the two-layer and infinite-width case. We consider a regression problem with a regularized evidence lower bound (ELBO) which is decomposed into the expected log-likelihood of the data and the Kullback-Leibler (KL) divergence between the a priori distribution and the variational posterior. With a… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  18. arXiv:2306.05131  [pdf, other

    stat.ML cs.LG

    Conformal Prediction for Federated Uncertainty Quantification Under Label Shift

    Authors: Vincent Plassier, Mehdi Makni, Aleksandr Rubashevskii, Eric Moulines, Maxim Panov

    Abstract: Federated Learning (FL) is a machine learning framework where many clients collaboratively train models while kee** the training data decentralized. Despite recent advances in FL, the uncertainty quantification topic (UQ) remains partially addressed. Among UQ methods, conformal prediction (CP) approaches provides distribution-free guarantees under minimal assumptions. We develop a new federated… ▽ More

    Submitted 24 October, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: ICML 2023

  19. arXiv:2306.00684  [pdf, other

    cs.LG stat.ML

    Balanced Training of Energy-Based Models with Adaptive Flow Sampling

    Authors: Louis Grenioux, Éric Moulines, Marylou Gabrié

    Abstract: Energy-based models (EBMs) are versatile density estimation models that directly parameterize an unnormalized log density. Although very flexible, EBMs lack a specified normalization constant of the model, making the likelihood of the model computationally intractable. Several approximate samplers and variational inference techniques have been proposed to estimate the likelihood gradients for trai… ▽ More

    Submitted 18 February, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

  20. arXiv:2305.16099  [pdf, other

    cs.LG stat.ML

    FAVANO: Federated AVeraging with Asynchronous NOdes

    Authors: Louis Leconte, Van Minh Nguyen, Eric Moulines

    Abstract: In this paper, we propose a novel centralized Asynchronous Federated Learning (FL) framework, FAVANO, for training Deep Neural Networks (DNNs) in resource-constrained environments. Despite its popularity, ``classical'' federated learning faces the increasingly difficult task of scaling synchronous communication over large wireless networks. Moreover, clients typically have different computing reso… ▽ More

    Submitted 22 November, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

  21. arXiv:2305.15938  [pdf, ps, other

    math.OC cs.LG stat.ML

    First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

    Authors: Aleksandr Beznosikov, Sergey Samsonov, Marina Sheshukova, Alexander Gasnikov, Alexey Naumov, Eric Moulines

    Abstract: This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the unde… ▽ More

    Submitted 30 March, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Appears in: Advances in Neural Information Processing Systems 36 (NeurIPS 2023). 41 pages, 3 algorithms, 2 tables

    Journal ref: https://proceedings.neurips.cc/paper_files/paper/2023/hash/8c3e38ce55a0fa44bc325bc6fdb7f4e5-Abstract-Conference.html

  22. arXiv:2304.14421  [pdf, other

    cs.LG stat.ML

    One-Step Distributional Reinforcement Learning

    Authors: Mastane Achab, Reda Alami, Yasser Abdelaziz Dahou Djilali, Kirill Fedyanin, Eric Moulines

    Abstract: Reinforcement learning (RL) allows an agent interacting sequentially with an environment to maximize its long-term expected return. In the distributional RL (DistrRL) paradigm, the agent goes beyond the limit of the expected value, to capture the underlying probability distribution of the return across all time steps. The set of DistrRL algorithms has led to improved empirical performance. Neverth… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  23. arXiv:2304.00232  [pdf, other

    cs.LG cs.AI stat.ML

    Restarted Bayesian Online Change-point Detection for Non-Stationary Markov Decision Processes

    Authors: Reda Alami, Mohammed Mahfoud, Eric Moulines

    Abstract: We consider the problem of learning in a non-stationary reinforcement learning (RL) environment, where the setting can be fully described by a piecewise stationary discrete-time Markov decision process (MDP). We introduce a variant of the Restarted Bayesian Online Change-Point Detection algorithm (R-BOCPD) that operates on input streams originating from the more general multinomial distribution an… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

  24. arXiv:2303.09261  [pdf, other

    math.OC stat.ML

    Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold

    Authors: Sholom Schechtman, Daniil Tiapkin, Michael Muehlebach, Eric Moulines

    Abstract: We consider the problem of minimizing a non-convex function over a smooth manifold $\mathcal{M}$. We propose a novel algorithm, the Orthogonal Directions Constrained Gradient Method (ODCGM) which only requires computing a projection onto a vector space. ODCGM is infeasible but the iterates are constantly pulled towards the manifold, ensuring the convergence of ODCGM towards $\mathcal{M}$. ODCGM is… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  25. arXiv:2303.08059  [pdf, other

    stat.ML cs.LG

    Fast Rates for Maximum Entropy Exploration

    Authors: Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Menard

    Abstract: We address the challenge of exploration in reinforcement learning (RL) when the agent operates in an unknown environment with sparse or no rewards. In this work, we study the maximum entropy exploration problem of two different types. The first type is visitation entropy maximization previously considered by Hazan et al.(2019) in the discounted setting. For this type of exploration, we propose a g… ▽ More

    Submitted 6 June, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: ICML-2023

  26. arXiv:2303.05838  [pdf, ps, other

    math.PR math.ST stat.ML

    Rosenthal-type inequalities for linear statistics of Markov chains

    Authors: Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Marina Sheshukova

    Abstract: In this paper, we establish novel deviation bounds for additive functionals of geometrically ergodic Markov chains similar to Rosenthal and Bernstein inequalities for sums of independent random variables. We pay special attention to the dependence of our bounds on the mixing time of the corresponding chain. More precisely, we establish explicit bounds that are linked to the constants from the mart… ▽ More

    Submitted 28 June, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    MSC Class: 60E15; 60J20; 65C40

  27. arXiv:2302.11147  [pdf, other

    math.OC stat.ML

    Stochastic Approximation Beyond Gradient for Signal Processing and Machine Learning

    Authors: Aymeric Dieuleveut, Gersende Fort, Eric Moulines, Hoi-To Wai

    Abstract: Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impact on signal processing, and nowadays on machine learning, due to the necessity to deal with a large amount of data observed with uncertainties. An exemplar special case of SA pertains to the popular stochastic (sub)gradient algorithm which is the working horse behind many important applications. A… ▽ More

    Submitted 16 July, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: Accepted for publication at IEEE Transactions on Signal Processing; 31 pages, 7 pages of supplementary materials

  28. arXiv:2302.04763  [pdf, other

    stat.ML cs.LG

    On Sampling with Approximate Transport Maps

    Authors: Louis Grenioux, Alain Durmus, Éric Moulines, Marylou Gabrié

    Abstract: Transport maps can ease the sampling of distributions with non-trivial geometries by transforming them into distributions that are easier to handle. The potential of this approach has risen with the development of Normalizing Flows (NF) which are maps parameterized with deep neural networks trained to push a reference distribution towards a target. NF-enhanced samplers recently proposed blend (Mar… ▽ More

    Submitted 18 February, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

  29. arXiv:2301.00900  [pdf, other

    stat.ME stat.ML

    State and parameter learning with PaRIS particle Gibbs

    Authors: Gabriel Cardoso, Yazid Janati El Idrissi, Sylvain Le Corff, Eric Moulines, Jimmy Olsson

    Abstract: Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the s… ▽ More

    Submitted 2 January, 2023; originally announced January 2023.

    Comments: preprint. arXiv admin note: text overlap with arXiv:2209.10351

  30. arXiv:2211.03741  [pdf, other

    stat.ML cs.LG

    AskewSGD : An Annealed interval-constrained Optimisation method to train Quantized Neural Networks

    Authors: Louis Leconte, Sholom Schechtman, Eric Moulines

    Abstract: In this paper, we develop a new algorithm, Annealed Skewed SGD - AskewSGD - for training deep neural networks (DNNs) with quantized weights. First, we formulate the training of quantized neural networks (QNNs) as a smoothed sequence of interval-constrained optimization problems. Then, we propose a new first-order stochastic method, AskewSGD, to solve each constrained optimization subproblem. Unlik… ▽ More

    Submitted 20 December, 2022; v1 submitted 7 November, 2022; originally announced November 2022.

  31. arXiv:2211.00100  [pdf, other

    stat.ML cs.LG

    Federated Averaging Langevin Dynamics: Toward a unified theory and new algorithms

    Authors: Vincent Plassier, Alain Durmus, Eric Moulines

    Abstract: This paper focuses on Bayesian inference in a federated learning context (FL). While several distributed MCMC algorithms have been proposed, few consider the specific limitations of FL such as communication bottlenecks and statistical heterogeneity. Recently, Federated Averaging Langevin Dynamics (FALD) was introduced, which extends the Federated Averaging algorithm to Bayesian inference. We obtai… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: 58 pages

  32. arXiv:2209.14414  [pdf, other

    stat.ML cs.LG

    Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees

    Authors: Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Menard

    Abstract: We consider reinforcement learning in an environment modeled by an episodic, finite, stage-dependent Markov decision process of horizon $H$ with $S$ states, and $A$ actions. The performance of an agent is measured by the regret after interacting with the environment for $T$ episodes. We propose an optimistic posterior sampling algorithm for reinforcement learning (OPSRL), a simple variant of poste… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: text overlap with arXiv:2205.07704

  33. arXiv:2209.10351  [pdf, other

    stat.ME

    Particle-based, rapid incremental smoother meets particle Gibbs

    Authors: Gabriel Cardoso, Eric Moulines, Jimmy Olsson

    Abstract: The particle-based, rapid incremental smoother (PARIS) is a sequential Monte Carlo technique allowing for efficient online approximation of expectations of additive functionals under Feynman--Kac path distributions. Under weak assumptions, the algorithm has linear computational complexity and limited memory requirements. It also comes with a number of non-asymptotic bounds and convergence results.… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: Preprint

  34. arXiv:2207.06364  [pdf, other

    stat.ML cs.LG stat.CO

    BR-SNIS: Bias Reduced Self-Normalized Importance Sampling

    Authors: Gabriel Cardoso, Sergey Samsonov, Achille Thin, Eric Moulines, Jimmy Olsson

    Abstract: Importance Sampling (IS) is a method for approximating expectations under a target distribution using independent samples from a proposal distribution and the associated importance weights. In many applications, the target distribution is known only up to a normalization constant, in which case self-normalized IS (SNIS) can be used. While the use of self-normalization can have a positive effect on… ▽ More

    Submitted 13 September, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

  35. arXiv:2207.04475  [pdf, ps, other

    stat.ML cs.LG math.PR math.ST

    Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation

    Authors: Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov

    Abstract: This paper provides a finite-time analysis of linear stochastic approximation (LSA) algorithms with fixed step size, a core method in statistics and machine learning. LSA is used to compute approximate solutions of a $d$-dimensional linear system $\bar{\mathbf{A}} θ= \bar{\mathbf{b}}$ for which $(\bar{\mathbf{A}}, \bar{\mathbf{b}})$ can only be estimated by (asymptotically) unbiased observations… ▽ More

    Submitted 29 March, 2023; v1 submitted 10 July, 2022; originally announced July 2022.

    MSC Class: 62L20; 60J20

  36. arXiv:2207.03859  [pdf, other

    stat.ML cs.LG

    Variational Inference of overparameterized Bayesian Neural Networks: a theoretical and empirical study

    Authors: Tom Huix, Szymon Majewski, Alain Durmus, Eric Moulines, Anna Korba

    Abstract: This paper studies the Variational Inference (VI) used for training Bayesian Neural Networks (BNN) in the overparameterized regime, i.e., when the number of neurons tends to infinity. More specifically, we consider overparameterized two-layer BNN and point out a critical issue in the mean-field VI training. This problem arises from the decomposition of the lower bound on the evidence (ELBO) into t… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

  37. arXiv:2206.03611  [pdf, other

    cs.LG stat.ME stat.ML

    FedPop: A Bayesian Approach for Personalised Federated Learning

    Authors: Nikita Kotelevskii, Maxime Vono, Eric Moulines, Alain Durmus

    Abstract: Personalised federated learning (FL) aims at collaboratively learning a machine learning model taylored for each client. Albeit promising advances have been made in this direction, most of existing approaches works do not allow for uncertainty quantification which is crucial in many applications. In addition, personalisation in the cross-device setting still involves important issues, especially f… ▽ More

    Submitted 26 January, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

  38. arXiv:2205.07704  [pdf, other

    stat.ML cs.LG

    From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses

    Authors: Daniil Tiapkin, Denis Belomestny, Eric Moulines, Alexey Naumov, Sergey Samsonov, Yunhao Tang, Michal Valko, Pierre Menard

    Abstract: We propose the Bayes-UCBVI algorithm for reinforcement learning in tabular, stage-dependent, episodic Markov decision process: a natural extension of the Bayes-UCB algorithm by Kaufmann et al. (2012) for multi-armed bandits. Our method uses the quantile of a Q-value function posterior as upper confidence bound on the optimal Q-value function. For Bayes-UCBVI, we prove a regret bound of order… ▽ More

    Submitted 22 June, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

  39. arXiv:2202.04895  [pdf, other

    stat.ML

    Diffusion bridges vector quantized Variational AutoEncoders

    Authors: Max Cohen, Guillaume Quispe, Sylvain Le Corff, Charles Ollion, Eric Moulines

    Abstract: Vector Quantized-Variational AutoEncoders (VQ-VAE) are generative models based on discrete latent representations of the data, where inputs are mapped to a finite set of learned embeddings.To generate new samples, an autoregressive prior distribution over the discrete states must be trained separately. This prior is generally very complex and leads to slow generation. In this work, we propose a ne… ▽ More

    Submitted 3 August, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

  40. arXiv:2201.01951  [pdf, ps, other

    stat.CO math.NA math.PR

    On the geometric convergence for MALA under verifiable conditions

    Authors: Alain Durmus, Éric Moulines

    Abstract: While the Metropolis Adjusted Langevin Algorithm (MALA) is a popular and widely used Markov chain Monte Carlo method, very few papers derive conditions that ensure its convergence. In particular, to the authors' knowledge, assumptions that are both easy to verify and guarantee geometric convergence, are still missing. In this work, we establish $V$-uniformly geometric convergence for MALA under mi… ▽ More

    Submitted 6 January, 2022; originally announced January 2022.

  41. arXiv:2111.02702  [pdf, other

    stat.ML cs.LG

    Local-Global MCMC kernels: the best of both worlds

    Authors: Sergey Samsonov, Evgeny Lagutin, Marylou Gabrié, Alain Durmus, Alexey Naumov, Eric Moulines

    Abstract: Recent works leveraging learning to enhance sampling have shown promising results, in particular by designing effective non-local moves and global proposals. However, learning accuracy is inevitably limited in regions where little data is available such as in the tails of distributions as well as in high-dimensional problems. In the present paper we study an Explore-Exploit Markov chain Monte Carl… ▽ More

    Submitted 4 October, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

    Comments: arXiv admin note: text overlap with arXiv:1111.5421 by other authors

  42. arXiv:2107.14542  [pdf, ps, other

    math.PR math.NA stat.CO

    Uniform minorization condition and convergence bounds for discretizations of kinetic Langevin dynamics

    Authors: Alain Durmus, Aurélien Enfroy, Éric Moulines, Gabriel Stoltz

    Abstract: We study the convergence in total variation and $V$-norm of discretization schemes of the underdamped Langevin dynamics. Such algorithms are very popular and commonly used in molecular dynamics and computational statistics to approximatively sample from a target distribution of interest. We show first that, for a very large class of schemes, a minorization condition uniform in the stepsize holds.… ▽ More

    Submitted 21 April, 2023; v1 submitted 30 July, 2021; originally announced July 2021.

  43. arXiv:2106.15921  [pdf, other

    stat.ML cs.LG

    Monte Carlo Variational Auto-Encoders

    Authors: Achille Thin, Nikita Kotelevskii, Arnaud Doucet, Alain Durmus, Eric Moulines, Maxim Panov

    Abstract: Variational auto-encoders (VAE) are popular deep latent variable models which are trained by maximizing an Evidence Lower Bound (ELBO). To obtain tighter ELBO and hence better variational approximations, it has been proposed to use importance sampling to get a lower variance estimate of the evidence. However, importance sampling is known to perform poorly in high dimensions. While it has been sugg… ▽ More

    Submitted 30 June, 2021; originally announced June 2021.

  44. arXiv:2106.06300  [pdf, other

    stat.ME cs.AI cs.LG stat.CO

    DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm via Langevin Monte Carlo within Gibbs

    Authors: Vincent Plassier, Maxime Vono, Alain Durmus, Eric Moulines

    Abstract: Performing reliable Bayesian inference on a big data scale is becoming a keystone in the modern era of machine learning. A workhorse class of methods to achieve this task are Markov chain Monte Carlo (MCMC) algorithms and their design to handle distributed datasets has been the subject of many works. However, existing methods are not completely either reliable or computationally efficient. In this… ▽ More

    Submitted 18 June, 2021; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: 77 pages. Accepted for publication at ICML 2021, to appear

  45. arXiv:2106.01257  [pdf, ps, other

    stat.ML cs.LG math.PR math.ST

    Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

    Authors: Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Kevin Scaman, Hoi-To Wai

    Abstract: This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear system $\bar{A}θ= \bar{b}$ for which $\bar{A}$ and $\bar{b}$ can only be accessed through random estimates $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$. Our ana… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: 21 pages

  46. arXiv:2106.00797  [pdf, other

    cs.LG cs.AI stat.CO stat.ME stat.ML

    QLSD: Quantised Langevin stochastic dynamics for Bayesian federated learning

    Authors: Maxime Vono, Vincent Plassier, Alain Durmus, Aymeric Dieuleveut, Eric Moulines

    Abstract: The objective of Federated Learning (FL) is to perform statistical inference for data which are decentralised and stored locally on networked clients. FL raises many constraints which include privacy and data ownership, communication overhead, statistical heterogeneity, and partial client participation. In this paper, we address these problems in the framework of the Bayesian paradigm. To this end… ▽ More

    Submitted 31 May, 2022; v1 submitted 1 June, 2021; originally announced June 2021.

  47. arXiv:2103.10943  [pdf, other

    stat.CO stat.ME stat.ML

    NEO: Non Equilibrium Sampling on the Orbit of a Deterministic Transform

    Authors: Achille Thin, Yazid Janati, Sylvain Le Corff, Charles Ollion, Arnaud Doucet, Alain Durmus, Eric Moulines, Christian Robert

    Abstract: Sampling from a complex distribution $π$ and approximating its intractable normalizing constant Z are challenging problems. In this paper, a novel family of importance samplers (IS) and Markov chain Monte Carlo (MCMC) samplers is derived. Given an invertible map T, these schemes combine (with weights) elements from the forward and backward Orbits through points sampled from a proposal distributi… ▽ More

    Submitted 23 August, 2021; v1 submitted 17 March, 2021; originally announced March 2021.

  48. arXiv:2102.07586  [pdf, other

    stat.ML cs.LG math.PR

    On Riemannian Stochastic Approximation Schemes with Fixed Step-Size

    Authors: Alain Durmus, Pablo Jiménez, Éric Moulines, Salem Said

    Abstract: This paper studies fixed step-size stochastic approximation (SA) schemes, including stochastic gradient schemes, in a Riemannian framework. It is motivated by several applications, where geodesics can be computed explicitly, and their use accelerates crude Euclidean methods. A fixed step-size scheme defines a family of time-homogeneous Markov chains, parametrized by the step-size. Here, using this… ▽ More

    Submitted 19 February, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: 37 pages, 4 figures, to appear in AISTAT21

    MSC Class: 60F05

  49. arXiv:2102.00199  [pdf, ps, other

    math.ST stat.ML

    Rates of convergence for density estimation with generative adversarial networks

    Authors: Nikita Puchkin, Sergey Samsonov, Denis Belomestny, Eric Moulines, Alexey Naumov

    Abstract: In this work we undertake a thorough study of the non-asymptotic properties of the vanilla generative adversarial networks (GANs). We prove an oracle inequality for the Jensen-Shannon (JS) divergence between the underlying density $\mathsf{p}^*$ and the GAN estimate with a significantly better statistical error term compared to the previously known results. The advantage of our bound becomes clear… ▽ More

    Submitted 25 January, 2024; v1 submitted 30 January, 2021; originally announced February 2021.

    Comments: To appear in Journal of Machine Learning Research

  50. arXiv:2102.00185  [pdf, ps, other

    stat.ML cs.LG math.PR math.ST

    On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning

    Authors: Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Hoi-To Wai

    Abstract: This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain. It is a cornerstone in the analysis of stochastic algorithms in machine learning (e.g. for parameter tracking in online learning or reinforcement learning). The existing results impose strong conditions such as uniform boundedness of the matrix-valued functions… ▽ More

    Submitted 30 January, 2021; originally announced February 2021.