-
Improving GFlowNets with Monte Carlo Tree Search
Authors:
Nikita Morozov,
Daniil Tiapkin,
Sergey Samsonov,
Alexey Naumov,
Dmitry Vetrov
Abstract:
Generative Flow Networks (GFlowNets) treat sampling from distributions over compositional discrete spaces as a sequential decision-making problem, training a stochastic policy to construct objects step by step. Recent studies have revealed strong connections between GFlowNets and entropy-regularized reinforcement learning. Building on these insights, we propose to enhance planning capabilities of…
▽ More
Generative Flow Networks (GFlowNets) treat sampling from distributions over compositional discrete spaces as a sequential decision-making problem, training a stochastic policy to construct objects step by step. Recent studies have revealed strong connections between GFlowNets and entropy-regularized reinforcement learning. Building on these insights, we propose to enhance planning capabilities of GFlowNets by applying Monte Carlo Tree Search (MCTS). Specifically, we show how the MENTS algorithm (Xiao et al., 2019) can be adapted for GFlowNets and used during both training and inference. Our experiments demonstrate that this approach improves the sample efficiency of GFlowNet training and the generation fidelity of pre-trained GFlowNet models.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
Authors:
Sergey Samsonov,
Eric Moulines,
Qi-Man Shao,
Zhuo-Song Zhang,
Alexey Naumov
Abstract:
In this paper, we obtain the Berry-Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Our findings reveal that the fastest rate of normal approximation is achieved when setting the most aggressive step size $α_{k} \asymp k^{-1/2}$. Moreover, we prove the non-asymptotic validit…
▽ More
In this paper, we obtain the Berry-Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Our findings reveal that the fastest rate of normal approximation is achieved when setting the most aggressive step size $α_{k} \asymp k^{-1/2}$. Moreover, we prove the non-asymptotic validity of the confidence intervals for parameter estimation with LSA based on multiplier bootstrap. This procedure updates the LSA estimate together with a set of randomly perturbed LSA estimates upon the arrival of subsequent observations. We illustrate our findings in the setting of temporal difference learning with linear function approximation.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Queuing dynamics of asynchronous Federated Learning
Authors:
Louis Leconte,
Matthieu Jonckheere,
Sergey Samsonov,
Eric Moulines
Abstract:
We study asynchronous federated learning mechanisms with nodes having potentially different computational speeds. In such an environment, each node is allowed to work on models with potential delays and contribute to updates to the central server at its own pace. Existing analyses of such algorithms typically depend on intractable quantities such as the maximum node delay and do not consider the u…
▽ More
We study asynchronous federated learning mechanisms with nodes having potentially different computational speeds. In such an environment, each node is allowed to work on models with potential delays and contribute to updates to the central server at its own pace. Existing analyses of such algorithms typically depend on intractable quantities such as the maximum node delay and do not consider the underlying queuing dynamics of the system. In this paper, we propose a non-uniform sampling scheme for the central server that allows for lower delays with better complexity, taking into account the closed Jackson network structure of the associated computational graph. Our experiments clearly show a significant improvement of our method over current state-of-the-art asynchronous algorithms on an image classification problem.
△ Less
Submitted 12 February, 2024;
originally announced May 2024.
-
SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning
Authors:
Paul Mangold,
Sergey Samsonov,
Safwan Labbi,
Ilya Levin,
Reda Alami,
Alexey Naumov,
Eric Moulines
Abstract:
In this paper, we analyze the sample and communication complexity of the federated linear stochastic approximation (FedLSA) algorithm. We explicitly quantify the effects of local training with agent heterogeneity. We show that the communication complexity of FedLSA scales polynomially with the inverse of the desired accuracy $ε$. To overcome this, we propose SCAFFLSA a new variant of FedLSA that u…
▽ More
In this paper, we analyze the sample and communication complexity of the federated linear stochastic approximation (FedLSA) algorithm. We explicitly quantify the effects of local training with agent heterogeneity. We show that the communication complexity of FedLSA scales polynomially with the inverse of the desired accuracy $ε$. To overcome this, we propose SCAFFLSA a new variant of FedLSA that uses control variates to correct for client drift, and establish its sample and communication complexities. We show that for statistically heterogeneous agents, its communication complexity scales logarithmically with the desired accuracy, similar to Scaffnew. An important finding is that, compared to the existing results for Scaffnew, the sample complexity scales with the inverse of the number of agents, a property referred to as linear speed-up. Achieving this linear speed-up requires completely new theoretical arguments. We apply the proposed method to federated temporal difference learning with linear function approximation and analyze the corresponding complexity improvements.
△ Less
Submitted 27 May, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability
Authors:
Sergey Samsonov,
Daniil Tiapkin,
Alexey Naumov,
Eric Moulines
Abstract:
In this paper we consider the problem of obtaining sharp bounds for the performance of temporal difference (TD) methods with linear function approximation for policy evaluation in discounted Markov decision processes. We show that a simple algorithm with a universal and instance-independent step size together with Polyak-Ruppert tail averaging is sufficient to obtain near-optimal variance and bias…
▽ More
In this paper we consider the problem of obtaining sharp bounds for the performance of temporal difference (TD) methods with linear function approximation for policy evaluation in discounted Markov decision processes. We show that a simple algorithm with a universal and instance-independent step size together with Polyak-Ruppert tail averaging is sufficient to obtain near-optimal variance and bias terms. We also provide the respective sample complexity bounds. Our proof technique is based on refined error bounds for linear stochastic approximation together with the novel stability result for the product of random matrices that arise from the TD-type recurrence.
△ Less
Submitted 15 June, 2024; v1 submitted 22 October, 2023;
originally announced October 2023.
-
First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
Authors:
Aleksandr Beznosikov,
Sergey Samsonov,
Marina Sheshukova,
Alexander Gasnikov,
Alexey Naumov,
Eric Moulines
Abstract:
This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the unde…
▽ More
This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the underlying noise sequence, we use the randomized batching scheme, which is based on the multilevel Monte Carlo method. Moreover, our technique allows us to eliminate the limiting assumptions of previous research on Markov noise, such as the need for a bounded domain and uniformly bounded stochastic gradients. Our extension to variational inequalities under Markovian noise is original. Additionally, we provide lower bounds that match the oracle complexity of our method in the case of strongly convex optimization problems.
△ Less
Submitted 30 March, 2024; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Theoretical guarantees for neural control variates in MCMC
Authors:
Denis Belomestny,
Artur Goldman,
Alexey Naumov,
Sergey Samsonov
Abstract:
In this paper, we propose a variance reduction approach for Markov chains based on additive control variates and the minimization of an appropriate estimate for the asymptotic variance. We focus on the particular case when control variates are represented as deep neural networks. We derive the optimal convergence rate of the asymptotic variance under various ergodicity assumptions on the underlyin…
▽ More
In this paper, we propose a variance reduction approach for Markov chains based on additive control variates and the minimization of an appropriate estimate for the asymptotic variance. We focus on the particular case when control variates are represented as deep neural networks. We derive the optimal convergence rate of the asymptotic variance under various ergodicity assumptions on the underlying Markov chain. The proposed approach relies upon recent results on the stochastic errors of variance reduction algorithms and function approximation theory.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Rosenthal-type inequalities for linear statistics of Markov chains
Authors:
Alain Durmus,
Eric Moulines,
Alexey Naumov,
Sergey Samsonov,
Marina Sheshukova
Abstract:
In this paper, we establish novel deviation bounds for additive functionals of geometrically ergodic Markov chains similar to Rosenthal and Bernstein inequalities for sums of independent random variables. We pay special attention to the dependence of our bounds on the mixing time of the corresponding chain. More precisely, we establish explicit bounds that are linked to the constants from the mart…
▽ More
In this paper, we establish novel deviation bounds for additive functionals of geometrically ergodic Markov chains similar to Rosenthal and Bernstein inequalities for sums of independent random variables. We pay special attention to the dependence of our bounds on the mixing time of the corresponding chain. More precisely, we establish explicit bounds that are linked to the constants from the martingale version of the Rosenthal inequality, as well as the constants that characterize the mixing properties of the underlying Markov kernel. Finally, our proof technique is, up to our knowledge, new and based on a recurrent application of the Poisson decomposition.
△ Less
Submitted 28 June, 2023; v1 submitted 10 March, 2023;
originally announced March 2023.
-
Opacity of relativistically underdense plasmas for extremely intense laser pulses
Authors:
M. A. Serebryakov,
A. S. Samsonov,
E. N. Nerush,
I. Yu. Kostyukov
Abstract:
It is generally believed that relativistically underdense plasma is transparent for intense laser radiation. However, particle-in-cell simulations reveal abnormal laser field absorption above the intensity threshold about~$3 \times 10^{24}~\mathrm{W}\,\mathrm{cm}^{-2}$ for the wavelength of $1~μ\mathrm{m}$. Above the threshold, the further increase of the laser intensity doesn't lead to the increa…
▽ More
It is generally believed that relativistically underdense plasma is transparent for intense laser radiation. However, particle-in-cell simulations reveal abnormal laser field absorption above the intensity threshold about~$3 \times 10^{24}~\mathrm{W}\,\mathrm{cm}^{-2}$ for the wavelength of $1~μ\mathrm{m}$. Above the threshold, the further increase of the laser intensity doesn't lead to the increase of the propagation distance. The simulations take into account emission of hard photons and subsequent pair photoproduction in the laser field. These effects lead to onset of a self-sustained electromagnetic cascade and to formation of dense electron-positron ($e^+e^-$) plasma right inside the laser field. The plasma absorbs the field efficiently, that ensures the plasma opacity. The role of a weak longitudinal electron-ion electric field in the cascade growth is discussed.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
High-order corrections to the radiation-free dynamics of an electron in the strongly radiation-dominated regime
Authors:
A. S. Samsonov,
E. N. Nerush,
I. Yu. Kostyukov
Abstract:
A system of reduced equations is proposed for the electron motion in the strongly-radiation dominated regime for an arbitrary electromagnetic field configuration. The developed approach is used to analyze various scenarios of an electron dynamics in the strongly-radiation dominated regime: motion in rotating electric and magnetic fields, longitudinal acceleration in a plane wave and in a plasma wa…
▽ More
A system of reduced equations is proposed for the electron motion in the strongly-radiation dominated regime for an arbitrary electromagnetic field configuration. The developed approach is used to analyze various scenarios of an electron dynamics in the strongly-radiation dominated regime: motion in rotating electric and magnetic fields, longitudinal acceleration in a plane wave and in a plasma wakefield. The obtained results show that the developed approach is able to describe features of the electron dynamics, which are essential to a certain scenario, but which could not be captured in the framework of the original radiation-free approximation [A. S. Samsonov et al., Phys. Rev. A 98, 053858 (2018); A. Gonoskov and M. Marklund, Phys. Plasmas 25, 093109 (2018)]. The results are verified by numerical integration of non-reduced motion equations with account of radiation reaction in both semi-classical and fully quantum cases.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
BR-SNIS: Bias Reduced Self-Normalized Importance Sampling
Authors:
Gabriel Cardoso,
Sergey Samsonov,
Achille Thin,
Eric Moulines,
Jimmy Olsson
Abstract:
Importance Sampling (IS) is a method for approximating expectations under a target distribution using independent samples from a proposal distribution and the associated importance weights. In many applications, the target distribution is known only up to a normalization constant, in which case self-normalized IS (SNIS) can be used. While the use of self-normalization can have a positive effect on…
▽ More
Importance Sampling (IS) is a method for approximating expectations under a target distribution using independent samples from a proposal distribution and the associated importance weights. In many applications, the target distribution is known only up to a normalization constant, in which case self-normalized IS (SNIS) can be used. While the use of self-normalization can have a positive effect on the dispersion of the estimator, it introduces bias. In this work, we propose a new method, BR-SNIS, whose complexity is essentially the same as that of SNIS and which significantly reduces bias without increasing the variance. This method is a wrapper in the sense that it uses the same proposal samples and importance weights as SNIS, but makes clever use of iterated sampling--importance resampling (ISIR) to form a bias-reduced version of the estimator. We furnish the proposed algorithm with rigorous theoretical results, including new bias, variance and high-probability bounds, and these are illustrated by numerical examples.
△ Less
Submitted 13 September, 2022; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation
Authors:
Alain Durmus,
Eric Moulines,
Alexey Naumov,
Sergey Samsonov
Abstract:
This paper provides a finite-time analysis of linear stochastic approximation (LSA) algorithms with fixed step size, a core method in statistics and machine learning. LSA is used to compute approximate solutions of a $d$-dimensional linear system $\bar{\mathbf{A}} θ= \bar{\mathbf{b}}$ for which $(\bar{\mathbf{A}}, \bar{\mathbf{b}})$ can only be estimated by (asymptotically) unbiased observations…
▽ More
This paper provides a finite-time analysis of linear stochastic approximation (LSA) algorithms with fixed step size, a core method in statistics and machine learning. LSA is used to compute approximate solutions of a $d$-dimensional linear system $\bar{\mathbf{A}} θ= \bar{\mathbf{b}}$ for which $(\bar{\mathbf{A}}, \bar{\mathbf{b}})$ can only be estimated by (asymptotically) unbiased observations $\{(\mathbf{A}(Z_n),\mathbf{b}(Z_n))\}_{n \in \mathbb{N}}$. We consider here the case where $\{Z_n\}_{n \in \mathbb{N}}$ is an i.i.d. sequence or a uniformly geometrically ergodic Markov chain. We derive $p$-th moment and high-probability deviation bounds for the iterates defined by LSA and its Polyak-Ruppert-averaged version. Our finite-time instance-dependent bounds for the averaged LSA iterates are sharp in the sense that the leading term we obtain coincides with the local asymptotic minimax limit. Moreover, the remainder terms of our bounds admit a tight dependence on the mixing time $t_{\operatorname{mix}}$ of the underlying chain and the norm of the noise variables. We emphasize that our result requires the SA step size to scale only with logarithm of the problem dimension $d$.
△ Less
Submitted 29 March, 2023; v1 submitted 10 July, 2022;
originally announced July 2022.
-
Simultaneous approximation of a smooth function and its derivatives by deep neural networks with piecewise-polynomial activations
Authors:
Denis Belomestny,
Alexey Naumov,
Nikita Puchkin,
Sergey Samsonov
Abstract:
This paper investigates the approximation properties of deep neural networks with piecewise-polynomial activation functions. We derive the required depth, width, and sparsity of a deep neural network to approximate any Hölder smooth function up to a given approximation error in Hölder norms in such a way that all weights of this neural network are bounded by $1$. The latter feature is essential to…
▽ More
This paper investigates the approximation properties of deep neural networks with piecewise-polynomial activation functions. We derive the required depth, width, and sparsity of a deep neural network to approximate any Hölder smooth function up to a given approximation error in Hölder norms in such a way that all weights of this neural network are bounded by $1$. The latter feature is essential to control generalization errors in many statistical and machine learning applications.
△ Less
Submitted 2 December, 2022; v1 submitted 19 June, 2022;
originally announced June 2022.
-
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Authors:
Daniil Tiapkin,
Denis Belomestny,
Eric Moulines,
Alexey Naumov,
Sergey Samsonov,
Yunhao Tang,
Michal Valko,
Pierre Menard
Abstract:
We propose the Bayes-UCBVI algorithm for reinforcement learning in tabular, stage-dependent, episodic Markov decision process: a natural extension of the Bayes-UCB algorithm by Kaufmann et al. (2012) for multi-armed bandits. Our method uses the quantile of a Q-value function posterior as upper confidence bound on the optimal Q-value function. For Bayes-UCBVI, we prove a regret bound of order…
▽ More
We propose the Bayes-UCBVI algorithm for reinforcement learning in tabular, stage-dependent, episodic Markov decision process: a natural extension of the Bayes-UCB algorithm by Kaufmann et al. (2012) for multi-armed bandits. Our method uses the quantile of a Q-value function posterior as upper confidence bound on the optimal Q-value function. For Bayes-UCBVI, we prove a regret bound of order $\widetilde{O}(\sqrt{H^3SAT})$ where $H$ is the length of one episode, $S$ is the number of states, $A$ the number of actions, $T$ the number of episodes, that matches the lower-bound of $Ω(\sqrt{H^3SAT})$ up to poly-$\log$ terms in $H,S,A,T$ for a large enough $T$. To the best of our knowledge, this is the first algorithm that obtains an optimal dependence on the horizon $H$ (and $S$) without the need for an involved Bernstein-like bonus or noise. Crucial to our analysis is a new fine-grained anti-concentration bound for a weighted Dirichlet sum that can be of independent interest. We then explain how Bayes-UCBVI can be easily extended beyond the tabular setting, exhibiting a strong link between our algorithm and Bayesian bootstrap (Rubin, 1981).
△ Less
Submitted 22 June, 2022; v1 submitted 16 May, 2022;
originally announced May 2022.
-
Local-Global MCMC kernels: the best of both worlds
Authors:
Sergey Samsonov,
Evgeny Lagutin,
Marylou Gabrié,
Alain Durmus,
Alexey Naumov,
Eric Moulines
Abstract:
Recent works leveraging learning to enhance sampling have shown promising results, in particular by designing effective non-local moves and global proposals. However, learning accuracy is inevitably limited in regions where little data is available such as in the tails of distributions as well as in high-dimensional problems. In the present paper we study an Explore-Exploit Markov chain Monte Carl…
▽ More
Recent works leveraging learning to enhance sampling have shown promising results, in particular by designing effective non-local moves and global proposals. However, learning accuracy is inevitably limited in regions where little data is available such as in the tails of distributions as well as in high-dimensional problems. In the present paper we study an Explore-Exploit Markov chain Monte Carlo strategy ($Ex^2MCMC$) that combines local and global samplers showing that it enjoys the advantages of both approaches. We prove $V$-uniform geometric ergodicity of $Ex^2MCMC$ without requiring a uniform adaptation of the global sampler to the target distribution. We also compute explicit bounds on the mixing rate of the Explore-Exploit strategy under realistic conditions. Moreover, we also analyze an adaptive version of the strategy ($FlEx^2MCMC$) where a normalizing flow is trained while sampling to serve as a proposal for global moves. We illustrate the efficiency of $Ex^2MCMC$ and its adaptive version on classical sampling benchmarks as well as in sampling high-dimensional distributions defined by Generative Adversarial Networks seen as Energy Based Models. We provide the code to reproduce the experiments at the link: https://github.com/svsamsonov/ex2mcmc_new.
△ Less
Submitted 4 October, 2022; v1 submitted 4 November, 2021;
originally announced November 2021.
-
Probability and moment inequalities for additive functionals of geometrically ergodic Markov chains
Authors:
Alain Durmus,
Eric Moulines,
Alexey Naumov,
Sergey Samsonov
Abstract:
In this paper, we establish moment and Bernstein-type inequalities for additive functionals of geometrically ergodic Markov chains. These inequalities extend the corresponding inequalities for independent random variables. Our conditions cover Markov chains converging geometrically to the stationary distribution either in $V$-norms or in weighted Wasserstein distances. Our inequalities apply to un…
▽ More
In this paper, we establish moment and Bernstein-type inequalities for additive functionals of geometrically ergodic Markov chains. These inequalities extend the corresponding inequalities for independent random variables. Our conditions cover Markov chains converging geometrically to the stationary distribution either in $V$-norms or in weighted Wasserstein distances. Our inequalities apply to unbounded functions and depend explicitly on constants appearing in the conditions that we consider.
△ Less
Submitted 15 June, 2023; v1 submitted 1 September, 2021;
originally announced September 2021.
-
Beamstrahlung-enhanced disruption in beam-beam interaction
Authors:
A. S. Samsonov,
E. N. Nerush,
I. Yu. Kostyukov,
M. Filipovic,
C. Baumann,
A. Pukhov
Abstract:
The radiation reaction (beamstrahlung) effect on particle dynamics during interaction of oppositely charged beams is studied. It is shown that the beam focusing can be strongly enhanced due to beamstrahlung. An approximate analytical solution of the motion equation including the radiation reaction force is derived. The disruption parameter is calculated for classical and quantum regime of beamstra…
▽ More
The radiation reaction (beamstrahlung) effect on particle dynamics during interaction of oppositely charged beams is studied. It is shown that the beam focusing can be strongly enhanced due to beamstrahlung. An approximate analytical solution of the motion equation including the radiation reaction force is derived. The disruption parameter is calculated for classical and quantum regime of beamstrahlung. The analytical model is verified by QED-PIC simulations. The model for head-on collision of long beams undergoing a number of betatron oscillation during interaction is also developed. It is demonstrated that the beamstrahlung-enhanced disruption effect can play a significant role in future lepton colliders with high-current particle beams.
△ Less
Submitted 10 July, 2021;
originally announced July 2021.
-
Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize
Authors:
Alain Durmus,
Eric Moulines,
Alexey Naumov,
Sergey Samsonov,
Kevin Scaman,
Hoi-To Wai
Abstract:
This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear system $\bar{A}θ= \bar{b}$ for which $\bar{A}$ and $\bar{b}$ can only be accessed through random estimates $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$. Our ana…
▽ More
This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear system $\bar{A}θ= \bar{b}$ for which $\bar{A}$ and $\bar{b}$ can only be accessed through random estimates $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$. Our analysis is based on new results regarding moments and high probability bounds for products of matrices which are shown to be tight. We derive high probability bounds on the performance of LSA under weaker conditions on the sequence $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$ than previous works. However, in contrast, we establish polynomial concentration bounds with order depending on the stepsize. We show that our conclusions cannot be improved without additional assumptions on the sequence of random matrices $\{{\bf A}_n: n \in \mathbb{N}^*\}$, and in particular that no Gaussian or exponential high probability bounds can hold. Finally, we pay a particular attention to establishing bounds with sharp order with respect to the number of iterations and the stepsize and whose leading terms contain the covariance matrices appearing in the central limit theorems.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
UVIP: Model-Free Approach to Evaluate Reinforcement Learning Algorithms
Authors:
D. Belomestny,
I. Levin,
E. Moulines,
A. Naumov,
S. Samsonov,
V. Zorina
Abstract:
Policy evaluation is an important instrument for the comparison of different algorithms in Reinforcement Learning (RL). Yet even a precise knowledge of the value function $V^π$ corresponding to a policy $π$ does not provide reliable information on how far is the policy $π$ from the optimal one. We present a novel model-free upper value iteration procedure $({\sf UVIP})$ that allows us to estimate…
▽ More
Policy evaluation is an important instrument for the comparison of different algorithms in Reinforcement Learning (RL). Yet even a precise knowledge of the value function $V^π$ corresponding to a policy $π$ does not provide reliable information on how far is the policy $π$ from the optimal one. We present a novel model-free upper value iteration procedure $({\sf UVIP})$ that allows us to estimate the suboptimality gap $V^{\star}(x) - V^π(x)$ from above and to construct confidence intervals for $V^\star$. Our approach relies on upper bounds to the solution of the Bellman optimality equation via martingale approach. We provide theoretical guarantees for ${\sf UVIP}$ under general assumptions and illustrate its performance on a number of benchmark RL problems.
△ Less
Submitted 3 June, 2021; v1 submitted 5 May, 2021;
originally announced May 2021.
-
Rates of convergence for density estimation with generative adversarial networks
Authors:
Nikita Puchkin,
Sergey Samsonov,
Denis Belomestny,
Eric Moulines,
Alexey Naumov
Abstract:
In this work we undertake a thorough study of the non-asymptotic properties of the vanilla generative adversarial networks (GANs). We prove an oracle inequality for the Jensen-Shannon (JS) divergence between the underlying density $\mathsf{p}^*$ and the GAN estimate with a significantly better statistical error term compared to the previously known results. The advantage of our bound becomes clear…
▽ More
In this work we undertake a thorough study of the non-asymptotic properties of the vanilla generative adversarial networks (GANs). We prove an oracle inequality for the Jensen-Shannon (JS) divergence between the underlying density $\mathsf{p}^*$ and the GAN estimate with a significantly better statistical error term compared to the previously known results. The advantage of our bound becomes clear in application to nonparametric density estimation. We show that the JS-divergence between the GAN estimate and $\mathsf{p}^*$ decays as fast as $(\log{n}/n)^{2β/(2β+ d)}$, where $n$ is the sample size and $β$ determines the smoothness of $\mathsf{p}^*$. This rate of convergence coincides (up to logarithmic factors) with minimax optimal for the considered class of densities.
△ Less
Submitted 25 January, 2024; v1 submitted 30 January, 2021;
originally announced February 2021.
-
On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning
Authors:
Alain Durmus,
Eric Moulines,
Alexey Naumov,
Sergey Samsonov,
Hoi-To Wai
Abstract:
This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain. It is a cornerstone in the analysis of stochastic algorithms in machine learning (e.g. for parameter tracking in online learning or reinforcement learning). The existing results impose strong conditions such as uniform boundedness of the matrix-valued functions…
▽ More
This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain. It is a cornerstone in the analysis of stochastic algorithms in machine learning (e.g. for parameter tracking in online learning or reinforcement learning). The existing results impose strong conditions such as uniform boundedness of the matrix-valued functions and uniform ergodicity of the Markov chains. Our main contribution is an exponential stability result for the $p$-th moment of random matrix product, provided that (i) the underlying Markov chain satisfies a super-Lyapunov drift condition, (ii) the growth of the matrix-valued functions is controlled by an appropriately defined function (related to the drift condition). Using this result, we give finite-time $p$-th moment bounds for constant and decreasing stepsize linear stochastic approximation schemes with Markovian noise on general state space. We illustrate these findings for linear value-function estimation in reinforcement learning. We provide finite-time $p$-th moment bound for various members of temporal difference (TD) family of algorithms.
△ Less
Submitted 30 January, 2021;
originally announced February 2021.
-
Hydrodynamical model of QED cascade expansion in an extremely strong laser pulse
Authors:
A. S. Samsonov,
I. Yu. Kostyukov,
E. N. Nerush
Abstract:
Development of the self-sustained quantum-electrodynamical (QED) cascade in a single strong laser pulse is studied analytically and numerically. The hydrodynamical approach is used to construct the analytical model of the cascade evolution, which includes the key features of the cascade observed in 3D QED particle-in-cell (QED-PIC) simulations such as the magnetic field predominance in the cascade…
▽ More
Development of the self-sustained quantum-electrodynamical (QED) cascade in a single strong laser pulse is studied analytically and numerically. The hydrodynamical approach is used to construct the analytical model of the cascade evolution, which includes the key features of the cascade observed in 3D QED particle-in-cell (QED-PIC) simulations such as the magnetic field predominance in the cascade plasma and laser energy absorption. The equations of the model are derived in the closed form and are solved numerically. Direct comparison between the solutions of the model equations and 3D QED-PIC simulations shows that our model is able to describe the complex nonlinear process of the cascade development qualitatively well. The various regimes of the interaction based on the intensity of the laser pulse are revealed in both the solutions of the model equations and the results of the QED-PIC simulations.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
Variance reduction for dependent sequences with applications to Stochastic Gradient MCMC
Authors:
D. Belomestny,
L. Iosipoi,
E. Moulines,
A. Naumov,
S. Samsonov
Abstract:
In this paper we propose a novel and practical variance reduction approach for additive functionals of dependent sequences. Our approach combines the use of control variates with the minimisation of an empirical variance estimate. We analyse finite sample properties of the proposed method and derive finite-time bounds of the excess asymptotic variance to zero. We apply our methodology to Stochasti…
▽ More
In this paper we propose a novel and practical variance reduction approach for additive functionals of dependent sequences. Our approach combines the use of control variates with the minimisation of an empirical variance estimate. We analyse finite sample properties of the proposed method and derive finite-time bounds of the excess asymptotic variance to zero. We apply our methodology to Stochastic Gradient MCMC (SGMCMC) methods for Bayesian inference on large data sets and combine it with existing variance reduction methods for SGMCMC. We present empirical results carried out on a number of benchmark examples showing that our variance reduction method achieves significant improvement as compared to state-of-the-art methods at the expense of a moderate increase of computational overhead.
△ Less
Submitted 16 August, 2020;
originally announced August 2020.
-
Variance reduction for Markov chains with application to MCMC
Authors:
D. Belomestny,
L. Iosipoi,
E. Moulines,
A. Naumov,
S. Samsonov
Abstract:
In this paper we propose a novel variance reduction approach for additive functionals of Markov chains based on minimization of an estimate for the asymptotic variance of these functionals over suitable classes of control variates. A distinctive feature of the proposed approach is its ability to significantly reduce the overall finite sample variance. This feature is theoretically demonstrated by…
▽ More
In this paper we propose a novel variance reduction approach for additive functionals of Markov chains based on minimization of an estimate for the asymptotic variance of these functionals over suitable classes of control variates. A distinctive feature of the proposed approach is its ability to significantly reduce the overall finite sample variance. This feature is theoretically demonstrated by means of a deep non asymptotic analysis of a variance reduced functional as well as by a thorough simulation study. In particular we apply our method to various MCMC Bayesian estimation problems where it favourably compares to the existing variance reduction approaches.
△ Less
Submitted 15 February, 2020; v1 submitted 8 October, 2019;
originally announced October 2019.
-
Variance reduction for additive functional of Markov chains via martingale representations
Authors:
D. Belomestny,
E. Moulines,
S. Samsonov
Abstract:
In this paper we propose an efficient variance reduction approach for additive functionals of Markov chains relying on a novel discrete time martingale representation. Our approach is fully non-asymptotic and does not require the knowledge of the stationary distribution (and even any type of ergodicity) or specific structure of the underlying density. By rigorously analyzing the convergence proper…
▽ More
In this paper we propose an efficient variance reduction approach for additive functionals of Markov chains relying on a novel discrete time martingale representation. Our approach is fully non-asymptotic and does not require the knowledge of the stationary distribution (and even any type of ergodicity) or specific structure of the underlying density. By rigorously analyzing the convergence properties of the proposed algorithm, we show that its cost-to-variance product is indeed smaller than one of the naive algorithm. The numerical performance of the new method is illustrated for the Langevin-type Markov Chain Monte Carlo (MCMC) methods.
△ Less
Submitted 21 December, 2021; v1 submitted 18 March, 2019;
originally announced March 2019.
-
Laser-driven vacuum breakdown waves
Authors:
A. S. Samsonov,
E. N. Nerush,
I. Yu. Kostyukov
Abstract:
It is demonstrated by three-dimensional quantum electrodynamics --- particle-in-cell (QED-PIC) simulations that vacuum breakdown wave in the form of QED cascade front can propagate in an extremely intense plane electromagnetic wave. The result disproves the statement that the self-sustained cascading is not possible in a plane wave configuration. In the simulations the cascade initiates during las…
▽ More
It is demonstrated by three-dimensional quantum electrodynamics --- particle-in-cell (QED-PIC) simulations that vacuum breakdown wave in the form of QED cascade front can propagate in an extremely intense plane electromagnetic wave. The result disproves the statement that the self-sustained cascading is not possible in a plane wave configuration. In the simulations the cascade initiates during laser-foil interaction in the light sail regime. As a result, a constantly growing electron-positron plasma cushion is formed between the foil and laser radiation. The cushion plasma efficiently absorbs the laser energy and decouples the radiation from the moving foil thereby interrupting the ion acceleration. The models describing propagation of the cascade front and electrodynamics of the cushion plasma are presented and their predictions are in a qualitative agreement with the results of numerical simulations.
△ Less
Submitted 27 June, 2019; v1 submitted 17 September, 2018;
originally announced September 2018.
-
Asymptotic electron motion in strong radiation-dominated regime
Authors:
A. S. Samsonov,
E. N. Nerush,
I. Yu. Kostyukov
Abstract:
We study electron motion in electromagnetic (EM) fields in the radiation-dominated regime. It is shown that the electron trajectories become close to some asymptotic trajectories in the strong field limit. The description of the electron dynamics by this asymptotic trajectories significantly differs from the ponderomotive description that is barely applicable in the radiation-dominated regime. The…
▽ More
We study electron motion in electromagnetic (EM) fields in the radiation-dominated regime. It is shown that the electron trajectories become close to some asymptotic trajectories in the strong field limit. The description of the electron dynamics by this asymptotic trajectories significantly differs from the ponderomotive description that is barely applicable in the radiation-dominated regime. The particle velocity on the asymptotic trajectory is completely determined by the local and instant EM field. The general properties of the asymptotic trajectories are discussed. In most of standing EM waves (including identical tightly-focused counter-propagating beams) the asymptotic trajectories are periodic with the period of the wave field. Furthermore, for a certain model of the laser beam we show that the asymptotic trajectories are periodic in the reference frame moving along the beam with its group velocity that may explain the effect of the radiation-reaction trap**.
△ Less
Submitted 11 July, 2018;
originally announced July 2018.