Skip to main content

Showing 1–24 of 24 results for author: Thoppe, G

.
  1. arXiv:2406.14141  [pdf, other

    eess.SY cs.AI cs.NI

    Online Learning of Weakly Coupled MDP Policies for Load Balancing and Auto Scaling

    Authors: S. R. Eshwar, Lucas Lopes Felipe, Alexandre Reiffers-Masson, Daniel Sadoc Menasché, Gugan Thoppe

    Abstract: Load balancing and auto scaling are at the core of scalable, contemporary systems, addressing dynamic resource allocation and service rate adjustments in response to workload changes. This paper introduces a novel model and algorithms for tuning load balancers coupled with auto scalers, considering bursty traffic arriving at finite queues. We begin by presenting the problem as a weakly coupled Mar… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2403.09940  [pdf, ps, other

    cs.LG cs.AI math.OC

    Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries

    Authors: Swetha Ganesh, Jiayu Chen, Gugan Thoppe, Vaneet Aggarwal

    Abstract: Federated Reinforcement Learning (FRL) allows multiple agents to collaboratively build a decision making policy without sharing raw trajectories. However, if a small fraction of these agents are adversarial, it can lead to catastrophic results. We propose a policy gradient based approach that is robust to adversarial agents which can send arbitrary values to the server. Under this setting, our res… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 27 pages, 6 figures

  3. arXiv:2310.11389  [pdf, ps, other

    cs.LG stat.ML

    Risk Estimation in a Markov Cost Process: Lower and Upper Bounds

    Authors: Gugan Thoppe, L. A. Prashanth, Sanjay Bhat

    Abstract: We tackle the problem of estimating risk measures of the infinite-horizon discounted cost within a Markov cost process. The risk measures we study include variance, Value-at-Risk (VaR), and Conditional Value-at-Risk (CVaR). First, we show that estimating any of these risk measures with $ε$-accuracy, either in expected or high-probability sense, requires at least $Ω(1/ε^2)$ samples. Then, using a t… ▽ More

    Submitted 11 April, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

  4. arXiv:2304.01525  [pdf, other

    cs.LG eess.SY math.OC

    Online Learning with Adversaries: A Differential-Inclusion Analysis

    Authors: Swetha Ganesh, Alexandre Reiffers-Masson, Gugan Thoppe

    Abstract: We introduce an observation-matrix-based framework for fully asynchronous online Federated Learning (FL) with adversaries. In this work, we demonstrate its effectiveness in estimating the mean of a random vector. Our main result is that the proposed algorithm almost surely converges to the desired mean $μ.$ This makes ours the first asynchronous FL method to have an a.s. convergence guarantee in t… ▽ More

    Submitted 26 September, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: 6 pages, 2 figures

  5. arXiv:2301.13236  [pdf, other

    cs.LG cs.AI

    SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search

    Authors: Gal Dalal, Assaf Hallak, Gugan Thoppe, Shie Mannor, Gal Chechik

    Abstract: Despite the popularity of policy gradient methods, they are known to suffer from large variance and high sample complexity. To mitigate this, we introduce SoftTreeMax -- a generalization of softmax that takes planning into account. In SoftTreeMax, we extend the traditional logits with the multi-step discounted cumulative reward, topped with the logits of future states. We consider two variants of… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: arXiv admin note: text overlap with arXiv:2209.13966

  6. arXiv:2208.10583  [pdf, other

    cs.LG cs.AI

    Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking

    Authors: Eshwar S R, Shishir Kolathaya, Gugan Thoppe

    Abstract: Evolution Strategy (ES) is a powerful black-box optimization technique based on the idea of natural evolution. In each of its iterations, a key step entails ranking candidate solutions based on some fitness score. For an ES method in Reinforcement Learning (RL), this ranking step requires evaluating multiple policies. This is presently done via on-policy approaches: each policy's score is estimate… ▽ More

    Submitted 21 February, 2023; v1 submitted 22 August, 2022; originally announced August 2022.

  7. arXiv:2205.13617  [pdf, other

    cs.LG math.OC

    Demystifying Approximate Value-based RL with $ε$-greedy Exploration: A Differential Inclusion View

    Authors: Aditya Gopalan, Gugan Thoppe

    Abstract: Q-learning and SARSA with $ε$-greedy exploration are leading reinforcement learning methods. Their tabular forms converge to the optimal Q-function under reasonable conditions. However, with function approximation, these methods exhibit strange behaviors such as policy oscillation, chattering, and convergence to different attractors (possibly even the worst policy) on different runs, apart from th… ▽ More

    Submitted 10 February, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: 22 pages, 3 figures

    MSC Class: 93E35; 68Q32 ACM Class: I.2.0

  8. arXiv:2110.15547  [pdf, ps, other

    cs.LG

    Does Momentum Help? A Sample Complexity Analysis

    Authors: Swetha Ganesh, Rohan Deb, Gugan Thoppe, Amarjit Budhiraja

    Abstract: Stochastic Heavy Ball (SHB) and Nesterov's Accelerated Stochastic Gradient (ASG) are popular momentum methods in stochastic optimization. While benefits of such acceleration ideas in deterministic settings are well understood, their advantages in stochastic optimization is still unclear. In fact, in some specific instances, it is known that momentum does not help in the sample complexity sense. Ou… ▽ More

    Submitted 11 July, 2022; v1 submitted 29 October, 2021; originally announced October 2021.

  9. arXiv:2110.15092  [pdf, ps, other

    cs.LG

    A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning

    Authors: Gugan Thoppe, Bhumesh Kumar

    Abstract: In Multi-Agent Reinforcement Learning (MARL), multiple agents interact with a common environment, as also with each other, for solving a shared problem in sequential decision-making. It has wide-ranging applications in gaming, robotics, finance, etc. In this work, we derive a novel law of iterated logarithm for a family of distributed nonlinear stochastic approximation schemes that is useful in MA… ▽ More

    Submitted 15 January, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: Some typos corrected; 19 pages

    MSC Class: 93E35; 68Q32 ACM Class: I.2.11

  10. arXiv:2012.14122  [pdf, other

    math.PR

    The Shadow knows: Empirical Distributions of Minimum Spanning Acycles and Persistence Diagrams of Random Complexes

    Authors: Nicolas Fraiman, Sayan Mukherjee, Gugan Thoppe

    Abstract: In 1985, Frieze showed that the expected sum of the edge weights of the minimum spanning tree (MST) in the uniformly weighted graph converges to $ζ(3)$. Recently, Hino and Kanazawa extended this result to a uniformly weighted simplicial complex, where the role of the MST is played by its higher-dimensional analog -- the Minimum Spanning Acycle (MSA). Our work goes beyond and describes the histogra… ▽ More

    Submitted 29 January, 2024; v1 submitted 28 December, 2020; originally announced December 2020.

    Comments: 18 pages, 4 figures

    MSC Class: 60C05; 60G57; 05E45

  11. arXiv:2009.08142  [pdf, other

    cs.IR cs.LG cs.SI math.PR stat.ML

    Online Algorithms for Estimating Change Rates of Web Pages

    Authors: Konstantin Avrachenkov, Kishor Patil, Gugan Thoppe

    Abstract: A search engine maintains local copies of different web pages to provide quick search results. This local cache is kept up-to-date by a web crawler that frequently visits these different pages to track changes in them. Ideally, the local copy should be updated as soon as a page changes on the web. However, finite bandwidth availability and server restrictions limit how frequently different pages c… ▽ More

    Submitted 4 November, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

    Comments: This is the author version of the paper accepted to {\it International Journal of Performance Evaluation}, Elsevier; 25 pages. arXiv admin note: text overlap with arXiv:2004.02167

  12. arXiv:2004.02167  [pdf, other

    cs.IR cs.LG cs.SI math.PR

    Change Rate Estimation and Optimal Freshness in Web Page Crawling

    Authors: Konstantin Avrachenkov, Kishor Patil, Gugan Thoppe

    Abstract: For providing quick and accurate results, a search engine maintains a local snapshot of the entire web. And, to keep this local cache fresh, it employs a crawler for tracking changes across various web pages. However, finite bandwidth availability and server restrictions impose some constraints on the crawling frequency. Consequently, the ideal crawling rates are the ones that maximise the freshne… ▽ More

    Submitted 5 April, 2020; originally announced April 2020.

    Comments: This paper has been accepted to the 13th EAI International Conference on Performance Evaluation Methodologies and Tools, VALUETOOLS'20, May 18--20, 2020, Tsukuba, Japan. This is the author version of the paper

  13. arXiv:2001.06860  [pdf, other

    math.PR

    Limit theorems for topological invariants of the dynamic multi-parameter simplicial complex

    Authors: Takashi Owada, Gennady Samorodnitsky, Gugan Thoppe

    Abstract: Topological study of existing random simplicial complexes is non-trivial and has led to several seminal works. However, the applicability of such studies is limited since the randomness there is usually governed by a single parameter. With this in mind, we focus here on the topology of the recently proposed multi-parameter random simplicial complex and, more importantly, of its dynamic analogue th… ▽ More

    Submitted 4 February, 2021; v1 submitted 19 January, 2020; originally announced January 2020.

    Comments: 42 pages, 1 figure

    MSC Class: 60F17; 55U05; 60C05; 60F15

  14. arXiv:1911.09157  [pdf, ps, other

    cs.LG math.PR

    A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound

    Authors: Gal Dalal, Balazs Szorenyi, Gugan Thoppe

    Abstract: Policy evaluation in reinforcement learning is often conducted using two-timescale stochastic approximation, which results in various gradient temporal difference methods such as GTD(0), GTD2, and TDC. Here, we provide convergence rate bounds for this suite of algorithms. Algorithms such as these have two iterates, $θ_n$ and $w_n,$ which are updated using two distinct stepsize sequences, $α_n$ and… ▽ More

    Submitted 4 December, 2019; v1 submitted 20 November, 2019; originally announced November 2019.

  15. arXiv:1807.11018  [pdf, other

    math.PR

    Betti Numbers of Gaussian Excursions in the Sparse Regime

    Authors: Gugan Thoppe, Sunder Ram Krishnan

    Abstract: Random field excursions is an increasingly vital topic within data analysis in medicine, cosmology, materials science, etc. This work is the first detailed study of their Betti numbers in the so-called `sparse' regime. Specifically, we consider a piecewise constant Gaussian field whose covariance function is positive and satisfies some local, boundedness, and decay rate conditions. We model its ex… ▽ More

    Submitted 23 August, 2018; v1 submitted 29 July, 2018; originally announced July 2018.

    Comments: 66 pages, 4 figures

    MSC Class: 60G15; 60F05; 05E45 (Primary) 60G60; 60G70; 60G10; 55U10 (Secondary)

  16. arXiv:1704.01161  [pdf, other

    cs.AI

    Finite Sample Analyses for TD(0) with Function Approximation

    Authors: Gal Dalal, Balázs Szörényi, Gugan Thoppe, Shie Mannor

    Abstract: TD(0) is one of the most commonly used algorithms in reinforcement learning. Despite this, there is no existing finite sample analysis for TD(0) with function approximation, even for the linear case. Our work is the first to provide such results. Existing convergence rates for Temporal Difference (TD) methods apply only to somewhat modified versions, e.g., projected variants or ones where stepsize… ▽ More

    Submitted 11 December, 2017; v1 submitted 4 April, 2017; originally announced April 2017.

  17. arXiv:1703.05376  [pdf, other

    cs.AI

    Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning

    Authors: Gal Dalal, Balazs Szorenyi, Gugan Thoppe, Shie Mannor

    Abstract: Two-timescale Stochastic Approximation (SA) algorithms are widely used in Reinforcement Learning (RL). Their iterates have two parts that are updated using distinct stepsizes. In this work, we develop a novel recipe for their finite sample analysis. Using this, we provide a concentration bound, which is the first such result for a two-timescale SA. The type of bound we obtain is known as `lock-in… ▽ More

    Submitted 4 June, 2018; v1 submitted 15 March, 2017; originally announced March 2017.

  18. arXiv:1701.00239  [pdf, other

    math.PR

    Randomly Weighted $d-$complexes: Minimal Spanning Acycles and Persistence Diagrams

    Authors: Primoz Skraba, Gugan Thoppe, D. Yogeshwaran

    Abstract: A weighted $d-$complex is a simplicial complex of dimension $d$ in which each face is assigned a real-valued weight. We derive three key results here concerning persistence diagrams and minimal spanning acycles (MSAs) of such complexes. First, we establish an equivalence between the MSA face-weights and \emph{death times} in the persistence diagram. Next, we show a novel stability result for the M… ▽ More

    Submitted 22 March, 2020; v1 submitted 1 January, 2017; originally announced January 2017.

    Comments: 42 Pages, 1 Figure. Streamlined introduction, modified Section 3 significantly

    MSC Class: 60C05; 05E45 (Primary) 60G70; 60B99; 05C80 (Secondary)

  19. arXiv:1506.08657  [pdf, ps, other

    math.OC

    A Concentration Bound for Stochastic Approximation via Alekseev's Formula

    Authors: Gugan Thoppe, Vivek S. Borkar

    Abstract: Given an ODE and its perturbation, the Alekseev formula expresses the solutions of the latter in terms related to the former. By exploiting this formula and a new concentration inequality for martingale-differences, we develop a novel approach for analyzing nonlinear Stochastic Approximation (SA). This approach is useful for studying a SA's behaviour close to a Locally Asymptotically Stable Equili… ▽ More

    Submitted 30 March, 2019; v1 submitted 26 June, 2015; originally announced June 2015.

    Comments: 44 pages. Mentioned that Dh(x*) needs to be Hurwitz

  20. arXiv:1503.01983  [pdf, ps, other

    math.PR

    On the evolution of topology in dynamic clique complexes

    Authors: Gugan Thoppe, D. Yogeshwaran, Robert Adler

    Abstract: We consider a time varying analogue of the Erd{\H o}s-R{\' e}nyi graph and study the topological variations of its associated clique complex. The dynamics of the graph are stationary and are determined by the edges, which evolve independently as continuous time Markov chains. Our main result is that when the edge inclusion probability is of the form $p = n^α$, where $n$ is the number of vertices a… ▽ More

    Submitted 15 January, 2016; v1 submitted 5 March, 2015; originally announced March 2015.

    Comments: Rewrote the introduction

  21. arXiv:1404.6635  [pdf, other

    math.OC eess.SY stat.CO

    Greedy Block Coordinate Descent (GBCD) Method for High Dimensional Quadratic Programs

    Authors: Gugan Thoppe, Vivek S. Borkar, Dinesh Garg

    Abstract: High dimensional unconstrained quadratic programs (UQPs) involving massive datasets are now common in application areas such as web, social networks, etc. Unless computational resources that match up to these datasets are available, solving such problems using classical UQP methods is very difficult. This paper discusses alternatives. We first define high dimensional compliant (HDC) methods for UQ… ▽ More

    Submitted 12 July, 2014; v1 submitted 26 April, 2014; originally announced April 2014.

    Comments: 29 pages, 3 figures, New references added

  22. A Stochastic Kaczmarz Algorithm for Network Tomography

    Authors: Gugan Thoppe, Vivek S. Borkar, D. Manjunath

    Abstract: We develop a stochastic approximation version of the classical Kaczmarz algorithm that is incremental in nature and takes as input noisy real time data. Our analysis shows that with probability one it mimics the behavior of the original scheme: starting from the same initial point, our algorithm and the corresponding deterministic Kaczmarz algorithm converge to precisely the same point. The motiva… ▽ More

    Submitted 18 October, 2013; v1 submitted 15 December, 2012; originally announced December 2012.

    Comments: Figures have been improved. Streamlined notation

  23. arXiv:1210.7911  [pdf, other

    math.ST

    Generalized Network Tomography (journal version)

    Authors: Gugan Thoppe

    Abstract: Generalized network tomography (GNT) deals with estimation of link performance parameters for networks with arbitrary topologies using only end-to-end path measurements of pure unicast probe packets. In this paper, by taking advantage of the properties of generalized hyperexponential distributions and polynomial systems, a novel algorithm to infer the complete link metric distributions under the f… ▽ More

    Submitted 30 October, 2012; originally announced October 2012.

    Comments: 33 pages. Extended version of arXiv:1207.2530

    MSC Class: 47A50; 47A52; 62J99;

  24. arXiv:1207.2530  [pdf, other

    cs.NI cs.DC math.ST

    Generalized Network Tomography

    Authors: Gugan Thoppe

    Abstract: For successful estimation, the usual network tomography algorithms crucially require i) end-to-end data generated using multicast probe packets, real or emulated, and ii) the network to be a tree rooted at a single sender with destinations at leaves. These requirements, consequently, limit their scope of application. In this paper, we address successfully a general problem, henceforth called gener… ▽ More

    Submitted 1 November, 2012; v1 submitted 10 July, 2012; originally announced July 2012.

    Comments: 8 Pages, Corrected Typos in Lemma 1