Skip to main content

Showing 1–50 of 88 results for author: Lan, G

Searching in archive math. Search in all archives.
.
  1. arXiv:2311.02814  [pdf, other

    math.OC

    A Novel Catalyst Scheme for Stochastic Minimax Optimization

    Authors: Guanghui Lan, Yan Li

    Abstract: This paper presents a proximal-point-based catalyst scheme for simple first-order methods applied to convex minimization and convex-concave minimax problems. In particular, for smooth and (strongly)-convex minimization problems, the proposed catalyst scheme, instantiated with a simple variant of stochastic gradient method, attains the optimal rate of convergence in terms of both deterministic and… ▽ More

    Submitted 7 November, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

  2. arXiv:2310.19807  [pdf, other

    cs.LG math.OC

    Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient Updates

    Authors: Guangchen Lan, Han Wang, James Anderson, Christopher Brinton, Vaneet Aggarwal

    Abstract: Federated reinforcement learning (FedRL) enables agents to collaboratively train a global policy without sharing their individual data. However, high communication overhead remains a critical bottleneck, particularly for natural policy gradient (NPG) methods, which are second-order. To address this issue, we propose the FedNPG-ADMM framework, which leverages the alternating direction method of mul… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

    ACM Class: I.2.6

  3. arXiv:2310.12139  [pdf, ps, other

    math.OC stat.CO

    Optimal and parameter-free gradient minimization methods for convex and nonconvex optimization

    Authors: Guanghui Lan, Yuyuan Ouyang, Zhe Zhang

    Abstract: We propose novel optimal and parameter-free algorithms for computing an approximate solution with small (projected) gradient norm. Specifically, for computing an approximate solution such that the norm of its (projected) gradient does not exceed $\varepsilon$, we obtain the following results: a) for the convex case, the total number of gradient evaluations is bounded by… ▽ More

    Submitted 29 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  4. arXiv:2310.10082  [pdf, other

    math.OC cs.LG

    A simple uniformly optimal method without line search for convex optimization

    Authors: Tianjiao Li, Guanghui Lan

    Abstract: Line search (or backtracking) procedures have been widely employed into first-order methods for solving convex optimization problems, especially those with unknown problem parameters (e.g., Lipschitz constant). In this paper, we show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori. In par… ▽ More

    Submitted 26 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

  5. arXiv:2307.15890  [pdf, ps, other

    math.OC cs.LG

    First-order Policy Optimization for Robust Policy Evaluation

    Authors: Yan Li, Guanghui Lan

    Abstract: We adopt a policy optimization viewpoint towards policy evaluation for robust Markov decision process with $\mathrm{s}$-rectangular ambiguity sets. The developed method, named first-order policy evaluation (FRPE), provides the first unified framework for robust policy evaluation in both deterministic (offline) and stochastic (online) settings, with either tabular representation or generic function… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

  6. arXiv:2307.01497  [pdf, other

    math.OC cs.LG stat.CO stat.ML

    Accelerated stochastic approximation with state-dependent noise

    Authors: Sasila Ilandarideva, Anatoli Juditsky, Guanghui Lan, Tianjiao Li

    Abstract: We consider a class of stochastic smooth convex optimization problems under rather general assumptions on the noise in the stochastic gradient observation. As opposed to the classical problem setting in which the variance of noise is assumed to be uniformly bounded, herein we assume that the variance of stochastic gradients is related to the "sub-optimality" of the approximate solutions delivered… ▽ More

    Submitted 13 July, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

  7. arXiv:2306.12116  [pdf, ps, other

    math.NA math.PR

    Mean square exponential stability of numerical methods for stochastic differential delay equations

    Authors: Guangqiang Lan, Qi Liu

    Abstract: Mean square exponential stability of $θ$-EM and modified truncated Euler-Maruyama (MTEM) methods for stochastic differential delay equations (SDDEs) are investigated in this paper. We present new criterion of mean square exponential stability of the $θ$-EM and MTEM methods for SDDEs, which are different from most existing results under Khasminskii-type conditions. Two examples are provided to supp… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: 19 pages

    MSC Class: 65C30; 65C20; 65L05; 65L20

  8. arXiv:2303.15672  [pdf, ps, other

    math.OC cs.AI cs.LG

    Numerical Methods for Convex Multistage Stochastic Optimization

    Authors: Guanghui Lan, Alexander Shapiro

    Abstract: Optimization problems involving sequential decisions in a stochastic environment were studied in Stochastic Programming (SP), Stochastic Optimal Control (SOC) and Markov Decision Processes (MDP). In this paper we mainly concentrate on SP and SOC modelling approaches. In these frameworks there are natural situations when the considered problems are convex. Classical approach to sequential optimizat… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    MSC Class: 65K05; 90C15; 90C39; 90C40

  9. arXiv:2303.04386  [pdf, ps, other

    cs.LG cs.AI math.OC

    Policy Mirror Descent Inherently Explores Action Space

    Authors: Yan Li, Guanghui Lan

    Abstract: Explicit exploration in the action space was assumed to be indispensable for online policy gradient methods to avoid a drastic degradation in sample complexity, for solving general reinforcement learning problems over finite state and action spaces. In this paper, we establish for the first time an $\tilde{\mathcal{O}}(1/ε^2)$ sample complexity for online policy gradient methods without incorporat… ▽ More

    Submitted 20 March, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

  10. arXiv:2303.02024  [pdf, other

    math.OC

    Dual dynamic programming for stochastic programs over an infinite horizon

    Authors: Caleb Ju, Guanghui Lan

    Abstract: We consider a dual dynamic programming algorithm for solving stochastic programs over an infinite horizon. We show non-asymptotic convergence results when using an explorative strategy, and we then enhance this result by reducing the dependence of the effective planning horizon from quadratic to linear. This improvement is achieved by combining the forward and backward phases from dual dynamic pro… ▽ More

    Submitted 4 April, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: 45 pages. New experiments for hierarchical problem and writing updates

    MSC Class: 90C25; 49M37; 90C06; 90C15; 93E20

  11. arXiv:2212.00084  [pdf, other

    math.OC

    A model-free first-order method for linear quadratic regulator with $\tilde{O}(1/\varepsilon)$ sampling complexity

    Authors: Caleb Ju, Georgios Kotsalis, Guanghui Lan

    Abstract: We consider the classic stochastic linear quadratic regulator (LQR) problem under an infinite horizon average stage cost. By leveraging recent policy gradient methods from reinforcement learning, we obtain a first-order method that finds a stable feedback law whose objective function gap to the optima is at most $\varepsilon$ with high probability using $\tilde{O}(1/\varepsilon)$ samples, where… ▽ More

    Submitted 10 May, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

    Comments: Pre-print. 23 pages, 1 figure. Update fixes some parts of proof that had incorrect constants and addresses stability of policy. Comments are welcome

    MSC Class: 93C05; 65K05

  12. arXiv:2211.16715  [pdf, ps, other

    cs.LG cs.AI math.OC

    Policy Optimization over General State and Action Spaces

    Authors: Guanghui Lan

    Abstract: Reinforcement learning (RL) problems over general state and action spaces are notoriously challenging. In contrast to the tableau setting, one can not enumerate all the states and then iteratively update the policies for each state. This prevents the application of many well-studied RL methods especially those with provable convergence guarantees. In this paper, we first present a substantial gene… ▽ More

    Submitted 9 May, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

  13. arXiv:2210.05807  [pdf, ps, other

    math.OC

    Solving Convex Smooth Function Constrained Optimization Is Almost As Easy As Unconstrained Optimization

    Authors: Zhe Zhang, Guanghui Lan

    Abstract: Consider applying first-order methods to solve the smooth convex constrained optimization problem of the form $\min_{x \in X} F(x).$ For a simple closed convex set $X$ which is easy to project onto, Nesterov proposed the Accelerated Gradient Descent (AGD) method to solve the constrained problem as efficiently as an unconstrained problem in terms of the number of gradient computations of $F$ (i.e.,… ▽ More

    Submitted 2 November, 2022; v1 submitted 11 October, 2022; originally announced October 2022.

  14. arXiv:2210.05108  [pdf, ps, other

    math.OC cs.LG

    Functional Constrained Optimization for Risk Aversion and Sparsity Control

    Authors: Yi Cheng, Guanghui Lan, H. Edwin Romeijn

    Abstract: Risk and sparsity requirements often need to be enforced simultaneously in many applications, e.g., in portfolio optimization, assortment planning, and treatment planning. Properly balancing these potentially conflicting requirements entails the formulation of functional constrained optimization with either convex or nonconvex objectives. In this paper, we focus on projection-free methods that can… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

  15. arXiv:2209.12111  [pdf, other

    math.NA

    Convergence and exponential stability of modified truncated Milstein method for stochastic differential equations

    Authors: Yu Jiang, Guangqiang Lan

    Abstract: In this paper, we develop a new explicit scheme called modified truncated Milstein method which is motivated by truncated Milstein method proposed by Guo (2018) and modified truncated Euler-Maruyama method introduced by Lan (2018). We obtain the strong convergence of the scheme under local boundedness and Khasminskii-type conditions, which are relatively weaker than the existing results, and we pr… ▽ More

    Submitted 24 September, 2022; originally announced September 2022.

    Comments: 28pages, 6 figures

    MSC Class: 65C30; 65C20; 65L05; 65L20

  16. arXiv:2209.10579  [pdf, other

    cs.LG cs.AI math.OC

    First-order Policy Optimization for Robust Markov Decision Process

    Authors: Yan Li, Guanghui Lan, Tuo Zhao

    Abstract: We consider the problem of solving robust Markov decision process (MDP), which involves a set of discounted, finite state, finite action space MDPs with uncertain transition kernels. The goal of planning is to find a robust policy that optimizes the worst-case values against the transition uncertainties, and thus encompasses the standard MDP planning as a special case. For… ▽ More

    Submitted 10 June, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

  17. arXiv:2205.08011  [pdf, other

    math.OC

    Level Constrained First Order Methods for Function Constrained Optimization

    Authors: Digvijay Boob, Qi Deng, Guanghui Lan

    Abstract: We present a new feasible proximal gradient method for constrained optimization where both the objective and constraint functions are given by the summation of a smooth, possibly nonconvex function and a convex simple function. The algorithm converts the original problem into a sequence of convex subproblems. Formulating those subproblems requires the evaluation of at most one gradient value of th… ▽ More

    Submitted 31 January, 2024; v1 submitted 16 May, 2022; originally announced May 2022.

    Comments: Accepted at Mathematical Programming

    MSC Class: 90C26; 90C30; 90C06; 90C51; 49M37

  18. arXiv:2205.05800  [pdf, other

    cs.LG math.OC stat.ML

    Stochastic first-order methods for average-reward Markov decision processes

    Authors: Tianjiao Li, Feiyang Wu, Guanghui Lan

    Abstract: We study the problem of average-reward Markov decision processes (AMDPs) and develop novel first-order methods with strong theoretical guarantees for both policy evaluation and optimization. Existing on-policy evaluation methods suffer from sub-optimal convergence rates as well as failure in handling insufficiently random policies, e.g., deterministic policies, for lack of exploration. To remedy t… ▽ More

    Submitted 14 September, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

  19. arXiv:2203.05117  [pdf, other

    math.OC cs.DC cs.LG

    Optimal Methods for Convex Risk Averse Distributed Optimization

    Authors: Guanghui Lan, Zhe Zhang

    Abstract: This paper studies the communication complexity of convex risk-averse optimization over a network. The problem generalizes the well-studied risk-neutral finite-sum distributed optimization problem and its importance stems from the need to handle risk in an uncertain environment. For algorithms in the literature, there exists a gap in communication complexities for solving risk-averse and risk-neut… ▽ More

    Submitted 7 March, 2023; v1 submitted 9 March, 2022; originally announced March 2022.

  20. arXiv:2202.07868  [pdf, ps, other

    cs.LG math.OC

    Data-Driven Minimax Optimization with Expectation Constraints

    Authors: Shuoguang Yang, Xudong Li, Guanghui Lan

    Abstract: Attention to data-driven optimization approaches, including the well-known stochastic gradient descent method, has grown significantly over recent decades, but data-driven constraints have rarely been studied, because of the computational challenges of projections onto the feasible set defined by these hard constraints. In this paper, we focus on the non-smooth convex-concave stochastic minimax re… ▽ More

    Submitted 9 October, 2023; v1 submitted 16 February, 2022; originally announced February 2022.

  21. arXiv:2201.09457  [pdf, other

    cs.LG cs.AI math.OC

    Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity

    Authors: Yan Li, Guanghui Lan, Tuo Zhao

    Abstract: We propose a new policy gradient method, named homotopic policy mirror descent (HPMD), for solving discounted, infinite horizon MDPs with finite state and action spaces. HPMD performs a mirror descent type policy update with an additional diminishing regularization term, and possesses several computational properties that seem to be new in the literature. We first establish the global linear conve… ▽ More

    Submitted 29 November, 2022; v1 submitted 23 January, 2022; originally announced January 2022.

  22. arXiv:2201.05756  [pdf, other

    cs.LG cs.AI math.OC

    Block Policy Mirror Descent

    Authors: Guanghui Lan, Yan Li, Tuo Zhao

    Abstract: In this paper, we present a new policy gradient (PG) methods, namely the block policy mirror descent (BPMD) method for solving a class of regularized reinforcement learning (RL) problems with (strongly)-convex regularizers. Compared to the traditional PG methods with a batch update rule, which visits and updates the policy for every state, BPMD method has cheap per-iteration computation via a part… ▽ More

    Submitted 17 September, 2022; v1 submitted 14 January, 2022; originally announced January 2022.

    MSC Class: 90C40; 90C15; 90C26; 68Q25

  23. arXiv:2112.13109  [pdf, other

    stat.ML cs.LG math.OC

    Accelerated and instance-optimal policy evaluation with linear function approximation

    Authors: Tianjiao Li, Guanghui Lan, Ashwin Pananjady

    Abstract: We study the problem of policy evaluation with linear function approximation and present efficient and practical algorithms that come with strong optimality guarantees. We begin by proving lower bounds that establish baselines on both the deterministic error and stochastic error in this problem. In particular, we prove an oracle complexity lower bound on the deterministic error in an instance-depe… ▽ More

    Submitted 13 August, 2022; v1 submitted 24 December, 2021; originally announced December 2021.

  24. arXiv:2111.00996  [pdf, ps, other

    math.OC

    Mirror-prox sliding methods for solving a class of monotone variational inequalities

    Authors: Guanghui Lan, Yuyuan Ouyang

    Abstract: In this paper we propose new algorithms for solving a class of structured monotone variational inequality (VI) problems over compact feasible sets. By identifying the gradient components existing in the operator of VI, we show that it is possible to skip computations of the gradients from time to time, while still maintaining the optimal iteration complexity for solving these VI problems. Specific… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

  25. arXiv:2110.10351  [pdf, other

    math.OC cs.LG

    Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

    Authors: Tianjiao Li, Ziwei Guan, Shaofeng Zou, Tengyu Xu, Yingbin Liang, Guanghui Lan

    Abstract: The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its utilities/costs. A new primal-dual approach is proposed with a novel integration of three ingredients: entropy regularized policy optimizer, dual variable regularizer, and Nesterov's accelerated gradient descent… ▽ More

    Submitted 19 October, 2021; originally announced October 2021.

    Comments: The paper was initially submitted for publication in January 2021

  26. arXiv:2110.04844  [pdf, other

    cs.LG math.OC

    Frequency-aware SGD for Efficient Embedding Learning with Provable Benefits

    Authors: Yan Li, Dhruv Choudhary, Xiaohan Wei, Baichuan Yuan, Bhargav Bhushanam, Tuo Zhao, Guanghui Lan

    Abstract: Embedding learning has found widespread applications in recommendation systems and natural language modeling, among other domains. To learn quality embeddings efficiently, adaptive learning rate algorithms have demonstrated superior empirical performance over SGD, largely accredited to their token-dependent learning rate. However, the underlying mechanism for the efficiency of token-dependent lear… ▽ More

    Submitted 23 November, 2021; v1 submitted 10 October, 2021; originally announced October 2021.

    Comments: Additional experiments on Word2Vec embedding learning included

  27. arXiv:2102.00135  [pdf, ps, other

    cs.LG cs.AI math.OC

    Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes

    Authors: Guanghui Lan

    Abstract: We present new policy mirror descent (PMD) methods for solving reinforcement learning (RL) problems with either strongly convex or general convex regularizers. By exploring the structural properties of these overall highly nonconvex problems we show that the PMD methods exhibit fast linear rate of convergence to the global optimality. We develop stochastic counterparts of these methods, and establ… ▽ More

    Submitted 6 April, 2022; v1 submitted 29 January, 2021; originally announced February 2021.

  28. arXiv:2101.00143  [pdf, other

    math.OC

    Graph topology invariant gradient and sampling complexity for decentralized and stochastic optimization

    Authors: Guanghui Lan, Yuyuan Ouyang, Yi Zhou

    Abstract: One fundamental problem in decentralized multi-agent optimization is the trade-off between gradient/sampling complexity and communication complexity. We propose new algorithms whose gradient and sampling complexities are graph topology invariant while their communication complexities remain optimal. For convex smooth deterministic problems, we propose a primal dual sliding (PDS) algorithm that com… ▽ More

    Submitted 12 January, 2021; v1 submitted 31 December, 2020; originally announced January 2021.

    Comments: 25 pages, 1 figure

  29. arXiv:2011.10076  [pdf, other

    math.OC cs.AI cs.LG

    Optimal Algorithms for Convex Nested Stochastic Composite Optimization

    Authors: Zhe Zhang, Guanghui Lan

    Abstract: Recently, convex nested stochastic composite optimization (NSCO) has received considerable attention for its applications in reinforcement learning and risk-averse optimization. The current NSCO algorithms have worse stochastic oracle complexities, by orders of magnitude, than those for simpler stochastic composite optimization problems (e.g., sum of smooth and nonsmooth functions) without the nes… ▽ More

    Submitted 21 June, 2022; v1 submitted 19 November, 2020; originally announced November 2020.

  30. arXiv:2011.08434  [pdf, other

    math.OC cs.AI cs.LG

    Simple and optimal methods for stochastic variational inequalities, II: Markovian noise and policy evaluation in reinforcement learning

    Authors: Georgios Kotsalis, Guanghui Lan, Tianjiao Li

    Abstract: The focus of this paper is on stochastic variational inequalities (VI) under Markovian noise. A prominent application of our algorithmic developments is the stochastic policy evaluation problem in reinforcement learning. Prior investigations in the literature focused on temporal difference (TD) learning by employing nonsmooth finite time analysis motivated by stochastic subgradient descent leading… ▽ More

    Submitted 13 August, 2021; v1 submitted 14 November, 2020; originally announced November 2020.

    Comments: arXiv admin note: text overlap with arXiv:2011.02987

    MSC Class: 90C25; 90C15; 62L20; 68Q25

  31. arXiv:2011.02987  [pdf, other

    math.OC cs.AI cs.LG

    Simple and optimal methods for stochastic variational inequalities, I: operator extrapolation

    Authors: Georgios Kotsalis, Guanghui Lan, Tianjiao Li

    Abstract: In this paper we first present a novel operator extrapolation (OE) method for solving deterministic variational inequality (VI) problems. Similar to the gradient (operator) projection method, OE updates one single search sequence by solving a single projection subproblem in each iteration. We show that OE can achieve the optimal rate of convergence for solving a variety of VI problems in a much si… ▽ More

    Submitted 19 June, 2023; v1 submitted 5 November, 2020; originally announced November 2020.

    MSC Class: 90C25; 90C15; 62L20; 68Q25

  32. arXiv:2010.12169  [pdf, other

    math.OC cs.LG

    A Feasible Level Proximal Point Method for Nonconvex Sparse Constrained Optimization

    Authors: Digvijay Boob, Qi Deng, Guanghui Lan, Yilin Wang

    Abstract: Nonconvex sparse models have received significant attention in high-dimensional machine learning. In this paper, we study a new model consisting of a general convex or nonconvex objectives and a variety of continuous nonconvex sparsity-inducing constraints. For this constrained model, we propose a novel proximal point algorithm that solves a sequence of convex subproblems with gradually relaxed co… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: Accepted at NeurIPS 2020

  33. arXiv:2008.04827  [pdf, ps, other

    math.PR

    The 4-D Gaussian Random Vector Maximum Conjecture and the 3-D Simplex Mean Width Conjecture

    Authors: Wei Sun, Ze-Chun Hu, Guolie Lan

    Abstract: We prove the four-dimensional Gaussian random vector maximum conjecture. This conjecture asserts that among all centered Gaussian random vectors $X=(X_1,X_2,X_3,X_4)$ with $E[X_i^2]=1$, $1\le i\le 4$, the expectation $E[\max(X_1,X_2,X_3,X_4)]$ is maximal if and only if all off-diagonal elements of the covariance matrix equal $-\frac{1}{3}$. As a direct consequence, we resolve the three-dimensional… ▽ More

    Submitted 15 August, 2020; v1 submitted 10 August, 2020; originally announced August 2020.

    MSC Class: 60E15; 52A40

  34. arXiv:2007.00153  [pdf, other

    math.OC cs.LG

    Conditional Gradient Methods for Convex Optimization with General Affine and Nonlinear Constraints

    Authors: Guanghui Lan, Edwin Romeijn, Zhiqiang Zhou

    Abstract: Conditional gradient methods have attracted much attention in both machine learning and optimization communities recently. These simple methods can guarantee the generation of sparse solutions. In addition, without the computation of full gradients, they can handle huge-scale problems sometimes even with an exponentially increasing number of decision variables. This paper aims to significantly exp… ▽ More

    Submitted 29 June, 2021; v1 submitted 30 June, 2020; originally announced July 2020.

  35. arXiv:2007.00132  [pdf, other

    math.OC

    Convex optimization for finite horizon robust covariance control of linear stochastic systems

    Authors: Georgios Kotsalis, Guanghui Lan, Arkadi Nemirovski

    Abstract: This work addresses the finite-horizon robust covariance control problem for discrete-time, partially observable, linear system affected by random zero mean noise and deterministic but unknown disturbances restricted to lie in what is called ellitopic uncertainty set (e.g., finite intersection of centered at the origin ellipsoids/elliptic cylinders). Performance specifications are imposed on the r… ▽ More

    Submitted 30 June, 2020; originally announced July 2020.

    Comments: 29 pages, 1 figure

    MSC Class: 90C47; 90C22; 49K30; 49M29

  36. A Unified Single-loop Alternating Gradient Projection Algorithm for Nonconvex-Concave and Convex-Nonconcave Minimax Problems

    Authors: Zi Xu, Huiling Zhang, Yang Xu, Guanghui Lan

    Abstract: Much recent research effort has been directed to the development of efficient algorithms for solving minimax problems with theoretical convergence guarantees due to the relevance of these problems to a few emergent applications. In this paper, we propose a unified single-loop alternating gradient projection (AGP) algorithm for solving smooth nonconvex-(strongly) concave and (strongly) convex-nonco… ▽ More

    Submitted 14 January, 2023; v1 submitted 3 June, 2020; originally announced June 2020.

    MSC Class: 90C47; 90C26; 90C30

    Journal ref: Mathematical Programming, 2023

  37. arXiv:1912.07702  [pdf, ps, other

    math.OC cs.LG

    Complexity of Stochastic Dual Dynamic Programming

    Authors: Guanghui Lan

    Abstract: Stochastic dual dynamic programming is a cutting plane type algorithm for multi-stage stochastic optimization originated about 30 years ago. In spite of its popularity in practice, there does not exist any analysis on the convergence rates of this method. In this paper, we first establish the number of iterations, i.e., iteration complexity, required by a basic dynamic cutting plane method for sol… ▽ More

    Submitted 9 May, 2023; v1 submitted 16 December, 2019; originally announced December 2019.

  38. arXiv:1910.11312  [pdf, ps, other

    math.PR math.CO

    Some explorations on two conjectures about Rademacher sequences

    Authors: Ze-Chun Hu, Guolie Lan, Wei Sun

    Abstract: In this paper, we explore two conjectures about Rademacher sequences. Let $(ε_i)$ be a Rademacher sequence, i.e., a sequence of independent $\{-1,1\}$-valued symmetric random variables. Set $S_n=a_1ε_1+\cdots+a_nε_n$ for $a=(a_1,\dots,a_n)\in \mathbb{R}^n$. The first conjecture says that $P\ (\ |S_n\ |\leq \|a\|\ )\geq\frac{1}{2}$ for all $a\in \mathbb{R}^n$ and $n\in \mathbb{N}$. The second conje… ▽ More

    Submitted 24 October, 2019; originally announced October 2019.

    Comments: 19 pages

    MSC Class: 60C05; 60G50

  39. arXiv:1909.11216  [pdf, ps, other

    math.OC

    Efficient Algorithms for Distributionally Robust Stochastic Optimization with Discrete Scenario Support

    Authors: Zhe Zhang, Shabbir Ahmed, Guanghui Lan

    Abstract: Recently, there has been a growing interest in distributionally robust optimization (DRO) as a principled approach to data-driven decision making. In this paper, we consider a distributionally robust two-stage stochastic optimization problem with discrete scenario support. While much research effort has been devoted to tractable reformulations for DRO problems, especially those with continuous sce… ▽ More

    Submitted 3 December, 2020; v1 submitted 24 September, 2019; originally announced September 2019.

  40. arXiv:1908.02734  [pdf, ps, other

    math.OC cs.LG

    Stochastic First-order Methods for Convex and Nonconvex Functional Constrained Optimization

    Authors: Digvijay Boob, Qi Deng, Guanghui Lan

    Abstract: Functional constrained optimization is becoming more and more important in machine learning and operations research. Such problems have potential applications in risk-averse machine learning, semisupervised learning, and robust optimization among others. In this paper, we first present a novel Constraint Extrapolation (ConEx) method for solving convex functional constrained problems, which utilize… ▽ More

    Submitted 26 January, 2022; v1 submitted 7 August, 2019; originally announced August 2019.

    Comments: 36 pages, final version, accepted at Math Programming

  41. arXiv:1905.12412  [pdf, other

    math.OC cs.DS cs.LG

    A unified variance-reduced accelerated gradient method for convex optimization

    Authors: Guanghui Lan, Zhize Li, Yi Zhou

    Abstract: We propose a novel randomized incremental gradient algorithm, namely, VAriance-Reduced Accelerated Gradient (Varag), for finite-sum optimization. Equipped with a unified step-size policy that adjusts itself to the value of the condition number, Varag exhibits the unified optimal rates of convergence for solving smooth convex finite-sum problems directly regardless of their strong convexity. Moreov… ▽ More

    Submitted 30 October, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)

  42. arXiv:1905.04279  [pdf, ps, other

    math.PR

    The Three-Dimensional Gaussian Product Inequality

    Authors: Guolie Lan, Ze-Chun Hu, Wei Sun

    Abstract: We prove the 3-dimensional Gaussian product inequality, i.e., for any real-valued centered Gaussian random vector $(X,Y,Z)$ and $m\in \mathbb{N}$, it holds that ${\mathbf{E}}[X^{2m}Y^{2m}Z^{2m}]\geq{\mathbf{E}}[X^{2m}]{\mathbf{E}}[Y^{2m}]{\mathbf{E}}[Z^{2m}]$. Our proof is based on some improved inequalities on multi-term products involving 2-dimensional Gaussian random vectors. The improved inequ… ▽ More

    Submitted 10 May, 2019; originally announced May 2019.

    MSC Class: 60E15; 62H12

  43. arXiv:1903.03917  [pdf, ps, other

    math.PR

    Products of Conditional Expectation Operators: Convergence and Divergence

    Authors: Guolie Lan, Ze-Chun Hu, Wei Sun

    Abstract: In this paper, we investigate the convergence of products of conditional expectation operators. We show that if $(Ω,\cal{F},P)$ is a probability space that is not purely atomic, then divergent sequences of products of conditional expectation operators involving 3 or 4 sub-$σ$-fields of $\cal{F}$ can be constructed for a large class of random variables in $L^2(Ω,\cal{F},P)$. This settles in the neg… ▽ More

    Submitted 5 July, 2019; v1 submitted 9 March, 2019; originally announced March 2019.

    MSC Class: 60A05; 60F15; 60F25

  44. arXiv:1810.03763  [pdf, other

    math.OC cs.LG stat.ML

    Cubic Regularization with Momentum for Nonconvex Optimization

    Authors: Zhe Wang, Yi Zhou, Yingbin Liang, Guanghui Lan

    Abstract: Momentum is a popular technique to accelerate the convergence in practical training, and its impact on convergence guarantee has been well-studied for first-order algorithms. However, such a successful acceleration technique has not yet been proposed for second-order algorithms in nonconvex optimization.In this paper, we apply the momentum scheme to cubic regularized (CR) Newton's method and explo… ▽ More

    Submitted 27 June, 2019; v1 submitted 8 October, 2018; originally announced October 2018.

  45. arXiv:1809.09258  [pdf, other

    math.OC cs.LG

    Asynchronous decentralized accelerated stochastic gradient descent

    Authors: Guanghui Lan, Yi Zhou

    Abstract: In this work, we introduce an asynchronous decentralized accelerated stochastic gradient descent type of method for decentralized stochastic optimization, considering communication and synchronization are the major bottlenecks. We establish $\mathcal{O}(1/ε)$ (resp., $\mathcal{O}(1/\sqrtε)$) communication complexity and $\mathcal{O}(1/ε^2)$ (resp., $\mathcal{O}(1/ε)$) sampling complexity for solvi… ▽ More

    Submitted 24 September, 2018; originally announced September 2018.

  46. arXiv:1808.07384  [pdf, ps, other

    math.OC cs.LG

    A Note on Inexact Condition for Cubic Regularized Newton's Method

    Authors: Zhe Wang, Yi Zhou, Yingbin Liang, Guanghui Lan

    Abstract: This note considers the inexact cubic-regularized Newton's method (CR), which has been shown in \cite{Cartis2011a} to achieve the same order-level convergence rate to a secondary stationary point as the exact CR \citep{Nesterov2006}. However, the inexactness condition in \cite{Cartis2011a} is not implementable due to its dependence on future iterates variable. This note fixes such an issue by prov… ▽ More

    Submitted 22 August, 2018; originally announced August 2018.

  47. arXiv:1807.08983  [pdf, ps, other

    math.PR

    Strong convergence rates of modified truncated EM methods for neutral stochastic differential delay equations

    Authors: Guangqiang Lan, Qiushi Wang

    Abstract: The aim of this paper is to investigate strong convergence of modified truncated Euler-Maruyama method for neutral stochastic differential delay equations introduced in Lan (2018). Strong convergence rates of the given numerical scheme to the exact solutions at fixed time $T$ are obtained under local Lipschitz and Khasminskii-type conditions. Moreover, convergence rates over a time interval… ▽ More

    Submitted 24 July, 2018; originally announced July 2018.

    Comments: 21 pages

    MSC Class: 60H10; 65C30; 65L20

  48. arXiv:1805.05411  [pdf, ps, other

    math.OC

    Accelerated Stochastic Algorithms for Nonconvex Finite-sum and Multi-block Optimization

    Authors: Guanghui Lan, Yu Yang

    Abstract: In this paper, we present new stochastic methods for solving two important classes of nonconvex optimization problems. We first introduce a randomized accelerated proximal gradient (RapGrad) method for solving a class of nonconvex optimization problems consisting of the sum of $m$ component functions, and show that it can significantly reduce the number of gradient computations especially when the… ▽ More

    Submitted 18 August, 2019; v1 submitted 14 May, 2018; originally announced May 2018.

  49. arXiv:1802.07372  [pdf, ps, other

    math.OC cs.LG stat.ML

    Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization

    Authors: Zhe Wang, Yi Zhou, Yingbin Liang, Guanghui Lan

    Abstract: Cubic regularization (CR) is an optimization method with emerging popularity due to its capability to escape saddle points and converge to second-order stationary solutions for nonconvex optimization. However, CR encounters a high sample complexity issue for finite-sum problems with a large data size. %Various inexact variants of CR have been proposed to improve the sample complexity. In this pape… ▽ More

    Submitted 8 October, 2018; v1 submitted 20 February, 2018; originally announced February 2018.

  50. arXiv:1801.04517  [pdf, ps, other

    math.PR

    Polynomial stability of exact solution and a numerical method for stochastic differential equations with time-dependent delay

    Authors: Guangqiang Lan, Fang Xia, Qiushi Wang

    Abstract: Polynomial stability of exact solution and modified truncated Euler-Maruyama method for stochastic differential equations with time-dependent delay are investigated in this paper. By using the well known discrete semimartingale convergence theorem, sufficient conditions are obtained for both bounded and unbounded delay $δ$ to ensure the polynomial stability of the corresponding numerical approxima… ▽ More

    Submitted 14 January, 2018; originally announced January 2018.

    MSC Class: 60H10; 65C30