Skip to main content

Showing 1–50 of 100 results for author: Hong, M

Searching in archive math. Search in all archives.
.
  1. arXiv:2404.10575  [pdf, other

    cs.LG cs.AI cs.CV math.OC

    EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence

    Authors: Chung-Yiu Yau, Hoi-To Wai, Parameswaran Raman, Soumajyoti Sarkar, Mingyi Hong

    Abstract: A key challenge in contrastive learning is to generate negative samples from a large sample set to contrast with positive samples, for learning better encoding of the data. These negative samples often follow a softmax distribution which are dynamically updated during the training process. However, sampling from this distribution is non-trivial due to the high computational costs in computing the… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 20 pages

  2. arXiv:2404.09800  [pdf, ps, other

    math.PR

    Fractional derivatives of local times for some Gaussian processes

    Authors: Minhao Hong, Qian Yu

    Abstract: In this article, we consider fractional derivatives of local time for $d-$dimensional centered Gaussian processes satisfying certain strong local nondeterminism property. We first give a condition for existence of fractional derivatives of the local time defined by Marchaud derivatives in $L^p(p\ge1)$ and show that these derivatives are Hölder continuous with respect to both time and space variabl… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  3. arXiv:2402.08821  [pdf, other

    math.OC cs.DC

    Problem-Parameter-Free Decentralized Nonconvex Stochastic Optimization

    Authors: Jiaxiang Li, Xuxing Chen, Shiqian Ma, Mingyi Hong

    Abstract: Existing decentralized algorithms usually require knowledge of problem parameters for updating local iterates. For example, the hyperparameters (such as learning rate) usually require the knowledge of Lipschitz constant of the global gradient or topological information of the communication networks, which are usually not accessible in practice. In this paper, we propose D-NASA, the first algorithm… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  4. arXiv:2401.12025  [pdf, other

    cs.IT eess.SP math.OC

    A Survey of Recent Advances in Optimization Methods for Wireless Communications

    Authors: Ya-Feng Liu, Tsung-Hui Chang, Mingyi Hong, Zheyu Wu, Anthony Man-Cho So, Eduard A. Jorswieck, Wei Yu

    Abstract: Mathematical optimization is now widely regarded as an indispensable modeling and solution tool for the design of wireless communications systems. While optimization has played a significant role in the revolutionary progress in wireless communication and networking technologies from 1G to 5G and onto the future 6G, the innovations in wireless technologies have also substantially transformed the n… ▽ More

    Submitted 7 June, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 39 pages, 5 figures, accepted for publication in IEEE Journal on Selected Areas in Communications

  5. arXiv:2401.11380  [pdf, other

    cs.LG math.ST stat.ME stat.ML

    MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning

    Authors: Mao Hong, Zhiyue Zhang, Yue Wu, Yanxun Xu

    Abstract: Model-based offline reinforcement learning methods (RL) have achieved state-of-the-art performance in many decision-making problems thanks to their sample efficiency and generalizability. Despite these advancements, existing model-based offline RL approaches either focus on theoretical studies without develo** practical algorithms or rely on a restricted parametric policy space, thus not fully l… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

  6. arXiv:2401.08893  [pdf, other

    cs.LG math.OC

    MADA: Meta-Adaptive Optimizers through hyper-gradient Descent

    Authors: Kaan Ozkara, Can Karakus, Parameswaran Raman, Mingyi Hong, Shoham Sabach, Branislav Kveton, Volkan Cevher

    Abstract: Following the introduction of Adam, several novel adaptive optimizers for deep learning have been proposed. These optimizers typically excel in some tasks but may not outperform Adam uniformly across all tasks. In this work, we introduce Meta-Adaptive Optimizers (MADA), a unified optimizer framework that can generalize several known optimizers and dynamically learn the most suitable one during tra… ▽ More

    Submitted 17 June, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  7. arXiv:2401.03058  [pdf, other

    math.OC cs.LG stat.ML

    Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate

    Authors: Ruichen Jiang, Parameswaran Raman, Shoham Sabach, Aryan Mokhtari, Mingyi Hong, Volkan Cevher

    Abstract: Second-order optimization methods, such as cubic regularized Newton methods, are known for their rapid convergence rates; nevertheless, they become impractical in high-dimensional problems due to their substantial memory requirements and computational costs. One promising approach is to execute second-order updates within a lower-dimensional subspace, giving rise to subspace second-order methods.… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 27 pages, 2 figures

  8. arXiv:2308.00788  [pdf, other

    cs.LG math.OC

    An Introduction to Bi-level Optimization: Foundations and Applications in Signal Processing and Machine Learning

    Authors: Yihua Zhang, Prashant Khanduri, Ioannis Tsaknakis, Yuguang Yao, Mingyi Hong, Sijia Liu

    Abstract: Recently, bi-level optimization (BLO) has taken center stage in some very exciting developments in the area of signal processing (SP) and machine learning (ML). Roughly speaking, BLO is a classical optimization problem that involves two levels of hierarchy (i.e., upper and lower levels), wherein obtaining the solution to the upper-level problem requires solving the lower-level one. BLO has become… ▽ More

    Submitted 20 December, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

  9. arXiv:2305.17083  [pdf, other

    stat.ML cs.LG econ.EM math.ST stat.ME

    A Policy Gradient Method for Confounded POMDPs

    Authors: Mao Hong, Zhengling Qi, Yanxun Xu

    Abstract: In this paper, we propose a policy gradient method for confounded partially observable Markov decision processes (POMDPs) with continuous state and observation spaces in the offline setting. We first establish a novel identification result to non-parametrically estimate any history-dependent policy gradient under POMDPs using the offline data. The identification enables us to solve a sequence of c… ▽ More

    Submitted 30 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 95 pages, 3 figures

  10. arXiv:2305.13146  [pdf, ps, other

    math.PR

    Limit theorems for additive functionals of some self-similar Gaussian processes

    Authors: Minhao Hong, Heguang Liu, Fangjun Xu

    Abstract: Under certain mild conditions, limit theorems for additive functionals of some $d$-dimensional self-similar Gaussian processes are obtained. These limit theorems work for general Gaussian processes including fractional Brownian motions, sub-fractional Brownian motions and bi-fractional Brownian motions. To prove these results, we use the method of moments and an enhanced chaining argument. The Gau… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  11. arXiv:2302.10157  [pdf, other

    nlin.CG math-ph math.DS

    The Game of Life on the Robinson Triangle Penrose Tiling: Still Life

    Authors: Seung Hyeon Mandy Hong, May Mei

    Abstract: We investigate Conway's Game of Life played on the Robinson triangle Penrose tiling. In this paper, we classify all four-cell still lifes.

    Submitted 10 April, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

  12. arXiv:2210.16152  [pdf, ps, other

    math.PR

    Limit laws for functionals of self-intersection symmetric alpha-stable processes

    Authors: Minhao Hong, Qian Yu

    Abstract: In this paper, we prove two limit laws for functionals of self-intersection symmetric alpha-stable processes with alpha\in(1,2). The results are obtained based on the method of moments, the sample configuration and the chaining argument introduced in (Nualart and Xu 2013) are employed.

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: 18 pages

  13. arXiv:2205.12433  [pdf, ps, other

    math.AP

    Global Classical Solutions Near Vacuum to the Initial-Boundary Value Problem of Isentropic Supersonic Flows through Divergent Ducts

    Authors: Ying-Chieh Lin, Jay Chu, John M. Hong, Hsin-Yi Lee

    Abstract: In this paper, we study the global existence and asymptotic behavior of classical solutions near vacuum for the initial-boundary value problem modeling isentropic supersonic flows through divergent ducts. The governing equations are the compressible Euler equations with a small parameter, which can be written as a hyperbolic system in terms of the Riemann invariants with a non-dissipative source.… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    MSC Class: 35L45; 35L65; 35L67; 35L81

  14. arXiv:2112.04074  [pdf, ps, other

    math.AP

    Existence and convergence of the Beris-Edwards system with general Landau-de Gennes energy

    Authors: Zhewen Feng, Min-Chun Hong, Yu Mei

    Abstract: In this paper, we investigate the Beris-Edwards system for both biaxial and uniaxial $Q$-tensors with a general Landau-de Gennes energy density depending on four non-zero elastic constants. We prove existence of the strong solution of the Beris-Edwards system for uniaxial $Q$-tensors up to a maximal time. Furthermore, we prove that the strong solutions of the Beris-Edwards system for biaxial $Q$-t… ▽ More

    Submitted 10 November, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

    MSC Class: 35K51; 35Q30; 76A15

  15. arXiv:2112.03453  [pdf, ps, other

    math.AP

    Existence of minimizers and convergence of critical points for a new Landau-de Gennes energy functional in nematic liquid crystals

    Authors: Zhewen Feng, Min-Chun Hong

    Abstract: The Landau-de Gennes energy in nematic liquid crystals depends on four elastic constants $L_1$, $L_2$, $L_3$, $L_4$. In the case of $L_4\neq 0$, Ball and Majumdar (Mol. Cryst. Liq. Cryst., 2010) found an example that the original Landau-de Gennes energy functional in physics does not satisfy a coercivity condition, which causes a problem in mathematics to establish existence of energy minimizers.… ▽ More

    Submitted 28 September, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: Revised vision. To appear in Calculus of Variations and Partial Differential Equations. arXiv admin note: text overlap with arXiv:2007.11144

    MSC Class: 35J20; 35Q35; 76A15

  16. arXiv:2110.11210  [pdf, other

    math.OC

    Minimax Problems with Coupled Linear Constraints: Computational Complexity, Duality and Solution Methods

    Authors: Ioannis Tsaknakis, Mingyi Hong, Shuzhong Zhang

    Abstract: In this work we study a special minimax problem where there are linear constraints that couple both the minimization and maximization decision variables. The problem is a generalization of the traditional saddle point problem (which does not have the coupling constraint), and it finds applications in wireless communication, game theory, transportation, just to name a few. We show that the consider… ▽ More

    Submitted 25 November, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

  17. arXiv:2110.03438  [pdf, ps, other

    math.DG

    Biconservative hypersurfaces with constant scalar curvature in space forms

    Authors: Yu Fu, Min-Chun Hong, Dan Yang, Xin Zhan

    Abstract: Biconservative hypersurfaces are hypersurfaces which have conservative stress-energy tensor with respect to the bienergy, containing all minimal and constant mean curvature hypersurfaces. The purpose of this paper is to study biconservative hypersurfaces $M^n$ with constant scalar curvature in a space form $N^{n+1}(c)$. We prove that every biconservative hypersurface with constant scalar curvature… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: 20pages

  18. arXiv:2109.14212  [pdf, other

    math.OC

    Primal-Dual First-Order Methods for Affinely Constrained Multi-Block Saddle Point Problems

    Authors: Junyu Zhang, Mengdi Wang, Mingyi Hong, Shuzhong Zhang

    Abstract: We consider the convex-concave saddle point problem $\min_{\mathbf{x}}\max_{\mathbf{y}}Φ(\mathbf{x},\mathbf{y})$, where the decision variables $\mathbf{x}$ and/or $\mathbf{y}$ subject to a multi-block structure and affine coupling constraints, and $Φ(\mathbf{x},\mathbf{y})$ possesses certain separable structure. Although the minimization counterpart of such problem has been widely studied under th… ▽ More

    Submitted 16 March, 2023; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: 25 pages

  19. arXiv:2106.10435  [pdf, other

    cs.LG math.OC stat.ML

    STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning

    Authors: Prashant Khanduri, Pranay Sharma, Haibo Yang, Mingyi Hong, Jia Liu, Ketan Rajawat, Pramod K. Varshney

    Abstract: Federated Learning (FL) refers to the paradigm where multiple worker nodes (WNs) build a joint model by using local data. Despite extensive research, for a generic non-convex FL problem, it is not clear, how to choose the WNs' and the server's update directions, the minibatch sizes, and the local update frequency, so that the WNs use the minimum number of samples and communication rounds to achiev… ▽ More

    Submitted 19 June, 2021; originally announced June 2021.

  20. arXiv:2102.07367  [pdf, other

    math.OC cs.LG stat.ML

    A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum

    Authors: Prashant Khanduri, Siliang Zeng, Mingyi Hong, Hoi-To Wai, Zhaoran Wang, Zhuoran Yang

    Abstract: This paper proposes a new algorithm -- the \underline{S}ingle-timescale Do\underline{u}ble-momentum \underline{St}ochastic \underline{A}pprox\underline{i}matio\underline{n} (SUSTAIN) -- for tackling stochastic unconstrained bilevel optimization problems. We focus on bilevel problems where the lower level subproblem is strongly-convex and the upper level objective function is smooth. Unlike prior w… ▽ More

    Submitted 15 June, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: 36 Pages, 10 Figures

  21. arXiv:2102.07091  [pdf, other

    math.OC cs.LG eess.SY

    Decentralized Riemannian Gradient Descent on the Stiefel Manifold

    Authors: Shixiang Chen, Alfredo Garcia, Mingyi Hong, Shahin Shahrampour

    Abstract: We consider a distributed non-convex optimization where a network of agents aims at minimizing a global function over the Stiefel manifold. The global function is represented as a finite sum of smooth local functions, where each local function is associated with one agent and agents communicate with each other over an undirected connected graph. The problem is non-convex as local functions are pos… ▽ More

    Submitted 14 February, 2021; originally announced February 2021.

  22. arXiv:2101.09346  [pdf, ps, other

    math.OC cs.LG eess.SY

    On the Local Linear Rate of Consensus on the Stiefel Manifold

    Authors: Shixiang Chen, Alfredo Garcia, Mingyi Hong, Shahin Shahrampour

    Abstract: We study the convergence properties of Riemannian gradient method for solving the consensus problem (for an undirected connected graph) over the Stiefel manifold. The Stiefel manifold is a non-convex set and the standard notion of averaging in the Euclidean space does not work for this problem. We propose Distributed Riemannian Consensus on Stiefel Manifold (DRCS) and prove that it enjoys a local… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

  23. Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup

    Authors: Han Shen, Kaiqing Zhang, Mingyi Hong, Tianyi Chen

    Abstract: Asynchronous and parallel implementation of standard reinforcement learning (RL) algorithms is a key enabler of the tremendous success of modern RL. Among many asynchronous RL algorithms, arguably the most popular and effective one is the asynchronous advantage actor-critic (A3C) algorithm. Although A3C is becoming the workhorse of RL, its theoretical properties are still not well-understood, incl… ▽ More

    Submitted 16 March, 2022; v1 submitted 31 December, 2020; originally announced December 2020.

  24. arXiv:2010.11626  [pdf, ps, other

    math.PR

    Derivatives of local times for some Gaussian fields II

    Authors: Minhao Hong, Fangjun Xu

    Abstract: Given a $(2,d)$-Gaussian field \[ Z=\big\{ Z(t,s)= X^{H_1}_t -\tilde{X}^{H_2}_s, s,t \ge 0\big\}, \] where $X^{H_1}$ and $\tilde{X}^{H_2}$ are independent $d$-dimensional centered Gaussian processes satisfying certain properties, we will give the necessary condition for existence of derivatives of the local time of $Z$.

    Submitted 22 October, 2020; originally announced October 2020.

  25. arXiv:2010.03194  [pdf, ps, other

    math.OC

    First-Order Algorithms Without Lipschitz Gradient: A Sequential Local Optimization Approach

    Authors: Junyu Zhang, Mingyi Hong

    Abstract: First-order algorithms have been popular for solving convex and non-convex optimization problems. A key assumption for the majority of these algorithms is that the gradient of the objective function is globally Lipschitz continuous, but many contemporary problems such as tensor decomposition fail to satisfy such an assumption. This paper develops a sequential local optimization (SLO) framework of… ▽ More

    Submitted 5 February, 2024; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted by Informs Journal on Optimization

  26. arXiv:2007.11144  [pdf, ps, other

    math.AP cond-mat.soft math.FA

    A new representation for the Landau-de Gennes energy of nematic liquid crystals

    Authors: Zhewen Feng, Min-Chun Hong

    Abstract: In the Landau-de Gennes theory on nematic liquid crystals, the well-known Landau-de Gennes energy depends on four elastic constants; $L_1$, $L_2$, $L_3$, $L_4$. For the general case of $L_4\neq 0$, Ball-Majumdar \cite {BM} found an example that the Landau-de Gennes energy functional from physics literature \cite{MN} does not satisfy a coercivity condition, which causes a problem in mathematics to… ▽ More

    Submitted 6 January, 2021; v1 submitted 21 July, 2020; originally announced July 2020.

    Comments: corrects several typos

    MSC Class: 35J20; 35Q35; 76A15

  27. arXiv:2007.05170  [pdf, other

    math.OC cs.LG

    A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic

    Authors: Mingyi Hong, Hoi-To Wai, Zhaoran Wang, Zhuoran Yang

    Abstract: This paper analyzes a two-timescale stochastic algorithm framework for bilevel optimization. Bilevel optimization is a class of problems which exhibit a two-level structure, and its goal is to minimize an outer objective function with variables which are constrained to be the optimal solution to an (inner) optimization problem. We consider the case when the inner problem is unconstrained and stron… ▽ More

    Submitted 8 June, 2022; v1 submitted 10 July, 2020; originally announced July 2020.

    Comments: Minor revision

  28. arXiv:2006.15429  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Understanding Gradient Clip** in Private SGD: A Geometric Perspective

    Authors: Xiangyi Chen, Zhiwei Steven Wu, Mingyi Hong

    Abstract: Deep learning models are increasingly popular in many machine learning applications where the training data may contain sensitive information. To provide formal and rigorous privacy guarantee, many learning systems now incorporate differential privacy by training their models with (differentially) private SGD. A key step in each private SGD update is gradient clip** that shrinks the gradient of… ▽ More

    Submitted 17 March, 2021; v1 submitted 27 June, 2020; originally announced June 2020.

  29. arXiv:2006.11662  [pdf, ps, other

    math.OC cs.LG

    On the Divergence of Decentralized Non-Convex Optimization

    Authors: Mingyi Hong, Siliang Zeng, Junyu Zhang, Haoran Sun

    Abstract: We study a generic class of decentralized algorithms in which $N$ agents jointly optimize the non-convex objective $f(u):=1/N\sum_{i=1}^{N}f_i(u)$, while only communicating with their neighbors. This class of problems has become popular in modeling many signal processing and machine learning applications, and many efficient algorithms have been proposed. However, by constructing some counter-examp… ▽ More

    Submitted 20 June, 2020; originally announced June 2020.

    Comments: 34 pages

  30. arXiv:2006.08141  [pdf, other

    math.OC cs.LG stat.ML

    Non-convex Min-Max Optimization: Applications, Challenges, and Recent Theoretical Advances

    Authors: Meisam Razaviyayn, Tianjian Huang, Songtao Lu, Maher Nouiehed, Maziar Sanjabi, Mingyi Hong

    Abstract: The min-max optimization problem, also known as the saddle point problem, is a classical optimization problem which is also studied in the context of zero-sum games. Given a class of objective functions, the goal is to find a value for the argument which leads to a small objective value even for the worst case function in the given class. Min-max optimization problems have recently become very pop… ▽ More

    Submitted 18 August, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Journal ref: IEEE Signal Processing Magazine (Volume: 37, Issue: 5, Sept. 2020)

  31. arXiv:2006.07612  [pdf, ps, other

    math.DG

    On Chen's biharmonic conjecture for hypersurfaces in $\mathbb R^5$

    Authors: Yu Fu, Min-Chun Hong, Xin Zhan

    Abstract: A longstanding conjecture on biharmonic submanifolds, proposed by Chen in 1991, is that {\it any biharmonic submanifold in a Euclidean space is minimal}. In the case of a hypersurface $M^n$ in $\mathbb R^{n+1}$, Chen's conjecture was settled in the case of $n=2$ by Chen and Jiang around 1987 independently. Hasanis and Vlachos in 1995 settled Chen's conjecture for a hypersurface with $n=3$. However… ▽ More

    Submitted 22 July, 2020; v1 submitted 13 June, 2020; originally announced June 2020.

    Comments: 23 pages

  32. arXiv:2006.02067  [pdf, ps, other

    math.OC

    Generalization Bounds for Stochastic Saddle Point Problems

    Authors: Junyu Zhang, Mingyi Hong, Mengdi Wang, Shuzhong Zhang

    Abstract: This paper studies the generalization bounds for the empirical saddle point (ESP) solution to stochastic saddle point (SSP) problems. For SSP with Lipschitz continuous and strongly convex-strongly concave objective functions, we establish an $\mathcal{O}(1/n)$ generalization bound by using a uniform stability argument. We also provide generalization bounds under a variety of assumptions, including… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

  33. arXiv:2005.03267  [pdf, other

    eess.SY math.OC

    Online Proximal-ADMM For Time-varying Constrained Convex Optimization

    Authors: Yijian Zhang, Emiliano Dall'Anese, Mingyi Hong

    Abstract: This paper considers a convex optimization problem with cost and constraints that evolve over time. The function to be minimized is strongly convex and possibly non-differentiable, and variables are coupled through linear constraints. In this setting, the paper proposes an online algorithm based on the alternating direction method of multipliers (ADMM), to track the optimal solution trajectory of… ▽ More

    Submitted 12 January, 2021; v1 submitted 7 May, 2020; originally announced May 2020.

  34. arXiv:2001.04786  [pdf, other

    cs.LG math.OC stat.ML

    Distributed Learning in the Non-Convex World: From Batch to Streaming Data, and Beyond

    Authors: Tsung-Hui Chang, Mingyi Hong, Hoi-To Wai, Xinwei Zhang, Songtao Lu

    Abstract: Distributed learning has become a critical enabler of the massively connected world envisioned by many. This article discusses four key elements of scalable distributed processing and real-time intelligence --- problems, data, communication and computation. Our aim is to provide a fresh and unique perspective about how these elements should work together in an effective and coherent manner. In par… ▽ More

    Submitted 14 January, 2020; originally announced January 2020.

    Comments: Submitted to IEEE Signal Processing Magazine Special Issue on Distributed, Streaming Machine Learning; THC, MH, HTW contributed equally

  35. On Lower Iteration Complexity Bounds for the Saddle Point Problems

    Authors: Junyu Zhang, Mingyi Hong, Shuzhong Zhang

    Abstract: In this paper, we study the lower iteration complexity bounds for finding the saddle point of a strongly convex and strongly concave saddle point problem: $\min_x\max_yF(x,y)$. We restrict the classes of algorithms in our investigation to be either pure first-order methods or methods using proximal map**s. The existing lower bound result for this type of problems is obtained via the framework of… ▽ More

    Submitted 20 June, 2021; v1 submitted 16 December, 2019; originally announced December 2019.

    Journal ref: Mathematical Programming (2021)

  36. arXiv:1910.06513  [pdf, other

    cs.LG math.OC stat.ML

    ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization

    Authors: Xiangyi Chen, Sijia Liu, Kaidi Xu, Xingguo Li, Xue Lin, Mingyi Hong, David Cox

    Abstract: The adaptive momentum method (AdaMM), which uses past gradients to update descent directions and learning rates simultaneously, has become one of the most popular first-order optimization methods for solving machine learning problems. However, AdaMM is not suited for solving black-box optimization problems, where explicit gradient forms are difficult or infeasible to obtain. In this paper, we prop… ▽ More

    Submitted 15 October, 2019; v1 submitted 14 October, 2019; originally announced October 2019.

  37. arXiv:1910.05857  [pdf, ps, other

    math.OC cs.DC cs.LG eess.SP stat.ML

    Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: A Joint Gradient Estimation and Tracking Approach

    Authors: Haoran Sun, Songtao Lu, Mingyi Hong

    Abstract: Many modern large-scale machine learning problems benefit from decentralized and stochastic optimization. Recent works have shown that utilizing both decentralized computing and local stochastic gradient estimates can outperform state-of-the-art centralized algorithms, in applications involving highly non-convex problems, such as training deep neural networks. In this work, we propose a decentra… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

    Journal ref: Published at the International Conference on Machine Learning (ICML 2020)

  38. arXiv:1909.13806  [pdf, other

    cs.LG math.OC stat.ML

    Min-Max Optimization without Gradients: Convergence and Applications to Adversarial ML

    Authors: Sijia Liu, Songtao Lu, Xiangyi Chen, Yao Feng, Kaidi Xu, Abdullah Al-Dujaili, Minyi Hong, Una-May O'Reilly

    Abstract: In this paper, we study the problem of constrained robust (min-max) optimization ina black-box setting, where the desired optimizer cannot access the gradients of the objective function but may query its values. We present a principled optimization framework, integrating a zeroth-order (ZO) gradient estimator with an alternating projected stochastic gradient descent-ascent method, where the former… ▽ More

    Submitted 16 June, 2020; v1 submitted 30 September, 2019; originally announced September 2019.

    Comments: ICML 2020

  39. arXiv:1907.06246  [pdf, ps, other

    cs.LG math.OC stat.ML

    On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost

    Authors: Zhuoran Yang, Yongxin Chen, Mingyi Hong, Zhaoran Wang

    Abstract: Despite the empirical success of the actor-critic algorithm, its theoretical understanding lags behind. In a broader context, actor-critic can be viewed as an online alternating update algorithm for bilevel optimization, whose convergence is known to be fragile. To understand the instability of actor-critic, we focus on its application to linear quadratic regulators, a simple yet fundamental setti… ▽ More

    Submitted 14 July, 2019; originally announced July 2019.

    Comments: 41 pages

  40. arXiv:1907.04450  [pdf, ps, other

    math.OC cs.CC stat.ML

    SNAP: Finding Approximate Second-Order Stationary Solutions Efficiently for Non-convex Linearly Constrained Problems

    Authors: Songtao Lu, Meisam Razaviyayn, Bo Yang, Kejun Huang, Mingyi Hong

    Abstract: This paper proposes low-complexity algorithms for finding approximate second-order stationary points (SOSPs) of problems with smooth non-convex objective and linear constraints. While finding (approximate) SOSPs is computationally intractable, we first show that generic instances of the problem can be solved efficiently. More specifically, for a generic problem instance, certain strict complementa… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

  41. arXiv:1906.01736  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Distributed Training with Heterogeneous Data: Bridging Median- and Mean-Based Algorithms

    Authors: Xiangyi Chen, Tiancong Chen, Haoran Sun, Zhiwei Steven Wu, Mingyi Hong

    Abstract: Recently, there is a growing interest in the study of median-based algorithms for distributed non-convex optimization. Two prominent such algorithms include signSGD with majority vote, an effective approach for communication reduction via 1-bit compression on the local gradients, and medianSGD, an algorithm recently proposed to ensure robustness against Byzantine workers. The convergence analyses… ▽ More

    Submitted 6 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

  42. arXiv:1905.09631  [pdf, ps, other

    math.PR

    Derivatives of local times for some Gaussian fields

    Authors: Minhao Hong, Fangjun Xu

    Abstract: In this article, we consider derivatives of local time for a $(2,d)$-Gaussian field \[ Z=\big\{ Z(t,s)= X^{H_1}_t -\widetilde{X}^{H_2}_s, s,t \ge 0\big\}, \] where $X^{H_1}$ and $\widetilde{X}^{H_2}$ are two independent processes from a class of $d$-dimensional centered Gaussian processes satisfying certain local nondeterminism property. We first give a condition for existence of derivatives of th… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

  43. Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications

    Authors: Songtao Lu, Ioannis Tsaknakis, Mingyi Hong, Yongxin Chen

    Abstract: The min-max problem, also known as the saddle point problem, is a class of optimization problems which minimizes and maximizes two subsets of variables simultaneously. This class of problems can be used to formulate a wide range of signal processing and communication (SPCOM) problems. Despite its popularity, most existing theory for this class has been mainly developed for problems with certain sp… ▽ More

    Submitted 16 March, 2021; v1 submitted 21 February, 2019; originally announced February 2019.

  44. arXiv:1901.03674  [pdf, ps, other

    cs.LG cs.AI math.OC stat.ML

    On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

    Authors: Qi Cai, Mingyi Hong, Yongxin Chen, Zhaoran Wang

    Abstract: We study the global convergence of generative adversarial imitation learning for linear quadratic regulators, which is posed as minimax optimization. To address the challenges arising from non-convex-concave geometry, we analyze the alternating gradient algorithm and establish its Q-linear rate of convergence to a unique saddle point, which simultaneously recovers the globally optimal policy and r… ▽ More

    Submitted 11 January, 2019; originally announced January 2019.

  45. Coordinating Multiple Sources for Service Restoration to Enhance Resilience of Distribution Systems

    Authors: Ying Wang, Yin Xu, **ghan He, Chen-Ching Liu, Kevin P. Schneider, Mingguo Hong, Dan T. Ton

    Abstract: When a major outage occurs on a distribution system due to extreme events, microgrids, distributed generators, and other local resources can be used to restore critical loads and enhance resiliency. This paper proposes a decision-making method to determine the optimal restoration strategy coordinating multiple sources to serve critical loads after blackouts. The critical load restoration problem i… ▽ More

    Submitted 15 January, 2019; v1 submitted 16 October, 2018; originally announced October 2018.

    Comments: 13 pages, 7 figures, journal

  46. arXiv:1810.05251  [pdf, other

    math.OC

    A Linearly Convergent Doubly Stochastic Gauss-Seidel Algorithm for Solving Linear Equations and A Certain Class of Over-Parameterized Optimization Problems

    Authors: Meisam Razaviyayn, Mingyi Hong, Navid Reyhanian, Zhi-Quan Luo

    Abstract: Consider the classical problem of solving a general linear system of equations $Ax=b$. It is well known that the (successively over relaxed) Gauss-Seidel scheme and many of its variants may not converge when $A$ is neither diagonally dominant nor symmetric positive definite. Can we have a linearly convergent G-S type algorithm that works for {\it any} $A$? In this paper we answer this question aff… ▽ More

    Submitted 13 May, 2019; v1 submitted 11 October, 2018; originally announced October 2018.

  47. arXiv:1808.02941  [pdf, other

    cs.LG math.OC stat.ML

    On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

    Authors: Xiangyi Chen, Sijia Liu, Ruoyu Sun, Mingyi Hong

    Abstract: This paper studies a class of adaptive gradient based momentum algorithms that update the search directions and learning rates simultaneously using past gradients. This class, which we refer to as the "Adam-type", includes the popular algorithms such as the Adam, AMSGrad and AdaGrad. Despite their popularity in training deep neural networks, the convergence of these algorithms for solving nonconve… ▽ More

    Submitted 9 March, 2019; v1 submitted 8 August, 2018; originally announced August 2018.

  48. arXiv:1806.10684  [pdf

    math.OC eess.SP eess.SY

    Price-Based Market Clearing with V2G Integration Using Generalized Benders Decomposition

    Authors: Reza Jamalzadeh, Sajjad Abedi, Masoud Rashidinejad, Mingguo Hong

    Abstract: Currently, most ISOs adopt offer cost minimization (OCM) auction mechanism which minimizes the total offer cost, and then, a settlement rule based on either locational marginal prices (LMPs) or market clearing price (MCP) is used to determine the payments to the committed units, which is not compatible with the auction mechanism because the minimized cost is different from the payment cost calcula… ▽ More

    Submitted 27 June, 2018; originally announced June 2018.

  49. arXiv:1806.00877  [pdf, ps, other

    cs.LG math.OC stat.ML

    Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization

    Authors: Hoi-To Wai, Zhuoran Yang, Zhaoran Wang, Mingyi Hong

    Abstract: Despite the success of single-agent reinforcement learning, multi-agent reinforcement learning (MARL) remains challenging due to complex interactions between agents. Motivated by decentralized applications such as sensor networks, swarm robotics, and power grids, we study policy evaluation in MARL, where agents with jointly observed state-action pairs and private local rewards collaborate to learn… ▽ More

    Submitted 8 January, 2019; v1 submitted 3 June, 2018; originally announced June 2018.

    Comments: final version as appeared in NeurIPS 2018

  50. Convergence of the Ginzburg-Landau approximation for the Ericksen-Leslie system

    Authors: Zhewen Feng, Min-Chun Hong, Yu Mei

    Abstract: We establish the local well-posedness of the general Ericksen-Leslie system in liquid crystals with the initial velocity and director field in $H^1 \times H_b^2$. In particular, we prove that the solutions of the Ginzburg-Landau approximation system converge smoothly to the solution of the Ericksen-Leslie system for any $t \in (0,T^\ast)$ with a maximal existence time $T^\ast$ of the Ericksen- Les… ▽ More

    Submitted 13 June, 2019; v1 submitted 22 April, 2018; originally announced April 2018.