-
A Single-Loop Robust Policy Gradient Method for Robust Markov Decision Processes
Authors:
Zhenwei Lin,
Chenyu Xue,
Qi Deng,
Yinyu Ye
Abstract:
Robust Markov Decision Processes (RMDPs) have recently been recognized as a valuable and promising approach to discovering a policy with creditable performance, particularly in the presence of a dynamic environment and estimation errors in the transition matrix due to limited data. Despite extensive exploration of dynamic programming algorithms for solving RMDPs, there has been a notable upswing i…
▽ More
Robust Markov Decision Processes (RMDPs) have recently been recognized as a valuable and promising approach to discovering a policy with creditable performance, particularly in the presence of a dynamic environment and estimation errors in the transition matrix due to limited data. Despite extensive exploration of dynamic programming algorithms for solving RMDPs, there has been a notable upswing in interest in develo** efficient algorithms using the policy gradient method. In this paper, we propose the first single-loop robust policy gradient (SRPG) method with the global optimality guarantee for solving RMDPs through its minimax formulation. Moreover, we complement the convergence analysis of the nonconvex-nonconcave min-max optimization problem with the objective function's gradient dominance property, which is not explored in the prior literature. Numerical experiments validate the efficacy of SRPG, demonstrating its faster and more robust convergence behavior compared to its nested-loop counterpart.
△ Less
Submitted 31 May, 2024;
originally announced June 2024.
-
Restarted Primal-Dual Hybrid Conjugate Gradient Method for Large-Scale Quadratic Programming
Authors:
Yicheng Huang,
Wanyu Zhang,
Hongpei Li,
Weihan Xue,
Dongdong Ge,
Huikang Liu,
Yinyu Ye
Abstract:
Convex quadratic programming (QP) is an essential class of optimization problems with broad applications across various fields. Traditional QP solvers, typically based on simplex or barrier methods, face significant scalability challenges. In response to these limitations, recent research has shifted towards matrix-free first-order methods to enhance scalability in QP. Among these, the restarted a…
▽ More
Convex quadratic programming (QP) is an essential class of optimization problems with broad applications across various fields. Traditional QP solvers, typically based on simplex or barrier methods, face significant scalability challenges. In response to these limitations, recent research has shifted towards matrix-free first-order methods to enhance scalability in QP. Among these, the restarted accelerated primal-dual hybrid gradient (rAPDHG) method, proposed by H.Lu(2023), has gained notable attention due to its linear convergence rate to an optimal solution and its straightforward implementation on Graphics Processing Units (GPUs). Building on this framework, this paper introduces a restarted primal-dual hybrid conjugate gradient (PDHCG) method, which incorporates conjugate gradient (CG) techniques to address the primal subproblems inexactly. We demonstrate that PDHCG maintains a linear convergence rate with an improved convergence constant and is also straightforward to implement on GPUs. Extensive numerical experiments affirm that, compared to rAPDHG, our method could significantly reduce the number of iterations required to achieve the desired accuracy and offer a substantial performance improvement in large-scale problems. These findings highlight the significant potential of our proposed PDHCG method to boost both the efficiency and scalability of solving complex QP challenges.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
New exotic examples of Ricci limit spaces
Authors:
Xilun Li,
Yanan Ye,
Shengxuan Zhou
Abstract:
For any integers $m\geqslant n\geqslant 3$, we construct a Ricci limit space $X_{m,n}$ such that for a fixed point, some tangent cones are $\mathbb{R}^m$ and some are $\mathbb{R}^n$. This is an improvement of Menguy's example.
For any integers $m\geqslant n\geqslant 3$, we construct a Ricci limit space $X_{m,n}$ such that for a fixed point, some tangent cones are $\mathbb{R}^m$ and some are $\mathbb{R}^n$. This is an improvement of Menguy's example.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Four-fifths laws in electron and Hall magnetohydrodynamic fluids: Energy, Magnetic helicity and Generalized helicity
Authors:
Yanqing Wang,
Yulin Ye,
Otto Chkhetiani
Abstract:
This paper examines the Kolmogorov type laws of conserved quantities in the electron and Hall magnetohydrodynamic fluids. Inspired by Eyink's longitudinal structure functions and recent progress in classical MHD equations, we derive four-fifths laws for energy, magnetic helicity and generalized helicity in these systems.
This paper examines the Kolmogorov type laws of conserved quantities in the electron and Hall magnetohydrodynamic fluids. Inspired by Eyink's longitudinal structure functions and recent progress in classical MHD equations, we derive four-fifths laws for energy, magnetic helicity and generalized helicity in these systems.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Four-fifths laws in incompressible and magnetized fluids: Helicity, Energy and Cross-helicity
Authors:
Yulin Ye,
Yanqing Wang,
Otto Chkhetiani
Abstract:
In this paper, we are concerned with the Kolmogorov's scaling laws of conserved quantities. By means of Eyink's longitudinal structure functions and the analysis of interaction of different physical quantities, we extend celebrated four-fifths laws from energy to helicity in incompressible fluid and, energy and cross-helicity in magnetohydrodynamic flow. In contrast to pervious 4/5 laws of energy…
▽ More
In this paper, we are concerned with the Kolmogorov's scaling laws of conserved quantities. By means of Eyink's longitudinal structure functions and the analysis of interaction of different physical quantities, we extend celebrated four-fifths laws from energy to helicity in incompressible fluid and, energy and cross-helicity in magnetohydrodynamic flow. In contrast to pervious 4/5 laws of energy and cross-helicity in magnetized fluids obtained by Politano and Pouquet, they are in terms of the mixed three-order structure functions rather than the structure coupling correlation functions.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
On the shrinking solitons of generalized Ricci flow
Authors:
Xilun Li,
Yanan Ye
Abstract:
We show that every gradient shrinking soliton of the generalized Ricci flow on compact manifold is a Ricci soliton. And we prove that the pluriclosed soliton is gradient Kahler-Ricci soliton under a broad cohomological condition. Moreover, we construct the first example of non-trivial shrinking generalized soliton, which can serve as a singularity model of the generalized Ricci flow.
We show that every gradient shrinking soliton of the generalized Ricci flow on compact manifold is a Ricci soliton. And we prove that the pluriclosed soliton is gradient Kahler-Ricci soliton under a broad cohomological condition. Moreover, we construct the first example of non-trivial shrinking generalized soliton, which can serve as a singularity model of the generalized Ricci flow.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Particle approximation for a conditional McKean--Vlasov stochastic differential equation
Authors:
Kai Du,
Yunzhang Li,
Yuyang Ye
Abstract:
In this paper, we construct a type of interacting particle systems to approximate a class of stochastic different equations whose coefficients depend on the conditional probability distributions of the processes given partial observations. After proving the well-posedness and regularity of the particle systems, we establish a quantitative convergence result for the empirical measures of the partic…
▽ More
In this paper, we construct a type of interacting particle systems to approximate a class of stochastic different equations whose coefficients depend on the conditional probability distributions of the processes given partial observations. After proving the well-posedness and regularity of the particle systems, we establish a quantitative convergence result for the empirical measures of the particle systems in the Wasserstein space, as the number of particles increases. Moreover, we discuss an Euler--Maruyama scheme of the particle system and validate its strong convergence. A numerical experiment is conducted to illustrate our results.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Diffusion Model for Data-Driven Black-Box Optimization
Authors:
Zihao Li,
Hui Yuan,
Kaixuan Huang,
Chengzhuo Ni,
Yinyu Ye,
Minshuo Chen,
Mengdi Wang
Abstract:
Generative AI has redefined artificial intelligence, enabling the creation of innovative content and customized solutions that drive business practices into a new era of efficiency and creativity. In this paper, we focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization over complex structured variables. Consider the practical scen…
▽ More
Generative AI has redefined artificial intelligence, enabling the creation of innovative content and customized solutions that drive business practices into a new era of efficiency and creativity. In this paper, we focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization over complex structured variables. Consider the practical scenario where one wants to optimize some structured design in a high-dimensional space, based on massive unlabeled data (representing design variables) and a small labeled dataset. We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons. The goal is to generate new designs that are near-optimal and preserve the designed latent structures. Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models for modeling complex distributions. In particular, we propose a reward-directed conditional diffusion model, to be trained on the mixed data, for sampling a near-optimal solution conditioned on high predicted rewards. Theoretically, we establish sub-optimality error bounds for the generated designs. The sub-optimality gap nearly matches the optimal guarantee in off-policy bandits, demonstrating the efficiency of reward-directed diffusion models for black-box optimization. Moreover, when the data admits a low-dimensional latent subspace structure, our model efficiently generates high-fidelity designs that closely respect the latent structure. We provide empirical experiments validating our model in decision-making and content-creation tasks.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
A Low-Rank ADMM Splitting Approach for Semidefinite Programming
Authors:
Qiushi Han,
Chenxi Li,
Zhenwei Lin,
Caihua Chen,
Qi Deng,
Dongdong Ge,
Huikang Liu,
Yinyu Ye
Abstract:
We introduce a new first-order method for solving general semidefinite programming problems, based on the alternating direction method of multipliers (ADMM) and a matrix-splitting technique. Our algorithm has an advantage over the Burer-Monteiro approach as it only involves much easier quadratically regularized subproblems in each iteration. For a linear objective, the subproblems are well-conditi…
▽ More
We introduce a new first-order method for solving general semidefinite programming problems, based on the alternating direction method of multipliers (ADMM) and a matrix-splitting technique. Our algorithm has an advantage over the Burer-Monteiro approach as it only involves much easier quadratically regularized subproblems in each iteration. For a linear objective, the subproblems are well-conditioned quadratic programs that can be efficiently solved by the standard conjugate gradient method. We show that the ADMM algorithm achieves sublinear or linear convergence rates to the KKT solutions under different conditions. Building on this theoretical development, we present LoRADS, a new solver for linear SDP based on the Low-Rank ADMM Splitting approach. LoRADS incorporates several strategies that significantly increase its efficiency. Firstly, it initiates with a warm-start phase that uses the Burer-Monteiro approach. Moreover, motivated by the SDP low-rank theory [So et al. 2008], LoRADS chooses an initial rank of logarithmic order and then employs a dynamic approach to increase the rank. Numerical experiments indicate that LoRADS exhibits promising performance on various SDP problems. A noteworthy achievement of LoRADS is its successful solving of a matrix completion problem with $15,694,167$ constraints and a matrix variable of size $40,000 \times 40,000$ in $351$ seconds.
△ Less
Submitted 25 March, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
The associated graded algebras of Brauer graph algebras II: infinite representation type
Authors:
**g Guo,
Yuming Liu,
Yu Ye
Abstract:
Let $G$ be a Brauer graph and $A$ the associated Brauer graph algebra. Denote by $gr(A)$ the graded algebra associated with the radical filtration of $A$. The question when $gr(A)$ is of finite representation type was answered in [9]. In the present paper, we characterize when $gr(A)$ is domestic in terms of the associated Brauer graph $G$.
Let $G$ be a Brauer graph and $A$ the associated Brauer graph algebra. Denote by $gr(A)$ the graded algebra associated with the radical filtration of $A$. The question when $gr(A)$ is of finite representation type was answered in [9]. In the present paper, we characterize when $gr(A)$ is domestic in terms of the associated Brauer graph $G$.
△ Less
Submitted 15 May, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
On the Classification of Finite Quasi-Quantum Groups over Abelian Groups
Authors:
Hua-Lin Huang,
Gongxiang Liu,
Yu** Yang,
Yu Ye
Abstract:
Using a variety of methods developed in the theory of finite-dimensional quasi-Hopf algebras, we classify all finite-dimensional coradically graded pointed coquasi-Hopf algebras over abelian groups. As a consequence, we partially confirm the generation conjecture of pointed finite tensor categories due to Etingof, Gelaki, Nikshych and Ostrik.
Using a variety of methods developed in the theory of finite-dimensional quasi-Hopf algebras, we classify all finite-dimensional coradically graded pointed coquasi-Hopf algebras over abelian groups. As a consequence, we partially confirm the generation conjecture of pointed finite tensor categories due to Etingof, Gelaki, Nikshych and Ostrik.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Quasi-diagrams and gentle algebras
Authors:
Haigang Hu,
Xiao-Chuang Wang,
Yu Ye
Abstract:
Any gentle algebra $A$ with one maximal path corresponds to a unique quasi-diagram $α$. We introduce the regularity for $α$, and show that $A$ has finite global dimension if and only if $α$ is regular. We characterize regular quasi-diagrams which remain regular under the dihedral group action. We prove that the set of maximal chord diagrams is the "biggest" one among the sets closed under taking K…
▽ More
Any gentle algebra $A$ with one maximal path corresponds to a unique quasi-diagram $α$. We introduce the regularity for $α$, and show that $A$ has finite global dimension if and only if $α$ is regular. We characterize regular quasi-diagrams which remain regular under the dihedral group action. We prove that the set of maximal chord diagrams is the "biggest" one among the sets closed under taking Koszul dual and rotations.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
An Adaptive Orthogonal Basis Method for Computing Multiple Solutions of Differential Equations with polynomial nonlinearities
Authors:
Lin Li,
Yangyi Ye,
Huiyuan Li
Abstract:
This paper presents an innovative approach, the Adaptive Orthogonal Basis Method, tailored for computing multiple solutions to differential equations characterized by polynomial nonlinearities. Departing from conventional practices of predefining candidate basis pools, our novel method adaptively computes bases, considering the equation's nature and structural characteristics of the solution. It f…
▽ More
This paper presents an innovative approach, the Adaptive Orthogonal Basis Method, tailored for computing multiple solutions to differential equations characterized by polynomial nonlinearities. Departing from conventional practices of predefining candidate basis pools, our novel method adaptively computes bases, considering the equation's nature and structural characteristics of the solution. It further leverages companion matrix techniques to generate initial guesses for subsequent computations. Thus this approach not only yields numerous initial guesses for solving such equations but also adapts orthogonal basis functions to effectively address discretized nonlinear systems. Through a series of numerical experiments, this paper demonstrates the method's effectiveness and robustness. By reducing computational costs in various applications, this novel approach opens new avenues for uncovering multiple solutions to differential equations with polynomial nonlinearities.
△ Less
Submitted 20 April, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Achieving $\tilde{O}(1/ε)$ Sample Complexity for Constrained Markov Decision Process
Authors:
Jiashuo Jiang,
Yinyu Ye
Abstract:
We consider the reinforcement learning problem for the constrained Markov decision process (CMDP), which plays a central role in satisfying safety or resource constraints in sequential learning and decision-making. In this problem, we are given finite resources and a MDP with unknown transition probabilities. At each stage, we take an action, collecting a reward and consuming some resources, all a…
▽ More
We consider the reinforcement learning problem for the constrained Markov decision process (CMDP), which plays a central role in satisfying safety or resource constraints in sequential learning and decision-making. In this problem, we are given finite resources and a MDP with unknown transition probabilities. At each stage, we take an action, collecting a reward and consuming some resources, all assumed to be unknown and need to be learned over time. In this work, we take the first step towards deriving optimal problem-dependent guarantees for the CMDP problems. We derive a logarithmic regret bound, which translates into a $O(\frac{1}{Δ\cdot\eps}\cdot\log^2(1/\eps))$ sample complexity bound, with $Δ$ being a problem-dependent parameter, yet independent of $\eps$. Our sample complexity bound improves upon the state-of-art $O(1/\eps^2)$ sample complexity for CMDP problems established in the previous literature, in terms of the dependency on $\eps$. To achieve this advance, we develop a new framework for analyzing CMDP problems. To be specific, our algorithm operates in the primal space and we resolve the primal LP for the CMDP problem at each period in an online manner, with \textit{adaptive} remaining resource capacities. The key elements of our algorithm are: i) a characterization of the instance hardness via LP basis, ii) an eliminating procedure that identifies one optimal basis of the primal LP, and; iii) a resolving procedure that is adaptive to the remaining resources and sticks to the characterized optimal basis.
△ Less
Submitted 2 June, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Decoupling Learning and Decision-Making: Breaking the $\mathcal{O}(\sqrt{T})$ Barrier in Online Resource Allocation with First-Order Methods
Authors:
Wenzhi Gao,
Chunlin Sun,
Chenyu Xue,
Dongdong Ge,
Yinyu Ye
Abstract:
Online linear programming plays an important role in both revenue management and resource allocation, and recent research has focused on develo** efficient first-order online learning algorithms. Despite the empirical success of first-order methods, they typically achieve a regret no better than $\mathcal{O}(\sqrt{T})$, which is suboptimal compared to the $\mathcal{O}(\log T)$ bound guaranteed b…
▽ More
Online linear programming plays an important role in both revenue management and resource allocation, and recent research has focused on develo** efficient first-order online learning algorithms. Despite the empirical success of first-order methods, they typically achieve a regret no better than $\mathcal{O}(\sqrt{T})$, which is suboptimal compared to the $\mathcal{O}(\log T)$ bound guaranteed by the state-of-the-art linear programming (LP)-based online algorithms. This paper establishes several important facts about online linear programming, which unveils the challenge for first-order-method-based online algorithms to achieve beyond $\mathcal{O}(\sqrt{T})$ regret. To address the challenge, we introduce a new algorithmic framework that decouples learning from decision-making. For the first time, we show that first-order methods can attain regret $\mathcal{O}(T^{1/3})$ with this new framework.
△ Less
Submitted 28 May, 2024; v1 submitted 11 February, 2024;
originally announced February 2024.
-
A Tuning-Free Primal-Dual Splitting Algorithm for Large-Scale Semidefinite Programming
Authors:
Yinjun Wang,
Haixiang Lan,
Yinyu Ye
Abstract:
This paper proposes and analyzes a tuning-free variant of Primal-Dual Hybrid Gradient (PDHG), and investigates its effectiveness for solving large-scale semidefinite programming (SDP). The core idea is based on the combination of two seemingly unrelated results: (1) the equivalence of PDHG and Douglas-Rachford splitting (DRS); (2) the asymptotic convergence of non-stationary DRS. This combination…
▽ More
This paper proposes and analyzes a tuning-free variant of Primal-Dual Hybrid Gradient (PDHG), and investigates its effectiveness for solving large-scale semidefinite programming (SDP). The core idea is based on the combination of two seemingly unrelated results: (1) the equivalence of PDHG and Douglas-Rachford splitting (DRS); (2) the asymptotic convergence of non-stationary DRS. This combination provides a unified approach to analyze the convergence of generic adaptive PDHG, including the proposed tuning-free algorithm and various existing ones. Numerical experiments are conducted to show the performance of our algorithm, highlighting its superior convergence speed and robustness in the context of SDP.
△ Less
Submitted 31 January, 2024;
originally announced February 2024.
-
Distribution of neighboring values of the Liouville and Möbius functions
Authors:
Qi Luo,
Yangbo Ye
Abstract:
Let $λ(n)$ and $μ(n)$ denote the Liouville function and the Möbius function, respectively. In this study, relationships between the values of $λ(n)$ and $λ(n+h)$ up to $n\leq10^8$ for $1\leq h\leq1,000$ are explored. Chowla's conjecture predicts that the conditional expectation of $λ(n+h)$ given $λ(n)=1$ for $1\leq n\leq X$ converges to the conditional expectation of $λ(n+h)$ given $λ(n)=-1$ for…
▽ More
Let $λ(n)$ and $μ(n)$ denote the Liouville function and the Möbius function, respectively. In this study, relationships between the values of $λ(n)$ and $λ(n+h)$ up to $n\leq10^8$ for $1\leq h\leq1,000$ are explored. Chowla's conjecture predicts that the conditional expectation of $λ(n+h)$ given $λ(n)=1$ for $1\leq n\leq X$ converges to the conditional expectation of $λ(n+h)$ given $λ(n)=-1$ for $1\leq n\leq X$ as $X\rightarrow\infty$. However, for finite $X$, these conditional expectations are different. The observed difference, together with the significant difference in $χ^2$ tests of independence, reveals hidden additive properties among the values of the Liouville function. Similarly, such additive structures for $μ(n)$ for square-free $n$'s are identified. These findings pave the way for develo** possible, and hopefully efficient, additive algorithms for these functions. The potential existence of fast, additive algorithms for $λ(n)$ and $μ(n)$ may eventually provide scientific evidence supporting the belief that prime factorization of large integers should not be too difficult. For $1\leq h\leq1,000$, the study also tested the convergence speeds of Chowla's conjecture and found no relation on $h$.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
From law of the iterated logarithm to Zolotarev distance for supercritical branching processes in random environment
Authors:
Yinna Ye
Abstract:
Consider $(Z_n)_{n\geq0}$ a supercritical branching process in an independent and identically distributed environment. Based on some recent development in martingale limit theory, we established law of the iterated logarithm, strong law of large numbers, invariance principle and optimal convergence rate in the central limit theorem under Zolotarev and Wasserstein distances of order $p\in(0,2]$ for…
▽ More
Consider $(Z_n)_{n\geq0}$ a supercritical branching process in an independent and identically distributed environment. Based on some recent development in martingale limit theory, we established law of the iterated logarithm, strong law of large numbers, invariance principle and optimal convergence rate in the central limit theorem under Zolotarev and Wasserstein distances of order $p\in(0,2]$ for the process $(\log Z_n)_{n\geq0}$.
△ Less
Submitted 20 June, 2024; v1 submitted 9 January, 2024;
originally announced January 2024.
-
Scalable Approximate Optimal Diagonal Preconditioning
Authors:
Wenzhi Gao,
Zhaonan Qu,
Madeleine Udell,
Yinyu Ye
Abstract:
We consider the problem of finding the optimal diagonal preconditioner for a positive definite matrix. Although this problem has been shown to be solvable and various methods have been proposed, none of the existing approaches are scalable to matrices of large dimension, or when access is limited to black-box matrix-vector products, thereby significantly limiting their practical application. In vi…
▽ More
We consider the problem of finding the optimal diagonal preconditioner for a positive definite matrix. Although this problem has been shown to be solvable and various methods have been proposed, none of the existing approaches are scalable to matrices of large dimension, or when access is limited to black-box matrix-vector products, thereby significantly limiting their practical application. In view of these challenges, we propose practical algorithms applicable to finding approximate optimal diagonal preconditioners of large sparse systems. Our approach is based on the idea of dimension reduction, and combines techniques from semi-definite programming (SDP), random projection, semi-infinite programming (SIP), and column generation. Numerical experiments demonstrate that our method scales to sparse matrices of size greater than $10^7$. Notably, our approach is efficient and implementable using only black-box matrix-vector product operations, making it highly practical for a wide variety of applications.
△ Less
Submitted 24 December, 2023;
originally announced December 2023.
-
cuPDLP-C: A Strengthened Implementation of cuPDLP for Linear Programming by C language
Authors:
Haihao Lu,
**wen Yang,
Haodong Hu,
Qi Huangfu,
**song Liu,
Tianhao Liu,
Yinyu Ye,
Chuwen Zhang,
Dongdong Ge
Abstract:
A recent GPU implementation of the Restarted Primal-Dual Hybrid Gradient Method for Linear Programming was proposed in Lu and Yang (2023). Its computational results demonstrate the significant computational advantages of the GPU-based first-order algorithm on certain large-scale problems. The average performance also achieves a level close to commercial solvers for the first time in history. Howev…
▽ More
A recent GPU implementation of the Restarted Primal-Dual Hybrid Gradient Method for Linear Programming was proposed in Lu and Yang (2023). Its computational results demonstrate the significant computational advantages of the GPU-based first-order algorithm on certain large-scale problems. The average performance also achieves a level close to commercial solvers for the first time in history. However, due to limitations in experimental hardware and the disadvantage of implementing the algorithm in Julia compared to C language, neither the commercial solver nor cuPDLP reached their maximum efficiency. Therefore, in this report, we have re-implemented and optimized cuPDLP in C language. Utilizing state-of-the-art CPU and GPU hardware, we extensively compare cuPDLP with the best commercial solvers. The experiments further highlight its substantial computational advantages and potential for solving large-scale linear programming problems. We also discuss the profound impact this breakthrough may have on mathematical programming research and the entire operations research community.
△ Less
Submitted 7 January, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
Linear stability of inner case of double averaged spatial restricted elliptic three body problem
Authors:
Xiumin Huang,
Yan Luo,
Kaicheng Sheng,
Yiru Ye
Abstract:
We study the secular effects in the motion of an asteroid with negligible mass in a spatial restricted elliptic three body problem with arbitrary inclination. Averaging over mean anomalies of the asteroid and the planet are applied to obtain the double averaged Hamiltonian system. It admits a two-parameter family of orbits corresponding to the motion of the third body in the plane of primaries' mo…
▽ More
We study the secular effects in the motion of an asteroid with negligible mass in a spatial restricted elliptic three body problem with arbitrary inclination. Averaging over mean anomalies of the asteroid and the planet are applied to obtain the double averaged Hamiltonian system. It admits a two-parameter family of orbits corresponding to the motion of the third body in the plane of primaries' motion. The aim of our investigation is to analyze the stability of these orbits in inner case. We show that they are stable in the linear approximation and give descriptions of linear stability with respect to the eccentricity and argument of periapsis of asteroid. Numerical simulations of different types of orbits are performed as well.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
A Universal Trust-Region Method for Convex and Nonconvex Optimization
Authors:
Yuntian Jiang,
Chang He,
Chuwen Zhang,
Dongdong Ge,
Bo Jiang,
Yinyu Ye
Abstract:
This paper presents a universal trust-region method simultaneously incorporating quadratic regularization and the ball constraint. We introduce a novel mechanism to set the parameters in the proposed method that unifies the analysis for convex and nonconvex optimization. Our method exhibits an iteration complexity of $\tilde O(ε^{-3/2})$ to find an approximate second-order stationary point for non…
▽ More
This paper presents a universal trust-region method simultaneously incorporating quadratic regularization and the ball constraint. We introduce a novel mechanism to set the parameters in the proposed method that unifies the analysis for convex and nonconvex optimization. Our method exhibits an iteration complexity of $\tilde O(ε^{-3/2})$ to find an approximate second-order stationary point for nonconvex optimization. Meanwhile, the analysis reveals that the universal method attains an $O(ε^{-1/2})$ complexity bound for convex optimization and can be accelerated. These results are complementary to the existing literature as the trust-region method was historically conceived for nonconvex optimization. Finally, we develop an adaptive universal method to address practical implementations. The numerical results show the effectiveness of our method in both nonconvex and convex problems.
△ Less
Submitted 12 March, 2024; v1 submitted 19 November, 2023;
originally announced November 2023.
-
Trust Region Methods For Nonconvex Stochastic Optimization Beyond Lipschitz Smoothness
Authors:
Chenghan Xie,
Chenxi Li,
Chuwen Zhang,
Qi Deng,
Dongdong Ge,
Yinyu Ye
Abstract:
In many important machine learning applications, the standard assumption of having a globally Lipschitz continuous gradient may fail to hold. This paper delves into a more general $(L_0, L_1)$-smoothness setting, which gains particular significance within the realms of deep neural networks and distributionally robust optimization (DRO). We demonstrate the significant advantage of trust region meth…
▽ More
In many important machine learning applications, the standard assumption of having a globally Lipschitz continuous gradient may fail to hold. This paper delves into a more general $(L_0, L_1)$-smoothness setting, which gains particular significance within the realms of deep neural networks and distributionally robust optimization (DRO). We demonstrate the significant advantage of trust region methods for stochastic nonconvex optimization under such generalized smoothness assumption. We show that first-order trust region methods can recover the normalized and clipped stochastic gradient as special cases and then provide a unified analysis to show their convergence to first-order stationary conditions. Motivated by the important application of DRO, we propose a generalized high-order smoothness condition, under which second-order trust region methods can achieve a complexity of $\mathcal{O}(ε^{-3.5})$ for convergence to second-order stationary points. By incorporating variance reduction, the second-order trust region method obtains an even better complexity of $\mathcal{O}(ε^{-3})$, matching the optimal bound for standard smooth optimization. To our best knowledge, this is the first work to show convergence beyond the first-order stationary condition for generalized smooth optimization. Preliminary experiments show that our proposed algorithms perform favorably compared with existing methods.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
Data-driven aerodynamic shape design with distributionally robust optimization approaches
Authors:
Long Chen,
Jan Rottmayer,
Lisa Kusch,
Nicolas R. Gauger,
Yinyu Ye
Abstract:
We formulate and solve data-driven aerodynamic shape design problems with distributionally robust optimization (DRO) approaches. Building on the findings of the work \cite{gotoh2018robust}, we study the connections between a class of DRO and the Taguchi method in the context of robust design optimization. Our preliminary computational experiments on aerodynamic shape optimization in transonic turb…
▽ More
We formulate and solve data-driven aerodynamic shape design problems with distributionally robust optimization (DRO) approaches. Building on the findings of the work \cite{gotoh2018robust}, we study the connections between a class of DRO and the Taguchi method in the context of robust design optimization. Our preliminary computational experiments on aerodynamic shape optimization in transonic turbulent flow show promising design results.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Derivative estimates of pluriclosed flow
Authors:
Yanan Ye
Abstract:
We provide a derivative estimate for the pluriclosed flow, controlling higher order derivatives of Chern curvature and torsion using the Chern curvature. Moreover, we derive an estimate for torsion tensor using Chern Ricci curvature in dimension two. And in the Hermitian-symplectic case, we find a monotonic quantity and use it to prove that all Hermitian-symplectic solitons are Kähler Ricci solito…
▽ More
We provide a derivative estimate for the pluriclosed flow, controlling higher order derivatives of Chern curvature and torsion using the Chern curvature. Moreover, we derive an estimate for torsion tensor using Chern Ricci curvature in dimension two. And in the Hermitian-symplectic case, we find a monotonic quantity and use it to prove that all Hermitian-symplectic solitons are Kähler Ricci solitons.
△ Less
Submitted 13 November, 2023; v1 submitted 28 August, 2023;
originally announced August 2023.
-
A Homogenization Approach for Gradient-Dominated Stochastic Optimization
Authors:
Jiyuan Tan,
Chenyu Xue,
Chuwen Zhang,
Qi Deng,
Dongdong Ge,
Yinyu Ye
Abstract:
Gradient dominance property is a condition weaker than strong convexity, yet sufficiently ensures global convergence even in non-convex optimization. This property finds wide applications in machine learning, reinforcement learning (RL), and operations management. In this paper, we propose the stochastic homogeneous second-order descent method (SHSODM) for stochastic functions enjoying gradient do…
▽ More
Gradient dominance property is a condition weaker than strong convexity, yet sufficiently ensures global convergence even in non-convex optimization. This property finds wide applications in machine learning, reinforcement learning (RL), and operations management. In this paper, we propose the stochastic homogeneous second-order descent method (SHSODM) for stochastic functions enjoying gradient dominance property based on a recently proposed homogenization approach. Theoretically, we provide its sample complexity analysis, and further present an enhanced result by incorporating variance reduction techniques. Our findings show that SHSODM matches the best-known sample complexity achieved by other second-order methods for gradient-dominated stochastic optimization but without cubic regularization. Empirically, since the homogenization approach only relies on solving extremal eigenvector problem at each iteration instead of Newton-type system, our methods gain the advantage of cheaper computational cost and robustness in ill-conditioned problems. Numerical experiments on several RL tasks demonstrate the better performance of SHSODM compared to other off-the-shelf methods.
△ Less
Submitted 29 May, 2024; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Learning to Pivot as a Smart Expert
Authors:
Tianhao Liu,
Shanwen Pu,
Dongdong Ge,
Yinyu Ye
Abstract:
Linear programming has been practically solved mainly by simplex and interior point methods. Compared with the weakly polynomial complexity obtained by the interior point methods, the existence of strongly polynomial bounds for the length of the pivot path generated by the simplex methods remains a mystery. In this paper, we propose two novel pivot experts that leverage both global and local infor…
▽ More
Linear programming has been practically solved mainly by simplex and interior point methods. Compared with the weakly polynomial complexity obtained by the interior point methods, the existence of strongly polynomial bounds for the length of the pivot path generated by the simplex methods remains a mystery. In this paper, we propose two novel pivot experts that leverage both global and local information of the linear programming instances for the primal simplex method and show their excellent performance numerically. The experts can be regarded as a benchmark to evaluate the performance of classical pivot rules, although they are hard to directly implement. To tackle this challenge, we employ a graph convolutional neural network model, trained via imitation learning, to mimic the behavior of the pivot expert. Our pivot rule, learned empirically, displays a significant advantage over conventional methods in various linear programming problems, as demonstrated through a series of rigorous experiments.
△ Less
Submitted 31 August, 2023; v1 submitted 16 August, 2023;
originally announced August 2023.
-
Blessing of High-Order Dimensionality: from Non-Convex to Convex Optimization for Sensor Network Localization
Authors:
Mingyu Lei,
Jiayu Zhang,
Yinyu Ye
Abstract:
This paper investigates the Sensor Network Localization (SNL) problem, which seeks to determine sensor locations based on known anchor locations and partially given anchors-sensors and sensors-sensors distances. Two primary methods for solving the SNL problem are analyzed: the low-dimensional method that directly minimizes a loss function, and the high-dimensional semi-definite relaxation (SDR) me…
▽ More
This paper investigates the Sensor Network Localization (SNL) problem, which seeks to determine sensor locations based on known anchor locations and partially given anchors-sensors and sensors-sensors distances. Two primary methods for solving the SNL problem are analyzed: the low-dimensional method that directly minimizes a loss function, and the high-dimensional semi-definite relaxation (SDR) method that reformulates the SNL problem as an SDP (semi-definite programming) problem. The paper primarily focuses on the intrinsic non-convexity of the loss function of the low-dimensional method, which is shown in our main theorem. The SDR method, via second-order dimension augmentation, is discussed in the context of its ability to transform non-convex problems into convex ones; while the first-order direct dimension augmentation fails. Additionally, we will show that more edges don't necessarily contribute to the better convexity of the loss function. Moreover, we provide an explanation for the success of the SDR+GD (gradient descent) method which uses the SDR solution as a warm-start of the minimization of the loss function by gradient descent. The paper also explores the parallels among SNL, max-cut, and neural networks in terms of the blessing of high-order dimension augmentation.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
On the energy and helicity conservation of the incompressible Euler equations
Authors:
Yanqing Wang,
Wei Wei,
Gnag Wu,
Yulin Ye
Abstract:
In this paper, we are concerned with the minimal regularity of weak solutions implying the law of balance for both energy and helicity in the incompressible Euler equations. In the spirit of recent works due to Berselli [5] and Berselli-Georgiadis [6], it is shown that the energy of weak solutions is invariant if $v\in L^{p}(0,T;B^{\frac1p}_{\frac{2p}{p-1},c(\mathbb{N})} )$ with $1<p\leq3$ and the…
▽ More
In this paper, we are concerned with the minimal regularity of weak solutions implying the law of balance for both energy and helicity in the incompressible Euler equations. In the spirit of recent works due to Berselli [5] and Berselli-Georgiadis [6], it is shown that the energy of weak solutions is invariant if $v\in L^{p}(0,T;B^{\frac1p}_{\frac{2p}{p-1},c(\mathbb{N})} )$ with $1<p\leq3$ and the helicity is conserved if $v\in L^{p}(0,T;B^{\frac2p}_{\frac{2p}{p-1},c(\mathbb{N})} )$ with $2<p\leq3 $ for both the periodic domain and the whole space, which generalizes the classical work of Cheskidov-Constantin-Friedlander-Shvydkoy in [10]. This indicates the role of the time integrability, spatial integrability and differential regularity of the velocity in the conserved quantities of weak solutions of the ideal fluid.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Ideals of the associative algebra operad
Authors:
Y. -H. Bao,
J. -N. Xu,
Y. Ye,
J. J. Zhang,
Y. -F. Zhang
Abstract:
We prove a one-to-one correspondence between the operadic ideals of the operad $\As$ and $T$-ideals. As a consequence, we show that $\As$ is noetherian and that every proper operadic ideal of $\ias$ is generated by a single element.
We prove a one-to-one correspondence between the operadic ideals of the operad $\As$ and $T$-ideals. As a consequence, we show that $\As$ is noetherian and that every proper operadic ideal of $\ias$ is generated by a single element.
△ Less
Submitted 26 November, 2023; v1 submitted 14 July, 2023;
originally announced July 2023.
-
Pattern formation and bifurcation analysis of delay induced fractional-order epidemic spreading on networks
Authors:
Jiaying Zhou,
Yong Ye,
Alex Arenas,
Sergio Gómez,
Yi Zhao
Abstract:
The spontaneous emergence of ordered structures, known as Turing patterns, in complex networks is a phenomenon that holds potential applications across diverse scientific fields, including biology, chemistry, and physics. Here, we present a novel delayed fractional-order susceptible-infected-recovered-susceptible (SIRS) reaction-diffusion model functioning on a network, which is typically used to…
▽ More
The spontaneous emergence of ordered structures, known as Turing patterns, in complex networks is a phenomenon that holds potential applications across diverse scientific fields, including biology, chemistry, and physics. Here, we present a novel delayed fractional-order susceptible-infected-recovered-susceptible (SIRS) reaction-diffusion model functioning on a network, which is typically used to simulate disease transmission but can also model rumor propagation in social contexts. Our theoretical analysis establishes the Turing instability resulting from delay, and we support our conclusions through numerical experiments. We identify the unique impacts of delay, average network degree, and diffusion rate on pattern formation. The primary outcomes of our study are: (i) Delays cause system instability, mainly evidenced by periodic temporal fluctuations; (ii) The average network degree produces periodic oscillatory states in uneven spatial distributions; (iii) The combined influence of diffusion rate and delay results in irregular oscillations in both time and space. However, we also find that fractional-order can suppress the formation of spatiotemporal patterns. These findings are crucial for comprehending the impact of network structure on the dynamics of fractional-order systems.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Wasserstein-$1$ distance and nonuniform Berry-Esseen bound for a supercritical branching process in a random environment
Authors:
Hao Wu,
Xiequan Fan,
Zhiqiang Gao,
Yinna Ye
Abstract:
Let $ (Z_{n})_{n\geq 0} $ be a supercritical branching process in an independent and identically distributed random environment. We establish an optimal convergence rate in the Wasserstein-$1$ distance for the process $ (Z_{n})_{n\geq 0} $, which completes a result of Grama et al. [Stochastic Process. Appl., 127(4), 1255-1281, 2017]. Moreover, an exponential nonuniform Berry-Esseen bound is also g…
▽ More
Let $ (Z_{n})_{n\geq 0} $ be a supercritical branching process in an independent and identically distributed random environment. We establish an optimal convergence rate in the Wasserstein-$1$ distance for the process $ (Z_{n})_{n\geq 0} $, which completes a result of Grama et al. [Stochastic Process. Appl., 127(4), 1255-1281, 2017]. Moreover, an exponential nonuniform Berry-Esseen bound is also given. At last, some applications of the main results to the confidence interval estimation for the criticality parameter and the population size $Z_n$ are discussed.
△ Less
Submitted 4 December, 2023; v1 submitted 3 July, 2023;
originally announced July 2023.
-
Homogeneous Second-Order Descent Framework: A Fast Alternative to Newton-Type Methods
Authors:
Chang He,
Yuntian Jiang,
Chuwen Zhang,
Dongdong Ge,
Bo Jiang,
Yinyu Ye
Abstract:
This paper proposes a homogeneous second-order descent framework (HSODF) for nonconvex and convex optimization based on the generalized homogeneous model (GHM). In comparison to the Newton steps, the GHM can be solved by extremal symmetric eigenvalue procedures and thus grant an advantage in ill-conditioned problems. Moreover, GHM extends the ordinary homogeneous model (OHM) (Zhang et al. 2022) to…
▽ More
This paper proposes a homogeneous second-order descent framework (HSODF) for nonconvex and convex optimization based on the generalized homogeneous model (GHM). In comparison to the Newton steps, the GHM can be solved by extremal symmetric eigenvalue procedures and thus grant an advantage in ill-conditioned problems. Moreover, GHM extends the ordinary homogeneous model (OHM) (Zhang et al. 2022) to allow adaptiveness in the construction of the aggregated matrix. Consequently, HSODF is able to recover some well-known second-order methods, such as trust-region methods and gradient regularized methods, while maintaining comparable iteration complexity bounds. We also study two specific realizations of HSODF. One is adaptive HSODM, which has a parameter-free $O(ε^{-3/2})$ global complexity bound for nonconvex second-order Lipschitz continuous objective functions. The other one is homotopy HSODM, which is proven to have a global linear rate of convergence without strong convexity. The efficiency of our approach to ill-conditioned and high-dimensional problems is justified by some preliminary numerical results.
△ Less
Submitted 6 May, 2024; v1 submitted 30 June, 2023;
originally announced June 2023.
-
Pattern formation in a predator-prey model with Allee effect and hyperbolic mortality on networked and non-networked environments
Authors:
Yong Ye,
Jiaying Zhou
Abstract:
With the development of network science, Turing pattern has been proven to be formed in discrete media such as complex networks, opening up the possibility of exploring it as a generation mechanism in the context of biology, chemistry, and physics. Turing instability in the predator-prey system has been widely studied in recent years. We hope to use the predator-prey interaction relationship in bi…
▽ More
With the development of network science, Turing pattern has been proven to be formed in discrete media such as complex networks, opening up the possibility of exploring it as a generation mechanism in the context of biology, chemistry, and physics. Turing instability in the predator-prey system has been widely studied in recent years. We hope to use the predator-prey interaction relationship in biological populations to explain the influence of network topology on pattern formation. In this paper, we establish a predator-prey model with weak Allee effect, analyze and verify the Turing instability conditions on the large ER (Erdös-Rényi) random network with the help of Turing stability theory and numerical experiments, and obtain the Turing instability region. The results indicate that diffusion plays a decisive role in the generation of spatial patterns, whether in continuous or discrete media. For spatiotemporal patterns, different initial values can also bring about changes in the pattern. When we analyze the model based on the network framework, we find that the average degree of the network has an important impact on the model, and different average degrees will lead to changes in the distribution pattern of the population.
△ Less
Submitted 4 October, 2023; v1 submitted 20 June, 2023;
originally announced June 2023.
-
Pre-trained Mixed Integer Optimization through Multi-variable Cardinality Branching
Authors:
Yanguang Chen,
Wenzhi Gao,
Dongdong Ge,
Yinyu Ye
Abstract:
We propose a new method to accelerate online Mixed Integer Optimization with Pre-trained machine learning models (PreMIO). The key component of PreMIO is a multi-variable cardinality branching procedure that splits the feasible region with data-driven hyperplanes, which can be easily integrated into any MIP solver with two lines of code. Moreover, we incorporate learning theory and concentration i…
▽ More
We propose a new method to accelerate online Mixed Integer Optimization with Pre-trained machine learning models (PreMIO). The key component of PreMIO is a multi-variable cardinality branching procedure that splits the feasible region with data-driven hyperplanes, which can be easily integrated into any MIP solver with two lines of code. Moreover, we incorporate learning theory and concentration inequalities to develop a straightforward and interpretable hyper-parameter selection strategy for our method. We test the performance of PreMIO by applying it to state-of-the-art MIP solvers and running numerical experiments on both classical OR benchmark datasets and real-life instances. The results validate the effectiveness of our proposed method.
△ Less
Submitted 21 May, 2023;
originally announced May 2023.
-
A Riemannian Dimension-reduced Second Order Method with Application in Sensor Network Localization
Authors:
Tianyun Tang,
Kim-Chuan Toh,
Nachuan Xiao,
Yinyu Ye
Abstract:
In this paper, we propose a cubic-regularized Riemannian optimization method (RDRSOM), which partially exploits the second order information and achieves the iteration complexity of $\mathcal{O}(1/ε^{3/2})$. In order to reduce the per-iteration computational cost, we further propose a practical version of (RDRSOM), which is an extension of the well known Barzilai-Borwein method and achieves the it…
▽ More
In this paper, we propose a cubic-regularized Riemannian optimization method (RDRSOM), which partially exploits the second order information and achieves the iteration complexity of $\mathcal{O}(1/ε^{3/2})$. In order to reduce the per-iteration computational cost, we further propose a practical version of (RDRSOM), which is an extension of the well known Barzilai-Borwein method and achieves the iteration complexity of $\mathcal{O}(1/ε^{3/2})$. We apply our method to solve a nonlinear formulation of the wireless sensor network localization problem whose feasible set is a Riemannian manifold that has not been considered in the literature before. Numerical experiments are conducted to verify the high efficiency of our algorithm compared to state-of-the-art Riemannian optimization methods and other nonlinear solvers.
△ Less
Submitted 24 April, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
On two conserved quantities in the inviscid electron and Hall magnetohydrodynamic equations
Authors:
Yanqing Wang,
**g Yang,
Yulin Ye
Abstract:
In this paper, we are concerned with the energy and magnetic helicity conservation of weak solutions for both the electron and Hall magnetohydrodynamic equations. Various sufficient criteria to ensure the energy and magnetic helicity conservation in Onsager's critical spaces $\underline{B}^α_{p,VMO}$ and $B^α_{p,c(\mathbb{N})}$ in these systems are established. Moreover, for the E-MHD equations, w…
▽ More
In this paper, we are concerned with the energy and magnetic helicity conservation of weak solutions for both the electron and Hall magnetohydrodynamic equations. Various sufficient criteria to ensure the energy and magnetic helicity conservation in Onsager's critical spaces $\underline{B}^α_{p,VMO}$ and $B^α_{p,c(\mathbb{N})}$ in these systems are established. Moreover, for the E-MHD equations, we observe that the conservation criteria of energy and magnetic helicity to the E-MHD equations correspond to the helicity and energy to the ideal incompressible Euler equations, respectively.
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
A gradient descent akin method for constrained optimization: algorithms and applications
Authors:
Long Chen,
Kai-Uwe Bletzinger,
Nicolas R. Gauger,
Yinyu Ye
Abstract:
We present a first-order method for solving constrained optimization problems. The method is derived from our previous work, a modified search direction method inspired by singular value decomposition. In this work, we simplify its computational framework to a ``gradient descent akin'' method (GDAM), i.e., the search direction is computed using a linear combination of the negative and normalized o…
▽ More
We present a first-order method for solving constrained optimization problems. The method is derived from our previous work, a modified search direction method inspired by singular value decomposition. In this work, we simplify its computational framework to a ``gradient descent akin'' method (GDAM), i.e., the search direction is computed using a linear combination of the negative and normalized objective and constraint gradient. We give fundamental theoretical guarantees on the global convergence of the method. This work focuses on the algorithms and applications of GDAM. We present computational algorithms that adapt common strategies for the gradient descent method. We demonstrate the potential of the method using two engineering applications, shape optimization and sensor network localization. When practically implemented, GDAM is robust and very competitive in solving the considered large and challenging optimization problems.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
On the polynomiality conjecture of cluster realization of quantum groups
Authors:
Ivan Chi-Ho Ip,
Jeff York Ye
Abstract:
In this paper, we give a sufficient and necessary condition for a regular element of a quantum cluster algebra $\mathcal{O}_q(\mathcal{X})$ to be universally polynomial. This resolves several conjectures by the first author on the polynomiality of the cluster realization of quantum group generators in different families of positive representations.
In this paper, we give a sufficient and necessary condition for a regular element of a quantum cluster algebra $\mathcal{O}_q(\mathcal{X})$ to be universally polynomial. This resolves several conjectures by the first author on the polynomiality of the cluster realization of quantum group generators in different families of positive representations.
△ Less
Submitted 8 February, 2023;
originally announced February 2023.
-
Stochastic Dimension-reduced Second-order Methods for Policy Optimization
Authors:
**song Liu,
Chenghan Xie,
Qi Deng,
Dongdong Ge,
Yinyu Ye
Abstract:
In this paper, we propose several new stochastic second-order algorithms for policy optimization that only require gradient and Hessian-vector product in each iteration, making them computationally efficient and comparable to policy gradient methods. Specifically, we propose a dimension-reduced second-order method (DR-SOPO) which repeatedly solves a projected two-dimensional trust region subproble…
▽ More
In this paper, we propose several new stochastic second-order algorithms for policy optimization that only require gradient and Hessian-vector product in each iteration, making them computationally efficient and comparable to policy gradient methods. Specifically, we propose a dimension-reduced second-order method (DR-SOPO) which repeatedly solves a projected two-dimensional trust region subproblem. We show that DR-SOPO obtains an $\mathcal{O}(ε^{-3.5})$ complexity for reaching approximate first-order stationary condition and certain subspace second-order stationary condition. In addition, we present an enhanced algorithm (DVR-SOPO) which further improves the complexity to $\mathcal{O}(ε^{-3})$ based on the variance reduction technique. Preliminary experiments show that our proposed algorithms perform favorably compared with stochastic and variance-reduced policy gradient methods.
△ Less
Submitted 28 January, 2023;
originally announced January 2023.
-
Yaglom's law and conserved quantity dissipation in turbulence
Authors:
Yanqing Wang,
Wei Wei,
Yulin Ye
Abstract:
In this paper, we are concerned with the local exact relationship for third-order structure functions in the temperature equation, the inviscid MHD equations and the Euler equations in the sense of Duchon-Robert type and Eyink type. It is shown that the local version of Yaglom's $4/3$ law is valid for the dissipation rates of conserved quantities such as the energy, cross-helicity and helicity in…
▽ More
In this paper, we are concerned with the local exact relationship for third-order structure functions in the temperature equation, the inviscid MHD equations and the Euler equations in the sense of Duchon-Robert type and Eyink type. It is shown that the local version of Yaglom's $4/3$ law is valid for the dissipation rates of conserved quantities such as the energy, cross-helicity and helicity in these systems. In the spirit of Duchon-Robert's classical work, we derive the dissipation term resulted from the lack of smoothness of the solutions in corresponding conservation relation. It seems that these results suggest that the Yaglom's law of the hydrodynamic equations holds if an analogue of dissipation term as Duchon-Robert's is obtained. Base on this, the first Yaglom's relation for the Oldroyd-B model and, inspired by the very recent work due to Boutros-Titi, six new 4/3 laws for subgrid scale $α$-models of turbulence are also presented.
△ Less
Submitted 19 February, 2023; v1 submitted 25 January, 2023;
originally announced January 2023.
-
Solving cubic equations by completing the cube and higher degree equations by completing powers
Authors:
Hua-Lin Huang,
Shengyuan Ruan,
Xiaodan Xu,
Yu Ye
Abstract:
We derive the Cardano formula of cubic equations by completing the cube, and provide radical solutions to some algebraic equations of higher degree by completing powers. The main idea of completing powers arises from Harrison's center theory of higher degree forms. A very simple criterion for such algebraic equations is presented, and the computation amounts to solving linear equations and quadrat…
▽ More
We derive the Cardano formula of cubic equations by completing the cube, and provide radical solutions to some algebraic equations of higher degree by completing powers. The main idea of completing powers arises from Harrison's center theory of higher degree forms. A very simple criterion for such algebraic equations is presented, and the computation amounts to solving linear equations and quadratic equations.
△ Less
Submitted 13 March, 2024; v1 submitted 18 January, 2023;
originally announced January 2023.
-
Bismut Einstein metrics on compact complex manifolds
Authors:
Yanan Ye
Abstract:
We observe that, for a Bismut Einstein metric, the (2,0)-part of Bismut Ricci form is an eigenvector of the Chern Laplacian. With the help of this observation, we prove that a Bismut Einstein metric with non-zero Einstein constant is Kähler Einstein. Additionally, for Bismut Einstein metrics with zero Einstein constant, we prove that they are actually Bismut Ricci flat.
We observe that, for a Bismut Einstein metric, the (2,0)-part of Bismut Ricci form is an eigenvector of the Chern Laplacian. With the help of this observation, we prove that a Bismut Einstein metric with non-zero Einstein constant is Kähler Einstein. Additionally, for Bismut Einstein metrics with zero Einstein constant, we prove that they are actually Bismut Ricci flat.
△ Less
Submitted 26 July, 2023; v1 submitted 7 December, 2022;
originally announced December 2022.
-
A Homogeneous Second-Order Descent Method for Nonconvex Optimization
Authors:
Chuwen Zhang,
Dongdong Ge,
Chang He,
Bo Jiang,
Yuntian Jiang,
Chenyu Xue,
Yinyu Ye
Abstract:
In this paper, we introduce a Homogeneous Second-Order Descent Method (HSODM) using the homogenized quadratic approximation to the original function. The merit of homogenization is that only the leftmost eigenvector of a gradient-Hessian integrated matrix is computed at each iteration. Therefore, the algorithm is a single-loop method that does not need to switch to other sophisticated algorithms a…
▽ More
In this paper, we introduce a Homogeneous Second-Order Descent Method (HSODM) using the homogenized quadratic approximation to the original function. The merit of homogenization is that only the leftmost eigenvector of a gradient-Hessian integrated matrix is computed at each iteration. Therefore, the algorithm is a single-loop method that does not need to switch to other sophisticated algorithms and is easy to implement. We show that HSODM has a global convergence rate of $O(ε^{-3/2})$ to find an $ε$-approximate second-order stationary point, and has a local quadratic convergence rate under the standard assumptions. The numerical results demonstrate the advantage of the proposed method over other second-order methods.
△ Less
Submitted 6 May, 2024; v1 submitted 15 November, 2022;
originally announced November 2022.
-
A note on the Hurwitz problem and cone spherical metrics
Authors:
Jijian Song,
Bin Xu,
Yu Ye
Abstract:
We are motivated by cone spherical metrics on compact Riemann surfaces of positive genus to solve a special case of the Hurwitz problem. Precisely speaking, letting $d,\,g$ and $\ell$ be three positive integers and $Λ$ be the following collection of $(\ell+2)$ partitions of a positive integer $d$: \[(a_1,\cdots, a_p),\,(b_1,\cdots, b_q),\,(m_1+1,1,\cdots,1),\cdots, (m_{\ell}+1,1,\cdots,1),\] where…
▽ More
We are motivated by cone spherical metrics on compact Riemann surfaces of positive genus to solve a special case of the Hurwitz problem. Precisely speaking, letting $d,\,g$ and $\ell$ be three positive integers and $Λ$ be the following collection of $(\ell+2)$ partitions of a positive integer $d$: \[(a_1,\cdots, a_p),\,(b_1,\cdots, b_q),\,(m_1+1,1,\cdots,1),\cdots, (m_{\ell}+1,1,\cdots,1),\] where $(m_1,\cdots, m_{\ell})$ is a partition of $p+q-2+2g$, we prove that there exists a branched cover from some compact Riemann surface of genus $g$ to the Riemann sphere ${\Bbb P}^1$ with branch data $Λ$. An analogue for the genus-zero case was found by the first two authors ({\it Algebra Colloq.} {\bf 27} (2020), no. 2, 231-246), who were stimulated by such metrics on ${\Bbb P}^1$ and conjectured the veracity of the above statement there.
△ Less
Submitted 6 February, 2024; v1 submitted 18 October, 2022;
originally announced October 2022.
-
SOLNP+: A Derivative-Free Solver for Constrained Nonlinear Optimization
Authors:
Dongdong Ge,
Tianhao Liu,
**song Liu,
Jiyuan Tan,
Yinyu Ye
Abstract:
SOLNP+ is a derivative-free solver for constrained nonlinear optimization. It starts from SOLNP proposed in 1989 by Ye Ye with the main idea that uses finite difference to approximate the gradient. We incorporate the techniques of implicit filtering, new restart mechanism and modern quadratic programming solver into this new version with an ANSI C implementation. The algorithm exhibits a great adv…
▽ More
SOLNP+ is a derivative-free solver for constrained nonlinear optimization. It starts from SOLNP proposed in 1989 by Ye Ye with the main idea that uses finite difference to approximate the gradient. We incorporate the techniques of implicit filtering, new restart mechanism and modern quadratic programming solver into this new version with an ANSI C implementation. The algorithm exhibits a great advantage in running time and robustness under noise compared with the last version by MATLAB. SOLNP+ is free to download at https://github.com/COPT-Public/SOLNP_plus.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
2-unitary Operads of GK-dimension 3
Authors:
Yan-Hong Bao,
Dong-Xing Fu,
Yu Ye,
James J. Zhang
Abstract:
We study and classify the 2-unitary operads of Gelfand-Kirillov dimension three.
We study and classify the 2-unitary operads of Gelfand-Kirillov dimension three.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
Semi-strict chordality of digraphs
Authors:
**g Huang,
Ying Ying Ye
Abstract:
Chordal graphs are important in algorithmic graph theory. Chordal digraphs are a digraph analogue of chordal graphs and have been a subject of active studies recently. Unlike chordal graphs, chordal digraphs lack many structural properties such as forbidden subdigraph or representation characterizations. In this paper we introduce the notion of semi-strict chordal digraphs which form a class stric…
▽ More
Chordal graphs are important in algorithmic graph theory. Chordal digraphs are a digraph analogue of chordal graphs and have been a subject of active studies recently. Unlike chordal graphs, chordal digraphs lack many structural properties such as forbidden subdigraph or representation characterizations. In this paper we introduce the notion of semi-strict chordal digraphs which form a class strictly between chordal digraphs and chordal graphs. Semi-strict chordal digraphs have rich structural properties. We characterize semi-strict chordal digraphs in terms of knotting graphs, a notion analogous to the one introduced by Gallai for the study of comparability graphs. We also give forbidden subdigraph characterizations of semi-strict chordal digraphs within the cases of locally semicomplete digraphs and weakly quasi-transitive digraphs.
△ Less
Submitted 25 November, 2022; v1 submitted 13 September, 2022;
originally announced September 2022.
-
Double square moments and bounds for resonance sums of cusp forms
Authors:
Tim Gillespie,
Praneel Samanta,
Yangbo Ye
Abstract:
Let $f$ and $g$ be holomorphic cusp forms for the modular group $SL_2(\mathbb Z)$ of weight $k_1$ and $k_2$ with Fourier coefficients $λ_f(n)$ and $λ_g(n)$, respectively. For real $α\neq0$ and $0<β\leq1$, consider a smooth resonance sum $S_X(f,g;α,β)$ of $λ_f(n)λ_g(n)$ against $e(αn^β)$ over $X\leq n\leq2X$. Double square moments of $S_X(f,g;α,β)$ over both $f$ and $g$ are nontrivially bounded whe…
▽ More
Let $f$ and $g$ be holomorphic cusp forms for the modular group $SL_2(\mathbb Z)$ of weight $k_1$ and $k_2$ with Fourier coefficients $λ_f(n)$ and $λ_g(n)$, respectively. For real $α\neq0$ and $0<β\leq1$, consider a smooth resonance sum $S_X(f,g;α,β)$ of $λ_f(n)λ_g(n)$ against $e(αn^β)$ over $X\leq n\leq2X$. Double square moments of $S_X(f,g;α,β)$ over both $f$ and $g$ are nontrivially bounded when their weights $k_1$ and $k_2$ tend to infinity together. By allowing both $f$ and $g$ to move, these double moments are indeed square moments associated with automorphic forms for $GL(4)$. By taking out a small exceptional set of $f$ and $g$, bounds for individual $S_X(f,g;α,β)$ will then be proved. These individual bounds break the resonance barrier of $X^\frac58$ for $\frac16<β<1$ and achieve a square-root cancellation for $\frac13<β<1$ for almost all $f$ and $g$ as an evidence for Hypothesis S for cusp forms over integers. The methods used in this study include Petersson's formula, Poisson's summation formula, and stationary phase integrals.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
An Enhanced ADMM-based Interior Point Method for Linear and Conic Optimization
Authors:
Qi Deng,
Qing Feng,
Wenzhi Gao,
Dongdong Ge,
Bo Jiang,
Yuntian Jiang,
**gsong Liu,
Tianhao Liu,
Chenyu Xue,
Yinyu Ye,
Chuwen Zhang
Abstract:
The ADMM-based interior point (ABIP, Lin et al. 2021) method is a hybrid algorithm that effectively combines interior point method (IPM) and first-order methods to achieve a performance boost in large-scale linear optimization. Different from traditional IPM that relies on computationally intensive Newton steps, the ABIP method applies the alternating direction method of multipliers (ADMM) to appr…
▽ More
The ADMM-based interior point (ABIP, Lin et al. 2021) method is a hybrid algorithm that effectively combines interior point method (IPM) and first-order methods to achieve a performance boost in large-scale linear optimization. Different from traditional IPM that relies on computationally intensive Newton steps, the ABIP method applies the alternating direction method of multipliers (ADMM) to approximately solve the barrier penalized problem. However, similar to other first-order methods, this technique remains sensitive to condition number and inverse precision. In this paper, we provide an enhanced ABIP method with multiple improvements. Firstly, we develop an ABIP method to solve the general linear conic optimization and establish the associated iteration complexity. Secondly, inspired by some existing methods, we develop different implementation strategies for ABIP method, which substantially improve its performance in linear optimization. Finally, we conduct extensive numerical experiments in both synthetic and real-world datasets to demonstrate the empirical advantage of our developments. In particular, the enhanced ABIP method achieves a 5.8x reduction in the geometric mean of run time on $105$ selected LP instances from Netlib, and it exhibits advantages in certain structured problems such as SVM and PageRank. However, the enhanced ABIP method still falls behind commercial solvers in many benchmarks, especially when high accuracy is desired. We posit that it can serve as a complementary tool alongside well-established solvers.
△ Less
Submitted 6 April, 2024; v1 submitted 5 September, 2022;
originally announced September 2022.