-
Fractional Brauer configuration algebras I: definitions and examples
Authors:
Nengqun Li,
Yuming Liu
Abstract:
In 2017, Green and Schroll introduced a generalization of Brauer graph algebras which they call Brauer configuration algebras. In the present paper, we further generalize Brauer configuration algebras to fractional Brauer configuration algebras by generalizing Brauer configurations to fractional Brauer configurations. The fractional Brauer configuration algebras are locally bounded but neither fin…
▽ More
In 2017, Green and Schroll introduced a generalization of Brauer graph algebras which they call Brauer configuration algebras. In the present paper, we further generalize Brauer configuration algebras to fractional Brauer configuration algebras by generalizing Brauer configurations to fractional Brauer configurations. The fractional Brauer configuration algebras are locally bounded but neither finite-dimensional nor symmetric in general. We show that if the fractional Brauer configuration is of type S (resp. of type MS), then the corresponding fractional Brauer configuration algebra is a locally bounded Frobenius algebra (resp. a locally bounded special multiserial Frobenius algebra). Moreover, we show that over an algebraically closed field, the class of finite-dimensional indecomposable representation-finite fractional Brauer configuration algebras in type S coincides with the class of basic indecomposable finite-dimensional standard representation-finite self-injective algebras.
△ Less
Submitted 30 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Equilibrium Selection for Multi-agent Reinforcement Learning: A Unified Framework
Authors:
Runyu Zhang,
Jeff Shamma,
Na Li
Abstract:
While there are numerous works in multi-agent reinforcement learning (MARL), most of them focus on designing algorithms and proving convergence to a Nash equilibrium (NE) or other equilibrium such as coarse correlated equilibrium. However, NEs can be non-unique and their performance varies drastically. Thus, it is important to design algorithms that converge to Nash equilibrium with better rewards…
▽ More
While there are numerous works in multi-agent reinforcement learning (MARL), most of them focus on designing algorithms and proving convergence to a Nash equilibrium (NE) or other equilibrium such as coarse correlated equilibrium. However, NEs can be non-unique and their performance varies drastically. Thus, it is important to design algorithms that converge to Nash equilibrium with better rewards or social welfare. In contrast, classical game theory literature has extensively studied equilibrium selection for multi-agent learning in normal-form games, demonstrating that decentralized learning algorithms can asymptotically converge to potential-maximizing or Pareto-optimal NEs. These insights motivate this paper to investigate equilibrium selection in the MARL setting. We focus on the stochastic game model, leveraging classical equilibrium selection results from normal-form games to propose a unified framework for equilibrium selection in stochastic games. The proposed framework is highly modular and can extend various learning rules and their corresponding equilibrium selection results from normal-form games to the stochastic game setting.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Policy Optimization in Control: Geometry and Algorithmic Implications
Authors:
Shahriar Talebi,
Yang Zheng,
Spencer Kraisler,
Na Li,
Mehran Mesbahi
Abstract:
This survey explores the geometric perspective on policy optimization within the realm of feedback control systems, emphasizing the intrinsic relationship between control design and optimization. By adopting a geometric viewpoint, we aim to provide a nuanced understanding of how various ``complete parameterization'' -- referring to the policy parameters together with its Riemannian geometry -- of…
▽ More
This survey explores the geometric perspective on policy optimization within the realm of feedback control systems, emphasizing the intrinsic relationship between control design and optimization. By adopting a geometric viewpoint, we aim to provide a nuanced understanding of how various ``complete parameterization'' -- referring to the policy parameters together with its Riemannian geometry -- of control design problems, influence stability and performance of local search algorithms. The paper is structured to address key themes such as policy parameterization, the topology and geometry of stabilizing policies, and their implications for various (non-convex) dynamic performance measures. We focus on a few iconic control design problems, including the Linear Quadratic Regulator (LQR), Linear Quadratic Gaussian (LQG) control, and $\mathcal{H}_\infty$ control. In particular, we first discuss the topology and Riemannian geometry of stabilizing policies, distinguishing between their static and dynamic realizations. Expanding on this geometric perspective, we then explore structural properties of the aforementioned performance measures and their interplay with the geometry of stabilizing policies in presence of policy constraints; along the way, we address issues such as spurious stationary points, symmetries of dynamic feedback policies, and (non-)smoothness of the corresponding performance measures. We conclude the survey with algorithmic implications of policy optimization in feedback design.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Convolution Identities of Stirling Numbers
Authors:
Nadia Na Li,
Wenchang Chu
Abstract:
By means of the generating function method, a linear recurrence relation is explicitly resolved. The solution is expressed in terms of the Stirling numbers of both the first and the second kind. Two remarkable pairs of combinatorial identities are established as applications, that contain some well-known convolution formulae on Stirling numbers as special cases.
By means of the generating function method, a linear recurrence relation is explicitly resolved. The solution is expressed in terms of the Stirling numbers of both the first and the second kind. Two remarkable pairs of combinatorial identities are established as applications, that contain some well-known convolution formulae on Stirling numbers as special cases.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Efficient Duple Perturbation Robustness in Low-rank MDPs
Authors:
Yang Hu,
Haitong Ma,
Bo Dai,
Na Li
Abstract:
The pursuit of robustness has recently been a popular topic in reinforcement learning (RL) research, yet the existing methods generally suffer from efficiency issues that obstruct their real-world implementation. In this paper, we introduce duple perturbation robustness, i.e. perturbation on both the feature and factor vectors for low-rank Markov decision processes (MDPs), via a novel characteriza…
▽ More
The pursuit of robustness has recently been a popular topic in reinforcement learning (RL) research, yet the existing methods generally suffer from efficiency issues that obstruct their real-world implementation. In this paper, we introduce duple perturbation robustness, i.e. perturbation on both the feature and factor vectors for low-rank Markov decision processes (MDPs), via a novel characterization of $(ξ,η)$-ambiguity sets. The novel robust MDP formulation is compatible with the function representation view, and therefore, is naturally applicable to practical RL problems with large or even continuous state-action spaces. Meanwhile, it also gives rise to a provably efficient and practical algorithm with theoretical convergence rate guarantee. Examples are designed to justify the new robustness concept, and algorithmic efficiency is supported by both theoretical bounds and numerical simulations.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Multi-Agent Coverage Control with Transient Behavior Consideration
Authors:
Runyu Zhang,
Haitong Ma,
Na Li
Abstract:
This paper studies the multi-agent coverage control (MAC) problem where agents must dynamically learn an unknown density function while performing coverage tasks. Unlike many current theoretical frameworks that concentrate solely on the regret occurring at specific targeted sensory locations, our approach additionally considers the regret caused by transient behavior - the path from one location a…
▽ More
This paper studies the multi-agent coverage control (MAC) problem where agents must dynamically learn an unknown density function while performing coverage tasks. Unlike many current theoretical frameworks that concentrate solely on the regret occurring at specific targeted sensory locations, our approach additionally considers the regret caused by transient behavior - the path from one location and another. We propose the multi-agent coverage control with the doubling trick (MAC-DT) algorithm and demonstrate that it achieves (approximated) regret of $\widetilde{O}(\sqrt{T})$ even when accounting for the transient behavior. Our result is also supported by numerical experiments, showcasing that the proposed algorithm manages to match or even outperform the baseline algorithms in simulation environments. We also show how our algorithm can be modified to handle safety constraints and further implement the algorithm on a real-robotic testbed.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Competition-Aware Decision-Making Approach for Mobile Robots in Racing Scenarios
Authors:
Kyoungtae Ji,
Sangjae Bae,
Nan Li,
Kyoungseok Han
Abstract:
This paper presents a game-theoretic strategy for racing, where the autonomous ego agent seeks to block a racing opponent that aims to overtake the ego agent. After a library of trajectory candidates and an associated reward matrix are constructed, the optimal trajectory in terms of maximizing the cumulative reward over the planning horizon is determined based on the level-K reasoning framework. I…
▽ More
This paper presents a game-theoretic strategy for racing, where the autonomous ego agent seeks to block a racing opponent that aims to overtake the ego agent. After a library of trajectory candidates and an associated reward matrix are constructed, the optimal trajectory in terms of maximizing the cumulative reward over the planning horizon is determined based on the level-K reasoning framework. In particular, the level of the opponent is estimated online according to its behavior over a past window and is then used to determine the trajectory for the ego agent. Taking into account that the opponent may change its level and strategy during the decision process of the ego agent, we introduce a trajectory mixing strategy that blends the level-K optimal trajectory with a fail-safe trajectory. The overall algorithm was tested and evaluated in various simulated racing scenarios, which also includes human-in-the-loop experiments. Comparative analysis against the conventional level-K framework demonstrates the superiority of our proposed approach in terms of overtake-blocking success rates.
△ Less
Submitted 30 March, 2024;
originally announced April 2024.
-
Data-Driven Predictive Control with Adaptive Disturbance Attenuation for Constrained Systems
Authors:
Nan Li,
Ilya Kolmanovsky,
Hong Chen
Abstract:
In this paper, we propose a novel data-driven predictive control approach for systems subject to time-domain constraints. The approach combines the strengths of H-infinity control for rejecting disturbances and MPC for handling constraints. In particular, the approach can dynamically adapt H-infinity disturbance attenuation performance depending on measured system state and forecasted disturbance…
▽ More
In this paper, we propose a novel data-driven predictive control approach for systems subject to time-domain constraints. The approach combines the strengths of H-infinity control for rejecting disturbances and MPC for handling constraints. In particular, the approach can dynamically adapt H-infinity disturbance attenuation performance depending on measured system state and forecasted disturbance level to satisfy constraints. We establish theoretical properties of the approach including robust guarantees of closed-loop stability, disturbance attenuation, constraint satisfaction under noisy data, as well as sufficient conditions for recursive feasibility, and illustrate the approach with a numerical example.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
TS-RSR: A provably efficient approach for batch bayesian optimization
Authors:
Zhaolin Ren,
Na Li
Abstract:
This paper presents a new approach for batch Bayesian Optimization (BO) called Thompson Sampling-Regret to Sigma Ratio directed sampling (TS-RSR), where we sample a new batch of actions by minimizing a Thompson Sampling approximation of a regret to uncertainty ratio. Our sampling objective is able to coordinate the actions chosen in each batch in a way that minimizes redundancy between points whil…
▽ More
This paper presents a new approach for batch Bayesian Optimization (BO) called Thompson Sampling-Regret to Sigma Ratio directed sampling (TS-RSR), where we sample a new batch of actions by minimizing a Thompson Sampling approximation of a regret to uncertainty ratio. Our sampling objective is able to coordinate the actions chosen in each batch in a way that minimizes redundancy between points whilst focusing on points with high predictive means or high uncertainty. Theoretically, we provide rigorous convergence guarantees on our algorithm's regret, and numerically, we demonstrate that our method attains state-of-the-art performance on a range of challenging synthetic and realistic test functions, where it outperforms several competitive benchmark batch BO algorithms.
△ Less
Submitted 2 May, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
Reduction on the congruences of partial sums of P-recursive sequences
Authors:
Qing-Hu Hou,
Na Li
Abstract:
Hou and Liu developed a telesco** method to prove the congruence of partial sums of P-recursive sequences. We release the requirement on the telescoper and utilize the congruence of the sequence. With this approach, we are able to confirm a conjecture of Sun and find a new congruence on the central trinomial coefficient.
Hou and Liu developed a telesco** method to prove the congruence of partial sums of P-recursive sequences. We release the requirement on the telescoper and utilize the congruence of the sequence. With this approach, we are able to confirm a conjecture of Sun and find a new congruence on the central trinomial coefficient.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Anchored spirals in the driven curvature flow approximation
Authors:
Nan Li,
Arnd Scheel
Abstract:
We study existence, asymptotics, and stability of spiral waves in a driven curvature approximation, supplemented with an anchoring condition on a circle of finite radius. We analyze the motion of curves written as graphs in polar coordinates, finding spiral waves as rigidly rotating shapes. The existence analysis reduces to a planar ODE and asymptotics are given through center manifold expansions.…
▽ More
We study existence, asymptotics, and stability of spiral waves in a driven curvature approximation, supplemented with an anchoring condition on a circle of finite radius. We analyze the motion of curves written as graphs in polar coordinates, finding spiral waves as rigidly rotating shapes. The existence analysis reduces to a planar ODE and asymptotics are given through center manifold expansions. In the limit of a large core, we find rotation frequencies and corrections starting form a problem without curvature corrections. Finally, we demonstrate orbital stability of spiral waves by exploiting a comparison principle inherent to curvature driven flow. \end{abstract}
△ Less
Submitted 14 February, 2024; v1 submitted 12 December, 2023;
originally announced December 2023.
-
Minimum-Time Trajectory Optimization With Data-Based Models: A Linear Programming Approach
Authors:
Nan Li,
Ehsan Taheri,
Ilya Kolmanovsky,
Dimitar Filev
Abstract:
In this paper, we develop a computationally-efficient approach to minimum-time trajectory optimization using input-output data-based models, to produce an end-to-end data-to-control solution to time-optimal planning/control of dynamic systems and hence facilitate their autonomous operation. The approach integrates a non-parametric data-based model for trajectory prediction and a continuous optimiz…
▽ More
In this paper, we develop a computationally-efficient approach to minimum-time trajectory optimization using input-output data-based models, to produce an end-to-end data-to-control solution to time-optimal planning/control of dynamic systems and hence facilitate their autonomous operation. The approach integrates a non-parametric data-based model for trajectory prediction and a continuous optimization formulation based on an exponential weighting scheme for minimum-time trajectory planning. The optimization problem in its final form is a linear program and is easy to solve. We validate the approach and illustrate its application with a spacecraft relative motion planning problem.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
MPC-Inspired Reinforcement Learning for Verifiable Model-Free Control
Authors:
Yiwen Lu,
Zishuo Li,
Yihan Zhou,
Na Li,
Yilin Mo
Abstract:
In this paper, we introduce a new class of parameterized controllers, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common control…
▽ More
In this paper, we introduce a new class of parameterized controllers, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common controllers with Multi-Layer Perceptron (MLP) or other general neural network architecture used in DRL, in terms of verifiability and performance guarantees, and the learned controllers possess verifiable properties like persistent feasibility and asymptotic stability akin to MPC. On the other hand, numerical examples illustrate that the proposed controller empirically matches MPC and MLP controllers in terms of control performance and has superior robustness against modeling uncertainty and noises. Furthermore, the proposed controller is significantly more computationally efficient compared to MPC and requires fewer parameters to learn than MLP controllers. Real-world experiments on vehicle drift maneuvering task demonstrate the potential of these controllers for robotics and other demanding control tasks.
△ Less
Submitted 9 April, 2024; v1 submitted 8 December, 2023;
originally announced December 2023.
-
A generalization of Dugas' construction on stable auto-equivalences for symmetric algebras
Authors:
Nengqun Li,
Yuming Liu
Abstract:
We give a unified generalization of Dugas' construction on stable auto-equivalences of Morita type from local symmetric algebras to arbitrary symmetric algebras. For group algebras $kP$ of $p$-groups in characteristic $p$, we recover all the stable auto-equivalences corresponding to endo-trivial modules over $kP$ except that $P$ is generalized quaternion of order $2^m$. Moreover, we give many exam…
▽ More
We give a unified generalization of Dugas' construction on stable auto-equivalences of Morita type from local symmetric algebras to arbitrary symmetric algebras. For group algebras $kP$ of $p$-groups in characteristic $p$, we recover all the stable auto-equivalences corresponding to endo-trivial modules over $kP$ except that $P$ is generalized quaternion of order $2^m$. Moreover, we give many examples of stable auto-equivalences of Morita type for non-local symmetric algebras.
△ Less
Submitted 21 October, 2023;
originally announced October 2023.
-
A Comparison between Markov Chain and Koopman Operator Based Data-Driven Modeling of Dynamical Systems
Authors:
Saeid Tafazzol,
Nan Li,
Ilya Kolmanovsky,
Dimitar Filev
Abstract:
Markov chain-based modeling and Koopman operator-based modeling are two popular frameworks for data-driven modeling of dynamical systems. They share notable similarities from a computational and practitioner's perspective, especially for modeling autonomous systems. The first part of this paper aims to elucidate these similarities. For modeling systems with control inputs, the models produced by t…
▽ More
Markov chain-based modeling and Koopman operator-based modeling are two popular frameworks for data-driven modeling of dynamical systems. They share notable similarities from a computational and practitioner's perspective, especially for modeling autonomous systems. The first part of this paper aims to elucidate these similarities. For modeling systems with control inputs, the models produced by the two approaches differ. The second part of this paper introduces these models and their corresponding control design methods. We illustrate the two approaches and compare them in terms of model accuracy and computational efficiency for both autonomous and controlled systems in numerical examples.
△ Less
Submitted 1 April, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
On constructing bent functions from cyclotomic map**s
Authors:
Xi Xie,
Nian Li,
Qiang Wang,
Xiangyong Zeng
Abstract:
We study a new method of constructing Boolean bent functions from cyclotomic map**s. Three generic constructions are obtained by considering different branch functions such as Dillon functions, Niho functions and Kasami functions over multiplicative cosets and additive cosets respectively. As a result, several new explicit infinite families of bent functions and their duals are derived. We demon…
▽ More
We study a new method of constructing Boolean bent functions from cyclotomic map**s. Three generic constructions are obtained by considering different branch functions such as Dillon functions, Niho functions and Kasami functions over multiplicative cosets and additive cosets respectively. As a result, several new explicit infinite families of bent functions and their duals are derived. We demonstrate that some previous constructions are special cases of our simple constructions. In addition, by studying their polynomial forms, we observe that the last construction provides some examples which are EA-inequivalent to five classes of monomials, Dillon type and Niho type polynomials.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
MQENet: A Mesh Quality Evaluation Neural Network Based on Dynamic Graph Attention
Authors:
Haoxuan Zhang,
Haisheng Li,
Nan Li,
Xiaochuan Wang
Abstract:
With the development of computational fluid dynamics, the requirements for the fluid simulation accuracy in industrial applications have also increased. The quality of the generated mesh directly affects the simulation accuracy. However, previous mesh quality metrics and models cannot evaluate meshes comprehensively and objectively. To this end, we propose MQENet, a structured mesh quality evaluat…
▽ More
With the development of computational fluid dynamics, the requirements for the fluid simulation accuracy in industrial applications have also increased. The quality of the generated mesh directly affects the simulation accuracy. However, previous mesh quality metrics and models cannot evaluate meshes comprehensively and objectively. To this end, we propose MQENet, a structured mesh quality evaluation neural network based on dynamic graph attention. MQENet treats the mesh evaluation task as a graph classification task for classifying the quality of the input structured mesh. To make graphs generated from structured meshes more informative, MQENet introduces two novel structured mesh preprocessing algorithms. These two algorithms can also improve the conversion efficiency of structured mesh data. Experimental results on the benchmark structured mesh dataset NACA-Market show the effectiveness of MQENet in the mesh quality evaluation task.
△ Less
Submitted 2 September, 2023;
originally announced September 2023.
-
Carbon-Aware Optimal Power Flow
Authors:
Xin Chen,
Andy Sun,
Wenbo Shi,
Na Li
Abstract:
To facilitate effective decarbonization of the electric power sector, this paper introduces the generic Carbon-aware Optimal Power Flow (C-OPF) method for power system decision-making that considers demand-side carbon accounting and emission management. Built upon the classic optimal power flow (OPF) model, the C-OPF method incorporates carbon emission flow equations and constraints, as well as ca…
▽ More
To facilitate effective decarbonization of the electric power sector, this paper introduces the generic Carbon-aware Optimal Power Flow (C-OPF) method for power system decision-making that considers demand-side carbon accounting and emission management. Built upon the classic optimal power flow (OPF) model, the C-OPF method incorporates carbon emission flow equations and constraints, as well as carbon-related objectives, to jointly optimize power flow and carbon flow. In particular, this paper establishes the invertibility of the carbon flow matrix and proposes modeling and linearization techniques to address the issues of undetermined power flow directions and bilinear terms in the C-OPF model. Additionally, two novel carbon emission models, together with the carbon accounting schemes, for energy storage systems are developed and integrated into the C-OPF model. Numerical simulations demonstrate the characteristics and effectiveness of the C-OPF method, in comparison with OPF solutions.
△ Less
Submitted 6 August, 2023;
originally announced August 2023.
-
Soft Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity
Authors:
Runyu Zhang,
Yang Hu,
Na Li
Abstract:
Robust Markov Decision Processes (MDPs) and risk-sensitive MDPs are both powerful tools for making decisions in the presence of uncertainties. Previous efforts have aimed to establish their connections, revealing equivalences in specific formulations. This paper introduces a new formulation for risk-sensitive MDPs, which assesses risk in a slightly different manner compared to the classical Markov…
▽ More
Robust Markov Decision Processes (MDPs) and risk-sensitive MDPs are both powerful tools for making decisions in the presence of uncertainties. Previous efforts have aimed to establish their connections, revealing equivalences in specific formulations. This paper introduces a new formulation for risk-sensitive MDPs, which assesses risk in a slightly different manner compared to the classical Markov risk measure (Ruszczyński 2010), and establishes its equivalence with a class of soft robust MDP (RMDP) problems, including the standard RMDP as a special case. Leveraging this equivalence, we further derive the policy gradient theorem for both problems, proving gradient domination and global convergence of the exact policy gradient method under the tabular setting with direct parameterization. This forms a sharp contrast to the Markov risk measure, known to be potentially non-gradient-dominant (Huang et al. 2021). We also propose a sample-based offline learning algorithm, namely the robust fitted-Z iteration (RFZI), for a specific soft RMDP problem with a KL-divergence regularization term (or equivalently the risk-sensitive MDP with an entropy risk measure). We showcase its streamlined design and less stringent assumptions due to the equivalence and analyze its sample complexity
△ Less
Submitted 24 May, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
Non-asymptotic System Identification for Linear Systems with Nonlinear Policies
Authors:
Yingying Li,
Tianpeng Zhang,
Subhro Das,
Jeff Shamma,
Na Li
Abstract:
This paper considers a single-trajectory system identification problem for linear systems under general nonlinear and/or time-varying policies with i.i.d. random excitation noises. The problem is motivated by safe learning-based control for constrained linear systems, where the safe policies during the learning process are usually nonlinear and time-varying for satisfying the state and input const…
▽ More
This paper considers a single-trajectory system identification problem for linear systems under general nonlinear and/or time-varying policies with i.i.d. random excitation noises. The problem is motivated by safe learning-based control for constrained linear systems, where the safe policies during the learning process are usually nonlinear and time-varying for satisfying the state and input constraints. In this paper, we provide a non-asymptotic error bound for least square estimation when the data trajectory is generated by any nonlinear and/or time-varying policies as long as the generated state and action trajectories are bounded. This significantly generalizes the existing non-asymptotic guarantees for linear system identification, which usually consider i.i.d. random inputs or linear policies. Interestingly, our error bound is consistent with that for linear policies with respect to the dependence on the trajectory length, system dimensions, and excitation levels. Lastly, we demonstrate the applications of our results by safe learning with robust model predictive control and provide numerical analysis.
△ Less
Submitted 17 June, 2023;
originally announced June 2023.
-
Quantitative estimates on the $C^2$-singular sets in Alexandrov spaces
Authors:
Nan Li
Abstract:
The total disaster may be controllable if not preventable. We will explore this phenomenon for singularities in metric spaces. A point in an $n$-dimensional Alexandrov space is called regular if its tangent cone is isometric to $\mathbb R^n$. Examples show that not every regular point is smooth, and the non-smooth points, away from the boundary, can have co-dimension 1. In this paper, we define a…
▽ More
The total disaster may be controllable if not preventable. We will explore this phenomenon for singularities in metric spaces. A point in an $n$-dimensional Alexandrov space is called regular if its tangent cone is isometric to $\mathbb R^n$. Examples show that not every regular point is smooth, and the non-smooth points, away from the boundary, can have co-dimension 1. In this paper, we define a non-negative function $\mathcal K(x)$, which quantitatively measures the extent of the point $x$ from being $C^2$. The so-called $C^2$-singular points are identified as the set where $\mathcal K>0$. We show that $\int_{B_r(p)} \mathcal K(x)\, \operatorname d\mathcal H^{n-1}\le c(n,κ,ν)r^{n-2}$ for any $n$-dimensional Alexandrov space $(X,p)$ with curv $\ge κ$ and $\operatorname{Vol}\left(B_1(p)\right)\geν>0$. This leads to the Hausdorff dimension estimate $\dim_\mathcal H\{\mathcal K>0\}\le n-1$, and the quantitative Hausdorff measure estimate $\mathcal H^{n-1}\left(\{\mathcal K>ε\}\cap B_r(p)\right)\le ε^{-1}\cdot c(n,ν)r^{n-2}$. These results also make progress on Naber's conjecture on the convergence of curvature measures.
The measure $\mathcal K(x)\, \operatorname d\mathcal H^{n-1}$ on Alexandrov spaces can be viewed as the counterpart of the curvature measure $scal \,\operatorname d {vol}_{g}$ on smooth manifolds. We also show that if $n$-dimensional Alexandrov spaces $X_i$ Gromov-Hausdorff converge to a smooth manifold with no boundary without collapsing, then $\mathcal K_i\, \operatorname d\mathcal H^{n-1}\to 0$ as a measure.
△ Less
Submitted 26 July, 2023; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Two-step Newton's method for deflation-one singular zeros of analytic systems
Authors:
Kisun Lee,
Nan Li,
Lihong Zhi
Abstract:
We propose a two-step Newton's method for refining an approximation of a singular zero whose deflation process terminates after one step, also known as a deflation-one singularity. Given an isolated singular zero of a square analytic system, our algorithm exploits an invertible linear operator obtained by combining the Jacobian and a projection of the Hessian in the direction of the kernel of the…
▽ More
We propose a two-step Newton's method for refining an approximation of a singular zero whose deflation process terminates after one step, also known as a deflation-one singularity. Given an isolated singular zero of a square analytic system, our algorithm exploits an invertible linear operator obtained by combining the Jacobian and a projection of the Hessian in the direction of the kernel of the Jacobian. We prove the quadratic convergence of the two-step Newton method when it is applied to an approximation of a deflation-one singular zero. Also, the algorithm requires a smaller size of matrices than the existing methods, making it more efficient. We demonstrate examples and experiments to show the efficiency of the method.
△ Less
Submitted 24 January, 2024; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Some new curious congruences involving multiple harmonic sums
Authors:
Rong Ma,
Ni Li
Abstract:
It is significant to study congruences involving multiple harmonic sums. Let $p$ be an odd prime, in recent years, the following curious congruence $$\sum_{\substack{i+j+k=p \\ i, j, k>0}} \frac{1}{i j k} \equiv-2 B_{p-3}\pmod p$$ has been generalized along different directions, where $B_n$ denote the $n$th Bernoulli number. In this paper, we obtain several new generalizations of the above congrue…
▽ More
It is significant to study congruences involving multiple harmonic sums. Let $p$ be an odd prime, in recent years, the following curious congruence $$\sum_{\substack{i+j+k=p \\ i, j, k>0}} \frac{1}{i j k} \equiv-2 B_{p-3}\pmod p$$ has been generalized along different directions, where $B_n$ denote the $n$th Bernoulli number. In this paper, we obtain several new generalizations of the above congruence by applying congruences involving multiple harmonic sums. For example, we have $$\sum_{\substack{k_1+k_2+\cdots+k_n=p \\ k_i> 0, 1 \le i \le n}} \dfrac{(-1)^{k_1}\left(\dfrac{k_1}{3}\right)}{k_1 \cdots k_n} \equiv \dfrac{(n-1)!}{n}\dfrac{2^{n-1}+1}{3\cdot6^{n-1}}B_{p-n}\left(\dfrac{1}{3}\right)\pmod p,$$ where $n$ is even, $B_n(x)$ denote the Bernoulli polynomials.
△ Less
Submitted 13 May, 2023;
originally announced May 2023.
-
Policy Iteration Reinforcement Learning Method for Continuous-time Mean-Field Linear-Quadratic Optimal Problem
Authors:
Na Li,
Xun Li,
Zuo Quan Xu
Abstract:
This paper employs a policy iteration reinforcement learning (RL) method to study continuous-time linear quadratic mean-field control problems in the infinite horizon. The drift and diffusion terms in the dynamics involve the state as well as the control. We investigate the stability and convergence of the RL algorithm using a Lyapunov Recursion. Instead of solving a pair of coupled Riccati equati…
▽ More
This paper employs a policy iteration reinforcement learning (RL) method to study continuous-time linear quadratic mean-field control problems in the infinite horizon. The drift and diffusion terms in the dynamics involve the state as well as the control. We investigate the stability and convergence of the RL algorithm using a Lyapunov Recursion. Instead of solving a pair of coupled Riccati equations, the RL technique focuses on strengthening an auxiliary function and the cost functional as the objective functions and updating the new policy to compute the optimal control via state trajectories. A numerical example sheds light on the established theoretical results.
△ Less
Submitted 29 April, 2024; v1 submitted 30 April, 2023;
originally announced May 2023.
-
A Unified Safety Protection and Extension Governor
Authors:
Nan Li,
Yutong Li,
Ilya Kolmanovsky
Abstract:
In this paper, we propose a supervisory control scheme that unifies the abilities of safety protection and safety extension. It produces a control that is able to keep the system safe indefinitely when such a control exists. When such a control does not exist due to abnormal system states, it optimizes the control to maximize the time before any safety violation, which translates into more time to…
▽ More
In this paper, we propose a supervisory control scheme that unifies the abilities of safety protection and safety extension. It produces a control that is able to keep the system safe indefinitely when such a control exists. When such a control does not exist due to abnormal system states, it optimizes the control to maximize the time before any safety violation, which translates into more time to seek recovery and/or mitigate any harm. We describe the scheme and develop an approach that integrates the two capabilities into a single constrained optimization problem with only continuous variables. For linear systems with convex constraints, the problem reduces to a convex quadratic program and is easy to solve. We illustrate the proposed safety supervisor with an automotive example.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Stochastic Nonlinear Control via Finite-dimensional Spectral Dynamic Embedding
Authors:
Tongzheng Ren,
Zhaolin Ren,
Haitong Ma,
Na Li,
Bo Dai
Abstract:
This paper presents an approach, Spectral Dynamics Embedding Control (SDEC), to optimal control for nonlinear stochastic systems. This method leverages an infinite-dimensional feature to linearly represent the state-action value function and exploits finite-dimensional truncation approximation for practical implementation. To characterize the effectiveness of these finite dimensional approximation…
▽ More
This paper presents an approach, Spectral Dynamics Embedding Control (SDEC), to optimal control for nonlinear stochastic systems. This method leverages an infinite-dimensional feature to linearly represent the state-action value function and exploits finite-dimensional truncation approximation for practical implementation. To characterize the effectiveness of these finite dimensional approximations, we provide an in-depth theoretical analysis to characterize the approximation error induced by the finite-dimension truncation and statistical error induced by finite-sample approximation in both policy evaluation and policy optimization. Our analysis includes two prominent kernel approximation methods: truncations onto random features and Nystrom features. We also empirically test the algorithm and compare the performance with Koopman-based, iLQR, and energy-based methods on a few benchmark problems.
△ Less
Submitted 20 December, 2023; v1 submitted 8 April, 2023;
originally announced April 2023.
-
Markov Games with Decoupled Dynamics: Price of Anarchy and Sample Complexity
Authors:
Runyu Zhang,
Yuyang Zhang,
Rohit Konda,
Bryce Ferguson,
Jason Marden,
Na Li
Abstract:
This paper studies the finite-time horizon Markov games where the agents' dynamics are decoupled but the rewards can possibly be coupled across agents. The policy class is restricted to local policies where agents make decisions using their local state. We first introduce the notion of smooth Markov games which extends the smoothness argument for normal form games to our setting, and leverage the…
▽ More
This paper studies the finite-time horizon Markov games where the agents' dynamics are decoupled but the rewards can possibly be coupled across agents. The policy class is restricted to local policies where agents make decisions using their local state. We first introduce the notion of smooth Markov games which extends the smoothness argument for normal form games to our setting, and leverage the smoothness property to bound the price of anarchy of the Markov game. For a specific type of Markov game called the Markov potential game, we also develop a distributed learning algorithm, multi-agent soft policy iteration (MA-SPI), which provably converges to a Nash equilibrium. Sample complexity of the algorithm is also provided. Lastly, our results are validated using a dynamic covering game.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
On the Relationship of Optimal State Feedback and Disturbance Response Controllers
Authors:
Runyu Zhang,
Yang Zheng,
Weiyu Li,
Na Li
Abstract:
This paper studies the relationship between state feedback policies and disturbance response policies for the standard Linear Quadratic Regulator (LQR). For open-loop stable plants, we establish a simple relationship between the optimal state feedback controller $u_t=K_\star x_t$ and the optimal disturbance response controller $u_t=L^{(H)}_{\star;1}w_{t-1}+\cdots+L^{(H)}_{\star;H}w_{t-H}$ with…
▽ More
This paper studies the relationship between state feedback policies and disturbance response policies for the standard Linear Quadratic Regulator (LQR). For open-loop stable plants, we establish a simple relationship between the optimal state feedback controller $u_t=K_\star x_t$ and the optimal disturbance response controller $u_t=L^{(H)}_{\star;1}w_{t-1}+\cdots+L^{(H)}_{\star;H}w_{t-H}$ with $H$-order. Here $x_t, w_t, u_t$ stands for the state, disturbance, control action of the system, respectively. Our result shows that $L_{\star,1}^{(H)}$ is a good approximation of $K_\star$ and the approximation error $\|K_\star - L_{\star,1}^{(H)}\|$ decays exponentially with $H$. We further extend this result to LQR for open-loop unstable systems, when a pre-stabilizing controller $K_0$ is available.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
Decentralized Riemannian natural gradient methods with Kronecker-product approximations
Authors:
Jiang Hu,
Kangkang Deng,
Na Li,
Quanzheng Li
Abstract:
With a computationally efficient approximation of the second-order information, natural gradient methods have been successful in solving large-scale structured optimization problems. We study the natural gradient methods for the large-scale decentralized optimization problems on Riemannian manifolds, where the local objective function defined by the local dataset is of a log-probability type. By u…
▽ More
With a computationally efficient approximation of the second-order information, natural gradient methods have been successful in solving large-scale structured optimization problems. We study the natural gradient methods for the large-scale decentralized optimization problems on Riemannian manifolds, where the local objective function defined by the local dataset is of a log-probability type. By utilizing the structure of the Riemannian Fisher information matrix (RFIM), we present an efficient decentralized Riemannian natural gradient descent (DRNGD) method. To overcome the communication issue of the high-dimension RFIM, we consider a class of structured problems for which the RFIM can be approximated by a Kronecker product of two low-dimension matrices. By performing the communications over the Kronecker factors, a high-quality approximation of the RFIM can be obtained in a low cost. We prove that DRNGD converges to a stationary point with the best-known rate of $\mathcal{O}(1/K)$. Numerical experiments demonstrate the efficiency of our proposed method compared with the state-of-the-art ones. To the best of our knowledge, this is the first Riemannian second-order method for solving decentralized manifold optimization problems.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Continuous-Time Zeroth-Order Dynamics with Projection Maps: Model-Free Feedback Optimization with Safety Guarantees
Authors:
Xin Chen,
Jorge I. Poveda,
Na Li
Abstract:
This paper introduces a class of model-free feedback methods for solving generic constrained optimization problems where the specific mathematical forms of the objective and constraint functions are not available. The proposed methods, termed Projected Zeroth-Order (P-ZO) dynamics, incorporate projection maps into a class of continuous-time model-free dynamics that make use of periodic dithering f…
▽ More
This paper introduces a class of model-free feedback methods for solving generic constrained optimization problems where the specific mathematical forms of the objective and constraint functions are not available. The proposed methods, termed Projected Zeroth-Order (P-ZO) dynamics, incorporate projection maps into a class of continuous-time model-free dynamics that make use of periodic dithering for the purpose of gradient learning. In particular, the proposed P-ZO algorithms can be interpreted as new extremum-seeking algorithms that autonomously drive an unknown system toward a neighborhood of the set of solutions of an optimization problem using only output feedback, while systematically guaranteeing that the input trajectories remain in a feasible set for all times. In this way, the P-ZO algorithms can properly handle hard and asymptotical constraints in model-free optimization problems without using penalty terms or barrier functions. Moreover, the proposed dynamics have suitable robustness properties with respect to small bounded additive disturbances on the states and dynamics, a property that is fundamental for practical real-world implementations. Additional tracking results for time-varying and switching cost functions are also derived under stronger convexity and smoothness assumptions and using tools from hybrid dynamical systems. Numerical examples are presented throughout the paper to illustrate the above results.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
Convergence of a quantum lattice Boltzmann scheme to the nonlinear Dirac equation for Gross-Neveu model in $1+1$ dimensions
Authors:
Ningning Li,
**g Zhang,
Yongqian Zhang
Abstract:
This paper studies the quantum lattice Boltzmann scheme for the nonlinear Dirac equations for Gross-Neveu model in $1+1$ dimensions. The initial data for the scheme are assumed to be convergent in $L^2$. Then for any $T\ge 0$ the corresponding solutions for the quantum lattice Boltzmann scheme are shown to be convergent in $C([0,T];L^2(R^1))$ to the strong solution to the nonlinear Dirac equations…
▽ More
This paper studies the quantum lattice Boltzmann scheme for the nonlinear Dirac equations for Gross-Neveu model in $1+1$ dimensions. The initial data for the scheme are assumed to be convergent in $L^2$. Then for any $T\ge 0$ the corresponding solutions for the quantum lattice Boltzmann scheme are shown to be convergent in $C([0,T];L^2(R^1))$ to the strong solution to the nonlinear Dirac equations as the mesh sizes converge to zero. In the proof, at first a Glimm type functional is introduced to establish the stability estimates for the difference between two solutions for the corresponding quantum lattice Boltzmann scheme, which leads to the compactness of the set of the solutions for the quantum lattice Boltzmann scheme. Finally, the limit of any convergent subsequence of the solutions for the quantum lattice Boltzmann scheme is shown to coincide with the strong solution to a Cauchy problem for the nonlinear Dirac equations.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Online switching control with stability and regret guarantees
Authors:
Yingying Li,
James A. Preiss,
Na Li,
Yiheng Lin,
Adam Wierman,
Jeff Shamma
Abstract:
This paper considers online switching control with a finite candidate controller pool, an unknown dynamical system, and unknown cost functions. The candidate controllers can be unstabilizing policies. We only require at least one candidate controller to satisfy certain stability properties, but we do not know which one is stabilizing. We design an online algorithm that guarantees finite-gain stabi…
▽ More
This paper considers online switching control with a finite candidate controller pool, an unknown dynamical system, and unknown cost functions. The candidate controllers can be unstabilizing policies. We only require at least one candidate controller to satisfy certain stability properties, but we do not know which one is stabilizing. We design an online algorithm that guarantees finite-gain stability throughout the duration of its execution. We also provide a sublinear policy regret guarantee compared with the optimal stabilizing candidate controller. Lastly, we numerically test our algorithm on quadrotor planar flights and compare it with a classical switching control algorithm, falsification-based switching, and a classical multi-armed bandit algorithm, Exp3 with batches.
△ Less
Submitted 23 January, 2023; v1 submitted 20 January, 2023;
originally announced January 2023.
-
Counting compatible indexing systems for $C_{p^n}$
Authors:
Michael A. Hill,
Jiayun Meng,
Nan Li
Abstract:
We count the number of compatible pairs of indexing systems for the cyclic group $C_{p^n}$. Building on work of Balchin--Barnes--Roitzheim, we show that this sequence of natural numbers is another family of Fuss--Catalan numbers. We count this two different ways: showing how the conditions of compatibility give natural recursive formulas for the number of admissible sets and using an enumeration o…
▽ More
We count the number of compatible pairs of indexing systems for the cyclic group $C_{p^n}$. Building on work of Balchin--Barnes--Roitzheim, we show that this sequence of natural numbers is another family of Fuss--Catalan numbers. We count this two different ways: showing how the conditions of compatibility give natural recursive formulas for the number of admissible sets and using an enumeration of ways to extend indexing systems by conceptually simpler pieces.
△ Less
Submitted 21 December, 2022;
originally announced December 2022.
-
Fractal dimensions for Iterated Graph Systems
Authors:
Nero Ziyu Li
Abstract:
Building upon [1], this study aims to introduce fractal geometry into graph theory, and to establish a potential theoretical foundation for complex networks. Specifically, we employ the method of substitution to create and explore fractal-like graphs, termed deterministic or random iterated graph systems. While the concept of substitution is commonplace in fractal geometry and dynamical systems, i…
▽ More
Building upon [1], this study aims to introduce fractal geometry into graph theory, and to establish a potential theoretical foundation for complex networks. Specifically, we employ the method of substitution to create and explore fractal-like graphs, termed deterministic or random iterated graph systems. While the concept of substitution is commonplace in fractal geometry and dynamical systems, its analysis in the context of graph theory remains a nascent field.
By delving into the properties of these systems, including diameter and distal, we derive two primary outcomes. Firstly, within the deterministic iterated graph systems, we establish that the Minkowski dimension and Hausdorff dimension align analytically through explicit formulae. Secondly, in the case of random iterated graph systems, we demonstrate that almost every graph limit exhibits identical Minkowski and Hausdorff dimensions numerically by their Lyapunov exponents.
The exploration of iterated graph systems holds the potential to unveil novel directions. These findings not only, mathematically, contribute to our understanding of the interplay between fractals and graphs, but also, physically, suggest promising avenues for applications for complex networks.
△ Less
Submitted 28 May, 2024; v1 submitted 4 December, 2022;
originally announced December 2022.
-
On Controller Reduction in Linear Quadratic Gaussian Control with Performance Bounds
Authors:
Zhaolin Ren,
Yang Zheng,
Maryam Fazel,
Na Li
Abstract:
The problem of controller reduction has a rich history in control theory. Yet, many questions remain open. In particular, there exist very few results on the order reduction of general non-observer based controllers and the subsequent quantification of the closed-loop performance. Recent developments in model-free policy optimization for Linear Quadratic Gaussian (LQG) control have highlighted the…
▽ More
The problem of controller reduction has a rich history in control theory. Yet, many questions remain open. In particular, there exist very few results on the order reduction of general non-observer based controllers and the subsequent quantification of the closed-loop performance. Recent developments in model-free policy optimization for Linear Quadratic Gaussian (LQG) control have highlighted the importance of this question. In this paper, we first propose a new set of sufficient conditions ensuring that a perturbed controller remains internally stabilizing. Based on this result, we illustrate how to perform order reduction of general non-observer based controllers using balanced truncation and modal truncation. We also provide explicit bounds on the LQG performance of the reduced-order controller. Furthermore, for single-input-single-output (SISO) systems, we introduce a new controller reduction technique by truncating unstable modes. We illustrate our theoretical results with numerical simulations. Our results will serve as valuable tools to design direct policy search algorithms for control problems with partial observations.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Some congruences involving generalized Bernoulli numbers and Bernoulli polynomials
Authors:
Ni Li,
Rong Ma
Abstract:
Let $[x]$ be the integral part of $x$, $n>1$ be a positive integer and $χ_n$ denote the trivial Dirichlet character modulo $n$. In this paper, we use an identity established by Z. H. Sun to get congruences of $T_{m,k}(n)=\sum_{x=1}^{[n/m]}\frac{χ_n(x)}{x^k}\left(\bmod n^{r+1}\right)$ for $r\in \{1,2\}$, any positive integer $m $ with $n \equiv \pm 1 \left(\bmod m \right)$ in terms of Bernoulli pol…
▽ More
Let $[x]$ be the integral part of $x$, $n>1$ be a positive integer and $χ_n$ denote the trivial Dirichlet character modulo $n$. In this paper, we use an identity established by Z. H. Sun to get congruences of $T_{m,k}(n)=\sum_{x=1}^{[n/m]}\frac{χ_n(x)}{x^k}\left(\bmod n^{r+1}\right)$ for $r\in \{1,2\}$, any positive integer $m $ with $n \equiv \pm 1 \left(\bmod m \right)$ in terms of Bernoulli polynomials. As its an application, we also obtain some new congruences involving binomial coefficients modulo $n^4$ in terms of generalized Bernoulli numbers.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
Safe Control and Learning Using Generalized Action Governor
Authors:
Nan Li,
Yutong Li,
Ilya Kolmanovsky,
Anouck Girard,
H. Eric Tseng,
Dimitar Filev
Abstract:
This paper introduces the Generalized Action Governor, which is a supervisory scheme for augmenting a nominal closed-loop system with the capability of strictly handling constraints. After presenting its theory for general systems and introducing tailored design approaches for linear and discrete systems, we discuss its application to safe online learning, which aims to safely evolve control param…
▽ More
This paper introduces the Generalized Action Governor, which is a supervisory scheme for augmenting a nominal closed-loop system with the capability of strictly handling constraints. After presenting its theory for general systems and introducing tailored design approaches for linear and discrete systems, we discuss its application to safe online learning, which aims to safely evolve control parameters using real-time data to improve performance for uncertain systems. In particular, we propose two safe learning algorithms based on integration of reinforcement learning/data-driven Koopman operator-based control with the generalized action governor. The developments are illustrated with a numerical example.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Learning to Optimize with Stochastic Dominance Constraints
Authors:
Hanjun Dai,
Yuan Xue,
Niao He,
Bethany Wang,
Na Li,
Dale Schuurmans,
Bo Dai
Abstract:
In real-world decision-making, uncertainty is important yet difficult to handle. Stochastic dominance provides a theoretically sound approach for comparing uncertain quantities, but optimization with stochastic dominance constraints is often computationally expensive, which limits practical applicability. In this paper, we develop a simple yet efficient approach for the problem, the Light Stochast…
▽ More
In real-world decision-making, uncertainty is important yet difficult to handle. Stochastic dominance provides a theoretically sound approach for comparing uncertain quantities, but optimization with stochastic dominance constraints is often computationally expensive, which limits practical applicability. In this paper, we develop a simple yet efficient approach for the problem, the Light Stochastic Dominance Solver (light-SD), that leverages useful properties of the Lagrangian. We recast the inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability and leads to tractable updates or even closed-form solutions for gradient calculations. We prove convergence of the algorithm and test it empirically. The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
△ Less
Submitted 24 February, 2023; v1 submitted 14 November, 2022;
originally announced November 2022.
-
The integral closure of a primary ideal is not always primary
Authors:
Nan Li,
Zijia Li,
Zhi-Hong Yang,
Lihong Zhi
Abstract:
In 1936, Krull asked if the integral closure of a primary ideal is still primary. Fifty years later, Huneke partially answered this question by giving a primary polynomial ideal whose integral closure is not primary in a regular local ring of characteristic $p=2$. We provide counterexamples to Krull's question regarding polynomial rings with any characteristics. We also find that the Jacobian idea…
▽ More
In 1936, Krull asked if the integral closure of a primary ideal is still primary. Fifty years later, Huneke partially answered this question by giving a primary polynomial ideal whose integral closure is not primary in a regular local ring of characteristic $p=2$. We provide counterexamples to Krull's question regarding polynomial rings with any characteristics. We also find that the Jacobian ideal $J$ of the polynomial $f = x^6 + y^6 + x^4 z t + z^3$ given by Briançon and Speder in 1975 is a counterexample to Krull's question. Let $V_1$ be the hypersurface defined by $f = 0$ and $V_2$ be its singular locus. Briançon and Speder proved that Whitney equisingularity does not imply Zariski equisingularity by showing that the pair $(V_1 \setminus V_2,\ V_2)$ satisfies Whitney's conditions around the origin but fails Zariski's equisingular conditions. We discover that the pair $(V_1 \setminus V_2,\ V_2)$ fails Whitney's conditions at the variety of the embedded prime of the integral closure $\bar{J}$, which means that $V_1$ is not Whitney regular along $V_2$. Moreover, we also show that Whitney stratification of this hypersurface is different from the stratification of isosingular sets given by Hauenstein and Wampler, which is related to Thom-Boardman singularity.
△ Less
Submitted 16 November, 2022; v1 submitted 30 October, 2022;
originally announced October 2022.
-
Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies
Authors:
Bin Hu,
Kaiqing Zhang,
Na Li,
Mehran Mesbahi,
Maryam Fazel,
Tamer Başar
Abstract:
Gradient-based methods have been widely used for system design and optimization in diverse application domains. Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning. This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synt…
▽ More
Gradient-based methods have been widely used for system design and optimization in diverse application domains. Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning. This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synthesis, popularized by successes of reinforcement learning. We take an interdisciplinary perspective in our exposition that connects control theory, reinforcement learning, and large-scale optimization. We review a number of recently-developed theoretical results on the optimization landscape, global convergence, and sample complexity of gradient-based methods for various continuous control problems such as the linear quadratic regulator (LQR), $\mathcal{H}_\infty$ control, risk-sensitive control, linear quadratic Gaussian (LQG) control, and output feedback synthesis. In conjunction with these optimization results, we also discuss how direct policy optimization handles stability and robustness concerns in learning-based control, two main desiderata in control engineering. We conclude the survey by pointing out several challenges and opportunities at the intersection of learning and control.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Data-driven policy iteration algorithm for continuous-time stochastic linear-quadratic optimal control problems
Authors:
Heng Zhang,
Na Li
Abstract:
This paper studies a continuous-time stochastic linear-quadratic (SLQ) optimal control problem on infinite-horizon. A data-driven policy iteration algorithm is proposed to solve the SLQ problem. Without knowing three system coefficient matrices, this algorithm uses the collected data to iteratively approximate a solution of the corresponding stochastic algebraic Riccati equation (SARE). A simulati…
▽ More
This paper studies a continuous-time stochastic linear-quadratic (SLQ) optimal control problem on infinite-horizon. A data-driven policy iteration algorithm is proposed to solve the SLQ problem. Without knowing three system coefficient matrices, this algorithm uses the collected data to iteratively approximate a solution of the corresponding stochastic algebraic Riccati equation (SARE). A simulation example is provided to illustrate the effectiveness and applicability of the algorithm.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
On the Optimal Control of Network LQR with Spatially-Exponential Decaying Structure
Authors:
Runyu Zhang,
Weiyu Li,
Na Li
Abstract:
This paper studies network LQR problems with system matrices being spatially-exponential decaying (SED) between nodes in the network. The major objective is to study whether the optimal controller also enjoys a SED structure, which is an appealing property for ensuring the optimality of decentralized control over the network. We start with studying the open-loop asymptotically stable system and sh…
▽ More
This paper studies network LQR problems with system matrices being spatially-exponential decaying (SED) between nodes in the network. The major objective is to study whether the optimal controller also enjoys a SED structure, which is an appealing property for ensuring the optimality of decentralized control over the network. We start with studying the open-loop asymptotically stable system and show that the optimal LQR state feedback gain $K$ is `quasi'-SED in this setting, i.e. $\|[K]_{ij}\|\sim O\left(e^{-\frac{c}{\mathrm{poly}\ln(N)}}\mathrm{dist}(i,j)\right)$. The decaying rate $c$ depends on the decaying rate and norms of system matrices and the open-loop exponential stability constants. Then the result is further generalized to unstable systems under a SED stabilizability assumption. Building upon the `quasi'-SED result on $K$, we give an upper-bound on the performance of $κ$-truncated local controllers, suggesting that distributed controllers can achieve near-optimal performance for SED systems. We develop these results via studying the structure of another type of controller, disturbance response control, which has been studied and used in recent online control literature; thus as a side result, we also prove the `quasi'-SED property of the optimal disturbance response control, which serves as a contribution on its own merit.
△ Less
Submitted 8 May, 2023; v1 submitted 28 September, 2022;
originally announced September 2022.
-
Esca** saddle points in zeroth-order optimization: the power of two-point estimators
Authors:
Zhaolin Ren,
Yujie Tang,
Na Li
Abstract:
Two-point zeroth order methods are important in many applications of zeroth-order optimization, such as robotics, wind farms, power systems, online optimization, and adversarial robustness to black-box attacks in deep neural networks, where the problem may be high-dimensional and/or time-varying. Most problems in these applications are nonconvex and contain saddle points. While existing works have…
▽ More
Two-point zeroth order methods are important in many applications of zeroth-order optimization, such as robotics, wind farms, power systems, online optimization, and adversarial robustness to black-box attacks in deep neural networks, where the problem may be high-dimensional and/or time-varying. Most problems in these applications are nonconvex and contain saddle points. While existing works have shown that zeroth-order methods utilizing $Ω(d)$ function valuations per iteration (with $d$ denoting the problem dimension) can escape saddle points efficiently, it remains an open question if zeroth-order methods based on two-point estimators can escape saddle points. In this paper, we show that by adding an appropriate isotropic perturbation at each iteration, a zeroth-order algorithm based on $2m$ (for any $1 \leq m \leq d$) function evaluations per iteration can not only find $ε$-second order stationary points polynomially fast, but do so using only $\tilde{O}\left(\frac{d}{mε^{2}\barψ}\right)$ function evaluations, where $\barψ \geq \tildeΩ\left(\sqrtε\right)$ is a parameter capturing the extent to which the function of interest exhibits the strict saddle property.
△ Less
Submitted 8 May, 2023; v1 submitted 27 September, 2022;
originally announced September 2022.
-
Kähler Finsler manifolds with curvatures bounded from below
Authors:
Bin Chen,
Nan Li,
Siwei Liu
Abstract:
We obtain a partial parallelism of the complex structure on Kähler Finsler manifolds. As applications, we prove Synge-Tsukamoto theorem and Bonnet-Myers theorem for positively curved Kähler Finsler manifolds. Moreover, we generalize a comparison theorem due to Ni-Zheng by introducing the notion of orthogonal Ricci curvature to Kähler Finsler geometry.
We obtain a partial parallelism of the complex structure on Kähler Finsler manifolds. As applications, we prove Synge-Tsukamoto theorem and Bonnet-Myers theorem for positively curved Kähler Finsler manifolds. Moreover, we generalize a comparison theorem due to Ni-Zheng by introducing the notion of orthogonal Ricci curvature to Kähler Finsler geometry.
△ Less
Submitted 23 December, 2022; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Asymptotic Geometry of the Moduli Space of Rank Two Irregular Higgs Bundles over the Projective Line
Authors:
Gao Chen,
Nianzi Li
Abstract:
We study the asymptotic behavior of Hitchin's hyperkähler metric on the moduli space of rank two irregular Higgs bundles over $\mathbb{C}P^1$. Along a generic curve, we prove that the Hitchin metric is asymptotic to the semiflat metric at an arbitrary polynomial order. When there are no weakly parabolic singularities, the rate is exponential. In the case of four-dimensional moduli spaces, we prove…
▽ More
We study the asymptotic behavior of Hitchin's hyperkähler metric on the moduli space of rank two irregular Higgs bundles over $\mathbb{C}P^1$. Along a generic curve, we prove that the Hitchin metric is asymptotic to the semiflat metric at an arbitrary polynomial order. When there are no weakly parabolic singularities, the rate is exponential. In the case of four-dimensional moduli spaces, we prove that the semiflat metric is asymptotic to an ALG/ALG$^\ast$ model metric.
△ Less
Submitted 3 January, 2024; v1 submitted 23 June, 2022;
originally announced June 2022.
-
Model-Free Feedback Constrained Optimization Via Projected Primal-Dual Zeroth-Order Dynamics
Authors:
Xin Chen,
Jorge I. Poveda,
Na Li
Abstract:
In this paper, we propose a model-free feedback solution method to solve generic constrained optimization problems, without knowing the specific formulations of the objective and constraint functions. This solution method is termed projected primal-dual zeroth-order dynamics (P-PDZD) and is developed based on projected primal-dual gradient dynamics and extremum seeking control. In particular, the…
▽ More
In this paper, we propose a model-free feedback solution method to solve generic constrained optimization problems, without knowing the specific formulations of the objective and constraint functions. This solution method is termed projected primal-dual zeroth-order dynamics (P-PDZD) and is developed based on projected primal-dual gradient dynamics and extremum seeking control. In particular, the P-PDZD method can be interpreted as a model-free controller that autonomously drives an unknown system to the solution of the optimization problem using only output feedback. The P-PDZD can properly handle both the hard and asymptotic constraints, and we develop the decentralized version of P-PDZD when applied to multi-agent systems. Moreover, we prove that the P-PDZD achieves semi-global practical asymptotic stability and structural robustness. We then apply the decentralized P-PDZD to the optimal voltage control problem in power distribution systems with square probing signals, and the simulation results verified the optimality, robustness, and adaptivity of the P-PDZD method.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
Policy Optimization for Markov Games: Unified Framework and Faster Convergence
Authors:
Runyu Zhang,
Qinghua Liu,
Huan Wang,
Caiming Xiong,
Na Li,
Yu Bai
Abstract:
This paper studies policy optimization algorithms for multi-agent reinforcement learning. We begin by proposing an algorithm framework for two-player zero-sum Markov Games in the full-information setting, where each iteration consists of a policy update step at each state using a certain matrix game algorithm, and a value update step with a certain learning rate. This framework unifies many existi…
▽ More
This paper studies policy optimization algorithms for multi-agent reinforcement learning. We begin by proposing an algorithm framework for two-player zero-sum Markov Games in the full-information setting, where each iteration consists of a policy update step at each state using a certain matrix game algorithm, and a value update step with a certain learning rate. This framework unifies many existing and new policy optimization algorithms. We show that the state-wise average policy of this algorithm converges to an approximate Nash equilibrium (NE) of the game, as long as the matrix game algorithms achieve low weighted regret at each state, with respect to weights determined by the speed of the value updates. Next, we show that this framework instantiated with the Optimistic Follow-The-Regularized-Leader (OFTRL) algorithm at each state (and smooth value updates) can find an $\mathcal{\widetilde{O}}(T^{-5/6})$ approximate NE in $T$ iterations, and a similar algorithm with slightly modified value update rule achieves a faster $\mathcal{\widetilde{O}}(T^{-1})$ convergence rate. These improve over the current best $\mathcal{\widetilde{O}}(T^{-1/2})$ rate of symmetric policy optimization type algorithms. We also extend this algorithm to multi-player general-sum Markov Games and show an $\mathcal{\widetilde{O}}(T^{-3/4})$ convergence rate to Coarse Correlated Equilibria (CCE). Finally, we provide a numerical example to verify our theory and investigate the importance of smooth value updates, and find that using "eager" value updates instead (equivalent to the independent natural policy gradient algorithm) may significantly slow down the convergence, even on a simple game with $H=2$ layers.
△ Less
Submitted 22 July, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
On value distribution of certain delay-differential polynomials
Authors:
Nan Li,
Lianzhong Yang
Abstract:
Given an entire function $f$ of finite order $ρ$, let $L(z,f)=\sum_{j=0}^{m}b_{j}(z)f^{(k_{j})}(z+c_{j})$ be a linear delay-differential polynomial of $f$ with small coefficients in the sense of $O(r^{λ+\varepsilon})+S(r,f)$, $λ<ρ$. Provided $α$, $β$ be similar small functions, we consider the zero distribution of $L(z,f)-αf^{n}-β$ for $n\geq 3$ and $n=2$, respectively. Our results are improvement…
▽ More
Given an entire function $f$ of finite order $ρ$, let $L(z,f)=\sum_{j=0}^{m}b_{j}(z)f^{(k_{j})}(z+c_{j})$ be a linear delay-differential polynomial of $f$ with small coefficients in the sense of $O(r^{λ+\varepsilon})+S(r,f)$, $λ<ρ$. Provided $α$, $β$ be similar small functions, we consider the zero distribution of $L(z,f)-αf^{n}-β$ for $n\geq 3$ and $n=2$, respectively. Our results are improvements and complements of Chen(Abstract Appl. Anal., 2011, 2011: ID239853, 1--9), and Laine (J. Math. Anal. Appl. 2019, 469(2): 808--826.), etc.
△ Less
Submitted 15 May, 2022;
originally announced May 2022.
-
Restarted randomized surrounding methods for solving large linear equations
Authors:
Junfeng Yin,
Nan Li,
Ning Zheng
Abstract:
A class of restarted randomized surrounding methods are presented to accelerate the surrounding algorithms by restarted techniques for solving the linear equations. Theoretical analysis prove that the proposed method converges under the randomized row selection rule and the expectation convergence rate is also addressed. Numerical experiments further demonstrate that the proposed algorithms are ef…
▽ More
A class of restarted randomized surrounding methods are presented to accelerate the surrounding algorithms by restarted techniques for solving the linear equations. Theoretical analysis prove that the proposed method converges under the randomized row selection rule and the expectation convergence rate is also addressed. Numerical experiments further demonstrate that the proposed algorithms are efficient and outperform the existing method for over-determined and under-determined linear equation, as well as in the application of image processing.
△ Less
Submitted 8 July, 2022; v1 submitted 3 May, 2022;
originally announced May 2022.
-
A Survey on Distributed Online Optimization and Game
Authors:
Xiuxian Li,
Lihua Xie,
Na Li
Abstract:
Distributed online optimization and game have been increasingly researched in the last decade, mostly motivated by its wide applications in sensor networks, robotics (e.g., distributed target tracking and formation control), smart grids, deep learning, and so forth. In these problems, there is a network of agents who may be cooperative (i.e., distributed online optimization) or noncooperative (i.e…
▽ More
Distributed online optimization and game have been increasingly researched in the last decade, mostly motivated by its wide applications in sensor networks, robotics (e.g., distributed target tracking and formation control), smart grids, deep learning, and so forth. In these problems, there is a network of agents who may be cooperative (i.e., distributed online optimization) or noncooperative (i.e., online game) through local information exchanges. And the local cost function of each agent is often time-varying in dynamic and even adversarial environments. At each time, a decision must be made by each agent based on historical information at hand without knowing future information on cost functions. For these problems, a comprehensive survey is still lacking. This paper aims to provide a thorough overview of distributed online optimization and game from the perspective of problem settings, communication, computation, algorithms, and performances. In addition, some potential future directions are also discussed.
△ Less
Submitted 21 January, 2023; v1 submitted 1 May, 2022;
originally announced May 2022.