-
A Poincaré-Birkhoff-Witt theorem for the universal envelo** algebra of a Rota-Baxter Lie algebra
Authors:
Zhi-Cheng Zhu,
Xing Gao,
Li Guo,
Jun Pei
Abstract:
Rota-Baxter associative algebras and Rota-Baxter Lie algebras are both important in mathematics and mathematical physics, with the former a basic structure in quantum field renormalization and the latter a operator form of the classical Yang-Baxter equation. An outstanding problem posed by Gubarev is to determine whether there is a Poincaré-Birkhoff-Witt theorem for the universal envelo** Rota-B…
▽ More
Rota-Baxter associative algebras and Rota-Baxter Lie algebras are both important in mathematics and mathematical physics, with the former a basic structure in quantum field renormalization and the latter a operator form of the classical Yang-Baxter equation. An outstanding problem posed by Gubarev is to determine whether there is a Poincaré-Birkhoff-Witt theorem for the universal envelo** Rota-Baxter associative algebra of a Rota-Baxter Lie algebra. This paper resolves this problem positively, working with operated algebras and applying the method of Gröbner-Shirshov bases.
△ Less
Submitted 21 October, 2022; v1 submitted 8 September, 2022;
originally announced September 2022.
-
Decentralized Composite Optimization in Stochastic Networks: A Dual Averaging Approach with Linear Convergence
Authors:
Changxin Liu,
Zirui Zhou,
Jian Pei,
Yong Zhang,
Yang Shi
Abstract:
Decentralized optimization, particularly the class of decentralized composite convex optimization (DCCO) problems, has found many applications. Due to ubiquitous communication congestion and random dropouts in practice, it is highly desirable to design decentralized algorithms that can handle stochastic communication networks. However, most existing algorithms for DCCO only work in networks that a…
▽ More
Decentralized optimization, particularly the class of decentralized composite convex optimization (DCCO) problems, has found many applications. Due to ubiquitous communication congestion and random dropouts in practice, it is highly desirable to design decentralized algorithms that can handle stochastic communication networks. However, most existing algorithms for DCCO only work in networks that are deterministically connected during bounded communication rounds, and therefore cannot be extended to stochastic networks. In this paper, we propose a new decentralized dual averaging (DDA) algorithm that can solve DCCO in stochastic networks. Under a rather mild condition on stochastic networks, we show that the proposed algorithm attains global linear convergence if each local objective function is strongly convex. Our algorithm substantially improves the existing DDA-type algorithms as the latter were only known to converge sublinearly prior to our work. The key to achieving the improved rate is the design of a novel dynamic averaging consensus protocol for DDA, which intuitively leads to more accurate local estimates of the global dual variable. To the best of our knowledge, this is the first linearly convergent DDA-type decentralized algorithm and also the first algorithm that attains global linear convergence for solving DCCO in stochastic networks. Numerical results are also presented to support our design and analysis.
△ Less
Submitted 11 October, 2022; v1 submitted 26 June, 2021;
originally announced June 2021.
-
A Fubini type theorem for rough integration
Authors:
Thomas Cass,
Jeffrey Pei
Abstract:
We develop the integration theory of two-parameter controlled paths $Y$ allowing us to define integrals of the form \begin{equation}
\int_{[s,t] \times [u,v]}
Y_{r,r'}
\;d(X_{r}, X_{r'}) \end{equation} where $X$ is the geometric $p$-rough path that controls $Y$. This extends to arbitrary regularity the definition presented for $2\leq p<3$ in the recent paper of Hairer and Gerasimovičs where…
▽ More
We develop the integration theory of two-parameter controlled paths $Y$ allowing us to define integrals of the form \begin{equation}
\int_{[s,t] \times [u,v]}
Y_{r,r'}
\;d(X_{r}, X_{r'}) \end{equation} where $X$ is the geometric $p$-rough path that controls $Y$. This extends to arbitrary regularity the definition presented for $2\leq p<3$ in the recent paper of Hairer and Gerasimovičs where it is used in the proof of a version of Hörmander's theorem for a class of SPDEs. We extend the Fubini type theorem of the same paper by showing that this two-parameter integral coincides with the two iterated one-parameter integrals \[
\int_{[s,t] \times [u,v]}
Y_{r,r'}
\;d(X_{r}, X_{r'})
=
\int_{s}^{t}
\int_{u}^{v}
Y_{r,r'}
\;dX_{r'}
\;dX_{r'}
=
\int_{u}^{v}
\int_{s}^{t}
Y_{r,r'}
\;dX_{r}
\;dX_{r'}. \] A priori these three integrals have distinct definitions, and so this parallels the classical Fubini's theorem for product measures. By extending the two-parameter Young-Towghi inequality in this context, we derive a maximal inequality for the discrete integrals approximating the two-parameter integral. We also extend the analysis to consider integrals of the form \begin{equation*}
\int_{[s,t] \times [u,v]}
Y_{r,r'}
\;
d(X_{r}, \tilde{X}_{r'}) \end{equation*} for possibly different rough paths $X$ and $\tilde{X}$, and obtain the corresponding Fubini type theorem. We prove continuity estimates for these integrals in the appropriate rough path topologies. As an application we consider the signature kernel, which has recently emerged as a useful tool in data science, as an example of a two-parameter controlled rough path which also solves a two-parameter rough integral equation.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.
-
Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization
Authors:
Feihu Huang,
Shangqian Gao,
Jian Pei,
Heng Huang
Abstract:
In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method for black-box mini-optimization where only function values can be obtained. Moreover, we prove that our Acc-ZOM method achieves a lower query complexity of…
▽ More
In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method for black-box mini-optimization where only function values can be obtained. Moreover, we prove that our Acc-ZOM method achieves a lower query complexity of $\tilde{O}(d^{3/4}ε^{-3})$ for finding an $ε$-stationary point, which improves the best known result by a factor of $O(d^{1/4})$ where $d$ denotes the variable dimension. In particular, our Acc-ZOM does not need large batches required in the existing zeroth-order stochastic algorithms. Meanwhile, we propose an accelerated zeroth-order momentum descent ascent (Acc-ZOMDA) method for black-box minimax optimization, where only function values can be obtained. Our Acc-ZOMDA obtains a low query complexity of $\tilde{O}((d_1+d_2)^{3/4}κ_y^{4.5}ε^{-3})$ without requiring large batches for finding an $ε$-stationary point, where $d_1$ and $d_2$ denote variable dimensions and $κ_y$ is condition number. Moreover, we propose an accelerated first-order momentum descent ascent (Acc-MDA) method for minimax optimization, whose explicit gradients are accessible. Our Acc-MDA achieves a low gradient complexity of $\tilde{O}(κ_y^{4.5}ε^{-3})$ without requiring large batches for finding an $ε$-stationary point. In particular, our Acc-MDA can obtain a lower gradient complexity of $\tilde{O}(κ_y^{2.5}ε^{-3})$ with a batch size $O(κ_y^4)$, which improves the best known result by a factor of $O(κ_y^{1/2})$. Extensive experimental results on black-box adversarial attack to deep neural networks and poisoning attack to logistic regression demonstrate efficiency of our algorithms.
△ Less
Submitted 16 January, 2022; v1 submitted 18 August, 2020;
originally announced August 2020.
-
Controlling the spread of COVID-19 on college campuses
Authors:
Molly Borowiak,
Fayfay Ning,
Justin Pei,
Sarah Zhao,
Hwai-Ray Tung,
Rick Durrett
Abstract:
This research was done during the DOMath program at Duke University from May 18 to July 10, 2020. At the time, Duke and other universities across the country were wrestling with the question of how to safely welcome students back to campus in the Fall. Because of this, our project focused on using mathematical models to evaluate strategies to suppress the spread of the virus on campus, specificall…
▽ More
This research was done during the DOMath program at Duke University from May 18 to July 10, 2020. At the time, Duke and other universities across the country were wrestling with the question of how to safely welcome students back to campus in the Fall. Because of this, our project focused on using mathematical models to evaluate strategies to suppress the spread of the virus on campus, specifically in dorms and in classrooms. For dorms, we show that giving students single rooms rather than double rooms can substantially reduce virus spread. For classrooms, we show that moving classes with size above some cutoff online can make the basic reproduction number $R_0<1$, preventing a wide spread epidemic. The cutoff will depend on the contagiousness of the disease in classrooms.
△ Less
Submitted 17 August, 2020;
originally announced August 2020.
-
Momentum-Based Policy Gradient Methods
Authors:
Feihu Huang,
Shangqian Gao,
Jian Pei,
Heng Huang
Abstract:
In the paper, we propose a class of efficient momentum-based policy gradient methods for the model-free reinforcement learning, which use adaptive learning rates and do not require any large batches. Specifically, we propose a fast important-sampling momentum-based policy gradient (IS-MBPG) method based on a new momentum-based variance reduced technique and the importance sampling technique. We al…
▽ More
In the paper, we propose a class of efficient momentum-based policy gradient methods for the model-free reinforcement learning, which use adaptive learning rates and do not require any large batches. Specifically, we propose a fast important-sampling momentum-based policy gradient (IS-MBPG) method based on a new momentum-based variance reduced technique and the importance sampling technique. We also propose a fast Hessian-aided momentum-based policy gradient (HA-MBPG) method based on the momentum-based variance reduced technique and the Hessian-aided technique. Moreover, we prove that both the IS-MBPG and HA-MBPG methods reach the best known sample complexity of $O(ε^{-3})$ for finding an $ε$-stationary point of the non-concave performance function, which only require one trajectory at each iteration. In particular, we present a non-adaptive version of IS-MBPG method, i.e., IS-MBPG*, which also reaches the best known sample complexity of $O(ε^{-3})$ without any large batches. In the experiments, we apply four benchmark tasks to demonstrate the effectiveness of our algorithms.
△ Less
Submitted 6 August, 2020; v1 submitted 13 July, 2020;
originally announced July 2020.
-
Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity
Authors:
Feihu Huang,
Shangqian Gao,
Jian Pei,
Heng Huang
Abstract:
Zeroth-order (a.k.a, derivative-free) methods are a class of effective optimization methods for solving complex machine learning problems, where gradients of the objective functions are not available or computationally prohibitive. Recently, although many zeroth-order methods have been developed, these approaches still have two main drawbacks: 1) high function query complexity; 2) not being well s…
▽ More
Zeroth-order (a.k.a, derivative-free) methods are a class of effective optimization methods for solving complex machine learning problems, where gradients of the objective functions are not available or computationally prohibitive. Recently, although many zeroth-order methods have been developed, these approaches still have two main drawbacks: 1) high function query complexity; 2) not being well suitable for solving the problems with complex penalties and constraints. To address these challenging drawbacks, in this paper, we propose a class of faster zeroth-order stochastic alternating direction method of multipliers (ADMM) methods (ZO-SPIDER-ADMM) to solve the nonconvex finite-sum problems with multiple nonsmooth penalties. Moreover, we prove that the ZO-SPIDER-ADMM methods can achieve a lower function query complexity of $O(nd+dn^{\frac{1}{2}}ε^{-1})$ for finding an $ε$-stationary point, which improves the existing best nonconvex zeroth-order ADMM methods by a factor of $O(d^{\frac{1}{3}}n^{\frac{1}{6}})$, where $n$ and $d$ denote the sample size and data dimension, respectively. At the same time, we propose a class of faster zeroth-order online ADMM methods (ZOO-ADMM+) to solve the nonconvex online problems with multiple nonsmooth penalties. We also prove that the proposed ZOO-ADMM+ methods achieve a lower function query complexity of $O(dε^{-\frac{3}{2}})$, which improves the existing best result by a factor of $O(ε^{-\frac{1}{2}})$. Extensive experimental results on the structure adversarial attack on black-box deep neural networks demonstrate the efficiency of our new algorithms.
△ Less
Submitted 11 December, 2023; v1 submitted 29 July, 2019;
originally announced July 2019.
-
Representations of Polynomial Rota-Baxter Algebras
Authors:
Li Qiao,
Jun Pei
Abstract:
A Rota--Baxter operator is an algebraic abstraction of integration, which is the typical example of a weight zero Rota-Baxter operator. We show that studying the modules over the polynomial Rota--Baxter algebra $(k[x],P)$ is equivalent to studying the modules over the Jordan plane, and we generalize the direct decomposability results for the $(k[x],P)$-modules in [Iy] from algebraically closed fie…
▽ More
A Rota--Baxter operator is an algebraic abstraction of integration, which is the typical example of a weight zero Rota-Baxter operator. We show that studying the modules over the polynomial Rota--Baxter algebra $(k[x],P)$ is equivalent to studying the modules over the Jordan plane, and we generalize the direct decomposability results for the $(k[x],P)$-modules in [Iy] from algebraically closed fields of characteristic zero to fields of characteristic zero. Furthermore, we provide a classification of Rota--Baxter modules up to isomorphism based on indecomposable $k[x]$-modules.
△ Less
Submitted 31 August, 2017;
originally announced September 2017.
-
Averaging algebras, Schröder numbers, rooted trees and operads
Authors:
Li Guo,
Jun Pei
Abstract:
In this paper, we study averaging operators from an algebraic and combinatorial point of view. We first construct free averaging algebras in terms of a class of bracketed words called averaging words. We next apply this construction to obtain one and two variable generating functions for subsets of averaging words when the averaging operator is taken to be idempotent. When the averaging algebra ha…
▽ More
In this paper, we study averaging operators from an algebraic and combinatorial point of view. We first construct free averaging algebras in terms of a class of bracketed words called averaging words. We next apply this construction to obtain one and two variable generating functions for subsets of averaging words when the averaging operator is taken to be idempotent. When the averaging algebra has an idempotent generator, the generating function in one variable is twice the generating function for large Schröder numbers, leading us to give interpretations of large Schröder numbers in terms of bracketed words and rooted trees, as well as a recursive formula for these numbers. We also give a representation of free averaging algebras by unreduced trees and apply it to give a combinatorial description of the operad of averaging algebras.
△ Less
Submitted 13 December, 2014; v1 submitted 28 January, 2014;
originally announced January 2014.
-
Rota-Baxter operators on $sl(2,C)$ and solutions of the classical Yang-Baxter equation
Authors:
Jun Pei,
Chengming Bai,
Li Guo
Abstract:
We explicitly determine all Rota-Baxter operators (of weight zero) on $sl(2,C)$ under the Cartan-Weyl basis. For the skew-symmetric operators, we give the corresponding skew-symmetric solutions of the classical Yang-Baxter equation in $sl(2,C)$, confirming the related study by Semenov-Tian-Shansky. In general, these Rota-Baxter operators give a family of solutions of the classical Yang-Baxter equa…
▽ More
We explicitly determine all Rota-Baxter operators (of weight zero) on $sl(2,C)$ under the Cartan-Weyl basis. For the skew-symmetric operators, we give the corresponding skew-symmetric solutions of the classical Yang-Baxter equation in $sl(2,C)$, confirming the related study by Semenov-Tian-Shansky. In general, these Rota-Baxter operators give a family of solutions of the classical Yang-Baxter equation in the 6-dimensional Lie algebra $sl(2,C) \ltimes_{{\rm ad}^{\ast}} sl(2,C)^{\ast}$. They also give rise to 3-dimensional pre-Lie algebras which in turn yield solutions of the classical Yang-Baxter equation in other 6-dimensional Lie algebras.
△ Less
Submitted 4 November, 2013;
originally announced November 2013.
-
Splitting of operads and Rota-Baxter operators on operads
Authors:
Jun Pei,
Chengming Bai,
Li Guo
Abstract:
This paper establishes a uniform procedure to split the operations in any algebraic operad, generalizing previous known notions of splitting algebraic structures from the dendriform algebra of Loday that splits the associative operation to the successors that split any binary operad. Examples are provided for various $n$-associative algebras, $n$-Lie algebras, $A_\infty$ algebras and $L_\infty$ al…
▽ More
This paper establishes a uniform procedure to split the operations in any algebraic operad, generalizing previous known notions of splitting algebraic structures from the dendriform algebra of Loday that splits the associative operation to the successors that split any binary operad. Examples are provided for various $n$-associative algebras, $n$-Lie algebras, $A_\infty$ algebras and $L_\infty$ algebras. Further, the concept of a Rota-Baxter operator, first showing its importance in the associative and Lie algebra context and then generalized to any binary operads, is generalized to arbitrary operads. The classical links from the Rota-Baxter associative algebra to the dendriform algebra and its numerous generalizations are further generalized and unified as the link from the Rota-Baxter operator on an operad to the splitting of the operad. Finally, the remarkable fact that any dendriform algebra can be recovered from a relative Rota-Baxter operator is generalized to the context of operads with the generalized notion of a relative Rota-Baxter operator for any operad.
△ Less
Submitted 13 June, 2013;
originally announced June 2013.
-
Replicating of binary operads, Koszul duality, Manin products and average operators
Authors:
Jun Pei,
Chengming Bai,
Li Guo,
Xiang Ni
Abstract:
We consider the notions of the replicators, including the duplicator and triplicator, of a binary operad. As in the closely related notions of di-Var-algebra and tri-Var-algebra in [14], they provide a general operadic definition for the recent constructions of replicating the operations of algebraic structures. We show that taking replicators is in Koszul dual to taking successors in [3] for bina…
▽ More
We consider the notions of the replicators, including the duplicator and triplicator, of a binary operad. As in the closely related notions of di-Var-algebra and tri-Var-algebra in [14], they provide a general operadic definition for the recent constructions of replicating the operations of algebraic structures. We show that taking replicators is in Koszul dual to taking successors in [3] for binary quadratic operads and is equivalent to taking the white product with certain operads such as Perm. We also relate the replicators to the actions of average operators.
△ Less
Submitted 2 January, 2013; v1 submitted 1 December, 2012;
originally announced December 2012.