Search | arXiv e-print repository

A Poincaré-Birkhoff-Witt theorem for the universal envelo** algebra of a Rota-Baxter Lie algebra

Authors: Zhi-Cheng Zhu, Xing Gao, Li Guo, Jun Pei

Abstract: Rota-Baxter associative algebras and Rota-Baxter Lie algebras are both important in mathematics and mathematical physics, with the former a basic structure in quantum field renormalization and the latter a operator form of the classical Yang-Baxter equation. An outstanding problem posed by Gubarev is to determine whether there is a Poincaré-Birkhoff-Witt theorem for the universal envelo** Rota-B… ▽ More Rota-Baxter associative algebras and Rota-Baxter Lie algebras are both important in mathematics and mathematical physics, with the former a basic structure in quantum field renormalization and the latter a operator form of the classical Yang-Baxter equation. An outstanding problem posed by Gubarev is to determine whether there is a Poincaré-Birkhoff-Witt theorem for the universal envelo** Rota-Baxter associative algebra of a Rota-Baxter Lie algebra. This paper resolves this problem positively, working with operated algebras and applying the method of Gröbner-Shirshov bases. △ Less

Submitted 21 October, 2022; v1 submitted 8 September, 2022; originally announced September 2022.

Comments: There was an error in considering all the compositions in Sec 3.1

MSC Class: 17B38; 16W99; 16S10; 08B20; 17B60

arXiv:2106.14075 [pdf, ps, other]

Decentralized Composite Optimization in Stochastic Networks: A Dual Averaging Approach with Linear Convergence

Authors: Changxin Liu, Zirui Zhou, Jian Pei, Yong Zhang, Yang Shi

Abstract: Decentralized optimization, particularly the class of decentralized composite convex optimization (DCCO) problems, has found many applications. Due to ubiquitous communication congestion and random dropouts in practice, it is highly desirable to design decentralized algorithms that can handle stochastic communication networks. However, most existing algorithms for DCCO only work in networks that a… ▽ More Decentralized optimization, particularly the class of decentralized composite convex optimization (DCCO) problems, has found many applications. Due to ubiquitous communication congestion and random dropouts in practice, it is highly desirable to design decentralized algorithms that can handle stochastic communication networks. However, most existing algorithms for DCCO only work in networks that are deterministically connected during bounded communication rounds, and therefore cannot be extended to stochastic networks. In this paper, we propose a new decentralized dual averaging (DDA) algorithm that can solve DCCO in stochastic networks. Under a rather mild condition on stochastic networks, we show that the proposed algorithm attains global linear convergence if each local objective function is strongly convex. Our algorithm substantially improves the existing DDA-type algorithms as the latter were only known to converge sublinearly prior to our work. The key to achieving the improved rate is the design of a novel dynamic averaging consensus protocol for DDA, which intuitively leads to more accurate local estimates of the global dual variable. To the best of our knowledge, this is the first linearly convergent DDA-type decentralized algorithm and also the first algorithm that attains global linear convergence for solving DCCO in stochastic networks. Numerical results are also presented to support our design and analysis. △ Less

Submitted 11 October, 2022; v1 submitted 26 June, 2021; originally announced June 2021.

Comments: 17 pages, 2 figures

arXiv:2106.06474 [pdf, ps, other]

A Fubini type theorem for rough integration

Authors: Thomas Cass, Jeffrey Pei

Abstract: We develop the integration theory of two-parameter controlled paths $Y$ allowing us to define integrals of the form \begin{equation} \int_{[s,t] \times [u,v]} Y_{r,r'} \;d(X_{r}, X_{r'}) \end{equation} where $X$ is the geometric $p$-rough path that controls $Y$. This extends to arbitrary regularity the definition presented for $2\leq p<3$ in the recent paper of Hairer and Gerasimovičs where… ▽ More We develop the integration theory of two-parameter controlled paths $Y$ allowing us to define integrals of the form \begin{equation} \int_{[s,t] \times [u,v]} Y_{r,r'} \;d(X_{r}, X_{r'}) \end{equation} where $X$ is the geometric $p$-rough path that controls $Y$. This extends to arbitrary regularity the definition presented for $2\leq p<3$ in the recent paper of Hairer and Gerasimovičs where it is used in the proof of a version of Hörmander's theorem for a class of SPDEs. We extend the Fubini type theorem of the same paper by showing that this two-parameter integral coincides with the two iterated one-parameter integrals \[ \int_{[s,t] \times [u,v]} Y_{r,r'} \;d(X_{r}, X_{r'}) = \int_{s}^{t} \int_{u}^{v} Y_{r,r'} \;dX_{r'} \;dX_{r'} = \int_{u}^{v} \int_{s}^{t} Y_{r,r'} \;dX_{r} \;dX_{r'}. \] A priori these three integrals have distinct definitions, and so this parallels the classical Fubini's theorem for product measures. By extending the two-parameter Young-Towghi inequality in this context, we derive a maximal inequality for the discrete integrals approximating the two-parameter integral. We also extend the analysis to consider integrals of the form \begin{equation*} \int_{[s,t] \times [u,v]} Y_{r,r'} \; d(X_{r}, \tilde{X}_{r'}) \end{equation*} for possibly different rough paths $X$ and $\tilde{X}$, and obtain the corresponding Fubini type theorem. We prove continuity estimates for these integrals in the appropriate rough path topologies. As an application we consider the signature kernel, which has recently emerged as a useful tool in data science, as an example of a two-parameter controlled rough path which also solves a two-parameter rough integral equation. △ Less

Submitted 11 June, 2021; originally announced June 2021.

Comments: 40 pages

MSC Class: 60L20 (Primary)

arXiv:2008.08170 [pdf, other]

Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization

Authors: Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Abstract: In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method for black-box mini-optimization where only function values can be obtained. Moreover, we prove that our Acc-ZOM method achieves a lower query complexity of… ▽ More In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method for black-box mini-optimization where only function values can be obtained. Moreover, we prove that our Acc-ZOM method achieves a lower query complexity of $\tilde{O}(d^{3/4}ε^{-3})$ for finding an $ε$-stationary point, which improves the best known result by a factor of $O(d^{1/4})$ where $d$ denotes the variable dimension. In particular, our Acc-ZOM does not need large batches required in the existing zeroth-order stochastic algorithms. Meanwhile, we propose an accelerated zeroth-order momentum descent ascent (Acc-ZOMDA) method for black-box minimax optimization, where only function values can be obtained. Our Acc-ZOMDA obtains a low query complexity of $\tilde{O}((d_1+d_2)^{3/4}κ_y^{4.5}ε^{-3})$ without requiring large batches for finding an $ε$-stationary point, where $d_1$ and $d_2$ denote variable dimensions and $κ_y$ is condition number. Moreover, we propose an accelerated first-order momentum descent ascent (Acc-MDA) method for minimax optimization, whose explicit gradients are accessible. Our Acc-MDA achieves a low gradient complexity of $\tilde{O}(κ_y^{4.5}ε^{-3})$ without requiring large batches for finding an $ε$-stationary point. In particular, our Acc-MDA can obtain a lower gradient complexity of $\tilde{O}(κ_y^{2.5}ε^{-3})$ with a batch size $O(κ_y^4)$, which improves the best known result by a factor of $O(κ_y^{1/2})$. Extensive experimental results on black-box adversarial attack to deep neural networks and poisoning attack to logistic regression demonstrate efficiency of our algorithms. △ Less

Submitted 16 January, 2022; v1 submitted 18 August, 2020; originally announced August 2020.

Comments: Published in Journal of Machine Learning Research (JMLR)

arXiv:2008.07293 [pdf, other]

Controlling the spread of COVID-19 on college campuses

Authors: Molly Borowiak, Fayfay Ning, Justin Pei, Sarah Zhao, Hwai-Ray Tung, Rick Durrett

Abstract: This research was done during the DOMath program at Duke University from May 18 to July 10, 2020. At the time, Duke and other universities across the country were wrestling with the question of how to safely welcome students back to campus in the Fall. Because of this, our project focused on using mathematical models to evaluate strategies to suppress the spread of the virus on campus, specificall… ▽ More This research was done during the DOMath program at Duke University from May 18 to July 10, 2020. At the time, Duke and other universities across the country were wrestling with the question of how to safely welcome students back to campus in the Fall. Because of this, our project focused on using mathematical models to evaluate strategies to suppress the spread of the virus on campus, specifically in dorms and in classrooms. For dorms, we show that giving students single rooms rather than double rooms can substantially reduce virus spread. For classrooms, we show that moving classes with size above some cutoff online can make the basic reproduction number $R_0<1$, preventing a wide spread epidemic. The cutoff will depend on the contagiousness of the disease in classrooms. △ Less

Submitted 17 August, 2020; originally announced August 2020.

Comments: 10 pages, 7 figures

arXiv:2007.06680 [pdf, other]

Momentum-Based Policy Gradient Methods

Authors: Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Abstract: In the paper, we propose a class of efficient momentum-based policy gradient methods for the model-free reinforcement learning, which use adaptive learning rates and do not require any large batches. Specifically, we propose a fast important-sampling momentum-based policy gradient (IS-MBPG) method based on a new momentum-based variance reduced technique and the importance sampling technique. We al… ▽ More In the paper, we propose a class of efficient momentum-based policy gradient methods for the model-free reinforcement learning, which use adaptive learning rates and do not require any large batches. Specifically, we propose a fast important-sampling momentum-based policy gradient (IS-MBPG) method based on a new momentum-based variance reduced technique and the importance sampling technique. We also propose a fast Hessian-aided momentum-based policy gradient (HA-MBPG) method based on the momentum-based variance reduced technique and the Hessian-aided technique. Moreover, we prove that both the IS-MBPG and HA-MBPG methods reach the best known sample complexity of $O(ε^{-3})$ for finding an $ε$-stationary point of the non-concave performance function, which only require one trajectory at each iteration. In particular, we present a non-adaptive version of IS-MBPG method, i.e., IS-MBPG*, which also reaches the best known sample complexity of $O(ε^{-3})$ without any large batches. In the experiments, we apply four benchmark tasks to demonstrate the effectiveness of our algorithms. △ Less

Submitted 6 August, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

Comments: ICML 2020, 24 pages

arXiv:1907.13463 [pdf, other]

Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity

Authors: Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Abstract: Zeroth-order (a.k.a, derivative-free) methods are a class of effective optimization methods for solving complex machine learning problems, where gradients of the objective functions are not available or computationally prohibitive. Recently, although many zeroth-order methods have been developed, these approaches still have two main drawbacks: 1) high function query complexity; 2) not being well s… ▽ More Zeroth-order (a.k.a, derivative-free) methods are a class of effective optimization methods for solving complex machine learning problems, where gradients of the objective functions are not available or computationally prohibitive. Recently, although many zeroth-order methods have been developed, these approaches still have two main drawbacks: 1) high function query complexity; 2) not being well suitable for solving the problems with complex penalties and constraints. To address these challenging drawbacks, in this paper, we propose a class of faster zeroth-order stochastic alternating direction method of multipliers (ADMM) methods (ZO-SPIDER-ADMM) to solve the nonconvex finite-sum problems with multiple nonsmooth penalties. Moreover, we prove that the ZO-SPIDER-ADMM methods can achieve a lower function query complexity of $O(nd+dn^{\frac{1}{2}}ε^{-1})$ for finding an $ε$-stationary point, which improves the existing best nonconvex zeroth-order ADMM methods by a factor of $O(d^{\frac{1}{3}}n^{\frac{1}{6}})$, where $n$ and $d$ denote the sample size and data dimension, respectively. At the same time, we propose a class of faster zeroth-order online ADMM methods (ZOO-ADMM+) to solve the nonconvex online problems with multiple nonsmooth penalties. We also prove that the proposed ZOO-ADMM+ methods achieve a lower function query complexity of $O(dε^{-\frac{3}{2}})$, which improves the existing best result by a factor of $O(ε^{-\frac{1}{2}})$. Extensive experimental results on the structure adversarial attack on black-box deep neural networks demonstrate the efficiency of our new algorithms. △ Less

Submitted 11 December, 2023; v1 submitted 29 July, 2019; originally announced July 2019.

Comments: This paper was accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence

arXiv:1709.00121 [pdf, ps, other]

Representations of Polynomial Rota-Baxter Algebras

Authors: Li Qiao, Jun Pei

Abstract: A Rota--Baxter operator is an algebraic abstraction of integration, which is the typical example of a weight zero Rota-Baxter operator. We show that studying the modules over the polynomial Rota--Baxter algebra $(k[x],P)$ is equivalent to studying the modules over the Jordan plane, and we generalize the direct decomposability results for the $(k[x],P)$-modules in [Iy] from algebraically closed fie… ▽ More A Rota--Baxter operator is an algebraic abstraction of integration, which is the typical example of a weight zero Rota-Baxter operator. We show that studying the modules over the polynomial Rota--Baxter algebra $(k[x],P)$ is equivalent to studying the modules over the Jordan plane, and we generalize the direct decomposability results for the $(k[x],P)$-modules in [Iy] from algebraically closed fields of characteristic zero to fields of characteristic zero. Furthermore, we provide a classification of Rota--Baxter modules up to isomorphism based on indecomposable $k[x]$-modules. △ Less

Submitted 31 August, 2017; originally announced September 2017.

MSC Class: 16W99; 16G99; 45N05; 12H20

arXiv:1401.7386 [pdf, ps, other]

doi 10.1007/s10801-014-0574-x

Averaging algebras, Schröder numbers, rooted trees and operads

Authors: Li Guo, Jun Pei

Abstract: In this paper, we study averaging operators from an algebraic and combinatorial point of view. We first construct free averaging algebras in terms of a class of bracketed words called averaging words. We next apply this construction to obtain one and two variable generating functions for subsets of averaging words when the averaging operator is taken to be idempotent. When the averaging algebra ha… ▽ More In this paper, we study averaging operators from an algebraic and combinatorial point of view. We first construct free averaging algebras in terms of a class of bracketed words called averaging words. We next apply this construction to obtain one and two variable generating functions for subsets of averaging words when the averaging operator is taken to be idempotent. When the averaging algebra has an idempotent generator, the generating function in one variable is twice the generating function for large Schröder numbers, leading us to give interpretations of large Schröder numbers in terms of bracketed words and rooted trees, as well as a recursive formula for these numbers. We also give a representation of free averaging algebras by unreduced trees and apply it to give a combinatorial description of the operad of averaging algebras. △ Less

Submitted 13 December, 2014; v1 submitted 28 January, 2014; originally announced January 2014.

Comments: 30 pages

Journal ref: Jour Algebraic Combinatorics, 42 (2015) 73-109

arXiv:1311.0612 [pdf, ps, other]

doi 10.1063/1.4863898

Rota-Baxter operators on $sl(2,C)$ and solutions of the classical Yang-Baxter equation

Authors: Jun Pei, Chengming Bai, Li Guo

Abstract: We explicitly determine all Rota-Baxter operators (of weight zero) on $sl(2,C)$ under the Cartan-Weyl basis. For the skew-symmetric operators, we give the corresponding skew-symmetric solutions of the classical Yang-Baxter equation in $sl(2,C)$, confirming the related study by Semenov-Tian-Shansky. In general, these Rota-Baxter operators give a family of solutions of the classical Yang-Baxter equa… ▽ More We explicitly determine all Rota-Baxter operators (of weight zero) on $sl(2,C)$ under the Cartan-Weyl basis. For the skew-symmetric operators, we give the corresponding skew-symmetric solutions of the classical Yang-Baxter equation in $sl(2,C)$, confirming the related study by Semenov-Tian-Shansky. In general, these Rota-Baxter operators give a family of solutions of the classical Yang-Baxter equation in the 6-dimensional Lie algebra $sl(2,C) \ltimes_{{\rm ad}^{\ast}} sl(2,C)^{\ast}$. They also give rise to 3-dimensional pre-Lie algebras which in turn yield solutions of the classical Yang-Baxter equation in other 6-dimensional Lie algebras. △ Less

Submitted 4 November, 2013; originally announced November 2013.

Comments: 17 pages

MSC Class: 16T25; 81R15

Journal ref: Journal of Mathematical Physics. 55 (2014), 021701

arXiv:1306.3046 [pdf, ps, other]

Splitting of operads and Rota-Baxter operators on operads

Authors: Jun Pei, Chengming Bai, Li Guo

Abstract: This paper establishes a uniform procedure to split the operations in any algebraic operad, generalizing previous known notions of splitting algebraic structures from the dendriform algebra of Loday that splits the associative operation to the successors that split any binary operad. Examples are provided for various $n$-associative algebras, $n$-Lie algebras, $A_\infty$ algebras and $L_\infty$ al… ▽ More This paper establishes a uniform procedure to split the operations in any algebraic operad, generalizing previous known notions of splitting algebraic structures from the dendriform algebra of Loday that splits the associative operation to the successors that split any binary operad. Examples are provided for various $n$-associative algebras, $n$-Lie algebras, $A_\infty$ algebras and $L_\infty$ algebras. Further, the concept of a Rota-Baxter operator, first showing its importance in the associative and Lie algebra context and then generalized to any binary operads, is generalized to arbitrary operads. The classical links from the Rota-Baxter associative algebra to the dendriform algebra and its numerous generalizations are further generalized and unified as the link from the Rota-Baxter operator on an operad to the splitting of the operad. Finally, the remarkable fact that any dendriform algebra can be recovered from a relative Rota-Baxter operator is generalized to the context of operads with the generalized notion of a relative Rota-Baxter operator for any operad. △ Less

Submitted 13 June, 2013; originally announced June 2013.

Comments: 30 pages

MSC Class: 18D50; 17A30; 17A36; 17B99; 17C99

Journal ref: Appl. Categorical Structures 25 (2017), 505-538

arXiv:1212.0177 [pdf, ps, other]

Replicating of binary operads, Koszul duality, Manin products and average operators

Authors: Jun Pei, Chengming Bai, Li Guo, Xiang Ni

Abstract: We consider the notions of the replicators, including the duplicator and triplicator, of a binary operad. As in the closely related notions of di-Var-algebra and tri-Var-algebra in [14], they provide a general operadic definition for the recent constructions of replicating the operations of algebraic structures. We show that taking replicators is in Koszul dual to taking successors in [3] for bina… ▽ More We consider the notions of the replicators, including the duplicator and triplicator, of a binary operad. As in the closely related notions of di-Var-algebra and tri-Var-algebra in [14], they provide a general operadic definition for the recent constructions of replicating the operations of algebraic structures. We show that taking replicators is in Koszul dual to taking successors in [3] for binary quadratic operads and is equivalent to taking the white product with certain operads such as Perm. We also relate the replicators to the actions of average operators. △ Less

Submitted 2 January, 2013; v1 submitted 1 December, 2012; originally announced December 2012.

Comments: 28 pages. 2 figures

MSC Class: 18D50; 17A30; 16W99; 17A36; 17B99

Journal ref: New Trends in Algebras and Combinatorics, 317-353, (2020)

Showing 1–12 of 12 results for author: Pei, J