-
Non-splitting Eulerian-Lagrangian WENO schemes for two-dimensional nonlinear convection-diffusion equations
Authors:
Nanyi Zheng,
Xiaofeng Cai,
**g-Mei Qiu,
Jianxian Qiu
Abstract:
In this paper, we develop high-order, conservative, non-splitting Eulerian-Lagrangian (EL) Runge-Kutta (RK) finite volume (FV) weighted essentially non-oscillatory (WENO) schemes for convection-diffusion equations. The proposed EL-RK-FV-WENO scheme defines modified characteristic lines and evolves the solution along them, significantly relaxing the time-step constraint for the convection term. The…
▽ More
In this paper, we develop high-order, conservative, non-splitting Eulerian-Lagrangian (EL) Runge-Kutta (RK) finite volume (FV) weighted essentially non-oscillatory (WENO) schemes for convection-diffusion equations. The proposed EL-RK-FV-WENO scheme defines modified characteristic lines and evolves the solution along them, significantly relaxing the time-step constraint for the convection term. The main algorithm design challenge arises from the complexity of constructing accurate and robust reconstructions on dynamically varying Lagrangian meshes. This reconstruction process is needed for flux evaluations on time-dependent upstream quadrilaterals and time integrations along moving characteristics. To address this, we propose a strategy that utilizes a WENO reconstruction on a fixed Eulerian mesh for spatial reconstruction, and updates intermediate solutions on the Eulerian background mesh for implicit-explicit RK temporal integration. This strategy leverages efficient reconstruction and remap** algorithms to manage the complexities of polynomial reconstructions on time-dependent quadrilaterals, while ensuring local mass conservation. The proposed scheme ensures mass conservation due to the flux-form semi-discretization and the mass-conservative reconstruction on both background and upstream cells. Extensive numerical tests have been performed to verify the effectiveness of the proposed scheme.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
A New Two-Sided Sketching Algorithm for Large-Scale Tensor Decomposition Based on Discrete Cosine Transformation
Authors:
Zhiguang Cheng,
Gaohang Yu,
Xiaohao Cai,
Liqun Qi
Abstract:
Large tensors are frequently encountered in various fields such as computer vision, scientific simulations, sensor networks, and data mining. However, these tensors are often too large for convenient processing, transfer, or storage. Fortunately, they typically exhibit a low-rank structure that can be leveraged through tensor decomposition. Despite this, performing large-scale tensor decomposition…
▽ More
Large tensors are frequently encountered in various fields such as computer vision, scientific simulations, sensor networks, and data mining. However, these tensors are often too large for convenient processing, transfer, or storage. Fortunately, they typically exhibit a low-rank structure that can be leveraged through tensor decomposition. Despite this, performing large-scale tensor decomposition can be time-consuming. Sketching is a useful technique to reduce the dimensionality of the data. In this study, we introduce a novel two-sided sketching method based on the $t$-product decomposition and the discrete cosine transformation. We conduct a thorough theoretical analysis to assess the approximation error of the proposed method. Specifically, we enhance the algorithm with power iteration to achieve more precise approximate solutions. Extensive numerical experiments and comparisons on low-rank approximation of color images and grayscale videos illustrate the efficiency and effectiveness of the proposed approach in terms of both CPU time and approximation accuracy.
△ Less
Submitted 28 April, 2024; v1 submitted 25 April, 2024;
originally announced April 2024.
-
Frequency-Reactive Power Optimization Strategy of Grid-forming Offshore Wind Farm Using DRU-HVDC Transmission
Authors:
Zhekai Li,
Kun Han,
Xu Cai,
Renxin Yang,
Haotian Yu,
Kepeng Xia,
Lulu Liu
Abstract:
The diode rectifier unit-based high voltage direct current (DRU-HVDC) transmission with grid-forming (GFM) wind turbine is becoming a promising scheme for offshore wind farm(OWF) integration due to its high reliability and low cost. In this scheme, the AC network of the OWF and the DRU has completely different synchronization mechanisms and power flow characteristics from the traditional power sys…
▽ More
The diode rectifier unit-based high voltage direct current (DRU-HVDC) transmission with grid-forming (GFM) wind turbine is becoming a promising scheme for offshore wind farm(OWF) integration due to its high reliability and low cost. In this scheme, the AC network of the OWF and the DRU has completely different synchronization mechanisms and power flow characteristics from the traditional power system. To optimize the power flow and reduce the net loss, this paper carries out the power flow modeling and optimization analysis for the DRU-HVDC transmission system with grid-forming OWFs. The influence of the DRU and the GFM wind turbines on the power flow of the system is analyzed. On this basis, improved constraint conditions are proposed and an optimal power flow (OPF) method is established. This method can minimize the power loss by adjusting the reactive power output of each wind turbine and internal network frequency. Finally, based on MATLAB, this paper uses YALMIP toolkit and CPLEX mathematical solver to realize the programming solution of the OPF model proposed in this paper. The results show that the proposed optimization strategy can effectively reduce the power loss of the entire OWF and the transmission system with an optimization ratio of network losses exceeding 25.3%.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Last Iterate Convergence of Incremental Methods and Applications in Continual Learning
Authors:
Xufeng Cai,
Jelena Diakonikolas
Abstract:
Incremental gradient and incremental proximal methods are a fundamental class of optimization algorithms used for solving finite sum problems, broadly studied in the literature. Yet, without strong convexity, their convergence guarantees have primarily been established for the ergodic (average) iterate. Motivated by applications in continual learning, we obtain the first convergence guarantees for…
▽ More
Incremental gradient and incremental proximal methods are a fundamental class of optimization algorithms used for solving finite sum problems, broadly studied in the literature. Yet, without strong convexity, their convergence guarantees have primarily been established for the ergodic (average) iterate. Motivated by applications in continual learning, we obtain the first convergence guarantees for the last iterate of both incremental gradient and incremental proximal methods, in general convex smooth (for both) and convex Lipschitz (for the proximal variants) settings. Our oracle complexity bounds for the last iterate nearly match (i.e., match up to a square-root-log or a log factor) the best known oracle complexity bounds for the average iterate, for both classes of methods. We further obtain generalizations of our results to weighted averaging of the iterates with increasing weights and for randomly permuted ordering of updates. We study incremental proximal methods as a model of continual learning with generalization and argue that large amount of regularization is crucial to preventing catastrophic forgetting. Our results generalize last iterate guarantees for incremental methods compared to state of the art, as such results were previously known only for overparameterized linear models, which correspond to convex quadratic problems with infinitely many solutions.
△ Less
Submitted 27 June, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
A Multilayer Eigen-Sensitivity Method Using Loop Gain Model for Oscillation Diagnosis of Converter-Based System
Authors:
Haoxiang Zong,
Chen Zhang,
Xu Cai,
Marta Molinas
Abstract:
Loop gain-based eigen-sensitivity (LGES) is a useful frequency-domain tool for oscillation diagnosis of converter-based system. However, the existing theory is still scant in two aspects: participation factor (PF) is bound up with the frequency-domain modal characteristic that does not necessarily point to the stability as that of the time-domain eigen-sensitivity (i.e., PF of oscillation mode); a…
▽ More
Loop gain-based eigen-sensitivity (LGES) is a useful frequency-domain tool for oscillation diagnosis of converter-based system. However, the existing theory is still scant in two aspects: participation factor (PF) is bound up with the frequency-domain modal characteristic that does not necessarily point to the stability as that of the time-domain eigen-sensitivity (i.e., PF of oscillation mode); a systematic LGES analysis framework containing both component- and parameter- level sensitivity is missing. These two factors hinder the application of LGES method on the proper evaluation of stability effects, which are closely related with the time-domain oscillation mode. To address these issues, this paper proposes a multilayer LGES method directed to the oscillation mode, and a full set of indices like PF, component and parameter sensitivity are established. The link from the eigen-sensitivity of frequency domain to that of time domain is revealed, through which it is shown how the proposed LGES method can facilitate the control parameter tuning-guided oscillation suppression. The effectiveness of the proposed LGES method is validated via case studies conducted on a generic AC/DC converter-based system.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Variance Reduced Halpern Iteration for Finite-Sum Monotone Inclusions
Authors:
Xufeng Cai,
Ahmet Alacaoglu,
Jelena Diakonikolas
Abstract:
Machine learning approaches relying on such criteria as adversarial robustness or multi-agent settings have raised the need for solving game-theoretic equilibrium problems. Of particular relevance to these applications are methods targeting finite-sum structure, which generically arises in empirical variants of learning problems in these contexts. Further, methods with computable approximation err…
▽ More
Machine learning approaches relying on such criteria as adversarial robustness or multi-agent settings have raised the need for solving game-theoretic equilibrium problems. Of particular relevance to these applications are methods targeting finite-sum structure, which generically arises in empirical variants of learning problems in these contexts. Further, methods with computable approximation errors are highly desirable, as they provide verifiable exit criteria. Motivated by these applications, we study finite-sum monotone inclusion problems, which model broad classes of equilibrium problems. Our main contributions are variants of the classical Halpern iteration that employ variance reduction to obtain improved complexity guarantees in which $n$ component operators in the finite sum are ``on average'' either cocoercive or Lipschitz continuous and monotone, with parameter $L$. The resulting oracle complexity of our methods, which provide guarantees for the last iterate and for a (computable) operator norm residual, is $\widetilde{\mathcal{O}}( n + \sqrt{n}L\varepsilon^{-1})$, which improves upon existing methods by a factor up to $\sqrt{n}$. This constitutes the first variance reduction-type result for general finite-sum monotone inclusions and for more specific problems such as convex-concave optimization when operator norm residual is the optimality measure. We further argue that, up to poly-logarithmic factors, this complexity is unimprovable in the monotone Lipschitz setting; i.e., the provided result is near-optimal.
△ Less
Submitted 26 October, 2023; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Distributed event-triggered aggregative optimization with applications to price-based energy management
Authors:
Xin Cai,
Feng Xiao,
Bo Wei,
Ai** Wang
Abstract:
This paper studies a distributed continuous-time aggregative optimization problem, which is a fundamental problem in the price-based energy management. The objective of the distributed aggregative optimization is to minimize the sum of local objective functions, which have a specific expression that relies on agents' own decisions and the aggregation of all agents' decisions. To solve the problem,…
▽ More
This paper studies a distributed continuous-time aggregative optimization problem, which is a fundamental problem in the price-based energy management. The objective of the distributed aggregative optimization is to minimize the sum of local objective functions, which have a specific expression that relies on agents' own decisions and the aggregation of all agents' decisions. To solve the problem, a novel distributed continuous-time algorithm is proposed by combining gradient dynamics with a dynamic average consensus estimator in a two-time scale. The exponential convergence of the proposed algorithm is established under the assumption of a convex global cost function by virtue of the stability theory of singular perturbation systems. Motivated by practical applications, the implementation of the continuous-time algorithm with event-triggered communication is investigated. Simulations on the price-based energy management of distributed energy resources are given to illustrate the proposed method.
△ Less
Submitted 24 September, 2023;
originally announced September 2023.
-
Liouville equations on complete surfaces with nonnegative Gauss curvature
Authors:
Xiaohan Cai,
Mijia Lai
Abstract:
We study finite total curvature solutions of the Liouville equation $Δu+e^{2u}=0$ on a complete surface $(M,g)$ with nonnegative Gauss curvature. It turns out that the asymptotic behavior of the solution separates two extremal cases: on the one end, if the solution decays not too fast, then $(M,g)$ must be isometric to the standard Euclidean plane; on the other end, if $(M,g)$ is isometric to the…
▽ More
We study finite total curvature solutions of the Liouville equation $Δu+e^{2u}=0$ on a complete surface $(M,g)$ with nonnegative Gauss curvature. It turns out that the asymptotic behavior of the solution separates two extremal cases: on the one end, if the solution decays not too fast, then $(M,g)$ must be isometric to the standard Euclidean plane; on the other end, if $(M,g)$ is isometric to the flat cylinder $\mathbb{S}^1\times \mathbb{R}$, then solutions must decay linearly and are completely classified.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
A Randomized Block Krylov Method for Tensor Train Approximation
Authors:
Gaohang Yu,
**hong Feng,
Zhongming Chen,
Xiaohao Cai,
Liqun Qi
Abstract:
Tensor train decomposition is a powerful tool for dealing with high-dimensional, large-scale tensor data, which is not suffering from the curse of dimensionality. To accelerate the calculation of the auxiliary unfolding matrix, some randomized algorithms have been proposed; however, they are not suitable for noisy data. The randomized block Krylov method is capable of dealing with heavy-tailed noi…
▽ More
Tensor train decomposition is a powerful tool for dealing with high-dimensional, large-scale tensor data, which is not suffering from the curse of dimensionality. To accelerate the calculation of the auxiliary unfolding matrix, some randomized algorithms have been proposed; however, they are not suitable for noisy data. The randomized block Krylov method is capable of dealing with heavy-tailed noisy data in the low-rank approximation of matrices. In this paper, we present a randomized algorithm for low-rank tensor train approximation of large-scale tensors based on randomized block Krylov subspace iteration and provide theoretical guarantees. Numerical experiments on synthetic and real-world tensor data demonstrate the effectiveness of the proposed algorithm.
△ Less
Submitted 7 August, 2023; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective and Improved Bounds
Authors:
Xufeng Cai,
Cheuk Yin Lin,
Jelena Diakonikolas
Abstract:
Stochastic gradient descent (SGD) is perhaps the most prevalent optimization method in modern machine learning. Contrary to the empirical practice of sampling from the datasets without replacement and with (possible) reshuffling at each epoch, the theoretical counterpart of SGD usually relies on the assumption of sampling with replacement. It is only very recently that SGD with sampling without re…
▽ More
Stochastic gradient descent (SGD) is perhaps the most prevalent optimization method in modern machine learning. Contrary to the empirical practice of sampling from the datasets without replacement and with (possible) reshuffling at each epoch, the theoretical counterpart of SGD usually relies on the assumption of sampling with replacement. It is only very recently that SGD with sampling without replacement -- shuffled SGD -- has been analyzed. For convex finite sum problems with $n$ components and under the $L$-smoothness assumption for each component function, there are matching upper and lower bounds, under sufficiently small -- $\mathcal{O}(\frac{1}{nL})$ -- step sizes. Yet those bounds appear too pessimistic -- in fact, the predicted performance is generally no better than for full gradient descent -- and do not agree with the empirical observations. In this work, to narrow the gap between the theory and practice of shuffled SGD, we sharpen the focus from general finite sum problems to empirical risk minimization with linear predictors. This allows us to take a primal-dual perspective and interpret shuffled SGD as a primal-dual method with cyclic coordinate updates on the dual side. Leveraging this perspective, we prove fine-grained complexity bounds that depend on the data matrix and are never worse than what is predicted by the existing bounds. Notably, our bounds predict much faster convergence than the existing analyses -- by a factor of the order of $\sqrt{n}$ in some cases. We empirically demonstrate that on common machine learning datasets our bounds are indeed much tighter. We further extend our analysis to nonsmooth convex problems and more general finite-sum problems, with similar improvements.
△ Less
Submitted 7 February, 2024; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Few-shot Learning for Inference in Medical Imaging with Subspace Feature Representations
Authors:
Jiahui Liu,
Keqiang Fan,
Xiaohao Cai,
Mahesan Niranjan
Abstract:
Unlike the field of visual scene recognition where tremendous advances have taken place due to the availability of very large datasets to train deep neural networks, inference from medical images is often hampered by the fact that only small amounts of data may be available. When working with very small dataset problems, of the order of a few hundred items of data, the power of deep learning may s…
▽ More
Unlike the field of visual scene recognition where tremendous advances have taken place due to the availability of very large datasets to train deep neural networks, inference from medical images is often hampered by the fact that only small amounts of data may be available. When working with very small dataset problems, of the order of a few hundred items of data, the power of deep learning may still be exploited by using a model pre-trained on natural images as a feature extractor and carrying out classic pattern recognition techniques in this feature space, the so-called few-shot learning problem. In regimes where the dimension of this feature space is comparable to or even larger than the number of items of data, dimensionality reduction is a necessity and is often achieved by principal component analysis, i.e., singular value decomposition (SVD). In this paper, noting the inappropriateness of using SVD for this setting, we usher in and explore two alternatives based on discriminant analysis and non-negative matrix factorization (NMF). Using 14 different datasets spanning $11$ distinct disease types, we demonstrate that discriminant subspaces at low dimensions achieve significant improvements over SVD-based subspaces and the original feature space. We also show that NMF at modest dimensions is a competitive alternative to SVD in this setting.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
GO-LDA: Generalised Optimal Linear Discriminant Analysis
Authors:
Jiahui Liu,
Xiaohao Cai,
Mahesan Niranjan
Abstract:
Linear discriminant analysis (LDA) has been a useful tool in pattern recognition and data analysis research and practice. While linearity of class boundaries cannot always be expected, nonlinear projections through pre-trained deep neural networks have served to map complex data onto feature spaces in which linear discrimination has served well. The solution to binary LDA is obtained by eigenvalue…
▽ More
Linear discriminant analysis (LDA) has been a useful tool in pattern recognition and data analysis research and practice. While linearity of class boundaries cannot always be expected, nonlinear projections through pre-trained deep neural networks have served to map complex data onto feature spaces in which linear discrimination has served well. The solution to binary LDA is obtained by eigenvalue analysis of within-class and between-class scatter matrices. It is well known that the multiclass LDA is solved by an extension to the binary LDA, a generalised eigenvalue problem, from which the largest subspace that can be extracted is of dimension one lower than the number of classes in the given problem. In this paper, we show that, apart from the first of the discriminant directions, the generalised eigenanalysis solution to multiclass LDA does neither yield orthogonal discriminant directions nor maximise discrimination of projected data along them. Surprisingly, to the best of our knowledge, this has not been noted in decades of literature on LDA. To overcome this drawback, we present a derivation with a strict theoretical support for sequentially obtaining discriminant directions that are orthogonal to previously computed ones and maximise in each step the Fisher criterion. We show distributions of projections along these axes and demonstrate that discrimination of data projected onto these discriminant directions has optimal separation, which is much higher than those from the generalised eigenvectors of the multiclass LDA. Using a wide range of benchmark tasks, we present a comprehensive empirical demonstration that on a number of pattern recognition and classification problems, the optimal discriminant subspaces obtained by the proposed method, referred to as GO-LDA (Generalised Optimal LDA), can offer superior accuracy.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Practical Sketching Algorithms for Low-Rank Tucker Approximation of Large Tensors
Authors:
Wandi Dong,
Gaohang Yu,
Liqun Qi,
Xiaohao Cai
Abstract:
Low-rank approximation of tensors has been widely used in high-dimensional data analysis. It usually involves singular value decomposition (SVD) of large-scale matrices with high computational complexity. Sketching is an effective data compression and dimensionality reduction technique applied to the low-rank approximation of large matrices. This paper presents two practical randomized algorithms…
▽ More
Low-rank approximation of tensors has been widely used in high-dimensional data analysis. It usually involves singular value decomposition (SVD) of large-scale matrices with high computational complexity. Sketching is an effective data compression and dimensionality reduction technique applied to the low-rank approximation of large matrices. This paper presents two practical randomized algorithms for low-rank Tucker approximation of large tensors based on sketching and power scheme, with a rigorous error-bound analysis. Numerical experiments on synthetic and real-world tensor data demonstrate the competitive performance of the proposed algorithms.
△ Less
Submitted 27 January, 2023;
originally announced January 2023.
-
Understanding the convergence of the preconditioned PDHG method: a view of indefinite proximal ADMM
Authors:
Yumin Ma,
Xingju Cai,
Bo Jiang,
Deren Han
Abstract:
The primal-dual hybrid gradient (PDHG) algorithm is popular in solving min-max problems which are being widely used in a variety of areas. To improve the applicability and efficiency of PDHG for different application scenarios, we focus on the preconditioned PDHG (PrePDHG) algorithm, which is a framework covering PDHG, alternating direction method of multipliers (ADMM), and other methods. We give…
▽ More
The primal-dual hybrid gradient (PDHG) algorithm is popular in solving min-max problems which are being widely used in a variety of areas. To improve the applicability and efficiency of PDHG for different application scenarios, we focus on the preconditioned PDHG (PrePDHG) algorithm, which is a framework covering PDHG, alternating direction method of multipliers (ADMM), and other methods. We give the optimal convergence condition of PrePDHG in the sense that the key parameters in the condition can not be further improved, which fills the theoretical gap in the-state-of-art convergence results of PrePDHG, and obtain the ergodic and non-ergodic sublinear convergence rates of PrePDHG. The theoretical analysis is achieved by establishing the equivalence between PrePDHG and indefinite proximal ADMM. Besides, we discuss various choices of the proximal matrices in PrePDHG and derive some interesting results. For example, the convergence condition of diagonal PrePDHG is improved to be tight, the dual stepsize of the balanced augmented Lagrangian method can be enlarged to $4/3$ from $1$, and a balanced augmented Lagrangian method with symmetric Gauss-Seidel iterations is also explored. Numerical results on the matrix game, projection onto the Birkhoff polytope, earth mover's distance, and CT reconstruction verify the effectiveness and superiority of PrePDHG.
△ Less
Submitted 8 January, 2023;
originally announced January 2023.
-
Cyclic Block Coordinate Descent With Variance Reduction for Composite Nonconvex Optimization
Authors:
Xufeng Cai,
Chaobing Song,
Stephen J. Wright,
Jelena Diakonikolas
Abstract:
Nonconvex optimization is central in solving many machine learning problems, in which block-wise structure is commonly encountered. In this work, we propose cyclic block coordinate methods for nonconvex optimization problems with non-asymptotic gradient norm guarantees. Our convergence analysis is based on a gradient Lipschitz condition with respect to a Mahalanobis norm, inspired by a recent prog…
▽ More
Nonconvex optimization is central in solving many machine learning problems, in which block-wise structure is commonly encountered. In this work, we propose cyclic block coordinate methods for nonconvex optimization problems with non-asymptotic gradient norm guarantees. Our convergence analysis is based on a gradient Lipschitz condition with respect to a Mahalanobis norm, inspired by a recent progress on cyclic block coordinate methods. In deterministic settings, our convergence guarantee matches the guarantee of (full-gradient) gradient descent, but with the gradient Lipschitz constant being defined w.r.t.~a Mahalanobis norm. In stochastic settings, we use recursive variance reduction to decrease the per-iteration cost and match the arithmetic operation complexity of current optimal stochastic full-gradient methods, with a unified analysis for both finite-sum and infinite-sum cases. We prove a faster linear convergence result when a Polyak-Łojasiewicz (PŁ) condition holds. To our knowledge, this work is the first to provide non-asymptotic convergence guarantees -- variance-reduced or not -- for a cyclic block coordinate method in general composite (smooth + nonsmooth) nonconvex settings. Our experimental results demonstrate the efficacy of the proposed cyclic scheme in training deep neural nets.
△ Less
Submitted 27 January, 2023; v1 submitted 9 December, 2022;
originally announced December 2022.
-
Resilient distributed resource allocation algorithm under false data injection attacks
Authors:
Xin Cai,
Xinyuan Nan,
Binpeng Gao
Abstract:
A resilient distributed algorithm is proposed to solve the distributed resource allocation problem of a first-order nonlinear multi-agent system who is subject to false data injection (FDI) attacks. An intelligent attacker injects false data into agents' actuators and sensors such that agents execute the algorithm according to the compromised control inputs and interactive information. The goal of…
▽ More
A resilient distributed algorithm is proposed to solve the distributed resource allocation problem of a first-order nonlinear multi-agent system who is subject to false data injection (FDI) attacks. An intelligent attacker injects false data into agents' actuators and sensors such that agents execute the algorithm according to the compromised control inputs and interactive information. The goal of the attacker is to make the multi-agent system to be unstable and to cause the deviance of agents' decisions from the optimal resource allocation. At first, we analyze the robustness of a distributed resource allocation algorithm under FDI attacks. Then, the unknown nonlinear term and the false data injected in agents are considered as extended states which can be estimated by extended state observers. The estimation was used in the feedback control to suppress the effect of the FDI attacks. A resilient distributed resource allocation algorithm based on the extended state observer is proposed to ensure that it can converge to the optimal allocation without requiring any information about the nature of the attacker. An example is given to illustrate the results.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Fourth-order conservative non-splitting semi-Lagrangian Hermite WENO schemes for kinetic and fluid simulations
Authors:
Nanyi Zheng,
Xiaofeng Cai,
**g-Mei Qiu,
Jianxian Qiu
Abstract:
We present fourth-order conservative non-splitting semi-Lagrangian (SL) Hermite essentially non-oscillatory (HWENO) schemes for linear transport equations with applications for nonlinear problems including the Vlasov-Poisson system, the guiding center Vlasov model, and the incompressible Euler equations in the vorticity-stream function formulation. The proposed SL HWENO schemes combine a weak form…
▽ More
We present fourth-order conservative non-splitting semi-Lagrangian (SL) Hermite essentially non-oscillatory (HWENO) schemes for linear transport equations with applications for nonlinear problems including the Vlasov-Poisson system, the guiding center Vlasov model, and the incompressible Euler equations in the vorticity-stream function formulation. The proposed SL HWENO schemes combine a weak formulation of the characteristic Galerkin method with two newly constructed HWENO reconstruction methods. Fourth-order accuracy is accomplished in both space and time under a non-splitting setting. Mass conservation naturally holds due to the weak formulation of the characteristic Galerkin method and the design of the HWENO reconstructions. We apply a positive-preserving limiter to maintain the positivity of numerical solutions when needed. Although the proposed SL framework allows us to take large time steps for improving computational efficiency, it also brings challenges to the spatial reconstruction technique; we construct two kind of novel HWENO reconstructions to fit the need for the proposed SL framework. Abundant benchmark tests are performed to verify the effectiveness of the proposed SL HWENO schemes.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
Stochastic Halpern Iteration with Variance Reduction for Stochastic Monotone Inclusions
Authors:
Xufeng Cai,
Chaobing Song,
Cristóbal Guzmán,
Jelena Diakonikolas
Abstract:
We study stochastic monotone inclusion problems, which widely appear in machine learning applications, including robust regression and adversarial learning. We propose novel variants of stochastic Halpern iteration with recursive variance reduction. In the cocoercive -- and more generally Lipschitz-monotone -- setup, our algorithm attains $ε$ norm of the operator with $\mathcal{O}(\frac{1}{ε^3})$…
▽ More
We study stochastic monotone inclusion problems, which widely appear in machine learning applications, including robust regression and adversarial learning. We propose novel variants of stochastic Halpern iteration with recursive variance reduction. In the cocoercive -- and more generally Lipschitz-monotone -- setup, our algorithm attains $ε$ norm of the operator with $\mathcal{O}(\frac{1}{ε^3})$ stochastic operator evaluations, which significantly improves over state of the art $\mathcal{O}(\frac{1}{ε^4})$ stochastic operator evaluations required for existing monotone inclusion solvers applied to the same problem classes. We further show how to couple one of the proposed variants of stochastic Halpern iteration with a scheduled restart scheme to solve stochastic monotone inclusion problems with ${\mathcal{O}}(\frac{\log(1/ε)}{ε^2})$ stochastic operator evaluations under additional sharpness or strong monotonicity assumptions.
△ Less
Submitted 8 January, 2023; v1 submitted 17 March, 2022;
originally announced March 2022.
-
On Average-Case Error Bounds for Kernel-Based Bayesian Quadrature
Authors:
Xu Cai,
Chi Thanh Lam,
Jonathan Scarlett
Abstract:
In this paper, we study error bounds for {\em Bayesian quadrature} (BQ), with an emphasis on noisy settings, randomized algorithms, and average-case performance measures. We seek to approximate the integral of functions in a {\em Reproducing Kernel Hilbert Space} (RKHS), particularly focusing on the Matérn-$ν$ and squared exponential (SE) kernels, with samples from the function potentially being c…
▽ More
In this paper, we study error bounds for {\em Bayesian quadrature} (BQ), with an emphasis on noisy settings, randomized algorithms, and average-case performance measures. We seek to approximate the integral of functions in a {\em Reproducing Kernel Hilbert Space} (RKHS), particularly focusing on the Matérn-$ν$ and squared exponential (SE) kernels, with samples from the function potentially being corrupted by Gaussian noise. We provide a two-step meta-algorithm that serves as a general tool for relating the average-case quadrature error with the $L^2$-function approximation error. When specialized to the Matérn kernel, we recover an existing near-optimal error rate while avoiding the existing method of repeatedly sampling points. When specialized to other settings, we obtain new average-case results for settings including the SE kernel with noise and the Matérn kernel with misspecification. Finally, we present algorithm-independent lower bounds that have greater generality and/or give distinct proofs compared to existing ones.
△ Less
Submitted 10 February, 2023; v1 submitted 21 February, 2022;
originally announced February 2022.
-
A distributed generalized Nash equilibrium seeking algorithm based on extremum seeking control
Authors:
Feng Xiao,
Xin Cai,
Bo Wei
Abstract:
In this paper, a distributed non-model based seeking algorithm which combines the extremum seeking control (ESC) jointly with learning algorithms is proposed to seek a generalized Nash equilibrium (GNE) for a class of noncooperative games with coupled equality constraint. The strategy of each agent is restricted by both the coupled inter-agent constraint and local inequality constraints. Thanks to…
▽ More
In this paper, a distributed non-model based seeking algorithm which combines the extremum seeking control (ESC) jointly with learning algorithms is proposed to seek a generalized Nash equilibrium (GNE) for a class of noncooperative games with coupled equality constraint. The strategy of each agent is restricted by both the coupled inter-agent constraint and local inequality constraints. Thanks to the ESC, it is unnecessary to know the specific expressions of agents' cost functions and local constraints and to know the strategies of other agents for the implementation of the proposed GNE seeking algorithm. To deal with the coupled constraints, only the Lagrange multiplier is transmitted among agents with some prior information about the coupled constraints. Moreover, a diminishing dither signal is designed in the seeking algorithm to remove undesirable steady-state oscillations. The non-local convergence of the designed seeking algorithm is analyzed via the singular perturbation theory, averaging analysis and Lyapunov stability theory. Numerical examples are given to verify the effectiveness of our proposed method.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Feedback Linearization of Car Dynamics for Racing via Reinforcement Learning
Authors:
Michael Estrada,
Sida Li,
Xiangyu Cai
Abstract:
Through the method of Learning Feedback Linearization, we seek to learn a linearizing controller to simplify the process of controlling a car to race autonomously. A soft actor-critic approach is used to learn a decoupling matrix and drift vector that effectively correct for errors in a hand-designed linearizing controller. The result is an exactly linearizing controller that can be used to enable…
▽ More
Through the method of Learning Feedback Linearization, we seek to learn a linearizing controller to simplify the process of controlling a car to race autonomously. A soft actor-critic approach is used to learn a decoupling matrix and drift vector that effectively correct for errors in a hand-designed linearizing controller. The result is an exactly linearizing controller that can be used to enable the well-developed theory of linear systems to design path planning and tracking schemes that are easy to implement and significantly less computationally demanding. To demonstrate the method of feedback linearization, it is first used to learn a simulated model whose exact structure is known, but varied from the initial controller, so as to introduce error. We further seek to apply this method to a system that introduces even more error in the form of a gym environment specifically designed for modeling the dynamics of car racing. To do so, we posit an extension to the method of learning feedback linearization; a neural network that is trained using supervised learning to convert the output of our linearizing controller to the required input for the racing environment. Our progress towards these goals is reported and the next steps in their accomplishment are discussed.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Distributed continuous-time strategy-updating rules for noncooperative games with discrete-time communication
Authors:
Xin Cai,
Feng Xiao,
Bo Wei,
Fang Fang
Abstract:
In this paper, continuous-time noncooperative games in networks of double-integrator agents are explored. The existing methods require that agents communicate with their neighbors in real time. In this paper, we propose two discrete-time communication schemes based on the designed continuous-time strategy-updating rule for the efficient use of communication resources. First, the property of the de…
▽ More
In this paper, continuous-time noncooperative games in networks of double-integrator agents are explored. The existing methods require that agents communicate with their neighbors in real time. In this paper, we propose two discrete-time communication schemes based on the designed continuous-time strategy-updating rule for the efficient use of communication resources. First, the property of the designed continuous-time rule is analyzed to ensure that all agents' strategies can converge to the Nash equilibrium. Then, we propose periodic and event-triggered communication schemes for the implementation of the designed rule with discrete-time communication. The rule in the periodic case is implemented synchronously and easily. The rule in the event-triggered case is executed asynchronously without Zeno behaviors. All agents in both cases can asymptotically reach to the Nash equilibrium by interacting with neighbors at discrete times. Simulations are performed in networks of Cournot competition to illustrate the effectiveness of the proposed methods.
△ Less
Submitted 23 February, 2023; v1 submitted 18 October, 2021;
originally announced October 2021.
-
Nash Equilibrium Seeking for General Linear Systems with Disturbance Rejection
Authors:
Xin Cai,
Feng Xiao,
Bo Wei,
Mei Yu,
Fang Fang
Abstract:
This paper explores aggregative games in a network of general linear systems subject to external disturbances. To deal with external disturbances, distributed strategy-updating rules based on internal model are proposed for the case with perfect and imperfect information, respectively. Different from existing algorithms based on gradient dynamics, by introducing the integral of gradient of cost fu…
▽ More
This paper explores aggregative games in a network of general linear systems subject to external disturbances. To deal with external disturbances, distributed strategy-updating rules based on internal model are proposed for the case with perfect and imperfect information, respectively. Different from existing algorithms based on gradient dynamics, by introducing the integral of gradient of cost functions on the basis of passive theory, the rules are proposed to force the strategies of all players to evolve to Nash equilibrium regardless the effect of disturbances. The convergence of the two strategy-updating rules is analyzed via Lyapunov stability theory, passive theory and singular perturbation theory. Simulations are presented to verify the obtained results.
△ Less
Submitted 11 December, 2021; v1 submitted 17 October, 2021;
originally announced October 2021.
-
Distributed strategy-updating rules for aggregative games of multi-integrator systems with coupled constraints
Authors:
Xin Cai,
Feng Xiao,
Bo Wei
Abstract:
In this paper, we explore aggregative games over networks of multi-integrator agents with coupled constraints. To reach the general Nash equilibrium of an aggregative game, a distributed strategy-updating rule is proposed by a combination of the coordination of Lagrange multipliers and the estimation of the aggregator. Each player has only access to partial-decision information and communicates wi…
▽ More
In this paper, we explore aggregative games over networks of multi-integrator agents with coupled constraints. To reach the general Nash equilibrium of an aggregative game, a distributed strategy-updating rule is proposed by a combination of the coordination of Lagrange multipliers and the estimation of the aggregator. Each player has only access to partial-decision information and communicates with his neighbors in a weight-balanced digraph which characterizes players' preferences as to the values of information received from neighbors. We first consider networks of double-integrator agents and then focus on multi-integrator agents. The effectiveness of the proposed strategy-updating rules is demonstrated by analyzing the convergence of corresponding dynamical systems via the Lyapunov stability theory, singular perturbation theory and passive theory. Numerical examples are given to illustrate our results.
△ Less
Submitted 20 June, 2021;
originally announced June 2021.
-
The standard cohomology of regular Courant algebroids
Authors:
Xiongwei Cai,
Zhuo Chen,
Maosong Xiang
Abstract:
For any regular Courant algebroid $E$ over a smooth manifold $M$ with characteristic distribution $F$ and ample Lie algebroid $A_E$, we prove that there exists a canonical homological vector field on the graded manifold $A_E[1] \oplus (TM/F)^\ast[2]$ such that the resulting dg manifold $\mathcal{M}_E$, which we call the minimal model of the Courant algebroid $E$, encodes all cohomological informat…
▽ More
For any regular Courant algebroid $E$ over a smooth manifold $M$ with characteristic distribution $F$ and ample Lie algebroid $A_E$, we prove that there exists a canonical homological vector field on the graded manifold $A_E[1] \oplus (TM/F)^\ast[2]$ such that the resulting dg manifold $\mathcal{M}_E$, which we call the minimal model of the Courant algebroid $E$, encodes all cohomological information of $E$. Indeed, the standard cohomology of $E$ can be identified with the cohomology of the function space on $\mathcal{M}_E$, which can be computed by a Hodge-to-de Rham type spectral sequence. We apply this result to generalized exact Courant algebroids and those arising from regular Lie algebroids.
△ Less
Submitted 1 November, 2022; v1 submitted 31 May, 2021;
originally announced May 2021.
-
Rankings in directed configuration models with heavy tailed in-degrees
Authors:
Xing Shi Cai,
Pietro Caputo,
Guillem Perarnau,
Matteo Quattropani
Abstract:
We consider the extremal values of the stationary distribution of sparse directed random graphs with given degree sequences and their relation to the extremal values of the in-degree sequence. The graphs are generated by the directed configuration model. Under the assumption of bounded $(2+η)$-moments on the in-degrees and of bounded out-degrees, we obtain tight comparisons between the maximum val…
▽ More
We consider the extremal values of the stationary distribution of sparse directed random graphs with given degree sequences and their relation to the extremal values of the in-degree sequence. The graphs are generated by the directed configuration model. Under the assumption of bounded $(2+η)$-moments on the in-degrees and of bounded out-degrees, we obtain tight comparisons between the maximum value of the stationary distribution and the maximum in-degree. Under the further assumption that the order statistics of the in-degrees have a power-law behavior, we show that the extremal values of the stationary distribution also have a power-law behavior with the same index. In the same setting, we prove that these results extend to the PageRank scores of the random digraph, thus confirming a version of the so-called power-law hypothesis. Along the way, we establish several facts about the model, including the mixing time cutoff and the characterization of the typical values of the stationary distribution, which were previously obtained under the assumption of bounded in-degrees.
△ Less
Submitted 16 April, 2021;
originally announced April 2021.
-
Simplicial Complex Representation Learning
Authors:
Mustafa Hajij,
Ghada Zamzmi,
Theodore Papamarkou,
Vasileios Maroulas,
Xuanting Cai
Abstract:
Simplicial complexes form an important class of topological spaces that are frequently used in many application areas such as computer-aided design, computer graphics, and simulation. Representation learning on graphs, which are just 1-d simplicial complexes, has witnessed a great attention in recent years. However, there has not been enough effort to extend representation learning to higher dimen…
▽ More
Simplicial complexes form an important class of topological spaces that are frequently used in many application areas such as computer-aided design, computer graphics, and simulation. Representation learning on graphs, which are just 1-d simplicial complexes, has witnessed a great attention in recent years. However, there has not been enough effort to extend representation learning to higher dimensional simplicial objects due to the additional complexity these objects hold, especially when it comes to entire-simplicial complex representation learning. In this work, we propose a method for simplicial complex-level representation learning that embeds a simplicial complex to a universal embedding space in a way that complex-to-complex proximity is preserved. Our method uses our novel geometric message passing schemes to learn an entire simplicial complex representation in an end-to-end fashion. We demonstrate the proposed model on publicly available mesh dataset. To the best of our knowledge, this work presents the first method for learning simplicial complex-level representation.
△ Less
Submitted 1 February, 2022; v1 submitted 6 March, 2021;
originally announced March 2021.
-
Persistent Homology and Graphs Representation Learning
Authors:
Mustafa Hajij,
Ghada Zamzmi,
Xuanting Cai
Abstract:
This article aims to study the topological invariant properties encoded in node graph representational embeddings by utilizing tools available in persistent homology. Specifically, given a node embedding representation algorithm, we consider the case when these embeddings are real-valued. By viewing these embeddings as scalar functions on a domain of interest, we can utilize the tools available in…
▽ More
This article aims to study the topological invariant properties encoded in node graph representational embeddings by utilizing tools available in persistent homology. Specifically, given a node embedding representation algorithm, we consider the case when these embeddings are real-valued. By viewing these embeddings as scalar functions on a domain of interest, we can utilize the tools available in persistent homology to study the topological information encoded in these representations. Our construction effectively defines a unique persistence-based graph descriptor, on both the graph and node levels, for every node representation algorithm. To demonstrate the effectiveness of the proposed method, we study the topological descriptors induced by DeepWalk, Node2Vec and Diff2Vec.
△ Less
Submitted 8 October, 2021; v1 submitted 25 February, 2021;
originally announced February 2021.
-
A Safety and Passivity Filter for Robot Teleoperation Systems
Authors:
Gennaro Notomista,
Xiaoyi Cai
Abstract:
In this paper, we present a way of enforcing safety and passivity properties of robot teleoperation systems, where a human operator interacts with a dynamical system modeling the robot. The approach does so in a holistic fashion, by combining safety and passivity constraints in a single optimization-based controller which effectively filters the desired control input before supplying it to the sys…
▽ More
In this paper, we present a way of enforcing safety and passivity properties of robot teleoperation systems, where a human operator interacts with a dynamical system modeling the robot. The approach does so in a holistic fashion, by combining safety and passivity constraints in a single optimization-based controller which effectively filters the desired control input before supplying it to the system. The result is a safety and passivity filter implemented as a convex quadratic program which can be solved efficiently and employed in an online fashion in many robotic teleoperation applications. Simulation results show the benefits of the approach developed in this paper applied to the human teleoperation of a second-order dynamical system.
△ Less
Submitted 17 February, 2021;
originally announced February 2021.
-
Lenient Regret and Good-Action Identification in Gaussian Process Bandits
Authors:
Xu Cai,
Selwyn Gomes,
Jonathan Scarlett
Abstract:
In this paper, we study the problem of Gaussian process (GP) bandits under relaxed optimization criteria stating that any function value above a certain threshold is "good enough". On the theoretical side, we study various {\em lenient regret} notions in which all near-optimal actions incur zero penalty, and provide upper bounds on the lenient regret for GP-UCB and an elimination algorithm, circum…
▽ More
In this paper, we study the problem of Gaussian process (GP) bandits under relaxed optimization criteria stating that any function value above a certain threshold is "good enough". On the theoretical side, we study various {\em lenient regret} notions in which all near-optimal actions incur zero penalty, and provide upper bounds on the lenient regret for GP-UCB and an elimination algorithm, circumventing the usual $O(\sqrt{T})$ term (with time horizon $T$) resulting from zooming extremely close towards the function maximum. In addition, we complement these upper bounds with algorithm-independent lower bounds. On the practical side, we consider the problem of finding a single "good action" according to a known pre-specified threshold, and introduce several good-action identification algorithms that exploit knowledge of the threshold. We experimentally find that such algorithms can often find a good action faster than standard optimization-based approaches.
△ Less
Submitted 26 May, 2021; v1 submitted 10 February, 2021;
originally announced February 2021.
-
Clustered active-subspace based local Gaussian Process emulator for high-dimensional and complex computer models
Authors:
Junda Xiong,
Xin Cai,
**glai Li
Abstract:
Quantifying uncertainties in physical or engineering systems often requires a large number of simulations of the underlying computer models that are computationally intensive. Emulators or surrogate models are often used to accelerate the computation in such problems, and in this regard the Gaussian Process (GP) emulator is a popular choice for its ability to quantify the approximation error in th…
▽ More
Quantifying uncertainties in physical or engineering systems often requires a large number of simulations of the underlying computer models that are computationally intensive. Emulators or surrogate models are often used to accelerate the computation in such problems, and in this regard the Gaussian Process (GP) emulator is a popular choice for its ability to quantify the approximation error in the emulator itself. However, a major limitation of the GP emulator is that it can not handle problems of very high dimensions, which is often addressed with dimension reduction techniques. In this work we hope to address an issue that the models of interest are so complex that they admit different low dimensional structures in different parameter regimes. Building upon the active subspace method for dimension reduction, we propose a clustered active subspace method which identifies the local low-dimensional structures as well as the parameter regimes they are in (represented as clusters), and then construct low dimensional and local GP emulators within the clusters. Specifically we design a clustering method based on the gradient information to identify these clusters, and a local GP construction procedure to construct the GP emulator within a local cluster. With numerical examples, we demonstrate that the proposed method is effective when the underlying models are of complex low-dimensional structures.
△ Less
Submitted 9 November, 2021; v1 submitted 31 December, 2020;
originally announced January 2021.
-
Asynchronous Parallel Stochastic Quasi-Newton Methods
Authors:
Qianqian Tong,
Guannan Liang,
Xingyu Cai,
Chunjiang Zhu,
**bo Bi
Abstract:
Although first-order stochastic algorithms, such as stochastic gradient descent, have been the main force to scale up machine learning models, such as deep neural nets, the second-order quasi-Newton methods start to draw attention due to their effectiveness in dealing with ill-conditioned optimization problems. The L-BFGS method is one of the most widely used quasi-Newton methods. We propose an as…
▽ More
Although first-order stochastic algorithms, such as stochastic gradient descent, have been the main force to scale up machine learning models, such as deep neural nets, the second-order quasi-Newton methods start to draw attention due to their effectiveness in dealing with ill-conditioned optimization problems. The L-BFGS method is one of the most widely used quasi-Newton methods. We propose an asynchronous parallel algorithm for stochastic quasi-Newton (AsySQN) method. Unlike prior attempts, which parallelize only the calculation for gradient or the two-loop recursion of L-BFGS, our algorithm is the first one that truly parallelizes L-BFGS with a convergence guarantee. Adopting the variance reduction technique, a prior stochastic L-BFGS, which has not been designed for parallel computing, reaches a linear convergence rate. We prove that our asynchronous parallel scheme maintains the same linear convergence rate but achieves significant speedup. Empirical evaluations in both simulations and benchmark datasets demonstrate the speedup in comparison with the non-parallel stochastic L-BFGS, as well as the better performance than first-order methods in solving ill-conditioned problems.
△ Less
Submitted 1 November, 2020;
originally announced November 2020.
-
Minimum stationary values of sparse random directed graphs
Authors:
Xing Shi Cai,
Guillem Perarnau
Abstract:
We consider the stationary distribution of the simple random walk on the directed configuration model with bounded degrees. Provided that the minimum out-degree is at least $2$, with high probability (whp) there is a unique stationary distribution. We show that the minimum positive stationary value is whp $n^{-(1+C+o(1))}$ for some constant $C \ge 0$ determined by the degree distribution. In parti…
▽ More
We consider the stationary distribution of the simple random walk on the directed configuration model with bounded degrees. Provided that the minimum out-degree is at least $2$, with high probability (whp) there is a unique stationary distribution. We show that the minimum positive stationary value is whp $n^{-(1+C+o(1))}$ for some constant $C \ge 0$ determined by the degree distribution. In particular, $C$ is the competing combination of two factors: (1) the contribution of atypically "thin" in-neighbourhoods, controlled by subcritical branching processes; and (2) the contribution of atypically "light" trajectories, controlled by large deviation rate functions. Additionally, our proof implies that whp the hitting and the cover time are both $n^{1+C+o(1)}$. Our results complement those of Caputo and Quattropani who showed that if the minimum in-degree is at least 2, stationary values have logarithmic fluctuations around $n^{-1}$.
△ Less
Submitted 19 February, 2021; v1 submitted 14 October, 2020;
originally announced October 2020.
-
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
Authors:
Xu Cai,
Jonathan Scarlett
Abstract:
In this paper, we consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm is some Reproducing Kernel Hilbert Space (RKHS), which can be viewed as a non-Bayesian Gaussian process bandit problem. In the standard noisy setting, we provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity…
▽ More
In this paper, we consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm is some Reproducing Kernel Hilbert Space (RKHS), which can be viewed as a non-Bayesian Gaussian process bandit problem. In the standard noisy setting, we provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability. In a robust setting in which every sampled point may be perturbed by a suitably-constrained adversary, we provide a novel lower bound for deterministic strategies, demonstrating an inevitable joint dependence of the cumulative regret on the corruption level and the time horizon, in contrast with existing lower bounds that only characterize the individual dependencies. Furthermore, in a distinct robust setting in which the final point is perturbed by an adversary, we strengthen an existing lower bound that only holds for target success probabilities very close to one, by allowing for arbitrary success probabilities above $\frac{2}{3}$.
△ Less
Submitted 24 May, 2021; v1 submitted 19 August, 2020;
originally announced August 2020.
-
A Gradient-Aware Search Algorithm for Constrained Markov Decision Processes
Authors:
Sami Khairy,
Prasanna Balaprakash,
Lin X. Cai
Abstract:
The canonical solution methodology for finite constrained Markov decision processes (CMDPs), where the objective is to maximize the expected infinite-horizon discounted rewards subject to the expected infinite-horizon discounted costs constraints, is based on convex linear programming. In this brief, we first prove that the optimization objective in the dual linear program of a finite CMDP is a pi…
▽ More
The canonical solution methodology for finite constrained Markov decision processes (CMDPs), where the objective is to maximize the expected infinite-horizon discounted rewards subject to the expected infinite-horizon discounted costs constraints, is based on convex linear programming. In this brief, we first prove that the optimization objective in the dual linear program of a finite CMDP is a piece-wise linear convex function (PWLC) with respect to the Lagrange penalty multipliers. Next, we propose a novel two-level Gradient-Aware Search (GAS) algorithm which exploits the PWLC structure to find the optimal state-value function and Lagrange penalty multipliers of a finite CMDP. The proposed algorithm is applied in two stochastic control problems with constraints: robot navigation in a grid world and solar-powered unmanned aerial vehicle (UAV)-based wireless network management. We empirically compare the convergence performance of the proposed GAS algorithm with binary search (BS), Lagrangian primal-dual optimization (PDO), and Linear Programming (LP). Compared with benchmark algorithms, it is shown that the proposed GAS algorithm converges to the optimal solution faster, does not require hyper-parameter tuning, and is not sensitive to initialization of the Lagrange penalty multiplier.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
The giant component of the directed configuration model revisited
Authors:
Xing Shi Cai,
Guillem Perarnau
Abstract:
We prove a law of large numbers for the order and size of the largest strongly connected component in the directed configuration model. Our result extends previous work by Cooper and Frieze.
We prove a law of large numbers for the order and size of the largest strongly connected component in the directed configuration model. Our result extends previous work by Cooper and Frieze.
△ Less
Submitted 10 April, 2020;
originally announced April 2020.
-
The diameter of the directed configuration model
Authors:
Xing Shi Cai,
Guillem Perarnau
Abstract:
We show that the diameter of the directed configuration model with $n$ vertices rescaled by $\log n$ converges in probability to a constant. Our assumptions are the convergence of the in- and out-degree of a uniform random vertex in distribution, first and second moment. Our result extends previous results on the diameter of the model and applies to many other random directed graphs.
We show that the diameter of the directed configuration model with $n$ vertices rescaled by $\log n$ converges in probability to a constant. Our assumptions are the convergence of the in- and out-degree of a uniform random vertex in distribution, first and second moment. Our result extends previous results on the diameter of the model and applies to many other random directed graphs.
△ Less
Submitted 10 March, 2020;
originally announced March 2020.
-
Cohomology of hemistrict Lie 2-algebras
Authors:
Xiongwei Cai,
Zhangju Liu,
Maosong Xiang
Abstract:
We study representations of hemistrict Lie 2-algebras and give a functorial construction of their cohomology. We prove that both the cohomology of an injective hemistrict Lie 2-algebra $L$ and the cohomology of the semistrict Lie 2-algebra obtained from skew-symmetrization of $L$ are isomorphic to the Chevalley-Eilenberg cohomology of the induced Lie algebra $L_{\operatorname{Lie}}$.
We study representations of hemistrict Lie 2-algebras and give a functorial construction of their cohomology. We prove that both the cohomology of an injective hemistrict Lie 2-algebra $L$ and the cohomology of the semistrict Lie 2-algebra obtained from skew-symmetrization of $L$ are isomorphic to the Chevalley-Eilenberg cohomology of the induced Lie algebra $L_{\operatorname{Lie}}$.
△ Less
Submitted 7 March, 2020;
originally announced March 2020.
-
An Eulerian-Lagrangian discontinuous Galerkin method for transport problems and its application to nonlinear dynamics
Authors:
Xiaofeng Cai,
**g-Mei Qiu,
Yang Yang
Abstract:
We propose a new Eulerian-Lagrangian (EL) discontinuous Galerkin (DG) method. The method is designed as a generalization of the semi-Lagrangian (SL) DG method for linear advection problems proposed in [J. Sci. Comput. 73: 514-542, 2017], which is formulated based on an adjoint problem and tracing upstream cells by tracking characteristics curves highly accurately. In the SLDG method, depending on…
▽ More
We propose a new Eulerian-Lagrangian (EL) discontinuous Galerkin (DG) method. The method is designed as a generalization of the semi-Lagrangian (SL) DG method for linear advection problems proposed in [J. Sci. Comput. 73: 514-542, 2017], which is formulated based on an adjoint problem and tracing upstream cells by tracking characteristics curves highly accurately. In the SLDG method, depending on the velocity field, upstream cells could be of arbitrary shape. Thus, a more sophisticated approximation to sides of the upstream cells is required to get high order approximation. For example, quadratic-curved (QC) quadrilaterals were proposed to approximate upstream cells for a third-order spatial accuracy in a swirling deformation example. In this paper, for linear advection problems, we propose a more general formulation, named the ELDG method. The scheme is formulated based on a {\em modified} adjoint problem for which the upstream cells are always quadrilaterals, which avoids the need to use QC quadrilaterals in the SLDG algorithm. The newly proposed ELDG method can be viewed as a new general framework, in which both the classical Eulerian Runge-Kutta DG formulation and the SL DG formulation can fit in. Numerical results on linear transport problems, as well as the nonlinear Vlasov and incompressible Euler dynamics using the exponential RK time integrators, are presented to demonstrate the effectiveness of the ELDG method.
△ Less
Submitted 14 February, 2020; v1 submitted 7 February, 2020;
originally announced February 2020.
-
Transgression maps for crossed modules of groupoids
Authors:
Xiongwei Cai
Abstract:
Given a crossed module of groupoids $N\rightarrow G$, we construct (1) a natural homomorphism from the product groupoid $\mathbb{Z}\times(N\rtimes G)\rightrightarrows N$ to the crossed product groupoid $N\rtimes G\rightrightarrows N$ and (2) a transgression map from the singular cohomology $H^\ast(G_\bullet,\mathbb{Z})$ of the nerve of the groupoid $G$ to the singular cohomology…
▽ More
Given a crossed module of groupoids $N\rightarrow G$, we construct (1) a natural homomorphism from the product groupoid $\mathbb{Z}\times(N\rtimes G)\rightrightarrows N$ to the crossed product groupoid $N\rtimes G\rightrightarrows N$ and (2) a transgression map from the singular cohomology $H^\ast(G_\bullet,\mathbb{Z})$ of the nerve of the groupoid $G$ to the singular cohomology $H^{\ast-1}\big((N\rtimes G)_\bullet,\mathbb{Z}\big)$ of the nerve of the crossed product groupoid $N\rtimes G$. The latter turns out to be identical to the transgression map obtained by Tu--Xu in their study of equivariant $K$-theory.
△ Less
Submitted 28 January, 2020;
originally announced January 2020.
-
Identification of causal intervention effects under contagion
Authors:
Xiaoxuan Cai,
Wen Wei Loh,
Forrest W. Crawford
Abstract:
Defining and identifying causal intervention effects for transmissible infectious disease outcomes is challenging because a treatment -- such as a vaccine -- given to one individual may affect the infection outcomes of others. Epidemiologists have proposed causal estimands to quantify effects of interventions under contagion using a two-person partnership model. These simple conceptual models have…
▽ More
Defining and identifying causal intervention effects for transmissible infectious disease outcomes is challenging because a treatment -- such as a vaccine -- given to one individual may affect the infection outcomes of others. Epidemiologists have proposed causal estimands to quantify effects of interventions under contagion using a two-person partnership model. These simple conceptual models have helped researchers develop causal estimands relevant to clinical evaluation of vaccine effects. However, many of these partnership models are formulated under structural assumptions that preclude realistic infectious disease transmission dynamics, limiting their conceptual usefulness in defining and identifying causal treatment effects in empirical intervention trials. In this paper, we propose causal intervention effects in two-person partnerships under arbitrary infectious disease transmission dynamics, and give nonparametric identification results showing how effects can be estimated in empirical trials using time-to-infection or binary outcome data. The key insight is that contagion is a causal phenomenon that induces conditional independencies on infection outcomes that can be exploited for the identification of clinically meaningful causal estimands. These new estimands are compared to existing quantities, and results are illustrated using a realistic simulation of an HIV vaccine trial.
△ Less
Submitted 10 December, 2019; v1 submitted 9 December, 2019;
originally announced December 2019.
-
High Order Semi-Lagrangian Discontinuous Galerkin Method Coupled with Runge-Kutta Exponential Integrators for Nonlinear Vlasov Dynamics
Authors:
Xiaofeng Cai,
Sebastiano Boscarino,
**g-Mei Qiu
Abstract:
In this paper, we propose a semi-Lagrangian discontinuous Galerkin method coupled with Runge-Kutta exponential integrators (SLDG-RKEI) for nonlinear Vlasov dynamics. The commutator-free Runge-Kutta (RK) exponential integrators (EI) were proposed by Celledoni, et al. (FGCS, 2003). In the nonlinear transport setting, the RKEI can be used to decompose the evolution of the nonlinear transport into a c…
▽ More
In this paper, we propose a semi-Lagrangian discontinuous Galerkin method coupled with Runge-Kutta exponential integrators (SLDG-RKEI) for nonlinear Vlasov dynamics. The commutator-free Runge-Kutta (RK) exponential integrators (EI) were proposed by Celledoni, et al. (FGCS, 2003). In the nonlinear transport setting, the RKEI can be used to decompose the evolution of the nonlinear transport into a composition of a sequence of linearized dynamics. The resulting linearized transport equations can be solved by the semi-Lagrangian (SL) discontinuous Galerkin (DG) method proposed in Cai, et al. (JSC, 2017). The proposed method can achieve high order spatial accuracy via the SLDG framework, and high order temporal accuracy via the RK EI. Due to the SL nature, the proposed SLDG-RKEI method is not subject to the CFL condition, thus they have the potential in using larger time-step** sizes than those in the Eulerian approach. Inheriting advantages from the SLDG method, the proposed SLDG-RKEI schemes are mass conservative, positivity-preserving, have no dimensional splitting error, perform well in resolving complex solution structures, and can be evolved with adaptive time-step** sizes. We show the performance of the SLDG-RKEI algorithm by classical test problems for the nonlinear Vlasov-Poisson system, as well as the Guiding center Vlasov model. Though that it is not our focus of this paper to explore the SLDG-RKEI scheme for nonlinear hyperbolic conservation laws that develop shocks, we show some preliminary results on schemes' performance on the Burgers' equation.
△ Less
Submitted 24 November, 2020; v1 submitted 27 November, 2019;
originally announced November 2019.
-
A semi-Lagrangian discontinuous Galerkin (DG) -- local DG method for solving convection-diffusion equations
Authors:
Mingchang Ding,
Xiaofeng Cai,
Wei Guo,
**g-Mei Qiu
Abstract:
In this paper, we propose an efficient high order semi-Lagrangian (SL) discontinuous Galerkin (DG) method for solving linear convection-diffusion equations. The method generalizes our previous work on develo** the SLDG method for transport equations (J. Sci. Comput. 73: 514-542, 2017), making it capable of handling additional diffusion and source terms. Within the DG framework, the solution is e…
▽ More
In this paper, we propose an efficient high order semi-Lagrangian (SL) discontinuous Galerkin (DG) method for solving linear convection-diffusion equations. The method generalizes our previous work on develo** the SLDG method for transport equations (J. Sci. Comput. 73: 514-542, 2017), making it capable of handling additional diffusion and source terms. Within the DG framework, the solution is evolved along the characteristics; while the diffusion term is discretized by the local DG (LDG) method and integrated along characteristics by implicit Runge-Kutta methods together with source terms. The proposed method is named the `SLDG-LDG' method and enjoys many attractive features of the DG and SL methods. These include the uniformly high order accuracy (e.g. third order) in space and in time, compact, mass conservative, and stability under large time step** size. An $L^2$ stability analysis is provided when the method is coupled with the first order backward Euler discretization. Effectiveness of the method are demonstrated by a group of numerical tests in one and two dimensions.
△ Less
Submitted 27 January, 2020; v1 submitted 13 July, 2019;
originally announced July 2019.
-
The $k$-cut model in deterministic and random trees
Authors:
Gabriel Berzunza,
Xing Shi Cai,
Cecilia Holmgren
Abstract:
The $k$-cut number of rooted graphs was introduced by Cai et al. as a generalization of the classical cutting model by Meir and Moon. In this paper, we show that all moments of the k-cut number of conditioned Galton-Watson trees converges after proper rescaling, which implies convergence in distribution to the same limit law regardless of the offspring distribution of the trees. This extends the r…
▽ More
The $k$-cut number of rooted graphs was introduced by Cai et al. as a generalization of the classical cutting model by Meir and Moon. In this paper, we show that all moments of the k-cut number of conditioned Galton-Watson trees converges after proper rescaling, which implies convergence in distribution to the same limit law regardless of the offspring distribution of the trees. This extends the result of Janson. Using the same method, we also show that the k-cut number of various random or deterministic trees of logarithmic height converges in probability to a constant after rescaling, such as random split-trees, uniform random recursive trees, and scale-free random trees.
△ Less
Submitted 16 October, 2020; v1 submitted 5 July, 2019;
originally announced July 2019.
-
A Two-stage Classification Method for High-dimensional Data and Point Clouds
Authors:
Xiaohao Cai,
Raymond Chan,
Xiaoyu Xie,
Tieyong Zeng
Abstract:
High-dimensional data classification is a fundamental task in machine learning and imaging science. In this paper, we propose a two-stage multiphase semi-supervised classification method for classifying high-dimensional data and unstructured point clouds. To begin with, a fuzzy classification method such as the standard support vector machine is used to generate a warm initialization. We then appl…
▽ More
High-dimensional data classification is a fundamental task in machine learning and imaging science. In this paper, we propose a two-stage multiphase semi-supervised classification method for classifying high-dimensional data and unstructured point clouds. To begin with, a fuzzy classification method such as the standard support vector machine is used to generate a warm initialization. We then apply a two-stage approach named SaT (smoothing and thresholding) to improve the classification. In the first stage, an unconstraint convex variational model is implemented to purify and smooth the initialization, followed by the second stage which is to project the smoothed partition obtained at stage one to a binary partition. These two stages can be repeated, with the latest result as a new initialization, to keep improving the classification quality. We show that the convex model of the smoothing stage has a unique solution and can be solved by a specifically designed primal-dual algorithm whose convergence is guaranteed. We test our method and compare it with the state-of-the-art methods on several benchmark data sets. The experimental results demonstrate clearly that our method is superior in both the classification accuracy and computation speed for high-dimensional data and point clouds.
△ Less
Submitted 21 May, 2019;
originally announced May 2019.
-
Comparison of semi-Lagrangian discontinuous Galerkin schemes for linear and nonlinear transport simulations
Authors:
Xiaofeng Cai,
Wei Guo,
**g-Mei Qiu
Abstract:
Transport problems arise across diverse fields of science and engineering. Semi-Lagrangian (SL) discontinuous Galerkin (DG) methods are a class of high order deterministic transport solvers that enjoy advantages of both SL approach and DG spatial discretization. In this paper, we review existing SLDG methods to date and compare numerical their performances. In particular, we make a comparison betw…
▽ More
Transport problems arise across diverse fields of science and engineering. Semi-Lagrangian (SL) discontinuous Galerkin (DG) methods are a class of high order deterministic transport solvers that enjoy advantages of both SL approach and DG spatial discretization. In this paper, we review existing SLDG methods to date and compare numerical their performances. In particular, we make a comparison between the splitting and non-splitting SLDG methods for multi-dimensional transport simulations. Through extensive numerical results, we offer a practical guide for choosing optimal SLDG solvers for linear and nonlinear transport simulations.
△ Less
Submitted 28 March, 2019;
originally announced March 2019.
-
The fluctuations of the giant cluster for percolation on random split trees
Authors:
Gabriel Berzunza,
Xing Shi Cai,
Cecilia Holmgren
Abstract:
A split tree of cardinality $n$ is constructed by distributing $n$ "balls" in a subset of vertices of an infinite tree which encompasses many types of random trees such as $m$-ary search trees, quad trees, median-of-$(2k+1)$ trees, fringe-balanced trees, digital search trees and random simplex trees. In this work, we study Bernoulli bond percolation on arbitrary split trees of large but finite car…
▽ More
A split tree of cardinality $n$ is constructed by distributing $n$ "balls" in a subset of vertices of an infinite tree which encompasses many types of random trees such as $m$-ary search trees, quad trees, median-of-$(2k+1)$ trees, fringe-balanced trees, digital search trees and random simplex trees. In this work, we study Bernoulli bond percolation on arbitrary split trees of large but finite cardinality $n$. We show for appropriate percolation regimes that depend on the cardinality $n$ of the split tree that there exists a unique giant cluster, the fluctuations of the size of the giant cluster as $n \rightarrow \infty$ are described by an infinitely divisible distribution that belongs to the class of stable Cauchy laws. This work generalizes the results for the random $m$-ary recursive trees in Berzunza (2015). Our approach is based on a remarkable decomposition of the size of the giant percolation cluster as a sum of essentially independent random variables which may be useful for studying percolation on other trees with logarithmic height; for instance in this work we study also the case of regular trees.
△ Less
Submitted 26 May, 2021; v1 submitted 21 February, 2019;
originally announced February 2019.
-
Cutting resilient networks -- complete binary trees
Authors:
Xing Shi Cai,
Cecilia Holmgren
Abstract:
In our previous work, we introduced the random $k$-cut number for rooted graphs. In this paper, we show that the distribution of the $k$-cut number in complete binary trees of size $n$, after rescaling, is asymptotically a periodic function of $\lg n - \lg \lg n$. Thus there are different limit distributions for different subsequences, where these limits are similar to weakly 1-stable distribution…
▽ More
In our previous work, we introduced the random $k$-cut number for rooted graphs. In this paper, we show that the distribution of the $k$-cut number in complete binary trees of size $n$, after rescaling, is asymptotically a periodic function of $\lg n - \lg \lg n$. Thus there are different limit distributions for different subsequences, where these limits are similar to weakly 1-stable distributions. This generalizes the result for the case $k = 1$, i.e., the traditional cutting model, by Janson.
△ Less
Submitted 11 November, 2019; v1 submitted 14 November, 2018;
originally announced November 2018.
-
Quantifying Uncertainty in High Dimensional Inverse Problems by Convex Optimisation
Authors:
Xiaohao Cai,
Marcelo Pereyra,
Jason D. McEwen
Abstract:
Inverse problems play a key role in modern image/signal processing methods. However, since they are generally ill-conditioned or ill-posed due to lack of observations, their solutions may have significant intrinsic uncertainty. Analysing and quantifying this uncertainty is very challenging, particularly in high-dimensional problems and problems with non-smooth objective functionals (e.g. sparsity-…
▽ More
Inverse problems play a key role in modern image/signal processing methods. However, since they are generally ill-conditioned or ill-posed due to lack of observations, their solutions may have significant intrinsic uncertainty. Analysing and quantifying this uncertainty is very challenging, particularly in high-dimensional problems and problems with non-smooth objective functionals (e.g. sparsity-promoting priors). In this article, a series of strategies to visualise this uncertainty are presented, e.g. highest posterior density credible regions, and local credible intervals (cf. error bars) for individual pixels and superpixels. Our methods support non-smooth priors for inverse problems and can be scaled to high-dimensional settings. Moreover, we present strategies to automatically set regularisation parameters so that the proposed uncertainty quantification (UQ) strategies become much easier to use. Also, different kinds of dictionaries (complete and over-complete) are used to represent the image/signal and their performance in the proposed UQ methodology is investigated.
△ Less
Submitted 5 September, 2019; v1 submitted 4 November, 2018;
originally announced November 2018.
-
The minimal and maximal symmetries for $J$-contractive projections
Authors:
Yuan Li,
Xiaomei Cai,
Jiajia Niu,
Jiaxin Zhang
Abstract:
In this paper, we firstly character the structures of symmetries $J$ such that a projection $P$ is $J$-contractive. Then the minimal and maximal elements of the symmetries $J$ with $P^{\ast}JP\leqslant J$(or $JP\geqslant0)$ are given. Moreover, some formulas between $P_{(2I-P-P^{\ast})^{+}}$ $(P_{(2I-P-P^{\ast})^{-}})$ and $P_{(P+P^{\ast})^-}$ $(P_{(P+P^{\ast})^+})$ are established.
In this paper, we firstly character the structures of symmetries $J$ such that a projection $P$ is $J$-contractive. Then the minimal and maximal elements of the symmetries $J$ with $P^{\ast}JP\leqslant J$(or $JP\geqslant0)$ are given. Moreover, some formulas between $P_{(2I-P-P^{\ast})^{+}}$ $(P_{(2I-P-P^{\ast})^{-}})$ and $P_{(P+P^{\ast})^-}$ $(P_{(P+P^{\ast})^+})$ are established.
△ Less
Submitted 17 October, 2018; v1 submitted 12 October, 2018;
originally announced October 2018.