Search | arXiv e-print repository

arXiv:2406.19475 [pdf, other]

Stochastic First-Order Methods with Non-smooth and Non-Euclidean Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization

Authors: Yue Xie, Jiawen Bi, Hongcheng Liu

Abstract: When the nonconvex problem is complicated by stochasticity, the sample complexity of stochastic first-order methods may depend linearly on the problem dimension, which is undesirable for large-scale problems. In this work, we propose dimension-insensitive stochastic first-order methods (DISFOMs) to address nonconvex optimization with expected-valued objective function. Our algorithms allow for non… ▽ More When the nonconvex problem is complicated by stochasticity, the sample complexity of stochastic first-order methods may depend linearly on the problem dimension, which is undesirable for large-scale problems. In this work, we propose dimension-insensitive stochastic first-order methods (DISFOMs) to address nonconvex optimization with expected-valued objective function. Our algorithms allow for non-Euclidean and non-smooth distance functions as the proximal terms. Under mild assumptions, we show that DISFOM using minibatches to estimate the gradient enjoys sample complexity of $ \mathcal{O} ( (\log d) / ε^4 ) $ to obtain an $ε$-stationary point. Furthermore, we prove that DISFOM employing variance reduction can sharpen this bound to $\mathcal{O} ( (\log d)^{2/3}/ε^{10/3} )$, which perhaps leads to the best-known sample complexity result in terms of $d$. We provide two choices of the non-smooth distance functions, both of which allow for closed-form solutions to the proximal step. Numerical experiments are conducted to illustrate the dimension insensitive property of the proposed frameworks. △ Less

Submitted 27 June, 2024; originally announced June 2024.

MSC Class: 90C06; 90C15; 90C26; 90C30

arXiv:2405.16828 [pdf, other]

Kernel-based optimally weighted conformal prediction intervals

Authors: Jonghyeok Lee, Chen Xu, Yao Xie

Abstract: Conformal prediction has been a popular distribution-free framework for uncertainty quantification. In this paper, we present a novel conformal prediction method for time-series, which we call Kernel-based Optimally Weighted Conformal Prediction Intervals (KOWCPI). Specifically, KOWCPI adapts the classic Reweighted Nadaraya-Watson (RNW) estimator for quantile regression on dependent data and learn… ▽ More Conformal prediction has been a popular distribution-free framework for uncertainty quantification. In this paper, we present a novel conformal prediction method for time-series, which we call Kernel-based Optimally Weighted Conformal Prediction Intervals (KOWCPI). Specifically, KOWCPI adapts the classic Reweighted Nadaraya-Watson (RNW) estimator for quantile regression on dependent data and learns optimal data-adaptive weights. Theoretically, we tackle the challenge of establishing a conditional coverage guarantee for non-exchangeable data under strong mixing conditions on the non-conformity scores. We demonstrate the superior performance of KOWCPI on real time-series against state-of-the-art methods, where KOWCPI achieves narrower confidence intervals without losing coverage. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.13940 [pdf, other]

High-dimensional (Group) Adversarial Training in Linear Regression

Authors: Yiling Xie, Xiaoming Huo

Abstract: Adversarial training can achieve robustness against adversarial perturbations and has been widely used in machine learning models. This paper delivers a non-asymptotic consistency analysis of the adversarial training procedure under $\ell_\infty$-perturbation in high-dimensional linear regression. It will be shown that the associated convergence rate of prediction error can achieve the minimax rat… ▽ More Adversarial training can achieve robustness against adversarial perturbations and has been widely used in machine learning models. This paper delivers a non-asymptotic consistency analysis of the adversarial training procedure under $\ell_\infty$-perturbation in high-dimensional linear regression. It will be shown that the associated convergence rate of prediction error can achieve the minimax rate up to a logarithmic factor in the high-dimensional linear regression on the class of sparse parameters. Additionally, the group adversarial training procedure is analyzed. Compared with classic adversarial training, it will be proved that the group adversarial training procedure enjoys a better prediction error upper bound under certain group-sparsity patterns. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.13430 [pdf, ps, other]

The Unisolvence of Lagrange Interpolation with Symmetric Interpolation Space and Nodes in High Dimension

Authors: Yulin Xie, Yifa Tang

Abstract: High-dimensional Lagrange interpolation plays a pivotal role in finite element methods, where ensuring the unisolvence and symmetry of its interpolation space and nodes set is crucial. In this paper, we leverage group action and group representation theories to precisely delineate the conditions for unisolvence. We establish a necessary condition for unisolvence: the symmetry of the interpolation… ▽ More High-dimensional Lagrange interpolation plays a pivotal role in finite element methods, where ensuring the unisolvence and symmetry of its interpolation space and nodes set is crucial. In this paper, we leverage group action and group representation theories to precisely delineate the conditions for unisolvence. We establish a necessary condition for unisolvence: the symmetry of the interpolation nodes set is determined by the given interpolation space. Our findings not only contribute to a deeper theoretical understanding but also promise practical benefits by reducing the computational overhead associated with identifying appropriate interpolation nodes. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 18 pages

MSC Class: 65D05

arXiv:2404.18838 [pdf, other]

Accurate adaptive deep learning method for solving elliptic problems

Authors: **gyong Ying, Yaqi Xie, Jiao Li, Hongqiao Wang

Abstract: Deep learning method is of great importance in solving partial differential equations. In this paper, inspired by the failure-informed idea proposed by Gao et.al. (SIAM Journal on Scientific Computing 45(4)(2023)) and as an improvement, a new accurate adaptive deep learning method is proposed for solving elliptic problems, including the interface problems and the convection-dominated problems. Bas… ▽ More Deep learning method is of great importance in solving partial differential equations. In this paper, inspired by the failure-informed idea proposed by Gao et.al. (SIAM Journal on Scientific Computing 45(4)(2023)) and as an improvement, a new accurate adaptive deep learning method is proposed for solving elliptic problems, including the interface problems and the convection-dominated problems. Based on the failure probability framework, the piece-wise uniform distribution is used to approximate the optimal proposal distribution and an kernel-based method is proposed for efficient sampling. Together with the improved Levenberg-Marquardt optimization method, the proposed adaptive deep learning method shows great potential in improving solution accuracy. Numerical tests on the elliptic problems without interface conditions, on the elliptic interface problem, and on the convection-dominated problems demonstrate the effectiveness of the proposed method, as it reduces the relative errors by a factor varying from $10^2$ to $10^4$ for different cases. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.18753 [pdf, ps, other]

Fixers and derangements of finite permutation groups

Authors: Hong Yi Huang, Cai Heng Li, Yi Lin Xie

Abstract: Let $G\leqslant\mathrm{Sym}(Ω)$ be a finite transitive permutation group with point stabiliser $H$. We say that a subgroup $K$ of $G$ is a fixer if every element of $K$ has fixed points, and we say that $K$ is large if $|K| \geqslant |H|$. There is a special interest in studying large fixers due to connections with Erdős-Ko-Rado type problems. In this paper, we classify up to conjugacy the large f… ▽ More Let $G\leqslant\mathrm{Sym}(Ω)$ be a finite transitive permutation group with point stabiliser $H$. We say that a subgroup $K$ of $G$ is a fixer if every element of $K$ has fixed points, and we say that $K$ is large if $|K| \geqslant |H|$. There is a special interest in studying large fixers due to connections with Erdős-Ko-Rado type problems. In this paper, we classify up to conjugacy the large fixers of the almost simple primitive groups with socle $\mathrm{PSL}_2(q)$, and we use this result to verify a special case of a conjecture of Spiga on permutation characters. We also present some results on large fixers of almost simple primitive groups with socle an alternating or sporadic group. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: 40 pages

arXiv:2404.12727 [pdf, ps, other]

Characterizations of open and semi-open maps of compact Hausdorff spaces by induced maps

Authors: ** Dai, Yuxun Xie

Abstract: Let $f\colon X\rightarrow Y$ be a continuous surjection of compact Hausdorff spaces. By $$f_*\colon\mathfrak{M}(X)\rightarrow\mathfrak{M}(Y),\ μ\mapsto μ\circ f^{-1} \quad{\rm and}\quad 2^f\colon2^X\rightarrow2^Y,\ A\mapsto f[A]$$ we denote the induced continuous surjections on the probability measure spaces and hyperspaces, respectively. In this paper we mainly show the following facts: (1) If… ▽ More Let $f\colon X\rightarrow Y$ be a continuous surjection of compact Hausdorff spaces. By $$f_*\colon\mathfrak{M}(X)\rightarrow\mathfrak{M}(Y),\ μ\mapsto μ\circ f^{-1} \quad{\rm and}\quad 2^f\colon2^X\rightarrow2^Y,\ A\mapsto f[A]$$ we denote the induced continuous surjections on the probability measure spaces and hyperspaces, respectively. In this paper we mainly show the following facts: (1) If $f_*$ is semi-open, then $f$ is semi-open. (2) If $f$ is semi-open densely open, then $f_*$ is semi-open densely open. (3) $f$ is open iff $2^f$ is open. (4) $f$ is semi-open iff $2^f$ is semi-open. (5) $f$ is irreducible iff $2^f$ is irreducible. △ Less

Submitted 29 April, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

Comments: 9 pages; Topology and its Applications (in press)

MSC Class: 37B05; 54B20

arXiv:2404.09276 [pdf, other]

Algorithm xxx: Faster Randomized SVD with Dynamic Shifts

Authors: Xu Feng, Wenjian Yu, Yuyang Xie, Jie Tang

Abstract: Aiming to provide a faster and convenient truncated SVD algorithm for large sparse matrices from real applications (i.e. for computing a few of largest singular values and the corresponding singular vectors), a dynamically shifted power iteration technique is applied to improve the accuracy of the randomized SVD method. This results in a dynamic shifts based randomized SVD (dashSVD) algorithm, whi… ▽ More Aiming to provide a faster and convenient truncated SVD algorithm for large sparse matrices from real applications (i.e. for computing a few of largest singular values and the corresponding singular vectors), a dynamically shifted power iteration technique is applied to improve the accuracy of the randomized SVD method. This results in a dynamic shifts based randomized SVD (dashSVD) algorithm, which also collaborates with the skills for handling sparse matrices. An accuracy-control mechanism is included in the dashSVD algorithm to approximately monitor the per vector error bound of computed singular vectors with negligible overhead. Experiments on real-world data validate that the dashSVD algorithm largely improves the accuracy of randomized SVD algorithm or attains same accuracy with fewer passes over the matrix, and provides an efficient accuracy-control mechanism to the randomized SVD computation, while demonstrating the advantages on runtime and parallel efficiency. A bound of the approximation error of the randomized SVD with the shifted power iteration is also proved. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 26 pages, accepted by ACM Transactions on Mathematical Software

arXiv:2403.17783 [pdf, ps, other]

Intersecting subsets in finite permutation groups

Authors: CaiHeng Li, Venkata Raghu Tej Pantangi, Shujiao Song, Yilin Xie

Abstract: A subset (subgroup) $S$ of a transitive permutation group $G\leq Sym(Ω)$ is called an intersecting subset (subgroup, resp.) if the ratio $xy^{-1}$ of any elements $x,y\in S$ fixes some point. A transitive group is said to have the EKR property if the size of each intersecting subset is at most the order of the point stabilizer. A nice result of Meagher-Spiga-Tiep (2016) says that 2-transitive perm… ▽ More A subset (subgroup) $S$ of a transitive permutation group $G\leq Sym(Ω)$ is called an intersecting subset (subgroup, resp.) if the ratio $xy^{-1}$ of any elements $x,y\in S$ fixes some point. A transitive group is said to have the EKR property if the size of each intersecting subset is at most the order of the point stabilizer. A nice result of Meagher-Spiga-Tiep (2016) says that 2-transitive permutation groups have the EKR property. In this paper, we systematically study intersecting subsets in more general transitive permutation groups, including primitive (quasiprimitive) groups, rank-3 groups, Suzuki groups, and some special solvable groups. We present new families of groups that have the EKR property, and various families of groups that do not have the EKR property. This paper significantly improves the unpublished version of this paper, and particularly solves Problem 1.4 of it. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 21 pages, 2 figures

MSC Class: 05E18

arXiv:2403.14822 [pdf, other]

Non-Convex Robust Hypothesis Testing using Sinkhorn Uncertainty Sets

Authors: Jie Wang, Rui Gao, Yao Xie

Abstract: We present a new framework to address the non-convex robust hypothesis testing problem, wherein the goal is to seek the optimal detector that minimizes the maximum of worst-case type-I and type-II risk functions. The distributional uncertainty sets are constructed to center around the empirical distribution derived from samples based on Sinkhorn discrepancy. Given that the objective involves non-c… ▽ More We present a new framework to address the non-convex robust hypothesis testing problem, wherein the goal is to seek the optimal detector that minimizes the maximum of worst-case type-I and type-II risk functions. The distributional uncertainty sets are constructed to center around the empirical distribution derived from samples based on Sinkhorn discrepancy. Given that the objective involves non-convex, non-smooth probabilistic functions that are often intractable to optimize, existing methods resort to approximations rather than exact solutions. To tackle the challenge, we introduce an exact mixed-integer exponential conic reformulation of the problem, which can be solved into a global optimum with a moderate amount of input data. Subsequently, we propose a convex approximation, demonstrating its superiority over current state-of-the-art methodologies in literature. Furthermore, we establish connections between robust hypothesis testing and regularized formulations of non-robust risk functions, offering insightful interpretations. Our numerical study highlights the satisfactory testing performance and computational efficiency of the proposed framework. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 26 pages, 2 figures

arXiv:2402.17294 [pdf, ps, other]

Advancing Continuous Distribution Generation: An Exponentiated Odds Ratio Generator Approach

Authors: Xinyu Chen, Yuanqi Xie, Achraf Cohen, Shusen Pu

Abstract: This paper presents a new methodology for generating continuous statistical distributions, integrating the exponentiated odds ratio within the framework of survival analysis. This new method enhances the flexibility and adaptability of distribution models to effectively address the complexities inherent in contemporary datasets. The core of this advancement is illustrated by introducing a particul… ▽ More This paper presents a new methodology for generating continuous statistical distributions, integrating the exponentiated odds ratio within the framework of survival analysis. This new method enhances the flexibility and adaptability of distribution models to effectively address the complexities inherent in contemporary datasets. The core of this advancement is illustrated by introducing a particular subfamily, the "Type-2 Gumbel Weibull-G Family of Distributions." We provide a comprehensive analysis of the mathematical properties of these distributions, encompassing statistical properties such as density functions, moments, hazard rate and quantile functions, Rényi entropy, order statistics, and the concept of stochastic ordering. To establish the robustness of our approach, we apply five distinct methods for parameter estimation. The practical applicability of the Type-2 Gumbel Weibull-G distributions is further supported through the analysis of three real-world datasets. These empirical applications illustrate the exceptional statistical precision of our distributions compared to existing models, thereby reinforcing their significant value in both theoretical and practical statistical applications. △ Less

Submitted 27 February, 2024; originally announced February 2024.

MSC Class: 62E99; 60E05

arXiv:2402.09661 [pdf, ps, other]

Asymptotic stability for $n$-dimensional isentropic compressible MHD equations without magnetic diffusion

Authors: Quansen Jiu, Jitao Liu, Yaowei Xie

Abstract: Whether the global well-posedness of strong solutions of $n$-dimensional compressible isentropic magnetohydrodynamic (MHD for short) equations without magnetic diffusion holds true or not remains an challenging open problem, even for the small initial data. In recent years, stared from the pioneer work by Wu and Wu [Adv. Math. 310 (2017), 759--888], much more attention has been paid to the system… ▽ More Whether the global well-posedness of strong solutions of $n$-dimensional compressible isentropic magnetohydrodynamic (MHD for short) equations without magnetic diffusion holds true or not remains an challenging open problem, even for the small initial data. In recent years, stared from the pioneer work by Wu and Wu [Adv. Math. 310 (2017), 759--888], much more attention has been paid to the system when the magnetic field near an equilibrium state (the background magnetic field for short). In particular, when the background magnetic field satisfies the Diophantine condition (see (1.3) for details), Wu and Zhai [Math. Models Methods Appl. Sci. 33 (2023), no. 13, 2629--2656] established the decay estimates and asymptotic stability for smooth solutions of the 3D compressible isentropic MHD system without magnetic diffusion in $H^{4r+7}(\mathbb{T}^3)$ with $r>2$ by exploiting a wave structure. In this paper, a new dissipative mechanism is found out and applied so that we can improve the spaces where the decay estimates and asymptotic stability of solutions are taking place by Wu and Zhai. More precisely, we establish the decay estimates of solutions in $H^{r+1}(\mathbb{T}^n)$ and asymptotic stability result in $H^{\left(3r+3\right)^+}(\mathbb{T}^n)$ for any dimensional periodic domain $\mathbb{T}^n$ with $n\geq 2$ and $r>n-1$. Our results provide an approach for establishing the decay estimates and asymptotic stability in the Sobolev spaces with much lower regularity and uniform dimension, which can be used to study many other related models such as the compressible non-isentropic MHD system without magnetic diffusion and so on. △ Less

Submitted 16 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

Comments: 39 pages

arXiv:2402.08913 [pdf, ps, other]

Sharp decay estimates and asymptotic stability for incompressible MHD equations without viscosity or magnetic diffusion

Authors: Yaowei Xie, Quansen Jiu, Jitao Liu

Abstract: Whether the global existence and uniqueness of strong solutions of $n$-dimensional incompressible magnetohydrodynamic (MHD for short) equations with only kinematic viscosity or magnetic diffusion holds true or not remains an outstanding open problem. In recent years, more attention has been paid to the case when the magnetic field close to an equilibrium state (the background magnetic field for sh… ▽ More Whether the global existence and uniqueness of strong solutions of $n$-dimensional incompressible magnetohydrodynamic (MHD for short) equations with only kinematic viscosity or magnetic diffusion holds true or not remains an outstanding open problem. In recent years, more attention has been paid to the case when the magnetic field close to an equilibrium state (the background magnetic field for short). Specifically, when the background magnetic field satisfies the Diophantine condition (see (1.2) for details), Chen, Zhang and Zhou [Sci. China Math. 41 (2022), pp.1-10] first studied the perturbation system and established the decay estimates and stability of its solutions in 3D periodic domain $\mathbb{T}^3$, which was then improved to $H^{(3+2β)r+5+(α+2β)}(\mathbb{T}^2)$ for 2D periodic domain $\mathbb{T}^2$ and any $α>0$, $β>0$ by Zhai [J. Differ. Equ. 374 (2023), pp.267-278]. In this paper, we seek to find the optimal decay estimates and improve the space where the global stability is taking place. Through deeply exploring and fully utilizing the structure of perturbation system, we discover a new dissipative mechanism, which enables us to establish the decay estimates in Sobolev space with much lower regularity. Based on the above discovery, we greatly reduce the initial regularity requirement of aforementioned two works from $H^{4r+7}(\mathbb{T}^3)$ and $H^{(3+2β)r+5+(α+2β)}(\mathbb{T}^2)$ to $H^{(3r+3)^+}(\mathbb{T}^n)$ for $r>n-1$ when $n=3$ and $n=2$ respectively. Additionally, we first present the linear stability result via the method of spectral analysis in this paper. From which, the decay estimates obtained for the nonlinear system can be seen as sharp in the sense that they are in line with those for the linearized system. △ Less

Submitted 16 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

Comments: 24 pages

arXiv:2401.16043 [pdf, ps, other]

A new interpretation of Jimbo's formula for Painlevé VI

Authors: Zikang Wang, Yuancheng Xie, Xiaomeng Xu

Abstract: In this paper, we first give a new interpretation of Jimbo's boundary condition for the generic Painlevé VI transcendents, as the shrinking phenomenon in long time behaviour of the Jimbo-Miwa-Mori-Sato equation with rank $n=3$. We then interpret Jimbo's monodromy formula from the viewpoint of the isomonodromy deformation with respect to irregular singularities. In this paper, we first give a new interpretation of Jimbo's boundary condition for the generic Painlevé VI transcendents, as the shrinking phenomenon in long time behaviour of the Jimbo-Miwa-Mori-Sato equation with rank $n=3$. We then interpret Jimbo's monodromy formula from the viewpoint of the isomonodromy deformation with respect to irregular singularities. △ Less

Submitted 9 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: 22 pages

arXiv:2401.15262 [pdf, other]

Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation

Authors: Yiling Xie, Xiaoming Huo

Abstract: Adversarial training has been proposed to hedge against adversarial attacks in machine learning and statistical models. This paper focuses on adversarial training under $\ell_\infty$-perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the limiting di… ▽ More Adversarial training has been proposed to hedge against adversarial attacks in machine learning and statistical models. This paper focuses on adversarial training under $\ell_\infty$-perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the limiting distribution of the adversarial training estimator under $\ell_\infty$-perturbation could put a positive probability mass at $0$ when the true parameter is $0$, providing a theoretical guarantee of the associated sparsity-recovery ability. Alternatively, a two-step procedure is proposed -- adaptive adversarial training, which could further improve the performance of adversarial training under $\ell_\infty$-perturbation. Specifically, the proposed procedure could achieve asymptotic unbiasedness and variable-selection consistency. Numerical experiments are conducted to show the sparsity-recovery ability of adversarial training under $\ell_\infty$-perturbation and to compare the empirical performance between classic adversarial training and adaptive adversarial training. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.13466 [pdf, ps, other]

Characterizations of umbilical hypersurfaces by partially overdetermined problems in space forms

Authors: Yangsen Xie

Abstract: In this paper, we characterize the rigidity of umbilical hypersurfaces by a Serrin-type partially overdetermined problem in space forms, which generalizes the similar results in Euclidean half-space and Euclidean half-ball. Guo-Xia first obtained these rigidity results when the Robin boundary condition on the support hypersurface is homogeneous, at this time the target umbilical hypersurface has o… ▽ More In this paper, we characterize the rigidity of umbilical hypersurfaces by a Serrin-type partially overdetermined problem in space forms, which generalizes the similar results in Euclidean half-space and Euclidean half-ball. Guo-Xia first obtained these rigidity results when the Robin boundary condition on the support hypersurface is homogeneous, at this time the target umbilical hypersurface has orthogonal contact angle with the support. However, in this paper we can obtain any contact angle $θ\in (0,π)$ by changing the Robin boundary condition to be inhomogeneous. △ Less

Submitted 24 January, 2024; originally announced January 2024.

arXiv:2312.11093 [pdf, other]

MGCNN: a learnable multigrid solver for sparse linear systems from PDEs on structured grids

Authors: Yan Xie, Minrui Lv, Chensong Zhang

Abstract: This paper presents a learnable solver tailored to iteratively solve sparse linear systems from discretized partial differential equations (PDEs). Unlike traditional approaches relying on specialized expertise, our solver streamlines the algorithm design process for a class of PDEs through training, which requires only training data of coefficient distributions. The proposed method is anchored by… ▽ More This paper presents a learnable solver tailored to iteratively solve sparse linear systems from discretized partial differential equations (PDEs). Unlike traditional approaches relying on specialized expertise, our solver streamlines the algorithm design process for a class of PDEs through training, which requires only training data of coefficient distributions. The proposed method is anchored by three core principles: (1) a multilevel hierarchy to promote rapid convergence, (2) adherence to linearity concerning the right-hand-side of equations, and (3) weights sharing across different levels to facilitate adaptability to various problem sizes. Built on these foundational principles and considering the similar computation pattern of the convolutional neural network (CNN) as multigrid components, we introduce a network adept at solving linear systems from PDEs with heterogeneous coefficients, discretized on structured grids. Notably, our proposed solver possesses the ability to generalize over right-hand-side terms, PDE coefficients, and grid sizes, thereby ensuring its training is purely offline. To evaluate its effectiveness, we train the solver on convection-diffusion equations featuring heterogeneous diffusion coefficients. The solver exhibits swift convergence to high accuracy over a range of grid sizes, extending from $31 \times 31$ to $4095 \times 4095$. Remarkably, our method outperforms the classical Geometric Multigrid (GMG) solver, demonstrating a speedup of approximately 3 to 8 times. Furthermore, our numerical investigation into the solver's capacity to generalize to untrained coefficient distributions reveals promising outcomes. △ Less

Submitted 9 May, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

Comments: 24 pages, 11 figures

MSC Class: 65F10; 65N55; 65N22

arXiv:2312.06214 [pdf, ps, other]

Duplex Hecke Algebras of type B

Authors: Yu Xie, An Zhang, Bin Shu

Abstract: As a sequel to [14], in this article we first introduce a so-called duplex Hecke algebras of type B which is a Q(q)-algebra associated with the Weyl group W (B) of type B, and symmetric groups S_l for l = 0, 1, . . . ,m, satisfying some Hecke relations. This notion originates from the degenerate duplex Hecke algebra arising from the course of study of a kind of Schur-Weyl duality of Levi-type, ext… ▽ More As a sequel to [14], in this article we first introduce a so-called duplex Hecke algebras of type B which is a Q(q)-algebra associated with the Weyl group W (B) of type B, and symmetric groups S_l for l = 0, 1, . . . ,m, satisfying some Hecke relations. This notion originates from the degenerate duplex Hecke algebra arising from the course of study of a kind of Schur-Weyl duality of Levi-type, extending the duplex Hecke algebra of type A arising from the related q-Schur-Weyl duality of Levi-type. A duplex Hecke algebra of type B admits natural representations on certain tensor spaces. We then establish a Levi-type q-Schur-Weyl duality of type B, which reveals the double centralizer property between such duplex Hecke algebras and ıquantum groups studied by Bao-Wang in [1]. △ Less

Submitted 12 December, 2023; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: 17 pages. To appear in Journal of Algebra and its Applications

MSC Class: 20G05; 17B20; 17B45; 17B50

arXiv:2311.15491 [pdf, ps, other]

The Cowen-Douglas operators with strongly flag structure

Authors: Yufang Xie, Kui Ji

Abstract: Denote $\mathcal{FB}_{n}(Ω)$ as the collection of operators possessing a flag structure in the Cowen-Douglas class $\mathcal{B}_{n}(Ω)$, and all the irreducible homogeneous operators in $\mathcal{B}_{n}(Ω)$ belong to this class. G. Misra et al. pointed out in \cite{JJKM} that the unitary invariants of this class of operators include the curvature and the second fundamental form of the correspondin… ▽ More Denote $\mathcal{FB}_{n}(Ω)$ as the collection of operators possessing a flag structure in the Cowen-Douglas class $\mathcal{B}_{n}(Ω)$, and all the irreducible homogeneous operators in $\mathcal{B}_{n}(Ω)$ belong to this class. G. Misra et al. pointed out in \cite{JJKM} that the unitary invariants of this class of operators include the curvature and the second fundamental form of the corresponding line bundle. In terms of the invariants, it is more tractable compared to general operators in $\mathcal{B}_{n}(Ω)$. A subclass of $\mathcal{FB}_{n}(Ω)$, denoted by $\mathcal{CFB}_{n}(Ω)$, was proven to be norm dense in $\mathcal{B}_{n}(Ω)$ in \cite{JJ}. In this paper, we introduce a smaller subclass of $\mathcal{FB}_{n}(Ω)$ which possesses a strongly flag structure, and for which the curvature and the second fundamental form of the associated line bundle is a complete set of unitary invariants. And we notice that this class of operators is norm dense in $\mathcal{B}_{n}(Ω)$ up to similarity. On this basis, we have completed the similar classification of a large class of operators with flag structure, which reduces the number of the similarity invariants in \cite{JKSX} from $\frac{n(n-1)}{2}+1$ to $n$. Furthermore, we also get a complete characterization of weakly homogeneous operators with high index and flag structure. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: 22pages

arXiv:2310.18535 [pdf, other]

Contextual Stochastic Bilevel Optimization

Authors: Yifan Hu, Jie Wang, Yao Xie, Andreas Krause, Daniel Kuhn

Abstract: We introduce contextual stochastic bilevel optimization (CSBO) -- a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This framework extends classical stochastic bilevel optimization when the lower-level decision maker responds optimally not only to the decision of the u… ▽ More We introduce contextual stochastic bilevel optimization (CSBO) -- a stochastic bilevel optimization framework with the lower-level problem minimizing an expectation conditioned on some contextual information and the upper-level decision variable. This framework extends classical stochastic bilevel optimization when the lower-level decision maker responds optimally not only to the decision of the upper-level decision maker but also to some side information and when there are multiple or even infinite many followers. It captures important applications such as meta-learning, personalized federated learning, end-to-end learning, and Wasserstein distributionally robust optimization with side information (WDRO-SI). Due to the presence of contextual information, existing single-loop methods for classical stochastic bilevel optimization are unable to converge. To overcome this challenge, we introduce an efficient double-loop gradient method based on the Multilevel Monte-Carlo (MLMC) technique and establish its sample and computational complexities. When specialized to stochastic nonconvex optimization, our method matches existing lower bounds. For meta-learning, the complexity of our method does not depend on the number of tasks. Numerical experiments further validate our theoretical results. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: The paper is accepted by NeurIPS 2023

arXiv:2310.17582 [pdf, other]

Convergence of flow-based generative models via proximal gradient descent in Wasserstein space

Authors: Xiuyuan Cheng, Jianfeng Lu, Yixin Tan, Yao Xie

Abstract: Flow-based generative models enjoy certain advantages in computing the data generation and the likelihood, and have recently shown competitive empirical performance. Compared to the accumulating theoretical studies on related score-based diffusion models, analysis of flow-based models, which are deterministic in both forward (data-to-noise) and reverse (noise-to-data) directions, remain sparse. In… ▽ More Flow-based generative models enjoy certain advantages in computing the data generation and the likelihood, and have recently shown competitive empirical performance. Compared to the accumulating theoretical studies on related score-based diffusion models, analysis of flow-based models, which are deterministic in both forward (data-to-noise) and reverse (noise-to-data) directions, remain sparse. In this paper, we provide a theoretical guarantee of generating data distribution by a progressive flow model, the so-called JKO flow model, which implements the Jordan-Kinderleherer-Otto (JKO) scheme in a normalizing flow network. Leveraging the exponential convergence of the proximal gradient descent (GD) in Wasserstein space, we prove the Kullback-Leibler (KL) guarantee of data generation by a JKO flow model to be $O(\varepsilon^2)$ when using $N \lesssim \log (1/\varepsilon)$ many JKO steps ($N$ Residual Blocks in the flow) where $\varepsilon $ is the error in the per-step first-order condition. The assumption on data density is merely a finite second moment, and the theory extends to data distributions without density and when there are inversion errors in the reverse process where we obtain KL-$W_2$ mixed error guarantees. The non-asymptotic convergence rate of the JKO-type $W_2$-proximal GD is proved for a general class of convex objective functionals that includes the KL divergence as a special case, which can be of independent interest. The analysis framework can extend to other first-order Wasserstein optimization schemes applied to flow-based generative models. △ Less

Submitted 16 May, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

arXiv:2310.14156 [pdf, other]

Configuration space integrals and formal smooth structures

Authors: Jianfeng Lin, Yi Xie

Abstract: Watanabe disproved the 4-dimensional Smale conjecture by constructing topologically trivial $D^{4}$-bundles over spheres and showing that they are smoothly nontrivial using configuration space integrals. In this paper, we define a new version of configuration space integrals that only relies on a formal smooth structure on the $D^{4}$-bundle (i.e., a vector bundle structure on the vertical tangent… ▽ More Watanabe disproved the 4-dimensional Smale conjecture by constructing topologically trivial $D^{4}$-bundles over spheres and showing that they are smoothly nontrivial using configuration space integrals. In this paper, we define a new version of configuration space integrals that only relies on a formal smooth structure on the $D^{4}$-bundle (i.e., a vector bundle structure on the vertical tangent microbundle). It coincides with Watanabe's definition when the $D^{4}$-bundle is smooth. We obtain several applications. First, we give a lower bound (in terms of the graph homology) on the dimension of the rational homotopy and homology groups of $\textrm{Top}(4)$ and $\textrm{Homeo}(S^4)$ (the homeomorphism group of $\mathbb{R}^4$ and $S^4$). In particular, this implies that $\textrm{Top}(4)$ and $\textrm{Homeo}(S^4)$ are not rationally equivalent to any finite-dimensional CW complexes. Second, we discover a generalized Miller-Morita-Mumford class $κ_θ(π)\in H^{3}(B;\mathbf{Q})$, which is defined for any topological 4-manifold bundle $X\to E\to B$. This class obstructs the existence of a formal smooth structure on the bundle. Third, we show that for any compact, orientable, smooth 4-manifold $X$ (possibly with boundary), the inclusion map from its diffeomorphism group to its homeomorphism group is not rationally $2$-connected (hence not a weak homotopy equivalence). This implies that the space of smooth structures on $X$ has a nontrivial rational homotopy group in dimension 2. △ Less

Submitted 21 October, 2023; originally announced October 2023.

Comments: 79 pages, comments welcome

arXiv:2310.08218 [pdf, other]

Convergence of Arbitrary Lagrangian-Eulerian Second-order Projection Method for the Stokes Equations on an Evolving Domain

Authors: Qiqi Rao, Jilu Wang, Yupei Xie

Abstract: The numerical solution of the Stokes equations on an evolving domain with a moving boundary is studied based on the arbitrary Lagrangian-Eulerian finite element method and a second-order projection method along the trajectories of the evolving mesh for decoupling the unknown solutions of velocity and pressure. The error of the semidiscrete arbitrary Lagrangian-Eulerian method is shown to be… ▽ More The numerical solution of the Stokes equations on an evolving domain with a moving boundary is studied based on the arbitrary Lagrangian-Eulerian finite element method and a second-order projection method along the trajectories of the evolving mesh for decoupling the unknown solutions of velocity and pressure. The error of the semidiscrete arbitrary Lagrangian-Eulerian method is shown to be $O(h^{r+1})$ for the Taylor--Hood finite elements of degree $r\ge 2$, using Nitsche's duality argument adapted to an evolving mesh, by proving that the material derivative and the Stokes--Ritz projection commute up to terms which have optimal-order convergence in the $L^2$ norm. Additionally, the error of the fully discrete finite element method, with a second-order projection method along the trajectories of the evolving mesh, is shown to be $O(\ln(1/τ+1)τ^{2}+\ln(1/h+1)h^{r+1})$ in the discrete $L^\infty(0,T; L^2)$ norm using newly developed energy techniques and backward parabolic duality arguments that are applicable to the Stokes equations with an evolving mesh. To maintain consistency between the notations of the numerical scheme in a moving domain and those in a fixed domain, we introduce the equivalence class of finite element spaces across time levels. Numerical examples are provided to support the theoretical analysis and to illustrate the performance of the method in simulating Navier--Stokes flow in a domain with a rotating propeller. △ Less

Submitted 12 October, 2023; originally announced October 2023.

MSC Class: 65M15; 65M60; 76D07

arXiv:2310.06140 [pdf, other]

NP-Hardness of Tensor Network Contraction Ordering

Authors: Jianyu Xu, Hanwen Zhang, Ling Liang, Lei Deng, Yuan Xie, Guoqi Li

Abstract: We study the optimal order (or sequence) of contracting a tensor network with a minimal computational cost. We conclude 2 different versions of this optimal sequence: that minimize the operation number (OMS) and that minimize the time complexity (CMS). Existing results only shows that OMS is NP-hard, but no conclusion on CMS problem. In this work, we firstly reduce CMS to CMS-0, which is a sub-pro… ▽ More We study the optimal order (or sequence) of contracting a tensor network with a minimal computational cost. We conclude 2 different versions of this optimal sequence: that minimize the operation number (OMS) and that minimize the time complexity (CMS). Existing results only shows that OMS is NP-hard, but no conclusion on CMS problem. In this work, we firstly reduce CMS to CMS-0, which is a sub-problem of CMS with no free indices. Then we prove that CMS is easier than OMS, both in general and in tree cases. Last but not least, we prove that CMS is still NP-hard. Based on our results, we have built up relationships of hardness of different tensor network contraction problems. △ Less

Submitted 9 October, 2023; originally announced October 2023.

Comments: Jianyu Xu and Hanwen Zhang are equal contributors. 10 pages (reference and appendix excluded), 20 pages in total, 6 figures

MSC Class: 05C35; 05C76 ACM Class: F.2.2

arXiv:2309.11136 [pdf, ps, other]

On Derived Categories of Generalized Grassmannian Flips

Authors: Naichung Conan Leung, Ying Xie

Abstract: In this paper, we construct and classify a new family of flips, called generalized Grassmannian flips, by generalizing the construction of standard flips for $\mathbb{P}^m\times \mathbb{P}^n$ to any generalized Grassmannian $G/P$, where $P$ is a maximal parabolic subgroup of a complex semi-simple algebraic group. In addition, we show that a 9-fold generalized Grassmannian flip for… ▽ More In this paper, we construct and classify a new family of flips, called generalized Grassmannian flips, by generalizing the construction of standard flips for $\mathbb{P}^m\times \mathbb{P}^n$ to any generalized Grassmannian $G/P$, where $P$ is a maximal parabolic subgroup of a complex semi-simple algebraic group. In addition, we show that a 9-fold generalized Grassmannian flip for $Sp(6, \mathbb{C})$ satisfies the DK flip conjecture by Bondal-Orlov and Kawamata via mutation techniques by Kuznetsov and Thomas' chess game method. △ Less

Submitted 20 September, 2023; originally announced September 2023.

Comments: To appear in Mathematical Research Letters

arXiv:2309.09529 [pdf, other]

Proof-of-Prospect-Theory: A Novel Game-based Consensus Mechanism for Blockchain

Authors: Yuqi Xie, Changbing Tang, Feilong Lin, Guanrong Chen, Zhao Zhang, Zhonglong Zheng

Abstract: Blockchain technology is a breakthrough in changing the ways of business and organization operations, in which the consensus problem is challenging with practical constraints, such as computational power and consensus standard. In this paper, a novel consensus mechanism named Proof-of-Prospect-Theory (PoPT) is designed from the view of game theory, where the game prospect value is considered as an… ▽ More Blockchain technology is a breakthrough in changing the ways of business and organization operations, in which the consensus problem is challenging with practical constraints, such as computational power and consensus standard. In this paper, a novel consensus mechanism named Proof-of-Prospect-Theory (PoPT) is designed from the view of game theory, where the game prospect value is considered as an important election criterion of the block-recorder. PoPT portrays the popularity of a node in the network as an attribute, which is constituted by the subjective sensibilities of nodes. Furthermore, the performances of the PoPT and the willingness of ordinary nodes to participate in the consensus are analyzed, exploring fairness, decentralization, credibility, and the motivating ability of the consensus mechanism. Finally, numerical simulations with optimization of the PoPT consensus mechanism are demonstrated in the scenario of a smart grid system to illustrate the effectiveness of the PoPT. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2307.10539 [pdf, ps, other]

Induced log-concavity of equivariant matroid invariants

Authors: Alice L. L. Gao, Ethan Y. H. Li, Matthew H. Y. Xie, Arthur L. B. Yang, Zhong-Xue Zhang

Abstract: Inspired by the notion of equivariant log-concavity, we introduce the concept of induced log-concavity for a sequence of representations of a finite group. For an equivariant matroid equipped with a symmetric group action or a finite general linear group action, we transform the problem of proving the induced log-concavity of matroid invariants to that of proving the Schur positivity of symmetric… ▽ More Inspired by the notion of equivariant log-concavity, we introduce the concept of induced log-concavity for a sequence of representations of a finite group. For an equivariant matroid equipped with a symmetric group action or a finite general linear group action, we transform the problem of proving the induced log-concavity of matroid invariants to that of proving the Schur positivity of symmetric functions. We prove the induced log-concavity of the equivariant Kazhdan-Lusztig polynomials of $q$-niform matroids equipped with the action of a finite general linear group, as well as that of the equivariant Kazhdan-Lusztig polynomials of uniform matroids equipped with the action of a symmetric group. As a consequence of the former, we obtain the log-concavity of Kazhdan-Lusztig polynomials of $q$-niform matroids, thus providing further positive evidence for Elias, Proudfoot and Wakefield's log-concavity conjecture on the matroid Kazhdan-Lusztig polynomials. From the latter we obtain the log-concavity of Kazhdan-Lusztig polynomials of uniform matroids, which was recently proved by Xie and Zhang by using a computer algebra approach. We also establish the induced log-concavity of the equivariant characteristic polynomials and the equivariant inverse Kazhdan-Lusztig polynomials for $q$-niform matroids and uniform matroids. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: 36 pages

MSC Class: 05B35; 05E05; 20C30

arXiv:2307.08845 [pdf, ps, other]

Ring structures in singular instanton homology

Authors: Yi Xie, Boyu Zhang

Abstract: We calculate the ring structure of the singular instanton Floer homology of $(S^1\times Σ, S^1\times \{p_1,\dots,p_n\})$ with C-coefficients, where $Σ$ is a closed oriented surface. As an application, we prove an excision formula for singular instanton homology when n=1. This settles the last unknown case of excision formula for instanton Floer homology. We calculate the ring structure of the singular instanton Floer homology of $(S^1\times Σ, S^1\times \{p_1,\dots,p_n\})$ with C-coefficients, where $Σ$ is a closed oriented surface. As an application, we prove an excision formula for singular instanton homology when n=1. This settles the last unknown case of excision formula for instanton Floer homology. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 53 pages

MSC Class: 57R58; 14H60

arXiv:2307.07898 [pdf, other]

A Graph-Prediction-Based Approach for Debiasing Underreported Data

Authors: Hanyang Jiang, Yao Xie

Abstract: We present a novel Graph-based debiasing Algorithm for Underreported Data (GRAUD) aiming at an efficient joint estimation of event counts and discovery probabilities across spatial or graphical structures. This innovative method provides a solution to problems seen in fields such as policing data and COVID-$19$ data analysis. Our approach avoids the need for strong priors typically associated with… ▽ More We present a novel Graph-based debiasing Algorithm for Underreported Data (GRAUD) aiming at an efficient joint estimation of event counts and discovery probabilities across spatial or graphical structures. This innovative method provides a solution to problems seen in fields such as policing data and COVID-$19$ data analysis. Our approach avoids the need for strong priors typically associated with Bayesian frameworks. By leveraging the graph structures on unknown variables $n$ and $p$, our method debiases the under-report data and estimates the discovery probability at the same time. We validate the effectiveness of our method through simulation experiments and illustrate its practicality in one real-world application: police 911 calls-to-service data. △ Less

Submitted 19 April, 2024; v1 submitted 15 July, 2023; originally announced July 2023.

arXiv:2307.07622 [pdf, ps, other]

Limiting distributions for RWCRE in the sub-ballistic regime and in the critical Gaussian regime

Authors: Conrado da Costa, Jonathon Peterson, Yongjia Xie

Abstract: Random Walks in Cooling Random Environments (RWCRE) is a model of random walks in dynamic random environments where the environment is frozen between a fixed sequence of times (called the cooling map) where it is resampled. Naturally the limiting distributions for this model depend both on the structure of the cooling sequence and on distribution $μ$ from which the environments are sampled. Previo… ▽ More Random Walks in Cooling Random Environments (RWCRE) is a model of random walks in dynamic random environments where the environment is frozen between a fixed sequence of times (called the cooling map) where it is resampled. Naturally the limiting distributions for this model depend both on the structure of the cooling sequence and on distribution $μ$ from which the environments are sampled. Previous results have considered the cases where $μ$ is such that the corresponding model of random walks in a fixed random environment (RWRE) is either (1) recurrent, (2) has a Gaussian limit with diffusive scaling (the $κ> 2$ case), or (3) has positive speed and a stable, non-Gaussian limit (the $κ\in (1,2)$ case). In this paper we examine the limiting distributions in two other transient regimes: the sub-ballistic, non-stable regime (i.e., $κ\in (0,1)$), and the Gaussian regime with non-diffusive scaling (i.e., $κ= 2$). In the first case we show that the limiting distributions are either Gaussian or a mixture of Gaussian and independent sums of Mittag-Leffler random variables, while in the second case the limiting distributions are always Gaussian but with a scaling that differs from the standard deviation by factor (which can oscillate, but which remains confined to some interval $[β,1]$) that depends very delicately on the properties of the cooling map. △ Less

Submitted 14 July, 2023; originally announced July 2023.

Comments: 34 pages

MSC Class: 60K37

arXiv:2306.05248 [pdf, other]

Optimal $L^2$ error analysis of a loosely coupled finite element scheme for thin-structure interactions

Authors: Buyang Li, Weiwei Sun, Yupei Xie, Wenshan Yu

Abstract: Finite element methods and kinematically coupled schemes that decouple the fluid velocity and structure displacement have been extensively studied for incompressible fluid-structure interaction (FSI) over the past decade. While these methods are known to be stable and easy to implement, optimal error analysis has remained challenging. Previous work has primarily relied on the classical elliptic pr… ▽ More Finite element methods and kinematically coupled schemes that decouple the fluid velocity and structure displacement have been extensively studied for incompressible fluid-structure interaction (FSI) over the past decade. While these methods are known to be stable and easy to implement, optimal error analysis has remained challenging. Previous work has primarily relied on the classical elliptic projection technique, which is only suitable for parabolic problems and does not lead to optimal convergence of numerical solutions to the FSI problems in the standard $L^2$ norm. In this article, we propose a new stable fully discrete kinematically coupled scheme for incompressible FSI thin-structure model and establish a new approach for the numerical analysis of FSI problems in terms of a newly introduced coupled non-stationary Ritz projection, which allows us to prove the optimal-order convergence of the proposed method in the $L^2$ norm. The methodology presented in this article is also applicable to numerous other FSI models and serves as a fundamental tool for advancing research in this field. △ Less

Submitted 12 December, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

arXiv:2305.09978 [pdf, other]

Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning Problems

Authors: Shigeng Sun, Yuchen Xie

Abstract: Many machine learning applications and tasks rely on the stochastic gradient descent (SGD) algorithm and its variants. Effective step length selection is crucial for the success of these algorithms, which has motivated the development of algorithms such as ADAM or AdaGrad. In this paper, we propose a novel algorithm for adaptive step length selection in the classical SGD framework, which can be re… ▽ More Many machine learning applications and tasks rely on the stochastic gradient descent (SGD) algorithm and its variants. Effective step length selection is crucial for the success of these algorithms, which has motivated the development of algorithms such as ADAM or AdaGrad. In this paper, we propose a novel algorithm for adaptive step length selection in the classical SGD framework, which can be readily adapted to other stochastic algorithms. Our proposed algorithm is inspired by traditional nonlinear optimization techniques and is supported by analytical findings. We show that under reasonable conditions, the algorithm produces step lengths in line with well-established theoretical requirements, and generates iterates that converge to a stationary neighborhood of a solution in expectation. We test the proposed algorithm on logistic regressions and deep neural networks and demonstrate that the algorithm can generate step lengths comparable to the best step length obtained from manual tuning. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2305.09126 [pdf, other]

Transfer Learning for Causal Effect Estimation

Authors: Song Wei, Hanyu Zhang, Ronald Moore, Rishikesan Kamaleswaran, Yao Xie

Abstract: We present a Transfer Causal Learning (TCL) framework when target and source domains share the same covariate/feature spaces, aiming to improve causal effect estimation accuracy in limited data. Limited data is very common in medical applications, where some rare medical conditions, such as sepsis, are of interest. Our proposed method, named \texttt{$\ell_1$-TCL}, incorporates $\ell_1$ regularized… ▽ More We present a Transfer Causal Learning (TCL) framework when target and source domains share the same covariate/feature spaces, aiming to improve causal effect estimation accuracy in limited data. Limited data is very common in medical applications, where some rare medical conditions, such as sepsis, are of interest. Our proposed method, named \texttt{$\ell_1$-TCL}, incorporates $\ell_1$ regularized TL for nuisance models (e.g., propensity score model); the TL estimator of the nuisance parameters is plugged into downstream average causal/treatment effect estimators (e.g., inverse probability weighted estimator). We establish non-asymptotic recovery guarantees for the \texttt{$\ell_1$-TCL} with generalized linear model (GLM) under the sparsity assumption in the high-dimensional setting, and demonstrate the empirical benefits of \texttt{$\ell_1$-TCL} through extensive numerical simulation for GLM and recent neural network nuisance models. Our method is subsequently extended to real data and generates meaningful insights consistent with medical literature, a case where all baseline methods fail. △ Less

Submitted 1 January, 2024; v1 submitted 15 May, 2023; originally announced May 2023.

Comments: Preliminary version, titled "Transfer causal learning: Causal effect estimation with knowledge transfer", has been presented in ICML 3rd Workshop on Interpretable Machine Learning in Healthcare (IMLH), 2023; see the arXiv version in v2

arXiv:2305.05915 [pdf, ps, other]

A synchronization-capturing multi-scale solver to the noisy integrate-and-fire neuron networks

Authors: Ziyu Du, Yantong Xie, Zhennan Zhou

Abstract: The noisy leaky integrate-and-fire (NLIF) model describes the voltage configurations of neuron networks with an interacting many-particles system at a microscopic level. When simulating neuron networks of large sizes, computing a coarse-grained mean-field Fokker-Planck equation solving the voltage densities of the networks at a macroscopic level practically serves as a feasible alternative in its… ▽ More The noisy leaky integrate-and-fire (NLIF) model describes the voltage configurations of neuron networks with an interacting many-particles system at a microscopic level. When simulating neuron networks of large sizes, computing a coarse-grained mean-field Fokker-Planck equation solving the voltage densities of the networks at a macroscopic level practically serves as a feasible alternative in its high efficiency and credible accuracy. However, the macroscopic model fails to yield valid results of the networks when simulating considerably synchronous networks with active firing events. In this paper, we propose a multi-scale solver for the NLIF networks, which inherits the low cost of the macroscopic solver and the high reliability of the microscopic solver. For each temporal step, the multi-scale solver uses the macroscopic solver when the firing rate of the simulated network is low, while it switches to the microscopic solver when the firing rate tends to blow up. Moreover, the macroscopic and microscopic solvers are integrated with a high-precision switching algorithm to ensure the accuracy of the multi-scale solver. The validity of the multi-scale solver is analyzed from two perspectives: firstly, we provide practically sufficient conditions that guarantee the mean-field approximation of the macroscopic model and present rigorous numerical analysis on simulation errors when coupling the two solvers; secondly, the numerical performance of the multi-scale solver is validated through simulating several large neuron networks, including networks with either instantaneous or periodic input currents which prompt active firing events over a period of time. △ Less

Submitted 10 May, 2023; originally announced May 2023.

Comments: 25 Pages, 18 Figures

arXiv:2304.13793 [pdf, other]

Generalized generalized linear models: Convex estimation and online bounds

Authors: Anatoli Juditsky, Arkadi Nemirovski, Yao Xie, Chen Xu

Abstract: We introduce a new computational framework for estimating parameters in generalized generalized linear models (GGLM), a class of models that extends the popular generalized linear models (GLM) to account for dependencies among observations in spatio-temporal data. The proposed approach uses a monotone operator-based variational inequality method to overcome non-convexity in parameter estimation an… ▽ More We introduce a new computational framework for estimating parameters in generalized generalized linear models (GGLM), a class of models that extends the popular generalized linear models (GLM) to account for dependencies among observations in spatio-temporal data. The proposed approach uses a monotone operator-based variational inequality method to overcome non-convexity in parameter estimation and provide guarantees for parameter recovery. The results can be applied to GLM and GGLM, focusing on spatio-temporal models. We also present online instance-based bounds using martingale concentrations inequalities. Finally, we demonstrate the performance of the algorithm using numerical simulations and a real data example for wildfire incidents. △ Less

Submitted 26 April, 2023; originally announced April 2023.

arXiv:2304.04969 [pdf, ps, other]

Poisson Equation and Application to Multi-Scale SDEs with State-Dependent Switching

Authors: Xiaobin Sun, Yingchao Xie

Abstract: In this paper, we study the averaging principle and central limit theorem for multi-scale stochastic differential equations with state-dependent switching. To accomplish this, we first study the Poisson equation associated with a Markov chain and the regularity of its solutions. As applications of the results on the Poisson equations, we prove three averaging principle results and two central limi… ▽ More In this paper, we study the averaging principle and central limit theorem for multi-scale stochastic differential equations with state-dependent switching. To accomplish this, we first study the Poisson equation associated with a Markov chain and the regularity of its solutions. As applications of the results on the Poisson equations, we prove three averaging principle results and two central limit theorems results. The first averaging principle result is a strong convergence of order $1/2$ of the slow component $X^{\varepsilon}$ in the space $C([0,T],\mathbb{R}^n)$. The second averaging principle result is a weak convergence of $X^{\varepsilon}$ in $C([0,T],\mathbb{R}^n)$. The third averaging principle result is a weak convergence of order $1$ of $X^{\varepsilon}_t$ in $\mathbb{R}^n$ for any fixed $t\ge 0$. The first central limit theorem type result is a weak convergence of $(X^{\varepsilon}-\bar{X})/\sqrt{\varepsilon}$ in $C([0,T],\mathbb{R}^n)$, where $\bar{X}$ is the solution of the averaged equation. The second central limit theorem type result is a weak convergence of order $1/2$ of $(X^{\varepsilon}_t-\bar{X}_t)/\sqrt{\varepsilon}$ in $\mathbb{R}^n$ for fixed $t\ge 0$. Several examples are given to show that all the achieved orders are optimal. △ Less

Submitted 17 December, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: 42 pages. We relax the assumptions and add several new results in this version

arXiv:2304.02824 [pdf, other]

Generalized Hypercube Queuing Models with Overlap** Service Regions

Authors: Shixiang Zhu, Wenqian Xing, Yao Xie

Abstract: We present a generalized hypercube queuing model, building upon the original model by Larson 1974, focusing on its application to overlap** service regions such as police beats. To design a service region, we need to capture the workload and police car operation, a type of mobile server. The traditional hypercube queuing model excels in capturing systems' dynamics with light traffic, as it prima… ▽ More We present a generalized hypercube queuing model, building upon the original model by Larson 1974, focusing on its application to overlap** service regions such as police beats. To design a service region, we need to capture the workload and police car operation, a type of mobile server. The traditional hypercube queuing model excels in capturing systems' dynamics with light traffic, as it primarily considers whether each server is busy or idle. However, contemporary service operations often experience saturation, in which each server in the system can only process a subset of calls and a queue in front of each server is allowed. Hence, the simple binary status for each server becomes inadequate, prompting the need for a more intricate state space analysis. Our proposed model addresses this problem using a Markov model with a large state space represented by non-negative integer-valued vectors. By leveraging the sparsity structure of the transition matrix, where transitions occur between states whose vectors differ by one in the $\ell_1$ distance, we can solve the steady-state distribution of states efficiently. This solution can then be used to evaluate general performance metrics for the service system. We validate our model's effectiveness through simulations of various artificial service systems. We also apply our model to the Atlanta police operation system, which faces challenges such as increased workload, significant staff shortages, and the impact of boundary effects among crime incidents. Using real 911 calls-for-service data, our analysis indicates that the police operations system with permitted overlap** patrols can significantly mitigate these problems, leading to more effective police force deployment. △ Less

Submitted 11 January, 2024; v1 submitted 5 April, 2023; originally announced April 2023.

arXiv:2303.14671 [pdf, other]

doi 10.1002/jgt.23099

A relation between the cube polynomials of partial cubes and the clique polynomials of their crossing graphs

Authors: Yan-Ting Xie, Yong-De Feng, Shou-Jun Xu

Abstract: Partial cubes are the graphs which can be embedded into hypercubes. The {\em cube polynomial} of a graph $G$ is a counting polynomial of induced hypercubes of $G$, which is defined as $C(G,x):=\sum_{i\geqslant 0}α_i(G)x^i$, where $α_i(G)$ is the number of induced $i$-cubes (hypercubes of dimension $i$) of $G$. The {\em clique polynomial} of $G$ is defined as… ▽ More Partial cubes are the graphs which can be embedded into hypercubes. The {\em cube polynomial} of a graph $G$ is a counting polynomial of induced hypercubes of $G$, which is defined as $C(G,x):=\sum_{i\geqslant 0}α_i(G)x^i$, where $α_i(G)$ is the number of induced $i$-cubes (hypercubes of dimension $i$) of $G$. The {\em clique polynomial} of $G$ is defined as $Cl(G,x):=\sum_{i\geqslant 0}a_i(G)x^i$, where $a_i(G)$ ($i\geqslant 1$) is the number of $i$-cliques in $G$ and $a_0(G)=1$. Equivalently, $Cl(G, x)$ is exactly the independence polynomial of the complement $\overline{G}$ of $G$. The {\em crossing graph} $G^{\#}$ of a partial cube $G$ is the graph whose vertices are corresponding to the $Θ$-classes of $G$, and two $Θ$-classes are adjacent in $G^{\#}$ if and only if they cross in $G$. In the present paper, we prove that for a partial cube $G$, $C(G,x)\leqslant Cl(G^{\#}, x+1)$ and the equality holds if and only if $G$ is a median graph. Since every graph can be represented as the crossing graph of a median graph [SIAM J. Discrete Math., 15 (2002) 235--251], the above necessary-and-sufficient result shows that the study on the cube polynomials of median graphs can be transformed to the one on the clique polynomials of general graphs (equivalently, on the independence polynomials of their complements). In addition, we disprove the conjecture that the cube polynomials of median graphs are unimodal. △ Less

Submitted 16 June, 2024; v1 submitted 26 March, 2023; originally announced March 2023.

Comments: 13 pages,2 figures

MSC Class: 05C31; 05C75

Journal ref: J. Graph Theory, 106 (2024) 907-922

arXiv:2303.09143 [pdf, ps, other]

Weak discrete maximum principle of isoparametric finite element methods in curvilinear polyhedra

Authors: Buyang Li, Weifeng Qiu, Yupei Xie, Wenshan Yu

Abstract: The weak maximum principle of the isoparametric finite element method is proved for the Poisson equation under the Dirichlet boundary condition in a (possibly concave) curvilinear polyhedral domain with edge openings smaller than $π$, which include smooth domains and smooth deformations of convex polyhedra. The proof relies on the analysis of a dual elliptic problem with a discontinuous coefficien… ▽ More The weak maximum principle of the isoparametric finite element method is proved for the Poisson equation under the Dirichlet boundary condition in a (possibly concave) curvilinear polyhedral domain with edge openings smaller than $π$, which include smooth domains and smooth deformations of convex polyhedra. The proof relies on the analysis of a dual elliptic problem with a discontinuous coefficient matrix arising from the isoparametric finite elements. Therefore, the standard $H^2$ elliptic regularity which is required in the proof of the weak maximum principle in the literature does not hold for this dual problem. To overcome this difficulty, we have decomposed the solution into a smooth part and a nonsmooth part, and estimated the two parts by $H^2$ and $W^{1,p}$ estimates, respectively. As an application of the weak maximum principle, we have proved a maximum-norm best approximation property of the isoparametric finite element method for the Poisson equation in a curvilinear polyhedron. The proof contains non-trivial modifications of Schatz's argument due to the non-conformity of the iso-parametric finite elements, which requires us to construct a globally smooth flow map which maps the curvilinear polyhedron to a perturbed larger domain on which we can establish the $W^{1,\infty}$ regularity estimate of the Poisson equation uniformly with respect to the perturbation. △ Less

Submitted 16 March, 2023; originally announced March 2023.

arXiv:2303.07331 [pdf, ps, other]

On the full Kostant-Toda hierarchy and its $\ell$-banded reductions for the Lie algebras of type $A, B$ and $G$

Authors: Yuji Kodama, Yuancheng Xie

Abstract: This paper concerns the solutions of the full Kostant-Toda (f-KT) hierarchy in the Hessenberg form and their reductions to the $\ell$-banded Kostant-Toda ($\ell$-KT) hierarchy. We also study the f-KT hierarchy and the corresponding $\ell$-KT hierarchy on simple Lie algebras of type $A, B$ and $G$ based on root space reductions with proper Chevalley systems. Explicit formulas of the polynomial solu… ▽ More This paper concerns the solutions of the full Kostant-Toda (f-KT) hierarchy in the Hessenberg form and their reductions to the $\ell$-banded Kostant-Toda ($\ell$-KT) hierarchy. We also study the f-KT hierarchy and the corresponding $\ell$-KT hierarchy on simple Lie algebras of type $A, B$ and $G$ based on root space reductions with proper Chevalley systems. Explicit formulas of the polynomial solutions for the $τ$-functions are also given in terms of the Schur functions and Schur's $Q$-functions. △ Less

Submitted 13 March, 2023; originally announced March 2023.

Comments: 34 pages

arXiv:2302.11109 [pdf, other]

A deformation of Asaeda-Przytycki-Sikora homology

Authors: Zhenkun Li, Yi Xie, Boyu Zhang

Abstract: We define a 1-parameter family of homology invariants for links in thickened oriented surfaces. It recovers the homology invariant of Asaeda-Przytycki-Sikora (arxiv:0409414) and the invariant defined by Winkeler (arxiv:2106.03834). The new invariant can be regarded as a deformation of Asaeda-Przytycki-Sikora homology; it is not a Lee-type deformation as the deformation is only non-trivial when the… ▽ More We define a 1-parameter family of homology invariants for links in thickened oriented surfaces. It recovers the homology invariant of Asaeda-Przytycki-Sikora (arxiv:0409414) and the invariant defined by Winkeler (arxiv:2106.03834). The new invariant can be regarded as a deformation of Asaeda-Przytycki-Sikora homology; it is not a Lee-type deformation as the deformation is only non-trivial when the surface is not simply connected. Our construction is motivated by computations in singular instanton Floer homology. We also prove a detection property for the new invariant, which is a stronger result than the main theorem of arxiv:2208.13963. △ Less

Submitted 21 February, 2023; originally announced February 2023.

Comments: 14 pages, 6 figures

MSC Class: 57K18

arXiv:2302.09570 [pdf, other]

A Posteriori Error Estimates for A Modified Weak Galerkin Finite Element Method Solving Linear Elasticity Problems

Authors: Liu Chunmei, Zhong Liuqiang, Xie Yingying Xie, Zhou Li**

Abstract: In this paper, a residual-type a posteriori error estimator is proposed and analyzed for a modified weak Galerkin finite element method solving linear elasticity problems. The estimator is proven to be both reliable and efficient because it provides upper and lower bounds on the actual error in a discrete energy norm. Numerical experiments are given to illustrate the effectiveness of the this erro… ▽ More In this paper, a residual-type a posteriori error estimator is proposed and analyzed for a modified weak Galerkin finite element method solving linear elasticity problems. The estimator is proven to be both reliable and efficient because it provides upper and lower bounds on the actual error in a discrete energy norm. Numerical experiments are given to illustrate the effectiveness of the this error estimator. △ Less

Submitted 19 February, 2023; originally announced February 2023.

Comments: 17 pages, 8 figures

arXiv:2301.12200 [pdf, ps, other]

A characterization of regular partial cubes whose all convex cycles have the same lengths

Authors: Yan-Ting Xie, Yong-De Feng, Shou-Jun Xu

Abstract: Partial cubes are graphs that can be isometrically embedded into hypercubes. Convex cycles play an important role in the study of partial cubes. In this paper, we prove that a regular partial cube is a hypercube (resp., a doubled Odd graph, an even cycle of length $2n$ where $n\geqslant 4$) if and only if all its convex cycles are 4-cycles (resp., 6-cycles, $2n$-cycles). In particular, the partial… ▽ More Partial cubes are graphs that can be isometrically embedded into hypercubes. Convex cycles play an important role in the study of partial cubes. In this paper, we prove that a regular partial cube is a hypercube (resp., a doubled Odd graph, an even cycle of length $2n$ where $n\geqslant 4$) if and only if all its convex cycles are 4-cycles (resp., 6-cycles, $2n$-cycles). In particular, the partial cubes whose all convex cycles are 4-cycles are equivalent to almost-median graphs, so we obtain that regular almost-median graphs are exactly hypercubes, which generate the result by Mulder [J. Graph Theory, 4 (1980) 107--110] -- regular median graphs are hypercubes. △ Less

Submitted 28 January, 2023; originally announced January 2023.

Comments: 8 pages, 0 figures

MSC Class: 05C75

arXiv:2301.11197

Causal Graph Discovery from Self and Mutually Exciting Time Series

Authors: Song Wei, Yao Xie, Christopher S. Josef, Rishikesan Kamaleswaran

Abstract: We present a generalized linear structural causal model, coupled with a novel data-adaptive linear regularization, to recover causal directed acyclic graphs (DAGs) from time series. By leveraging a recently developed stochastic monotone Variational Inequality (VI) formulation, we cast the causal discovery problem as a general convex optimization. Furthermore, we develop a non-asymptotic recovery g… ▽ More We present a generalized linear structural causal model, coupled with a novel data-adaptive linear regularization, to recover causal directed acyclic graphs (DAGs) from time series. By leveraging a recently developed stochastic monotone Variational Inequality (VI) formulation, we cast the causal discovery problem as a general convex optimization. Furthermore, we develop a non-asymptotic recovery guarantee and quantifiable uncertainty by solving a linear program to establish confidence intervals for a wide range of non-linear monotone link functions. We validate our theoretical results and show the competitive performance of our method via extensive numerical experiments. Most importantly, we demonstrate the effectiveness of our approach in recovering highly interpretable causal DAGs over Sepsis Associated Derangements (SADs) while achieving comparable prediction performance to powerful ``black-box'' models such as XGBoost. Thus, the future adoption of our proposed method to conduct continuous surveillance of high-risk patients by clinicians is much more likely. △ Less

Submitted 27 January, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

Comments: This is an updated version of our previous workshop paper; instead of posting it as a new submission, we update the previous arxiv preprint arXiv:2106.02600 . Also, the previous workshop paper can be found in the "past version" using the above arXiv link

arXiv:2301.09675 [pdf, other]

Improved Rate of First Order Algorithms for Entropic Optimal Transport

Authors: Yiling Luo, Yiling Xie, Xiaoming Huo

Abstract: This paper improves the state-of-the-art rate of a first-order algorithm for solving entropy regularized optimal transport. The resulting rate for approximating the optimal transport (OT) has been improved from $\widetilde{O}({n^{2.5}}/ε)$ to $\widetilde{O}({n^2}/ε)$, where $n$ is the problem size and $ε$ is the accuracy level. In particular, we propose an accelerated primal-dual stochastic mirror… ▽ More This paper improves the state-of-the-art rate of a first-order algorithm for solving entropy regularized optimal transport. The resulting rate for approximating the optimal transport (OT) has been improved from $\widetilde{O}({n^{2.5}}/ε)$ to $\widetilde{O}({n^2}/ε)$, where $n$ is the problem size and $ε$ is the accuracy level. In particular, we propose an accelerated primal-dual stochastic mirror descent algorithm with variance reduction. Such special design helps us improve the rate compared to other accelerated primal-dual algorithms. We further propose a batch version of our stochastic algorithm, which improves the computational performance through parallel computing. To compare, we prove that the computational complexity of the Stochastic Sinkhorn algorithm is $\widetilde{O}({n^2}/{ε^2})$, which is slower than our accelerated primal-dual stochastic mirror algorithm. Experiments are done using synthetic and real data, and the results match our theoretical rates. Our algorithm may inspire more research to develop accelerated primal-dual algorithms that have rate $\widetilde{O}({n^2}/ε)$ for solving OT. △ Less

Submitted 23 January, 2023; originally announced January 2023.

arXiv:2212.07046 [pdf, other]

Randomized methods for computing optimal transport without regularization and their convergence analysis

Authors: Yue Xie, Zhongjian Wang, Zhiwen Zhang

Abstract: The optimal transport (OT) problem can be reduced to a linear programming (LP) problem through discretization. In this paper, we introduced the random block coordinate descent (RBCD) methods to directly solve this LP problem. Our approach involves restricting the potentially large-scale optimization problem to small LP subproblems constructed via randomly chosen working sets. By using a random Gau… ▽ More The optimal transport (OT) problem can be reduced to a linear programming (LP) problem through discretization. In this paper, we introduced the random block coordinate descent (RBCD) methods to directly solve this LP problem. Our approach involves restricting the potentially large-scale optimization problem to small LP subproblems constructed via randomly chosen working sets. By using a random Gauss-Southwell-$q$ rule to select these working sets, we equip the vanilla version of (${\bf{\rm RBCD}_0}$) with almost sure convergence and a linear convergence rate to solve general standard LP problems. To further improve the efficiency of the (${\bf{\rm RBCD}_0}$) method, we explore the special structure of constraints in the OT problems and leverage the theory of linear systems to propose several approaches for refining the random working set selection and accelerating the vanilla method. Inexact versions of the RBCD methods are also discussed. Our preliminary numerical experiments demonstrate that the accelerated random block coordinate descent (${\bf {\rm ARBCD}}$) method compares well with other solvers including Sinkhorn's algorithm when seeking solutions with relatively high accuracy, and offers the advantage of saving memory. △ Less

Submitted 23 November, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

MSC Class: 65C35; 68W20; 90C08; 90C25

arXiv:2212.03679 [pdf, ps, other]

On the full Kostant-Toda lattice and the flag varieties. I. The singular solutions

Authors: Yuancheng Xie

Abstract: The full Kostant-Toda (f-KT) lattice is a natural generalization of the classical tridiagonal Toda lattice. We study singular structure of solutions of the f-KT lattices defined on simple Lie algebras in two different ways: through the $τ$-functions and through the Kowalevski-Painlevé analysis. The $τ$-function formalism relies on and is equivalent to the representation theory of the underlying Li… ▽ More The full Kostant-Toda (f-KT) lattice is a natural generalization of the classical tridiagonal Toda lattice. We study singular structure of solutions of the f-KT lattices defined on simple Lie algebras in two different ways: through the $τ$-functions and through the Kowalevski-Painlevé analysis. The $τ$-function formalism relies on and is equivalent to the representation theory of the underlying Lie algebras, while the Kowalevski-Painlevé analysis is representation independent and we are able to characterize all the terms in the Laurent series solutions of the f-KT lattices via the structure theory of the Lie algebras. Through the above analysis we compactify the initial condition spaces of f-KT lattice by the corresponding flag varieties, that is fixing the spectral parameters which are invariant under the f-KT flows, we build a one to one correspondence between solutions of the f-KT lattices and points in the corresponding flag varieties. As all the important characters we obtain in the Kowalevski-Painlevé analysis are integral valued, results in this paper are valid in any field containing the rational field. △ Less

Submitted 19 December, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

Comments: 23 pages

arXiv:2211.16739 [pdf, other]

Quasi Non-Negative Quaternion Matrix Factorization with Application to Color Face Recognition

Authors: Yifen Ke, Changfeng Ma, Zhigang Jia, Yajun Xie, Riwei Liao

Abstract: To address the non-negativity dropout problem of quaternion models, a novel quasi non-negative quaternion matrix factorization (QNQMF) model is presented for color image processing. To implement QNQMF, the quaternion projected gradient algorithm and the quaternion alternating direction method of multipliers are proposed via formulating QNQMF as the non-convex constraint quaternion optimization pro… ▽ More To address the non-negativity dropout problem of quaternion models, a novel quasi non-negative quaternion matrix factorization (QNQMF) model is presented for color image processing. To implement QNQMF, the quaternion projected gradient algorithm and the quaternion alternating direction method of multipliers are proposed via formulating QNQMF as the non-convex constraint quaternion optimization problems. Some properties of the proposed algorithms are studied. The numerical experiments on the color image reconstruction show that these algorithms encoded on the quaternion perform better than these algorithms encoded on the red, green and blue channels. Furthermore, we apply the proposed algorithms to the color face recognition. Numerical results indicate that the accuracy rate of face recognition on the quaternion model is better than on the red, green and blue channels of color image as well as single channel of gray level images for the same data, when large facial expressions and shooting angle variations are presented. △ Less

Submitted 29 November, 2022; originally announced November 2022.

Comments: 35 pages, 8 figures

arXiv:2211.15070 [pdf, other]

Online Kernel CUSUM for Change-Point Detection

Authors: Song Wei, Yao Xie

Abstract: We present a computationally efficient online kernel Cumulative Sum (CUSUM) method for change-point detection that utilizes the maximum over a set of kernel statistics to account for the unknown change-point location. Our approach exhibits increased sensitivity to small changes compared to existing kernel-based change-point detection methods, including Scan-B statistic, corresponding to a non-para… ▽ More We present a computationally efficient online kernel Cumulative Sum (CUSUM) method for change-point detection that utilizes the maximum over a set of kernel statistics to account for the unknown change-point location. Our approach exhibits increased sensitivity to small changes compared to existing kernel-based change-point detection methods, including Scan-B statistic, corresponding to a non-parametric Shewhart chart-type procedure. We provide accurate analytic approximations for two key performance metrics: the Average Run Length (ARL) and Expected Detection Delay (EDD), which enable us to establish an optimal window length to be on the order of the logarithm of ARL to ensure minimal power loss relative to an oracle procedure with infinite memory. Moreover, we introduce a recursive calculation procedure for detection statistics to ensure constant computational and memory complexity, which is essential for online implementation. Through extensive experiments on both simulated and real data, we demonstrate the competitive performance of our method and validate our theoretical results. △ Less

Submitted 8 November, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

Comments: The Matlab code can be found at https://github.com/SongWei-GT/online_kernel_cusum

arXiv:2210.16645 [pdf, other]

Solving a Special Type of Optimal Transport Problem by a Modified Hungarian Algorithm

Authors: Yiling Xie, Yiling Luo, Xiaoming Huo

Abstract: Computing the empirical Wasserstein distance in the Wasserstein-distance-based independence test is an optimal transport (OT) problem with a special structure. This observation inspires us to study a special type of OT problem and propose a modified Hungarian algorithm to solve it exactly. For the OT problem involving two marginals with $m$ and $n$ atoms ($m\geq n$), respectively, the computationa… ▽ More Computing the empirical Wasserstein distance in the Wasserstein-distance-based independence test is an optimal transport (OT) problem with a special structure. This observation inspires us to study a special type of OT problem and propose a modified Hungarian algorithm to solve it exactly. For the OT problem involving two marginals with $m$ and $n$ atoms ($m\geq n$), respectively, the computational complexity of the proposed algorithm is $O(m^2n)$. Computing the empirical Wasserstein distance in the independence test requires solving this special type of OT problem, where $m=n^2$. The associated computational complexity of the proposed algorithm is $O(n^5)$, while the order of applying the classic Hungarian algorithm is $O(n^6)$. In addition to the aforementioned special type of OT problem, it is shown that the modified Hungarian algorithm could be adopted to solve a wider range of OT problems. Broader applications of the proposed algorithm are discussed -- solving the one-to-many assignment problem and the many-to-many assignment problem. We conduct numerical experiments to validate our theoretical results. The experiment results demonstrate that the proposed modified Hungarian algorithm compares favorably with the Hungarian algorithm, the well-known Sinkhorn algorithm, and the network simplex algorithm. △ Less

Submitted 28 February, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

Showing 1–50 of 213 results for author: Xie, Y