-
A class of overdetermined problem for fractional Capacity
Authors:
Lei Qin,
Lu Zhang
Abstract:
In this paper, we consider an unconventional overdetermined problem through a property of concavity, which provides some characterizations of balls via Brunn-Minkowski inequalities. In this setting, our rsults can be viewed as the generalization of $p$-capacity in [14], which have its own interest.
In this paper, we consider an unconventional overdetermined problem through a property of concavity, which provides some characterizations of balls via Brunn-Minkowski inequalities. In this setting, our rsults can be viewed as the generalization of $p$-capacity in [14], which have its own interest.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Two-component system modelling shallow-water waves with constant vorticity under the Camassa-Holm scaling
Authors:
Leyi Zhang,
Xingxing Liu
Abstract:
This paper is concerned with the derivation of a two-component system modelling shallow-water waves with constant vorticity under the Camassa-Holm scaling from our newly established Green-Naghdi equations with a linear shear. It is worth pointing out that the $ρ$ component in this new system is quite different from the previous two-component system due to the effects of both vorticity and larger a…
▽ More
This paper is concerned with the derivation of a two-component system modelling shallow-water waves with constant vorticity under the Camassa-Holm scaling from our newly established Green-Naghdi equations with a linear shear. It is worth pointing out that the $ρ$ component in this new system is quite different from the previous two-component system due to the effects of both vorticity and larger amplitude. We then establish the local well-posedness of this new system in Besov spaces, and present a blow-up criterion. We finally give a sufficient condition for global strong solutions to the system in some special case.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
On Stewart's Perturbation Theorem for SVD
Authors:
Ren-Cang Li,
Ninoslav Truhar,
Lei-Hong Zhang
Abstract:
This paper establishes a variant of Stewart's theorem (Theorem~6.4 of Stewart, {\em SIAM Rev.}, 15:727--764, 1973) for the singular subspaces associated with the SVD of a matrix subject to perturbations. Stewart's original version uses both the Frobenius and spectral norms, whereas the new variant uses the spectral norm and any unitarily invariant norm that offer choices per convenience of particu…
▽ More
This paper establishes a variant of Stewart's theorem (Theorem~6.4 of Stewart, {\em SIAM Rev.}, 15:727--764, 1973) for the singular subspaces associated with the SVD of a matrix subject to perturbations. Stewart's original version uses both the Frobenius and spectral norms, whereas the new variant uses the spectral norm and any unitarily invariant norm that offer choices per convenience of particular applications and lead to sharper bounds than that straightforwardly derived from Stewart's original theorem with the help of the well-known equivalence inequalities between matrix norms. Of interest in their own right, bounds on the solution to two couple Sylvester equations are established for a few different circumstances.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
A Characterization for Tightness of the Sparse Moment-SOS Hierarchy
Authors:
Jiawang Nie,
Zheng Qu,
Xindong Tang,
Linghao Zhang
Abstract:
This paper studies the sparse Moment-SOS hierarchy of relaxations for solving sparse polynomial optimization problems. We show that this sparse hierarchy is tight if and only if the objective can be written as a sum of sparse nonnegative polynomials, each of which belongs to the sum of the ideal and quadratic module generated by the corresponding sparse constraints. Based on this characterization,…
▽ More
This paper studies the sparse Moment-SOS hierarchy of relaxations for solving sparse polynomial optimization problems. We show that this sparse hierarchy is tight if and only if the objective can be written as a sum of sparse nonnegative polynomials, each of which belongs to the sum of the ideal and quadratic module generated by the corresponding sparse constraints. Based on this characterization, we give several sufficient conditions for the sparse Moment-SOS hierarchy to be tight. In particular, we show that this sparse hierarchy is tight under some assumptions such as convexity, optimality conditions or finiteness of constraining sets.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Quality of Approximate Balanced Truncation
Authors:
Lei-Hong Zhang,
Ren-Cang Li
Abstract:
Model reduction is a powerful tool in dealing with numerical simulation of large scale dynamic systems for studying complex physical systems. Two major types of model reduction methods for linear time-invariant dynamic systems are Krylov subspace-based methods and balanced truncation-based methods. The methods of the second type are much more theoretically sound than the first type in that there i…
▽ More
Model reduction is a powerful tool in dealing with numerical simulation of large scale dynamic systems for studying complex physical systems. Two major types of model reduction methods for linear time-invariant dynamic systems are Krylov subspace-based methods and balanced truncation-based methods. The methods of the second type are much more theoretically sound than the first type in that there is a fairly tight global error bound on the approximation error between the original system and the reduced one. It is noted that the error bound is established based upon the availability of the exact controllability and observability Gramians. However, numerically, the Gramians are not available and have to be numerically calculated, and for a large scale system, a viable option is to compute low-rank approximations of the Gramians from which an approximate balanced truncation is then performed. Hence, rigorously speaking, the existing global error bound is not applicable to any reduced system obtained via approximate Gramians. The goal of this paper is to address this issue by establishing global error bounds for reduced systems via approximate balanced truncation.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization
Authors:
Wei Jiang,
Sifan Yang,
Wenhao Yang,
Yibo Wang,
Yuanyu Wan,
Lijun Zhang
Abstract:
This paper investigates projection-free algorithms for stochastic constrained multi-level optimization. In this context, the objective function is a nested composition of several smooth functions, and the decision set is closed and convex. Existing projection-free algorithms for solving this problem suffer from two limitations: 1) they solely focus on the gradient map** criterion and fail to mat…
▽ More
This paper investigates projection-free algorithms for stochastic constrained multi-level optimization. In this context, the objective function is a nested composition of several smooth functions, and the decision set is closed and convex. Existing projection-free algorithms for solving this problem suffer from two limitations: 1) they solely focus on the gradient map** criterion and fail to match the optimal sample complexities in unconstrained settings; 2) their analysis is exclusively applicable to non-convex functions, without considering convex and strongly convex objectives. To address these issues, we introduce novel projection-free variance reduction algorithms and analyze their complexities under different criteria. For gradient map**, our complexities improve existing results and match the optimal rates for unconstrained problems. For the widely-used Frank-Wolfe gap criterion, we provide theoretical guarantees that align with those for single-level problems. Additionally, by using a stage-wise adaptation, we further obtain complexities for convex and strongly convex functions. Finally, numerical experiments on different tasks demonstrate the effectiveness of our methods.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Multi-Patch Isogeometric Convolution Hierarchical Deep-learning Neural Network
Authors:
Lei Zhang,
Chanwook Park,
T. J. R. Hughes,
Wing Kam Liu
Abstract:
A seamless integration of neural networks with Isogeometric Analysis (IGA) was first introduced in [1] under the name of Hierarchical Deep-learning Neural Network (HiDeNN) and has systematically evolved into Isogeometric Convolution HiDeNN (in short, C-IGA) [2]. C-IGA achieves higher order approximations without increasing the degree of freedom. Due to the Kronecker delta property of C-IGA shape f…
▽ More
A seamless integration of neural networks with Isogeometric Analysis (IGA) was first introduced in [1] under the name of Hierarchical Deep-learning Neural Network (HiDeNN) and has systematically evolved into Isogeometric Convolution HiDeNN (in short, C-IGA) [2]. C-IGA achieves higher order approximations without increasing the degree of freedom. Due to the Kronecker delta property of C-IGA shape functions, one can refine the mesh in the physical domain like standard finite element method (FEM) while maintaining the exact geometrical map** of IGA. In this article, C-IGA theory is generalized for multi-CAD-patch systems with a mathematical investigation of the compatibility conditions at patch interfaces and convergence of error estimates. Two compatibility conditions (nodal compatibility and G^0 (i.e., global C^0) compatibility) are presented and validated through numerical examples.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions
Authors:
Wei Jiang,
Sifan Yang,
Yibo Wang,
Lijun Zhang
Abstract:
This paper explores adaptive variance reduction methods for stochastic optimization based on the STORM technique. Existing adaptive extensions of STORM rely on strong assumptions like bounded gradients and bounded function values, or suffer an additional $\mathcal{O}(\log T)$ term in the convergence rate. To address these limitations, we introduce a novel adaptive STORM method that achieves an opt…
▽ More
This paper explores adaptive variance reduction methods for stochastic optimization based on the STORM technique. Existing adaptive extensions of STORM rely on strong assumptions like bounded gradients and bounded function values, or suffer an additional $\mathcal{O}(\log T)$ term in the convergence rate. To address these limitations, we introduce a novel adaptive STORM method that achieves an optimal convergence rate of $\mathcal{O}(T^{-1/3})$ for non-convex functions with our newly designed learning rate strategy. Compared with existing approaches, our method requires weaker assumptions and attains the optimal convergence rate without the additional $\mathcal{O}(\log T)$ term. We also extend the proposed technique to stochastic compositional optimization, obtaining the same optimal rate of $\mathcal{O}(T^{-1/3})$. Furthermore, we investigate the non-convex finite-sum problem and develop another innovative adaptive variance reduction method that achieves an optimal convergence rate of $\mathcal{O}(n^{1/4} T^{-1/2} )$, where $n$ represents the number of component functions. Numerical experiments across various tasks validate the effectiveness of our method.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Generalization Bound and New Algorithm for Clean-Label Backdoor Attack
Authors:
Lijia Yu,
Shuang Liu,
Yibo Miao,
Xiao-Shan Gao,
Lijun Zhang
Abstract:
The generalization bound is a crucial theoretical tool for assessing the generalizability of learning methods and there exist vast literatures on generalizability of normal learning, adversarial learning, and data poisoning. Unlike other data poison attacks, the backdoor attack has the special property that the poisoned triggers are contained in both the training set and the test set and the purpo…
▽ More
The generalization bound is a crucial theoretical tool for assessing the generalizability of learning methods and there exist vast literatures on generalizability of normal learning, adversarial learning, and data poisoning. Unlike other data poison attacks, the backdoor attack has the special property that the poisoned triggers are contained in both the training set and the test set and the purpose of the attack is two-fold. To our knowledge, the generalization bound for the backdoor attack has not been established. In this paper, we fill this gap by deriving algorithm-independent generalization bounds in the clean-label backdoor attack scenario. Precisely, based on the goals of backdoor attack, we give upper bounds for the clean sample population errors and the poison population errors in terms of the empirical error on the poisoned training dataset. Furthermore, based on the theoretical result, a new clean-label backdoor attack is proposed that computes the poisoning trigger by combining adversarial noise and indiscriminate poison. We show its effectiveness in a variety of settings.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction
Authors:
Wei Jiang,
Sifan Yang,
Wenhao Yang,
Lijun Zhang
Abstract:
Sign stochastic gradient descent (signSGD) is a communication-efficient method that transmits only the sign of stochastic gradients for parameter updating. Existing literature has demonstrated that signSGD can achieve a convergence rate of $\mathcal{O}(d^{1/2}T^{-1/4})$, where $d$ represents the dimension and $T$ is the iteration number. In this paper, we improve this convergence rate to…
▽ More
Sign stochastic gradient descent (signSGD) is a communication-efficient method that transmits only the sign of stochastic gradients for parameter updating. Existing literature has demonstrated that signSGD can achieve a convergence rate of $\mathcal{O}(d^{1/2}T^{-1/4})$, where $d$ represents the dimension and $T$ is the iteration number. In this paper, we improve this convergence rate to $\mathcal{O}(d^{1/2}T^{-1/3})$ by introducing the Sign-based Stochastic Variance Reduction (SSVR) method, which employs variance reduction estimators to track gradients and leverages their signs to update. For finite-sum problems, our method can be further enhanced to achieve a convergence rate of $\mathcal{O}(m^{1/4}d^{1/2}T^{-1/2})$, where $m$ denotes the number of component functions. Furthermore, we investigate the heterogeneous majority vote in distributed settings and introduce two novel algorithms that attain improved convergence rates of $\mathcal{O}(d^{1/2}T^{-1/2} + dn^{-1/2})$ and $\mathcal{O}(d^{1/4}T^{-1/4})$ respectively, outperforming the previous results of $\mathcal{O}(dT^{-1/4} + dn^{-1/2})$ and $\mathcal{O}(d^{3/8}T^{-1/8})$, where $n$ represents the number of nodes. Numerical experiments across different tasks validate the effectiveness of our proposed methods.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Universal Online Convex Optimization with $1$ Projection per Round
Authors:
Wenhao Yang,
Yibo Wang,
Peng Zhao,
Lijun Zhang
Abstract:
To address the uncertainty in function types, recent progress in online convex optimization (OCO) has spurred the development of universal algorithms that simultaneously attain minimax rates for multiple types of convex functions. However, for a $T$-round online problem, state-of-the-art methods typically conduct $O(\log T)$ projections onto the domain in each round, a process potentially time-con…
▽ More
To address the uncertainty in function types, recent progress in online convex optimization (OCO) has spurred the development of universal algorithms that simultaneously attain minimax rates for multiple types of convex functions. However, for a $T$-round online problem, state-of-the-art methods typically conduct $O(\log T)$ projections onto the domain in each round, a process potentially time-consuming with complicated feasible sets. In this paper, inspired by the black-box reduction of Cutkosky and Orabona (2018), we employ a surrogate loss defined over simpler domains to develop universal OCO algorithms that only require $1$ projection. Embracing the framework of prediction with expert advice, we maintain a set of experts for each type of functions and aggregate their predictions via a meta-algorithm. The crux of our approach lies in a uniquely designed expert-loss for strongly convex functions, stemming from an innovative decomposition of the regret into the meta-regret and the expert-regret. Our analysis sheds new light on the surrogate loss, facilitating a rigorous examination of the discrepancy between the regret of the original loss and that of the surrogate loss, and carefully controlling meta-regret under the strong convexity condition. In this way, with only $1$ projection per round, we establish optimal regret bounds for general convex, exponentially concave, and strongly convex functions simultaneously. Furthermore, we enhance the expert-loss to exploit the smoothness property, and demonstrate that our algorithm can attain small-loss regret for multiple types of convex and smooth functions.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
A Hessian-Aware Stochastic Differential Equation for Modelling SGD
Authors:
Xiang Li,
Zebang Shen,
Liang Zhang,
Niao He
Abstract:
Continuous-time approximation of Stochastic Gradient Descent (SGD) is a crucial tool to study its esca** behaviors from stationary points. However, existing stochastic differential equation (SDE) models fail to fully capture these behaviors, even for simple quadratic objectives. Built on a novel stochastic backward error analysis framework, we derive the Hessian-Aware Stochastic Modified Equatio…
▽ More
Continuous-time approximation of Stochastic Gradient Descent (SGD) is a crucial tool to study its esca** behaviors from stationary points. However, existing stochastic differential equation (SDE) models fail to fully capture these behaviors, even for simple quadratic objectives. Built on a novel stochastic backward error analysis framework, we derive the Hessian-Aware Stochastic Modified Equation (HA-SME), an SDE that incorporates Hessian information of the objective function into both its drift and diffusion terms. Our analysis shows that HA-SME matches the order-best approximation error guarantee among existing SDE models in the literature, while achieving a significantly reduced dependence on the smoothness parameter of the objective. Further, for quadratic objectives, under mild conditions, HA-SME is proved to be the first SDE model that recovers exactly the SGD dynamics in the distributional sense. Consequently, when the local landscape near a stationary point can be approximated by quadratics, HA-SME is expected to accurately predict the local esca** behaviors of SGD.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Strongly Tempered BZSV Quadruples
Authors:
Zhengyu Mao,
Chen Wan,
Lei Zhang
Abstract:
In this paper, we give a list of strongly tempered BZSV quadruples. This gives a conceptual explanation of many existing Rankin-Selberg integrals and period integrals. It also proposes many new interesting period integrals to study.
In this paper, we give a list of strongly tempered BZSV quadruples. This gives a conceptual explanation of many existing Rankin-Selberg integrals and period integrals. It also proposes many new interesting period integrals to study.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Geometry of Critical Sets and Existence of Saddle Branches for Two-layer Neural Networks
Authors:
Leyang Zhang,
Yaoyu Zhang,
Tao Luo
Abstract:
This paper presents a comprehensive analysis of critical point sets in two-layer neural networks. To study such complex entities, we introduce the critical embedding operator and critical reduction operator as our tools. Given a critical point, we use these operators to uncover the whole underlying critical set representing the same output function, which exhibits a hierarchical structure. Further…
▽ More
This paper presents a comprehensive analysis of critical point sets in two-layer neural networks. To study such complex entities, we introduce the critical embedding operator and critical reduction operator as our tools. Given a critical point, we use these operators to uncover the whole underlying critical set representing the same output function, which exhibits a hierarchical structure. Furthermore, we prove existence of saddle branches for any critical set whose output function can be represented by a narrower network. Our results provide a solid foundation to the further study of optimization and training behavior of neural networks.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Well-posedness and invariant measures for the stochastically perturbed Landau-Lifshitz-Baryakhtar equation
Authors:
Fan Xu,
Lei Zhang,
Bin Liu
Abstract:
In this paper, we study the initial-boundary value problem for the stochastic Landau-Lifshitz-Baryakhtar (SLLBar) equation with Stratonovich-type noise in bounded domains $\mathcal{O}\subset\mathbb{R}^d$, $d=1,2,3$. Our main results can be briefly described as follows: (1) for $d=1,2,3$ and any $\mathbf{u}_0\in\mathbb{H}^1$, the SLLBar equation admits a unique local-in-time pathwise weak solution;…
▽ More
In this paper, we study the initial-boundary value problem for the stochastic Landau-Lifshitz-Baryakhtar (SLLBar) equation with Stratonovich-type noise in bounded domains $\mathcal{O}\subset\mathbb{R}^d$, $d=1,2,3$. Our main results can be briefly described as follows: (1) for $d=1,2,3$ and any $\mathbf{u}_0\in\mathbb{H}^1$, the SLLBar equation admits a unique local-in-time pathwise weak solution; (2) for $d=1$ and small-data $\mathbf{u}_0\in\mathbb{H}^1$, the SLLBar equation has a unique global-in-time pathwise weak solution and at least one invariant measure; (3) for $d=1,2$ and small-data $\mathbf{u}_0\in\mathbb{L}^2$, the SLLBar equation possesses a unique global-in-time pathwise very weak solution and at least one invariant measure, while for $d=3$ only the existence of martingale solution is obtained due to the loss of pathwise uniqueness.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Generalized Jacobi Method for Computing Eigenvalues of Dual Quaternion Hermitian Matrices
Authors:
Yongjun Chen,
Li** Zhang
Abstract:
Dual quaternion matrices have various applications in robotic research and its spectral theory has been extensively studied in recent years. In this paper, we extend Jacobi method to compute all eigenpairs of dual quaternion Hermitian matrices and establish its convergence. The improved version with elimination strategy is proposed to reduce the computational time. Especially, we present a novel t…
▽ More
Dual quaternion matrices have various applications in robotic research and its spectral theory has been extensively studied in recent years. In this paper, we extend Jacobi method to compute all eigenpairs of dual quaternion Hermitian matrices and establish its convergence. The improved version with elimination strategy is proposed to reduce the computational time. Especially, we present a novel three-step Jacobi method to compute such eigenvalues which have identical standard parts but different dual parts. We prove that the proposed three-step Jacobi method terminates after at most finite iterations and can provide $ε$-approximation of eigenvalue. To the best of our knowledge, both the power method and the Rayleigh quotient iteration method can not handle such eigenvalue problem in this scenario. Numerical experiments illustrate the proposed Jacobi-type algorithms are effective and stable, and also outperform the power method and the Rayleigh quotient iteration method.
△ Less
Submitted 25 June, 2024; v1 submitted 22 May, 2024;
originally announced May 2024.
-
A stochastic algorithm approach for the elephant random walk with applications
Authors:
Li-Xin Zhang
Abstract:
The randomized play-the-winner rule (RPW) is a response-adaptive design proposed by Wei and Durham (1978) for sequentially randomizing patients to treatments in a two-treatment clinical trial so that more patients are assigned to the better treatment as the clinical trial goes on. The elephant random walk (ERW) proposed by Schutz and Trimper (2004) is a non-Markovian discrete-time random walk on…
▽ More
The randomized play-the-winner rule (RPW) is a response-adaptive design proposed by Wei and Durham (1978) for sequentially randomizing patients to treatments in a two-treatment clinical trial so that more patients are assigned to the better treatment as the clinical trial goes on. The elephant random walk (ERW) proposed by Schutz and Trimper (2004) is a non-Markovian discrete-time random walk on $\mathbb Z$ which has a link to a famous saying that elephants can always remember where they have been. The asymptotic behaviors of RPW rule and ERW have been studied in litterateurs independently, and their asymptotic behaviors are very similar. In this paper, we show that the RPW rule is a biased ERW and link them with the recursive stochastic algorithm. With the help of a recursive stochastic algorithm, we obtain the strong invariance principle of the ERW as well as multi-dimensional varying-memory ERW with random step sizes. By the strong invariance principle, the central limit theorem, precise law of the iterated logarithm and almost sure central limit theorem are obtained for both the multi-dimensional ERW, and the multi-dimensional ERW with random step sizes and their centers of mass. The Chung type law of the iterated logarithm of the multi-dimensional ERW is also obtained.
△ Less
Submitted 7 June, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
Stability for Nash Equilibrium Problems
Authors:
Ruoyu Diao,
Yu-Hong Dai,
Liwei Zhang
Abstract:
This paper is devoted to studying the stability properties of the Karush-Kuhn-Tucker (KKT) solution map** $S_{\rm KKT}$ for Nash equilibrium problems (NEPs) with canonical perturbations. Firstly, we obtain an exact characterization of the strong regularity of $S_{\rm KKT}$ and a sufficient condition that is easy to verify. Secondly, we propose equivalent conditions for the continuously different…
▽ More
This paper is devoted to studying the stability properties of the Karush-Kuhn-Tucker (KKT) solution map** $S_{\rm KKT}$ for Nash equilibrium problems (NEPs) with canonical perturbations. Firstly, we obtain an exact characterization of the strong regularity of $S_{\rm KKT}$ and a sufficient condition that is easy to verify. Secondly, we propose equivalent conditions for the continuously differentiable single-valued localization of $S_{\rm KKT}$. Thirdly, the isolated calmness of $S_{\rm KKT}$ is studied based on two conditions: Property A and Property B, and Property B proves to be sufficient for the robustness of both $E(p)$ and $S_{\rm KKT}$ under the convex assumptions, where $E(p)$ denotes the Nash equilibria at perturbation $p$. Furthermore, we establish that studying the stability properties of the NEP with canonical perturbations is equivalent to studying those of the NEP with only tilt perturbations based on the prior discussions. Finally, we provide detailed characterizations of stability for NEPs whose each individual player solves a quadratic programming (QP) problem.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
A Vector Representation for Phylogenetic Trees
Authors:
Cedric Chauve,
Caroline Colijn,
Louxin Zhang
Abstract:
Good representations for phylogenetic trees and networks are important for optimizing storage efficiency and implementation of scalable methods for the inference and analysis of evolutionary trees for genes, genomes and species. We introduce a new representation for rooted phylogenetic trees that encodes a binary tree on n taxa as a vector of length 2n in which each taxon appears exactly twice. Us…
▽ More
Good representations for phylogenetic trees and networks are important for optimizing storage efficiency and implementation of scalable methods for the inference and analysis of evolutionary trees for genes, genomes and species. We introduce a new representation for rooted phylogenetic trees that encodes a binary tree on n taxa as a vector of length 2n in which each taxon appears exactly twice. Using this new tree representation, we introduce a novel tree rearrangement operator, called a HOP, that results in a tree space of diameter n and a quadratic neighbourhood size. We also introduce a novel metric, the HOP distance, which is the minimum number of HOPs to transform a tree into another tree. The HOP distance can be computed in near-linear time, a rare instance of a tree rearrangement distance that is tractable. Our experiments show that the HOP distance is better correlated to the Subtree-Prune-and-Regraft distance than the widely used Robinson-Foulds distance. We also describe how the novel tree representation we introduce can be further generalized to tree-child networks.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
A characterization of entangled two-qubit states via partial-transpose-moments
Authors:
Lin Zhang,
Ming-**g Zhao,
Lin Chen,
Hua Xiang,
Yi Shen
Abstract:
Although quantum entanglement is an important resource, its characterization is quite challenging. The partial transposition is a common method to detect bipartite entanglement. In this paper, the authors study the partial-transpose(PT)-moments of two-qubit states,and completely describe the whole region, composed of the second and third PT-moments, for all two-qubit states. Furthermore, they dete…
▽ More
Although quantum entanglement is an important resource, its characterization is quite challenging. The partial transposition is a common method to detect bipartite entanglement. In this paper, the authors study the partial-transpose(PT)-moments of two-qubit states,and completely describe the whole region, composed of the second and third PT-moments, for all two-qubit states. Furthermore, they determine the accurate region corresponding to all entangled two-qubit states. The states corresponding to those boundary points of the whole region, and to the border lines between separable and entangled states are analyzed. As an application, they characterize the entangled region of PT-moments for the two families of Werner states and Bell-diagonal states. The relations between entanglement and the pairs of PT-moments are revealed from these typical examples. They also numerically plot the whole region of possible PT-moments for all two-qubit X-states, and find that this region is almost the same as the whole region of PT-moments for all two-qubit states. Moreover, they extend their results to detect the entanglement of multiqubit states. By utilizing the PT-moment-based method to characterize the entanglement of the multiqubit states mixed by the GHZ and W states, they propose an operational way of verifying the genuine entanglement in such states.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Uncertainty relation and the constrained quadratic programming
Authors:
Lin Zhang,
Dade Wu,
Ming-**g Zhao,
Hua Nan
Abstract:
The uncertainty relation is a fundamental concept in quantum theory, plays a pivotal role in various quantum information processing tasks. In this study, we explore the additive uncertainty relation pertaining to two or more observables, in terms of their variance,by utilizing the generalized Gell-Mann representation in qudit systems. We find that the tight state-independent lower bound of the var…
▽ More
The uncertainty relation is a fundamental concept in quantum theory, plays a pivotal role in various quantum information processing tasks. In this study, we explore the additive uncertainty relation pertaining to two or more observables, in terms of their variance,by utilizing the generalized Gell-Mann representation in qudit systems. We find that the tight state-independent lower bound of the variance sum can be characterized as a quadratic programming problem with nonlinear constraints in optimization theory. As illustrative examples, we derive analytical solutions for these quadratic programming problems in lower-dimensional systems, which align with the state-independent lower bounds. Additionally, we introduce a numerical algorithm tailored for solving these quadratic programming instances, highlighting its efficiency and accuracy. The advantage of our approach lies in its potential ability to simultaneously achieve the optimal value of the quadratic programming problem with nonlinear constraints but also precisely identify the extremal state where this optimal value is attained. This enables us to establish a tight state-independent lower bound for the sum of variances, and further identify the extremal state at which this lower bound is realized.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Symmetry group based domain decomposition to enhance physics-informed neural networks for solving partial differential equations
Authors:
Ye Liu,
Jie-Ying Li,
Li-Sheng Zhang,
Lei-Lei Guo,
Zhi-Yong Zhang
Abstract:
Domain decomposition provides an effective way to tackle the dilemma of physics-informed neural networks (PINN) which struggle to accurately and efficiently solve partial differential equations (PDEs) in the whole domain, but the lack of efficient tools for dealing with the interfaces between two adjacent sub-domains heavily hinders the training effects, even leads to the discontinuity of the lear…
▽ More
Domain decomposition provides an effective way to tackle the dilemma of physics-informed neural networks (PINN) which struggle to accurately and efficiently solve partial differential equations (PDEs) in the whole domain, but the lack of efficient tools for dealing with the interfaces between two adjacent sub-domains heavily hinders the training effects, even leads to the discontinuity of the learned solutions. In this paper, we propose a symmetry group based domain decomposition strategy to enhance the PINN for solving the forward and inverse problems of the PDEs possessing a Lie symmetry group. Specifically, for the forward problem, we first deploy the symmetry group to generate the dividing-lines having known solution information which can be adjusted flexibly and are used to divide the whole training domain into a finite number of non-overlap** sub-domains, then utilize the PINN and the symmetry-enhanced PINN methods to learn the solutions in each sub-domain and finally stitch them to the overall solution of PDEs. For the inverse problem, we first utilize the symmetry group acting on the data of the initial and boundary conditions to generate labeled data in the interior domain of PDEs and then find the undetermined parameters as well as the solution by only training the neural networks in a sub-domain. Consequently, the proposed method can predict high-accuracy solutions of PDEs which are failed by the vanilla PINN in the whole domain and the extended physics-informed neural network in the same sub-domains. Numerical results of the Korteweg-de Vries equation with a translation symmetry and the nonlinear viscous fluid equation with a scaling symmetry show that the accuracies of the learned solutions are improved largely.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Variational Optimization for Quantum Problems using Deep Generative Networks
Authors:
Lingxia Zhang,
Xiaodie Lin,
Peidong Wang,
Kaiyan Yang,
Xiao Zeng,
Zhaohui Wei,
Zizhu Wang
Abstract:
Optimization is one of the keystones of modern science and engineering. Its applications in quantum technology and machine learning helped nurture variational quantum algorithms and generative AI respectively. We propose a general approach to design variational optimization algorithms based on generative models: the Variational Generative Optimization Network (VGON). To demonstrate its broad appli…
▽ More
Optimization is one of the keystones of modern science and engineering. Its applications in quantum technology and machine learning helped nurture variational quantum algorithms and generative AI respectively. We propose a general approach to design variational optimization algorithms based on generative models: the Variational Generative Optimization Network (VGON). To demonstrate its broad applicability, we apply VGON to three quantum tasks: finding the best state in an entanglement-detection protocol, finding the ground state of a 1D quantum spin model with variational quantum circuits, and generating degenerate ground states of many-body quantum Hamiltonians. For the first task, VGON greatly reduces the optimization time compared to stochastic gradient descent while generating nearly optimal quantum states. For the second task, VGON alleviates the barren plateau problem in variational quantum circuits. For the final task, VGON can identify the degenerate ground state spaces after a single stage of training and generate a variety of states therein.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
Polynomial Optimization Over Unions of Sets
Authors:
Jiawang Nie,
Linghao Zhang
Abstract:
This paper studies the polynomial optimization problem whose feasible set is a union of several basic closed semialgebraic sets. We propose a unified hierarchy of Moment-SOS relaxations to solve it globally. Under some assumptions, we prove the asymptotic or finite convergence of the unified hierarchy. Special properties for the univariate case are discussed.The application for computing $(p,q)$-n…
▽ More
This paper studies the polynomial optimization problem whose feasible set is a union of several basic closed semialgebraic sets. We propose a unified hierarchy of Moment-SOS relaxations to solve it globally. Under some assumptions, we prove the asymptotic or finite convergence of the unified hierarchy. Special properties for the univariate case are discussed.The application for computing $(p,q)$-norms of matrices is also presented.
△ Less
Submitted 17 May, 2024; v1 submitted 26 April, 2024;
originally announced April 2024.
-
Differentially Private Federated Learning: Servers Trustworthiness, Estimation, and Statistical Inference
Authors:
Zhe Zhang,
Ryumei Nakada,
Linjun Zhang
Abstract:
Differentially private federated learning is crucial for maintaining privacy in distributed environments. This paper investigates the challenges of high-dimensional estimation and inference under the constraints of differential privacy. First, we study scenarios involving an untrusted central server, demonstrating the inherent difficulties of accurate estimation in high-dimensional problems. Our f…
▽ More
Differentially private federated learning is crucial for maintaining privacy in distributed environments. This paper investigates the challenges of high-dimensional estimation and inference under the constraints of differential privacy. First, we study scenarios involving an untrusted central server, demonstrating the inherent difficulties of accurate estimation in high-dimensional problems. Our findings indicate that the tight minimax rates depends on the high-dimensionality of the data even with sparsity assumptions. Second, we consider a scenario with a trusted central server and introduce a novel federated estimation algorithm tailored for linear regression models. This algorithm effectively handles the slight variations among models distributed across different machines. We also propose methods for statistical inference, including coordinate-wise confidence intervals for individual parameters and strategies for simultaneous inference. Extensive simulation experiments support our theoretical advances, underscoring the efficacy and reliability of our approaches.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
4-connected 1-planar chordal graphs are Hamiltonian-connected
Authors:
Licheng Zhang,
Yuanqiu Huang,
Shengxiang Lv,
Fengming Dong
Abstract:
Tutte proved that 4-connected planar graphs are Hamiltonian. It is unknown if there is an analogous result on 1-planar graphs. In this paper, we characterize 4-connected 1-planar chordal graphs, and show that all such graphs are Hamiltonian-connected. A crucial tool used in our proof is a characteristic of 1-planar 4-trees.
Tutte proved that 4-connected planar graphs are Hamiltonian. It is unknown if there is an analogous result on 1-planar graphs. In this paper, we characterize 4-connected 1-planar chordal graphs, and show that all such graphs are Hamiltonian-connected. A crucial tool used in our proof is a characteristic of 1-planar 4-trees.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Metric operator and geodesic orbit property for a standard homogeneous Finsler metric
Authors:
Lei Zhang,
Ming Xu
Abstract:
In this paper, we introduce the metric operator for a compact homogeneous Finsler space, and use it to investigate the geodesic orbit property. We define the notion of standard homogeneous $(α_1,\cdots,α_s)$-metric which generalizes the notion of standard homogeneous $(α_1,α_2)$-metric. We classify all connected simply connected homogeneous manifold $G/H$ with a compact connected simple Lie group…
▽ More
In this paper, we introduce the metric operator for a compact homogeneous Finsler space, and use it to investigate the geodesic orbit property. We define the notion of standard homogeneous $(α_1,\cdots,α_s)$-metric which generalizes the notion of standard homogeneous $(α_1,α_2)$-metric. We classify all connected simply connected homogeneous manifold $G/H$ with a compact connected simple Lie group $G$ and two irreducible summands in its isotropy representation, such that there exists a standard homogeneous $(α_1,α_2)$-metric which is g.o. but not naturally reductive on $G/H$. We also prove that on a generalized Wallach space which is not a product of three symmetric spaces, any standard homogeneous $(α_1,α_2,α_3)$-metric $F$ with respect to the canonical decomposition is g.o. on $G/H$ if and only if $F$ is a normal homogeneous Riemannian metric.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Multivariate confluent Vandermonde with G-Arnoldi and applications
Authors:
Lei-Hong Zhang,
Ya-Nan Zhang,
Linyi Yang,
Yifu Wu
Abstract:
In the least-squares fitting framework, the Vandermonde with Arnoldi (V+A) method presented in [Brubeck, Nakatsukasa, and Trefethen, SIAM Review, 63 (2021), pp. 405-415] is an effective approach to compute a polynomial that approximates an underlying univariate function f. Extensions of V+A include its multivariate version and the univariate confluent V+A; the latter enables us to use the informat…
▽ More
In the least-squares fitting framework, the Vandermonde with Arnoldi (V+A) method presented in [Brubeck, Nakatsukasa, and Trefethen, SIAM Review, 63 (2021), pp. 405-415] is an effective approach to compute a polynomial that approximates an underlying univariate function f. Extensions of V+A include its multivariate version and the univariate confluent V+A; the latter enables us to use the information of the derivative of f in obtaining the approximation polynomial. In this paper, we shall extend V+A further to the multivariate confluent V+A. Besides the technical generalization of the univariate confluent V+A, we also introduce a general and application-dependent G-orthogonalization in the Arnoldi process. We shall demonstrate with several applications that, by specifying an application-related G-inner product, the desired approximate multivariate polynomial as well as its certain partial derivatives can be computed accurately from a well-conditioned least-squares problem whose coefficient matrix is orthonormal. The desired multivariate polynomial is represented in a discrete G-orthogonal polynomials basis which admits an explicit recurrence, and therefore, facilitates evaluating function values and certain partial derivatives at new nodes efficiently. We demonstrate its flexibility by applying it to solve the multivariate Hermite least-squares problem and PDEs with various boundary conditions in irregular domains.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Projection method for quasiperiodic elliptic equations and application to quasiperiodic homogenization
Authors:
Kai Jiang,
Meng Li,
Juan Zhang,
Lei Zhang
Abstract:
In this study, our main objective is to address the challenge of solving elliptic equations with quasiperiodic coefficients. To achieve accurate and efficient computation, we introduce the projection method, which enables the embedding of quasiperiodic systems into higher-dimensional periodic systems. To enhance the computational efficiency, we propose a compressed storage strategy for the stiffne…
▽ More
In this study, our main objective is to address the challenge of solving elliptic equations with quasiperiodic coefficients. To achieve accurate and efficient computation, we introduce the projection method, which enables the embedding of quasiperiodic systems into higher-dimensional periodic systems. To enhance the computational efficiency, we propose a compressed storage strategy for the stiffness matrix by its multi-level block circulant structure, reducing memory requirements while preserving accuracy. Furthermore, we design a diagonal preconditioner to efficiently solve the resulting high-dimensional linear system by reducing the condition number of the stiffness matrix. These techniques collectively contribute to the computational effectiveness of our proposed approach. We demonstrate the effectiveness and accuracy of our approach through a series of numerical examples. Moreover, we apply our method to achieve a highly accurate computation of the homogenized coefficients for a quasiperiodic multiscale elliptic equation.
△ Less
Submitted 18 April, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO
Authors:
Zi-Hao Qiu,
Siqi Guo,
Mao Xu,
Tuo Zhao,
Lijun Zhang,
Tianbao Yang
Abstract:
The temperature parameter plays a profound role during training and/or inference with large foundation models (LFMs) such as large language models (LLMs) and CLIP models. Particularly, it adjusts the logits in the softmax function in LLMs, which is crucial for next token generation, and it scales the similarities in the contrastive loss for training CLIP models. A significant question remains: Is…
▽ More
The temperature parameter plays a profound role during training and/or inference with large foundation models (LFMs) such as large language models (LLMs) and CLIP models. Particularly, it adjusts the logits in the softmax function in LLMs, which is crucial for next token generation, and it scales the similarities in the contrastive loss for training CLIP models. A significant question remains: Is it viable to learn a neural network to predict a personalized temperature of any input data for enhancing LFMs"? In this paper, we present a principled framework for learning a small yet generalizable temperature prediction network (TempNet) to improve LFMs. Our solution is composed of a novel learning framework with a robust loss underpinned by constrained distributionally robust optimization (DRO), and a properly designed TempNet with theoretical inspiration. TempNet can be trained together with a large foundation model from scratch or learned separately given a pretrained foundation model. It is not only useful for predicting personalized temperature to promote the training of LFMs but also generalizable and transferable to new tasks. Our experiments on LLMs and CLIP models demonstrate that TempNet greatly improves the performance of existing solutions or models, e.g. Table 1. The code to reproduce the experimental results in this paper can be found at https://github.com/zhqiu/TempNet.
△ Less
Submitted 16 June, 2024; v1 submitted 6 April, 2024;
originally announced April 2024.
-
The Hadamard variational formula for Riesz capacity and its applications
Authors:
Lu Zhang
Abstract:
In this paper, our focus lies on a fundamental geometric invariant known as Riesz capacity, which holds an essential position in potential theory. We establish the Hadamard variational formula for Riesz capacity of convex bodies. As a meaningful application, we derive a Serrin-type symmetry result for an overdetermined problem.
In this paper, our focus lies on a fundamental geometric invariant known as Riesz capacity, which holds an essential position in potential theory. We establish the Hadamard variational formula for Riesz capacity of convex bodies. As a meaningful application, we derive a Serrin-type symmetry result for an overdetermined problem.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Strong law of large numbers for $m$-dependent and stationary random variables under sub-linear expectations
Authors:
Wang-Yun Gu,
Li-Xin Zhang
Abstract:
The arm of this paper is to establish the strong law of large numbers (SLLN) of $m$-dependent random variables under the framework of sub-linear expectations. We establish the SLLN for a sequence of independent, but not necessarily identically distributed random variables. The study further extends the SLLN to $m$-dependent and stationary sequence of random variables with the condition…
▽ More
The arm of this paper is to establish the strong law of large numbers (SLLN) of $m$-dependent random variables under the framework of sub-linear expectations. We establish the SLLN for a sequence of independent, but not necessarily identically distributed random variables. The study further extends the SLLN to $m$-dependent and stationary sequence of random variables with the condition $C_{\mathbb V}(|X_1|)<\infty$ which is the sufficient and necessary condition of SLLN in the case of independent and identically distributed random variables.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
The paradigm of tax-reward and tax-punishment strategies in the advancement of public resource management dynamics
Authors:
Lichen Wang,
Yuyuan Liu,
Ruqiang Guo,
Liang Zhang,
Linjie Liu,
Shijia Hua
Abstract:
In contemporary society, the effective utilization of public resources remains a subject of significant concern. A common issue arises from defectors seeking to obtain an excessive share of these resources for personal gain, potentially leading to resource depletion. To mitigate this tragedy and ensure sustainable development of resources, implementing mechanisms to either reward those who adhere…
▽ More
In contemporary society, the effective utilization of public resources remains a subject of significant concern. A common issue arises from defectors seeking to obtain an excessive share of these resources for personal gain, potentially leading to resource depletion. To mitigate this tragedy and ensure sustainable development of resources, implementing mechanisms to either reward those who adhere to distribution rules or penalize those who do not, appears advantageous. We introduce two models: a tax-reward model and a tax-punishment model, to address this issue. Our analysis reveals that in the tax-reward model, the evolutionary trajectory of the system is influenced not only by the tax revenue collected but also by the natural growth rate of the resources. Conversely, the tax-punishment model exhibits distinct characteristics when compared to the tax-reward model, notably the potential for bistability. In such scenarios, the selection of initial conditions is critical, as it can determine the system's path. Furthermore, our study identifies instances where the system lacks stable points, exemplified by a limit cycle phenomenon, underscoring the complexity and dynamism inherent in managing public resources using these models.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Stochastic Approximation Proximal Subgradient Method for Stochastic Convex-Concave Minimax Optimization
Authors:
Yu-Hong Dai,
Jiani Wang,
Liwei Zhang
Abstract:
This paper presents a stochastic approximation proximal subgradient (SAPS) method for stochastic convex-concave minimax optimization. By accessing unbiased and variance bounded approximate subgradients, we show that this algorithm exhibits ${\rm O}(N^{-1/2})$ expected convergence rate of the minimax optimality measure if the parameters in the algorithm are properly chosen, where $N$ denotes the nu…
▽ More
This paper presents a stochastic approximation proximal subgradient (SAPS) method for stochastic convex-concave minimax optimization. By accessing unbiased and variance bounded approximate subgradients, we show that this algorithm exhibits ${\rm O}(N^{-1/2})$ expected convergence rate of the minimax optimality measure if the parameters in the algorithm are properly chosen, where $N$ denotes the number of iterations. Moreover, we show that the algorithm has ${\rm O}(\log(N)N^{-1/2})$ minimax optimality measure bound with high probability. Further we study a specific stochastic convex-concave minimax optimization problems arising from stochastic convex conic optimization problems, which the the bounded subgradient condition is fail. To overcome the lack of the bounded subgradient conditions in convex-concave minimax problems, we propose a linearized stochastic approximation augmented Lagrange (LSAAL) method and prove that this algorithm exhibits ${\rm O}(N^{-1/2})$ expected convergence rate for the minimax optimality measure and ${\rm O}(\log^2(N)N^{-1/2})$ minimax optimality measure bound with high probability as well. Preliminary numerical results demonstrate the effect of the SAPS and LSAAL methods.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Hopf heap and Hopf heap module
Authors:
Huihui Zheng,
Liangyun Zhang
Abstract:
In this paper, we firstly give some results and properties of Hopf heaps. In addition, we get a equivalent definition of a Hopf truess by a Hopf heap. Using this equivalence definition, we construct some examples of Hopf trusses by Hopf heaps. Finally, we introduce the conception of Hopf heap modules, and give the structure theorem of Hopf heap modules.
In this paper, we firstly give some results and properties of Hopf heaps. In addition, we get a equivalent definition of a Hopf truess by a Hopf heap. Using this equivalence definition, we construct some examples of Hopf trusses by Hopf heaps. Finally, we introduce the conception of Hopf heap modules, and give the structure theorem of Hopf heap modules.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Nonexistence of invariant nodal line and improved $L^2$ restriction bounds for Neumann data on negatively curved surface
Authors:
Xianchao Wu,
Lan Zhang
Abstract:
The problem of obtaining the lower bounds on the restriction of Laplacian eigenfunctions to hypersurfaces inside a compact Riemannian manifold $(M,g)$ is challenging and has been attempted by many authors \cite{BR, GRS, Jun, ET}. This paper aims to show that if $(M,g)$ is assumed to be a negatively curved surface then one can get the corresponding restricted lower bounds, as well as quantitative i…
▽ More
The problem of obtaining the lower bounds on the restriction of Laplacian eigenfunctions to hypersurfaces inside a compact Riemannian manifold $(M,g)$ is challenging and has been attempted by many authors \cite{BR, GRS, Jun, ET}. This paper aims to show that if $(M,g)$ is assumed to be a negatively curved surface then one can get the corresponding restricted lower bounds, as well as quantitative improvement of restricted bounds for Neumann data.
△ Less
Submitted 2 April, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
Reweighted Quasi Norm Regularized Low-Rank Factorization for Matrix Robust PCA
Authors:
Zhenzhi Qin,
Li** Zhang
Abstract:
Robust Principal Component Analysis (RPCA) and its associated non-convex relaxation methods constitute a significant component of matrix completion problems, wherein matrix factorization strategies effectively reduce dimensionality and enhance computational speed. However, some non-convex factorization forms lack theoretical guarantees. This paper proposes a novel strategy in non-convex quasi-norm…
▽ More
Robust Principal Component Analysis (RPCA) and its associated non-convex relaxation methods constitute a significant component of matrix completion problems, wherein matrix factorization strategies effectively reduce dimensionality and enhance computational speed. However, some non-convex factorization forms lack theoretical guarantees. This paper proposes a novel strategy in non-convex quasi-norm representation, introducing a method to obtain weighted matrix quasi-norm factorization forms. Especially, explicit bilinear factor matrix factorization formulations for the weighted logarithmic norm and weighted Schatten-$q$ quasi norms with $q=1, 1/2, 2/3$ are provided, along with the establishment of corresponding matrix completion models. An Alternating Direction Method of Multipliers (ADMM) framework algorithm is employed for solving, and convergence results of the algorithm are presented.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Online Prediction for Streaming Tensor Time Series
Authors:
Zhenting Luan,
Haoning Wang,
Li** Zhang,
Shansuo Liang,
Wei Han
Abstract:
Real-time prediction plays a vital role in various control systems, such as traffic congestion control and wireless channel resource allocation. In these scenarios, the predictor usually needs to track the evolution of the latent statistical patterns in the modern high-dimensional streaming time series continuously and quickly, which presents new challenges for traditional prediction methods. This…
▽ More
Real-time prediction plays a vital role in various control systems, such as traffic congestion control and wireless channel resource allocation. In these scenarios, the predictor usually needs to track the evolution of the latent statistical patterns in the modern high-dimensional streaming time series continuously and quickly, which presents new challenges for traditional prediction methods. This paper proposes a novel algorithm based on tensor factorization to predict streaming tensor time series online. The proposed algorithm updates the predictor in a low-complexity online manner to adapt to the time-evolving data. Additionally, an automatically adaptive version of the algorithm is presented to mitigate the negative impact of stale data. Simulation results demonstrate that our proposed methods achieve prediction accuracy similar to that of conventional offline tensor prediction methods, while being much faster than them during long-term online prediction. Therefore, our proposed algorithm provides an effective and efficient solution for the online prediction of streaming tensor time series.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Unsupervised Feature Selection via Nonnegative Orthogonal Constrained Regularized Minimization
Authors:
Yan Li,
Defeng Sun,
Li** Zhang
Abstract:
Unsupervised feature selection has drawn wide attention in the era of big data since it is a primary technique for dimensionality reduction. However, many existing unsupervised feature selection models and solution methods were presented for the purpose of application, and lack of theoretical support, e.g., without convergence analysis. In this paper, we first establish a novel unsupervised featur…
▽ More
Unsupervised feature selection has drawn wide attention in the era of big data since it is a primary technique for dimensionality reduction. However, many existing unsupervised feature selection models and solution methods were presented for the purpose of application, and lack of theoretical support, e.g., without convergence analysis. In this paper, we first establish a novel unsupervised feature selection model based on regularized minimization with nonnegative orthogonal constraints, which has advantages of embedding feature selection into the nonnegative spectral clustering and preventing overfitting. An effective inexact augmented Lagrangian multiplier method is proposed to solve our model, which adopts the proximal alternating minimization method to solve subproblem at each iteration. We show that the sequence generated by our method globally converges to a Karush-Kuhn-Tucker point of our model. Extensive numerical experiments on popular datasets demonstrate the stability and robustness of our method. Moreover, comparison results of algorithm performance show that our method outperforms some existing state-of-the-art methods.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
A Novel Loss Function-based Support Vector Machine for Binary Classification
Authors:
Yan Li,
Li** Zhang
Abstract:
The previous support vector machine(SVM) including $0/1$ loss SVM, hinge loss SVM, ramp loss SVM, truncated pinball loss SVM, and others, overlooked the degree of penalty for the correctly classified samples within the margin. This oversight affects the generalization ability of the SVM classifier to some extent. To address this limitation, from the perspective of confidence margin, we propose a n…
▽ More
The previous support vector machine(SVM) including $0/1$ loss SVM, hinge loss SVM, ramp loss SVM, truncated pinball loss SVM, and others, overlooked the degree of penalty for the correctly classified samples within the margin. This oversight affects the generalization ability of the SVM classifier to some extent. To address this limitation, from the perspective of confidence margin, we propose a novel Slide loss function ($\ell_s$) to construct the support vector machine classifier($\ell_s$-SVM). By introducing the concept of proximal stationary point, and utilizing the property of Lipschitz continuity, we derive the first-order optimality conditions for $\ell_s$-SVM. Based on this, we define the $\ell_s$ support vectors and working set of $\ell_s$-SVM. To efficiently handle $\ell_s$-SVM, we devise a fast alternating direction method of multipliers with the working set ($\ell_s$-ADMM), and provide the convergence analysis. The numerical experiments on real world datasets confirm the robustness and effectiveness of the proposed method.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Low-rank quaternion tensor completion for color video inpainting via a novel factorization strategy
Authors:
Zhenzhi Qin,
Zhenyu Ming,
Defeng Sun,
Li** Zhang
Abstract:
Recently, a quaternion tensor product named Qt-product was proposed, and then the singular value decomposition and the rank of a third-order quaternion tensor were given. From a more applicable perspective, we extend the Qt-product and propose a novel multiplication principle for third-order quaternion tensor named gQt-product. With the gQt-product, we introduce a brand-new singular value decompos…
▽ More
Recently, a quaternion tensor product named Qt-product was proposed, and then the singular value decomposition and the rank of a third-order quaternion tensor were given. From a more applicable perspective, we extend the Qt-product and propose a novel multiplication principle for third-order quaternion tensor named gQt-product. With the gQt-product, we introduce a brand-new singular value decomposition for third-order quaternion tensors named gQt-SVD and then define gQt-rank and multi-gQt-rank. We prove that the optimal low-rank approximation of a third-order quaternion tensor exists and some numerical experiments demonstrate the low-rankness of color videos. So, we apply the low-rank quaternion tensor completion to color video inpainting problems and present alternating least-square algorithms to solve the proposed low gQt-rank and multi-gQt-rank quaternion tensor completion models. The convergence analyses of the proposed algorithms are established and some numerical experiments on various color video datasets show the high recovery accuracy and computational efficiency of our methods.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Bulk universality for deformed GinUEs
Authors:
Lu Zhang
Abstract:
For the deformed complex Ginibre ensemble with a mean normal matrix, under certain assumptions on the mean matrix we prove that the same bulk statistics holds as in the complex Ginibre matrix bulk. This is the continuation of the previous joint papers ``Critical edge statistics for deformed GinUEs Preprint arXiv: 2311.13227v1'' and ``Repeated erfc statistics for deformed GinUEs Preprint arXiv: 240…
▽ More
For the deformed complex Ginibre ensemble with a mean normal matrix, under certain assumptions on the mean matrix we prove that the same bulk statistics holds as in the complex Ginibre matrix bulk. This is the continuation of the previous joint papers ``Critical edge statistics for deformed GinUEs Preprint arXiv: 2311.13227v1'' and ``Repeated erfc statistics for deformed GinUEs Preprint arXiv: 2402.14362'', which deal with local eigenvalue statistics at the edge.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Primal Methods for Variational Inequality Problems with Functional Constraints
Authors:
Liang Zhang,
Niao He,
Michael Muehlebach
Abstract:
Constrained variational inequality problems are recognized for their broad applications across various fields including machine learning and operations research. First-order methods have emerged as the standard approach for solving these problems due to their simplicity and scalability. However, they typically rely on projection or linear minimization oracles to navigate the feasible set, which be…
▽ More
Constrained variational inequality problems are recognized for their broad applications across various fields including machine learning and operations research. First-order methods have emerged as the standard approach for solving these problems due to their simplicity and scalability. However, they typically rely on projection or linear minimization oracles to navigate the feasible set, which becomes computationally expensive in practical scenarios featuring multiple functional constraints. Existing efforts to tackle such functional constrained variational inequality problems have centered on primal-dual algorithms grounded in the Lagrangian function. These algorithms along with their theoretical analysis often require the existence and prior knowledge of the optimal Lagrange multipliers. In this work, we propose a simple primal method, termed Constrained Gradient Method (CGM), for addressing functional constrained variational inequality problems, without necessitating any information on the optimal Lagrange multipliers. We establish a non-asymptotic convergence analysis of the algorithm for variational inequality problems with monotone operators under smooth constraints. Remarkably, our algorithms match the complexity of projection-based methods in terms of operator queries for both monotone and strongly monotone settings, while utilizing significantly cheaper oracles based on quadratic programming. Furthermore, we provide several numerical examples to evaluate the efficacy of our algorithms.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Data-driven Stabilization of Nitsche's Method
Authors:
S. Saberi,
L. Zhang,
A. Vogel
Abstract:
The weak imposition of essential boundary conditions is an integral aspect of unfitted finite element methods, where the physical boundary does not in general coincide with the computational domain. In this regard, the symmetric Nitsche's method is a powerful technique that preserves the symmetry and variational consistency of the unmodified weak formulation. The stabilization parameter in Nitsche…
▽ More
The weak imposition of essential boundary conditions is an integral aspect of unfitted finite element methods, where the physical boundary does not in general coincide with the computational domain. In this regard, the symmetric Nitsche's method is a powerful technique that preserves the symmetry and variational consistency of the unmodified weak formulation. The stabilization parameter in Nitsche's method plays a crucial role in the stability of the resultant formulation, whose estimation is computationally intensive and dependent on the particular cut configuration using the conventional eigenvalue-based approach. In this work, we employ as model problem the finite cell method in which the need for the generation of a boundary-conforming mesh is circumvented by embedding the physical domain in a, typically regular, background mesh. We propose a data-driven estimate based on machine learning methods for the estimation of the stabilization parameter in Nitsche's method that offers an efficient constant-complexity alternative to the eigenvalue-based approach independent of the cut configuration. It is shown, using numerical benchmarks, that the proposed method can estimate the stabilization parameter accurately and is by far more computationally efficient. The data-driven estimate can be integrated into existing numerical codes with minimal modifications and thanks to the wide adoption of accelerators such as GPUs by machine learning frameworks, can be used with virtually no extra implementation cost on GPU devices, further increasing the potential for computational gains over the conventional eigenvalue-based estimate.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Quantum Circuits for partial differential equations via Schrödingerisation
Authors:
Junpeng Hu,
Shi **,
Nana Liu,
Lei Zhang
Abstract:
Quantum computing has emerged as a promising avenue for achieving significant speedup, particularly in large-scale PDE simulations, compared to classical computing. One of the main quantum approaches involves utilizing Hamiltonian simulation, which is directly applicable only to Schrödinger-type equations. To address this limitation, Schrödingerisation techniques have been developed, employing the…
▽ More
Quantum computing has emerged as a promising avenue for achieving significant speedup, particularly in large-scale PDE simulations, compared to classical computing. One of the main quantum approaches involves utilizing Hamiltonian simulation, which is directly applicable only to Schrödinger-type equations. To address this limitation, Schrödingerisation techniques have been developed, employing the warped transformation to convert general linear PDEs into Schrödinger-type equations. However, despite the development of Schrödingerisation techniques, the explicit implementation of the corresponding quantum circuit for solving general PDEs remains to be designed. In this paper, we present detailed implementation of a quantum algorithm for general PDEs using Schrödingerisation techniques. We provide examples of the heat equation, and the advection equation approximated by the upwind scheme, to demonstrate the effectiveness of our approach. Complexity analysis is also carried out to demonstrate the quantum advantages of these algorithms in high dimensions over their classical counterparts.
△ Less
Submitted 12 May, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
Efficient Matching Boundary Conditions of Two-dimensional Honeycomb Lattice for Atomic Simulations
Authors:
Baiyili Liu,
Songsong Ji,
Gang Pang,
Shaoqiang Tang,
Lei Zhang
Abstract:
In this paper, we design a series of matching boundary conditions for a two-dimensional compound honeycomb lattice, which has an explicit and simple form, high computing efficiency and good effectiveness of suppressing boundary reflections. First, we formulate the dynamic equations and calculate the dispersion relation for the harmonic honeycomb lattice, then symmetrically choose specific atoms ne…
▽ More
In this paper, we design a series of matching boundary conditions for a two-dimensional compound honeycomb lattice, which has an explicit and simple form, high computing efficiency and good effectiveness of suppressing boundary reflections. First, we formulate the dynamic equations and calculate the dispersion relation for the harmonic honeycomb lattice, then symmetrically choose specific atoms near the boundary to design different forms of matching boundary conditions. The boundary coefficients are determined by matching a residual function at some selected wavenumbers. Several atomic simulations are performed to test the effectiveness of matching boundary conditions in the example of a harmonic honeycomb lattice and a nonlinear honeycomb lattice with the FPU-$β$ potential. Numerical results illustrate that low-order matching boundary conditions mainly treat long waves, while the high-order matching boundary conditions can efficiently suppress short waves and long waves simultaneously. Decaying kinetic energy curves indicate the stability of matching boundary conditions in numerical simulations.
△ Less
Submitted 6 February, 2024;
originally announced March 2024.
-
pETNNs: Partial Evolutionary Tensor Neural Networks for Solving Time-dependent Partial Differential Equations
Authors:
Tunan Kao,
** Zhao,
Lei Zhang
Abstract:
We present partial evolutionary tensor neural networks (pETNNs), a novel framework for solving time-dependent partial differential equations with both of high accuracy and remarkable extrapolation. Our proposed architecture leverages the inherent accuracy of tensor neural networks, while incorporating evolutionary parameters that enable remarkable extrapolation capabilities. By adopting innovative…
▽ More
We present partial evolutionary tensor neural networks (pETNNs), a novel framework for solving time-dependent partial differential equations with both of high accuracy and remarkable extrapolation. Our proposed architecture leverages the inherent accuracy of tensor neural networks, while incorporating evolutionary parameters that enable remarkable extrapolation capabilities. By adopting innovative parameter update strategies, the pETNNs achieve a significant reduction in computational cost while maintaining precision and robustness. Notably, the pETNNs enhance the accuracy of conventional evolutional deep neural networks and empowers computational abilities to address high-dimensional problems. Numerical experiments demonstrate the superior performance of the pETNNs in solving time-dependent complex equations, including the Navier-Stokes equations, high-dimensional heat equation, high-dimensional transport equation and Korteweg-de Vries type equation.
△ Less
Submitted 9 March, 2024;
originally announced March 2024.
-
Low-rank Tensor Autoregressive Predictor for Third-Order Time-Series Forecasting
Authors:
Haoning Wang,
Li** Zhang,
Shengbo Eben Li
Abstract:
Recently, tensor time-series forecasting has gained increasing attention, whose core requirement is how to perform dimensionality reduction. Among all multidimensional data, third-order tensor is the most prevalent structure in real-world scenarios, such as RGB images and network traffic data. Previous studies in this field are mainly based on tensor Tucker decomposition and such methods have limi…
▽ More
Recently, tensor time-series forecasting has gained increasing attention, whose core requirement is how to perform dimensionality reduction. Among all multidimensional data, third-order tensor is the most prevalent structure in real-world scenarios, such as RGB images and network traffic data. Previous studies in this field are mainly based on tensor Tucker decomposition and such methods have limitations in terms of computational cost, with iteration complexity of approximately $O(2n^3r)$, where $n$ and $r$ are the dimension and rank of original tensor data. Moreover, many real-world data does not exhibit the low-rank property under Tucker decomposition, which may fail the dimensionality reduction. In this paper, we pioneer the application of tensor singular value decomposition (t-SVD) to third-order time-series, which builds an efficient forecasting algorithm, called Low-rank Tensor Autoregressive Predictor (LOTAP). We observe that tensor tubal rank in t-SVD is always less than Tucker rank, which leads to great benefit in computational complexity. By combining it with the autoregressive (AR) model, the forecasting problem is formulated as a least squares optimization. We divide such an optimization problem by fast Fourier transformation into four decoupled subproblems, whose variables include regressive coefficient, f-diagonal tensor, left and right orthogonal tensors. The alternating minimization algorithm is proposed with iteration complexity of about $O(n^3 + n^2r^2)$, in which each subproblem has a closed-form solution. Numerical experiments show that, compared to Tucker-decomposition-based algorithms, LOTAP achieves a speed improvement ranging from 2 to 6 times while maintaining accurate forecasting performance in all four baseline tasks. In addition, LOTAP is applicable to a wider range of tensor forecasting tasks due to its more effective dimensionality reduction ability.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
SGD with Partial Hessian for Deep Neural Networks Optimization
Authors:
Ying Sun,
Hongwei Yong,
Lei Zhang
Abstract:
Due to the effectiveness of second-order algorithms in solving classical optimization problems, designing second-order optimizers to train deep neural networks (DNNs) has attracted much research interest in recent years. However, because of the very high dimension of intermediate features in DNNs, it is difficult to directly compute and store the Hessian matrix for network optimization. Most of th…
▽ More
Due to the effectiveness of second-order algorithms in solving classical optimization problems, designing second-order optimizers to train deep neural networks (DNNs) has attracted much research interest in recent years. However, because of the very high dimension of intermediate features in DNNs, it is difficult to directly compute and store the Hessian matrix for network optimization. Most of the previous second-order methods approximate the Hessian information imprecisely, resulting in unstable performance. In this work, we propose a compound optimizer, which is a combination of a second-order optimizer with a precise partial Hessian matrix for updating channel-wise parameters and the first-order stochastic gradient descent (SGD) optimizer for updating the other parameters. We show that the associated Hessian matrices of channel-wise parameters are diagonal and can be extracted directly and precisely from Hessian-free methods. The proposed method, namely SGD with Partial Hessian (SGD-PH), inherits the advantages of both first-order and second-order optimizers. Compared with first-order optimizers, it adopts a certain amount of information from the Hessian matrix to assist optimization, while compared with the existing second-order optimizers, it keeps the good generalization performance of first-order optimizers. Experiments on image classification tasks demonstrate the effectiveness of our proposed optimizer SGD-PH. The code is publicly available at \url{https://github.com/myingysun/SGDPH}.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Extremal problems about the order and size of nonhamiltonian locally linear graphs
Authors:
Feng Liu,
Leilei Zhang
Abstract:
The study of the relationship between local and global properties of mathematical objects has always been a key subject of investigation in different areas of mathematics. A graph $G$ is called locally linear if the neighbourhood of every vertex of $G$ induces a path. And $G$ is called locally hamiltonian (traceable) if the neighbourhood of every vertex of $G$ induces a hamiltonian (traceable) gra…
▽ More
The study of the relationship between local and global properties of mathematical objects has always been a key subject of investigation in different areas of mathematics. A graph $G$ is called locally linear if the neighbourhood of every vertex of $G$ induces a path. And $G$ is called locally hamiltonian (traceable) if the neighbourhood of every vertex of $G$ induces a hamiltonian (traceable) graph. The local properties of graphs are being studied extensively. For example, the minimum order of a nonhamiltonian (or nontraceable) locally hamiltonian (or traceable) graph has been determined. In this paper, we determine the minimum order of a nonhamiltonian locally linear graph and the minimum size of a nonhamiltonian locally linear graph of a given order.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.