Search | arXiv e-print repository

Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks

Authors: Xianliang Xu, Zhongyi Huang, Ye Li

Abstract: Optimization algorithms is crucial in training physics-informed neural networks (PINNs), unsuitable methods may lead to poor solutions. Compared to the common gradient descent algorithm, implicit gradient descent (IGD) outperforms it in handling some multi-scale problems. In this paper, we provide convergence analysis for the implicit gradient descent for training over-parametrized two-layer PINNs… ▽ More Optimization algorithms is crucial in training physics-informed neural networks (PINNs), unsuitable methods may lead to poor solutions. Compared to the common gradient descent algorithm, implicit gradient descent (IGD) outperforms it in handling some multi-scale problems. In this paper, we provide convergence analysis for the implicit gradient descent for training over-parametrized two-layer PINNs. We first demonstrate the positive definiteness of Gram matrices for general smooth activation functions, like sigmoidal function, softplus function, tanh function and so on. Then the over-parameterization allows us to show that the randomly initialized IGD converges a globally optimal solution at a linear convergence rate. Moreover, due to the different training dynamics, the learning rate of IGD can be chosen independent of the sample size and the least eigenvalue of the Gram matrix. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02681 [pdf, other]

Uniform Transformation: Refining Latent Representation in Variational Autoencoders

Authors: Ye Shi, C. S. George Lee

Abstract: Irregular distribution in latent space causes posterior collapse, misalignment between posterior and prior, and ill-sampling problem in Variational Autoencoders (VAEs). In this paper, we introduce a novel adaptable three-stage Uniform Transformation (UT) module -- Gaussian Kernel Density Estimation (G-KDE) clustering, non-parametric Gaussian Mixture (GM) Modeling, and Probability Integral Transfor… ▽ More Irregular distribution in latent space causes posterior collapse, misalignment between posterior and prior, and ill-sampling problem in Variational Autoencoders (VAEs). In this paper, we introduce a novel adaptable three-stage Uniform Transformation (UT) module -- Gaussian Kernel Density Estimation (G-KDE) clustering, non-parametric Gaussian Mixture (GM) Modeling, and Probability Integral Transform (PIT) -- to address irregular latent distributions. By reconfiguring irregular distributions into a uniform distribution in the latent space, our approach significantly enhances the disentanglement and interpretability of latent representations, overcoming the limitation of traditional VAE models in capturing complex data structures. Empirical evaluations demonstrated the efficacy of our proposed UT module in improving disentanglement metrics across benchmark datasets -- dSprites and MNIST. Our findings suggest a promising direction for advancing representation learning techniques, with implication for future research in extending this framework to more sophisticated datasets and downstream tasks. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Accepted by 2024 IEEE 20th International Conference on Automation Science and Engineering

arXiv:2407.02015 [pdf, other]

Robust First and Second-Order Differentiation for Regularized Optimal Transport

Authors: Xingjie Li, Fei Lu, Molei Tao, Felix X. -F. Ye

Abstract: Applications such as unbalanced and fully shuffled regression can be approached by optimizing regularized optimal transport (OT) distances, such as the entropic OT and Sinkhorn distances. A common approach for this optimization is to use a first-order optimizer, which requires the gradient of the OT distance. For faster convergence, one might also resort to a second-order optimizer, which addition… ▽ More Applications such as unbalanced and fully shuffled regression can be approached by optimizing regularized optimal transport (OT) distances, such as the entropic OT and Sinkhorn distances. A common approach for this optimization is to use a first-order optimizer, which requires the gradient of the OT distance. For faster convergence, one might also resort to a second-order optimizer, which additionally requires the Hessian. The computations of these derivatives are crucial for efficient and accurate optimization. However, they present significant challenges in terms of memory consumption and numerical instability, especially for large datasets and small regularization strengths. We circumvent these issues by analytically computing the gradients for OT distances and the Hessian for the entropic OT distance, which was not previously used due to intricate tensor-wise calculations and the complex dependency on parameters within the bi-level loss function. Through analytical derivation and spectral analysis, we identify and resolve the numerical instability caused by the singularity and ill-posedness of a key linear system. Consequently, we achieve scalable and stable computation of the Hessian, enabling the implementation of the stochastic gradient descent (SGD)-Newton methods. Tests on shuffled regression examples demonstrate that the second stage of the SGD-Newton method converges orders of magnitude faster than the gradient descent-only method while achieving significantly more accurate parameter estimations. △ Less

Submitted 2 July, 2024; originally announced July 2024.

MSC Class: 68Q25; 68R10; 68U05

arXiv:2406.19788 [pdf, ps, other]

Involves averaging arithmetic and integral partial functions over sparse set

Authors: Zhaoxi Ye, Zhefeng Xu

Abstract: Let $p$ be a prime number, $k\ge 0$ and $f$ be a class of arithmetic functions satisfying some simple conditions. In this short paper, we study the asymptotical behaviour of summation function… ▽ More Let $p$ be a prime number, $k\ge 0$ and $f$ be a class of arithmetic functions satisfying some simple conditions. In this short paper, we study the asymptotical behaviour of summation function $$ψ_{f,k}(x):=\sum_{n\le x}Λ(n)\frac{f\left ( \left [ \frac{x}{n} \right ] \right ) }{\left [ \frac{x}{n} \right ]^{k} } ,~~~~~~~~~~~ π_{f,k}(x):=\sum_{p\le x}\frac{f\left ( \left [ \frac{x}{p} \right ] \right ) }{\left [ \frac{x}{p} \right ]^{k} } $$ as $x\to \infty $, where $\left [ \cdot \right ] $ is the integral part function, $Λ(n)$ is the von Mangoldt function. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.19619 [pdf, other]

ScoreFusion: fusing score-based generative models via Kullback-Leibler barycenters

Authors: Hao Liu, Junze, Ye, Jose Blanchet, Nian Si

Abstract: We study the problem of fusing pre-trained (auxiliary) generative models to enhance the training of a target generative model. We propose using KL-divergence weighted barycenters as an optimal fusion mechanism, in which the barycenter weights are optimally trained to minimize a suitable loss for the target population. While computing the optimal KL-barycenter weights can be challenging, we demonst… ▽ More We study the problem of fusing pre-trained (auxiliary) generative models to enhance the training of a target generative model. We propose using KL-divergence weighted barycenters as an optimal fusion mechanism, in which the barycenter weights are optimally trained to minimize a suitable loss for the target population. While computing the optimal KL-barycenter weights can be challenging, we demonstrate that this process can be efficiently executed using diffusion score training when the auxiliary generative models are also trained based on diffusion score methods. Moreover, we show that our fusion method has a dimension-free sample complexity in total variation distance provided that the auxiliary models are well fitted for their own task and the auxiliary tasks combined capture the target well. The main takeaway of our method is that if the auxiliary models are well-trained and can borrow features from each other that are present in the target, our fusion method significantly improves the training of generative models. We provide a concise computational implementation of the fusion algorithm, and validate its efficiency in the low-data regime with numerical experiments involving mixtures models and image datasets. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 40 pages, 6 figures

arXiv:2406.19377 [pdf, ps, other]

Grassmannian optimization is NP-hard

Authors: Zehua Lai, Lek-Heng Lim, Ke Ye

Abstract: We show that unconstrained quadratic optimization over a Grassmannian $\operatorname{Gr}(k,n)$ is NP-hard. Our results cover all scenarios: (i) when $k$ and $n$ are both allowed to grow; (ii) when $k$ is arbitrary but fixed; (iii) when $k$ is fixed at its lowest possible value $1$. We then deduce the NP-hardness of unconstrained cubic optimization over the Stiefel manifold $\operatorname{V}(k,n)$… ▽ More We show that unconstrained quadratic optimization over a Grassmannian $\operatorname{Gr}(k,n)$ is NP-hard. Our results cover all scenarios: (i) when $k$ and $n$ are both allowed to grow; (ii) when $k$ is arbitrary but fixed; (iii) when $k$ is fixed at its lowest possible value $1$. We then deduce the NP-hardness of unconstrained cubic optimization over the Stiefel manifold $\operatorname{V}(k,n)$ and the orthogonal group $\operatorname{O}(n)$. As an addendum we demonstrate the NP-hardness of unconstrained quadratic optimization over the Cartan manifold, i.e., the positive definite cone $\mathbb{S}^n_{\scriptscriptstyle++}$ regarded as a Riemannian manifold, another popular example in manifold optimization. We will also establish the nonexistence of $\mathrm{FPTAS}$ in all cases. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 19 pages

MSC Class: 03D15; 90C26; 90C23; 65K10; 68Q25; 90C60

arXiv:2406.18147 [pdf, ps, other]

Correlation entropy of free semigroup actions

Authors: Xiaojiang Ye, Yanjie Tang, Dongkui Ma

Abstract: This paper introduces the concepts of correlation entropy and local correlation entropy for free semigroup actions on compact metric space, and explores their fundamental properties. Thereafter, we generalize some classical results on correlation entropy and local correlation entropy to apply to free semigroup actions. Finally, we establish the relationship between topological entropy, measure-the… ▽ More This paper introduces the concepts of correlation entropy and local correlation entropy for free semigroup actions on compact metric space, and explores their fundamental properties. Thereafter, we generalize some classical results on correlation entropy and local correlation entropy to apply to free semigroup actions. Finally, we establish the relationship between topological entropy, measure-theoretic entropy, correlation entropy, and local correlation entropy for free semigroup actions under various conditions. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 35 pages

arXiv:2406.15713 [pdf, other]

Efficient Low-rank Identification via Accelerated Iteratively Reweighted Nuclear Norm Minimization

Authors: Hao Wang, Ye Wang, Xiangyu Yang

Abstract: This paper considers the problem of minimizing the sum of a smooth function and the Schatten-$p$ norm of the matrix. Our contribution involves proposing accelerated iteratively reweighted nuclear norm methods designed for solving the nonconvex low-rank minimization problem. Two major novelties characterize our approach. Firstly, the proposed method possesses a rank identification property, enablin… ▽ More This paper considers the problem of minimizing the sum of a smooth function and the Schatten-$p$ norm of the matrix. Our contribution involves proposing accelerated iteratively reweighted nuclear norm methods designed for solving the nonconvex low-rank minimization problem. Two major novelties characterize our approach. Firstly, the proposed method possesses a rank identification property, enabling the provable identification of the "correct" rank of the stationary point within a finite number of iterations. Secondly, we introduce an adaptive updating strategy for smoothing parameters. This strategy automatically fixes parameters associated with zero singular values as constants upon detecting the "correct" rank while quickly driving the rest of the parameters to zero. This adaptive behavior transforms the algorithm into one that effectively solves smooth problems after a few iterations, setting our work apart from existing iteratively reweighted methods for low-rank optimization. We prove the global convergence of the proposed algorithm, guaranteeing that every limit point of the iterates is a critical point. Furthermore, a local convergence rate analysis is provided under the Kurdyka-Łojasiewicz property. We conduct numerical experiments using both synthetic and real data to showcase our algorithm's efficiency and superiority over existing methods. △ Less

Submitted 26 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

Comments: Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2406.13241 [pdf, ps, other]

Achirality of Sol 3-Manifolds, Stevenhagen Conjecture and Shimizu's L-series

Authors: Ye Tian, Shicheng Wang, Zhongzi Wang

Abstract: A closed orientable manifold is {\em achiral} if it admits an orientation reversing homeomorphism. A commensurable class of closed manifolds is achiral if it contains an achiral element, or equivalently, each manifold in $\CM$ has an achiral finite cover. Each commensurable class containing non-orientable elements must be achiral. It is natural to wonder how many commensurable classes are ac… ▽ More A closed orientable manifold is {\em achiral} if it admits an orientation reversing homeomorphism. A commensurable class of closed manifolds is achiral if it contains an achiral element, or equivalently, each manifold in $\CM$ has an achiral finite cover. Each commensurable class containing non-orientable elements must be achiral. It is natural to wonder how many commensurable classes are achiral and how many achiral classes have non-orientable elements. We study this problem for Sol 3-manifolds. Each commensurable class $\CM$ of Sol 3-manifold has a complete topological invariant $D_{\CM}$, the discriminant of $\CM$. Our main result is: (1) Among all commensurable classes of Sol 3-manifolds, there are infinitely many achiral classes; however ordered by discriminants, the density of achiral commensurable classes is 0. (2) Among all achiral commensurable classes of Sol 3-manifolds, ordered by discriminants, the density of classes containing non-orientable elements is $1-ρ$, where $$ρ:=\prod_{j=1}^\infty \left(1+2^{-j}\right)^{-1} = 0.41942\cdots.$$ △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 19 pages

arXiv:2406.12839 [pdf, other]

Evaluating the design space of diffusion-based generative models

Authors: Yuqing Wang, Ye He, Molei Tao

Abstract: Most existing theoretical investigations of the accuracy of diffusion models, albeit significant, assume the score function has been approximated to a certain accuracy, and then use this a priori bound to control the error of generation. This article instead provides a first quantitative understanding of the whole generation process, i.e., both training and sampling. More precisely, it conducts a… ▽ More Most existing theoretical investigations of the accuracy of diffusion models, albeit significant, assume the score function has been approximated to a certain accuracy, and then use this a priori bound to control the error of generation. This article instead provides a first quantitative understanding of the whole generation process, i.e., both training and sampling. More precisely, it conducts a non-asymptotic convergence analysis of denoising score matching under gradient descent. In addition, a refined sampling error analysis for variance exploding models is also provided. The combination of these two results yields a full error analysis, which elucidates (again, but this time theoretically) how to design the training and sampling processes for effective generation. For instance, our theory implies a preference toward noise distribution and loss weighting that qualitatively agree with the ones used in [Karras et al. 2022]. It also provides some perspectives on why the time and variance schedule used in [Karras et al. 2022] could be better tuned than the pioneering version in [Song et al. 2020]. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Comments are welcome

arXiv:2406.12215 [pdf, other]

Discrete Variable Topology Optimization Using Multi-Cut Formulation and Adaptive Trust Regions

Authors: Zisheng Ye, Wenxiao Pan

Abstract: We present a new framework for solving general topology optimization (TO) problems that find an optimal material distribution within a design space to maximize the performance of a structure while satisfying design constraints. These problems involve state variables that nonlinearly depend on the design variables, with objective functions that can be convex or non-convex, and may include multiple… ▽ More We present a new framework for solving general topology optimization (TO) problems that find an optimal material distribution within a design space to maximize the performance of a structure while satisfying design constraints. These problems involve state variables that nonlinearly depend on the design variables, with objective functions that can be convex or non-convex, and may include multiple candidate materials. The framework is designed to greatly enhance computational efficiency, primarily by diminishing optimization iteration counts and thereby reducing the solving of associated state-equilibrium partial differential equations (PDEs). It maintains binary design variables and addresses the large-scale mixed integer nonlinear programming (MINLP) problem that arises from discretizing the design space and PDEs. The core of this framework is the integration of the generalized Benders' decomposition and adaptive trust regions. The trust-region radius adapts based on a merit function. To mitigate ill-conditioning due to extreme parameter values, we further introduce a parameter relaxation scheme where two parameters are relaxed in stages at different paces. Numerical tests validate the framework's superior performance, including minimum compliance and compliant mechanism problems in single-material and multi-material designs. We compare our results with those of other methods and demonstrate significant reductions in optimization iterations by about one order of magnitude, while maintaining comparable optimal objective function values. As the design variables and constraints increase, the framework maintains consistent solution quality and efficiency, underscoring its good scalability. We anticipate this framework will be especially advantageous for TO applications involving substantial design variables and constraints and requiring significant computational resources for PDE solving. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11821 [pdf, ps, other]

Simple matrix expressions for the curvatures of Grassmannian

Authors: Zehua Lai, Lek-Heng Lim, Ke Ye

Abstract: We show that modeling a Grassmannian as symmetric orthogonal matrices $\operatorname{Gr}(k,\mathbb{R}^n) \cong\{Q \in \mathbb{R}^{n \times n} : Q^{\scriptscriptstyle\mathsf{T}} Q = I, \; Q^{\scriptscriptstyle\mathsf{T}} = Q,\; \operatorname{tr}(Q)=2k - n\}$ yields exceedingly simple matrix formulas for various curvatures and curvature-related quantities, both intrinsic and extrinsic. These include… ▽ More We show that modeling a Grassmannian as symmetric orthogonal matrices $\operatorname{Gr}(k,\mathbb{R}^n) \cong\{Q \in \mathbb{R}^{n \times n} : Q^{\scriptscriptstyle\mathsf{T}} Q = I, \; Q^{\scriptscriptstyle\mathsf{T}} = Q,\; \operatorname{tr}(Q)=2k - n\}$ yields exceedingly simple matrix formulas for various curvatures and curvature-related quantities, both intrinsic and extrinsic. These include Riemann, Ricci, Jacobi, sectional, scalar, mean, principal, and Gaussian curvatures; Schouten, Weyl, Cotton, Bach, Plebański, cocurvature, nonmetricity, and torsion tensors; first, second, and third fundamental forms; Gauss and Weingarten maps; and upper and lower delta invariants. We will derive explicit, simple expressions for the aforementioned quantities in terms of standard matrix operations that are stably computable with numerical linear algebra. Many of these aforementioned quantities have never before been presented for the Grassmannian. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 25 pages

MSC Class: 15A75; 14M15

arXiv:2406.05337 [pdf, ps, other]

doi 10.1016/j.jfa.2024.110527

On Onsager's type conjecture for the inviscid Boussinesq equations

Authors: Changxing Miao, Yao Nie, Weikui Ye

Abstract: In this paper, we investigate the Cauchy problem for the three dimensional inviscid Boussinesq system in the periodic setting. For $1\le p\le \infty$, we show that the threshold regularity exponent for $L^p$-norm conservation of temperature of this system is $1/3$, consistent with Onsager exponent. More precisely, for $1\le p\le\infty$, every weak solution $(v,θ)\in C_tC^β_x$ to the inviscid Bouss… ▽ More In this paper, we investigate the Cauchy problem for the three dimensional inviscid Boussinesq system in the periodic setting. For $1\le p\le \infty$, we show that the threshold regularity exponent for $L^p$-norm conservation of temperature of this system is $1/3$, consistent with Onsager exponent. More precisely, for $1\le p\le\infty$, every weak solution $(v,θ)\in C_tC^β_x$ to the inviscid Boussinesq equations satisfies that $\|θ(t)\|_{L^p(\mathbb{T}^3)}=\|θ_0\|_{L^p(\mathbb{T}^3)}$ if $β>\frac{1}{3}$, while if $β<\frac{1}{3}$, there exist infinitely many weak solutions $(v,θ)\in C_tC^β_x$ such that the $L^p$-norm of temperature is not conserved. As a byproduct, we are able to construct many weak solutions in $C_tC^β_x$ for $β<\frac{1}{3}$ displaying wild behavior, such as fast kinetic energy dissipation and high oscillation of velocity. Moreover, we also show that if a weak solution $(v, θ)$ of this system has at least one interval of regularity, then this weak solution $(v,θ)$ is not unique in $C_tC^β_x$ for $β<\frac{1}{3}$. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Journal ref: Journal of Functional Analysis 287 (2024) 110527

arXiv:2406.03076 [pdf, ps, other]

Fourier integral operators on Hardy spaces with Hormander class

Authors: Ye Xiaofeng, Zhang Chunjie, Zhu Xiangrong

Abstract: In this note, we consider a Fourier integral operator defined by \begin{align*} T_{φ,a}f(x)=\int_{\mathbb{R}^{n}}e^{iφ(x,ξ)}a(x,ξ)\widehat{f}(ξ)dξ, \end{align*} where $a$ is the amplitude, and $φ$ is the phase. Let $0\leqρ\leq 1,n\geq 2$ or $0\leqρ<1,n=1$ and $$m_p=\frac{ρ-n}{p}+(n-1)\min\{\frac 12,ρ\}.$$ If $a$ belongs to the forbidden Hörmander class $S^{m_p}_{ρ,1}$ and $φ\in Φ^{2}$… ▽ More In this note, we consider a Fourier integral operator defined by \begin{align*} T_{φ,a}f(x)=\int_{\mathbb{R}^{n}}e^{iφ(x,ξ)}a(x,ξ)\widehat{f}(ξ)dξ, \end{align*} where $a$ is the amplitude, and $φ$ is the phase. Let $0\leqρ\leq 1,n\geq 2$ or $0\leqρ<1,n=1$ and $$m_p=\frac{ρ-n}{p}+(n-1)\min\{\frac 12,ρ\}.$$ If $a$ belongs to the forbidden Hörmander class $S^{m_p}_{ρ,1}$ and $φ\in Φ^{2}$ satisfies the strong non-degeneracy condition, then for any $\frac {n}{n+1}<p\leq 1$, we can show that the Fourier integral operator $T_{φ,a}$ is bounded from the local Hardy space $h^p$ to $L^p$. Furthermore, if $a$ has compact support in variable $x$, then we can extend this result to $0<p\leq 1$. As $S^{m_p}_{ρ,δ}\subset S^{m_p}_{ρ,1}$ for any $0\leq δ\leq 1$, our result supplements and improves upon recent theorems proved by Staubach and his collaborators for $a\in S^{m}_{ρ,δ}$ when $δ$ is close to 1. As an important special case, when $n\geq 2$, we show that $T_{φ,a}$ is bounded from $H^1$ to $L^1$ if $a\in S^{(1-n)/2}_{1,1}$ which is a generalization of the well-known Seeger-Sogge-Stein theorem for $a\in S^{(1-n)/2}_{1,0}$. This result is false when $n=1$ and $a\in S^{0}_{1,1}$. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 24 pages

MSC Class: 42B20; 35S30

arXiv:2406.02909 [pdf, other]

An iterative constraint energy minimizing generalized multiscale finite element method for contact problem

Authors: Zishang Li, Changqing Ye, Eric T. Chung

Abstract: This work presents an Iterative Constraint Energy Minimizing Generalized Multiscale Finite Element Method (ICEM-GMsFEM) for solving the contact problem with high contrast coefficients. The model problem can be characterized by a variational inequality, where we add a penalty term to convert this problem into a non-smooth and non-linear unconstrained minimizing problem. The characterization of the… ▽ More This work presents an Iterative Constraint Energy Minimizing Generalized Multiscale Finite Element Method (ICEM-GMsFEM) for solving the contact problem with high contrast coefficients. The model problem can be characterized by a variational inequality, where we add a penalty term to convert this problem into a non-smooth and non-linear unconstrained minimizing problem. The characterization of the minimizer satisfies the variational form of a mixed Dirilect-Neumann-Robin boundary value problem. So we apply CEM-GMsFEM iteratively and introduce special boundary correctors along with multiscale spaces to achieve an optimal convergence rate. Numerical results are conducted for different highly heterogeneous permeability fields, validating the fast convergence of the CEM-GMsFEM iteration in handling the contact boundary and illustrating the stability of the proposed method with different sets of parameters. We also prove the fast convergence of the proposed iterative CEM-GMsFEM method and provide an error estimate of the multiscale solution under a mild assumption. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.00880 [pdf, ps, other]

Lang-Weil Type Estimates in Finite Difference Fields

Authors: Martin Hils, Ehud Hrushovski, **he Ye, Tingxiang Zou

Abstract: We prove a uniform estimate of the number of points for difference algebraic varieties in finite difference fields in the spirit of Lang-Weil. More precisely, we give uniform lower and upper bounds for the number of rational points of a difference variety in terms of its transformal dimension. As a main technical ingredient, we prove an equidimensionality result for Frobenius reductions of differe… ▽ More We prove a uniform estimate of the number of points for difference algebraic varieties in finite difference fields in the spirit of Lang-Weil. More precisely, we give uniform lower and upper bounds for the number of rational points of a difference variety in terms of its transformal dimension. As a main technical ingredient, we prove an equidimensionality result for Frobenius reductions of difference varieties. △ Less

Submitted 2 June, 2024; originally announced June 2024.

MSC Class: Primary 12H10; 11U09 Secondary 03C60; 03C20

arXiv:2406.00274 [pdf, other]

A Single-Loop Robust Policy Gradient Method for Robust Markov Decision Processes

Authors: Zhenwei Lin, Chenyu Xue, Qi Deng, Yinyu Ye

Abstract: Robust Markov Decision Processes (RMDPs) have recently been recognized as a valuable and promising approach to discovering a policy with creditable performance, particularly in the presence of a dynamic environment and estimation errors in the transition matrix due to limited data. Despite extensive exploration of dynamic programming algorithms for solving RMDPs, there has been a notable upswing i… ▽ More Robust Markov Decision Processes (RMDPs) have recently been recognized as a valuable and promising approach to discovering a policy with creditable performance, particularly in the presence of a dynamic environment and estimation errors in the transition matrix due to limited data. Despite extensive exploration of dynamic programming algorithms for solving RMDPs, there has been a notable upswing in interest in develo** efficient algorithms using the policy gradient method. In this paper, we propose the first single-loop robust policy gradient (SRPG) method with the global optimality guarantee for solving RMDPs through its minimax formulation. Moreover, we complement the convergence analysis of the nonconvex-nonconcave min-max optimization problem with the objective function's gradient dominance property, which is not explored in the prior literature. Numerical experiments validate the efficacy of SRPG, demonstrating its faster and more robust convergence behavior compared to its nested-loop counterpart. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.17761 [pdf, other]

Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient

Authors: Hao Di, Haishan Ye, Yueling Zhang, Xiangyu Chang, Guang Dai, Ivor W. Tsang

Abstract: Variance reduction techniques are designed to decrease the sampling variance, thereby accelerating convergence rates of first-order (FO) and zeroth-order (ZO) optimization methods. However, in composite optimization problems, ZO methods encounter an additional variance called the coordinate-wise variance, which stems from the random gradient estimation. To reduce this variance, prior works require… ▽ More Variance reduction techniques are designed to decrease the sampling variance, thereby accelerating convergence rates of first-order (FO) and zeroth-order (ZO) optimization methods. However, in composite optimization problems, ZO methods encounter an additional variance called the coordinate-wise variance, which stems from the random gradient estimation. To reduce this variance, prior works require estimating all partial derivatives, essentially approximating FO information. This approach demands O(d) function evaluations (d is the dimension size), which incurs substantial computational costs and is prohibitive in high-dimensional scenarios. This paper proposes the Zeroth-order Proximal Double Variance Reduction (ZPDVR) method, which utilizes the averaging trick to reduce both sampling and coordinate-wise variances. Compared to prior methods, ZPDVR relies solely on random gradient estimates, calls the stochastic zeroth-order oracle (SZO) in expectation $\mathcal{O}(1)$ times per iteration, and achieves the optimal $\mathcal{O}(d(n + κ)\log (\frac{1}ε))$ SZO query complexity in the strongly convex and smooth setting, where $κ$ represents the condition number and $ε$ is the desired accuracy. Empirical results validate ZPDVR's linear convergence and demonstrate its superior performance over other related methods. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.17343 [pdf, ps, other]

Bounded geometry for PCF-special subvarieties

Authors: Laura DeMarco, Niki Myrto Mavraki, Hexi Ye

Abstract: For each integer $d\geq 2$, let $M_d$ denote the moduli space of maps $f: \mathbb{P}^1\to \mathbb{P}^1$ of degree $d$. We study the geometric configurations of subsets of postcritically finite (or PCF) maps in $M_d$. A complex-algebraic subvariety $Y \subset M_d$ is said to be PCF-special if it contains a Zariski-dense set of PCF maps. Here we prove that there are only finitely many positive-dimen… ▽ More For each integer $d\geq 2$, let $M_d$ denote the moduli space of maps $f: \mathbb{P}^1\to \mathbb{P}^1$ of degree $d$. We study the geometric configurations of subsets of postcritically finite (or PCF) maps in $M_d$. A complex-algebraic subvariety $Y \subset M_d$ is said to be PCF-special if it contains a Zariski-dense set of PCF maps. Here we prove that there are only finitely many positive-dimensional irreducible PCF-special subvarieties in $M_d$ with degree $\leq D$. In addition, there exist constants $N = N(D,d)$ and $B = B(D,d)$ so that for any complex algebraic subvariety $X \subset M_d$ of degree $\leq D$, the Zariski closure $\overline{X\cap\mathrm{PCF}}~$ has at most $N$ irreducible components, each with degree $\leq B$. We also prove generalizations of these results for points with small critical height in $M_d(\bar{\mathbb{Q}})$. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.16736 [pdf, ps, other]

A Separation in Heavy-Tailed Sampling: Gaussian vs. Stable Oracles for Proximal Samplers

Authors: Ye He, Alireza Mousavi-Hosseini, Krishnakumar Balasubramanian, Murat A. Erdogdu

Abstract: We study the complexity of heavy-tailed sampling and present a separation result in terms of obtaining high-accuracy versus low-accuracy guarantees i.e., samplers that require only $O(\log(1/\varepsilon))$ versus $Ω(\text{poly}(1/\varepsilon))$ iterations to output a sample which is $\varepsilon$-close to the target in $χ^2$-divergence. Our results are presented for proximal samplers that are base… ▽ More We study the complexity of heavy-tailed sampling and present a separation result in terms of obtaining high-accuracy versus low-accuracy guarantees i.e., samplers that require only $O(\log(1/\varepsilon))$ versus $Ω(\text{poly}(1/\varepsilon))$ iterations to output a sample which is $\varepsilon$-close to the target in $χ^2$-divergence. Our results are presented for proximal samplers that are based on Gaussian versus stable oracles. We show that proximal samplers based on the Gaussian oracle have a fundamental barrier in that they necessarily achieve only low-accuracy guarantees when sampling from a class of heavy-tailed targets. In contrast, proximal samplers based on the stable oracle exhibit high-accuracy guarantees, thereby overcoming the aforementioned limitation. We also prove lower bounds for samplers under the stable oracle and show that our upper bounds cannot be fundamentally improved. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16252 [pdf, other]

2-torsion in instanton Floer homology

Authors: Zhenkun Li, Fan Ye

Abstract: This paper studies the existence of $2$-torsion in instanton Floer homology with $\mathbb{Z}$ coefficients for closed $3$-manifolds and singular knots. First, we show that the non-existence of $2$-torsion in the framed instanton Floer homology $I^\sharp(S_n^3(K);\mathbb{Z})$ of any nonzero integral $n$-surgery along a knot $K$ in $S^3$ would imply that $K$ is fibered. Also, we show that… ▽ More This paper studies the existence of $2$-torsion in instanton Floer homology with $\mathbb{Z}$ coefficients for closed $3$-manifolds and singular knots. First, we show that the non-existence of $2$-torsion in the framed instanton Floer homology $I^\sharp(S_n^3(K);\mathbb{Z})$ of any nonzero integral $n$-surgery along a knot $K$ in $S^3$ would imply that $K$ is fibered. Also, we show that $I^\sharp(S_{r}^3(K);\mathbb{Z})$ for any nontrivial $K$ with $r=1,1/2,1/4$ always has $2$-torsion. These two results indicate that the existence of $2$-torsion is expected to be a generic phenomenon for Dehn surgeries along knots. Second, we show that for genus-one knots with nontrivial Alexander polynomials and for unknotting-number-one knots, the unreduced singular instanton knot homology $I^\sharp(S^3,K;\mathbb{Z})$ always has $2$-torsion. Finally, some crucial lemmas that help us demonstrate the existence of $2$-torsion are motivated by analogous results in Heegaard Floer theory, which may be of independent interest. In particular, we show that, for a knot $K$ in $S^3$, if there is a nonzero rational number $r$ such that the dual knot $\widetilde{K}_r$ inside $S^3_r(K)$ is Floer simple, then $S^3_r(K)$ must be an L-space and $K$ must be an L-space knot. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 41 pages, 17 figures; comments are welcome

arXiv:2405.16160 [pdf, other]

Restarted Primal-Dual Hybrid Conjugate Gradient Method for Large-Scale Quadratic Programming

Authors: Yicheng Huang, Wanyu Zhang, Hongpei Li, Weihan Xue, Dongdong Ge, Huikang Liu, Yinyu Ye

Abstract: Convex quadratic programming (QP) is an essential class of optimization problems with broad applications across various fields. Traditional QP solvers, typically based on simplex or barrier methods, face significant scalability challenges. In response to these limitations, recent research has shifted towards matrix-free first-order methods to enhance scalability in QP. Among these, the restarted a… ▽ More Convex quadratic programming (QP) is an essential class of optimization problems with broad applications across various fields. Traditional QP solvers, typically based on simplex or barrier methods, face significant scalability challenges. In response to these limitations, recent research has shifted towards matrix-free first-order methods to enhance scalability in QP. Among these, the restarted accelerated primal-dual hybrid gradient (rAPDHG) method, proposed by H.Lu(2023), has gained notable attention due to its linear convergence rate to an optimal solution and its straightforward implementation on Graphics Processing Units (GPUs). Building on this framework, this paper introduces a restarted primal-dual hybrid conjugate gradient (PDHCG) method, which incorporates conjugate gradient (CG) techniques to address the primal subproblems inexactly. We demonstrate that PDHCG maintains a linear convergence rate with an improved convergence constant and is also straightforward to implement on GPUs. Extensive numerical experiments affirm that, compared to rAPDHG, our method could significantly reduce the number of iterations required to achieve the desired accuracy and offer a substantial performance improvement in large-scale problems. These findings highlight the significant potential of our proposed PDHCG method to boost both the efficiency and scalability of solving complex QP challenges. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.16126 [pdf, other]

Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity

Authors: Qihao Zhou, Haishan Ye, Luo Luo

Abstract: This paper considers the distributed convex-concave minimax optimization under the second-order similarity. We propose stochastic variance-reduced optimistic gradient sliding (SVOGS) method, which takes the advantage of the finite-sum structure in the objective by involving the mini-batch client sampling and variance reduction. We prove SVOGS can achieve the $\varepsilon$-duality gap within commun… ▽ More This paper considers the distributed convex-concave minimax optimization under the second-order similarity. We propose stochastic variance-reduced optimistic gradient sliding (SVOGS) method, which takes the advantage of the finite-sum structure in the objective by involving the mini-batch client sampling and variance reduction. We prove SVOGS can achieve the $\varepsilon$-duality gap within communication rounds of ${\mathcal O}(δD^2/\varepsilon)$, communication complexity of ${\mathcal O}(n+\sqrt{n}δD^2/\varepsilon)$, and local gradient calls of $\tilde{\mathcal O}(n+(\sqrt{n}δ+L)D^2/\varepsilon\log(1/\varepsilon))$, where $n$ is the number of nodes, $δ$ is the degree of the second-order similarity, $L$ is the smoothness parameter and $D$ is the diameter of the constraint set. We can verify that all of above complexity (nearly) matches the corresponding lower bounds. For the specific $μ$-strongly-convex-$μ$-strongly-convex case, our algorithm has the upper bounds on communication rounds, communication complexity, and local gradient calls of $\mathcal O(δ/μ\log(1/\varepsilon))$, ${\mathcal O}((n+\sqrt{n}δ/μ)\log(1/\varepsilon))$, and $\tilde{\mathcal O}(n+(\sqrt{n}δ+L)/μ)\log(1/\varepsilon))$ respectively, which are also nearly tight. Furthermore, we conduct the numerical experiments to show the empirical advantages of proposed method. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.11251 [pdf, ps, other]

A refined saturation theorem for polynomials and applications

Authors: Xiangdong Ye, Jiaqi Yu

Abstract: For a dynamical system $(X,T)$, $d\in\mathbb{N}$ and distinct non-constant integral polynomials $p_1,\ldots, p_d$ vanishing at $0$, the notion of regionally proximal relation along $C=\{p_1,\ldots,p_d\}$ (denoted by $RP_C^{[d]}(X,T)$) is introduced. It turns out that for a minimal system, $RP_C^{[d]}(X,T)=Δ$ implies that $X$ is an almost one-to-one extension of $X_k$ for some $k\in\mathbb{N}$ on… ▽ More For a dynamical system $(X,T)$, $d\in\mathbb{N}$ and distinct non-constant integral polynomials $p_1,\ldots, p_d$ vanishing at $0$, the notion of regionally proximal relation along $C=\{p_1,\ldots,p_d\}$ (denoted by $RP_C^{[d]}(X,T)$) is introduced. It turns out that for a minimal system, $RP_C^{[d]}(X,T)=Δ$ implies that $X$ is an almost one-to-one extension of $X_k$ for some $k\in\mathbb{N}$ only depending on a set of finite polynomials associated with $C$ and has zero entropy, where $X_k$ is the maximal $k$-step pro-nilfactor of $X$. Particularly, when $C$ is a collection of linear polynomials, it is proved that $RP_C^{[d]}(X,T)=Δ$ implies $(X,T)$ is a $d$-step pro-nilsystem, which answers negatively a conjecture in \cite{5p}. The results are obtained by proving a refined saturation theorem for polynomials. △ Less

Submitted 18 May, 2024; originally announced May 2024.

arXiv:2405.11132 [pdf, ps, other]

Quadratic twists of tiling number elliptic curves

Authors: Keqin Feng, Qiuyue Liu, **zhao Pan, Ye Tian

Abstract: A positive integer $n$ is called a tiling number if the equilateral triangle can be dissected into $nk^2$ congruent triangles for some integer $k$. An integer $n>3$ is tiling number if and only if at least one of the elliptic curves $E^{(\pm n)}:\pm ny^2=x(x-1)(x+3)$ has positive Mordell-Weil rank. Let $A$ denote one of the two curves. In this paper, using Waldspurger formula and an induction meth… ▽ More A positive integer $n$ is called a tiling number if the equilateral triangle can be dissected into $nk^2$ congruent triangles for some integer $k$. An integer $n>3$ is tiling number if and only if at least one of the elliptic curves $E^{(\pm n)}:\pm ny^2=x(x-1)(x+3)$ has positive Mordell-Weil rank. Let $A$ denote one of the two curves. In this paper, using Waldspurger formula and an induction method, for $n\equiv 3,7\mod 24$ positive square-free, as well as some other residue classes, we express the parity of analytic Sha of $A$ in terms of the genus number $g(m):=\#2\mathrm{Cl}(\mathbb{Q}(\sqrt{-m}))$ as $m$ runs over factors of $n$. Together with $2$-descent method which express $\mathrm{dim}_{\mathbb{F}_2}\mathrm{Sel}_2(A/\mathbb{Q})/A[2]$ in terms of the corank of a matrix of $\mathbb{F}_2$-coefficients, we show that for $n\equiv 3,7\mod 24$ positive square-free, the analytic Sha of $A$ being odd is equivalent to that $\mathrm{Sel}_2(A/\mathbb{Q})/A[2]$ being trivial, as predicted by the BSD conjecture. We also show that, among the residue classes $3$, resp. $7\mod 24$, the subset of $n$ such that both of $E^{(n)}$ and $E^{(-n)}$ have analytic Sha odd is of limit density $0.288\cdots$ and $0.144\cdots$, respectively, in particular, they are non-tiling numbers. This exhibits two new phenomena on tiling number elliptic curves: firstly, the limit density is different from the general phenomenon on elliptic curves predicted by Bhargava-Kane-Lenstra-Poonen-Rains; secondly, the joint distribution has different behavior among different residue classes. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 25 pages

MSC Class: 11G05 (Primary) 11G40 (Secondary)

arXiv:2405.08822 [pdf, other]

Despite Absolute Information Advantages, All Investors Incur Welfare Loss

Authors: Zongxia Liang, Qi Ye

Abstract: This paper delves into financial markets that incorporate a novel form of heterogeneity among investors, specifically in terms of their beliefs regarding the reliability of signals in the business cycle economy model, which may be biased. Unlike most papers in this field, we not only analyze the equilibrium but also examine welfare using objective measures while investors aim to maximize their uti… ▽ More This paper delves into financial markets that incorporate a novel form of heterogeneity among investors, specifically in terms of their beliefs regarding the reliability of signals in the business cycle economy model, which may be biased. Unlike most papers in this field, we not only analyze the equilibrium but also examine welfare using objective measures while investors aim to maximize their utility based on subjective measures. Furthermore, we introduce passive investors and use their utility as a benchmark, thereby revealing the phenomenon of double loss sometimes. In the analysis, we examine two effects: the distortion effect on total welfare and the advantage effect of information and highlight their key factors of influence, with a particular emphasis on the proportion of investors. We also demonstrate that manipulating investors' estimation towards the economy can be a way to improve utility and identify an inner connection between welfare and survival. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.08605 [pdf, ps, other]

On gradient estimates of the heat semigroups on step-two Carnot groups

Authors: Sheng-Chen Mao, Ye Zhang

Abstract: In this work, we give a sufficient condition for a step-two Carnot group to satisfy the quasi Bakry-Émery curvature condition. As an application, we establish the gradient estimate for the heat semigroup on the free step-two Carnot group with three generators $N_{3,2}$. Moreover, high order gradient estimates and the Riemannian counterparts are also deduced under an extra condition. In this work, we give a sufficient condition for a step-two Carnot group to satisfy the quasi Bakry-Émery curvature condition. As an application, we establish the gradient estimate for the heat semigroup on the free step-two Carnot group with three generators $N_{3,2}$. Moreover, high order gradient estimates and the Riemannian counterparts are also deduced under an extra condition. △ Less

Submitted 14 May, 2024; originally announced May 2024.

MSC Class: 58J35; 22E30; 35R03

arXiv:2405.06871 [pdf, other]

Statistical Error of Numerical Integrators for Underdamped Langevin Dynamics with Deterministic And Stochastic Gradients

Authors: Xuda Ye, Zhennan Zhou

Abstract: We propose a novel discrete Poisson equation approach to estimate the statistical error of a broad class of numerical integrators for the underdamped Langevin dynamics. The statistical error refers to the mean square error of the estimator to the exact ensemble average with a finite number of iterations. With the proposed error analysis framework, we show that when the potential function $U(x)$ is… ▽ More We propose a novel discrete Poisson equation approach to estimate the statistical error of a broad class of numerical integrators for the underdamped Langevin dynamics. The statistical error refers to the mean square error of the estimator to the exact ensemble average with a finite number of iterations. With the proposed error analysis framework, we show that when the potential function $U(x)$ is strongly convex in $\mathbb R^d$ and the numerical integrator has strong order $p$, the statistical error is $O(h^{2p}+\frac1{Nh})$, where $h$ is the time step and $N$ is the number of iterations. Besides, this approach can be adopted to analyze integrators with stochastic gradients, and quantitative estimates can be derived as well. Our approach only requires the geometric ergodicity of the continuous-time underdamped Langevin dynamics, and relaxes the constraint on the time step. △ Less

Submitted 10 May, 2024; originally announced May 2024.

MSC Class: 60H35; 37M05

arXiv:2405.05483 [pdf, ps, other]

Zero-one Grothendieck Polynomials

Authors: Yiming Chen, Neil J. Y. Fan, Zelin Ye

Abstract: Fink, Mészáros and St.Dizier showed that the Schubert polynomial $\mathfrak{S}_w(x)$ is zero-one if and only if $w$ avoids twelve permutation patterns. In this paper, we prove that the Grothendieck polynomial $\mathfrak{G}_w(x)$ is zero-one, i.e., with coefficients either 0 or $\pm$1, if and only if $w$ avoids six patterns. As applications, we show that zero-one homogeneous Grothendieck polynomial… ▽ More Fink, Mészáros and St.Dizier showed that the Schubert polynomial $\mathfrak{S}_w(x)$ is zero-one if and only if $w$ avoids twelve permutation patterns. In this paper, we prove that the Grothendieck polynomial $\mathfrak{G}_w(x)$ is zero-one, i.e., with coefficients either 0 or $\pm$1, if and only if $w$ avoids six patterns. As applications, we show that zero-one homogeneous Grothendieck polynomials are Lorentzian, partially confirming three conjectures of Huh, Matherne, Mészáros and St.Dizier. Moreover, we verify several conjectures on the support and coefficients of Grothendieck polynomials posed by Mészáros, Setiabrata and St.Dizier for the case of zero-one Grothendieck polynomials. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 23 pages, 22 figures

arXiv:2405.05128 [pdf, ps, other]

Degree of the Grassmannian as an affine variety

Authors: Lek-Heng Lim, Ke Ye

Abstract: The degree of the Grassmannian with respect to the Plücker embedding is well-known. However, the Plücker embedding, while ubiquitous in pure mathematics, is almost never used in applied mathematics. In applied mathematics, the Grassmannian is usually embedded as projection matrices… ▽ More The degree of the Grassmannian with respect to the Plücker embedding is well-known. However, the Plücker embedding, while ubiquitous in pure mathematics, is almost never used in applied mathematics. In applied mathematics, the Grassmannian is usually embedded as projection matrices $\operatorname{Gr}(k,\mathbb{R}^n) \cong \{P \in \mathbb{R}^{n \times n} : P^{\scriptscriptstyle\mathsf{T}} = P = P^2,\; \operatorname{tr}(P) = k\}$ or as involution matrices $\operatorname{Gr}(k,\mathbb{R}^n) \cong \{X \in \mathbb{R}^{n \times n} : X^{\scriptscriptstyle\mathsf{T}} = X,\; X^2 = I,\; \operatorname{tr}(X)=2k - n\}$. We will determine an explicit expression for the degree of the Grassmannian with respect to these embeddings. In so doing, we resolved a conjecture of Devriendt--Friedman--Sturmfels about the degree $\operatorname{Gr}(2, \mathbb{R}^n)$ and in fact generalized it to $\operatorname{Gr}(k, \mathbb{R}^n)$. We also proved a set theoretic variant of another conjecture of Devriendt--Friedman--Sturmfels about the limit of $\operatorname{Gr}(k,\mathbb{R}^n)$ in the sense of Gröbner degneration. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 15 pages

MSC Class: 14E25; 14F45

arXiv:2405.02209 [pdf, ps, other]

Zilber's Trichotomy in Hausdorff Geometric Structures

Authors: Benjamin Castle, Assaf Hasson, **he Ye

Abstract: We give a new axiomatic treatment of the Zilber trichotomy, and use it to complete the proof of the trichotomy for relics of algebraically closed fields, i.e., reducts of the ACF-induced structure on ACF-definable sets. More precisely, we introduce a class of geometric structures equipped with a Hausdorff topology, called \textit{Hausdorff geometric structures}. Natural examples include the comple… ▽ More We give a new axiomatic treatment of the Zilber trichotomy, and use it to complete the proof of the trichotomy for relics of algebraically closed fields, i.e., reducts of the ACF-induced structure on ACF-definable sets. More precisely, we introduce a class of geometric structures equipped with a Hausdorff topology, called \textit{Hausdorff geometric structures}. Natural examples include the complex field; algebraically closed valued fields; o-minimal expansions of real closed fields; and characteristic zero Henselian fields (in particular $p$-adically closed fields). We then study the Zilber trichotomy for relics of Hausdorff geometric structures, showing that under additional assumptions, every non-locally modular strongly minimal relic on a real sort interprets a one-dimensional group. Combined with recent results, this allows us to prove the trichotomy for strongly minimal relics on the real sorts of algebraically closed valued fields. Finally, we make progress on the imaginary sorts, reducing the trichotomy for \textit{all} ACVF relics (in all sorts) to a conjectural technical condition that we prove in characteristic $(0,0)$. △ Less

Submitted 3 May, 2024; originally announced May 2024.

MSC Class: 0C345; 14A99

arXiv:2405.00852 [pdf, ps, other]

On manifolds with nonnegative Ricci curvature and the infimum of volume growth order $<2$

Authors: Zhu Ye

Abstract: We prove two rigidity theorems for open (complete and noncompact) $n$-manifolds $M$ with nonnegative Ricci curvature and the infimum of volume growth order $<2$. The first theorem asserts that the Riemannian universal cover of $M$ has Euclidean volume growth if and only if $M$ is flat with an $n-1$ dimensional soul. The second theorem asserts that there exists a nonconstant linear growth harmonic… ▽ More We prove two rigidity theorems for open (complete and noncompact) $n$-manifolds $M$ with nonnegative Ricci curvature and the infimum of volume growth order $<2$. The first theorem asserts that the Riemannian universal cover of $M$ has Euclidean volume growth if and only if $M$ is flat with an $n-1$ dimensional soul. The second theorem asserts that there exists a nonconstant linear growth harmonic function on $M$ if and only if $M$ is isometric to the metric product $\mathbb{R}\times N$ for some compact manifold $N$. △ Less

Submitted 1 May, 2024; originally announced May 2024.

arXiv:2404.19133 [pdf, other]

Parameterized Wasserstein Gradient Flow

Authors: Yijie **, Shu Liu, Hao Wu, Xiao**g Ye, Haomin Zhou

Abstract: We develop a fast and scalable numerical approach to solve Wasserstein gradient flows (WGFs), particularly suitable for high-dimensional cases. Our approach is to use general reduced-order models, like deep neural networks, to parameterize the push-forward maps such that they can push a simple reference density to the one solving the given WGF. The new dynamical system is called parameterized WGF… ▽ More We develop a fast and scalable numerical approach to solve Wasserstein gradient flows (WGFs), particularly suitable for high-dimensional cases. Our approach is to use general reduced-order models, like deep neural networks, to parameterize the push-forward maps such that they can push a simple reference density to the one solving the given WGF. The new dynamical system is called parameterized WGF (PWGF), and it is defined on the finite-dimensional parameter space equipped with a pullback Wasserstein metric. Our numerical scheme can approximate the solutions of WGFs for general energy functionals effectively, without requiring spatial discretization or nonconvex optimization procedures, thus avoiding some limitations of classical numerical methods and more recent deep-learning-based approaches. A comprehensive analysis of the approximation errors measured by Wasserstein distance is also provided in this work. Numerical experiments show promising computational efficiency and verified accuracy on various WGF examples using our approach. △ Less

Submitted 22 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.18538 [pdf, ps, other]

Symmetry group based domain decomposition to enhance physics-informed neural networks for solving partial differential equations

Authors: Ye Liu, Jie-Ying Li, Li-Sheng Zhang, Lei-Lei Guo, Zhi-Yong Zhang

Abstract: Domain decomposition provides an effective way to tackle the dilemma of physics-informed neural networks (PINN) which struggle to accurately and efficiently solve partial differential equations (PDEs) in the whole domain, but the lack of efficient tools for dealing with the interfaces between two adjacent sub-domains heavily hinders the training effects, even leads to the discontinuity of the lear… ▽ More Domain decomposition provides an effective way to tackle the dilemma of physics-informed neural networks (PINN) which struggle to accurately and efficiently solve partial differential equations (PDEs) in the whole domain, but the lack of efficient tools for dealing with the interfaces between two adjacent sub-domains heavily hinders the training effects, even leads to the discontinuity of the learned solutions. In this paper, we propose a symmetry group based domain decomposition strategy to enhance the PINN for solving the forward and inverse problems of the PDEs possessing a Lie symmetry group. Specifically, for the forward problem, we first deploy the symmetry group to generate the dividing-lines having known solution information which can be adjusted flexibly and are used to divide the whole training domain into a finite number of non-overlap** sub-domains, then utilize the PINN and the symmetry-enhanced PINN methods to learn the solutions in each sub-domain and finally stitch them to the overall solution of PDEs. For the inverse problem, we first utilize the symmetry group acting on the data of the initial and boundary conditions to generate labeled data in the interior domain of PDEs and then find the undetermined parameters as well as the solution by only training the neural networks in a sub-domain. Consequently, the proposed method can predict high-accuracy solutions of PDEs which are failed by the vanilla PINN in the whole domain and the extended physics-informed neural network in the same sub-domains. Numerical results of the Korteweg-de Vries equation with a translation symmetry and the nonlinear viscous fluid equation with a scaling symmetry show that the accuracies of the learned solutions are improved largely. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.17741 [pdf, ps, other]

On some relationships between the center and the derived subalgebra in Poisson (2-3)-algebras

Authors: P. Ye. Minaiev, O. O. Pypka, I. V. Shyshenko

Abstract: One of the classic results of group theory is the so-called Schur theorem. It states that if the central factor-group $G/ζ(G)$ of a group $G$ is finite, then its derived subgroup $[G,G]$ is also finite. This result has numerous generalizations and modifications in group theory. At the same time, similar investigations were conducted in other algebraic structures, namely in modules, linear groups,… ▽ More One of the classic results of group theory is the so-called Schur theorem. It states that if the central factor-group $G/ζ(G)$ of a group $G$ is finite, then its derived subgroup $[G,G]$ is also finite. This result has numerous generalizations and modifications in group theory. At the same time, similar investigations were conducted in other algebraic structures, namely in modules, linear groups, topological groups, $n$-groups, associative algebras, Lie algebras, Lie $n$-algebras, Lie rings, Leibniz algebras. In 2021, L.A. Kurdachenko, O.O. Pypka and I.Ya. Subbotin proved an analogue of Schur theorem for Poisson algebras: if the center of the Poisson algebra $P$ has finite codimension, then $P$ includes an ideal $K$ of finite dimension such that $P/K$ is abelian. In this paper, we continue similar studies for another algebraic structure. An analogue of Schur theorem for Poisson (2-3)-algebras is proved. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.17740 [pdf, ps, other]

On some relationships between the centers and the derived ideal in Leibniz 3-algebras

Authors: P. Ye. Minaiev, O. O. Pypka

Abstract: One of the classic results of group theory is the so-called Schur theorem. It states that if the central factor-group $G/ζ(G)$ of a group $G$ is finite, then its derived subgroup $[G,G]$ is also finite. This result has numerous generalizations and modifications in group theory. At the same time, similar investigations were conducted in other algebraic structures. In 2016, L.A. Kurdachenko, J. Otal… ▽ More One of the classic results of group theory is the so-called Schur theorem. It states that if the central factor-group $G/ζ(G)$ of a group $G$ is finite, then its derived subgroup $[G,G]$ is also finite. This result has numerous generalizations and modifications in group theory. At the same time, similar investigations were conducted in other algebraic structures. In 2016, L.A. Kurdachenko, J. Otal and O.O. Pypka proved an analogue of Schur theorem for Leibniz algebras: if central factor-algebra $L/ζ(L)$ of Leibniz algebra $L$ has finite dimension, then its derived ideal $[L,L]$ is also finite-dimensional. Moreover, they also proved a slightly modified analogue of Schur theorem: if the codimensions of the left $ζ^{l}(L)$ and right $ζ^{r}(L)$ centers of Leibniz algebra $L$ are finite, then its derived ideal $[L,L]$ is also finite-dimensional. One of the generalizations of Leibniz algebras is the so-called Leibniz $n$-algebras. Therefore, the question of proving analogs of the above results for this type of algebras naturally arises. In this article, we prove the analogues of the two mentioned theorems for Leibniz 3-algebras. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.17696 [pdf, ps, other]

New second-order optimality conditions for directional optimality of a general set-constrained optimization problem

Authors: Wei Ouyang, Jane Ye, Binbin Zhang

Abstract: In this paper we derive new second-order optimality conditions for a very general set-constrained optimization problem where the underlying set may be nononvex. We consider local optimality in specific directions (i.e., optimal in a directional neighborhood) in pursuit of develo** these new optimality conditions. First-order necessary conditions for local optimality in given directions are provi… ▽ More In this paper we derive new second-order optimality conditions for a very general set-constrained optimization problem where the underlying set may be nononvex. We consider local optimality in specific directions (i.e., optimal in a directional neighborhood) in pursuit of develo** these new optimality conditions. First-order necessary conditions for local optimality in given directions are provided by virtue of the corresponding directional normal cones. Utilizing the classical and/or the lower generalized support function, we obtain new second-order necessary and sufficient conditions for local optimality of general nonconvex constrained optimization problem in given directions via both the corresponding asymptotic second-order tangent cone and outer second-order tangent set. Our results do not require convexity and/or nonemptyness of the outer second-order tangent set. This is an important improvement to other results in the literature since the outer second-order tangent set can be nonconvex and empty even when the set is convex. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.15054 [pdf, ps, other]

New exotic examples of Ricci limit spaces

Authors: Xilun Li, Yanan Ye, Shengxuan Zhou

Abstract: For any integers $m\geqslant n\geqslant 3$, we construct a Ricci limit space $X_{m,n}$ such that for a fixed point, some tangent cones are $\mathbb{R}^m$ and some are $\mathbb{R}^n$. This is an improvement of Menguy's example. For any integers $m\geqslant n\geqslant 3$, we construct a Ricci limit space $X_{m,n}$ such that for a fixed point, some tangent cones are $\mathbb{R}^m$ and some are $\mathbb{R}^n$. This is an improvement of Menguy's example. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: Comments are welcome

arXiv:2404.14825 [pdf, ps, other]

doi 10.1016/j.jfa.2023.110302

Sharp ill-posedness for the non-resistive MHD equations in Sobolev spaces

Authors: Qionglei Chen, Yao Nie, Weikui Ye

Abstract: In this paper, we prove a sharp ill-posedness result for the incompressible non-resistive MHD equations. In any dimension $d\ge 2$, we show the ill-posedness of the non-resistive MHD equations in $H^{\frac{d}{2}-1}(\mathbb{R}^d)\times H^{\frac{d}{2}}(\mathbb{R}^d)$, which is sharp in view of the results of the local well-posedness in… ▽ More In this paper, we prove a sharp ill-posedness result for the incompressible non-resistive MHD equations. In any dimension $d\ge 2$, we show the ill-posedness of the non-resistive MHD equations in $H^{\frac{d}{2}-1}(\mathbb{R}^d)\times H^{\frac{d}{2}}(\mathbb{R}^d)$, which is sharp in view of the results of the local well-posedness in $H^{s-1}(\mathbb{R}^d)\times H^{s}(\mathbb{R}^d)(s>\frac{d}{2})$ established by Fefferman et al.(Arch. Ration. Mech. Anal., \textbf{223} (2), 677-691, 2017). Furthermore, we generalize the ill-posedness results from $H^{\frac{d}{2}-1}(\mathbb{R}^d)\times H^{\frac{d}{2}}(\mathbb{R}^d)$ to Besov spaces $B^{\frac{d}{p}-1}_{p, q}(\mathbb{R}^d)\times B^{\frac{d}{p}}_{p, q}(\mathbb{R}^d)$ and $\dot B^{\frac{d}{p}-1}_{p, q}(\mathbb{R}^d)\times \dot B^{\frac{d}{p}}_{p, q}(\mathbb{R}^d)$ for $1\le p\le\infty, q>1$. Different from the ill-posedness mechanism of the incompressible Navier-Stokes equations in $\dot B^{-1}_{\infty, q}$ \cite{B,W}, we construct an initial data such that the paraproduct terms (low-high frequency interaction) of the nonlinear term make the main contribution to the norm inflation of the magnetic field. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 20 pages

Journal ref: Journal of Functional Analysis, 286(2024)110302

arXiv:2404.11134 [pdf, ps, other]

Co-existence of Type II blow-ups with multiple blow-up rates for five-dimensional heat equation with critical nonlinear boundary conditions

Authors: Juncheng Wei, Zikai Ye, Xiaoyu Zeng, Qidi Zhang

Abstract: We consider the following five-dimensional heat equation with critical boundary condition \begin{equation*} \partial_t u=Δu \mbox{ \ in \ } \mathbb{R}_+^5\times (0,T) , \quad -\partial_{x_5}u =|u|^\frac{2}{3}u \mbox{ \ on \ } \pp \mathbb{R}^5_+ \times (0,T) . \end{equation*} Given $\mathfrak{o}$ distinct boundary points $q^{[i]} \in \partial \mathbb{R}_+^5$, and $\mathfrak{o}$ integers… ▽ More We consider the following five-dimensional heat equation with critical boundary condition \begin{equation*} \partial_t u=Δu \mbox{ \ in \ } \mathbb{R}_+^5\times (0,T) , \quad -\partial_{x_5}u =|u|^\frac{2}{3}u \mbox{ \ on \ } \pp \mathbb{R}^5_+ \times (0,T) . \end{equation*} Given $\mathfrak{o}$ distinct boundary points $q^{[i]} \in \partial \mathbb{R}_+^5$, and $\mathfrak{o}$ integers $l_i\in \mathbb{N}$ (possibly duplicated), $i=1,2,\dots, \mathfrak{o}$, for $T>0$ sufficiently small, we construct a finite-time blow-up solution $u$ with a type II blow-up rate $(T-t)^{-3l_i -3}$ for $x$ near $q^{[i]}$. This seems to be the first result of the co-existence of type II blowups with different blow-up rates. To accommodate highly unstable blowups with different blowup rates, we first develop a unified linear theory for the inner problem with more time decay in the blow-up scheme through restriction on the spatial growth of the right-hand side, and then use vanishing adjustment functions for deriving multiple rates at distinct points. This paper is inspired by [25, 52, 60]. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 59 pages; comments welcome

arXiv:2404.10559 [pdf, ps, other]

Nonlinear kernel-free quadratic hyper-surface support vector machine with 0-1 loss function

Authors: Mingyang Wu, Zhixia Yang, Junyou Ye

Abstract: For the binary classification problem, a novel nonlinear kernel-free quadratic hyper-surface support vector machine with 0-1 loss function (QSSVM$_{0/1}$) is proposed. Specifically, the task of QSSVM$_{0/1}$ is to seek a quadratic separating hyper-surface to divide the samples into two categories. And it has better interpretability than the methods using kernel functions, since each feature of the… ▽ More For the binary classification problem, a novel nonlinear kernel-free quadratic hyper-surface support vector machine with 0-1 loss function (QSSVM$_{0/1}$) is proposed. Specifically, the task of QSSVM$_{0/1}$ is to seek a quadratic separating hyper-surface to divide the samples into two categories. And it has better interpretability than the methods using kernel functions, since each feature of the sample acts both independently and synergistically. By introducing the 0-1 loss function to construct the optimization model makes the model obtain strong sample sparsity. The proximal stationary point of the optimization problem is defined by the proximal operator of the 0-1 loss function, which figures out the problem of non-convex discontinuity of the optimization problem due to the 0-1 loss function. A new iterative algorithm based on the alternating direction method of multipliers (ADMM) framework is designed to solve the optimization problem, which relates to the working set defined by support vectors. The computational complexity and convergence of the algorithm are discussed. Numerical experiments on 4 artificial datasets and 14 benchmark datasets demonstrate that our QSSVM$_{0/1}$ achieves higher classification accuracy, fewer support vectors and less CPU time cost than other state-of-the-art methods. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.10145 [pdf, ps, other]

Nonnegative Ricci curvature, splitting at infinity, and first Betti number rigidity

Authors: Jiayin Pan, Zhu Ye

Abstract: We study the rigidity problems for open (complete and noncompact) $n$-manifolds with nonnegative Ricci curvature. We prove that if an asymptotic cone of $M$ properly contains a Euclidean $\mathbb{R}^{k-1}$, then the first Betti number of $M$ is at most $n-k$; moreover, if equality holds, then $M$ is flat. Next, we study the geometry of the orbit $Γ\tilde{p}$, where $Γ=π_1(M,p)$ acts on the univers… ▽ More We study the rigidity problems for open (complete and noncompact) $n$-manifolds with nonnegative Ricci curvature. We prove that if an asymptotic cone of $M$ properly contains a Euclidean $\mathbb{R}^{k-1}$, then the first Betti number of $M$ is at most $n-k$; moreover, if equality holds, then $M$ is flat. Next, we study the geometry of the orbit $Γ\tilde{p}$, where $Γ=π_1(M,p)$ acts on the universal cover $(\widetilde{M},\tilde{p})$. Under a similar asymptotic condition, we prove a geometric rigidity in terms of the growth order of $Γ\tilde{p}$. We also give the first example of a manifold $M$ of $\mathrm{Ric}>0$ and $π_1(M)=\mathbb{Z}$ but with a varying orbit growth order. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.09804 [pdf, other]

The $L_p$ dual Minkowski problem for unbounded closed convex sets

Authors: Wen Ai, Yunlong Yang, De** Ye

Abstract: The central focus of this paper is the $L_p$ dual Minkowski problem for $C$-compatible sets, where $C$ is a pointed closed convex cone in $\mathbb{R}^n$ with nonempty interior. Such a problem deals with the characterization of the $(p, q)$-th dual curvature measure of a $C$-compatible set. It produces new Monge-Ampère equations for unbounded convex hypersurface, often defined over open domains and… ▽ More The central focus of this paper is the $L_p$ dual Minkowski problem for $C$-compatible sets, where $C$ is a pointed closed convex cone in $\mathbb{R}^n$ with nonempty interior. Such a problem deals with the characterization of the $(p, q)$-th dual curvature measure of a $C$-compatible set. It produces new Monge-Ampère equations for unbounded convex hypersurface, often defined over open domains and with non-positive unknown convex functions. Within the family of $C$-determined sets, the $L_p$ dual Minkowski problem is solved for $0\neq p\in \mathbb{R}$ and $q\in \mathbb{R}$; while it is solved for the range of $p\leq 0$ and $p<q$ within the newly defined family of $(C, p, q)$-close sets. When $p\leq q$, we also obtain some results regarding the uniqueness of solutions to the $L_p$ dual Minkowski problem for $C$-compatible sets. △ Less

Submitted 15 April, 2024; originally announced April 2024.

MSC Class: 52A20; 52A39

arXiv:2404.07047 [pdf, ps, other]

Four-fifths laws in electron and Hall magnetohydrodynamic fluids: Energy, Magnetic helicity and Generalized helicity

Authors: Yanqing Wang, Yulin Ye, Otto Chkhetiani

Abstract: This paper examines the Kolmogorov type laws of conserved quantities in the electron and Hall magnetohydrodynamic fluids. Inspired by Eyink's longitudinal structure functions and recent progress in classical MHD equations, we derive four-fifths laws for energy, magnetic helicity and generalized helicity in these systems. This paper examines the Kolmogorov type laws of conserved quantities in the electron and Hall magnetohydrodynamic fluids. Inspired by Eyink's longitudinal structure functions and recent progress in classical MHD equations, we derive four-fifths laws for energy, magnetic helicity and generalized helicity in these systems. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 30pages, 1 table

arXiv:2404.07035 [pdf, ps, other]

Four-fifths laws in incompressible and magnetized fluids: Helicity, Energy and Cross-helicity

Authors: Yulin Ye, Yanqing Wang, Otto Chkhetiani

Abstract: In this paper, we are concerned with the Kolmogorov's scaling laws of conserved quantities. By means of Eyink's longitudinal structure functions and the analysis of interaction of different physical quantities, we extend celebrated four-fifths laws from energy to helicity in incompressible fluid and, energy and cross-helicity in magnetohydrodynamic flow. In contrast to pervious 4/5 laws of energy… ▽ More In this paper, we are concerned with the Kolmogorov's scaling laws of conserved quantities. By means of Eyink's longitudinal structure functions and the analysis of interaction of different physical quantities, we extend celebrated four-fifths laws from energy to helicity in incompressible fluid and, energy and cross-helicity in magnetohydrodynamic flow. In contrast to pervious 4/5 laws of energy and cross-helicity in magnetized fluids obtained by Politano and Pouquet, they are in terms of the mixed three-order structure functions rather than the structure coupling correlation functions. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 33 pages

arXiv:2404.06141 [pdf, ps, other]

On the shrinking solitons of generalized Ricci flow

Authors: Xilun Li, Yanan Ye

Abstract: We show that every gradient shrinking soliton of the generalized Ricci flow on compact manifold is a Ricci soliton. And we prove that the pluriclosed soliton is gradient Kahler-Ricci soliton under a broad cohomological condition. Moreover, we construct the first example of non-trivial shrinking generalized soliton, which can serve as a singularity model of the generalized Ricci flow. We show that every gradient shrinking soliton of the generalized Ricci flow on compact manifold is a Ricci soliton. And we prove that the pluriclosed soliton is gradient Kahler-Ricci soliton under a broad cohomological condition. Moreover, we construct the first example of non-trivial shrinking generalized soliton, which can serve as a singularity model of the generalized Ricci flow. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2404.04062 [pdf, other]

Derivative-free tree optimization for complex systems

Authors: Ye Wei, Bo Peng, Ruiwen Xie, Yangtao Chen, Yu Qin, Peng Wen, Stefan Bauer, Po-Yen Tung

Abstract: A tremendous range of design tasks in materials, physics, and biology can be formulated as finding the optimum of an objective function depending on many parameters without knowing its closed-form expression or the derivative. Traditional derivative-free optimization techniques often rely on strong assumptions about objective functions, thereby failing at optimizing non-convex systems beyond 100 d… ▽ More A tremendous range of design tasks in materials, physics, and biology can be formulated as finding the optimum of an objective function depending on many parameters without knowing its closed-form expression or the derivative. Traditional derivative-free optimization techniques often rely on strong assumptions about objective functions, thereby failing at optimizing non-convex systems beyond 100 dimensions. Here, we present a tree search method for derivative-free optimization that enables accelerated optimal design of high-dimensional complex systems. Specifically, we introduce stochastic tree expansion, dynamic upper confidence bound, and short-range backpropagation mechanism to evade local optimum, iteratively approximating the global optimum using machine learning models. This development effectively confronts the dimensionally challenging problems, achieving convergence to global optima across various benchmark functions up to 2,000 dimensions, surpassing the existing methods by 10- to 20-fold. Our method demonstrates wide applicability to a wide range of real-world complex systems spanning materials, physics, and biology, considerably outperforming state-of-the-art algorithms. This enables efficient autonomous knowledge discovery and facilitates self-driving virtual laboratories. Although we focus on problems within the realm of natural science, the advancements in optimization techniques achieved herein are applicable to a broader spectrum of challenges across all quantitative disciplines. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 39 pages, 3 figures

arXiv:2404.02433 [pdf, other]

A fast cosine transformation accelerated method for predicting effective thermal conductivity

Authors: Changqing Ye, Shubin Fu, Eric T. Chung

Abstract: Predicting effective thermal conductivity by solving a Partial Differential Equation (PDE) defined on a high-resolution Representative Volume Element (RVE) is a computationally intensive task. In this paper, we tackle the task by proposing an efficient and implementation-friendly computational method that can fully leverage the computing power offered by hardware accelerators, namely, graphical pr… ▽ More Predicting effective thermal conductivity by solving a Partial Differential Equation (PDE) defined on a high-resolution Representative Volume Element (RVE) is a computationally intensive task. In this paper, we tackle the task by proposing an efficient and implementation-friendly computational method that can fully leverage the computing power offered by hardware accelerators, namely, graphical processing units (GPUs). We first employ the Two-Point Flux-Approximation scheme to discretize the PDE and then utilize the preconditioned conjugate gradient method to solve the resulting algebraic linear system. The construction of the preconditioner originates from FFT-based homogenization methods, and an engineered linear programming technique is utilized to determine the homogeneous reference parameters. The fundamental observation presented in this paper is that the preconditioner system can be effectively solved using multiple Fast Cosine Transformations (FCT) and parallel tridiagonal matrix solvers. Regarding the fact that default multiple FCTs are unavailable on the CUDA platform, we detail how to derive FCTs from FFTs with nearly optimal memory usage. Numerical experiments including the stability comparison with standard preconditioners are conducted for 3D RVEs. Our performance reports indicate that the proposed method can achieve a $5$-fold acceleration on the GPU platform over the pure CPU platform and solve the problems with $512^3$ degrees of freedom and reasonable contrast ratios in less than $30$ seconds. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.01224 [pdf, other]

Collaborative Pareto Set Learning in Multiple Multi-Objective Optimization Problems

Authors: Chikai Shang, Rongguang Ye, Jiaqi Jiang, Fangqing Gu

Abstract: Pareto Set Learning (PSL) is an emerging research area in multi-objective optimization, focusing on training neural networks to learn the map** from preference vectors to Pareto optimal solutions. However, existing PSL methods are limited to addressing a single Multi-objective Optimization Problem (MOP) at a time. When faced with multiple MOPs, this limitation results in significant inefficienci… ▽ More Pareto Set Learning (PSL) is an emerging research area in multi-objective optimization, focusing on training neural networks to learn the map** from preference vectors to Pareto optimal solutions. However, existing PSL methods are limited to addressing a single Multi-objective Optimization Problem (MOP) at a time. When faced with multiple MOPs, this limitation results in significant inefficiencies and hinders the ability to exploit potential synergies across varying MOPs. In this paper, we propose a Collaborative Pareto Set Learning (CoPSL) framework, which learns the Pareto sets of multiple MOPs simultaneously in a collaborative manner. CoPSL particularly employs an architecture consisting of shared and MOP-specific layers. The shared layers are designed to capture commonalities among MOPs collaboratively, while the MOP-specific layers tailor these general insights to generate solution sets for individual MOPs. This collaborative approach enables CoPSL to efficiently learn the Pareto sets of multiple MOPs in a single execution while leveraging the potential relationships among various MOPs. To further understand these relationships, we experimentally demonstrate that shareable representations exist among MOPs. Leveraging these shared representations effectively improves the capability to approximate Pareto sets. Extensive experiments underscore the superior efficiency and robustness of CoPSL in approximating Pareto sets compared to state-of-the-art approaches on a variety of synthetic and real-world MOPs. Code is available at https://github.com/ckshang/CoPSL. △ Less

Submitted 28 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

Comments: Accepted by IJCNN 2024

arXiv:2403.19356 [pdf, other]

A robust two-level overlap** preconditioner for Darcy flow in high-contrast media

Authors: Changqing Ye, Shubin Fu, Eric T. Chung, Jizu Huang

Abstract: In this article, a two-level overlap** domain decomposition preconditioner is developed for solving linear algebraic systems obtained from simulating Darcy flow in high-contrast media. Our preconditioner starts at a mixed finite element method for discretizing the partial differential equation by Darcy's law with the no-flux boundary condition and is then followed by a velocity elimination techn… ▽ More In this article, a two-level overlap** domain decomposition preconditioner is developed for solving linear algebraic systems obtained from simulating Darcy flow in high-contrast media. Our preconditioner starts at a mixed finite element method for discretizing the partial differential equation by Darcy's law with the no-flux boundary condition and is then followed by a velocity elimination technique to yield a linear algebraic system with only unknowns of pressure. Then, our main objective is to design a robust and efficient domain decomposition preconditioner for this system, which is accomplished by engineering a multiscale coarse space that is capable of characterizing high-contrast features of the permeability field. A generalized eigenvalue problem is solved in each non-overlap** coarse element in a communication-free manner to form the global solver, which is accompanied by local solvers originated from additive Schwarz methods but with a non-Galerkin discretization to derive the two-level preconditioner. We provide a rigorous analysis that indicates that the condition number of the preconditioned system could be bounded above with several assumptions. Extensive numerical experiments with various types of three-dimensional high-contrast models are exhibited. In particular, we study the robustness against the contrast of the media as well as the influences of numbers of eigenfunctions, oversampling sizes, and subdomain partitions on the efficiency of the proposed preconditioner. Besides, strong and weak scalability performances are also examined. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Showing 1–50 of 1,189 results for author: Ye