Search | arXiv e-print repository

Fast Iterative Solver For Neural Network Method: II. 1D Diffusion-Reaction Problems And Data Fitting

Authors: Zhiqiang Cai, Anastassia Doktorova, Robert D. Falgout, César Herrera

Abstract: This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis funct… ▽ More This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis functions is tri-diagonal and well-conditioned, the mass matrix for NNs is dense and ill-conditioned. For example, the condition number of the NN mass matrix for quasi-uniform meshes is at least ${\cal O}(n^4)$. We present a factorization of the mass matrix that enables solving the systems of linear equations in ${\cal O}(n)$ operations. To determine the non-linear parameters (the weights and bias of the hidden layer), one step of a damped Newton method is employed at each iteration. A Gauss-Newton method is used in place of Newton for the instances in which the Hessian matrices are singular. This modified dBN is referred to as dBGN. For both methods, the computational cost per iteration is ${\cal O}(n)$. Numerical results demonstrate the ability dBN and dBGN to efficiently achieve accurate results and outperform BFGS for select examples. △ Less

Submitted 1 July, 2024; originally announced July 2024.

MSC Class: 65K10; 65F05

arXiv:2406.02397 [pdf, ps, other]

One-arm Probabilities for Metric Graph Gaussian Free Fields below and at the Critical Dimension

Authors: Zhenhao Cai, Jian Ding

Abstract: For the critical level-set of the Gaussian free field on the metric graph of $\mathbb Z^d$, we consider the one-arm probability $θ_d(N)$, i.e., the probability that the boundary of a box of side length $2N$ is connected to the center. We prove that $θ_d(N)$ is $O(N^{-\frac{d}{2}+1})$ for $3\le d\le 5$, and is $N^{-2+o(1)}$ for $d=6$. Our upper bounds match the lower bounds in a previous work by Di… ▽ More For the critical level-set of the Gaussian free field on the metric graph of $\mathbb Z^d$, we consider the one-arm probability $θ_d(N)$, i.e., the probability that the boundary of a box of side length $2N$ is connected to the center. We prove that $θ_d(N)$ is $O(N^{-\frac{d}{2}+1})$ for $3\le d\le 5$, and is $N^{-2+o(1)}$ for $d=6$. Our upper bounds match the lower bounds in a previous work by Ding and Wirth up to a constant factor for $3\le d\le 5$, and match the exponent therein for $d=6$. Combined with our previous result that $θ_d(N) \asymp N^{-2}$ for $d>6$, this seems to present the first percolation model whose one-arm probabilities are essentially completely understood in all dimensions. In particular, these results fully confirm Werner's conjectures (2021) on the one-arm exponents: \begin{equation*} \text{(1) for}\ 3\le d<d_c=6,\ θ_d(N)=N^{-\frac{d}{2}+o(1)};\ \text{(2) for}\ d>d_c,\ θ_d(N)=N^{-2+o(1)}. \end{equation*} Prior to our work, Drewitz, Prévost and Rodriguez obtained upper bounds for $d\in \{3, 4\}$, which are very sharp although lose some diverging factors. In the same work, they conjectured that $θ_{d_c}(N) = N^{-2+o(1)}$, which is now established. In addition, in a recent concurrent work, Drewitz, Prévost and Rodriguez independently obtained the up-to-constant upper bound for $d=3$. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.19256 [pdf, other]

Weak Generative Sampler to Efficiently Sample Invariant Distribution of Stochastic Differential Equation

Authors: Zhiqiang Cai, Yu Cao, Yuanfei Huang, Xiang Zhou

Abstract: Sampling invariant distributions from an Ito diffusion process presents a significant challenge in stochastic simulation. Traditional numerical solvers for stochastic differential equations require both a fine step size and a lengthy simulation period, resulting in both biased and correlated samples. Current deep learning-based method solves the stationary Fokker--Planck equation to determine the… ▽ More Sampling invariant distributions from an Ito diffusion process presents a significant challenge in stochastic simulation. Traditional numerical solvers for stochastic differential equations require both a fine step size and a lengthy simulation period, resulting in both biased and correlated samples. Current deep learning-based method solves the stationary Fokker--Planck equation to determine the invariant probability density function in form of deep neural networks, but they generally do not directly address the problem of sampling from the computed density function. In this work, we introduce a framework that employs a weak generative sampler (WGS) to directly generate independent and identically distributed (iid) samples induced by a transformation map derived from the stationary Fokker--Planck equation. Our proposed loss function is based on the weak form of the Fokker--Planck equation, integrating normalizing flows to characterize the invariant distribution and facilitate sample generation from the base distribution. Our randomized test function circumvents the need for mini-max optimization in the traditional weak formulation. Distinct from conventional generative models, our method neither necessitates the computationally intensive calculation of the Jacobian determinant nor the invertibility of the transformation map. A crucial component of our framework is the adaptively chosen family of test functions in the form of Gaussian kernel functions with centres selected from the generated data samples. Experimental results on several benchmark examples demonstrate the effectiveness of our method, which offers both low computational costs and excellent capability in exploring multiple metastable states. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 24 pages,10 figures

arXiv:2404.17750 [pdf, other]

Fast Iterative Solver For Neural Network Method: I. 1D Diffusion Problems

Authors: Zhiqiang Cai, Anastassia Doktorova, Robert D. Falgout, César Herrera

Abstract: The discretization of the deep Ritz method [18] for the Poisson equation leads to a high-dimensional non-convex minimization problem, that is difficult and expensive to solve numerically. In this paper, we consider the shallow Ritz approximation to one-dimensional diffusion problems and introduce an effective and efficient iterative method, a damped block Newton (dBN) method, for solving the resul… ▽ More The discretization of the deep Ritz method [18] for the Poisson equation leads to a high-dimensional non-convex minimization problem, that is difficult and expensive to solve numerically. In this paper, we consider the shallow Ritz approximation to one-dimensional diffusion problems and introduce an effective and efficient iterative method, a damped block Newton (dBN) method, for solving the resulting non-convex minimization problem. The method employs the block Gauss-Seidel method as an outer iteration by dividing the parameters of a shallow neural network into the linear parameters (the weights and bias of the output layer) and the non-linear parameters (the weights and bias of the hidden layer). Per each outer iteration, the linear and the non-linear parameters are updated by exact inversion and one step of a damped Newton method, respectively. Inverses of the coefficient matrix and the Hessian matrix are tridiagonal and diagonal, respectively, and hence the cost of each dBN iteration is $\mathcal{O}(n)$. To move the breakpoints (the non-linear parameters) more efficiently, we propose an adaptive damped block Newton (AdBN) method by combining the dBN with the adaptive neuron enhancement (ANE) method [25]. Numerical examples demonstrate the ability of dBN and AdBN not only to move the breakpoints quickly and efficiently but also to achieve a nearly optimal order of convergence for AdBN. These iterative solvers are capable of outperforming BFGS for select examples. △ Less

Submitted 26 April, 2024; originally announced April 2024.

MSC Class: 65N99

arXiv:2404.05064 [pdf, other]

A Structure-Guided Gauss-Newton Method for Shallow ReLU Neural Network

Authors: Zhiqiang Cai, Tong Ding, Min Liu, Xinyu Liu, Jianlin Xia

Abstract: In this paper, we propose a structure-guided Gauss-Newton (SgGN) method for solving least squares problems using a shallow ReLU neural network. The method effectively takes advantage of both the least squares structure and the neural network structure of the objective function. By categorizing the weights and biases of the hidden and output layers of the network as nonlinear and linear parameters,… ▽ More In this paper, we propose a structure-guided Gauss-Newton (SgGN) method for solving least squares problems using a shallow ReLU neural network. The method effectively takes advantage of both the least squares structure and the neural network structure of the objective function. By categorizing the weights and biases of the hidden and output layers of the network as nonlinear and linear parameters, respectively, the method iterates back and forth between the nonlinear and linear parameters. The nonlinear parameters are updated by a damped Gauss-Newton method and the linear ones are updated by a linear solver. Moreover, at the Gauss-Newton step, a special form of the Gauss-Newton matrix is derived for the shallow ReLU neural network and is used for efficient iterations. It is shown that the corresponding mass and Gauss-Newton matrices in the respective linear and nonlinear steps are symmetric and positive definite under reasonable assumptions. Thus, the SgGN method naturally produces an effective search direction without the need of additional techniques like shifting in the Levenberg-Marquardt method to achieve invertibility of the Gauss-Newton matrix. The convergence and accuracy of the method are demonstrated numerically for several challenging function approximation problems, especially those with discontinuities or sharp transition layers that pose significant challenges for commonly used training algorithms in machine learning. △ Less

Submitted 7 April, 2024; originally announced April 2024.

MSC Class: 65D15; 65K10

arXiv:2401.08150 [pdf, other]

Differentially Private Sliced Inverse Regression: Minimax Optimality and Algorithm

Authors: Xintao Xia, Linjun Zhang, Zhanrui Cai

Abstract: Privacy preservation has become a critical concern in high-dimensional data analysis due to the growing prevalence of data-driven applications. Proposed by Li (1991), sliced inverse regression has emerged as a widely utilized statistical technique for reducing covariate dimensionality while maintaining sufficient statistical information. In this paper, we propose optimally differentially private a… ▽ More Privacy preservation has become a critical concern in high-dimensional data analysis due to the growing prevalence of data-driven applications. Proposed by Li (1991), sliced inverse regression has emerged as a widely utilized statistical technique for reducing covariate dimensionality while maintaining sufficient statistical information. In this paper, we propose optimally differentially private algorithms specifically designed to address privacy concerns in the context of sufficient dimension reduction. We proceed to establish lower bounds for differentially private sliced inverse regression in both the low and high-dimensional settings. Moreover, we develop differentially private algorithms that achieve the minimax lower bounds up to logarithmic factors. Through a combination of simulations and real data analysis, we illustrate the efficacy of these differentially private algorithms in safeguarding privacy while preserving vital information within the reduced dimension space. As a natural extension, we can readily offer analogous lower and upper bounds for differentially private sparse principal component analysis, a topic that may also be of potential interest to the statistical and machine learning community. △ Less

Submitted 16 January, 2024; originally announced January 2024.

arXiv:2312.06919 [pdf, other]

Evolving Neural Network (ENN) Method for One-Dimensional Scalar Hyperbolic Conservation Laws: I Linear and Quadratic Fluxes

Authors: Zhiqiang Cai, Brooke Hejnal

Abstract: We propose and study the evolving neural network (ENN) method for solving one-dimensional scalar hyperbolic conservation laws with linear and quadratic spatial fluxes. The ENN method first represents the initial data and the inflow boundary data by neural networks. Then, it evolves the neural network representation of the initial data along the temporal direction. The evolution is computed using a… ▽ More We propose and study the evolving neural network (ENN) method for solving one-dimensional scalar hyperbolic conservation laws with linear and quadratic spatial fluxes. The ENN method first represents the initial data and the inflow boundary data by neural networks. Then, it evolves the neural network representation of the initial data along the temporal direction. The evolution is computed using a combination of characteristic and finite volume methods. For the linear spatial flux, the method is not subject to any time step size, and it is shown theoretically that the error at any time step is bounded by the representation errors of the initial and boundary condition. For the quadratic flux, an error estimate is studied in a companion paper. Finally, numerical results for the linear advection equation and the inviscid Burgers equation are presented to show that the ENN method is more accurate and cost efficient than traditional mesh-based methods. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2312.06191 [pdf, ps, other]

Iterative methods of linearized moment equations for rarefied gases

Authors: Xiaoyu Dong, Zhenning Cai

Abstract: We study the iterative methods for large moment systems derived from the linearized Boltzmann equation. By Fourier analysis, it is shown that the direct application of the block symmetric Gauss-Seidel (BSGS) method has slower convergence for smaller Knudsen numbers. Better convergence rates for dense flows are then achieved by coupling the BSGS method with the micro-macro decomposition, which trea… ▽ More We study the iterative methods for large moment systems derived from the linearized Boltzmann equation. By Fourier analysis, it is shown that the direct application of the block symmetric Gauss-Seidel (BSGS) method has slower convergence for smaller Knudsen numbers. Better convergence rates for dense flows are then achieved by coupling the BSGS method with the micro-macro decomposition, which treats the moment equations as a coupled system with a microscopic part and a macroscopic part. Since the macroscopic part contains only a small number of equations, it can be solved accurately during the iteration with a relatively small computational cost, which accelerates the overall iteration. The method is further generalized to the multiscale decomposition which splits the moment system into many subsystems with different orders of magnitude. Both one- and two-dimensional numerical tests are carried out to examine the performances of these methods. Possible issues regarding the efficiency and convergence are discussed in the conclusion. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 26 pages, 16 figures

arXiv:2311.14138 [pdf, ps, other]

A symmetric Gauss-Seidel method for the steady-state Boltzmann equation

Authors: Tianai Yin, Zhenning Cai, Yanli Wang

Abstract: We introduce numerical solvers for the steady-state Boltzmann equation based on the symmetric Gauss-Seidel (SGS) method. Due to the quadratic collision operator in the Boltzmann equation, the SGS method requires solving a nonlinear system on each grid cell, and we consider two methods, namely Newton's method and the fixed-point iteration, in our numerical tests. For small Knudsen numbers, our meth… ▽ More We introduce numerical solvers for the steady-state Boltzmann equation based on the symmetric Gauss-Seidel (SGS) method. Due to the quadratic collision operator in the Boltzmann equation, the SGS method requires solving a nonlinear system on each grid cell, and we consider two methods, namely Newton's method and the fixed-point iteration, in our numerical tests. For small Knudsen numbers, our method has an efficiency between the classical source iteration and the modern generalized synthetic iterative scheme, and the complexity of its implementation is closer to the source iteration. A variety of numerical tests are carried out to demonstrate its performance, and it is concluded that the proposed method is suitable for applications with moderate to large Knudsen numbers. △ Less

Submitted 23 November, 2023; originally announced November 2023.

arXiv:2311.03670 [pdf, ps, other]

Vertex-removal stability and the least positive value of harmonic measures

Authors: Zhenhao Cai, Gady Kozma, Eviatar B. Procaccia, Yuan Zhang

Abstract: We prove that for $\mathbb{Z}^d$ ($d\ge 2$), the vertex-removal stability of harmonic measures (i.e. it is feasible to remove some vertex while changing the harmonic measure by a bounded factor) holds if and only if $d=2$. The proof mainly relies on geometric arguments, with a surprising use of the discrete Klein bottle. Moreover, a direct application of this stability verifies a conjecture of Cal… ▽ More We prove that for $\mathbb{Z}^d$ ($d\ge 2$), the vertex-removal stability of harmonic measures (i.e. it is feasible to remove some vertex while changing the harmonic measure by a bounded factor) holds if and only if $d=2$. The proof mainly relies on geometric arguments, with a surprising use of the discrete Klein bottle. Moreover, a direct application of this stability verifies a conjecture of Calvert, Ganguly and Hammond [9] for the exponential decay of the least positive value of harmonic measures on $\mathbb{Z}^2$. Furthermore, the analogue of this conjecture for $\mathbb{Z}^d$ with $d\ge 3$ is also proved in this paper, despite vertex-removal stability no longer holding. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: 34 pages, 9 figures

arXiv:2310.05489 [pdf, ps, other]

Some extensions of the $φ$-divergence moment closures for the radiative transfer equation

Authors: Micheal R A Abdelmalik, Zhenning Cai, Teddy Pichard

Abstract: The $φ$-divergence-based moment method was recently introduced Abdelmalik et al. (2023) for the discretization of the radiative transfer equation. At the continuous level, this method is very close to the entropy-based MN methods and possesses its main properties, i.e. entropy dissipation, rotational invariance and energy conservation. However, the $φ$-divergence based moment systems are easier to… ▽ More The $φ$-divergence-based moment method was recently introduced Abdelmalik et al. (2023) for the discretization of the radiative transfer equation. At the continuous level, this method is very close to the entropy-based MN methods and possesses its main properties, i.e. entropy dissipation, rotational invariance and energy conservation. However, the $φ$-divergence based moment systems are easier to resolve numerically due to the improved conditioning of the discrete equations. Moreover, exact quadrature rules can be used to compute moments of the distribution function, which enables the preservation of energy conservation, entropy dissipation and rotational invariants, discretely. In this paper we consider different variants of the $φ$-divergence closures that are based on different approximations of the exponential function and the Planck function. We compare the approximation properties of the proposed closures in the numerical benchmarks. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2307.04434 [pdf, other]

One-arm exponent of critical level-set for metric graph Gaussian free field in high dimensions

Authors: Zhenhao Cai, Jian Ding

Abstract: In this paper, we study the critical level-set of Gaussian free field (GFF) on the metric graph $\widetilde{\mathbb{Z}}^d,d>6$. We prove that the one-arm probability (i.e. the probability of the event that the origin is connected to the boundary of the box $B(N)$) is proportional to $N^{-2}$, where $B(N)$ is centered at the origin and has side length $2\lfloor N \rfloor$. Our proof is hugely inspi… ▽ More In this paper, we study the critical level-set of Gaussian free field (GFF) on the metric graph $\widetilde{\mathbb{Z}}^d,d>6$. We prove that the one-arm probability (i.e. the probability of the event that the origin is connected to the boundary of the box $B(N)$) is proportional to $N^{-2}$, where $B(N)$ is centered at the origin and has side length $2\lfloor N \rfloor$. Our proof is hugely inspired by Kozma and Nachmias [29] which proves the analogous result of the critical bond percolation for $d\geq 11$, and by Werner [51] which conjectures the similarity between the GFF level-set and the bond percolation in general and proves this connection for various geometric aspects. △ Less

Submitted 10 July, 2023; originally announced July 2023.

arXiv:2306.07445 [pdf, other]

Least-Squares Neural Network (LSNN) Method For Linear Advection-Reaction Equation: Non-constant Jumps

Authors: Zhiqiang Cai, Junpyo Choi, Min Liu

Abstract: The least-squares ReLU neural network (LSNN) method was introduced and studied for solving linear advection-reaction equation with discontinuous solution in \cite{Cai2021linear,cai2023least}. The method is based on an equivalent least-squares formulation and \cite{cai2023least} employs ReLU neural network (NN) functions with $\lceil \log_2(d+1)\rceil+1$-layer representations for approximating solu… ▽ More The least-squares ReLU neural network (LSNN) method was introduced and studied for solving linear advection-reaction equation with discontinuous solution in \cite{Cai2021linear,cai2023least}. The method is based on an equivalent least-squares formulation and \cite{cai2023least} employs ReLU neural network (NN) functions with $\lceil \log_2(d+1)\rceil+1$-layer representations for approximating solutions. In this paper, we show theoretically that the method is also capable of accurately approximating non-constant jumps along discontinuous interfaces that are not necessarily straight lines. Theoretical results are confirmed through multiple numerical examples with $d=2,3$ and various non-constant jumps and interface shapes, showing that the LSNN method with $\lceil \log_2(d+1)\rceil+1$ layers approximates solutions accurately with degrees of freedom less than that of mesh-based methods and without the common Gibbs phenomena along discontinuous interfaces having non-constant jumps. △ Less

Submitted 29 May, 2024; v1 submitted 12 June, 2023; originally announced June 2023.

Comments: 19 pages. A continuation of arXiv:2301.06156

MSC Class: 65N15; 65N99

arXiv:2305.18344 [pdf, ps, other]

doi 10.1088/1572-9494/acc7f3

The Dirac equation on metrics of Eguchi-Hanson type

Authors: Zhuohua Cai, Xiao Zhang

Abstract: We investigate parallel spinors on the Eguchi-Hanson metrics and find the space of complex parallel spinors are complex 2-dimensional. For the metrics of Eguchi-Hanson type with the zero scalar curvature, we separate variables for the harmonic spinors and obtain the solutions explicitly. We investigate parallel spinors on the Eguchi-Hanson metrics and find the space of complex parallel spinors are complex 2-dimensional. For the metrics of Eguchi-Hanson type with the zero scalar curvature, we separate variables for the harmonic spinors and obtain the solutions explicitly. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: 11 pages

Journal ref: Commun. Theor. Phys. 75 (2023) 055002 (5pp)

arXiv:2305.15257 [pdf, other]

doi 10.1016/j.cma.2023.116229

Deep Ritz Method with Adaptive Quadrature for Linear Elasticity

Authors: Min Liu, Zhiqiang Cai, Karthik Ramani

Abstract: In this paper, we study the deep Ritz method for solving the linear elasticity equation from a numerical analysis perspective. A modified Ritz formulation using the $H^{1/2}(Γ_D)$ norm is introduced and analyzed for linear elasticity equation in order to deal with the (essential) Dirichlet boundary condition. We show that the resulting deep Ritz method provides the best approximation among the set… ▽ More In this paper, we study the deep Ritz method for solving the linear elasticity equation from a numerical analysis perspective. A modified Ritz formulation using the $H^{1/2}(Γ_D)$ norm is introduced and analyzed for linear elasticity equation in order to deal with the (essential) Dirichlet boundary condition. We show that the resulting deep Ritz method provides the best approximation among the set of deep neural network (DNN) functions with respect to the ``energy'' norm. Furthermore, we demonstrate that the total error of the deep Ritz simulation is bounded by the sum of the network approximation error and the numerical integration error, disregarding the algebraic error. To effectively control the numerical integration error, we propose an adaptive quadrature-based numerical integration technique with a residual-based local error indicator. This approach enables efficient approximation of the modified energy functional. Through numerical experiments involving smooth and singular problems, as well as problems with stress concentration, we validate the effectiveness and efficiency of the proposed deep Ritz method with adaptive quadrature. △ Less

Submitted 24 May, 2023; originally announced May 2023.

arXiv:2305.14409 [pdf, ps, other]

Evolution: A Unified Formula for Feature Operators from a High-level Perspective

Authors: Zhicheng Cai

Abstract: Traditionally, different types of feature operators (e.g., convolution, self-attention and involution) utilize different approaches to extract and aggregate the features. Resemblance can be hardly discovered from their mathematical formulas. However, these three operators all serve the same paramount purpose and bear no difference in essence. Hence we probe into the essence of various feature oper… ▽ More Traditionally, different types of feature operators (e.g., convolution, self-attention and involution) utilize different approaches to extract and aggregate the features. Resemblance can be hardly discovered from their mathematical formulas. However, these three operators all serve the same paramount purpose and bear no difference in essence. Hence we probe into the essence of various feature operators from a high-level perspective, transformed their components equivalently, and explored their mathematical expressions within higher dimensions. We raise one clear and concrete unified formula for different feature operators termed as Evolution. Evolution utilizes the Evolution Function to generate the Evolution Kernel, which extracts and aggregates the features in certain positions of the input feature map. We mathematically deduce the equivalent transformation from the traditional formulas of these feature operators to Evolution and prove the unification. In addition, we discuss the forms of Evolution Functions and the properties of generated Evolution Kernels, intending to give inspirations to the further research and innovations of powerful feature operators. △ Less

Submitted 23 May, 2023; originally announced May 2023.

arXiv:2304.11847 [pdf, ps, other]

A positive and moment-preserving Fourier spectral method

Authors: Zhenning Cai, Bo Lin, Meixia Lin

Abstract: This paper presents a novel Fourier spectral method that utilizes optimization techniques to ensure the positivity and conservation of moments in the space of trigonometric polynomials. We rigorously analyze the accuracy of the new method and prove that it maintains spectral accuracy. To solve the optimization problem, we propose an efficient Newton solver that has quadratic convergence rate. Nume… ▽ More This paper presents a novel Fourier spectral method that utilizes optimization techniques to ensure the positivity and conservation of moments in the space of trigonometric polynomials. We rigorously analyze the accuracy of the new method and prove that it maintains spectral accuracy. To solve the optimization problem, we propose an efficient Newton solver that has quadratic convergence rate. Numerical examples are provided to demonstrate the high accuracy of the proposed method. Our method is also integrated into the spectral solver of the Boltzmann equation, showing the benefit of our approach in applications. △ Less

Submitted 24 April, 2023; originally announced April 2023.

Comments: 25 pages, 5 figures

arXiv:2304.08342 [pdf, other]

NF-ULA: Langevin Monte Carlo with Normalizing Flow Prior for Imaging Inverse Problems

Authors: Ziruo Cai, Junqi Tang, Subhadip Mukherjee, **glai Li, Carola Bibiane Schönlieb, Xiaoqun Zhang

Abstract: Bayesian methods for solving inverse problems are a powerful alternative to classical methods since the Bayesian approach offers the ability to quantify the uncertainty in the solution. In recent years, data-driven techniques for solving inverse problems have also been remarkably successful, due to their superior representation ability. In this work, we incorporate data-based models into a class o… ▽ More Bayesian methods for solving inverse problems are a powerful alternative to classical methods since the Bayesian approach offers the ability to quantify the uncertainty in the solution. In recent years, data-driven techniques for solving inverse problems have also been remarkably successful, due to their superior representation ability. In this work, we incorporate data-based models into a class of Langevin-based sampling algorithms for Bayesian inference in imaging inverse problems. In particular, we introduce NF-ULA (Normalizing Flow-based Unadjusted Langevin algorithm), which involves learning a normalizing flow (NF) as the image prior. We use NF to learn the prior because a tractable closed-form expression for the log prior enables the differentiation of it using autograd libraries. Our algorithm only requires a normalizing flow-based generative network, which can be pre-trained independently of the considered inverse problem and the forward operator. We perform theoretical analysis by investigating the well-posedness and non-asymptotic convergence of the resulting NF-ULA algorithm. The efficacy of the proposed NF-ULA algorithm is demonstrated in various image restoration problems such as image deblurring, image inpainting, and limited-angle X-ray computed tomography (CT) reconstruction. NF-ULA is found to perform better than competing methods for severely ill-posed inverse problems. △ Less

Submitted 14 October, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

arXiv:2304.04544 [pdf, other]

Approximate Primal-Dual Fixed-Point based Langevin Algorithms for Non-smooth Convex Potentials

Authors: Ziruo Cai, **glai Li, Xiaoqun Zhang

Abstract: The Langevin algorithms are frequently used to sample the posterior distributions in Bayesian inference. In many practical problems, however, the posterior distributions often consist of non-differentiable components, posing challenges for the standard Langevin algorithms, as they require to evaluate the gradient of the energy function in each iteration. To this end, a popular remedy is to utilize… ▽ More The Langevin algorithms are frequently used to sample the posterior distributions in Bayesian inference. In many practical problems, however, the posterior distributions often consist of non-differentiable components, posing challenges for the standard Langevin algorithms, as they require to evaluate the gradient of the energy function in each iteration. To this end, a popular remedy is to utilize the proximity operator, and as a result one needs to solve a proximity subproblem in each iteration. The conventional practice is to solve the subproblems accurately, which can be exceedingly expensive, as the subproblem needs to be solved in each iteration. We propose an approximate primal-dual fixed-point algorithm for solving the subproblem, which only seeks an approximate solution of the subproblem and therefore reduces the computational cost considerably. We provide theoretical analysis of the proposed method and also demonstrate its performance with numerical examples. △ Less

Submitted 10 April, 2023; originally announced April 2023.

arXiv:2303.14899 [pdf, ps, other]

Classification and enumeration of lattice polygons in a disc

Authors: Qiuyue Liu, Yuqin Zhang, Zhanyuan Cai

Abstract: In 1980, V. I. Arnold studied the classification problem for convex lattice polygons of given area. Since then, this problem and its analogues have been studied by many authors, including $\mathrm{B\acute{a}r\acute{a}ny}$, Lagarias, Pach, Santos, Ziegler and Zong. Recently, Zong proposed two computer programs to prove Hadwiger's covering conjecture and Borsuk's partition problem, respectively, bas… ▽ More In 1980, V. I. Arnold studied the classification problem for convex lattice polygons of given area. Since then, this problem and its analogues have been studied by many authors, including $\mathrm{B\acute{a}r\acute{a}ny}$, Lagarias, Pach, Santos, Ziegler and Zong. Recently, Zong proposed two computer programs to prove Hadwiger's covering conjecture and Borsuk's partition problem, respectively, based on enumeration of the convex lattice polytopes contained in certain balls. For this purpose, similar to $\mathrm{B\acute{a}r\acute{a}ny}$ and Pach's work on volume and Liu and Zong's work on cardinality, we obtain bounds on the number of non-equivalent convex lattice polygons in a given disc. Furthermore, we propose an algorithm to enumerate these convex lattice polygons. △ Less

Submitted 26 March, 2023; originally announced March 2023.

Comments: 19 pages, 3 figures

MSC Class: 52B20; 52C07

arXiv:2301.11582 [pdf, other]

Adaptive Least-Squares Methods for Convection-Dominated Diffusion-Reaction Problems

Authors: Zhiqiang Cai, Binghe Chen, **g Yang

Abstract: This paper studies adaptive least-squares finite element methods for convection-dominated diffusion-reaction problems. The least-squares methods are based on the first-order system of the primal and dual variables with various ways of imposing outflow boundary conditions. The coercivity of the homogeneous least-squares functionals are established, and the a priori error estimates of the least-squa… ▽ More This paper studies adaptive least-squares finite element methods for convection-dominated diffusion-reaction problems. The least-squares methods are based on the first-order system of the primal and dual variables with various ways of imposing outflow boundary conditions. The coercivity of the homogeneous least-squares functionals are established, and the a priori error estimates of the least-squares methods are obtained in a norm that incorporates the streamline derivative. All methods have the same convergence rate provided that meshes in the layer regions are fine enough. To increase computational accuracy and reduce computational cost, adaptive least-squares methods are implemented and numerical results are presented for some test problems. △ Less

Submitted 27 January, 2023; originally announced January 2023.

MSC Class: 65N50

arXiv:2301.06156 [pdf, other]

Least-Squares Neural Network (LSNN) Method For Linear Advection-Reaction Equation: Discontinuity Interface

Authors: Zhiqiang Cai, Junpyo Choi, Min Liu

Abstract: We studied the least-squares ReLU neural network (LSNN) method for solving linear advection-reaction equation with discontinuous solution in [Cai, Zhiqiang, **gshuang Chen, and Min Liu. ``Least-squares ReLU neural network (LSNN) method for linear advection-reaction equation.'' Journal of Computational Physics 443 (2021), 110514]. The method is based on a least-squares formulation and uses a new c… ▽ More We studied the least-squares ReLU neural network (LSNN) method for solving linear advection-reaction equation with discontinuous solution in [Cai, Zhiqiang, **gshuang Chen, and Min Liu. ``Least-squares ReLU neural network (LSNN) method for linear advection-reaction equation.'' Journal of Computational Physics 443 (2021), 110514]. The method is based on a least-squares formulation and uses a new class of approximating functions: ReLU neural network (NN) functions. A critical and additional component of the LSNN method, differing from other NN-based methods, is the introduction of a properly designed and physics preserved discrete differential operator. In this paper, we study the LSNN method for problems with discontinuity interfaces. First, we show that ReLU NN functions with depth $\lceil \log_2(d+1)\rceil+1$ can approximate any $d$-dimensional step function on a discontinuity interface generated by a vector field as streamlines with any prescribed accuracy. By decomposing the solution into continuous and discontinuous parts, we prove theoretically that discretization error of the LSNN method using ReLU NN functions with depth $\lceil \log_2(d+1)\rceil+1$ is mainly determined by the continuous part of the solution provided that the solution jump is constant. Numerical results for both two- and three-dimensional test problems with various discontinuity interfaces show that the LSNN method with enough layers is accurate and does not exhibit the common Gibbs phenomena along discontinuity interfaces. △ Less

Submitted 5 February, 2024; v1 submitted 15 January, 2023; originally announced January 2023.

Comments: 30 pages

MSC Class: 65N15; 65N99

arXiv:2212.09014 [pdf, ps, other]

New sufficient degree conditions for an $r$-uniform hypergraph to be $k$-edge-connected

Authors: Jiyun Guo, Jun Wang, Zhanyuan Cai, Haiyan Li

Abstract: An $r$-uniform hypergraphic sequence (i.e., $r$-graphic sequence) $d=(d_1, d_2,\cdots,d_n)$ is said to be forcibly $k$-edge-connected if every realization of $d$ is $k$-edge-connected. In this paper, we obtain a strongest sufficient degree condition for $d$ to be $k$-edge-connected for all $k\ge 1$ and a strongest sufficient degree condition for $d$ to be super edge-connected. As a corollary, we g… ▽ More An $r$-uniform hypergraphic sequence (i.e., $r$-graphic sequence) $d=(d_1, d_2,\cdots,d_n)$ is said to be forcibly $k$-edge-connected if every realization of $d$ is $k$-edge-connected. In this paper, we obtain a strongest sufficient degree condition for $d$ to be $k$-edge-connected for all $k\ge 1$ and a strongest sufficient degree condition for $d$ to be super edge-connected. As a corollary, we give the minimum degree condition for $d$ to be maximally edge-connected. We also obtain another sufficient degree condition for $d$ to be $k$-edge-connected. △ Less

Submitted 18 December, 2022; originally announced December 2022.

arXiv:2205.02312 [pdf, ps, other]

Asymptotic analysis of diabatic surface hop** algorithm in the adiabatic and non-adiabatic limits

Authors: Zhenning Cai, Di Fang, Jianfeng Lu

Abstract: Surface hop** algorithms, as an important class of quantum dynamics simulation algorithms for non-adiabatic dynamics, are typically performed in the adiabatic representation, which can break down in the presence of ill-defined adiabatic potential energy surfaces (PESs) and adiabatic coupling term. Another issue of surface hop** algorithms is the difficulty in capturing the correct scaling of t… ▽ More Surface hop** algorithms, as an important class of quantum dynamics simulation algorithms for non-adiabatic dynamics, are typically performed in the adiabatic representation, which can break down in the presence of ill-defined adiabatic potential energy surfaces (PESs) and adiabatic coupling term. Another issue of surface hop** algorithms is the difficulty in capturing the correct scaling of the transition rate in the Marcus (weak-coupling/non-adiabatic) regime. Though the first issue can be circumvented by exploiting the diabatic representation, diabatic surface hop** algorithms usually lack justification on the theoretical level. We consider the diabatic surface hop** algorithm proposed in [Fang, Lu. Multiscale Model. Simul. 16:4, 1603-1622, 2018] and provide the asymptotic analysis of the transition rate in the Marcus regime that justifies the correct scaling for the spin-boson model. We propose two conditions that guarantee the correctness for general potentials. In the opposite (strong-coupling/adiabatic) regime, we derive the asymptotic behavior of the algorithm that interestingly matches a type of mean-field description. The techniques used here may shed light on the analysis for other diabatic-based algorithms. △ Less

Submitted 4 May, 2022; originally announced May 2022.

arXiv:2204.12787 [pdf, other]

doi 10.1002/mma.9204

3-D generalized analytic signal associated with linear canonical transform in Clifford biquaternion domain

Authors: Zhen Feng Cai, Kit Ian Kou

Abstract: The analytic signal is a useful mathematical tool. It separates qualitative and quantitative information of a signal in form of the local phase and local amplitude. The Clifford Fourier transform (CFT) plays a vital role in the representation of multidimensional signals. By generalizing the CFT to the Clifford linear canonical transform (CLCT), we present a new type of Clifford biquaternionic anal… ▽ More The analytic signal is a useful mathematical tool. It separates qualitative and quantitative information of a signal in form of the local phase and local amplitude. The Clifford Fourier transform (CFT) plays a vital role in the representation of multidimensional signals. By generalizing the CFT to the Clifford linear canonical transform (CLCT), we present a new type of Clifford biquaternionic analytic signal. Due to the advantages of more freedom, the envelop detection problems of 3D images, with the help of this new analytic signal, can get a better visual appearance. Synthesis examples are presented to demonstrate these advantages. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: 19 pages and 5 figures

MSC Class: 45P05

arXiv:2112.01136 [pdf, other]

On the exact orders of critical value in Finitary Random Interlacements

Authors: Zhenhao Cai, Yuan Zhang

Abstract: In this paper, we prove the exact orders of critical intensity $u_*(T)$ in Finitary Random Interlacements (FRI) in $\mathbb{Z}^d, \ d\ge 3$ with respect to the expected fiber length $T$. We show that as $T\to\infty$, $u_*(T)\sim T^{-1}, \ d\ge 5;$ $u_*(T)\sim T^{-1}\log T, \ d=4;$ $u_*(T)\sim T^{-1/2},\ d=3.$ Our estimates also give the order of magnitude at which the percolative phase tra… ▽ More In this paper, we prove the exact orders of critical intensity $u_*(T)$ in Finitary Random Interlacements (FRI) in $\mathbb{Z}^d, \ d\ge 3$ with respect to the expected fiber length $T$. We show that as $T\to\infty$, $u_*(T)\sim T^{-1}, \ d\ge 5;$ $u_*(T)\sim T^{-1}\log T, \ d=4;$ $u_*(T)\sim T^{-1/2},\ d=3.$ Our estimates also give the order of magnitude at which the percolative phase transition with respect to $T$ takes place. △ Less

Submitted 4 December, 2021; v1 submitted 2 December, 2021; originally announced December 2021.

Comments: 29 pages, 2 figures

arXiv:2110.10895 [pdf, other]

Least-Squares Neural Network (LSNN) Method For Scalar Nonlinear Hyperbolic Conservation Laws: Discrete Divergence Operator

Authors: Zhiqiang Cai, **gshuang Chen, Min Liu

Abstract: A least-squares neural network (LSNN) method was introduced for solving scalar linear and nonlinear hyperbolic conservation laws (HCLs) in [7, 6]. This method is based on an equivalent least-squares (LS) formulation and uses ReLU neural network as approximating functions, making it ideal for approximating discontinuous functions with unknown interface location. In the design of the LSNN method for… ▽ More A least-squares neural network (LSNN) method was introduced for solving scalar linear and nonlinear hyperbolic conservation laws (HCLs) in [7, 6]. This method is based on an equivalent least-squares (LS) formulation and uses ReLU neural network as approximating functions, making it ideal for approximating discontinuous functions with unknown interface location. In the design of the LSNN method for HCLs, the numerical approximation of differential operators is a critical factor, and standard numerical or automatic differentiation along coordinate directions can often lead to a failed NN-based method. To overcome this challenge, this paper rewrites HCLs in their divergence form of space and time and introduces a new discrete divergence operator. As a result, the proposed LSNN method is free of penalization of artificial viscosity. Theoretically, the accuracy of the discrete divergence operator is estimated even for discontinuous solutions. Numerically, the LSNN method with the new discrete divergence operator was tested for several benchmark problems with both convex and non-convex fluxes, and was able to compute the correct physical solution for problems with rarefaction, shock or compound waves. The method is capable of capturing the shock of the underlying problem without oscillation or smearing, even without any penalization of the entropy condition, total variation, and/or artificial viscosity. △ Less

Submitted 7 May, 2023; v1 submitted 21 October, 2021; originally announced October 2021.

Comments: Published on Journal of Computational and Applied Mathematics

arXiv:2109.11756 [pdf, other]

Continuity and uniqueness of percolation critical parameters in Finitary Random Interlacements

Authors: Zhenhao Cai, Eviatar B. Procaccia, Yuan Zhang

Abstract: We prove that the critical percolation parameter for Finitary Random Interlacements (FRI) is continuous with respect to the path length parameter $T$. The proof uses a result which is interesting on its own right; equality of natural critical parameters for FRI percolation phase transition. We prove that the critical percolation parameter for Finitary Random Interlacements (FRI) is continuous with respect to the path length parameter $T$. The proof uses a result which is interesting on its own right; equality of natural critical parameters for FRI percolation phase transition. △ Less

Submitted 24 September, 2021; originally announced September 2021.

Comments: 49 pages, 5 figures

arXiv:2109.02839 [pdf, other]

doi 10.1016/j.jcp.2022.111021

Self-adaptive deep neural network: Numerical approximation to functions and PDEs

Authors: Zhiqiang Cai, **gshuang Chen, Min Liu

Abstract: Designing an optimal deep neural network for a given task is important and challenging in many machine learning applications. To address this issue, we introduce a self-adaptive algorithm: the adaptive network enhancement (ANE) method, written as loops of the form train, estimate and enhance. Starting with a small two-layer neural network (NN), the step train is to solve the optimization problem a… ▽ More Designing an optimal deep neural network for a given task is important and challenging in many machine learning applications. To address this issue, we introduce a self-adaptive algorithm: the adaptive network enhancement (ANE) method, written as loops of the form train, estimate and enhance. Starting with a small two-layer neural network (NN), the step train is to solve the optimization problem at the current NN; the step estimate is to compute a posteriori estimator/indicators using the solution at the current NN; the step enhance is to add new neurons to the current NN. Novel network enhancement strategies based on the computed estimator/indicators are developed in this paper to determine how many new neurons and when a new layer should be added to the current NN. The ANE method provides a natural process for obtaining a good initialization in training the current NN; in addition, we introduce an advanced procedure on how to initialize newly added neurons for a better approximation. We demonstrate that the ANE method can automatically design a nearly minimal NN for learning functions exhibiting sharp transitional layers as well as discontinuous solutions of hyperbolic partial differential equations. △ Less

Submitted 4 February, 2022; v1 submitted 6 September, 2021; originally announced September 2021.

Comments: Published in Journal of Computational Physics

arXiv:2108.02975 [pdf, ps, other]

Biquaternion Z Transform

Authors: Wenshan Bi, Zhen-Feng Cai, Kit Ian Kou

Abstract: In this work, the biquaternion Z transformation method is proposed to solve a class of biquaternion recurrence relations. Biqueternion Z transform is an natural extension of the complex Z transform. In the design process, special norm presentation is employed to analyze the region of convergence of the biquaternion geometry sequence. In addition, some useful properties have been given. It is shown… ▽ More In this work, the biquaternion Z transformation method is proposed to solve a class of biquaternion recurrence relations. Biqueternion Z transform is an natural extension of the complex Z transform. In the design process, special norm presentation is employed to analyze the region of convergence of the biquaternion geometry sequence. In addition, some useful properties have been given. It is shown that the proposed properties is helpful to understand the biquaternion Z transform. Finally, several examples have been given to illustrate the effectiveness of the proposed design method. △ Less

Submitted 6 August, 2021; originally announced August 2021.

arXiv:2107.08935 [pdf, other]

Adaptive Two-Layer ReLU Neural Network: I. Best Least-squares Approximation

Authors: Min Liu, Zhiqiang Cai, **gshuang Chen

Abstract: In this paper, we introduce adaptive neuron enhancement (ANE) method for the best least-squares approximation using two-layer ReLU neural networks (NNs). For a given function f(x), the ANE method generates a two-layer ReLU NN and a numerical integration mesh such that the approximation accuracy is within the prescribed tolerance. The ANE method provides a natural process for obtaining a good initi… ▽ More In this paper, we introduce adaptive neuron enhancement (ANE) method for the best least-squares approximation using two-layer ReLU neural networks (NNs). For a given function f(x), the ANE method generates a two-layer ReLU NN and a numerical integration mesh such that the approximation accuracy is within the prescribed tolerance. The ANE method provides a natural process for obtaining a good initialization which is crucial for training nonlinear optimization problems. Numerical results of the ANE method are presented for functions of two variables exhibiting either intersecting interface singularities or sharp interior layers. △ Less

Submitted 14 January, 2022; v1 submitted 19 July, 2021; originally announced July 2021.

Comments: 17 pages

arXiv:2107.06459 [pdf, other]

Adaptive Two-Layer ReLU Neural Network: II. Ritz Approximation to Elliptic PDEs

Authors: Min Liu, Zhiqiang Cai

Abstract: In this paper, we study adaptive neuron enhancement (ANE) method for solving self-adjoint second-order elliptic partial differential equations (PDEs). The ANE method is a self-adaptive method generating a two-layer spline NN and a numerical integration mesh such that the approximation accuracy is within the prescribed tolerance. Moreover, the ANE method provides a natural process for obtaining a g… ▽ More In this paper, we study adaptive neuron enhancement (ANE) method for solving self-adjoint second-order elliptic partial differential equations (PDEs). The ANE method is a self-adaptive method generating a two-layer spline NN and a numerical integration mesh such that the approximation accuracy is within the prescribed tolerance. Moreover, the ANE method provides a natural process for obtaining a good initialization which is crucial for training nonlinear optimization problem. The underlying PDE is discretized by the Ritz method using a two-layer spline neural network based on either the primal or dual formulations that minimize the respective energy or complimentary functionals. Essential boundary conditions are imposed weakly through the functionals with proper norms. It is proved that the Ritz approximation is the best approximation in the energy norm; moreover, effect of numerical integration for the Ritz approximation is analyzed as well. Two estimators for adaptive neuron enhancement method are introduced, one is the so-called recovery estimator and the other is the least-squares estimator. Finally, numerical results for diffusion problems with either corner or intersecting interface singularities are presented. △ Less

Submitted 13 July, 2021; originally announced July 2021.

Comments: 18 pages

arXiv:2107.06341 [pdf, ps, other]

Hybrid A Posteriori Error Estimators for Conforming Finite Element Approximations to Stationary Convection-Diffusion-Reaction equations

Authors: Difeng Cai, Zhiqiang Cai

Abstract: We consider the a posteriori error estimation for convection-diffusion-reaction equations in both diffusion-dominated and convection/reaction-dominated regimes. We present an explicit hybrid estimator, which, in each regime, is proved to be reliable and efficient with constants independent of the parameters in the underlying problem. For convection-dominated problems, the norm introduced by Verf{ü… ▽ More We consider the a posteriori error estimation for convection-diffusion-reaction equations in both diffusion-dominated and convection/reaction-dominated regimes. We present an explicit hybrid estimator, which, in each regime, is proved to be reliable and efficient with constants independent of the parameters in the underlying problem. For convection-dominated problems, the norm introduced by Verf{ü}rth \cite{verf2005confusion} is used to measure the approximation error. Various numerical experiments are performed to (1) demonstrate the robustness of the hybrid estimator; (2) show that the hybrid estimator is more accurate than the explicit residual estimator and is less sensitive to the size of reaction, even though both of them are robust. △ Less

Submitted 15 July, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

arXiv:2106.12428 [pdf, other]

An entropic method for discrete systems with Gibbs entropy

Authors: Zhenning Cai, **gwei Hu, Yang Kuang, Bo Lin

Abstract: We consider general systems of ordinary differential equations with monotonic Gibbs entropy, and introduce an entropic scheme that simply imposes an entropy fix after every time step of any existing time integrator. It is proved that in the general case, our entropy fix has only infinitesimal influence on the numerical order of the original scheme, and in many circumstances, it can be shown that t… ▽ More We consider general systems of ordinary differential equations with monotonic Gibbs entropy, and introduce an entropic scheme that simply imposes an entropy fix after every time step of any existing time integrator. It is proved that in the general case, our entropy fix has only infinitesimal influence on the numerical order of the original scheme, and in many circumstances, it can be shown that the scheme does not affect the numerical order. Numerical experiments on the linear Fokker-Planck equation and nonlinear Boltzmann equation are carried out to support our numerical analysis. △ Less

Submitted 23 June, 2021; originally announced June 2021.

MSC Class: 65L05

arXiv:2105.11632 [pdf, other]

doi 10.1016/j.jcp.2021.110514

Least-Squares ReLU Neural Network (LSNN) Method For Linear Advection-Reaction Equation

Authors: Zhiqiang Cai, **gshuang Chen, Min Liu

Abstract: This paper studies least-squares ReLU neural network method for solving the linear advection-reaction problem with discontinuous solution. The method is a discretization of an equivalent least-squares formulation in the set of neural network functions with the ReLU activation function. The method is capable of approximating the discontinuous interface of the underlying problem automatically throug… ▽ More This paper studies least-squares ReLU neural network method for solving the linear advection-reaction problem with discontinuous solution. The method is a discretization of an equivalent least-squares formulation in the set of neural network functions with the ReLU activation function. The method is capable of approximating the discontinuous interface of the underlying problem automatically through the free hyper-planes of the ReLU neural network and, hence, outperforms mesh-based numerical methods in terms of the number of degrees of freedom. Numerical results of some benchmark test problems show that the method can not only approximate the solution with the least number of parameters, but also avoid the common Gibbs phenomena along the discontinuous interface. Moreover, a three-layer ReLU neural network is necessary and sufficient in order to well approximate a discontinuous solution with an interface in $\mathbb{R}^2$ that is not a straight line. △ Less

Submitted 24 May, 2021; originally announced May 2021.

Comments: submitted to Journal of Computational Physics

arXiv:2105.11627 [pdf, other]

Least-Squares ReLU Neural Network (LSNN) Method For Scalar Nonlinear Hyperbolic Conservation Law

Authors: Zhiqiang Cai, **gshuang Chen, Min Liu

Abstract: We introduced the least-squares ReLU neural network (LSNN) method for solving the linear advection-reaction problem with discontinuous solution and showed that the method outperforms mesh-based numerical methods in terms of the number of degrees of freedom. This paper studies the LSNN method for scalar nonlinear hyperbolic conservation law. The method is a discretization of an equivalent least-squ… ▽ More We introduced the least-squares ReLU neural network (LSNN) method for solving the linear advection-reaction problem with discontinuous solution and showed that the method outperforms mesh-based numerical methods in terms of the number of degrees of freedom. This paper studies the LSNN method for scalar nonlinear hyperbolic conservation law. The method is a discretization of an equivalent least-squares (LS) formulation in the set of neural network functions with the ReLU activation function. Evaluation of the LS functional is done by using numerical integration and conservative finite volume scheme. Numerical results of some test problems show that the method is capable of approximating the discontinuous interface of the underlying problem automatically through the free breaking lines of the ReLU neural network. Moreover, the method does not exhibit the common Gibbs phenomena along the discontinuous interface. △ Less

Submitted 24 January, 2022; v1 submitted 24 May, 2021; originally announced May 2021.

Journal ref: Published in Applied Numerical Mathematics. 2022

arXiv:2102.08559 [pdf, other]

Numerical Solver for the Boltzmann Equation With Self-Adaptive Collision Operators

Authors: Zhenning Cai, Yanli Wang

Abstract: We use the Burnett spectral method to solve the Boltzmann equation whose collision term is modeled by separate treatments for the low-frequency part and high-frequency part of the solution. For the low-frequency part representing the sketch of the distribution function, the binary collision is applied, while for the high-frequency part representing the finer details, the BGK approximation is appli… ▽ More We use the Burnett spectral method to solve the Boltzmann equation whose collision term is modeled by separate treatments for the low-frequency part and high-frequency part of the solution. For the low-frequency part representing the sketch of the distribution function, the binary collision is applied, while for the high-frequency part representing the finer details, the BGK approximation is applied. The parameter controlling the ratio of the high-frequency part and the low-frequency part is selected adaptively on every grid cell at every time step. This self-adaptation is based on an error indicator describing the difference between the model collision term and the original binary collision term. The indicator is derived by controlling the quadratic terms in the modeling error with linear operators. Our numerical experiments show that such an error indicator is effective and computationally affordable. △ Less

Submitted 22 October, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

arXiv:2101.08010 [pdf, ps, other]

Some Rigorous Results on the Phase Transition of Finitary Random Interlacement

Authors: Zhenhao Cai, Yuan Zhang

Abstract: In this paper, we show several rigorous results on the phase transition of Finitary Random Interlacement (FRI). For the high intensity regime, we show the existence of a critical fiber length, and give the exact asymptotic of it as intensity goes to infinity. At the same time, our result for the low intensity regime proves the global existence of a non-trivial phase transition with respect to the… ▽ More In this paper, we show several rigorous results on the phase transition of Finitary Random Interlacement (FRI). For the high intensity regime, we show the existence of a critical fiber length, and give the exact asymptotic of it as intensity goes to infinity. At the same time, our result for the low intensity regime proves the global existence of a non-trivial phase transition with respect to the system intensity. △ Less

Submitted 20 January, 2021; originally announced January 2021.

arXiv:2012.06744 [pdf, ps, other]

Solutions of quaternion-valued differential equations with or without commutativity

Authors: Z. Cai, K. I. Kou, W. Zhang

Abstract: Most results on quaternion-valued differential equation (QDE) are based on J. Campos and J. Mawhin's fundamental solution of exponential form for the homogeneous linear equation, but their result requires a commutativity property. In this paper we discuss with two problems: What quaternion function satisfies the commutativity property? Without the commutativity property, what can we do for the hom… ▽ More Most results on quaternion-valued differential equation (QDE) are based on J. Campos and J. Mawhin's fundamental solution of exponential form for the homogeneous linear equation, but their result requires a commutativity property. In this paper we discuss with two problems: What quaternion function satisfies the commutativity property? Without the commutativity property, what can we do for the homogeneous equation? We prove that the commutativity property actually requires quaternionic functions to be complex-like functions. Without the commutativity property, we reduce the initial value problem of the homogeneous equation to a real nonautonomous nonlinear differential equation. △ Less

Submitted 12 December, 2020; originally announced December 2020.

arXiv:2010.14254 [pdf, other]

doi 10.3390/e23010069

On (non-)monotonicity and phase diagram of finitary random interlacement

Authors: Zhenhao Cai, Yunfeng Xiong, Yuan Zhang

Abstract: In this paper, we study the evolution of a Finitary Random Interlacement (FRI) with respect to the expected length of each fiber. In contrast to the previously proved phase transition between sufficiently large and small fiber length, we show that for $d=3,4$, FRI is NOT stochastically monotone as fiber length increasing. At the same time, numerical evidences still strongly support the existence o… ▽ More In this paper, we study the evolution of a Finitary Random Interlacement (FRI) with respect to the expected length of each fiber. In contrast to the previously proved phase transition between sufficiently large and small fiber length, we show that for $d=3,4$, FRI is NOT stochastically monotone as fiber length increasing. At the same time, numerical evidences still strongly support the existence of a unique and sharp phase transition on the existence of a unique infinite cluster, while the critical value for phase transition is estimated to be an inversely proportional function with respect to the system intensity. △ Less

Submitted 27 October, 2020; originally announced October 2020.

arXiv:2009.04044 [pdf, ps, other]

On Chemical Distance and Local Uniqueness of a Sufficiently Supercritical Finitary Random Interlacement

Authors: Zhenhao Cai, Xiao Han, Jiayan Ye, Yuan Zhang

Abstract: In this paper, we study geometric properties of the unique infinite cluster $Γ$ in a sufficiently supercritical Finitary Random Interlacements $\mathcal{FI}^{u,T}$ in $\mathbb{Z}^d, \ d\ge 3$. We prove that the chemical distance in $Γ$ is, with stretched exponentially high probability, of the same order as the Euclidean distance in $\mathbb{Z}^d$. This also implies a shape theorem parallel to thos… ▽ More In this paper, we study geometric properties of the unique infinite cluster $Γ$ in a sufficiently supercritical Finitary Random Interlacements $\mathcal{FI}^{u,T}$ in $\mathbb{Z}^d, \ d\ge 3$. We prove that the chemical distance in $Γ$ is, with stretched exponentially high probability, of the same order as the Euclidean distance in $\mathbb{Z}^d$. This also implies a shape theorem parallel to those for Bernoulli percolation and random interlacements. We also prove local uniqueness of $\mathcal{FI}^{u,T}$, which says any two large clusters in $\mathcal{FI}^{u,T}$ "close to each other" will with stretched exponentially high probability be connected to each other within the same order of the distance between them. △ Less

Submitted 8 September, 2020; originally announced September 2020.

Comments: 43 pages

arXiv:2007.10198 [pdf, other]

On the validity of complex Langevin method for path integral computations

Authors: Zhenning Cai, Xiaoyu Dong, Yang Kuang

Abstract: The complex Langevin (CL) method is a classical numerical strategy to alleviate the numerical sign problem in the computation of lattice field theories. Mathematically, it is a simple numerical tool to compute a wide class of high-dimensional and oscillatory integrals. However, it is often observed that the CL method converges but the limiting result is incorrect. The literature has several unclea… ▽ More The complex Langevin (CL) method is a classical numerical strategy to alleviate the numerical sign problem in the computation of lattice field theories. Mathematically, it is a simple numerical tool to compute a wide class of high-dimensional and oscillatory integrals. However, it is often observed that the CL method converges but the limiting result is incorrect. The literature has several unclear or even conflicting statements, making the method look mysterious. By an in-depth analysis of a model problem, we reveal the mechanism of how the CL result turns biased as the parameter changes, and it is demonstrated that such a transition is difficult to capture. Our analysis also shows that the method works for any observables only if the probability density function generated by the CL process is localized. To generalize such observations to lattice field theories, we formulate the CL method on general groups using rigorous mathematical languages for the first time, and we demonstrate that such localized probability density function does not exist in the simulation of lattice field theories for general compact groups, which explains the unstable behavior of the CL method. Fortunately, we also find that the gauge cooling technique creates additional velocity that helps confine the samples, so that we can still see localized probability density functions in certain cases, as significantly broadens the application of the CL method. The limitations of gauge cooling are also discussed. In particular, we prove that gauge cooling has no effect for Abelian groups, and we provide an example showing that biased results still exist when gauge cooling is insufficient to confine the probability density function. △ Less

Submitted 5 November, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

Comments: 28 pages,9 figures

arXiv:2006.07654 [pdf, ps, other]

Numerical analysis for inchworm Monte Carlo method: Sign problem and error growth

Authors: Zhenning Cai, Jianfeng Lu, Siyao Yang

Abstract: We consider the numerical analysis of the inchworm Monte Carlo method, which is proposed recently to tackle the numerical sign problem for open quantum systems. We focus on the growth of the numerical error with respect to the simulation time, for which the inchworm Monte Carlo method shows a flatter curve than the direct application of Monte Carlo method to the classical Dyson series. To better u… ▽ More We consider the numerical analysis of the inchworm Monte Carlo method, which is proposed recently to tackle the numerical sign problem for open quantum systems. We focus on the growth of the numerical error with respect to the simulation time, for which the inchworm Monte Carlo method shows a flatter curve than the direct application of Monte Carlo method to the classical Dyson series. To better understand the underlying mechanism of the inchworm Monte Carlo method, we distinguish two types of exponential error growth, which are known as the numerical sign problem and the error amplification. The former is due to the fast growth of variance in the stochastic method, which can be observed from the Dyson series, and the latter comes from the evolution of the numerical solution. Our analysis demonstrates that the technique of partial resummation can be considered as a tool to balance these two types of error, and the inchwormMonte Carlo method is a successful case where the numerical sign problem is effectively suppressed by such means. We first demonstrate our idea in the context of ordinary differential equations, and then provide complete analysis for the inchworm Monte Carlo method. Several numerical experiments are carried out to verify our theoretical results. △ Less

Submitted 5 December, 2021; v1 submitted 13 June, 2020; originally announced June 2020.

Comments: 58 pages, 8 figures

arXiv:2006.02397 [pdf, other]

One Step to Efficient Synthetic Data

Authors: Jordan Awan, Zhanrui Cai

Abstract: A common approach to synthetic data is to sample from a fitted model. We show that under general assumptions, this approach results in a sample with inefficient estimators and whose joint distribution is inconsistent with the true distribution. Motivated by this, we propose a general method of producing synthetic data, which is widely applicable for parametric models, has asymptotically efficient… ▽ More A common approach to synthetic data is to sample from a fitted model. We show that under general assumptions, this approach results in a sample with inefficient estimators and whose joint distribution is inconsistent with the true distribution. Motivated by this, we propose a general method of producing synthetic data, which is widely applicable for parametric models, has asymptotically efficient summary statistics, and is both easily implemented and highly computationally efficient. Our approach allows for the construction of both partially synthetic datasets, which preserve certain summary statistics, as well as fully synthetic data which satisfy the strong guarantee of differential privacy (DP), both with the same asymptotic guarantees. We also provide theoretical and empirical evidence that the distribution from our procedure converges to the true distribution. Besides our focus on synthetic data, our procedure can also be used to perform approximate hypothesis tests in the presence of intractable likelihood functions. △ Less

Submitted 29 March, 2023; v1 submitted 3 June, 2020; originally announced June 2020.

Comments: 17 pages before appendices/references

arXiv:2002.09733 [pdf, ps, other]

Numerical Analysis of a High-Order Scheme for Nonlinear Fractional Differential Equations with Uniform Accuracy

Authors: Junying Cao, Zhenning Cai

Abstract: We introduce a high-order numerical scheme for fractional ordinary differential equations with the Caputo derivative. The method is developed by dividing the domain into a number of subintervals, and applying the quadratic interpolation on each subinterval. The method is shown to be unconditionally stable, and for general nonlinear equations, the uniform sharp numerical order $3-ν$ can be rigorous… ▽ More We introduce a high-order numerical scheme for fractional ordinary differential equations with the Caputo derivative. The method is developed by dividing the domain into a number of subintervals, and applying the quadratic interpolation on each subinterval. The method is shown to be unconditionally stable, and for general nonlinear equations, the uniform sharp numerical order $3-ν$ can be rigorously proven for sufficiently smooth solutions at all time steps. The proof provides a general guide for proving the sharp order for higher-order schemes in the nonlinear case. Some numerical examples are given to validate our theoretical results. △ Less

Submitted 22 February, 2020; originally announced February 2020.

arXiv:2001.09102 [pdf, other]

Generalized Prager-Synge Inequality and Equilibrated Error Estimators for Discontinuous Elements

Authors: Cuiyu He, Zhiqiang Cai, Shun Zhang

Abstract: The well-known Prager-Synge identity is valid in $H^1(Ω)$ and serves as a foundation for develo** equilibrated a posteriori error estimators for continuous elements. In this paper, we introduce a new inequality, that may be regarded as a generalization of the Prager-Synge identity, to be valid for piecewise $H^1(Ω)$ functions for diffusion problems. The inequality is proved to be identity in two… ▽ More The well-known Prager-Synge identity is valid in $H^1(Ω)$ and serves as a foundation for develo** equilibrated a posteriori error estimators for continuous elements. In this paper, we introduce a new inequality, that may be regarded as a generalization of the Prager-Synge identity, to be valid for piecewise $H^1(Ω)$ functions for diffusion problems. The inequality is proved to be identity in two dimensions. For nonconforming finite element approximation of arbitrary odd order, we propose a fully explicit approach that recovers an equilibrated flux in $H(div; Ω)$ through a local element-wise scheme and that recovers a gradient in $H(curl;Ω)$ through a simple averaging technique over edges. The resulting error estimator is then proved to be globally reliable and locally efficient. Moreover, the reliability and efficiency constants are independent of the jump of the diffusion coefficient regardless of its distribution. △ Less

Submitted 24 January, 2020; originally announced January 2020.

MSC Class: 65N30 ACM Class: G.1.8

arXiv:1911.02109 [pdf, other]

doi 10.1016/j.jcp.2020.109707

Deep least-squares methods: an unsupervised learning-based numerical method for solving elliptic PDEs

Authors: Zhiqiang Cai, **gshuang Chen, Min Liu, Xinyu Liu

Abstract: This paper studies an unsupervised deep learning-based numerical approach for solving partial differential equations (PDEs). The approach makes use of the deep neural network to approximate solutions of PDEs through the compositional construction and employs least-squares functionals as loss functions to determine parameters of the deep neural network. There are various least-squares functionals f… ▽ More This paper studies an unsupervised deep learning-based numerical approach for solving partial differential equations (PDEs). The approach makes use of the deep neural network to approximate solutions of PDEs through the compositional construction and employs least-squares functionals as loss functions to determine parameters of the deep neural network. There are various least-squares functionals for a partial differential equation. This paper focuses on the so-called first-order system least-squares (FOSLS) functional studied in [3], which is based on a first-order system of scalar second-order elliptic PDEs. Numerical results for second-order elliptic PDEs in one dimension are presented. △ Less

Submitted 12 July, 2020; v1 submitted 5 November, 2019; originally announced November 2019.

Comments: 15 pages, 6 figures, 5 tables, accepted by Journal of Computational Physics

MSC Class: 35Q68

arXiv:1910.09355 [pdf, ps, other]

Burnett Spectral Method for High-Speed Rarefied Gas Flows

Authors: Zhicheng Hu, Zhenning Cai

Abstract: We introduce a numerical solver for the spatially inhomogeneous Boltzmann equation using the Burnett spectral method. The modelling and discretization of the collision operator are based on the previous work [Z. Cai, Y. Fan, and Y. Wang, Burnett spectral method for the spatially homogeneous Boltzmann equation, arXiv:1810.07804], which is the hybridization of the BGK operator for higher moments and… ▽ More We introduce a numerical solver for the spatially inhomogeneous Boltzmann equation using the Burnett spectral method. The modelling and discretization of the collision operator are based on the previous work [Z. Cai, Y. Fan, and Y. Wang, Burnett spectral method for the spatially homogeneous Boltzmann equation, arXiv:1810.07804], which is the hybridization of the BGK operator for higher moments and the quadratic collision operator for lower moments. To ensure the preservation of the equilibrium state, we introduce an additional term to the discrete collision operator, which equals zero when the number of degrees of freedom tends to infinity. Compared with the previous work [Z. Hu, Z. Cai, and Y. Wang,Numerical simulation of microflows using Hermite spectral methods, arXiv:1807.06236], the computational cost is reduced by one order. Numerical experiments such as shock structure calculation and Fourier flows are carried out to show the efficiency and accuracy of our numerical method. △ Less

Submitted 18 October, 2019; originally announced October 2019.

arXiv:1907.08894 [pdf, ps, other]

doi 10.1080/00029890.2020.1690387

The Surprising Accuracy of Benford's Law in Mathematics

Authors: Zhaodong Cai, Matthew Faust, A. J. Hildebrand, Junxian Li, Yuan Zhang

Abstract: Benford's law is an empirical ``law'' governing the frequency of leading digits in numerical data sets. Surprisingly, for mathematical sequences the predictions derived from it can be uncannily accurate. For example, among the first billion powers of $2$, exactly $301029995$ begin with digit 1, while the Benford prediction for this count is $10^9\log_{10}2=301029995.66\dots$. Similar ``perfect hit… ▽ More Benford's law is an empirical ``law'' governing the frequency of leading digits in numerical data sets. Surprisingly, for mathematical sequences the predictions derived from it can be uncannily accurate. For example, among the first billion powers of $2$, exactly $301029995$ begin with digit 1, while the Benford prediction for this count is $10^9\log_{10}2=301029995.66\dots$. Similar ``perfect hits'' can be observed in other instances, such as the digit $1$ and $2$ counts for the first billion powers of $3$. We prove results that explain many, but not all, of these surprising accuracies, and we relate the observed behavior to classical results in Diophantine approximation as well as recent deep conjectures in this area. △ Less

Submitted 28 November, 2019; v1 submitted 20 July, 2019; originally announced July 2019.

Comments: Accepted for publication in the American Mathematical Monthly

Journal ref: The American Mathematical Monthly, 127 (2020), 217-237

arXiv:1905.11683 [pdf, other]

doi 10.4208/cicp.OA-2019-0126

How does Gauge Cooling Stabilize Complex Langevin?

Authors: Zhenning Cai, Yana Di, Xiaoyu Dong

Abstract: We study the mechanism of the gauge cooling technique to stabilize the complex Langevin method in the one-dimensional periodic setting. In this case, we find the exact solutions for the gauge transform which minimizes the Frobenius norm of link variables. Thereby, we derive the underlying stochastic differential equations by continuing the numerical method with gauge cooling, and thus provide a nu… ▽ More We study the mechanism of the gauge cooling technique to stabilize the complex Langevin method in the one-dimensional periodic setting. In this case, we find the exact solutions for the gauge transform which minimizes the Frobenius norm of link variables. Thereby, we derive the underlying stochastic differential equations by continuing the numerical method with gauge cooling, and thus provide a number of insights on the effects of gauge cooling. A specific case study is carried out for the Polyakov loop model in $SU(2)$ theory, in which we show that the gauge cooling may help form a localized distribution to guarantee there is no excursion too far away from the real axis. △ Less

Submitted 28 May, 2019; originally announced May 2019.

Comments: 23 pages, 4 figures

Showing 1–50 of 69 results for author: Cai, Z