Search | arXiv e-print repository

Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method

Authors: Yuling Jiao, Yanming Lai, Yang Wang

Abstract: Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations wi… ▽ More Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations with three different types of boundary conditions. We perform projected gradient descent(PDG) to train the three-layer network and we establish its global convergence. To the best of our knowledge, we are the first to provide a comprehensive error analysis of using overparameterized networks to solve PDE problems, as our analysis simultaneously includes estimates for approximation error, generalization error, and optimization error. We present error bound in terms of the sample size $n$ and our work provides guidance on how to set the network depth, width, step size, and number of iterations for the projected gradient descent algorithm. Importantly, our assumptions in this work are classical and we do not require any additional assumptions on the solution of the equation. This ensures the broad applicability and generality of our results. △ Less

Submitted 19 May, 2024; originally announced May 2024.

MSC Class: 65N12; 65N15; 68T07; 62G05; 35J25

arXiv:2403.04089 [pdf, ps, other]

A family of Kähler flying wing steady Ricci solitons

Authors: Pak-Yeung Chan, Ronan J. Conlon, Yi Lai

Abstract: In $1996$, H.-D. Cao constructed a $U(n)$-invariant steady gradient Kähler-Ricci soliton on $\mathbb{C}^{n}$ and asked whether every steady gradient Kähler-Ricci soliton of positive curvature on $\mathbb{C}^{n}$ is necessarily $U(n)$-invariant (and hence unique up to scaling). Recently, Apostolov-Cifarelli answered this question in the negative for $n=2$. Here, we construct a family of… ▽ More In $1996$, H.-D. Cao constructed a $U(n)$-invariant steady gradient Kähler-Ricci soliton on $\mathbb{C}^{n}$ and asked whether every steady gradient Kähler-Ricci soliton of positive curvature on $\mathbb{C}^{n}$ is necessarily $U(n)$-invariant (and hence unique up to scaling). Recently, Apostolov-Cifarelli answered this question in the negative for $n=2$. Here, we construct a family of $U(1)\times U(n-1)$-invariant, but not $U(n)$-invariant, complete steady gradient Kähler-Ricci solitons with strictly positive curvature operator on real $(1,\,1)$-forms (in particular, with strictly positive sectional curvature) on $\mathbb{C}^{n}$ for $n\geq3$, thereby answering Cao's question in the negative for $n\geq3$. This family of steady Ricci solitons interpolates between Cao's $U(n)$-invariant steady Kähler-Ricci soliton and the product of the cigar soliton and Cao's $U(n-1)$-invariant steady Kähler-Ricci soliton. This provides the Kähler analog of the Riemannian flying wings construction of Lai. In the process of the proof, we also demonstrate that the almost diameter rigidity of $\mathbb{P}^{n}$ endowed with the Fubini-Study metric does not hold even if the curvature operator is bounded below by $2$ on real $(1,\,1)$-forms. △ Less

Submitted 23 May, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

Comments: Appendix B added; more references added

MSC Class: 53E20; 53E30

arXiv:2402.14877 [pdf, other]

Machine-learning prediction of tip** and collapse of the Atlantic Meridional Overturning Circulation

Authors: Shirin Panahi, Ling-Wei Kong, Mohammadamin Moradi, Zheng-Meng Zhai, Bryan Glaz, Mulugeta Haile, Ying-Cheng Lai

Abstract: Recent research on the Atlantic Meridional Overturning Circulation (AMOC) raised concern about its potential collapse through a tip** point due to the climate-change caused increase in the freshwater input into the North Atlantic. The predicted time window of collapse is centered about the middle of the century and the earliest possible start is approximately two years from now. More generally,… ▽ More Recent research on the Atlantic Meridional Overturning Circulation (AMOC) raised concern about its potential collapse through a tip** point due to the climate-change caused increase in the freshwater input into the North Atlantic. The predicted time window of collapse is centered about the middle of the century and the earliest possible start is approximately two years from now. More generally, anticipating a tip** point at which the system transitions from one stable steady state to another is relevant to a broad range of fields. We develop a machine-learning approach to predicting tip** in noisy dynamical systems with a time-varying parameter and test it on a number of systems including the AMOC, ecological networks, an electrical power system, and a climate model. For the AMOC, our prediction based on simulated fingerprint data and real data of the sea surface temperature places the time window of a potential collapse between the years 2040 and 2065. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: 6 pages, 3 figures

arXiv:2311.09142 [pdf, other]

Machine-learning parameter tracking with partial state observation

Authors: Zheng-Meng Zhai, Mohammadamin Moradi, Bryan Glaz, Mulugeta Haile, Ying-Cheng Lai

Abstract: Complex and nonlinear dynamical systems often involve parameters that change with time, accurate tracking of which is essential to tasks such as state estimation, prediction, and control. Existing machine-learning methods require full state observation of the underlying system and tacitly assume adiabatic changes in the parameter. Formulating an inverse problem and exploiting reservoir computing,… ▽ More Complex and nonlinear dynamical systems often involve parameters that change with time, accurate tracking of which is essential to tasks such as state estimation, prediction, and control. Existing machine-learning methods require full state observation of the underlying system and tacitly assume adiabatic changes in the parameter. Formulating an inverse problem and exploiting reservoir computing, we develop a model-free and fully data-driven framework to accurately track time-varying parameters from partial state observation in real time. In particular, with training data from a subset of the dynamical variables of the system for a small number of known parameter values, the framework is able to accurately predict the parameter variations in time. Low- and high-dimensional, Markovian and non-Markovian nonlinear dynamical systems are used to demonstrate the power of the machine-learning based parameter-tracking framework. Pertinent issues affecting the tracking performance are addressed. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: 5 pages, 4 figures

arXiv:2311.09140 [pdf, other]

Rate-induced tip** in complex high-dimensional ecological networks

Authors: Shirin Panahi, Younghae Do, Alan Hastings, Ying-Cheng Lai

Abstract: In an ecosystem, environmental changes as a result of natural and human processes can cause some key parameters of the system to change with time. Depending on how fast such a parameter changes, a tip** point can occur. Existing works on rate-induced tip**, or R-tip**, offered a theoretical way to study this phenomenon but from a local dynamical point of view, revealing, e.g., the existence… ▽ More In an ecosystem, environmental changes as a result of natural and human processes can cause some key parameters of the system to change with time. Depending on how fast such a parameter changes, a tip** point can occur. Existing works on rate-induced tip**, or R-tip**, offered a theoretical way to study this phenomenon but from a local dynamical point of view, revealing, e.g., the existence of a critical rate for some specific initial condition above which a tip** point will occur. As ecosystems are subject to constant disturbances and can drift away from their equilibrium point, it is necessary to study R-tip** from a global perspective in terms of the initial conditions in the entire relevant phase space region. In particular, we introduce the notion of the probability of R-tip** defined for initial conditions taken from the whole relevant phase space. Using a number of real-world, complex mutualistic networks as a paradigm, we discover a scaling law between this probability and the rate of parameter change and provide a geometric theory to explain the law. The real-world implication is that even a slow parameter change can lead to a system collapse with catastrophic consequences. In fact, to mitigate the environmental changes by merely slowing down the parameter drift may not always be effective: only when the rate of parameter change is reduced to practically zero would the tip** be avoided. Our global dynamics approach offers a more complete and physically meaningful way to understand the important phenomenon of R-tip**. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: 8 pages, 5 figures

arXiv:2309.11470 [pdf, other]

doi 10.1038/s41467-023-41379-3

Model-free tracking control of complex dynamical trajectories with machine learning

Authors: Zheng-Meng Zhai, Mohammadamin Moradi, Ling-Wei Kong, Bryan Glaz, Mulugeta Haile, Ying-Cheng Lai

Abstract: Nonlinear tracking control enabling a dynamical system to track a desired trajectory is fundamental to robotics, serving a wide range of civil and defense applications. In control engineering, designing tracking control requires complete knowledge of the system model and equations. We develop a model-free, machine-learning framework to control a two-arm robotic manipulator using only partially obs… ▽ More Nonlinear tracking control enabling a dynamical system to track a desired trajectory is fundamental to robotics, serving a wide range of civil and defense applications. In control engineering, designing tracking control requires complete knowledge of the system model and equations. We develop a model-free, machine-learning framework to control a two-arm robotic manipulator using only partially observed states, where the controller is realized by reservoir computing. Stochastic input is exploited for training, which consists of the observed partial state vector as the first and its immediate future as the second component so that the neural machine regards the latter as the future state of the former. In the testing (deployment) phase, the immediate-future component is replaced by the desired observational vector from the reference trajectory. We demonstrate the effectiveness of the control framework using a variety of periodic and chaotic signals, and establish its robustness against measurement noise, disturbances, and uncertainties. △ Less

Submitted 20 September, 2023; originally announced September 2023.

Comments: 16 pages, 8 figures

Journal ref: Nat Commun 14, 5698 (2023)

arXiv:2309.11461 [pdf, other]

Digital twins of nonlinear dynamical systems: A perspective

Authors: Ying-Cheng Lai

Abstract: Digital twins have attracted a great deal of recent attention from a wide range of fields. A basic requirement for digital twins of nonlinear dynamical systems is the ability to generate the system evolution and predict potentially catastrophic emergent behaviors so as to providing early warnings. The digital twin can then be used for system "health" monitoring in real time and for predictive prob… ▽ More Digital twins have attracted a great deal of recent attention from a wide range of fields. A basic requirement for digital twins of nonlinear dynamical systems is the ability to generate the system evolution and predict potentially catastrophic emergent behaviors so as to providing early warnings. The digital twin can then be used for system "health" monitoring in real time and for predictive problem solving. In particular, if the digital twin forecasts a possible system collapse in the future due to parameter drifting as caused by environmental changes or perturbations, an optimal control strategy can be devised and executed as early intervention to prevent the collapse. Two approaches exist for constructing digital twins of nonlinear dynamical systems: sparse optimization and machine learning. The basics of these two approaches are described and their advantages and caveats are discussed. △ Less

Submitted 20 September, 2023; originally announced September 2023.

Comments: 12 pages, 3 figures

arXiv:2308.09655 [pdf, other]

Oscillatory networks: Insights from piecewise-linear modeling

Authors: Stephen Coombes, Mustafa Sayli, Rüdiger Thul, Rachel Nicks, Mason A Porter, Yi Ming Lai

Abstract: There is enormous interest -- both mathematically and in diverse applications -- in understanding the dynamics of coupled oscillator networks. The real-world motivation of such networks arises from studies of the brain, the heart, ecology, and more. It is common to describe the rich emergent behavior in these systems in terms of complex patterns of network activity that reflect both the connectivi… ▽ More There is enormous interest -- both mathematically and in diverse applications -- in understanding the dynamics of coupled oscillator networks. The real-world motivation of such networks arises from studies of the brain, the heart, ecology, and more. It is common to describe the rich emergent behavior in these systems in terms of complex patterns of network activity that reflect both the connectivity and the nonlinear dynamics of the network components. Such behavior is often organized around phase-locked periodic states and their instabilities. However, the explicit calculation of periodic orbits in nonlinear systems (even in low dimensions) is notoriously hard, so network-level insights often require the numerical construction of some underlying periodic component. In this paper, we review powerful techniques for studying coupled oscillator networks. We discuss phase reductions, phase-amplitude reductions, and the master stability function for smooth dynamical systems. We then focus in particular on the augmentation of these methods to analyze piecewise-linear systems, for which one can readily construct periodic orbits. This yields useful insights into network behavior, but the cost is that one needs to study nonsmooth dynamical systems. The study of nonsmooth systems is well-developed when focusing on the interacting units (i.e., at the node level) of a system, and we give a detailed presentation of how to use \textit{saltation operators}, which can treat the propagation of perturbations through switching manifolds, to understand dynamics and bifurcations at the network level. We illustrate this merger of tools and techniques from network science and nonsmooth dynamical systems with applications to neural systems, cardiac systems, networks of electro-mechanical oscillators, and cooperation in cattle herds. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: 63 pages, 26 figures

MSC Class: 34C15; 49J52; 90B10; 92C42; 91D30; 49J52

arXiv:2302.02405 [pdf, ps, other]

Convergence Analysis of the Deep Galerkin Method for Weak Solutions

Authors: Yuling Jiao, Yanming Lai, Yang Wang, Haizhao Yang, Yunfei Yang

Abstract: This paper analyzes the convergence rate of a deep Galerkin method for the weak solution (DGMW) of second-order elliptic partial differential equations on $\mathbb{R}^d$ with Dirichlet, Neumann, and Robin boundary conditions, respectively. In DGMW, a deep neural network is applied to parametrize the PDE solution, and a second neural network is adopted to parametrize the test function in the tradit… ▽ More This paper analyzes the convergence rate of a deep Galerkin method for the weak solution (DGMW) of second-order elliptic partial differential equations on $\mathbb{R}^d$ with Dirichlet, Neumann, and Robin boundary conditions, respectively. In DGMW, a deep neural network is applied to parametrize the PDE solution, and a second neural network is adopted to parametrize the test function in the traditional Galerkin formulation. By properly choosing the depth and width of these two networks in terms of the number of training samples $n$, it is shown that the convergence rate of DGMW is $\mathcal{O}(n^{-1/d})$, which is the first convergence result for weak solutions. The main idea of the proof is to divide the error of the DGMW into an approximation error and a statistical error. We derive an upper bound on the approximation error in the $H^{1}$ norm and bound the statistical error via Rademacher complexity. △ Less

Submitted 5 February, 2023; originally announced February 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2107.14478

arXiv:2211.09955 [pdf, other]

Emergence of a stochastic resonance in machine learning

Authors: Zheng-Meng Zhai, Ling-Wei Kong, Ying-Cheng Lai

Abstract: Can noise be beneficial to machine-learning prediction of chaotic systems? Utilizing reservoir computers as a paradigm, we find that injecting noise to the training data can induce a stochastic resonance with significant benefits to both short-term prediction of the state variables and long-term prediction of the attractor of the system. A key to inducing the stochastic resonance is to include the… ▽ More Can noise be beneficial to machine-learning prediction of chaotic systems? Utilizing reservoir computers as a paradigm, we find that injecting noise to the training data can induce a stochastic resonance with significant benefits to both short-term prediction of the state variables and long-term prediction of the attractor of the system. A key to inducing the stochastic resonance is to include the amplitude of the noise in the set of hyperparameters for optimization. By so doing, the prediction accuracy, stability and horizon can be dramatically improved. The stochastic resonance phenomenon is demonstrated using two prototypical high-dimensional chaotic systems. △ Less

Submitted 15 November, 2022; originally announced November 2022.

Comments: 7 pages, 4 figures

arXiv:2207.02714 [pdf, ps, other]

3D flying wings for any asymptotic cones

Authors: Yi Lai

Abstract: For every $θ\in(0,π)$, we construct a 3D steady gradient Ricci soliton whose asymptotic cone is a sector with angle $θ$, which is a called 3D flying wing. For every $θ\in(0,π)$, we construct a 3D steady gradient Ricci soliton whose asymptotic cone is a sector with angle $θ$, which is a called 3D flying wing. △ Less

Submitted 6 July, 2022; originally announced July 2022.

MSC Class: 53E20

arXiv:2205.01146 [pdf, ps, other]

O(2)-symmetry of 3D steady gradient Ricci solitons

Authors: Yi Lai

Abstract: For any 3D steady gradient Ricci soliton with positive curvature, we prove that it must be isometric to the Bryant soliton if it is asymptotic to a ray. Otherwise, it is asymptotic to a sector and hence a flying wing. We show that all 3D flying wings are O(2)-symmetric. Therefore, all 3D steady gradient Ricci solitons are O(2)-symmetric. For any 3D steady gradient Ricci soliton with positive curvature, we prove that it must be isometric to the Bryant soliton if it is asymptotic to a ray. Otherwise, it is asymptotic to a sector and hence a flying wing. We show that all 3D flying wings are O(2)-symmetric. Therefore, all 3D steady gradient Ricci solitons are O(2)-symmetric. △ Less

Submitted 18 July, 2023; v1 submitted 2 May, 2022; originally announced May 2022.

MSC Class: 53E20

arXiv:2203.17155 [pdf, other]

Predicting extreme events from data using deep machine learning: when and where

Authors: Junjie Jiang, Zi-Gang Huang, Celso Grebogi, Ying-Cheng Lai

Abstract: We develop a deep convolutional neural network (DCNN) based framework for model-free prediction of the occurrence of extreme events both in time ("when") and in space ("where") in nonlinear physical systems of spatial dimension two. The measurements or data are a set of two-dimensional snapshots or images. For a desired time horizon of prediction, a proper labeling scheme can be designated to enab… ▽ More We develop a deep convolutional neural network (DCNN) based framework for model-free prediction of the occurrence of extreme events both in time ("when") and in space ("where") in nonlinear physical systems of spatial dimension two. The measurements or data are a set of two-dimensional snapshots or images. For a desired time horizon of prediction, a proper labeling scheme can be designated to enable successful training of the DCNN and subsequent prediction of extreme events in time. Given that an extreme event has been predicted to occur within the time horizon, a space-based labeling scheme can be applied to predict, within certain resolution, the location at which the event will occur. We use synthetic data from the 2D complex Ginzburg-Landau equation and empirical wind speed data of the North Atlantic ocean to demonstrate and validate our machine-learning based prediction framework. The trade-offs among the prediction horizon, spatial resolution, and accuracy are illustrated, and the detrimental effect of spatially biased occurrence of extreme event on prediction accuracy is discussed. The deep learning framework is viable for predicting extreme events in the real world. △ Less

Submitted 31 March, 2022; originally announced March 2022.

Comments: 15 pages, 10 figures

arXiv:2203.14006 [pdf, other]

Continuity scaling: A rigorous framework for detecting and quantifying causality accurately

Authors: Xiong Ying, Si-Yang Leng, Huan-Fei Ma, Qing Nie, Ying-Cheng Lai, Wei Lin

Abstract: Data based detection and quantification of causation in complex, nonlinear dynamical systems is of paramount importance to science, engineering and beyond. Inspired by the widely used methodology in recent years, the cross-map-based techniques, we develop a general framework to advance towards a comprehensive understanding of dynamical causal mechanisms, which is consistent with the natural interp… ▽ More Data based detection and quantification of causation in complex, nonlinear dynamical systems is of paramount importance to science, engineering and beyond. Inspired by the widely used methodology in recent years, the cross-map-based techniques, we develop a general framework to advance towards a comprehensive understanding of dynamical causal mechanisms, which is consistent with the natural interpretation of causality. In particular, instead of measuring the smoothness of the cross map as conventionally implemented, we define causation through measuring the {\it scaling law} for the continuity of the investigated dynamical system directly. The uncovered scaling law enables accurate, reliable, and efficient detection of causation and assessment of its strength in general complex dynamical systems, outperforming those existing representative methods. The continuity scaling based framework is rigorously established and demonstrated using datasets from model complex systems and the real world. △ Less

Submitted 26 March, 2022; originally announced March 2022.

Comments: 7 figures; The article has been peer reviewed and accepted by RESEARCH

arXiv:2111.02009 [pdf, ps, other]

Analysis of Deep Ritz Methods for Laplace Equations with Dirichlet Boundary Conditions

Authors: Chenguang Duan, Yuling Jiao, Yanming Lai, Xiliang Lu, Qimeng Quan, Jerry Zhijian Yang

Abstract: Deep Ritz methods (DRM) have been proven numerically to be efficient in solving partial differential equations. In this paper, we present a convergence rate in $H^{1}$ norm for deep Ritz methods for Laplace equations with Dirichlet boundary condition, where the error depends on the depth and width in the deep neural networks and the number of samples explicitly. Further we can properly choose the… ▽ More Deep Ritz methods (DRM) have been proven numerically to be efficient in solving partial differential equations. In this paper, we present a convergence rate in $H^{1}$ norm for deep Ritz methods for Laplace equations with Dirichlet boundary condition, where the error depends on the depth and width in the deep neural networks and the number of samples explicitly. Further we can properly choose the depth and width in the deep neural networks in terms of the number of training samples. The main idea of the proof is to decompose the total error of DRM into three parts, that is approximation error, statistical error and the error caused by the boundary penalty. We bound the approximation error in $H^{1}$ norm with $\mathrm{ReLU}^{2}$ networks and control the statistical error via Rademacher complexity. In particular, we derive the bound on the Rademacher complexity of the non-Lipschitz composition of gradient norm with $\mathrm{ReLU}^{2}$ network, which is of immense independent interest. We also analysis the error inducing by the boundary penalty method and give a prior rule for tuning the penalty parameter. △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2103.13330; text overlap with arXiv:2109.01780

arXiv:2109.01780 [pdf, ps, other]

doi 10.4208/cicp.OA-2021-0186

A rate of convergence of Physics Informed Neural Networks for the linear second order elliptic PDEs

Authors: Yuling Jiao, Yanming Lai, Dingwei Li, Xiliang Lu, Fengru Wang, Yang Wang, Jerry Zhijian Yang

Abstract: In recent years, physical informed neural networks (PINNs) have been shown to be a powerful tool for solving PDEs empirically. However, numerical analysis of PINNs is still missing. In this paper, we prove the convergence rate to PINNs for the second order elliptic equations with Dirichlet boundary condition, by establishing the upper bounds on the number of training samples, depth and width of th… ▽ More In recent years, physical informed neural networks (PINNs) have been shown to be a powerful tool for solving PDEs empirically. However, numerical analysis of PINNs is still missing. In this paper, we prove the convergence rate to PINNs for the second order elliptic equations with Dirichlet boundary condition, by establishing the upper bounds on the number of training samples, depth and width of the deep neural networks to achieve desired accuracy. The error of PINNs is decomposed into approximation error and statistical error, where the approximation error is given in $C^2$ norm with $\mathrm{ReLU}^{3}$ networks (deep network with activations function $\max\{0,x^3\}$) and the statistical error is estimated by Rademacher complexity. We derive the bound on the Rademacher complexity of the non-Lipschitz composition of gradient norm with $\mathrm{ReLU}^{3}$ network, which is of immense independent interest. △ Less

Submitted 15 March, 2022; v1 submitted 3 September, 2021; originally announced September 2021.

Comments: arXiv admin note: text overlap with arXiv:2103.13330

arXiv:2107.14478 [pdf, ps, other]

Error Analysis of Deep Ritz Methods for Elliptic Equations

Authors: Yuling Jiao, Yanming Lai, Yisu Lo, Yang Wang, Yunfei Yang

Abstract: Using deep neural networks to solve PDEs has attracted a lot of attentions recently. However, why the deep learning method works is falling far behind its empirical success. In this paper, we provide a rigorous numerical analysis on deep Ritz method (DRM) \cite{Weinan2017The} for second order elliptic equations with Drichilet, Neumann and Robin boundary condition, respectively. We establish the fi… ▽ More Using deep neural networks to solve PDEs has attracted a lot of attentions recently. However, why the deep learning method works is falling far behind its empirical success. In this paper, we provide a rigorous numerical analysis on deep Ritz method (DRM) \cite{Weinan2017The} for second order elliptic equations with Drichilet, Neumann and Robin boundary condition, respectively. We establish the first nonasymptotic convergence rate in $H^1$ norm for DRM using deep networks with smooth activation functions including logistic and hyperbolic tangent functions. Our results show how to set the hyper-parameter of depth and width to achieve the desired convergence rate in terms of number of training samples. △ Less

Submitted 4 September, 2021; v1 submitted 30 July, 2021; originally announced July 2021.

arXiv:2103.13330 [pdf, ps, other]

doi 10.4208/cicp.OA-2021-0195

Convergence Rate Analysis for Deep Ritz Method

Authors: Chenguang Duan, Yuling Jiao, Yanming Lai, Xiliang Lu, Zhijian Yang

Abstract: Using deep neural networks to solve PDEs has attracted a lot of attentions recently. However, why the deep learning method works is falling far behind its empirical success. In this paper, we provide a rigorous numerical analysis on deep Ritz method (DRM) \cite{wan11} for second order elliptic equations with Neumann boundary conditions. We establish the first nonasymptotic convergence rate in… ▽ More Using deep neural networks to solve PDEs has attracted a lot of attentions recently. However, why the deep learning method works is falling far behind its empirical success. In this paper, we provide a rigorous numerical analysis on deep Ritz method (DRM) \cite{wan11} for second order elliptic equations with Neumann boundary conditions. We establish the first nonasymptotic convergence rate in $H^1$ norm for DRM using deep networks with $\mathrm{ReLU}^2$ activation functions. In addition to providing a theoretical justification of DRM, our study also shed light on how to set the hyper-parameter of depth and width to achieve the desired convergence rate in terms of number of training samples. Technically, we derive bounds on the approximation error of deep $\mathrm{ReLU}^2$ network in $H^1$ norm and on the Rademacher complexity of the non-Lipschitz composition of gradient norm and $\mathrm{ReLU}^2$ network, both of which are of independent interest. △ Less

Submitted 29 March, 2021; v1 submitted 24 March, 2021; originally announced March 2021.

arXiv:2103.00542 [pdf, ps, other]

Deep Neural Networks with ReLU-Sine-Exponential Activations Break Curse of Dimensionality in Approximation on Hölder Class

Authors: Yuling Jiao, Yanming Lai, Xiliang Lu, Fengru Wang, Jerry Zhijian Yang, Yuanyuan Yang

Abstract: In this paper, we construct neural networks with ReLU, sine and $2^x$ as activation functions. For general continuous $f$ defined on $[0,1]^d$ with continuity modulus $ω_f(\cdot)$, we construct ReLU-sine-$2^x$ networks that enjoy an approximation rate $\mathcal{O}(ω_f(\sqrt{d})\cdot2^{-M}+ω_{f}\left(\frac{\sqrt{d}}{N}\right))$, where $M,N\in \mathbb{N}^{+}$ denote the hyperparameters related to wi… ▽ More In this paper, we construct neural networks with ReLU, sine and $2^x$ as activation functions. For general continuous $f$ defined on $[0,1]^d$ with continuity modulus $ω_f(\cdot)$, we construct ReLU-sine-$2^x$ networks that enjoy an approximation rate $\mathcal{O}(ω_f(\sqrt{d})\cdot2^{-M}+ω_{f}\left(\frac{\sqrt{d}}{N}\right))$, where $M,N\in \mathbb{N}^{+}$ denote the hyperparameters related to widths of the networks. As a consequence, we can construct ReLU-sine-$2^x$ network with the depth $5$ and width $\max\left\{\left\lceil2d^{3/2}\left(\frac{3μ}ε\right)^{1/α}\right\rceil,2\left\lceil\log_2\frac{3μd^{α/2}}{2ε}\right\rceil+2\right\}$ that approximates $f\in \mathcal{H}_μ^α([0,1]^d)$ within a given tolerance $ε>0$ measured in $L^p$ norm $p\in[1,\infty)$, where $\mathcal{H}_μ^α([0,1]^d)$ denotes the Hölder continuous function class defined on $[0,1]^d$ with order $α\in (0,1]$ and constant $μ> 0$. Therefore, the ReLU-sine-$2^x$ networks overcome the curse of dimensionality on $\mathcal{H}_μ^α([0,1]^d)$. In addition to its supper expressive power, functions implemented by ReLU-sine-$2^x$ networks are (generalized) differentiable, enabling us to apply SGD to train. △ Less

Submitted 12 August, 2022; v1 submitted 28 February, 2021; originally announced March 2021.

arXiv:2012.04556 [pdf, other]

doi 10.1063/5.0062042

Finding nonlinear system equations and complex network structures from data: a sparse optimization approach

Authors: Ying-Cheng Lai

Abstract: In applications of nonlinear and complex dynamical systems, a common situation is that the system can be measured but its structure and the detailed rules of dynamical evolution are unknown. The inverse problem is to determine the system equations and structure based solely on measured time series. Recently, methods based on sparse optimization have been developed. For example, the principle of ex… ▽ More In applications of nonlinear and complex dynamical systems, a common situation is that the system can be measured but its structure and the detailed rules of dynamical evolution are unknown. The inverse problem is to determine the system equations and structure based solely on measured time series. Recently, methods based on sparse optimization have been developed. For example, the principle of exploiting sparse optimization such as compressive sensing to find the equations of nonlinear dynamical systems from data was articulated in 2011 by the Nonlinear Dynamics Group at Arizona State University. This article presents a brief review of the recent progress in this area. The basic idea is to expand the equations governing the dynamical evolution of the system into a power series or a Fourier series of a finite number of terms and then to determine the vector of the expansion coefficients based solely on data through sparse optimization. Examples discussed here include discovering the equations of stationary or nonstationary chaotic systems to enable prediction of dynamical events such as critical transition and system collapse, inferring the full topology of complex networks of dynamical oscillators and social networks hosting evolutionary game dynamics, and identifying partial differential equations for spatiotemporal dynamical systems. Situations where sparse optimization is effective and those in which the method fails are discussed. Comparisons with the traditional method of delay coordinate embedding in nonlinear time series analysis are given and the recent development of model-free, data driven prediction framework based on machine learning is briefly introduced. △ Less

Submitted 7 December, 2020; originally announced December 2020.

Comments: 23 pages, 2 figures. arXiv admin note: text overlap with arXiv:1704.08764

arXiv:2012.01545 [pdf, other]

Machine learning prediction of critical transition and system collapse

Authors: Ling-Wei Kong, Hua-Wei Fan, Celso Grebogi, Ying-Cheng Lai

Abstract: To predict a critical transition due to parameter drift without relying on model is an outstanding problem in nonlinear dynamics and applied fields. A closely related problem is to predict whether the system is already in or if the system will be in a transient state preceding its collapse. We develop a model free, machine learning based solution to both problems by exploiting reservoir computing… ▽ More To predict a critical transition due to parameter drift without relying on model is an outstanding problem in nonlinear dynamics and applied fields. A closely related problem is to predict whether the system is already in or if the system will be in a transient state preceding its collapse. We develop a model free, machine learning based solution to both problems by exploiting reservoir computing to incorporate a parameter input channel. We demonstrate that, when the machine is trained in the normal functioning regime with a chaotic attractor (i.e., before the critical transition), the transition point can be predicted accurately. Remarkably, for a parameter drift through the critical point, the machine with the input parameter channel is able to predict not only that the system will be in a transient state, but also the average transient time before the final collapse. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Comments: 5 pages, 3 figures

arXiv:2011.10597 [pdf, other]

Synchronization within synchronization: transients and intermittency in ecological networks

Authors: Huawei Fan, Ling-Wei Kong, Xingang Wang, Alan Hastings, Ying-Cheng Lai

Abstract: Transients are fundamental to ecological systems with significant implications to management, conservation, and biological control. We uncover a type of transient synchronization behavior in spatial ecological networks whose local dynamics are of the chaotic, predator-prey type. In the parameter regime where there is phase synchronization among all the patches, complete synchronization (i.e., sync… ▽ More Transients are fundamental to ecological systems with significant implications to management, conservation, and biological control. We uncover a type of transient synchronization behavior in spatial ecological networks whose local dynamics are of the chaotic, predator-prey type. In the parameter regime where there is phase synchronization among all the patches, complete synchronization (i.e., synchronization in both phase and amplitude) can arise in certain pairs of patches as determined by the network symmetry - henceforth the phenomenon of "synchronization within synchronization." Distinct patterns of complete synchronization coexist but, due to intrinsic instability or noise, each pattern is a transient and there is random, intermittent switching among the patterns in the course of time evolution. The probability distribution of the transient time is found to follow an algebraic scaling law with a divergent average transient lifetime. Based on symmetry considerations, we develop a stability analysis to understand these phenomena. The general principle of symmetry can also be exploited to explain previously discovered, counterintuitive synchronization behaviors in ecological networks. △ Less

Submitted 20 November, 2020; originally announced November 2020.

Comments: 17 pages, 7 figures

arXiv:2010.07272 [pdf, other]

A family of 3d steady gradient solitons that are flying wings

Authors: Yi Lai

Abstract: We find a family of 3d steady gradient Ricci solitons that are flying wings. This verifies a conjecture by Hamilton. For a 3d flying wing, we show that the scalar curvature does not vanish at infinity. The 3d flying wings are collapsed. For dimension $n\ge 4$, we find a family of $\mathbb{Z}_2\times O(n-1)$-symmetric but non-rotationally symmetric n-dimensional steady gradient solitons with posi… ▽ More We find a family of 3d steady gradient Ricci solitons that are flying wings. This verifies a conjecture by Hamilton. For a 3d flying wing, we show that the scalar curvature does not vanish at infinity. The 3d flying wings are collapsed. For dimension $n\ge 4$, we find a family of $\mathbb{Z}_2\times O(n-1)$-symmetric but non-rotationally symmetric n-dimensional steady gradient solitons with positive curvature operator. We show that these solitons are non-collapsed. △ Less

Submitted 29 November, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

Comments: 25 pages, a figure added, minor changes

MSC Class: 53E20

arXiv:2004.05291 [pdf, ps, other]

doi 10.2140/gt.2021.25.3629

Producing 3d Ricci flows with non-negative Ricci curvature via singular Ricci flows

Authors: Yi Lai

Abstract: We extend the concept of singular Ricci flow by Kleiner and Lott from 3d compact manifolds to 3d complete manifolds with possibly unbounded curvature. As an application of the generalized singular Ricci flow, we show that for any 3d complete Riemannian manifold with non-negative Ricci curvature, there exists a smooth Ricci flow starting from it. We extend the concept of singular Ricci flow by Kleiner and Lott from 3d compact manifolds to 3d complete manifolds with possibly unbounded curvature. As an application of the generalized singular Ricci flow, we show that for any 3d complete Riemannian manifold with non-negative Ricci curvature, there exists a smooth Ricci flow starting from it. △ Less

Submitted 15 June, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

MSC Class: 53E20

Journal ref: Geom. Topol. 25 (2021) 3629-3690

arXiv:2004.04769 [pdf, other]

doi 10.1103/PhysRevResearch.2.023196

Scaling law of transient lifetime of chimera states under dimension-augmenting perturbations

Authors: Ling-Wei Kong, Ying-Cheng Lai

Abstract: Chimera states arising in the classic Kuramoto system of two-dimensional phase coupled oscillators are transient but they are "long" transients in the sense that the average transient lifetime grows exponentially with the system size. For reasonably large systems, e.g., those consisting of a few hundreds oscillators, it is infeasible to numerically calculate or experimentally measure the average l… ▽ More Chimera states arising in the classic Kuramoto system of two-dimensional phase coupled oscillators are transient but they are "long" transients in the sense that the average transient lifetime grows exponentially with the system size. For reasonably large systems, e.g., those consisting of a few hundreds oscillators, it is infeasible to numerically calculate or experimentally measure the average lifetime, so the chimera states are practically permanent. We find that small perturbations in the third dimension, which make system "slightly" three-dimensional, will reduce dramatically the transient lifetime. In particular, under such a perturbation, the practically infinite average transient lifetime will become extremely short, because it scales with the magnitude of the perturbation only logarithmically. Physically, this means that a reduction in the perturbation strength over many orders of magnitude, insofar as it is not zero, would result in only an incremental increase in the lifetime. The uncovered type of fragility of chimera states raises concerns about their observability in physical systems. △ Less

Submitted 13 April, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

Comments: 15 pages, 13 figures

Journal ref: Phys. Rev. Research 2, 023196 (2020)

arXiv:1910.11417 [pdf, other]

doi 10.1103/PhysRevE.100.052306

Asymmetry in interdependence makes a multilayer system more robust against cascading failures

Authors: Run-Ran Liu, Chun-Xiao Jia, Ying-Cheng Lai

Abstract: Multilayer networked systems are ubiquitous in nature and engineering, and the robustness of these systems against failures is of great interest. A main line of theoretical pursuit has been percolation induced cascading failures, where interdependence between network layers is conveniently and tacitly assumed to be symmetric. In the real world, interdependent interactions are generally asymmetric.… ▽ More Multilayer networked systems are ubiquitous in nature and engineering, and the robustness of these systems against failures is of great interest. A main line of theoretical pursuit has been percolation induced cascading failures, where interdependence between network layers is conveniently and tacitly assumed to be symmetric. In the real world, interdependent interactions are generally asymmetric. To uncover and quantify the impact of asymmetry in interdependence on network robustness, we focus on percolation dynamics in double-layer systems and implement the following failure mechanism: once a node in a network layer fails, the damage it can cause depends not only on its position in the layer but also on the position of its counterpart neighbor in the other layer. We find that the characteristics of the percolation transition depend on the degree of asymmetry, where the striking phenomenon of a switch in the nature of the phase transition from first- to second-order arises. We derive a theory to calculate the percolation transition points in both network layers, as well as the transition switching point, with strong numerical support from synthetic and empirical networks. Not only does our work shed light upon the factors that determine the robustness of multilayer networks against cascading failures, but it also provides a scenario by which the system can be designed or controlled to reach a desirable level of resilience. △ Less

Submitted 24 October, 2019; originally announced October 2019.

Comments: 20 pages, 7 figures

Journal ref: Phys. Rev. E 100, 052306 (2019)

arXiv:1909.01288 [pdf, other]

doi 10.1038/s41467-019-11822-5

Irrelevance of linear controllability to nonlinear dynamical networks

Authors: Junjie Jiang, Ying-Cheng Lai

Abstract: There has been tremendous development of linear controllability of complex networks. Real-world systems are fundamentally nonlinear. Is linear controllability relevant to nonlinear dynamical networks? We identify a common trait underlying both types of control: the nodal "importance." For nonlinear and linear control, the importance is determined, respectively, by physical/biological consideration… ▽ More There has been tremendous development of linear controllability of complex networks. Real-world systems are fundamentally nonlinear. Is linear controllability relevant to nonlinear dynamical networks? We identify a common trait underlying both types of control: the nodal "importance." For nonlinear and linear control, the importance is determined, respectively, by physical/biological considerations and the probability for a node to be in the minimum driver set. We study empirical mutualistic networks and a gene regulatory network, for which the nonlinear nodal importance can be quantified by the ability of individual nodes to restore the system from the aftermath of a tip**-point transition. We find that the nodal importance ranking for nonlinear and linear control exhibits opposite trends: for the former large-degree nodes are more important but for the latter, the importance scale is tilted towards the small-degree nodes, suggesting strongly irrelevance of linear controllability to these systems. The recent claim of successful application of linear controllability to C. elegans connectome is examined and discussed. △ Less

Submitted 3 September, 2019; originally announced September 2019.

Comments: 26 pages, 8 figures

arXiv:1804.08073 [pdf, other]

Ricci flow under local almost non-negative curvature conditions

Authors: Yi Lai

Abstract: We find a local solution to the Ricci flow equation under a negative lower bound for many known curvature conditions. The flow exists for a uniform amount of time, during which the curvature stays bounded below by a controllable negative number. The curvature conditions we consider include 2-non-negative and weakly $\textnormal{PIC}_1$ cases, of which the results are new. We complete the discussio… ▽ More We find a local solution to the Ricci flow equation under a negative lower bound for many known curvature conditions. The flow exists for a uniform amount of time, during which the curvature stays bounded below by a controllable negative number. The curvature conditions we consider include 2-non-negative and weakly $\textnormal{PIC}_1$ cases, of which the results are new. We complete the discussion of the almost preservation problem by Bamler-Cabezas-Rivas-Wilking, and the 2-non-negative case generalizes a result in 3D by Simon-Top** to higher dimensions. As an application, we use the local Ricci flow to smooth a metric space which is the limit of a sequence of manifolds with the almost non-negative curvature conditions, and show that this limit space is bi-H$\ddot{\textnormal{o}}$lder homeomorphic to a smooth manifold. △ Less

Submitted 11 June, 2018; v1 submitted 22 April, 2018; originally announced April 2018.

Comments: 45 pages, 1 figure

arXiv:1801.08366 [pdf, other]

Networks of piecewise linear neural mass models

Authors: S Coombes, Y-M Lai, M Sayli, R Thul

Abstract: Neural mass models are ubiquitous in large scale brain modelling. At the node level they are written in terms of a set of ODEs with a nonlinearity that is typically a sigmoidal shape. Using structural data from brain atlases they may be connected into a network to investigate the emergence of functional dynamic states, such as synchrony. With the simple restriction of the classic sigmoidal nonline… ▽ More Neural mass models are ubiquitous in large scale brain modelling. At the node level they are written in terms of a set of ODEs with a nonlinearity that is typically a sigmoidal shape. Using structural data from brain atlases they may be connected into a network to investigate the emergence of functional dynamic states, such as synchrony. With the simple restriction of the classic sigmoidal nonlinearity to a piecewise linear caricature we show that the famous Wilson-Cowan neural mass model can be analysed at both the node and network level. The construction of periodic orbits at the node level is achieved by patching together matrix exponential solutions, and stability is determined using Floquet theory. For networks with interactions described by circulant matrices, we show that the stability of the synchronous state can be determined in terms of a low-dimensional Floquet problem parameterised by the eigenvalues of the interaction matrix. This network Floquet problem is readily solved using linear algebra, to predict the onset of spatio-temporal network patterns arising from a synchronous instability. We consider the case of a discontinuous choice for the node nonlinearity, namely the replacement of the sigmoid by a Heaviside nonlinearity. This gives rise to a continuous-time switching network. At the node level this allows for the existence of unstable sliding periodic orbits, which we construct. The stability of a periodic orbit is now treated with a modification of Floquet theory to treat the evolution of small perturbations through switching manifolds via saltation matrices. At the network level the stability analysis of the synchronous state is considerably more challenging. Here we report on the use of ideas originally developed for the study of Glass networks to treat the stability of periodic network states in neural mass models with discontinuous interactions. △ Less

Submitted 25 January, 2018; originally announced January 2018.

arXiv:1301.0796 [pdf, ps, other]

doi 10.1103/PhysRevE.88.012905

Noise-Induced Synchronization, Desynchronization, and Clustering in Globally Coupled Nonidentical Oscillators

Authors: Yi Ming Lai, Mason A. Porter

Abstract: We study ensembles of globally coupled, nonidentical phase oscillators subject to correlated noise, and we identify several important factors that cause noise and coupling to synchronize or desychronize a system. By introducing noise in various ways, we find a novel estimate for the onset of synchrony of a system in terms of the coupling strength, noise strength, and width of the frequency distrib… ▽ More We study ensembles of globally coupled, nonidentical phase oscillators subject to correlated noise, and we identify several important factors that cause noise and coupling to synchronize or desychronize a system. By introducing noise in various ways, we find a novel estimate for the onset of synchrony of a system in terms of the coupling strength, noise strength, and width of the frequency distribution of its natural oscillations. We also demonstrate that noise alone is sufficient to synchronize nonidentical oscillators. However, this synchrony depends on the first Fourier mode of a phase-sensitivity function, through which we introduce common noise into the system. We show that higher Fourier modes can cause desychronization due to clustering effects, and that this can reinforce clustering caused by different forms of coupling. Finally, we discuss the effects of noise on an ensemble in which antiferromagnetic coupling causes oscillators to form two clusters in the absence of noise. △ Less

Submitted 24 September, 2013; v1 submitted 4 January, 2013; originally announced January 2013.

Comments: 7 pages, 5 figures (some with multiple parts)

Journal ref: Phys. Rev. E 88, 012905 (2013)

arXiv:1205.0088

ProPPA: A Fast Algorithm for $\ell_1$ Minimization and Low-Rank Matrix Completion

Authors: Ranch Y. Q. Lai, Pong C. Yuen

Abstract: We propose a Projected Proximal Point Algorithm (ProPPA) for solving a class of optimization problems. The algorithm iteratively computes the proximal point of the last estimated solution projected into an affine space which itself is parallel and approaching to the feasible set. We provide convergence analysis theoretically supporting the general algorithm, and then apply it for solving $\ell_1$-… ▽ More We propose a Projected Proximal Point Algorithm (ProPPA) for solving a class of optimization problems. The algorithm iteratively computes the proximal point of the last estimated solution projected into an affine space which itself is parallel and approaching to the feasible set. We provide convergence analysis theoretically supporting the general algorithm, and then apply it for solving $\ell_1$-minimization problems and the matrix completion problem. These problems arise in many applications including machine learning, image and signal processing. We compare our algorithm with the existing state-of-the-art algorithms. Experimental results on solving these problems show that our algorithm is very efficient and competitive. △ Less

Submitted 19 May, 2012; v1 submitted 1 May, 2012; originally announced May 2012.

Comments: update needed

arXiv:0902.2718 [pdf, other]

An Effective Compactness Theorem for Coxeter Groups

Authors: Yvonne Lai

Abstract: Through highly non-constructive methods, works by Bestvina, Culler, Feighn, Morgan, Paulin, Rips, Shalen, and Thurston show that if a finitely presented group does not split over a virtually solvable subgroup, then the space of its discrete and faithful actions on hyperbolic n-space, modulo conjugation, is compact for all dimensions. Although this implies that the space of hyperbolic structures… ▽ More Through highly non-constructive methods, works by Bestvina, Culler, Feighn, Morgan, Paulin, Rips, Shalen, and Thurston show that if a finitely presented group does not split over a virtually solvable subgroup, then the space of its discrete and faithful actions on hyperbolic n-space, modulo conjugation, is compact for all dimensions. Although this implies that the space of hyperbolic structures of such groups has finite diameter, the known methods do not give an explicit bound. We establish such a bound for Coxeter groups. We find that either the group splits over a virtually solvable subgroup or there is a constant C and a point in hyperbolic n-space that is moved no more than C by any generator. The constant C depends only on the number of generators of the group, and is independent of the relators. △ Less

Submitted 16 February, 2009; originally announced February 2009.

Comments: PDFLaTeX, 6 figures

MSC Class: 20F65; 57M99

arXiv:0804.4361 [pdf, ps, other]

Improving Coverage Accuracy of Block Bootstrap Confidence Intervals

Authors: Stephen M. S. Lee, P. Y. Lai

Abstract: The block bootstrap confidence interval based on dependent data can outperform the computationally more convenient normal approximation only with non-trivial Studentization which, in the case of complicated statistics, calls for highly specialist treatment. We propose two different approaches to improving the accuracy of the block bootstrap confidence interval under very general conditions. The… ▽ More The block bootstrap confidence interval based on dependent data can outperform the computationally more convenient normal approximation only with non-trivial Studentization which, in the case of complicated statistics, calls for highly specialist treatment. We propose two different approaches to improving the accuracy of the block bootstrap confidence interval under very general conditions. The first calibrates the coverage level by iterating the block bootstrap. The second calculates Studentizing factors directly from block bootstrap series and requires no non-trivial analytic treatment. Both approaches involve two nested levels of block bootstrap resampling and yield high-order accuracy with simple tuning of block lengths at the two resampling levels. A simulation study is reported to provide empirical support for our theory. △ Less

Submitted 28 April, 2008; originally announced April 2008.

Report number: Research Report No. 435. Department of Statistics and Actuarial Science, The University of Hong Kong

Showing 1–33 of 33 results for author: Lai, Y