-
Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method
Authors:
Yuling Jiao,
Yanming Lai,
Yang Wang
Abstract:
Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations wi…
▽ More
Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations with three different types of boundary conditions. We perform projected gradient descent(PDG) to train the three-layer network and we establish its global convergence. To the best of our knowledge, we are the first to provide a comprehensive error analysis of using overparameterized networks to solve PDE problems, as our analysis simultaneously includes estimates for approximation error, generalization error, and optimization error. We present error bound in terms of the sample size $n$ and our work provides guidance on how to set the network depth, width, step size, and number of iterations for the projected gradient descent algorithm. Importantly, our assumptions in this work are classical and we do not require any additional assumptions on the solution of the equation. This ensures the broad applicability and generality of our results.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
A family of Kähler flying wing steady Ricci solitons
Authors:
Pak-Yeung Chan,
Ronan J. Conlon,
Yi Lai
Abstract:
In $1996$, H.-D. Cao constructed a $U(n)$-invariant steady gradient Kähler-Ricci soliton on $\mathbb{C}^{n}$ and asked whether every steady gradient Kähler-Ricci soliton of positive curvature on $\mathbb{C}^{n}$ is necessarily $U(n)$-invariant (and hence unique up to scaling). Recently, Apostolov-Cifarelli answered this question in the negative for $n=2$. Here, we construct a family of…
▽ More
In $1996$, H.-D. Cao constructed a $U(n)$-invariant steady gradient Kähler-Ricci soliton on $\mathbb{C}^{n}$ and asked whether every steady gradient Kähler-Ricci soliton of positive curvature on $\mathbb{C}^{n}$ is necessarily $U(n)$-invariant (and hence unique up to scaling). Recently, Apostolov-Cifarelli answered this question in the negative for $n=2$. Here, we construct a family of $U(1)\times U(n-1)$-invariant, but not $U(n)$-invariant, complete steady gradient Kähler-Ricci solitons with strictly positive curvature operator on real $(1,\,1)$-forms (in particular, with strictly positive sectional curvature) on $\mathbb{C}^{n}$ for $n\geq3$, thereby answering Cao's question in the negative for $n\geq3$. This family of steady Ricci solitons interpolates between Cao's $U(n)$-invariant steady Kähler-Ricci soliton and the product of the cigar soliton and Cao's $U(n-1)$-invariant steady Kähler-Ricci soliton. This provides the Kähler analog of the Riemannian flying wings construction of Lai. In the process of the proof, we also demonstrate that the almost diameter rigidity of $\mathbb{P}^{n}$ endowed with the Fubini-Study metric does not hold even if the curvature operator is bounded below by $2$ on real $(1,\,1)$-forms.
△ Less
Submitted 23 May, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
Machine-learning prediction of tip** and collapse of the Atlantic Meridional Overturning Circulation
Authors:
Shirin Panahi,
Ling-Wei Kong,
Mohammadamin Moradi,
Zheng-Meng Zhai,
Bryan Glaz,
Mulugeta Haile,
Ying-Cheng Lai
Abstract:
Recent research on the Atlantic Meridional Overturning Circulation (AMOC) raised concern about its potential collapse through a tip** point due to the climate-change caused increase in the freshwater input into the North Atlantic. The predicted time window of collapse is centered about the middle of the century and the earliest possible start is approximately two years from now. More generally,…
▽ More
Recent research on the Atlantic Meridional Overturning Circulation (AMOC) raised concern about its potential collapse through a tip** point due to the climate-change caused increase in the freshwater input into the North Atlantic. The predicted time window of collapse is centered about the middle of the century and the earliest possible start is approximately two years from now. More generally, anticipating a tip** point at which the system transitions from one stable steady state to another is relevant to a broad range of fields. We develop a machine-learning approach to predicting tip** in noisy dynamical systems with a time-varying parameter and test it on a number of systems including the AMOC, ecological networks, an electrical power system, and a climate model. For the AMOC, our prediction based on simulated fingerprint data and real data of the sea surface temperature places the time window of a potential collapse between the years 2040 and 2065.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Machine-learning parameter tracking with partial state observation
Authors:
Zheng-Meng Zhai,
Mohammadamin Moradi,
Bryan Glaz,
Mulugeta Haile,
Ying-Cheng Lai
Abstract:
Complex and nonlinear dynamical systems often involve parameters that change with time, accurate tracking of which is essential to tasks such as state estimation, prediction, and control. Existing machine-learning methods require full state observation of the underlying system and tacitly assume adiabatic changes in the parameter. Formulating an inverse problem and exploiting reservoir computing,…
▽ More
Complex and nonlinear dynamical systems often involve parameters that change with time, accurate tracking of which is essential to tasks such as state estimation, prediction, and control. Existing machine-learning methods require full state observation of the underlying system and tacitly assume adiabatic changes in the parameter. Formulating an inverse problem and exploiting reservoir computing, we develop a model-free and fully data-driven framework to accurately track time-varying parameters from partial state observation in real time. In particular, with training data from a subset of the dynamical variables of the system for a small number of known parameter values, the framework is able to accurately predict the parameter variations in time. Low- and high-dimensional, Markovian and non-Markovian nonlinear dynamical systems are used to demonstrate the power of the machine-learning based parameter-tracking framework. Pertinent issues affecting the tracking performance are addressed.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Rate-induced tip** in complex high-dimensional ecological networks
Authors:
Shirin Panahi,
Younghae Do,
Alan Hastings,
Ying-Cheng Lai
Abstract:
In an ecosystem, environmental changes as a result of natural and human processes can cause some key parameters of the system to change with time. Depending on how fast such a parameter changes, a tip** point can occur. Existing works on rate-induced tip**, or R-tip**, offered a theoretical way to study this phenomenon but from a local dynamical point of view, revealing, e.g., the existence…
▽ More
In an ecosystem, environmental changes as a result of natural and human processes can cause some key parameters of the system to change with time. Depending on how fast such a parameter changes, a tip** point can occur. Existing works on rate-induced tip**, or R-tip**, offered a theoretical way to study this phenomenon but from a local dynamical point of view, revealing, e.g., the existence of a critical rate for some specific initial condition above which a tip** point will occur. As ecosystems are subject to constant disturbances and can drift away from their equilibrium point, it is necessary to study R-tip** from a global perspective in terms of the initial conditions in the entire relevant phase space region. In particular, we introduce the notion of the probability of R-tip** defined for initial conditions taken from the whole relevant phase space. Using a number of real-world, complex mutualistic networks as a paradigm, we discover a scaling law between this probability and the rate of parameter change and provide a geometric theory to explain the law. The real-world implication is that even a slow parameter change can lead to a system collapse with catastrophic consequences. In fact, to mitigate the environmental changes by merely slowing down the parameter drift may not always be effective: only when the rate of parameter change is reduced to practically zero would the tip** be avoided. Our global dynamics approach offers a more complete and physically meaningful way to understand the important phenomenon of R-tip**.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Model-free tracking control of complex dynamical trajectories with machine learning
Authors:
Zheng-Meng Zhai,
Mohammadamin Moradi,
Ling-Wei Kong,
Bryan Glaz,
Mulugeta Haile,
Ying-Cheng Lai
Abstract:
Nonlinear tracking control enabling a dynamical system to track a desired trajectory is fundamental to robotics, serving a wide range of civil and defense applications. In control engineering, designing tracking control requires complete knowledge of the system model and equations. We develop a model-free, machine-learning framework to control a two-arm robotic manipulator using only partially obs…
▽ More
Nonlinear tracking control enabling a dynamical system to track a desired trajectory is fundamental to robotics, serving a wide range of civil and defense applications. In control engineering, designing tracking control requires complete knowledge of the system model and equations. We develop a model-free, machine-learning framework to control a two-arm robotic manipulator using only partially observed states, where the controller is realized by reservoir computing. Stochastic input is exploited for training, which consists of the observed partial state vector as the first and its immediate future as the second component so that the neural machine regards the latter as the future state of the former. In the testing (deployment) phase, the immediate-future component is replaced by the desired observational vector from the reference trajectory. We demonstrate the effectiveness of the control framework using a variety of periodic and chaotic signals, and establish its robustness against measurement noise, disturbances, and uncertainties.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Digital twins of nonlinear dynamical systems: A perspective
Authors:
Ying-Cheng Lai
Abstract:
Digital twins have attracted a great deal of recent attention from a wide range of fields. A basic requirement for digital twins of nonlinear dynamical systems is the ability to generate the system evolution and predict potentially catastrophic emergent behaviors so as to providing early warnings. The digital twin can then be used for system "health" monitoring in real time and for predictive prob…
▽ More
Digital twins have attracted a great deal of recent attention from a wide range of fields. A basic requirement for digital twins of nonlinear dynamical systems is the ability to generate the system evolution and predict potentially catastrophic emergent behaviors so as to providing early warnings. The digital twin can then be used for system "health" monitoring in real time and for predictive problem solving. In particular, if the digital twin forecasts a possible system collapse in the future due to parameter drifting as caused by environmental changes or perturbations, an optimal control strategy can be devised and executed as early intervention to prevent the collapse. Two approaches exist for constructing digital twins of nonlinear dynamical systems: sparse optimization and machine learning. The basics of these two approaches are described and their advantages and caveats are discussed.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Oscillatory networks: Insights from piecewise-linear modeling
Authors:
Stephen Coombes,
Mustafa Sayli,
Rüdiger Thul,
Rachel Nicks,
Mason A Porter,
Yi Ming Lai
Abstract:
There is enormous interest -- both mathematically and in diverse applications -- in understanding the dynamics of coupled oscillator networks. The real-world motivation of such networks arises from studies of the brain, the heart, ecology, and more. It is common to describe the rich emergent behavior in these systems in terms of complex patterns of network activity that reflect both the connectivi…
▽ More
There is enormous interest -- both mathematically and in diverse applications -- in understanding the dynamics of coupled oscillator networks. The real-world motivation of such networks arises from studies of the brain, the heart, ecology, and more. It is common to describe the rich emergent behavior in these systems in terms of complex patterns of network activity that reflect both the connectivity and the nonlinear dynamics of the network components. Such behavior is often organized around phase-locked periodic states and their instabilities. However, the explicit calculation of periodic orbits in nonlinear systems (even in low dimensions) is notoriously hard, so network-level insights often require the numerical construction of some underlying periodic component. In this paper, we review powerful techniques for studying coupled oscillator networks. We discuss phase reductions, phase-amplitude reductions, and the master stability function for smooth dynamical systems. We then focus in particular on the augmentation of these methods to analyze piecewise-linear systems, for which one can readily construct periodic orbits. This yields useful insights into network behavior, but the cost is that one needs to study nonsmooth dynamical systems. The study of nonsmooth systems is well-developed when focusing on the interacting units (i.e., at the node level) of a system, and we give a detailed presentation of how to use \textit{saltation operators}, which can treat the propagation of perturbations through switching manifolds, to understand dynamics and bifurcations at the network level. We illustrate this merger of tools and techniques from network science and nonsmooth dynamical systems with applications to neural systems, cardiac systems, networks of electro-mechanical oscillators, and cooperation in cattle herds.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Convergence Analysis of the Deep Galerkin Method for Weak Solutions
Authors:
Yuling Jiao,
Yanming Lai,
Yang Wang,
Haizhao Yang,
Yunfei Yang
Abstract:
This paper analyzes the convergence rate of a deep Galerkin method for the weak solution (DGMW) of second-order elliptic partial differential equations on $\mathbb{R}^d$ with Dirichlet, Neumann, and Robin boundary conditions, respectively. In DGMW, a deep neural network is applied to parametrize the PDE solution, and a second neural network is adopted to parametrize the test function in the tradit…
▽ More
This paper analyzes the convergence rate of a deep Galerkin method for the weak solution (DGMW) of second-order elliptic partial differential equations on $\mathbb{R}^d$ with Dirichlet, Neumann, and Robin boundary conditions, respectively. In DGMW, a deep neural network is applied to parametrize the PDE solution, and a second neural network is adopted to parametrize the test function in the traditional Galerkin formulation. By properly choosing the depth and width of these two networks in terms of the number of training samples $n$, it is shown that the convergence rate of DGMW is $\mathcal{O}(n^{-1/d})$, which is the first convergence result for weak solutions. The main idea of the proof is to divide the error of the DGMW into an approximation error and a statistical error. We derive an upper bound on the approximation error in the $H^{1}$ norm and bound the statistical error via Rademacher complexity.
△ Less
Submitted 5 February, 2023;
originally announced February 2023.
-
Emergence of a stochastic resonance in machine learning
Authors:
Zheng-Meng Zhai,
Ling-Wei Kong,
Ying-Cheng Lai
Abstract:
Can noise be beneficial to machine-learning prediction of chaotic systems? Utilizing reservoir computers as a paradigm, we find that injecting noise to the training data can induce a stochastic resonance with significant benefits to both short-term prediction of the state variables and long-term prediction of the attractor of the system. A key to inducing the stochastic resonance is to include the…
▽ More
Can noise be beneficial to machine-learning prediction of chaotic systems? Utilizing reservoir computers as a paradigm, we find that injecting noise to the training data can induce a stochastic resonance with significant benefits to both short-term prediction of the state variables and long-term prediction of the attractor of the system. A key to inducing the stochastic resonance is to include the amplitude of the noise in the set of hyperparameters for optimization. By so doing, the prediction accuracy, stability and horizon can be dramatically improved. The stochastic resonance phenomenon is demonstrated using two prototypical high-dimensional chaotic systems.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
3D flying wings for any asymptotic cones
Authors:
Yi Lai
Abstract:
For every $θ\in(0,π)$, we construct a 3D steady gradient Ricci soliton whose asymptotic cone is a sector with angle $θ$, which is a called 3D flying wing.
For every $θ\in(0,π)$, we construct a 3D steady gradient Ricci soliton whose asymptotic cone is a sector with angle $θ$, which is a called 3D flying wing.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
O(2)-symmetry of 3D steady gradient Ricci solitons
Authors:
Yi Lai
Abstract:
For any 3D steady gradient Ricci soliton with positive curvature, we prove that it must be isometric to the Bryant soliton if it is asymptotic to a ray. Otherwise, it is asymptotic to a sector and hence a flying wing. We show that all 3D flying wings are O(2)-symmetric. Therefore, all 3D steady gradient Ricci solitons are O(2)-symmetric.
For any 3D steady gradient Ricci soliton with positive curvature, we prove that it must be isometric to the Bryant soliton if it is asymptotic to a ray. Otherwise, it is asymptotic to a sector and hence a flying wing. We show that all 3D flying wings are O(2)-symmetric. Therefore, all 3D steady gradient Ricci solitons are O(2)-symmetric.
△ Less
Submitted 18 July, 2023; v1 submitted 2 May, 2022;
originally announced May 2022.
-
Predicting extreme events from data using deep machine learning: when and where
Authors:
Junjie Jiang,
Zi-Gang Huang,
Celso Grebogi,
Ying-Cheng Lai
Abstract:
We develop a deep convolutional neural network (DCNN) based framework for model-free prediction of the occurrence of extreme events both in time ("when") and in space ("where") in nonlinear physical systems of spatial dimension two. The measurements or data are a set of two-dimensional snapshots or images. For a desired time horizon of prediction, a proper labeling scheme can be designated to enab…
▽ More
We develop a deep convolutional neural network (DCNN) based framework for model-free prediction of the occurrence of extreme events both in time ("when") and in space ("where") in nonlinear physical systems of spatial dimension two. The measurements or data are a set of two-dimensional snapshots or images. For a desired time horizon of prediction, a proper labeling scheme can be designated to enable successful training of the DCNN and subsequent prediction of extreme events in time. Given that an extreme event has been predicted to occur within the time horizon, a space-based labeling scheme can be applied to predict, within certain resolution, the location at which the event will occur. We use synthetic data from the 2D complex Ginzburg-Landau equation and empirical wind speed data of the North Atlantic ocean to demonstrate and validate our machine-learning based prediction framework. The trade-offs among the prediction horizon, spatial resolution, and accuracy are illustrated, and the detrimental effect of spatially biased occurrence of extreme event on prediction accuracy is discussed. The deep learning framework is viable for predicting extreme events in the real world.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Continuity scaling: A rigorous framework for detecting and quantifying causality accurately
Authors:
Xiong Ying,
Si-Yang Leng,
Huan-Fei Ma,
Qing Nie,
Ying-Cheng Lai,
Wei Lin
Abstract:
Data based detection and quantification of causation in complex, nonlinear dynamical systems is of paramount importance to science, engineering and beyond. Inspired by the widely used methodology in recent years, the cross-map-based techniques, we develop a general framework to advance towards a comprehensive understanding of dynamical causal mechanisms, which is consistent with the natural interp…
▽ More
Data based detection and quantification of causation in complex, nonlinear dynamical systems is of paramount importance to science, engineering and beyond. Inspired by the widely used methodology in recent years, the cross-map-based techniques, we develop a general framework to advance towards a comprehensive understanding of dynamical causal mechanisms, which is consistent with the natural interpretation of causality. In particular, instead of measuring the smoothness of the cross map as conventionally implemented, we define causation through measuring the {\it scaling law} for the continuity of the investigated dynamical system directly. The uncovered scaling law enables accurate, reliable, and efficient detection of causation and assessment of its strength in general complex dynamical systems, outperforming those existing representative methods. The continuity scaling based framework is rigorously established and demonstrated using datasets from model complex systems and the real world.
△ Less
Submitted 26 March, 2022;
originally announced March 2022.
-
Analysis of Deep Ritz Methods for Laplace Equations with Dirichlet Boundary Conditions
Authors:
Chenguang Duan,
Yuling Jiao,
Yanming Lai,
Xiliang Lu,
Qimeng Quan,
Jerry Zhijian Yang
Abstract:
Deep Ritz methods (DRM) have been proven numerically to be efficient in solving partial differential equations. In this paper, we present a convergence rate in $H^{1}$ norm for deep Ritz methods for Laplace equations with Dirichlet boundary condition, where the error depends on the depth and width in the deep neural networks and the number of samples explicitly. Further we can properly choose the…
▽ More
Deep Ritz methods (DRM) have been proven numerically to be efficient in solving partial differential equations. In this paper, we present a convergence rate in $H^{1}$ norm for deep Ritz methods for Laplace equations with Dirichlet boundary condition, where the error depends on the depth and width in the deep neural networks and the number of samples explicitly. Further we can properly choose the depth and width in the deep neural networks in terms of the number of training samples. The main idea of the proof is to decompose the total error of DRM into three parts, that is approximation error, statistical error and the error caused by the boundary penalty. We bound the approximation error in $H^{1}$ norm with $\mathrm{ReLU}^{2}$ networks and control the statistical error via Rademacher complexity. In particular, we derive the bound on the Rademacher complexity of the non-Lipschitz composition of gradient norm with $\mathrm{ReLU}^{2}$ network, which is of immense independent interest. We also analysis the error inducing by the boundary penalty method and give a prior rule for tuning the penalty parameter.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
A rate of convergence of Physics Informed Neural Networks for the linear second order elliptic PDEs
Authors:
Yuling Jiao,
Yanming Lai,
Dingwei Li,
Xiliang Lu,
Fengru Wang,
Yang Wang,
Jerry Zhijian Yang
Abstract:
In recent years, physical informed neural networks (PINNs) have been shown to be a powerful tool for solving PDEs empirically. However, numerical analysis of PINNs is still missing. In this paper, we prove the convergence rate to PINNs for the second order elliptic equations with Dirichlet boundary condition, by establishing the upper bounds on the number of training samples, depth and width of th…
▽ More
In recent years, physical informed neural networks (PINNs) have been shown to be a powerful tool for solving PDEs empirically. However, numerical analysis of PINNs is still missing. In this paper, we prove the convergence rate to PINNs for the second order elliptic equations with Dirichlet boundary condition, by establishing the upper bounds on the number of training samples, depth and width of the deep neural networks to achieve desired accuracy. The error of PINNs is decomposed into approximation error and statistical error, where the approximation error is given in $C^2$ norm with $\mathrm{ReLU}^{3}$ networks (deep network with activations function $\max\{0,x^3\}$) and the statistical error is estimated by Rademacher complexity. We derive the bound on the Rademacher complexity of the non-Lipschitz composition of gradient norm with $\mathrm{ReLU}^{3}$ network, which is of immense independent interest.
△ Less
Submitted 15 March, 2022; v1 submitted 3 September, 2021;
originally announced September 2021.
-
Error Analysis of Deep Ritz Methods for Elliptic Equations
Authors:
Yuling Jiao,
Yanming Lai,
Yisu Lo,
Yang Wang,
Yunfei Yang
Abstract:
Using deep neural networks to solve PDEs has attracted a lot of attentions recently. However, why the deep learning method works is falling far behind its empirical success. In this paper, we provide a rigorous numerical analysis on deep Ritz method (DRM) \cite{Weinan2017The} for second order elliptic equations with Drichilet, Neumann and Robin boundary condition, respectively. We establish the fi…
▽ More
Using deep neural networks to solve PDEs has attracted a lot of attentions recently. However, why the deep learning method works is falling far behind its empirical success. In this paper, we provide a rigorous numerical analysis on deep Ritz method (DRM) \cite{Weinan2017The} for second order elliptic equations with Drichilet, Neumann and Robin boundary condition, respectively. We establish the first nonasymptotic convergence rate in $H^1$ norm for DRM using deep networks with smooth activation functions including logistic and hyperbolic tangent functions. Our results show how to set the hyper-parameter of depth and width to achieve the desired convergence rate in terms of number of training samples.
△ Less
Submitted 4 September, 2021; v1 submitted 30 July, 2021;
originally announced July 2021.
-
Convergence Rate Analysis for Deep Ritz Method
Authors:
Chenguang Duan,
Yuling Jiao,
Yanming Lai,
Xiliang Lu,
Zhijian Yang
Abstract:
Using deep neural networks to solve PDEs has attracted a lot of attentions recently. However, why the deep learning method works is falling far behind its empirical success. In this paper, we provide a rigorous numerical analysis on deep Ritz method (DRM) \cite{wan11} for second order elliptic equations with Neumann boundary conditions. We establish the first nonasymptotic convergence rate in…
▽ More
Using deep neural networks to solve PDEs has attracted a lot of attentions recently. However, why the deep learning method works is falling far behind its empirical success. In this paper, we provide a rigorous numerical analysis on deep Ritz method (DRM) \cite{wan11} for second order elliptic equations with Neumann boundary conditions. We establish the first nonasymptotic convergence rate in $H^1$ norm for DRM using deep networks with $\mathrm{ReLU}^2$ activation functions. In addition to providing a theoretical justification of DRM, our study also shed light on how to set the hyper-parameter of depth and width to achieve the desired convergence rate in terms of number of training samples. Technically, we derive bounds on the approximation error of deep $\mathrm{ReLU}^2$ network in $H^1$ norm and on the Rademacher complexity of the non-Lipschitz composition of gradient norm and $\mathrm{ReLU}^2$ network, both of which are of independent interest.
△ Less
Submitted 29 March, 2021; v1 submitted 24 March, 2021;
originally announced March 2021.
-
Deep Neural Networks with ReLU-Sine-Exponential Activations Break Curse of Dimensionality in Approximation on Hölder Class
Authors:
Yuling Jiao,
Yanming Lai,
Xiliang Lu,
Fengru Wang,
Jerry Zhijian Yang,
Yuanyuan Yang
Abstract:
In this paper, we construct neural networks with ReLU, sine and $2^x$ as activation functions. For general continuous $f$ defined on $[0,1]^d$ with continuity modulus $ω_f(\cdot)$, we construct ReLU-sine-$2^x$ networks that enjoy an approximation rate $\mathcal{O}(ω_f(\sqrt{d})\cdot2^{-M}+ω_{f}\left(\frac{\sqrt{d}}{N}\right))$, where $M,N\in \mathbb{N}^{+}$ denote the hyperparameters related to wi…
▽ More
In this paper, we construct neural networks with ReLU, sine and $2^x$ as activation functions. For general continuous $f$ defined on $[0,1]^d$ with continuity modulus $ω_f(\cdot)$, we construct ReLU-sine-$2^x$ networks that enjoy an approximation rate $\mathcal{O}(ω_f(\sqrt{d})\cdot2^{-M}+ω_{f}\left(\frac{\sqrt{d}}{N}\right))$, where $M,N\in \mathbb{N}^{+}$ denote the hyperparameters related to widths of the networks. As a consequence, we can construct ReLU-sine-$2^x$ network with the depth $5$ and width $\max\left\{\left\lceil2d^{3/2}\left(\frac{3μ}ε\right)^{1/α}\right\rceil,2\left\lceil\log_2\frac{3μd^{α/2}}{2ε}\right\rceil+2\right\}$ that approximates $f\in \mathcal{H}_μ^α([0,1]^d)$ within a given tolerance $ε>0$ measured in $L^p$ norm $p\in[1,\infty)$, where $\mathcal{H}_μ^α([0,1]^d)$ denotes the Hölder continuous function class defined on $[0,1]^d$ with order $α\in (0,1]$ and constant $μ> 0$. Therefore, the ReLU-sine-$2^x$ networks overcome the curse of dimensionality on $\mathcal{H}_μ^α([0,1]^d)$. In addition to its supper expressive power, functions implemented by ReLU-sine-$2^x$ networks are (generalized) differentiable, enabling us to apply SGD to train.
△ Less
Submitted 12 August, 2022; v1 submitted 28 February, 2021;
originally announced March 2021.
-
Finding nonlinear system equations and complex network structures from data: a sparse optimization approach
Authors:
Ying-Cheng Lai
Abstract:
In applications of nonlinear and complex dynamical systems, a common situation is that the system can be measured but its structure and the detailed rules of dynamical evolution are unknown. The inverse problem is to determine the system equations and structure based solely on measured time series. Recently, methods based on sparse optimization have been developed. For example, the principle of ex…
▽ More
In applications of nonlinear and complex dynamical systems, a common situation is that the system can be measured but its structure and the detailed rules of dynamical evolution are unknown. The inverse problem is to determine the system equations and structure based solely on measured time series. Recently, methods based on sparse optimization have been developed. For example, the principle of exploiting sparse optimization such as compressive sensing to find the equations of nonlinear dynamical systems from data was articulated in 2011 by the Nonlinear Dynamics Group at Arizona State University. This article presents a brief review of the recent progress in this area. The basic idea is to expand the equations governing the dynamical evolution of the system into a power series or a Fourier series of a finite number of terms and then to determine the vector of the expansion coefficients based solely on data through sparse optimization. Examples discussed here include discovering the equations of stationary or nonstationary chaotic systems to enable prediction of dynamical events such as critical transition and system collapse, inferring the full topology of complex networks of dynamical oscillators and social networks hosting evolutionary game dynamics, and identifying partial differential equations for spatiotemporal dynamical systems. Situations where sparse optimization is effective and those in which the method fails are discussed. Comparisons with the traditional method of delay coordinate embedding in nonlinear time series analysis are given and the recent development of model-free, data driven prediction framework based on machine learning is briefly introduced.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Machine learning prediction of critical transition and system collapse
Authors:
Ling-Wei Kong,
Hua-Wei Fan,
Celso Grebogi,
Ying-Cheng Lai
Abstract:
To predict a critical transition due to parameter drift without relying on model is an outstanding problem in nonlinear dynamics and applied fields. A closely related problem is to predict whether the system is already in or if the system will be in a transient state preceding its collapse. We develop a model free, machine learning based solution to both problems by exploiting reservoir computing…
▽ More
To predict a critical transition due to parameter drift without relying on model is an outstanding problem in nonlinear dynamics and applied fields. A closely related problem is to predict whether the system is already in or if the system will be in a transient state preceding its collapse. We develop a model free, machine learning based solution to both problems by exploiting reservoir computing to incorporate a parameter input channel. We demonstrate that, when the machine is trained in the normal functioning regime with a chaotic attractor (i.e., before the critical transition), the transition point can be predicted accurately. Remarkably, for a parameter drift through the critical point, the machine with the input parameter channel is able to predict not only that the system will be in a transient state, but also the average transient time before the final collapse.
△ Less
Submitted 2 December, 2020;
originally announced December 2020.
-
Synchronization within synchronization: transients and intermittency in ecological networks
Authors:
Huawei Fan,
Ling-Wei Kong,
Xingang Wang,
Alan Hastings,
Ying-Cheng Lai
Abstract:
Transients are fundamental to ecological systems with significant implications to management, conservation, and biological control. We uncover a type of transient synchronization behavior in spatial ecological networks whose local dynamics are of the chaotic, predator-prey type. In the parameter regime where there is phase synchronization among all the patches, complete synchronization (i.e., sync…
▽ More
Transients are fundamental to ecological systems with significant implications to management, conservation, and biological control. We uncover a type of transient synchronization behavior in spatial ecological networks whose local dynamics are of the chaotic, predator-prey type. In the parameter regime where there is phase synchronization among all the patches, complete synchronization (i.e., synchronization in both phase and amplitude) can arise in certain pairs of patches as determined by the network symmetry - henceforth the phenomenon of "synchronization within synchronization." Distinct patterns of complete synchronization coexist but, due to intrinsic instability or noise, each pattern is a transient and there is random, intermittent switching among the patterns in the course of time evolution. The probability distribution of the transient time is found to follow an algebraic scaling law with a divergent average transient lifetime. Based on symmetry considerations, we develop a stability analysis to understand these phenomena. The general principle of symmetry can also be exploited to explain previously discovered, counterintuitive synchronization behaviors in ecological networks.
△ Less
Submitted 20 November, 2020;
originally announced November 2020.
-
A family of 3d steady gradient solitons that are flying wings
Authors:
Yi Lai
Abstract:
We find a family of 3d steady gradient Ricci solitons that are flying wings. This verifies a conjecture by Hamilton. For a 3d flying wing, we show that the scalar curvature does not vanish at infinity. The 3d flying wings are collapsed.
For dimension $n\ge 4$, we find a family of $\mathbb{Z}_2\times O(n-1)$-symmetric but non-rotationally symmetric n-dimensional steady gradient solitons with posi…
▽ More
We find a family of 3d steady gradient Ricci solitons that are flying wings. This verifies a conjecture by Hamilton. For a 3d flying wing, we show that the scalar curvature does not vanish at infinity. The 3d flying wings are collapsed.
For dimension $n\ge 4$, we find a family of $\mathbb{Z}_2\times O(n-1)$-symmetric but non-rotationally symmetric n-dimensional steady gradient solitons with positive curvature operator. We show that these solitons are non-collapsed.
△ Less
Submitted 29 November, 2020; v1 submitted 14 October, 2020;
originally announced October 2020.
-
Producing 3d Ricci flows with non-negative Ricci curvature via singular Ricci flows
Authors:
Yi Lai
Abstract:
We extend the concept of singular Ricci flow by Kleiner and Lott from 3d compact manifolds to 3d complete manifolds with possibly unbounded curvature. As an application of the generalized singular Ricci flow, we show that for any 3d complete Riemannian manifold with non-negative Ricci curvature, there exists a smooth Ricci flow starting from it.
We extend the concept of singular Ricci flow by Kleiner and Lott from 3d compact manifolds to 3d complete manifolds with possibly unbounded curvature. As an application of the generalized singular Ricci flow, we show that for any 3d complete Riemannian manifold with non-negative Ricci curvature, there exists a smooth Ricci flow starting from it.
△ Less
Submitted 15 June, 2020; v1 submitted 10 April, 2020;
originally announced April 2020.
-
Scaling law of transient lifetime of chimera states under dimension-augmenting perturbations
Authors:
Ling-Wei Kong,
Ying-Cheng Lai
Abstract:
Chimera states arising in the classic Kuramoto system of two-dimensional phase coupled oscillators are transient but they are "long" transients in the sense that the average transient lifetime grows exponentially with the system size. For reasonably large systems, e.g., those consisting of a few hundreds oscillators, it is infeasible to numerically calculate or experimentally measure the average l…
▽ More
Chimera states arising in the classic Kuramoto system of two-dimensional phase coupled oscillators are transient but they are "long" transients in the sense that the average transient lifetime grows exponentially with the system size. For reasonably large systems, e.g., those consisting of a few hundreds oscillators, it is infeasible to numerically calculate or experimentally measure the average lifetime, so the chimera states are practically permanent. We find that small perturbations in the third dimension, which make system "slightly" three-dimensional, will reduce dramatically the transient lifetime. In particular, under such a perturbation, the practically infinite average transient lifetime will become extremely short, because it scales with the magnitude of the perturbation only logarithmically. Physically, this means that a reduction in the perturbation strength over many orders of magnitude, insofar as it is not zero, would result in only an incremental increase in the lifetime. The uncovered type of fragility of chimera states raises concerns about their observability in physical systems.
△ Less
Submitted 13 April, 2020; v1 submitted 9 April, 2020;
originally announced April 2020.
-
Asymmetry in interdependence makes a multilayer system more robust against cascading failures
Authors:
Run-Ran Liu,
Chun-Xiao Jia,
Ying-Cheng Lai
Abstract:
Multilayer networked systems are ubiquitous in nature and engineering, and the robustness of these systems against failures is of great interest. A main line of theoretical pursuit has been percolation induced cascading failures, where interdependence between network layers is conveniently and tacitly assumed to be symmetric. In the real world, interdependent interactions are generally asymmetric.…
▽ More
Multilayer networked systems are ubiquitous in nature and engineering, and the robustness of these systems against failures is of great interest. A main line of theoretical pursuit has been percolation induced cascading failures, where interdependence between network layers is conveniently and tacitly assumed to be symmetric. In the real world, interdependent interactions are generally asymmetric. To uncover and quantify the impact of asymmetry in interdependence on network robustness, we focus on percolation dynamics in double-layer systems and implement the following failure mechanism: once a node in a network layer fails, the damage it can cause depends not only on its position in the layer but also on the position of its counterpart neighbor in the other layer. We find that the characteristics of the percolation transition depend on the degree of asymmetry, where the striking phenomenon of a switch in the nature of the phase transition from first- to second-order arises. We derive a theory to calculate the percolation transition points in both network layers, as well as the transition switching point, with strong numerical support from synthetic and empirical networks. Not only does our work shed light upon the factors that determine the robustness of multilayer networks against cascading failures, but it also provides a scenario by which the system can be designed or controlled to reach a desirable level of resilience.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Irrelevance of linear controllability to nonlinear dynamical networks
Authors:
Junjie Jiang,
Ying-Cheng Lai
Abstract:
There has been tremendous development of linear controllability of complex networks. Real-world systems are fundamentally nonlinear. Is linear controllability relevant to nonlinear dynamical networks? We identify a common trait underlying both types of control: the nodal "importance." For nonlinear and linear control, the importance is determined, respectively, by physical/biological consideration…
▽ More
There has been tremendous development of linear controllability of complex networks. Real-world systems are fundamentally nonlinear. Is linear controllability relevant to nonlinear dynamical networks? We identify a common trait underlying both types of control: the nodal "importance." For nonlinear and linear control, the importance is determined, respectively, by physical/biological considerations and the probability for a node to be in the minimum driver set. We study empirical mutualistic networks and a gene regulatory network, for which the nonlinear nodal importance can be quantified by the ability of individual nodes to restore the system from the aftermath of a tip**-point transition. We find that the nodal importance ranking for nonlinear and linear control exhibits opposite trends: for the former large-degree nodes are more important but for the latter, the importance scale is tilted towards the small-degree nodes, suggesting strongly irrelevance of linear controllability to these systems. The recent claim of successful application of linear controllability to C. elegans connectome is examined and discussed.
△ Less
Submitted 3 September, 2019;
originally announced September 2019.
-
Ricci flow under local almost non-negative curvature conditions
Authors:
Yi Lai
Abstract:
We find a local solution to the Ricci flow equation under a negative lower bound for many known curvature conditions. The flow exists for a uniform amount of time, during which the curvature stays bounded below by a controllable negative number. The curvature conditions we consider include 2-non-negative and weakly $\textnormal{PIC}_1$ cases, of which the results are new. We complete the discussio…
▽ More
We find a local solution to the Ricci flow equation under a negative lower bound for many known curvature conditions. The flow exists for a uniform amount of time, during which the curvature stays bounded below by a controllable negative number. The curvature conditions we consider include 2-non-negative and weakly $\textnormal{PIC}_1$ cases, of which the results are new. We complete the discussion of the almost preservation problem by Bamler-Cabezas-Rivas-Wilking, and the 2-non-negative case generalizes a result in 3D by Simon-Top** to higher dimensions. As an application, we use the local Ricci flow to smooth a metric space which is the limit of a sequence of manifolds with the almost non-negative curvature conditions, and show that this limit space is bi-H$\ddot{\textnormal{o}}$lder homeomorphic to a smooth manifold.
△ Less
Submitted 11 June, 2018; v1 submitted 22 April, 2018;
originally announced April 2018.
-
Networks of piecewise linear neural mass models
Authors:
S Coombes,
Y-M Lai,
M Sayli,
R Thul
Abstract:
Neural mass models are ubiquitous in large scale brain modelling. At the node level they are written in terms of a set of ODEs with a nonlinearity that is typically a sigmoidal shape. Using structural data from brain atlases they may be connected into a network to investigate the emergence of functional dynamic states, such as synchrony. With the simple restriction of the classic sigmoidal nonline…
▽ More
Neural mass models are ubiquitous in large scale brain modelling. At the node level they are written in terms of a set of ODEs with a nonlinearity that is typically a sigmoidal shape. Using structural data from brain atlases they may be connected into a network to investigate the emergence of functional dynamic states, such as synchrony. With the simple restriction of the classic sigmoidal nonlinearity to a piecewise linear caricature we show that the famous Wilson-Cowan neural mass model can be analysed at both the node and network level. The construction of periodic orbits at the node level is achieved by patching together matrix exponential solutions, and stability is determined using Floquet theory. For networks with interactions described by circulant matrices, we show that the stability of the synchronous state can be determined in terms of a low-dimensional Floquet problem parameterised by the eigenvalues of the interaction matrix. This network Floquet problem is readily solved using linear algebra, to predict the onset of spatio-temporal network patterns arising from a synchronous instability. We consider the case of a discontinuous choice for the node nonlinearity, namely the replacement of the sigmoid by a Heaviside nonlinearity. This gives rise to a continuous-time switching network. At the node level this allows for the existence of unstable sliding periodic orbits, which we construct. The stability of a periodic orbit is now treated with a modification of Floquet theory to treat the evolution of small perturbations through switching manifolds via saltation matrices. At the network level the stability analysis of the synchronous state is considerably more challenging. Here we report on the use of ideas originally developed for the study of Glass networks to treat the stability of periodic network states in neural mass models with discontinuous interactions.
△ Less
Submitted 25 January, 2018;
originally announced January 2018.
-
Noise-Induced Synchronization, Desynchronization, and Clustering in Globally Coupled Nonidentical Oscillators
Authors:
Yi Ming Lai,
Mason A. Porter
Abstract:
We study ensembles of globally coupled, nonidentical phase oscillators subject to correlated noise, and we identify several important factors that cause noise and coupling to synchronize or desychronize a system. By introducing noise in various ways, we find a novel estimate for the onset of synchrony of a system in terms of the coupling strength, noise strength, and width of the frequency distrib…
▽ More
We study ensembles of globally coupled, nonidentical phase oscillators subject to correlated noise, and we identify several important factors that cause noise and coupling to synchronize or desychronize a system. By introducing noise in various ways, we find a novel estimate for the onset of synchrony of a system in terms of the coupling strength, noise strength, and width of the frequency distribution of its natural oscillations. We also demonstrate that noise alone is sufficient to synchronize nonidentical oscillators. However, this synchrony depends on the first Fourier mode of a phase-sensitivity function, through which we introduce common noise into the system. We show that higher Fourier modes can cause desychronization due to clustering effects, and that this can reinforce clustering caused by different forms of coupling. Finally, we discuss the effects of noise on an ensemble in which antiferromagnetic coupling causes oscillators to form two clusters in the absence of noise.
△ Less
Submitted 24 September, 2013; v1 submitted 4 January, 2013;
originally announced January 2013.
-
ProPPA: A Fast Algorithm for $\ell_1$ Minimization and Low-Rank Matrix Completion
Authors:
Ranch Y. Q. Lai,
Pong C. Yuen
Abstract:
We propose a Projected Proximal Point Algorithm (ProPPA) for solving a class of optimization problems. The algorithm iteratively computes the proximal point of the last estimated solution projected into an affine space which itself is parallel and approaching to the feasible set. We provide convergence analysis theoretically supporting the general algorithm, and then apply it for solving $\ell_1$-…
▽ More
We propose a Projected Proximal Point Algorithm (ProPPA) for solving a class of optimization problems. The algorithm iteratively computes the proximal point of the last estimated solution projected into an affine space which itself is parallel and approaching to the feasible set. We provide convergence analysis theoretically supporting the general algorithm, and then apply it for solving $\ell_1$-minimization problems and the matrix completion problem. These problems arise in many applications including machine learning, image and signal processing. We compare our algorithm with the existing state-of-the-art algorithms. Experimental results on solving these problems show that our algorithm is very efficient and competitive.
△ Less
Submitted 19 May, 2012; v1 submitted 1 May, 2012;
originally announced May 2012.
-
An Effective Compactness Theorem for Coxeter Groups
Authors:
Yvonne Lai
Abstract:
Through highly non-constructive methods, works by Bestvina, Culler, Feighn, Morgan, Paulin, Rips, Shalen, and Thurston show that if a finitely presented group does not split over a virtually solvable subgroup, then the space of its discrete and faithful actions on hyperbolic n-space, modulo conjugation, is compact for all dimensions. Although this implies that the space of hyperbolic structures…
▽ More
Through highly non-constructive methods, works by Bestvina, Culler, Feighn, Morgan, Paulin, Rips, Shalen, and Thurston show that if a finitely presented group does not split over a virtually solvable subgroup, then the space of its discrete and faithful actions on hyperbolic n-space, modulo conjugation, is compact for all dimensions. Although this implies that the space of hyperbolic structures of such groups has finite diameter, the known methods do not give an explicit bound. We establish such a bound for Coxeter groups. We find that either the group splits over a virtually solvable subgroup or there is a constant C and a point in hyperbolic n-space that is moved no more than C by any generator. The constant C depends only on the number of generators of the group, and is independent of the relators.
△ Less
Submitted 16 February, 2009;
originally announced February 2009.
-
Improving Coverage Accuracy of Block Bootstrap Confidence Intervals
Authors:
Stephen M. S. Lee,
P. Y. Lai
Abstract:
The block bootstrap confidence interval based on dependent data can outperform the computationally more convenient normal approximation only with non-trivial Studentization which, in the case of complicated statistics, calls for highly specialist treatment. We propose two different approaches to improving the accuracy of the block bootstrap confidence interval under very general conditions. The…
▽ More
The block bootstrap confidence interval based on dependent data can outperform the computationally more convenient normal approximation only with non-trivial Studentization which, in the case of complicated statistics, calls for highly specialist treatment. We propose two different approaches to improving the accuracy of the block bootstrap confidence interval under very general conditions. The first calibrates the coverage level by iterating the block bootstrap. The second calculates Studentizing factors directly from block bootstrap series and requires no non-trivial analytic treatment. Both approaches involve two nested levels of block bootstrap resampling and yield high-order accuracy with simple tuning of block lengths at the two resampling levels. A simulation study is reported to provide empirical support for our theory.
△ Less
Submitted 28 April, 2008;
originally announced April 2008.