\newsiamremark

hypothesisHypothesis \newsiamthmclaimClaim

Fast Iterative Solver for Neural Network Method:
II. 1D diffusion-reaction problems and data fitting thanks: This work was supported in part by the National Science Foundation under grant DMS-2110571. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 (LLNL-JRNL-865920).

Zhiqiang Cai Department of Mathematics, Purdue University, West Lafayette, IN (, , ). [email protected] [email protected] [email protected]    Anastassia Doktorova22footnotemark: 2    Robert D. Falgout Lawrence Livermore National Laboratory, Livermore, CA () [email protected]    César Herrera22footnotemark: 2
Abstract

This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis functions is tri-diagonal and well-conditioned, the mass matrix for NNs is dense and ill-conditioned. For example, the condition number of the NN mass matrix for quasi-uniform meshes is at least 𝒪(n4)𝒪superscript𝑛4{\cal O}(n^{4})caligraphic_O ( italic_n start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ). We present a factorization of the mass matrix that enables solving the systems of linear equations in 𝒪(n)𝒪𝑛{\cal O}(n)caligraphic_O ( italic_n ) operations. To determine the non-linear parameters (the weights and bias of the hidden layer), one step of a damped Newton method is employed at each iteration. A Gauss-Newton method is used in place of Newton for the instances in which the Hessian matrices are singular. This modified dBN is referred to as dBGN. For both methods, the computational cost per iteration is 𝒪(n)𝒪𝑛{\cal O}(n)caligraphic_O ( italic_n ). Numerical results demonstrate the ability dBN and dBGN to efficiently achieve accurate results and outperform BFGS for select examples.

keywords:
Fast iterative solvers, Neural network, Ritz formulation, ReLU activation, Diffusion-Reaction problems, Data fitting, Newton’s method, Gauss-Newton’s method

1 Introduction

Using neural networks to solve partial differential equations (PDEs) has recently gained traction in the iterative solvers community (see, e.g., [1, 2, 6, 7, 11, 12]). In particular, the damped block Newton (dBN) method presented in [4] is a fast iterative solver for 1D diffusion problems. The descretization from the Ritz formulation of the one-dimensional diffusion equation introduces a high-dimensional, non-convex minimization problem. The dBN method numerically solves this problem using the block Gauss-Seidel method for the linear and non-linear parameters as an outer iteration. For the inner iteration, the corresponding coefficient and Hessian matrices are inverted exactly. The computational cost of the dBN method is 𝒪(n)𝒪𝑛\mathcal{O}(n)caligraphic_O ( italic_n ) per iteration, which is an improvement over 𝒪(n2)𝒪superscript𝑛2\mathcal{O}(n^{2})caligraphic_O ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) for common second order methods. This paper extends the methods in [4] to a broader class of problems, while maintaining the efficiency achieved in [4].

For elliptic PDEs beyond diffusion problems, as well as data fitting problems, the mass matrix must be inverted to solve for the linear parameter. Just as for the coefficient matrix in [4], the mass matrix depends on the non-linear parameter. However, the mass matrix is dense and much more ill-conditioned than the coefficient matrix. Whereas the coefficient matrix has condition number bounded by 𝒪(nhmin1)𝒪𝑛superscriptsubscript𝑚𝑖𝑛1\mathcal{O}(nh_{min}^{-1})caligraphic_O ( italic_n italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) and has a tri-diagonal inverse [4], the mass matrix has condition number bounded by 𝒪(nhmin3)𝒪𝑛superscriptsubscript𝑚𝑖𝑛3\mathcal{O}(nh_{min}^{-3})caligraphic_O ( italic_n italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT ) (see Lemma 2.3). Here, n𝑛nitalic_n is the number of neurons and hminsubscript𝑚𝑖𝑛h_{min}italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT is the smallest distance between two neighboring breakpoints. This is completely different from the finite element method, in which the mass matrix is tri-diagonal and the condition number is 𝒪(1)𝒪1\mathcal{O}(1)caligraphic_O ( 1 ) for local hat basis functions on quasi-uniform meshes. Yet solving the linear systems efficiently is still possible; two representations of the mass matrix in terms of simpler matrices are presented in Section 2. Both methods make the inversion less computationally expensive.

The non-linear parameters for this broader class of problems present further challenges. Unlike in diffusion problems, the Hessian matrices for both diffusion-reaction and non-linear least squares problems are no longer diagonal and depend on the coefficient matrix. However, a factorization is used to compute the inverse of the Hessian efficiently, utilizing the explicit formula for the inverse of the coefficient matrix from [4]. Furthermore, for the cases in which the Hessian matrices are non-invertible, a damped block Gauss-Newton (dBGN) method is presented. The Gauss-Newton matrix is positive-definite, and its inverse is tri-diagonal. Whether using dBN or dBGN, the computational cost per iteration remains 𝒪(n)𝒪𝑛\mathcal{O}(n)caligraphic_O ( italic_n ), as in [4]. Even faster convergence for the non-linear parameter is possible for diffusion-reaction problems when adding adaptive neuron enhancement (ANE) [10]. Numerical examples demonstrate the ability of the aforementioned methods to move the breakpoints quickly and efficiently and to outperform BFGS for select examples.

The paper is structured as follows: Section 2 introduces the notation for shallow neural networks and the corresponding mass matrix. The condition numbers for both neural network and finite element mass matrices are presented and compared. This is followed by a discussion of two ways in which to decompose the mass matrix in order to more efficiently invert it. Then the problems in which the mass matrix arises are presented in Section 3.1 and Section 3.2. The non-linear least-squares optimization problem using shallow neural neworks is presented in Section 3.1. Then in Section 3.2 the diffusion-reaction equation and the modified Ritz formulation are introduced. Next, the dBN method is reiterated in Section 4, emphasizing the modifications that need to be made to the dBN in [4] in order for it to work for the broader class of problems presented in this paper. For cases in which the Hessian for the non-linear parameter is non-invertible, the dBGN method is outlined. This is followed by Section 4.1, in which we recall the adaptivity scheme (AdBN) from [4], which can also be used for diffusion-reaction problems. Lastly, numerical results are presented in Section 5, demonstrating the performance of the aforementioned methods, as compared to BFGS, for select example problems. The examples in Section 5 highlight the ability of these methods to move mesh points to enhance the approximation. In particular, the results in Section 5.3 demonstrate the ability of dBN to solve the singularly perturbed reaction-diffusion equation.

2 Mass Matrix for Shallow Neural Network

This section studies the mass matrix resulting from a shallow ReLU neural network and computation of its inversion.

As in [4], the set of approximating functions generated by the shallow ReLU neural network with n𝑛nitalic_n neurons is denoted by

n(Ω)subscript𝑛Ω\displaystyle{\cal M}_{n}(\Omega)caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) =\displaystyle== {c0+i=1nciσ(xbi):ci, 0=b0b1<<bn<bn+1=1},conditional-setsubscript𝑐0superscriptsubscript𝑖1𝑛subscript𝑐𝑖𝜎𝑥subscript𝑏𝑖formulae-sequencesubscript𝑐𝑖 0subscript𝑏0subscript𝑏1subscript𝑏𝑛subscript𝑏𝑛11\displaystyle\left\{c_{0}+\sum_{i=1}^{n}c_{i}\sigma(x-b_{i})\,:\,c_{i}\in% \mathbb{R},\,0=b_{0}\leq b_{1}<\cdots<b_{n}<b_{n+1}=1\right\},{ italic_c start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) : italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R , 0 = italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≤ italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < ⋯ < italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT < italic_b start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = 1 } ,

where Ω=(0,1)Ω01\Omega=(0,1)roman_Ω = ( 0 , 1 ) and σ(t)=max{0,t}𝜎𝑡0𝑡\sigma(t)=\max\{0,t\}italic_σ ( italic_t ) = roman_max { 0 , italic_t } is the ReLU activation function. Let r(x)L(Ω)𝑟𝑥superscript𝐿Ωr(x)\in L^{\infty}(\Omega)italic_r ( italic_x ) ∈ italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( roman_Ω ) be a real-valued function defined on ΩΩ\Omegaroman_Ω and bounded below by a positive constant r0>0subscript𝑟00r_{0}>0italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT > 0 almost everywhere.

Consider the following mass matrix associated with the weight function r𝑟ritalic_r given by

(1) Mr(𝐛)=(mij)n×nwith mij=01r(x)σ(xbi)σ(xbj)𝑑xformulae-sequencesubscript𝑀𝑟𝐛subscriptsubscript𝑚𝑖𝑗𝑛𝑛with subscript𝑚𝑖𝑗superscriptsubscript01𝑟𝑥𝜎𝑥subscript𝑏𝑖𝜎𝑥subscript𝑏𝑗differential-d𝑥M_{r}({\bf b})=\big{(}m_{ij}\big{)}_{n\times n}\quad\mbox{with }\,m_{ij}=\int_% {0}^{1}r(x)\sigma(x-b_{i})\sigma(x-b_{j})dxitalic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) = ( italic_m start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_n × italic_n end_POSTSUBSCRIPT with italic_m start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_d italic_x

and the coefficient matrix associated with r𝑟ritalic_r given by

Ar(𝐛)=(aij)n×nwith aij=01r(x)H(xbi)H(xbj)𝑑xformulae-sequencesubscript𝐴𝑟𝐛subscriptsubscript𝑎𝑖𝑗𝑛𝑛with subscript𝑎𝑖𝑗superscriptsubscript01𝑟𝑥𝐻𝑥subscript𝑏𝑖𝐻𝑥subscript𝑏𝑗differential-d𝑥A_{r}({\bf b})=\big{(}a_{ij}\big{)}_{n\times n}\quad\mbox{with }\,a_{ij}=\int_% {0}^{1}r(x)H(x-b_{i})H(x-b_{j})dxitalic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) = ( italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_n × italic_n end_POSTSUBSCRIPT with italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) italic_H ( italic_x - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_H ( italic_x - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_d italic_x

for i,j=1,,nformulae-sequence𝑖𝑗1𝑛i,j=1,\ldots,nitalic_i , italic_j = 1 , … , italic_n, where H(t)=σ(t)𝐻𝑡superscript𝜎𝑡H(t)=\sigma^{\prime}(t)italic_H ( italic_t ) = italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_t ) is the Heaviside (unit) step function and 𝐛=(b1,,bn)T𝐛superscriptsubscript𝑏1subscript𝑏𝑛𝑇{\bf b}=\left(b_{1},\ldots,b_{n}\right)^{T}bold_b = ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is the non-linear parameter.

While the coefficient matrix Ar(𝐛)subscript𝐴𝑟𝐛A_{r}({\bf b})italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) is dense, its inversion is a tri-diagonal matrix with an explicit algebraic formula (see [4]). This property holds for a class of matrices with a special structure.

Lemma 2.1.

For {αi}i=1k,{βi}i=1ksuperscriptsubscriptsubscript𝛼𝑖𝑖1𝑘superscriptsubscriptsubscript𝛽𝑖𝑖1𝑘\left\{\alpha_{i}\right\}_{i=1}^{k},\,\left\{\beta_{i}\right\}_{i=1}^{k}% \subset\mathbb{R}{ italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , { italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ⊂ blackboard_R, assume that

α10,βk0,andαi+1βiαiβi+10formulae-sequencesubscript𝛼10formulae-sequencesubscript𝛽𝑘0andsubscript𝛼𝑖1subscript𝛽𝑖subscript𝛼𝑖subscript𝛽𝑖10\alpha_{1}\neq 0,\quad\beta_{k}\neq 0,\quad\mbox{and}\quad\alpha_{i+1}\beta_{i% }-\alpha_{i}\beta_{i+1}\neq 0italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0 , italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ≠ 0 , and italic_α start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ≠ 0

for all i=1,,k1𝑖1𝑘1i=1,\dots,k-1italic_i = 1 , … , italic_k - 1. Then the matrix

(2) =(α1β1α1β2α1β3α1βkα1β2α2β2α2β3α2βkα1β3α2β3α3β3α3βkα1βkα2βkα3βkαkβk)matrixsubscript𝛼1subscript𝛽1subscript𝛼1subscript𝛽2subscript𝛼1subscript𝛽3subscript𝛼1subscript𝛽𝑘subscript𝛼1subscript𝛽2subscript𝛼2subscript𝛽2subscript𝛼2subscript𝛽3subscript𝛼2subscript𝛽𝑘subscript𝛼1subscript𝛽3subscript𝛼2subscript𝛽3subscript𝛼3subscript𝛽3subscript𝛼3subscript𝛽𝑘subscript𝛼1subscript𝛽𝑘subscript𝛼2subscript𝛽𝑘subscript𝛼3subscript𝛽𝑘subscript𝛼𝑘subscript𝛽𝑘{\cal M}=\begin{pmatrix}\alpha_{1}\beta_{1}&\alpha_{1}\beta_{2}&\alpha_{1}% \beta_{3}&\ldots&\alpha_{1}\beta_{k}\\ \alpha_{1}\beta_{2}&\alpha_{2}\beta_{2}&\alpha_{2}\beta_{3}&\ldots&\alpha_{2}% \beta_{k}\\ \alpha_{1}\beta_{3}&\alpha_{2}\beta_{3}&\alpha_{3}\beta_{3}&\ldots&\alpha_{3}% \beta_{k}\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ \alpha_{1}\beta_{k}&\alpha_{2}\beta_{k}&\alpha_{3}\beta_{k}&\ldots&\alpha_{k}% \beta_{k}\end{pmatrix}caligraphic_M = ( start_ARG start_ROW start_CELL italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL italic_α start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL italic_α start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL start_CELL italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL start_CELL italic_α start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG )

is invertible. Moreover, its inverse is symmetric and tri-diagonal with non-zero entries given by

ii1=αi+1βi1αi1βi+1(αiβi1αi1βi)(αi+1βiαiβi+1)andi,i+11=i+1,i1=1αi+1βiαiβi+1,formulae-sequencesubscriptsuperscript1𝑖𝑖subscript𝛼𝑖1subscript𝛽𝑖1subscript𝛼𝑖1subscript𝛽𝑖1subscript𝛼𝑖subscript𝛽𝑖1subscript𝛼𝑖1subscript𝛽𝑖subscript𝛼𝑖1subscript𝛽𝑖subscript𝛼𝑖subscript𝛽𝑖1andsubscriptsuperscript1𝑖𝑖1subscriptsuperscript1𝑖1𝑖1subscript𝛼𝑖1subscript𝛽𝑖subscript𝛼𝑖subscript𝛽𝑖1{\cal M}^{-1}_{ii}=\displaystyle\frac{\alpha_{i+1}\beta_{i-1}-\alpha_{i-1}% \beta_{i+1}}{(\alpha_{i}\beta_{i-1}-\alpha_{i-1}\beta_{i})(\alpha_{i+1}\beta_{% i}-\alpha_{i}\beta_{i+1})}\quad\mbox{and}\quad{\cal M}^{-1}_{i,i+1}={\cal M}^{% -1}_{i+1,i}=\displaystyle\frac{-1}{\alpha_{i+1}\beta_{i}-\alpha_{i}\beta_{i+1}},caligraphic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_i end_POSTSUBSCRIPT = divide start_ARG italic_α start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT end_ARG start_ARG ( italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ( italic_α start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ) end_ARG and caligraphic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_i + 1 end_POSTSUBSCRIPT = caligraphic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i + 1 , italic_i end_POSTSUBSCRIPT = divide start_ARG - 1 end_ARG start_ARG italic_α start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_β start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT end_ARG ,

where α0=βk+1=0subscript𝛼0subscript𝛽𝑘10\alpha_{0}=\beta_{k+1}=0italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_β start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = 0 and αk+1=β0=1subscript𝛼𝑘1subscript𝛽01\alpha_{k+1}=\beta_{0}=1italic_α start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1.

Proof 2.2.

It is easy to verify that 1=Isuperscript1𝐼{\cal M}{\cal M}^{-1}=Icaligraphic_M caligraphic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_I.

The coefficient matrix Ar(𝐛)subscript𝐴𝑟𝐛A_{r}({\bf b})italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) has the same structure as {\cal M}caligraphic_M with k=n𝑘𝑛k=nitalic_k = italic_n, αi=1subscript𝛼𝑖1\alpha_{i}=1italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1, and βi=bi1r(x)𝑑xsubscript𝛽𝑖superscriptsubscriptsubscript𝑏𝑖1𝑟𝑥differential-d𝑥\beta_{i}=\int_{b_{i}}^{1}r(x)dxitalic_β start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) italic_d italic_x. In the case of the mass matrix Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ), it remains dense due to the global support of neurons, and its condition number is very large (see Section 2.1). This section derives inverse formulas of the mass matrix, whose application needs 𝒪(n)𝒪𝑛\mathcal{O}(n)caligraphic_O ( italic_n ) operations. Derivation is given both algebraically in Section 2.2 and geometrically in Section 2.3.

2.1 Condition Number

Let hi=bi+1bisubscript𝑖subscript𝑏𝑖1subscript𝑏𝑖h_{i}=b_{i+1}-b_{i}italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for i=0,,n𝑖0𝑛i=0,\ldots,nitalic_i = 0 , … , italic_n, and set

hmax=max1inhiandhmin=min1inhi.formulae-sequencesubscriptmaxsubscript1𝑖𝑛subscript𝑖andsubscriptminsubscript1𝑖𝑛subscript𝑖h_{\text{max}}=\max\limits_{1\leq i\leq n}h_{i}\quad\mbox{and}\quad h_{\text{% min}}=\min\limits_{1\leq i\leq n}h_{i}.italic_h start_POSTSUBSCRIPT max end_POSTSUBSCRIPT = roman_max start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_n end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and italic_h start_POSTSUBSCRIPT min end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_n end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT .

It was shown in [4] that the condition number of Ar(𝐛)subscript𝐴𝑟𝐛A_{r}({\bf b})italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) is bounded by 𝒪(nhmin1)𝒪𝑛superscriptsubscript𝑚𝑖𝑛1\mathcal{O}(nh_{min}^{-1})caligraphic_O ( italic_n italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) for r(x)=1𝑟𝑥1r(x)=1italic_r ( italic_x ) = 1. The next lemma provides an upper bound for the condition number of the mass matrix.

Lemma 2.3.

Let r(x)=1𝑟𝑥1r(x)=1italic_r ( italic_x ) = 1, then the condition number of the mass matrix Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) is bounded by 𝒪(n/hmin3)𝒪𝑛superscriptsubscript𝑚𝑖𝑛3\mathcal{O}\left(n/h_{min}^{3}\right)caligraphic_O ( italic_n / italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ).

Proof 2.4.

For any vector 𝛏=(ξ1,,ξn)Tn𝛏superscriptsubscript𝜉1subscript𝜉𝑛𝑇superscript𝑛\mbox{\boldmath$\xi$}=(\xi_{1},\ldots,\xi_{n})^{T}\in\mathbb{R}^{n}bold_italic_ξ = ( italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, denote its magnitude by |𝛏|=(i=1nξi2)1/2𝛏superscriptsuperscriptsubscript𝑖1𝑛subscriptsuperscript𝜉2𝑖12\big{|}\mbox{\boldmath$\xi$}\big{|}=\left(\sum\limits_{i=1}^{n}\xi^{2}_{i}% \right)^{1/2}| bold_italic_ξ | = ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_ξ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT. By the Cauchy-Schwarz inequality and the fact that σ(xbj)=0𝜎𝑥subscript𝑏𝑗0\sigma(x-b_{j})=0italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = 0 for xbj𝑥subscript𝑏𝑗x\leq b_{j}italic_x ≤ italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, we have

(3) 𝝃TMr(𝐛)𝝃=01(i=1nξiσ(xbi))2𝑑x|𝝃|201(i=1nσ(xbi)2)𝑑x=|𝝃|2j=1nbjbj+1i=1jσ(xbi)2dx=|𝝃|23j=1ni=1j{(bj+1bi)3(bjbi)3}=|𝝃|23i=1n(bn+1bi)3=|𝝃|23i=1n(1bi)3n3|𝝃|2.superscript𝝃𝑇subscript𝑀𝑟𝐛𝝃superscriptsubscript01superscriptsuperscriptsubscript𝑖1𝑛subscript𝜉𝑖𝜎𝑥subscript𝑏𝑖2differential-d𝑥superscript𝝃2superscriptsubscript01superscriptsubscript𝑖1𝑛𝜎superscript𝑥subscript𝑏𝑖2differential-d𝑥superscript𝝃2superscriptsubscript𝑗1𝑛superscriptsubscriptsubscript𝑏𝑗subscript𝑏𝑗1superscriptsubscript𝑖1𝑗𝜎superscript𝑥subscript𝑏𝑖2𝑑𝑥superscript𝝃23superscriptsubscript𝑗1𝑛superscriptsubscript𝑖1𝑗superscriptsubscript𝑏𝑗1subscript𝑏𝑖3superscriptsubscript𝑏𝑗subscript𝑏𝑖3superscript𝝃23superscriptsubscript𝑖1𝑛superscriptsubscript𝑏𝑛1subscript𝑏𝑖3superscript𝝃23superscriptsubscript𝑖1𝑛superscript1subscript𝑏𝑖3𝑛3superscript𝝃2\displaystyle\begin{split}\mbox{\boldmath$\xi$}^{T}M_{r}({\bf b})\mbox{% \boldmath$\xi$}&=\int_{0}^{1}\left(\sum_{i=1}^{n}\xi_{i}\sigma(x-b_{i})\right)% ^{2}\,dx\leq|\mbox{\boldmath$\xi$}|^{2}\int_{0}^{1}\left(\sum_{i=1}^{n}\sigma(% x-b_{i})^{2}\right)\,dx\\ &=|\mbox{\boldmath$\xi$}|^{2}\sum_{j=1}^{n}\int_{b_{j}}^{b_{j+1}}\sum_{i=1}^{j% }\sigma(x-b_{i})^{2}\,dx=\frac{|\mbox{\boldmath$\xi$}|^{2}}{3}\sum_{j=1}^{n}% \sum_{i=1}^{j}\left\{(b_{j+1}-b_{i})^{3}-(b_{j}-b_{i})^{3}\right\}\\ &=\frac{|\mbox{\boldmath$\xi$}|^{2}}{3}\sum_{i=1}^{n}(b_{n+1}-b_{i})^{3}=\frac% {|\mbox{\boldmath$\xi$}|^{2}}{3}\sum_{i=1}^{n}(1-b_{i})^{3}\leq\frac{n}{3}\,|% \mbox{\boldmath$\xi$}|^{2}.\end{split}start_ROW start_CELL bold_italic_ξ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) bold_italic_ξ end_CELL start_CELL = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_ξ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_x ≤ | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_d italic_x end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_x = divide start_ARG | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 3 end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT { ( italic_b start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT - ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT } end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = divide start_ARG | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 3 end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT = divide start_ARG | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 3 end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( 1 - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ≤ divide start_ARG italic_n end_ARG start_ARG 3 end_ARG | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . end_CELL end_ROW

To estimate the lower bound of 𝛏TMr(𝐛)𝛏superscript𝛏𝑇subscript𝑀𝑟𝐛𝛏\mbox{\boldmath$\xi$}^{T}M_{r}({\bf b})\mbox{\boldmath$\xi$}bold_italic_ξ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) bold_italic_ξ, let

τi(x)=j=1iξjσ(xbj)andai1=τi(bi)=j=1iξj(bibj),formulae-sequencesubscript𝜏𝑖𝑥superscriptsubscript𝑗1𝑖subscript𝜉𝑗𝜎𝑥subscript𝑏𝑗andsubscript𝑎𝑖1subscript𝜏𝑖subscript𝑏𝑖superscriptsubscript𝑗1𝑖subscript𝜉𝑗subscript𝑏𝑖subscript𝑏𝑗\tau_{i}(x)=\sum\limits_{j=1}^{i}\xi_{j}\sigma(x-b_{j})\quad\mbox{and}\quad a_% {i-1}=\tau_{i}(b_{i})=\sum\limits_{j=1}^{i}\xi_{j}(b_{i}-b_{j}),italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_ξ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) and italic_a start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT = italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_ξ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ,

for i=1,,n+1𝑖1𝑛1i=1,\dots,n+1italic_i = 1 , … , italic_n + 1. Then τi(bi+1+bi2)=ai1+ai2subscript𝜏𝑖subscript𝑏𝑖1subscript𝑏𝑖2subscript𝑎𝑖1subscript𝑎𝑖2\tau_{i}\left(\frac{b_{i+1}+b_{i}}{2}\right)=\dfrac{a_{i-1}+a_{i}}{2}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( divide start_ARG italic_b start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) = divide start_ARG italic_a start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT + italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG. Since τi2(x)superscriptsubscript𝜏𝑖2𝑥\tau_{i}^{2}(x)italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_x ) is a quadratic function in each sub-interval [bi,bi+1]subscript𝑏𝑖subscript𝑏𝑖1[b_{i},b_{i+1}][ italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ], Simpson’s Rule implies

𝝃TMr(𝐛)𝝃=i=1nbibi+1τi2(x)𝑑x=16i=1nhi[τi2(bi)+4τi2(bi+1+bi2)+τi2(bi+1)]=16i=1nhi[ai12+(ai1+ai)2+ai2]16hmin|𝐚|2,superscript𝝃𝑇subscript𝑀𝑟𝐛𝝃superscriptsubscript𝑖1𝑛superscriptsubscriptsubscript𝑏𝑖subscript𝑏𝑖1subscriptsuperscript𝜏2𝑖𝑥differential-d𝑥16superscriptsubscript𝑖1𝑛subscript𝑖delimited-[]superscriptsubscript𝜏𝑖2subscript𝑏𝑖4superscriptsubscript𝜏𝑖2subscript𝑏𝑖1subscript𝑏𝑖2superscriptsubscript𝜏𝑖2subscript𝑏𝑖116superscriptsubscript𝑖1𝑛subscript𝑖delimited-[]superscriptsubscript𝑎𝑖12superscriptsubscript𝑎𝑖1subscript𝑎𝑖2superscriptsubscript𝑎𝑖216subscript𝑚𝑖𝑛superscript𝐚2\displaystyle\begin{split}\mbox{\boldmath$\xi$}^{T}M_{r}({\bf b})\mbox{% \boldmath$\xi$}&=\sum_{i=1}^{n}\int_{b_{i}}^{b_{i+1}}\tau^{2}_{i}(x)\,dx=\frac% {1}{6}\sum_{i=1}^{n}h_{i}\left[\tau_{i}^{2}(b_{i})+4\tau_{i}^{2}\left(\frac{b_% {i+1}+b_{i}}{2}\right)+\tau_{i}^{2}(b_{i+1})\right]\\ &=\frac{1}{6}\sum_{i=1}^{n}h_{i}\left[a_{i-1}^{2}+(a_{i-1}+a_{i})^{2}+a_{i}^{2% }\right]\geq\frac{1}{6}h_{min}\lvert{\bf a}\rvert^{2},\end{split}start_ROW start_CELL bold_italic_ξ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) bold_italic_ξ end_CELL start_CELL = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_τ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_x ) italic_d italic_x = divide start_ARG 1 end_ARG start_ARG 6 end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT [ italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) + 4 italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( divide start_ARG italic_b start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) + italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ) ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = divide start_ARG 1 end_ARG start_ARG 6 end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT [ italic_a start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_a start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT + italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] ≥ divide start_ARG 1 end_ARG start_ARG 6 end_ARG italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT | bold_a | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , end_CELL end_ROW

where 𝐚=(a1,,an)T𝐚superscriptsubscript𝑎1subscript𝑎𝑛𝑇{\bf a}=(a_{1},\ldots,a_{n})^{T}bold_a = ( italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. It is easy to see that 𝛏=Q𝐚𝛏𝑄𝐚\mbox{\boldmath$\xi$}=Q{\bf a}bold_italic_ξ = italic_Q bold_a, where Q𝑄Qitalic_Q is a n𝑛nitalic_n-order lower tri-diagonal matrix given by

Q=(1h10000(1h1+1h2)1h20001h2(1h2+1h3)1h3000001hn10000(1hn1+1hn)1hn).𝑄matrix1subscript100001subscript11subscript21subscript20001subscript21subscript21subscript31subscript3000001subscript𝑛100001subscript𝑛11subscript𝑛1subscript𝑛Q=\begin{pmatrix}~{}\frac{1}{h_{1}}&0&0&\dots&0&0\\ -\left(\frac{1}{h_{1}}+\frac{1}{h_{2}}\right)&\frac{1}{h_{2}}&0&\dots&0&0\\[5.% 69054pt] \frac{1}{h_{2}}&-\left(\frac{1}{h_{2}}+\frac{1}{h_{3}}\right)&\frac{1}{h_{3}}&% \dots&0&0\\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots\\ 0&0&0&\ldots&\frac{1}{h_{n-1}}&0\\ 0&0&0&\ldots&-\left(\frac{1}{h_{n-1}}+\frac{1}{h_{n}}\right)&\frac{1}{h_{n}}\\ \end{pmatrix}.italic_Q = ( start_ARG start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL - ( divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG + divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL - ( divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG + divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_ARG ) end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL … end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL - ( divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT end_ARG + divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG ) end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_h start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG end_CELL end_ROW end_ARG ) .

It is easy to verify that Q𝑄Qitalic_Q has spectral norm bounded by

Q2Q1Q4hmin1.subscriptdelimited-∥∥𝑄2subscriptdelimited-∥∥𝑄1subscriptdelimited-∥∥𝑄4superscriptsubscript𝑚𝑖𝑛1\lVert Q\rVert_{2}\leq\sqrt{\lVert Q\rVert_{1}\lVert Q\rVert_{\infty}}\leq 4h_% {min}^{-1}.∥ italic_Q ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ square-root start_ARG ∥ italic_Q ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∥ italic_Q ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT end_ARG ≤ 4 italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

Hence,

𝝃TMr(𝐛)𝝃16hmin|𝝃|2Q22196hmin3|𝝃|2,superscript𝝃𝑇subscript𝑀𝑟𝐛𝝃16subscript𝑚𝑖𝑛superscript𝝃2superscriptsubscriptdelimited-∥∥𝑄22196subscriptsuperscript3𝑚𝑖𝑛superscript𝝃2\mbox{\boldmath$\xi$}^{T}M_{r}({\bf b})\mbox{\boldmath$\xi$}\geq\frac{1}{6}h_{% min}\frac{\lvert{\mbox{\boldmath$\xi$}}\rvert^{2}}{\lVert Q\rVert_{2}^{2}}\geq% \frac{1}{96}h^{3}_{min}\lvert{\mbox{\boldmath$\xi$}}\rvert^{2},bold_italic_ξ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) bold_italic_ξ ≥ divide start_ARG 1 end_ARG start_ARG 6 end_ARG italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT divide start_ARG | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ∥ italic_Q ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ≥ divide start_ARG 1 end_ARG start_ARG 96 end_ARG italic_h start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,

which, together with the upper bound in Eq. 3, implies the validity of the lemma.

Lemma 2.5.

Under the assumption on the weight function r𝑟ritalic_r, the condition number of the mass matrix Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) is bounded by 𝒪(nr01hmin3)𝒪𝑛subscriptsuperscript𝑟10superscriptsubscript𝑚𝑖𝑛3\mathcal{O}\left(n\,r^{-1}_{0}h_{min}^{-3}\right)caligraphic_O ( italic_n italic_r start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT ).

Proof 2.6.

Since rL(I)𝑟superscript𝐿𝐼r\in L^{\infty}(I)italic_r ∈ italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( italic_I ) and r(x)r0𝑟𝑥subscript𝑟0r(x)\geq r_{0}italic_r ( italic_x ) ≥ italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT almost everywhere, in a similar fashion as the proof of Lemma 2.3, we have

16r0hmin3|𝝃|2𝝃TMr(𝐛)𝝃Cn|𝝃|2,16subscript𝑟0superscriptsubscript𝑚𝑖𝑛3superscript𝝃2superscript𝝃𝑇subscript𝑀𝑟𝐛𝝃𝐶𝑛superscript𝝃2\dfrac{1}{6}r_{0}h_{min}^{3}\lvert{\mbox{\boldmath$\xi$}}\rvert^{2}\leq\mbox{% \boldmath$\xi$}^{T}M_{r}({\bf b})\mbox{\boldmath$\xi$}\leq C\,n\lvert{\mbox{% \boldmath$\xi$}}\rvert^{2},divide start_ARG 1 end_ARG start_ARG 6 end_ARG italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ bold_italic_ξ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) bold_italic_ξ ≤ italic_C italic_n | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,

which implies the validity of the lemma.

Whereas the mass matrix associated with the ReLU neural network is very ill-condiditoned, it is well known that the mass matrix for the finite element (FE) method is much better conditioned (see [8] for example). The following Lemma 2.7 reiterates the result in [8] but with an alternate proof in a similar fashion as that of Lemma 2.3.

Assume that b0<b1subscript𝑏0subscript𝑏1b_{0}<b_{1}italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT < italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and set

h~max=max0inhiandh~min=min0inhi.formulae-sequencesubscript~maxsubscript0𝑖𝑛subscript𝑖andsubscript~minsubscript0𝑖𝑛subscript𝑖\tilde{h}_{\text{max}}=\max\limits_{0\leq i\leq n}h_{i}\quad\mbox{and}\quad% \tilde{h}_{\text{min}}=\min\limits_{0\leq i\leq n}h_{i}.over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT max end_POSTSUBSCRIPT = roman_max start_POSTSUBSCRIPT 0 ≤ italic_i ≤ italic_n end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT min end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT 0 ≤ italic_i ≤ italic_n end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT .

For the partition {bi}i=0n+1superscriptsubscriptsubscript𝑏𝑖𝑖0𝑛1\{b_{i}\}_{i=0}^{n+1}{ italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT, denote the hat basis functions for i=1,,n𝑖1𝑛i=1,\dots,nitalic_i = 1 , … , italic_n by

φi(x)={(xbi1)/hi,x(bi1,bi),(bi+1x)/hi+1,x(bi,bi+1),0,otherwise.subscript𝜑𝑖𝑥cases𝑥subscript𝑏𝑖1subscript𝑖𝑥subscript𝑏𝑖1subscript𝑏𝑖subscript𝑏𝑖1𝑥subscript𝑖1𝑥subscript𝑏𝑖subscript𝑏𝑖10otherwise\varphi_{i}(x)=\left\{\begin{array}[]{ll}(x-b_{i-1})/h_{i},&x\in(b_{i-1},b_{i}% ),\\[5.69054pt] (b_{i+1}-x)/h_{i+1},&x\in(b_{i},b_{i+1}),\\[5.69054pt] 0,&\text{otherwise}.\end{array}\right.italic_φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_x ) = { start_ARRAY start_ROW start_CELL ( italic_x - italic_b start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ) / italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , end_CELL start_CELL italic_x ∈ ( italic_b start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , end_CELL end_ROW start_ROW start_CELL ( italic_b start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT - italic_x ) / italic_h start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT , end_CELL start_CELL italic_x ∈ ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ) , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise . end_CELL end_ROW end_ARRAY

Next let 𝝋=(φ1,,φn)T𝝋superscriptsubscript𝜑1subscript𝜑𝑛𝑇\mbox{\boldmath${\varphi}$}=\left(\varphi_{1},\dots,\varphi_{n}\right)^{T}bold_italic_φ = ( italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_φ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Then the corresponding FE mass matrix for this partition is denoted by

M~(𝐛)=01𝝋𝝋T𝑑x.~𝑀𝐛superscriptsubscript01superscript𝝋𝝋𝑇differential-d𝑥\tilde{M}({\bf b})=\int_{0}^{1}\mbox{\boldmath${\varphi}$}\mbox{\boldmath${% \varphi}$}^{T}dx.over~ start_ARG italic_M end_ARG ( bold_b ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT roman_φ roman_φ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d italic_x .
Lemma 2.7.

The condition number of the finite element mass matrix M~(𝐛)~𝑀𝐛\tilde{M}({\bf b})over~ start_ARG italic_M end_ARG ( bold_b ) is bounded by
𝒪(h~max/h~min)𝒪subscript~𝑚𝑎𝑥subscript~𝑚𝑖𝑛\mathcal{O}(\tilde{h}_{max}/\tilde{h}_{min})caligraphic_O ( over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT / over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT ).

Proof 2.8.

For any vector 𝛏=(ξ1,,ξn)Tn𝛏superscriptsubscript𝜉1subscript𝜉𝑛𝑇superscript𝑛\mbox{\boldmath$\xi$}=(\xi_{1},\ldots,\xi_{n})^{T}\in\mathbb{R}^{n}bold_italic_ξ = ( italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, in a similar fashion as that of Lemma 2.3, we get the equality

𝝃TM~(𝐛)𝝃=j=0nbjbj+1(ξjφj+ξj+1φj+1)2𝑑x=j=0nhj6[ξj2+(ξj+ξj+1)2+ξj+12]superscript𝝃𝑇~𝑀𝐛𝝃superscriptsubscript𝑗0𝑛superscriptsubscriptsubscript𝑏𝑗subscript𝑏𝑗1superscriptsubscript𝜉𝑗subscript𝜑𝑗subscript𝜉𝑗1subscript𝜑𝑗12differential-d𝑥superscriptsubscript𝑗0𝑛subscript𝑗6delimited-[]superscriptsubscript𝜉𝑗2superscriptsubscript𝜉𝑗subscript𝜉𝑗12superscriptsubscript𝜉𝑗12\displaystyle\begin{split}\mbox{\boldmath$\xi$}^{T}\tilde{M}({\bf b})\mbox{% \boldmath$\xi$}&=\sum_{j=0}^{n}\int_{b_{j}}^{b_{j+1}}\left(\xi_{j}\varphi_{j}+% \xi_{j+1}\varphi_{j+1}\right)^{2}\,dx=\sum_{j=0}^{n}\frac{h_{j}}{6}\left[\xi_{% j}^{2}+(\xi_{j}+\xi_{j+1})^{2}+\xi_{j+1}^{2}\right]\\ \end{split}start_ROW start_CELL bold_italic_ξ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over~ start_ARG italic_M end_ARG ( bold_b ) bold_italic_ξ end_CELL start_CELL = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_ξ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_φ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_ξ start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT italic_φ start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_x = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT divide start_ARG italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG 6 end_ARG [ italic_ξ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_ξ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_ξ start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_ξ start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_CELL end_ROW

with φ0(x)=φn+1(x)=ξ0=ξn+1=0subscript𝜑0𝑥subscript𝜑𝑛1𝑥subscript𝜉0subscript𝜉𝑛10\varphi_{0}(x)=\varphi_{n+1}(x)=\xi_{0}=\xi_{n+1}=0italic_φ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x ) = italic_φ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT ( italic_x ) = italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_ξ start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = 0, which leads to the inequalities

16h~min|𝝃|2𝝃TM~(𝐛)𝝃23h~max|𝝃|2.16subscript~𝑚𝑖𝑛superscript𝝃2superscript𝝃𝑇~𝑀𝐛𝝃23subscript~𝑚𝑎𝑥superscript𝝃2\frac{1}{6}\tilde{h}_{min}|\mbox{\boldmath$\xi$}|^{2}\leq\mbox{\boldmath$\xi$}% ^{T}\tilde{M}({\bf b})\mbox{\boldmath$\xi$}\leq\frac{2}{3}\tilde{h}_{max}|% \mbox{\boldmath$\xi$}|^{2}.divide start_ARG 1 end_ARG start_ARG 6 end_ARG over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ bold_italic_ξ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over~ start_ARG italic_M end_ARG ( bold_b ) bold_italic_ξ ≤ divide start_ARG 2 end_ARG start_ARG 3 end_ARG over~ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT | bold_italic_ξ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

This completes the proof of the lemma.

2.2 Algebraic Approach

This section derives an inverse formula of the mass matrix through a decomposition into two matrices. The decomposition is based on the fact that matrices with the structure of {\cal M}caligraphic_M in Eq. 2 have tri-diagonal inverses.

For 1ijn1𝑖𝑗𝑛1\leq i\leq j\leq n1 ≤ italic_i ≤ italic_j ≤ italic_n, let mijsubscript𝑚𝑖𝑗m_{ij}italic_m start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT be the (i,j)𝑖𝑗(i,j)( italic_i , italic_j )-element of the mass matrix Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ), then

mijsubscript𝑚𝑖𝑗\displaystyle m_{ij}italic_m start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT =\displaystyle== mji=01r(x)σ(xbi1)σ(xbj)𝑑x=bj1r(x)(xbi)(xbj)𝑑xsubscript𝑚𝑗𝑖superscriptsubscript01𝑟𝑥𝜎𝑥subscript𝑏𝑖1𝜎𝑥subscript𝑏𝑗differential-d𝑥superscriptsubscriptsubscript𝑏𝑗1𝑟𝑥𝑥subscript𝑏𝑖𝑥subscript𝑏𝑗differential-d𝑥\displaystyle m_{ji}=\int_{0}^{1}r(x)\sigma(x-b_{i-1})\sigma(x-b_{j})dx=\int_{% b_{j}}^{1}r(x)(x-b_{i})(x-b_{j})dxitalic_m start_POSTSUBSCRIPT italic_j italic_i end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ) italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_d italic_x = ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) ( italic_x - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ( italic_x - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_d italic_x
=\displaystyle== bj1r(x)(x1)(xbj)𝑑x+(1bi)bj1r(x)(xbj)𝑑xmij1+mij2,superscriptsubscriptsubscript𝑏𝑗1𝑟𝑥𝑥1𝑥subscript𝑏𝑗differential-d𝑥1subscript𝑏𝑖superscriptsubscriptsubscript𝑏𝑗1𝑟𝑥𝑥subscript𝑏𝑗differential-d𝑥subscriptsuperscript𝑚1𝑖𝑗subscriptsuperscript𝑚2𝑖𝑗\displaystyle\int_{b_{j}}^{1}r(x)\left(x-1\right)\left(x-b_{j}\right)dx+(1-b_{% i})\int_{b_{j}}^{1}r(x)\left(x-b_{j}\right)dx\equiv m^{1}_{ij}+m^{2}_{ij},∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) ( italic_x - 1 ) ( italic_x - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_d italic_x + ( 1 - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) ( italic_x - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_d italic_x ≡ italic_m start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT + italic_m start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ,

which implies the following decomposition

Mr(𝐛)=M1(𝐛)+M2(𝐛)(mij1)n×n+(mij2)n×n.subscript𝑀𝑟𝐛subscript𝑀1𝐛subscript𝑀2𝐛subscriptsuperscriptsubscript𝑚𝑖𝑗1𝑛𝑛subscriptsuperscriptsubscript𝑚𝑖𝑗2𝑛𝑛M_{r}({\bf b})=M_{1}({\bf b})+M_{2}({\bf b})\equiv\left(m_{ij}^{1}\right)_{n% \times n}+\left(m_{ij}^{2}\right)_{n\times n}.italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) = italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_b ) + italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_b ) ≡ ( italic_m start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_n × italic_n end_POSTSUBSCRIPT + ( italic_m start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_n × italic_n end_POSTSUBSCRIPT .

Both M1(𝐛)subscript𝑀1𝐛M_{1}({\bf b})italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_b ) and M2(𝐛)subscript𝑀2𝐛M_{2}({\bf b})italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_b ) have the same structure as {\cal M}caligraphic_M in Eq. 2 with

mij1=βmax{i,j}1andmij2=αmin{i,j}2βmax{i,j}2,formulae-sequencesuperscriptsubscript𝑚𝑖𝑗1subscriptsuperscript𝛽1𝑖𝑗andsuperscriptsubscript𝑚𝑖𝑗2superscriptsubscript𝛼𝑚𝑖𝑛𝑖𝑗2subscriptsuperscript𝛽2𝑚𝑎𝑥𝑖𝑗m_{ij}^{1}=\beta^{1}_{\max\{i,j\}}\quad\mbox{and}\quad m_{ij}^{2}=\alpha_{min% \{i,j\}}^{2}\beta^{2}_{max\{i,j\}},italic_m start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = italic_β start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_max { italic_i , italic_j } end_POSTSUBSCRIPT and italic_m start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_α start_POSTSUBSCRIPT italic_m italic_i italic_n { italic_i , italic_j } end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m italic_a italic_x { italic_i , italic_j } end_POSTSUBSCRIPT ,

where

βk1=bk1r(x)(x1)(xbk)𝑑x,αk2=1bk,andβk2=bk1r(x)(xbk)𝑑x.formulae-sequencesubscriptsuperscript𝛽1𝑘superscriptsubscriptsubscript𝑏𝑘1𝑟𝑥𝑥1𝑥subscript𝑏𝑘differential-d𝑥formulae-sequencesubscriptsuperscript𝛼2𝑘1subscript𝑏𝑘andsubscriptsuperscript𝛽2𝑘superscriptsubscriptsubscript𝑏𝑘1𝑟𝑥𝑥subscript𝑏𝑘differential-d𝑥\beta^{1}_{k}=\int_{b_{k}}^{1}r(x)\left(x-1\right)\left(x-b_{k}\right)dx,\quad% \alpha^{2}_{k}=1-b_{k},\quad\mbox{and}\quad\beta^{2}_{k}=\int_{b_{k}}^{1}r(x)% \left(x-b_{k}\right)dx.italic_β start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) ( italic_x - 1 ) ( italic_x - italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) italic_d italic_x , italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = 1 - italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , and italic_β start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) ( italic_x - italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) italic_d italic_x .
Proposition 2.9.

The inverse of the mass matrix Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) is given by

(4) Mr(𝐛)1=M2(𝐛)1(M2(𝐛)1+M1(𝐛)1)1M1(𝐛)1.subscript𝑀𝑟superscript𝐛1subscript𝑀2superscript𝐛1superscriptsubscript𝑀2superscript𝐛1subscript𝑀1superscript𝐛11subscript𝑀1superscript𝐛1M_{r}({\bf b})^{-1}=M_{2}({\bf b})^{-1}(M_{2}({\bf b})^{-1}+M_{1}({\bf b})^{-1% })^{-1}M_{1}({\bf b})^{-1}.italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

Proof 2.10.

Eq. 4 is a direct consequence of the fact that

Mr(𝐛)=M1(𝐛)(M2(𝐛)1+M1(𝐛)1)M2(𝐛).subscript𝑀𝑟𝐛subscript𝑀1𝐛subscript𝑀2superscript𝐛1subscript𝑀1superscript𝐛1subscript𝑀2𝐛M_{r}({\bf b})=M_{1}({\bf b})\left(M_{2}({\bf b})^{-1}+M_{1}({\bf b})^{-1}% \right)M_{2}({\bf b}).italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) = italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_b ) ( italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_b ) .

Remark 1.

Since M1(𝐛)1subscript𝑀1superscript𝐛1M_{1}({\bf b})^{-1}italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT and M2(𝐛)1subscript𝑀2superscript𝐛1M_{2}({\bf b})^{-1}italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT are tri-diagonal, so is M1(𝐛)1+M2(𝐛)1subscript𝑀1superscript𝐛1subscript𝑀2superscript𝐛1M_{1}({\bf b})^{-1}\!+\!M_{2}({\bf b})^{-1}italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. Hence, Mr(𝐛)1subscript𝑀𝑟superscript𝐛1M_{r}({\bf b})^{-1}italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT in Eq. 4 applied to any vector can be computed in 𝒪(n)𝒪𝑛{\cal O}(n)caligraphic_O ( italic_n ) operations.

2.3 Geometric Approach

This section presents another way to invert the mass matrix, based on a factorization of Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) into the product of three tri-diagonal matrices. The factorization arises from expressing the global ReLU basis functions in terms of local discontinuous basis functions.

To this end, for k=0,,n𝑘0𝑛k=0,\ldots,nitalic_k = 0 , … , italic_n, let Ik=[bk,bk+1)subscript𝐼𝑘subscript𝑏𝑘subscript𝑏𝑘1I_{k}=[b_{k},b_{k+1})italic_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = [ italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ) and define the local basis functions

φk0(x)={1,xIk,0,otherwiseandφk1(x)={hk1(xbk),xIk,0,otherwise.formulae-sequencesuperscriptsubscript𝜑𝑘0𝑥cases1𝑥subscript𝐼𝑘0otherwiseandsuperscriptsubscript𝜑𝑘1𝑥casessuperscriptsubscript𝑘1𝑥subscript𝑏𝑘𝑥subscript𝐼𝑘0otherwise\displaystyle\varphi_{k}^{0}(x)=\left\{\begin{array}[]{cl}1,&x\in I_{k},\\ 0,&\text{otherwise}\end{array}\right.\quad\mbox{and}\quad\varphi_{k}^{1}(x)=% \left\{\begin{array}[]{cl}h_{k}^{-1}(x-b_{k}),&x\in I_{k},\\ 0,&\text{otherwise}\end{array}\right..italic_φ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_x ) = { start_ARRAY start_ROW start_CELL 1 , end_CELL start_CELL italic_x ∈ italic_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise end_CELL end_ROW end_ARRAY and italic_φ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_x ) = { start_ARRAY start_ROW start_CELL italic_h start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x - italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , end_CELL start_CELL italic_x ∈ italic_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise end_CELL end_ROW end_ARRAY .

Since i=0nφk0(x)1superscriptsubscript𝑖0𝑛superscriptsubscript𝜑𝑘0𝑥1\sum\limits_{i=0}^{n}\varphi_{k}^{0}(x)\equiv 1∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_x ) ≡ 1 in ΩΩ\Omegaroman_Ω, we have

(5) span{1,σ(xb1),,σ(xbn)}span{φk0(x)}k=1nspan{φk1(x)}k=1n.span1𝜎𝑥subscript𝑏1𝜎𝑥subscript𝑏𝑛spansuperscriptsubscriptsubscriptsuperscript𝜑0𝑘𝑥𝑘1𝑛spansuperscriptsubscriptsubscriptsuperscript𝜑1𝑘𝑥𝑘1𝑛\mbox{span}\left\{1,\sigma(x-b_{1}),\ldots,\sigma(x-b_{n})\right\}\subset\mbox% {span}\left\{\varphi^{0}_{k}(x)\right\}_{k=1}^{n}\bigcup\mbox{span}\left\{% \varphi^{1}_{k}(x)\right\}_{k=1}^{n}.span { 1 , italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) } ⊂ span { italic_φ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_x ) } start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋃ span { italic_φ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_x ) } start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT .

Set

(6) 𝝍(x)=(ψ1(x),,ψn(x))Tand𝝋i(x)=(φ1i(x),,φni(x))T,formulae-sequence𝝍𝑥superscriptsubscript𝜓1𝑥subscript𝜓𝑛𝑥𝑇andsubscript𝝋𝑖𝑥superscriptsuperscriptsubscript𝜑1𝑖𝑥subscriptsuperscript𝜑𝑖𝑛𝑥𝑇\mbox{\boldmath${\psi}$}(x)=(\psi_{1}(x),\ldots,\psi_{n}(x))^{T}\quad\mbox{and% }\quad\mbox{\boldmath${\varphi}$}_{i}(x)=(\varphi_{1}^{i}(x),\ldots,\varphi^{i% }_{n}(x))^{T},bold_italic_ψ ( italic_x ) = ( italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) , … , italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and bold_italic_φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_x ) = ( italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( italic_x ) , … , italic_φ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ,

where ψk(x)=σ(xbk)subscript𝜓𝑘𝑥𝜎𝑥subscript𝑏𝑘\psi_{k}(x)=\sigma(x-b_{k})italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_x ) = italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ); and let D(𝐡)=diag(h1,,hn)𝐷𝐡diagsubscript1subscript𝑛D(\mathbf{h})=\mbox{diag}(h_{1},\ldots,h_{n})italic_D ( bold_h ) = diag ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_h start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ),

G=(11111)n×n,andG1=(111111)n×n.formulae-sequence𝐺subscript1missing-subexpressionmissing-subexpressionmissing-subexpression11missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression11𝑛𝑛andsuperscript𝐺1subscript1missing-subexpressionmissing-subexpressionmissing-subexpression11missing-subexpressionmissing-subexpressionmissing-subexpression111𝑛𝑛\displaystyle G=\left(\begin{array}[]{cccc}~{}1&&&\\ \!-1&~{}1&&\\ &\ddots&\ddots&\\ &&\!-1&~{}1\end{array}\right)_{\!\!n\times n},\quad\mbox{and}\quad G^{-1}=% \left(\begin{array}[]{cccc}1&&&\\ 1&~{}1&&\\ \vdots&\vdots&\ddots&\\ 1&1&\ldots&~{}1\end{array}\right)_{\!\!n\times n}.italic_G = ( start_ARRAY start_ROW start_CELL 1 end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL - 1 end_CELL start_CELL 1 end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL - 1 end_CELL start_CELL 1 end_CELL end_ROW end_ARRAY ) start_POSTSUBSCRIPT italic_n × italic_n end_POSTSUBSCRIPT , and italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = ( start_ARRAY start_ROW start_CELL 1 end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL … end_CELL start_CELL 1 end_CELL end_ROW end_ARRAY ) start_POSTSUBSCRIPT italic_n × italic_n end_POSTSUBSCRIPT .
Lemma 2.11.

There exist map**s B0:nn:subscript𝐵0superscript𝑛superscript𝑛B_{0}:\mathbb{R}^{n}\to\mathbb{R}^{n}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and B1:nn:subscript𝐵1superscript𝑛superscript𝑛B_{1}:\mathbb{R}^{n}\to\mathbb{R}^{n}italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT such that

(7) 𝝍=B0𝝋0+B1𝝋1.𝝍subscript𝐵0subscript𝝋0subscript𝐵1subscript𝝋1\mbox{\boldmath${\psi}$}=B_{0}\mbox{\boldmath${\varphi}$}_{0}+B_{1}\mbox{% \boldmath${\varphi}$}_{1}.bold_italic_ψ = italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_φ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT .

Moreover, we have

B0=GTD(𝐡)(GTI)andB1=GTD(𝐡),formulae-sequencesubscript𝐵0superscript𝐺𝑇𝐷𝐡superscript𝐺𝑇𝐼andsubscript𝐵1superscript𝐺𝑇𝐷𝐡B_{0}=G^{-T}D(\mathbf{h})\left(G^{-T}-I\right)\quad\mbox{and}\quad B_{1}=G^{-T% }D(\mathbf{h}),italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_D ( bold_h ) ( italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT - italic_I ) and italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_D ( bold_h ) ,

where I𝐼Iitalic_I is the n𝑛nitalic_n-order identity matrix.

Proof 2.12.

Eq. 5 implies that there exist B0subscript𝐵0B_{0}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and B1subscript𝐵1B_{1}italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT such that Eq. 7 is valid. To determine B0subscript𝐵0B_{0}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and B1subscript𝐵1B_{1}italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, for any 𝐜=(c1,,cn)Tn𝐜superscriptsubscript𝑐1subscript𝑐𝑛𝑇superscript𝑛{\bf c}=(c_{1},\ldots,c_{n})^{T}\in\mathbb{R}^{n}bold_c = ( italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_c start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, let v(x)=𝐜T𝛙(x)𝑣𝑥superscript𝐜𝑇𝛙𝑥v(x)={\bf c}^{T}\mbox{\boldmath${\psi}$}(x)italic_v ( italic_x ) = bold_c start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_ψ ( italic_x ), then

v(x)=𝐜T𝝍(x)=𝐜TB0𝝋0(x)+𝐜TB1𝝋1(x).𝑣𝑥superscript𝐜𝑇𝝍𝑥superscript𝐜𝑇subscript𝐵0subscript𝝋0𝑥superscript𝐜𝑇subscript𝐵1subscript𝝋1𝑥v(x)={\bf c}^{T}\mbox{\boldmath${\psi}$}(x)={\bf c}^{T}B_{0}\mbox{\boldmath${% \varphi}$}_{0}(x)+{\bf c}^{T}B_{1}\mbox{\boldmath${\varphi}$}_{1}(x).italic_v ( italic_x ) = bold_c start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_ψ ( italic_x ) = bold_c start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_φ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x ) + bold_c start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) .

On each Iksubscript𝐼𝑘I_{k}italic_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, using the facts that v(x)superscript𝑣𝑥v^{\prime}(x)italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x ) and 𝐜TB0𝛗0(x)superscript𝐜𝑇subscript𝐵0subscript𝛗0𝑥{\bf c}^{T}B_{0}\mbox{\boldmath${\varphi}$}_{0}(x)bold_c start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_φ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x ) are constants, we have

(𝐜TB1)k=v(bk+1)v(bk)=i=1kci(j=ikhj)i=1k1ci(j=ik1hj)=i=1kcihk=(D(𝐡)G1𝐜)ksubscriptsuperscript𝐜𝑇subscript𝐵1𝑘𝑣subscript𝑏𝑘1𝑣subscript𝑏𝑘superscriptsubscript𝑖1𝑘subscript𝑐𝑖superscriptsubscript𝑗𝑖𝑘subscript𝑗superscriptsubscript𝑖1𝑘1subscript𝑐𝑖superscriptsubscript𝑗𝑖𝑘1subscript𝑗superscriptsubscript𝑖1𝑘subscript𝑐𝑖subscript𝑘subscript𝐷𝐡superscript𝐺1𝐜𝑘\left({\bf c}^{T}B_{1}\right)_{k}=v(b_{k+1})-v(b_{k})=\sum_{i=1}^{k}c_{i}\left% (\sum_{j=i}^{k}h_{j}\right)-\sum_{i=1}^{k-1}c_{i}\left(\sum_{j=i}^{k-1}h_{j}% \right)=\sum_{i=1}^{k}c_{i}h_{k}=\left(D(\mathbf{h})G^{-1}\mathbf{c}\right)_{k}( bold_c start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_v ( italic_b start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ) - italic_v ( italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_j = italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) - ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_j = italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( italic_D ( bold_h ) italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_c ) start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT

which, together with arbitrariness of 𝐜𝐜{\bf c}bold_c, implies that B1=GTD(𝐡)subscript𝐵1superscript𝐺𝑇𝐷𝐡B_{1}=G^{-T}D(\mathbf{h})italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_D ( bold_h ).

By the definitions of 𝛗0(x)subscript𝛗0𝑥\mbox{\boldmath${\varphi}$}_{0}(x)bold_italic_φ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_x ) and 𝛗1(x)subscript𝛗1𝑥\mbox{\boldmath${\varphi}$}_{1}(x)bold_italic_φ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x ) and the fact that v(b1)=0𝑣subscript𝑏10v(b_{1})=0italic_v ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = 0, we have

(𝐜TB0)ksubscriptsuperscript𝐜𝑇subscript𝐵0𝑘\displaystyle\left({\bf c}^{T}B_{0}\right)_{k}( bold_c start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT =v(bk)=(v(bk)v(bk1))+(v(bk1)v(bk2))++(v(b2)v(b1))absent𝑣subscript𝑏𝑘𝑣subscript𝑏𝑘𝑣subscript𝑏𝑘1𝑣subscript𝑏𝑘1𝑣subscript𝑏𝑘2𝑣subscript𝑏2𝑣subscript𝑏1\displaystyle=v(b_{k})=\left(v(b_{k})-v(b_{k-1})\right)+\left(v(b_{k-1})-v(b_{% k-2})\right)+\cdots+\left(v(b_{2})-v(b_{1})\right)= italic_v ( italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ( italic_v ( italic_b start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) - italic_v ( italic_b start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ) ) + ( italic_v ( italic_b start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ) - italic_v ( italic_b start_POSTSUBSCRIPT italic_k - 2 end_POSTSUBSCRIPT ) ) + ⋯ + ( italic_v ( italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - italic_v ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) )
=(D(𝐡)G1𝐜)k1+(D(𝐡)G1𝐜)k2++(D(𝐡)G1𝐜)1=((G1I)D(𝐡)G1𝐜)k,absentsubscript𝐷𝐡superscript𝐺1𝐜𝑘1subscript𝐷𝐡superscript𝐺1𝐜𝑘2subscript𝐷𝐡superscript𝐺1𝐜1subscriptsuperscript𝐺1𝐼𝐷𝐡superscript𝐺1𝐜𝑘\displaystyle=\left(D(\mathbf{h})G^{-1}\mathbf{c}\right)_{k-1}+\left(D(\mathbf% {h})G^{-1}\mathbf{c}\right)_{k-2}+\cdots+\left(D(\mathbf{h})G^{-1}\mathbf{c}% \right)_{1}=\left((G^{-1}-I)D(\mathbf{h})G^{-1}{\bf c}\right)_{k},= ( italic_D ( bold_h ) italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_c ) start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT + ( italic_D ( bold_h ) italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_c ) start_POSTSUBSCRIPT italic_k - 2 end_POSTSUBSCRIPT + ⋯ + ( italic_D ( bold_h ) italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_c ) start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ( ( italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - italic_I ) italic_D ( bold_h ) italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_c ) start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ,

which, together with arbitrariness of 𝐜𝐜{\bf c}bold_c, implies that B0=GTD(𝐡)(GTI)subscript𝐵0superscript𝐺𝑇𝐷𝐡superscript𝐺𝑇𝐼B_{0}=G^{-T}D(\mathbf{h})\left(G^{-T}-I\right)italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_D ( bold_h ) ( italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT - italic_I ). This completes the proof of the lemma.

For i,j=0,1formulae-sequence𝑖𝑗01i,j=0,1italic_i , italic_j = 0 , 1, let

Dij(r)=01r(x)𝝋i𝝋jT𝑑x.subscript𝐷𝑖𝑗𝑟superscriptsubscript01𝑟𝑥subscript𝝋𝑖superscriptsubscript𝝋𝑗𝑇differential-d𝑥D_{ij}(r)=\displaystyle\int_{0}^{1}r(x)\mbox{\boldmath${\varphi}$}_{i}\mbox{% \boldmath${\varphi}$}_{j}^{T}dx.italic_D start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( italic_r ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) bold_italic_φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_φ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d italic_x .

For k=0,1,2𝑘012k=0,1,2italic_k = 0 , 1 , 2, let

(8) Dr(𝐬k)=diag(s0k(r),,snk(r))withsik(r)=bibi+1r(x)(xbi)k𝑑x.formulae-sequencesubscript𝐷𝑟superscript𝐬𝑘diagsuperscriptsubscript𝑠0𝑘𝑟superscriptsubscript𝑠𝑛𝑘𝑟withsuperscriptsubscript𝑠𝑖𝑘𝑟superscriptsubscriptsubscript𝑏𝑖subscript𝑏𝑖1𝑟𝑥superscript𝑥subscript𝑏𝑖𝑘differential-d𝑥D_{r}({\bf s}^{k})=\text{diag}(s_{0}^{k}(r),\dots,s_{n}^{k}(r))\quad\mbox{with% }\quad s_{i}^{k}(r)=\int_{b_{i}}^{b_{i+1}}r(x)(x-b_{i})^{k}\,dx.italic_D start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_s start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) = diag ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_r ) , … , italic_s start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_r ) ) with italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( italic_r ) = ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_r ( italic_x ) ( italic_x - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_d italic_x .

Then, together with D(𝐡)=diag(h1,,hn)𝐷𝐡diagsubscript1subscript𝑛D(\mathbf{h})=\mbox{diag}(h_{1},\ldots,h_{n})italic_D ( bold_h ) = diag ( italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_h start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ), it is easy to see that

D00(r)=Dr(𝐬0),D01(r)=D10(r)=D(𝐡)1Dr(𝐬1), and D11(r)=D(𝐡)2Dr(𝐬2).formulae-sequenceformulae-sequencesubscript𝐷00𝑟subscript𝐷𝑟superscript𝐬0subscript𝐷01𝑟subscript𝐷10𝑟𝐷superscript𝐡1subscript𝐷𝑟superscript𝐬1 and subscript𝐷11𝑟𝐷superscript𝐡2subscript𝐷𝑟superscript𝐬2D_{00}(r)=D_{r}({\bf s}^{0}),\quad D_{01}(r)=D_{10}(r)=D(\mathbf{h})^{-1}D_{r}% ({\bf s}^{1}),\quad\mbox{ and }\,\,D_{11}(r)=D(\mathbf{h})^{-2}D_{r}({\bf s}^{% 2}).italic_D start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ( italic_r ) = italic_D start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_s start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_D start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT ( italic_r ) = italic_D start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( italic_r ) = italic_D ( bold_h ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_s start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , and italic_D start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_r ) = italic_D ( bold_h ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) .
Theorem 2.

Let Q=GD(𝐡)1G𝑄𝐺𝐷superscript𝐡1𝐺Q=GD({\bf h})^{-1}Gitalic_Q = italic_G italic_D ( bold_h ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_G and let

TMr=(IGT)D00(r)(IG)+(IGT)D01(r)G+GTD10(r)(IG)+GTD11(r)G,subscript𝑇subscript𝑀𝑟𝐼superscript𝐺𝑇subscript𝐷00𝑟𝐼𝐺𝐼superscript𝐺𝑇subscript𝐷01𝑟𝐺superscript𝐺𝑇subscript𝐷10𝑟𝐼𝐺superscript𝐺𝑇subscript𝐷11𝑟𝐺T_{M_{r}}=(I-G^{T})D_{00}(r)(I-G)+(I-G^{T})D_{01}(r)G+G^{T}D_{10}(r)(I-G)+G^{T% }D_{11}(r)G,italic_T start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ( italic_I - italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_D start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ( italic_r ) ( italic_I - italic_G ) + ( italic_I - italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_D start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT ( italic_r ) italic_G + italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( italic_r ) ( italic_I - italic_G ) + italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_r ) italic_G ,

then the mass matrix Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) defined in Eq. 1 has the following factorization

(9) Mr(𝐛)=QTTMrQ1.subscript𝑀𝑟𝐛superscript𝑄𝑇subscript𝑇subscript𝑀𝑟superscript𝑄1M_{r}({\bf b})=Q^{-T}T_{M_{r}}Q^{-1}.italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) = italic_Q start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

Proof 2.13.

By Eq. 7 and the fact that B0=B1(GTI)subscript𝐵0subscript𝐵1superscript𝐺𝑇𝐼B_{0}=B_{1}(G^{-T}-I)italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT - italic_I ), we have

Mr(𝐛)subscript𝑀𝑟𝐛\displaystyle M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) =01r(x)𝛙𝛙T𝑑x=B0D00B0T+B0D01B1T+B1D10B0T+B1D11B1Tabsentsuperscriptsubscript01𝑟𝑥superscript𝛙𝛙𝑇differential-d𝑥subscript𝐵0subscript𝐷00superscriptsubscript𝐵0𝑇subscript𝐵0subscript𝐷01superscriptsubscript𝐵1𝑇subscript𝐵1subscript𝐷10superscriptsubscript𝐵0𝑇subscript𝐵1subscript𝐷11superscriptsubscript𝐵1𝑇\displaystyle=\int_{0}^{1}r(x)\mbox{\boldmath${\psi}$}\mbox{\boldmath${\psi}$}% ^{T}\,dx=B_{0}D_{00}B_{0}^{T}+B_{0}D_{01}B_{1}^{T}+B_{1}D_{10}B_{0}^{T}+B_{1}D% _{11}B_{1}^{T}= ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) italic_ψ italic_ψ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d italic_x = italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT + italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT + italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT + italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_D start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
=B1{(GTI)D00(G1I)+(GTI)D01+D10(G1I)+D11}B1Tabsentsubscript𝐵1superscript𝐺𝑇𝐼subscript𝐷00superscript𝐺1𝐼superscript𝐺𝑇𝐼subscript𝐷01subscript𝐷10superscript𝐺1𝐼subscript𝐷11superscriptsubscript𝐵1𝑇\displaystyle=B_{1}\left\{(G^{-T}-I)D_{00}(G^{-1}-I)+(G^{-T}-I)D_{01}+D_{10}(G% ^{-1}-I)+D_{11}\right\}B_{1}^{T}= italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT { ( italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT - italic_I ) italic_D start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ( italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - italic_I ) + ( italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT - italic_I ) italic_D start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT + italic_D start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - italic_I ) + italic_D start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT } italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
=B1GT{(IGT)D00(IG)+(IGT)D01G+GTD10(IG)+GTD11G}G1B1T,absentsubscript𝐵1superscript𝐺𝑇𝐼superscript𝐺𝑇subscript𝐷00𝐼𝐺𝐼superscript𝐺𝑇subscript𝐷01𝐺superscript𝐺𝑇subscript𝐷10𝐼𝐺superscript𝐺𝑇subscript𝐷11𝐺superscript𝐺1superscriptsubscript𝐵1𝑇\displaystyle=B_{1}G^{-T}\left\{(I-G^{T})D_{00}(I-G)+(I-G^{T})D_{01}G+G^{T}D_{% 10}(I-G)+G^{T}D_{11}G\right\}G^{-1}B_{1}^{T},= italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_G start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT { ( italic_I - italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_D start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ( italic_I - italic_G ) + ( italic_I - italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_D start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT italic_G + italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( italic_I - italic_G ) + italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_G } italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ,

which, together with the fact that Q1=G1B1Tsuperscript𝑄1superscript𝐺1superscriptsubscript𝐵1𝑇Q^{-1}=G^{-1}B_{1}^{T}italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, implies Eq. 9.

Remark 3.

Clearly, TMrsubscript𝑇subscript𝑀𝑟T_{M_{r}}italic_T start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_POSTSUBSCRIPT is tri-diagonal. Hence, TMr1superscriptsubscript𝑇subscript𝑀𝑟1T_{M_{r}}^{-1}italic_T start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT and hence Mr(𝐛)1subscript𝑀𝑟superscript𝐛1M_{r}({\bf b})^{-1}italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT applied to any vector can be computed in 𝒪(n)𝒪𝑛{\cal O}(n)caligraphic_O ( italic_n ) operations.

Remark 4.

The transformation in Eq. 7 leads to a similar factorization of the coefficient matrix as

(10) Aa(𝐛)=QTTAaQ1subscript𝐴𝑎𝐛superscript𝑄𝑇subscript𝑇subscript𝐴𝑎superscript𝑄1A_{a}({\bf b})=Q^{-T}T_{A_{a}}Q^{-1}italic_A start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( bold_b ) = italic_Q start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT

with TAa=GTD(𝐡)2Da(𝐬0)Gsubscript𝑇subscript𝐴𝑎superscript𝐺𝑇𝐷superscript𝐡2subscript𝐷𝑎superscript𝐬0𝐺\displaystyle T_{A_{a}}=G^{T}D(\mathbf{h})^{-2}D_{a}({\bf s}^{0})Gitalic_T start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_D ( bold_h ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_D start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( bold_s start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_G being tri-diagonal, where Da(𝐬0)subscript𝐷𝑎superscript𝐬0D_{a}({\bf s}^{0})italic_D start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( bold_s start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) is defined similarly as in Eq. 8.

3 Applications

This section considers two applications: the least-squares data fitting and the diffusion-reaction equation in one dimension. When using the shallow ReLU neural network, the resulting discretization requires inversion of the corresponding mass matrix.

3.1 Least-Squares Approximation

The first problem type in which the mass matrix arises is least-squares data fitting. Given a function f(x)L2(Ω)𝑓𝑥superscript𝐿2Ωf(x)\in L^{2}(\Omega)italic_f ( italic_x ) ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( roman_Ω ), the best least-squares approximation to f𝑓fitalic_f in n(Ω)subscript𝑛Ω{\cal M}_{n}(\Omega)caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) is to find unn(Ω)subscript𝑢𝑛subscript𝑛Ωu_{n}\in{\cal M}_{n}(\Omega)italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) and un(0)=f(0)subscript𝑢𝑛0𝑓0u_{n}(0)=f(0)italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( 0 ) = italic_f ( 0 ) such that

(11) J(un)=minvn(Ω){v(0)=f(0)}J(v),𝐽subscript𝑢𝑛subscript𝑣subscript𝑛Ω𝑣0𝑓0𝐽𝑣J(u_{n})=\min_{v\in{\cal M}_{n}(\Omega)\bigcap\{v(0)=f(0)\}}J(v),italic_J ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) = roman_min start_POSTSUBSCRIPT italic_v ∈ caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) ⋂ { italic_v ( 0 ) = italic_f ( 0 ) } end_POSTSUBSCRIPT italic_J ( italic_v ) ,

where J(v)𝐽𝑣J(v)italic_J ( italic_v ) is the weighted continuous least-squares loss functional given by

J(v)=1201r(x)((v(x)f(x))2dx.\displaystyle J(v)=\dfrac{1}{2}\int_{0}^{1}r(x)\left((v(x)-f(x)\right)^{2}dx.italic_J ( italic_v ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) ( ( italic_v ( italic_x ) - italic_f ( italic_x ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_x .

Let un(x)n(Ω)subscript𝑢𝑛𝑥subscript𝑛Ωu_{n}(x)\in{\cal M}_{n}(\Omega)italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) ∈ caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) be a solution of Eq. 11 having the form of

un(x)=f(0)+i=1nciσ(xbi).subscript𝑢𝑛𝑥𝑓0superscriptsubscript𝑖1𝑛subscript𝑐𝑖𝜎𝑥subscript𝑏𝑖u_{n}(x)=f(0)+\sum_{i=1}^{n}c_{i}\sigma(x-b_{i}).italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) = italic_f ( 0 ) + ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_σ ( italic_x - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) .

Clearly, the optimality condition on the linear parameter 𝐜=(c1,,cn)T𝐜superscriptsubscript𝑐1subscript𝑐𝑛𝑇{\bf c}=\left(c_{1},\ldots,c_{n}\right)^{T}bold_c = ( italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_c start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT gives

(12) Mr(𝐛)𝐜=𝐟(𝐛),subscript𝑀𝑟𝐛𝐜𝐟𝐛M_{r}({\bf b})\,{\bf c}={\bf f}({\bf b}),italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) bold_c = bold_f ( bold_b ) ,

where Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) is the mass matrix defined in Eq. 1 and 𝐟(𝐛)𝐟𝐛{\bf f}({\bf b})bold_f ( bold_b ) is given by

𝐟(𝐛)=01r(x)(f(x)f(0))𝝍(x)𝑑x,𝐟𝐛superscriptsubscript01𝑟𝑥𝑓𝑥𝑓0𝝍𝑥differential-d𝑥{\bf f}({\bf b})=\int_{0}^{1}r(x)\left(f(x)-f(0)\right)\mbox{\boldmath${\psi}$% }(x)dx,bold_f ( bold_b ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) ( italic_f ( italic_x ) - italic_f ( 0 ) ) bold_italic_ψ ( italic_x ) italic_d italic_x ,

where 𝝍(x)𝝍𝑥\mbox{\boldmath${\psi}$}(x)bold_italic_ψ ( italic_x ) is defined in Eq. 6.

Let D(𝐜)=diag(c1,,cn)𝐷𝐜diagsubscript𝑐1subscript𝑐𝑛D({\bf c})=\text{diag}(c_{1},\dots,c_{n})italic_D ( bold_c ) = diag ( italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_c start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) be a diagonal matrix with the linear parameter, then the optimality condition on the non-linear parameter leads to

(13) 𝟎=𝐛J(un)=D(𝐜)(b11r(unf)𝑑x,,bn1r(unf)𝑑x)T.0subscript𝐛𝐽subscript𝑢𝑛𝐷𝐜superscriptsuperscriptsubscriptsubscript𝑏11𝑟subscript𝑢𝑛𝑓differential-d𝑥superscriptsubscriptsubscript𝑏𝑛1𝑟subscript𝑢𝑛𝑓differential-d𝑥𝑇{\bf 0}=\nabla_{{\bf b}}J\left(u_{n}\right)=-D({\bf c})\left(\int_{b_{1}}^{1}r% (u_{n}-f)dx,\ldots,\int_{b_{n}}^{1}r(u_{n}-f)dx\right)^{T}.bold_0 = ∇ start_POSTSUBSCRIPT bold_b end_POSTSUBSCRIPT italic_J ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) = - italic_D ( bold_c ) ( ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f ) italic_d italic_x , … , ∫ start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f ) italic_d italic_x ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .

Eq. 13 is a system of non-linear algebraic equations and will be solved by Newton’s method. Let wi=r(bi)(un(bi)f(bi))subscript𝑤𝑖𝑟subscript𝑏𝑖subscript𝑢𝑛subscript𝑏𝑖𝑓subscript𝑏𝑖w_{i}=r(b_{i})\bigl{(}u_{n}(b_{i})-f(b_{i})\bigr{)}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_r ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - italic_f ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) for i=1,,n𝑖1𝑛i=1,\dots,nitalic_i = 1 , … , italic_n. In one dimension, Lemma 4.1 in [3] implies that the corresponding Hessian matrix is of the form

(14) 𝐛2J(un)𝐇(𝐜,𝐛)=D(𝐰)D(𝐜)+D(𝐜)Ar(𝐛)D(𝐜),superscriptsubscript𝐛2𝐽subscript𝑢𝑛𝐇𝐜𝐛𝐷𝐰𝐷𝐜𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜\nabla_{{\bf b}}^{2}J(u_{n})\equiv{\bf H}({\bf c},{\bf b})=D({\bf w})D({\bf c}% )+D({\bf c})A_{r}({\bf b})D({\bf c}),∇ start_POSTSUBSCRIPT bold_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_J ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ≡ bold_H ( bold_c , bold_b ) = italic_D ( bold_w ) italic_D ( bold_c ) + italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) ,

where D(𝐰)𝐷𝐰D({\bf w})italic_D ( bold_w ) is a diagonal matrix given by

D(𝐰)𝐷𝐰\displaystyle D({\bf w})italic_D ( bold_w ) =\displaystyle\!=\!= 01r(unf)diag(δ(xb1),,δ(xbn))𝑑xsuperscriptsubscript01𝑟subscript𝑢𝑛𝑓diag𝛿𝑥subscript𝑏1𝛿𝑥subscript𝑏𝑛differential-d𝑥\displaystyle\int_{0}^{1}r(u_{n}-f)\text{diag}(\delta(x-b_{1}),\ldots,\delta(x% -b_{n}))dx∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f ) diag ( italic_δ ( italic_x - italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , italic_δ ( italic_x - italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ) italic_d italic_x
=\displaystyle\!=\!= diag(w1,,wn)diagsubscript𝑤1subscript𝑤𝑛\displaystyle\text{diag}(w_{1},\dots,w_{n})diag ( italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_w start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )

3.2 Diffusion-Reaction Problem

The second application that we consider is the following diffusion-reaction equation in one dimension:

(15) {(a(x)u(x))+r(x)u(x)=f(x),in Ω=(0,1),u(0)=α,u(1)=βcasessuperscript𝑎𝑥superscript𝑢𝑥𝑟𝑥𝑢𝑥𝑓𝑥in Ω01formulae-sequence𝑢0𝛼𝑢1𝛽missing-subexpression\left\{\begin{array}[]{lr}-(a(x)u^{\prime}(x))^{\prime}+r(x)u(x)=f(x),&\mbox{% in }\,\Omega=(0,1),\\[5.69054pt] u(0)=\alpha,\quad u(1)=\beta&\end{array}\right.{ start_ARRAY start_ROW start_CELL - ( italic_a ( italic_x ) italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x ) ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_r ( italic_x ) italic_u ( italic_x ) = italic_f ( italic_x ) , end_CELL start_CELL in roman_Ω = ( 0 , 1 ) , end_CELL end_ROW start_ROW start_CELL italic_u ( 0 ) = italic_α , italic_u ( 1 ) = italic_β end_CELL start_CELL end_CELL end_ROW end_ARRAY

where the diffusion coefficient a(x)𝑎𝑥a(x)italic_a ( italic_x ), the reaction coefficient r(x)𝑟𝑥r(x)italic_r ( italic_x ), and f(x)𝑓𝑥f(x)italic_f ( italic_x ) are given real-valued functions defined on ΩΩ\Omegaroman_Ω. Assume that a(x)L(Ω)𝑎𝑥superscript𝐿Ωa(x)\in L^{\infty}(\Omega)italic_a ( italic_x ) ∈ italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( roman_Ω ) and r(x)L(Ω)𝑟𝑥superscript𝐿Ωr(x)\in L^{\infty}(\Omega)italic_r ( italic_x ) ∈ italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( roman_Ω ) are bounded below by the respective positive constant a0>0subscript𝑎00a_{0}>0italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT > 0 and non-negative constant r00subscript𝑟00r_{0}\geq 0italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≥ 0 almost everywhere on ΩΩ\Omegaroman_Ω.

As in [4], the modified Ritz formulation of problem (15) is to find uH1(Ω){u(0)=α}𝑢superscript𝐻1Ω𝑢0𝛼u\in H^{1}(\Omega)\bigcap\{u(0)=\alpha\}italic_u ∈ italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Ω ) ⋂ { italic_u ( 0 ) = italic_α } such that

(16) J(u)=minvH1(Ω){v(0)=α}J(v),𝐽𝑢subscript𝑣superscript𝐻1Ω𝑣0𝛼𝐽𝑣J(u)=\min_{v\in H^{1}(\Omega)\cap\{v(0)=\alpha\}}J(v),italic_J ( italic_u ) = roman_min start_POSTSUBSCRIPT italic_v ∈ italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Ω ) ∩ { italic_v ( 0 ) = italic_α } end_POSTSUBSCRIPT italic_J ( italic_v ) ,

where the modified energy functional is given by

(17) J(v)=1201a(x)(v(x))2𝑑x+1201r(x)(v(x))2𝑑x01f(x)v(x)𝑑x+γ2(v(1)β)2.𝐽𝑣12superscriptsubscript01𝑎𝑥superscriptsuperscript𝑣𝑥2differential-d𝑥12superscriptsubscript01𝑟𝑥superscript𝑣𝑥2differential-d𝑥superscriptsubscript01𝑓𝑥𝑣𝑥differential-d𝑥𝛾2superscript𝑣1𝛽2J(v)=\frac{1}{2}\int_{0}^{1}a(x)(v^{\prime}(x))^{2}dx+\frac{1}{2}\int_{0}^{1}r% (x)(v(x))^{2}dx-\int_{0}^{1}f(x)v(x)dx+\frac{\gamma}{2}(v(1)-\beta)^{2}.italic_J ( italic_v ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_a ( italic_x ) ( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_x + divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) ( italic_v ( italic_x ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d italic_x - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_f ( italic_x ) italic_v ( italic_x ) italic_d italic_x + divide start_ARG italic_γ end_ARG start_ARG 2 end_ARG ( italic_v ( 1 ) - italic_β ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

Here, γ>0𝛾0\gamma>0italic_γ > 0 is a penalization constant. Then the Ritz neural network approximation is to find unn(Ω){un(0)=α}subscript𝑢𝑛subscript𝑛Ωsubscript𝑢𝑛0𝛼u_{n}\in{\cal M}_{n}(\Omega)\cap\{u_{n}(0)=\alpha\}italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) ∩ { italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( 0 ) = italic_α } such that

(18) J(un)=minvn(Ω){v(0)=α}J(v).𝐽subscript𝑢𝑛subscript𝑣subscript𝑛Ω𝑣0𝛼𝐽𝑣J(u_{n})=\min_{v\in{\cal M}_{n}(\Omega)\cap\{v(0)=\alpha\}}J(v).italic_J ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) = roman_min start_POSTSUBSCRIPT italic_v ∈ caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) ∩ { italic_v ( 0 ) = italic_α } end_POSTSUBSCRIPT italic_J ( italic_v ) .

The corresponding bilinear form of the modified enery functional is given by

a(u,v):=01a(x)u(x)v(x)𝑑x+01r(x)u(x)v(x)𝑑x+γu(1)v(1)assign𝑎𝑢𝑣superscriptsubscript01𝑎𝑥superscript𝑢𝑥superscript𝑣𝑥differential-d𝑥superscriptsubscript01𝑟𝑥𝑢𝑥𝑣𝑥differential-d𝑥𝛾𝑢1𝑣1a(u,v):=\int_{0}^{1}a(x)u^{\prime}(x)v^{\prime}(x)dx+\int_{0}^{1}r(x)u(x)v(x)% dx+\gamma u(1)v(1)italic_a ( italic_u , italic_v ) := ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_a ( italic_x ) italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x ) italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x ) italic_d italic_x + ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_r ( italic_x ) italic_u ( italic_x ) italic_v ( italic_x ) italic_d italic_x + italic_γ italic_u ( 1 ) italic_v ( 1 )

for any u,vH1(Ω)𝑢𝑣superscript𝐻1Ωu,\,v\in H^{1}(\Omega)italic_u , italic_v ∈ italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Ω ). Denote by a\|\cdot\|_{a}∥ ⋅ ∥ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT the induced norm of the bilinear form.

Proposition 3.1.

Let u𝑢uitalic_u and unsubscript𝑢𝑛u_{n}italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be the solutions of problems Eq. 16 and Eq. 18, respectively. Then

(19) uuna3infvn(Ω){v(0)=α}uva+2|a(1)u(1)|γ1/2.subscriptnorm𝑢subscript𝑢𝑛𝑎3subscriptinfimum𝑣subscript𝑛Ω𝑣0𝛼subscriptnorm𝑢𝑣𝑎2𝑎1superscript𝑢1superscript𝛾12\|u-u_{n}\|_{a}\leq\sqrt{3}\inf_{v\in{\cal M}_{n}(\Omega)\cap\{v(0)=\alpha\}}% \|u-v\|_{a}+\sqrt{2}\,\big{|}a(1)u^{\prime}(1)\big{|}\,\gamma^{-1/2}.∥ italic_u - italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ≤ square-root start_ARG 3 end_ARG roman_inf start_POSTSUBSCRIPT italic_v ∈ caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) ∩ { italic_v ( 0 ) = italic_α } end_POSTSUBSCRIPT ∥ italic_u - italic_v ∥ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT + square-root start_ARG 2 end_ARG | italic_a ( 1 ) italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 1 ) | italic_γ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT .

Moreover, if n(Ω)subscript𝑛Ω{\cal M}_{n}(\Omega)caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) has the following approximation property

(20) infvn(Ω)uvH1(Ω)C(u)n1,subscriptinfimum𝑣subscript𝑛Ωsubscriptnorm𝑢𝑣superscript𝐻1Ω𝐶𝑢superscript𝑛1\inf_{v\in{\cal M}_{n}(\Omega)}\|u-v\|_{H^{1}(\Omega)}\leq C(u)\,n^{-1},roman_inf start_POSTSUBSCRIPT italic_v ∈ caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ) end_POSTSUBSCRIPT ∥ italic_u - italic_v ∥ start_POSTSUBSCRIPT italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Ω ) end_POSTSUBSCRIPT ≤ italic_C ( italic_u ) italic_n start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ,

then there exists a constant C𝐶Citalic_C depending on u𝑢uitalic_u such that

(21) uunaC(n1+γ1/2).subscriptnorm𝑢subscript𝑢𝑛𝑎𝐶superscript𝑛1superscript𝛾12\|u-u_{n}\|_{a}\leq C\left(n^{-1}+\gamma^{-1/2}\right).∥ italic_u - italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ≤ italic_C ( italic_n start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_γ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) .

Proof 3.2.

Eq. 19 may be proved in a similar fashion as that of Lemma 2.1 in [4], and Eq. 21 is a direct consequence of Eq. 19 and Eq. 20.

3.2.1 System of Algebraic Equations

Let un(x)subscript𝑢𝑛𝑥u_{n}(x)italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) be the solution of problem Eq. 18, then the linear parameter 𝐜=(c1,,cn)T𝐜superscriptsubscript𝑐1subscript𝑐𝑛𝑇{\bf c}=\left(c_{1},\ldots,c_{n}\right)^{T}bold_c = ( italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_c start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and non-linear parameter 𝐛=(b1,,bn)T𝐛superscriptsubscript𝑏1subscript𝑏𝑛𝑇{\bf b}=\left(b_{1},\ldots,b_{n}\right)^{T}bold_b = ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT satisfy the following optimality conditions

(22) 𝐜J(un)=𝟎and𝐛J(un)=𝟎,formulae-sequencesubscript𝐜𝐽subscript𝑢𝑛0andsubscript𝐛𝐽subscript𝑢𝑛0\nabla_{{\bf c}}J\left(u_{n}\right)={\bf 0}\quad\mbox{and}\quad\nabla_{{\bf b}% }J\left(u_{n}\right)={\bf 0},∇ start_POSTSUBSCRIPT bold_c end_POSTSUBSCRIPT italic_J ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) = bold_0 and ∇ start_POSTSUBSCRIPT bold_b end_POSTSUBSCRIPT italic_J ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) = bold_0 ,

where 𝐜subscript𝐜\nabla_{{\bf c}}∇ start_POSTSUBSCRIPT bold_c end_POSTSUBSCRIPT and 𝐛subscript𝐛\nabla_{{\bf b}}∇ start_POSTSUBSCRIPT bold_b end_POSTSUBSCRIPT denote the gradients with respect to 𝐜𝐜{\bf c}bold_c and 𝐛𝐛{\bf b}bold_b, respectively.

Denote the right-hand side vector by

𝐟(𝐛)=01(f(x)α)𝐜un(x)𝑑x,𝐟𝐛superscriptsubscript01𝑓𝑥𝛼subscript𝐜subscript𝑢𝑛𝑥differential-d𝑥{\bf f}({\bf b})=\int_{0}^{1}\left(f(x)-\alpha\right)\nabla_{{\bf c}}u_{n}(x)dx,bold_f ( bold_b ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_f ( italic_x ) - italic_α ) ∇ start_POSTSUBSCRIPT bold_c end_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) italic_d italic_x ,

and let 𝐝=𝐜un(1)𝐝subscript𝐜subscript𝑢𝑛1{\bf d}=\nabla_{{\bf c}}u_{n}(1)bold_d = ∇ start_POSTSUBSCRIPT bold_c end_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( 1 ). By the same derivation in [4], the first equation in Eq. 22 becomes

(23) (Aa(𝐛)+Mr(𝐛)+γ𝐝𝐝T)𝐜=𝐟(𝐛)+γ(βα)𝐝.subscript𝐴𝑎𝐛subscript𝑀𝑟𝐛𝛾superscript𝐝𝐝𝑇𝐜𝐟𝐛𝛾𝛽𝛼𝐝\left(A_{a}({\bf b})+M_{r}({\bf b})+\gamma{\bf d}{\bf d}^{T}\right){\bf c}={% \bf f}({\bf b})+\gamma(\beta-\alpha){\bf d}.( italic_A start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( bold_b ) + italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) + italic_γ bold_dd start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) bold_c = bold_f ( bold_b ) + italic_γ ( italic_β - italic_α ) bold_d .

Comparing to (3.2) in [4], the additional term Mr(𝐛)𝐜subscript𝑀𝑟𝐛𝐜M_{r}({\bf b}){\bf c}italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) bold_c in Eq. 23 is resulted from the reaction term.

For j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n, let

gj=r(bj)un(bj)f(bj)a(bj)(k=1j1ck+cj2).subscript𝑔𝑗𝑟subscript𝑏𝑗subscript𝑢𝑛subscript𝑏𝑗𝑓subscript𝑏𝑗superscript𝑎subscript𝑏𝑗superscriptsubscript𝑘1𝑗1subscript𝑐𝑘subscript𝑐𝑗2g_{j}=r(b_{j})u_{n}(b_{j})-f(b_{j})-a^{\prime}(b_{j})\left(\sum_{k=1}^{j-1}c_{% k}+\frac{c_{j}}{2}\right).italic_g start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_r ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) - italic_f ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) - italic_a start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ( ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j - 1 end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + divide start_ARG italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) .

Let D(𝐠)=diag(g1,,gn)𝐷𝐠diagsubscript𝑔1subscript𝑔𝑛D({\bf g})=\text{\em diag}(g_{1},\dots,g_{n})italic_D ( bold_g ) = diag ( italic_g start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_g start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) be the diagonal matrix with the i𝑖iitalic_i-th diagonal elements g(bi)𝑔subscript𝑏𝑖g(b_{i})italic_g ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ).

Lemma 3.3.

The Hessian matrix 𝐛2J(un)subscriptsuperscript2𝐛𝐽subscript𝑢𝑛\nabla^{2}_{{\bf b}}J(u_{n})∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_b end_POSTSUBSCRIPT italic_J ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) has the form

(24) 𝐇(𝐜,𝐛)=D(𝐠)D(𝐜)+D(𝐜)Ar(𝐛)D(𝐜)+γ𝐜𝐜T.𝐇𝐜𝐛𝐷𝐠𝐷𝐜𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜𝛾superscript𝐜𝐜𝑇\mathbf{H}({\bf c},{\bf b})=D({\bf g})D({\bf c})+D({\bf c})A_{r}({\bf b})D({% \bf c})+\gamma{\bf c}{\bf c}^{T}.bold_H ( bold_c , bold_b ) = italic_D ( bold_g ) italic_D ( bold_c ) + italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) + italic_γ bold_cc start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .
Proof 3.4.

Eq. 24 can be derived in a similar fashion to Lemma 3.2 in [4]. The only difference here is the additional reaction term in Eq. 17. For that term, the computations shown in Lemma 4.1 from [3] can be used to obtain the second-order derivatives with respect to 𝐛𝐛{\bf b}bold_b.

4 Damped Block Newton and Gauss-Newton Methods

Optimality conditions of the minimization problems in Eq. 11 and Eq. 18 lead to systems of non-linear algebraic equations of the form

(25) 𝒜(𝐛)𝐜=(𝐛)and𝐛J(un)=𝟎,formulae-sequence𝒜𝐛𝐜𝐛andsubscript𝐛𝐽subscript𝑢𝑛0{\cal A}({\bf b})\,{\bf c}={\cal F}({\bf b})\quad\mbox{and}\quad\nabla_{{\bf b% }}J(u_{n})={\bf 0},caligraphic_A ( bold_b ) bold_c = caligraphic_F ( bold_b ) and ∇ start_POSTSUBSCRIPT bold_b end_POSTSUBSCRIPT italic_J ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) = bold_0 ,

for the linear and non-linear parameters, respectively, where the first equation is given in Eq. 12 for the least-squares (LS) approximation and in Eq. 23 for the diffusion-reaction (DR) equation with

𝒜(𝐛)={Mr(𝐛),LS,Aa(𝐛)+Mr(𝐛)+γ𝐝𝐝T,DR.𝒜𝐛casessubscript𝑀𝑟𝐛LSsubscript𝐴𝑎𝐛subscript𝑀𝑟𝐛𝛾superscript𝐝𝐝𝑇DR{\cal A}({\bf b})=\left\{\begin{array}[]{ll}M_{r}({\bf b}),&\mbox{LS},\\[5.690% 54pt] A_{a}({\bf b})+M_{r}({\bf b})+\gamma{\bf d}{\bf d}^{T},&\mbox{DR}.\end{array}\right.caligraphic_A ( bold_b ) = { start_ARRAY start_ROW start_CELL italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) , end_CELL start_CELL LS , end_CELL end_ROW start_ROW start_CELL italic_A start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( bold_b ) + italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) + italic_γ bold_dd start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , end_CELL start_CELL DR . end_CELL end_ROW end_ARRAY

The respective Hessian matrix 𝐇(𝐜,𝐛)=𝐛2J(un)𝐇𝐜𝐛subscriptsuperscript2𝐛𝐽subscript𝑢𝑛{\bf H}({\bf c},{\bf b})=\nabla^{2}_{{\bf b}}J(u_{n})bold_H ( bold_c , bold_b ) = ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_b end_POSTSUBSCRIPT italic_J ( italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) is given in Eq. 14 and Eq. 24 with

(26) 𝐇(𝐜,𝐛)={D(𝐰)D(𝐜)+D(𝐜)Ar(𝐛)D(𝐜),LS,D(𝐠)D(𝐜)+D(𝐜)Ar(𝐛)D(𝐜)+γ𝐜𝐜T,DR.𝐇𝐜𝐛cases𝐷𝐰𝐷𝐜𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜LS𝐷𝐠𝐷𝐜𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜𝛾superscript𝐜𝐜𝑇DR{\bf H}({\bf c},{\bf b})=\left\{\begin{array}[]{ll}D({\bf w})D({\bf c})+D({\bf c% })A_{r}({\bf b})D({\bf c}),&\mbox{LS},\\[5.69054pt] D({\bf g})D({\bf c})+D({\bf c})A_{r}({\bf b})D({\bf c})+\gamma{\bf c}{\bf c}^{% T},&\mbox{DR}.\end{array}\right.bold_H ( bold_c , bold_b ) = { start_ARRAY start_ROW start_CELL italic_D ( bold_w ) italic_D ( bold_c ) + italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) , end_CELL start_CELL LS , end_CELL end_ROW start_ROW start_CELL italic_D ( bold_g ) italic_D ( bold_c ) + italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) + italic_γ bold_cc start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , end_CELL start_CELL DR . end_CELL end_ROW end_ARRAY

In a similar fashion as in [3], the Gauss-Newton matrix is given by

(27) 𝐇GN(𝐜,𝐛)={D(𝐜)Ar(𝐛)D(𝐜),LS,D(𝐜)Ar(𝐛)D(𝐜)+γ𝐜𝐜T,DR.subscript𝐇𝐺𝑁𝐜𝐛cases𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜LS𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜𝛾superscript𝐜𝐜𝑇DR{\bf H}_{GN}({\bf c},{\bf b})=\left\{\begin{array}[]{ll}D({\bf c})A_{r}({\bf b% })D({\bf c}),&\mbox{LS},\\[5.69054pt] D({\bf c})A_{r}({\bf b})D({\bf c})+\gamma{\bf c}{\bf c}^{T},&\mbox{DR}.\end{% array}\right.bold_H start_POSTSUBSCRIPT italic_G italic_N end_POSTSUBSCRIPT ( bold_c , bold_b ) = { start_ARRAY start_ROW start_CELL italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) , end_CELL start_CELL LS , end_CELL end_ROW start_ROW start_CELL italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) + italic_γ bold_cc start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , end_CELL start_CELL DR . end_CELL end_ROW end_ARRAY

In the case that 𝐇(𝐜,𝐛)𝐇𝐜𝐛{\bf H}({\bf c},{\bf b})bold_H ( bold_c , bold_b ) in Eq. 26 is invertible, the non-linear system in Eq. 25 can be solved by the damped block Newton (dBN) method described in Algorithm 4.1 of [4]. The method employs the block Gauss-Seidel method as an outer iteration for the linear and non-linear parameters. Per each outer iteration, the linear and the non-linear parameters are updated by exact inversion and one step of a damped Newton method, respectively.

To efficiently invert 𝒜(𝐛)𝒜𝐛{\cal A}({\bf b})caligraphic_A ( bold_b ), we use the factorizations of Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) and Aa(𝐛)subscript𝐴𝑎𝐛A_{a}({\bf b})italic_A start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( bold_b ) given in Eq. 9 and Eq. 10, respectively. That is,

(28) Mr(𝐛)1=QTMr1QTand(Aa(b)+Mr(𝐛))1=Q(TAa+TMr)1QT.formulae-sequencesubscript𝑀𝑟superscript𝐛1𝑄superscriptsubscript𝑇subscript𝑀𝑟1superscript𝑄𝑇andsuperscriptsubscript𝐴𝑎bsubscript𝑀𝑟𝐛1𝑄superscriptsubscript𝑇subscript𝐴𝑎subscript𝑇subscript𝑀𝑟1superscript𝑄𝑇M_{r}({\bf b})^{-1}=Q\,T_{M_{r}}^{-1}\,Q^{T}\quad\mbox{and}\quad\left(A_{a}(% \textbf{b})+M_{r}({\bf b})\right)^{-1}=Q\,(T_{A_{a}}+T_{M_{r}})^{-1}\,Q^{T}.italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_Q italic_T start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_Q start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and ( italic_A start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( b ) + italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_Q ( italic_T start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_T start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_Q start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .

Since TMrsubscript𝑇subscript𝑀𝑟T_{M_{r}}italic_T start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_POSTSUBSCRIPT and TAa+TMrsubscript𝑇subscript𝐴𝑎subscript𝑇subscript𝑀𝑟T_{A_{a}}+T_{M_{r}}italic_T start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_T start_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_POSTSUBSCRIPT are tri-diagonal, action of their inversions applied to any vector can be computed in 𝒪(n)𝒪𝑛{\cal O}(n)caligraphic_O ( italic_n ) operations, so is the action of 𝒜(𝐛)1𝒜superscript𝐛1{\cal A}({\bf b})^{-1}caligraphic_A ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. For the diffusion-reaction problem, the Sherman-Morrison formula is needed for a rank-one update.

In the case that 𝐇(𝐜,𝐛)𝐇𝐜𝐛{\bf H}({\bf c},{\bf b})bold_H ( bold_c , bold_b ) in Eq. 26 is singular, the non-linear system in Eq. 25 can be solved by the structure-guided Gauss-Newton (SgGN) method described in Algorithm 4.1 of [3]. This is because the layer Gauss-Newton matrix Ar(𝐛)subscript𝐴𝑟𝐛A_{r}({\bf b})italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) is always symmetric positive-definite and its inverse is tri-diagonal (see [4]). The SgGN is essentially the damped block Gauss-Newton (dBGN) method, that replaces 𝐇(𝐜,𝐛)1𝐇superscript𝐜𝐛1{\bf H}({\bf c},{\bf b})^{-1}bold_H ( bold_c , bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT in the dBN method by 𝐇GN(𝐜,𝐛)1subscript𝐇𝐺𝑁superscript𝐜𝐛1{\bf H}_{GN}({\bf c},{\bf b})^{-1}bold_H start_POSTSUBSCRIPT italic_G italic_N end_POSTSUBSCRIPT ( bold_c , bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT in the dBGN method.

Lemma 4.1.

Assume that ci0subscript𝑐𝑖0c_{i}\neq 0italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≠ 0 for all i=1,,n𝑖1𝑛i=1,\dots,nitalic_i = 1 , … , italic_n. Then D(𝐬)D(𝐜)+D(𝐜)Ar(𝐛)D(𝐜)𝐷𝐬𝐷𝐜𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜D({\bf s})D({\bf c})+D({\bf c})A_{r}({\bf b})D({\bf c})italic_D ( bold_s ) italic_D ( bold_c ) + italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) is invertible if and only if I+D(𝐬)Ar(𝐛)1D(𝐜)1𝐼𝐷𝐬subscript𝐴𝑟superscript𝐛1𝐷superscript𝐜1I+D({\bf s})A_{r}({\bf b})^{-1}D({\bf c})^{-1}italic_I + italic_D ( bold_s ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_D ( bold_c ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT is invertible. Moreover, we have

(29) (D(𝐬)D(𝐜)+D(𝐜)Ar(𝐛)D(𝐜))1=(I+D(𝐜)1Ar(𝐛)1D(𝐬))1D(𝐜)1Ar(𝐛)1D(𝐜)1.superscript𝐷𝐬𝐷𝐜𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜1superscript𝐼𝐷superscript𝐜1subscript𝐴𝑟superscript𝐛1𝐷𝐬1𝐷superscript𝐜1subscript𝐴𝑟superscript𝐛1𝐷superscript𝐜1\left(D({\bf s})D({\bf c})+D({\bf c})A_{r}({\bf b})D({\bf c})\right)^{-1}=% \left(I+D({\bf c})^{-1}A_{r}({\bf b})^{-1}D({\bf s})\right)^{-1}D({\bf c})^{-1% }A_{r}({\bf b})^{-1}D({\bf c})^{-1}.( italic_D ( bold_s ) italic_D ( bold_c ) + italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = ( italic_I + italic_D ( bold_c ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_D ( bold_s ) ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_D ( bold_c ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_D ( bold_c ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

Proof 4.2.

Under the assumption, Eq. 29 follows that

D(𝐬)D(𝐜)+D(𝐜)Ar(𝐛)D(𝐜)=(I+D(𝐬)Ar(𝐛)1D(𝐜)1)D(𝐜)Ar(𝐛)D(𝐜),𝐷𝐬𝐷𝐜𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜𝐼𝐷𝐬subscript𝐴𝑟superscript𝐛1𝐷superscript𝐜1𝐷𝐜subscript𝐴𝑟𝐛𝐷𝐜D({\bf s})D({\bf c})+D({\bf c})A_{r}({\bf b})D({\bf c})=\left(I+D({\bf s})A_{r% }({\bf b})^{-1}D({\bf c})^{-1}\right)D({\bf c})A_{r}({\bf b})D({\bf c}),italic_D ( bold_s ) italic_D ( bold_c ) + italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) = ( italic_I + italic_D ( bold_s ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_D ( bold_c ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) italic_D ( bold_c ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) italic_D ( bold_c ) ,

which proves the lemma.

Lemma 4.1, together with the fact that I+D(𝐬)Ar(𝐛)1D(𝐜)1𝐼𝐷𝐬subscript𝐴𝑟superscript𝐛1𝐷superscript𝐜1I+D({\bf s})A_{r}({\bf b})^{-1}D({\bf c})^{-1}italic_I + italic_D ( bold_s ) italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_D ( bold_c ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT is tri-diagonal and the Sherman-Morrison formula, implies that action of 𝐇(𝐜,𝐛)1𝐇superscript𝐜𝐛1{\bf H}({\bf c},{\bf b})^{-1}bold_H ( bold_c , bold_b ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT applied to any vector can be computed in 𝒪(n)𝒪𝑛{\cal O}(n)caligraphic_O ( italic_n ) operations.

4.1 An Adaptivity Scheme

For a fixed number of neurons, the dBN method for the diffusion-reaction equation moves the initial uniformly distributed breakpoints very efficiently to nearly optimal locations as shown in Section 5. However, it was shown in [4] that introducing adaptivity results in a more optimal convergence rate.

In fact, the adaptive neuron enhancement (ANE) method [9, 10] was employed in [4]. The ANE method starts with a relatively small neural network and adaptively adds new neurons based on the previous approximation. Moreover, the newly added neurons are initialized at where the previous approximation is not accurate. At each adaptive step, we use the dBN method to numerically solve the minimization problem in Eq. 18. Section 5 in [4] describes how to introduce adaptivity, and Algorithm 5.1 in [4] describes the adaptive block Newton (AdBN) method.

Here, the only modification is the local indicator. Letting 𝒦=[c,d][0,1]𝒦𝑐𝑑01\mathcal{K}=[c,d]\subseteq[0,1]caligraphic_K = [ italic_c , italic_d ] ⊆ [ 0 , 1 ] be a subinterval, a modified local indicator of the ZZ type on 𝒦𝒦\mathcal{K}caligraphic_K (see, e.g., [5]) is defined by

ξ𝒦2=a1/2(G(aun)aun)L2(𝒦)2+(dc)2G(a2un)+unfL2(𝒦)2,superscriptsubscript𝜉𝒦2superscriptsubscriptdelimited-∥∥superscript𝑎12𝐺𝑎superscriptsubscript𝑢𝑛𝑎superscriptsubscript𝑢𝑛superscript𝐿2𝒦2superscript𝑑𝑐2superscriptsubscriptdelimited-∥∥superscript𝐺superscript𝑎2superscriptsubscript𝑢𝑛subscript𝑢𝑛𝑓superscript𝐿2𝒦2\xi_{\mathcal{K}}^{2}=\lVert a^{-1/2}\left(G(au_{n}^{\prime})-au_{n}^{\prime}% \right)\rVert_{L^{2}(\mathcal{K})}^{2}+(d-c)^{2}\lVert-G^{\prime}(a^{2}u_{n}^{% \prime})+u_{n}-f\rVert_{L^{2}(\mathcal{K})}^{2},italic_ξ start_POSTSUBSCRIPT caligraphic_K end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ∥ italic_a start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ( italic_G ( italic_a italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_a italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_K ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_d - italic_c ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ - italic_G start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_a start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) + italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_f ∥ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( caligraphic_K ) end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,

where G(v)𝐺𝑣G(v)italic_G ( italic_v ) is the projection of v𝑣vitalic_v onto the space of the continuous piecewise linear functions.

5 Numerical Experiments

This section first presents numerical results of the dBN and dBGN methods for solving Eq. 11. Afterwards, results of the dBN, dBGN and AdBN methods for solving Eq. 15 are shown in Section 5.2 and Section 5.3. For diffusion-reaction problems, the penalization parameter γ𝛾\gammaitalic_γ was set to 104superscript10410^{4}10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT. For the AdBN method, a refinement occurred when the difference of the total estimators for two consecutive iterates was less than 107superscript10710^{-7}10 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT.

For each test problem of the diffusion-reaction equation, let u𝑢uitalic_u and unsubscript𝑢𝑛u_{n}italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be the exact solution and its approximation in n(Ω)subscript𝑛Ω\mathcal{M}_{n}(\Omega)caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( roman_Ω ), respectively. Denote the relative error by

en=|uun|H1(Ω)|u|H1(Ω).subscript𝑒𝑛subscript𝑢subscript𝑢𝑛superscript𝐻1Ωsubscript𝑢superscript𝐻1Ωe_{n}=\frac{|u-u_{n}|_{H^{1}(\Omega)}}{|u|_{H^{1}(\Omega)}}.italic_e start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = divide start_ARG | italic_u - italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Ω ) end_POSTSUBSCRIPT end_ARG start_ARG | italic_u | start_POSTSUBSCRIPT italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Ω ) end_POSTSUBSCRIPT end_ARG .

5.1 Least-Squares Problem

The first test problem is the function

(30) u(x)=x.𝑢𝑥𝑥u(x)=\sqrt{x}.italic_u ( italic_x ) = square-root start_ARG italic_x end_ARG .

as the target function for problem Eq. 11, with r(x)=1𝑟𝑥1r(x)=1italic_r ( italic_x ) = 1. We aim to test the performance of dBN and dBGN for least-squares data fitting problems. LABEL:example3BFGSdBN presents a comparison between dBN, dBGN and BFGS. In this comparison, we utilized a Python BFGS implementation from ‘scipy.optimize’. The initial network parameters for the three algorithms were set to be the uniform mesh for 𝐛(0)superscript𝐛0{\bf b}^{(0)}bold_b start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT and 𝐜(0)superscript𝐜0{\bf c}^{(0)}bold_c start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT given by solving Eq. 12. Recall that the computational cost per iteration of dBN and dBGN is 𝒪(n)𝒪𝑛{\cal O}(n)caligraphic_O ( italic_n ), while each iteration of BFGS has a cost of 𝒪(n2)𝒪superscript𝑛2{\cal O}(n^{2})caligraphic_O ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). In this example our solvers outperform BFGS, achieving smaller losses in fewer and cheaper iterations.

LABEL:example2DF (a) illustrates the neural network approximation of the function in Eq. 30, obtained using uniform breakpoints and determining the linear parameter through the solution of Eq. 12. Clearly, it is more optimal to concentrate more mesh points on the left side, where the curve is steeper. The dBN method is capable of making this adjustment, as illustrated in LABEL:example2DF (b). The loss functions confirm that the approximation improves substantially when the breakpoints are allocated according to the steepness of the function.

5.2 Exponential Solution

The second test problem involves the function

(31) u(x)=x(exp((x13)20.01)exp(49×0.01)),𝑢𝑥𝑥superscript𝑥1320.01490.01u(x)=x\left(\exp\left(-\frac{{(x-\frac{1}{3})^{2}}}{{0.01}}\right)-\exp\left(-% \frac{{4}}{{9\times 0.01}}\right)\right),italic_u ( italic_x ) = italic_x ( roman_exp ( - divide start_ARG ( italic_x - divide start_ARG 1 end_ARG start_ARG 3 end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 0.01 end_ARG ) - roman_exp ( - divide start_ARG 4 end_ARG start_ARG 9 × 0.01 end_ARG ) ) ,

serving as a solution of Eq. 15 for a(x)=r(x)=1𝑎𝑥𝑟𝑥1a(x)=r(x)=1italic_a ( italic_x ) = italic_r ( italic_x ) = 1 and α=β=0𝛼𝛽0\alpha=\beta=0italic_α = italic_β = 0.

Similarly to LABEL:example3BFGSdBN, we start by comparing our two solvers with BFGS. The initial network parameters for all algorithms were set to be the uniform mesh for 𝐛(0)superscript𝐛0{\bf b}^{(0)}bold_b start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT, with 𝐜(0)superscript𝐜0{\bf c}^{(0)}bold_c start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT given by the exact solution of equation Eq. 23. We observe in LABEL:example1BFGSdB that in about 25 iterations, both dBN and dBGN achieve an accuracy that BFGS cannot attain.

LABEL:ex1Figure (a) shows the initial neural network approximation of the function in Eq. 31, obtained by using uniform breakpoints and determining the linear parameter through the solution of Eq. 23. The approximation generated by dBN is shown in LABEL:ex1Figure (b), while LABEL:ex1Figure (c) illustrates the approximation obtained by employing dBN with adaptivity. Notably, in both cases, the breakpoints are moved, and the approximation enhances the initial approximation.

Theoretically, from Eq. 21, 1n1𝑛\frac{1}{n}divide start_ARG 1 end_ARG start_ARG italic_n end_ARG is the order of convergence of approximating a solution Eq. 31 by functions in nsubscript𝑛{\cal M}_{n}caligraphic_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. However, since Eq. 18 is a non-convex optimization problem, the existence of local minimums makes it challenging to achieve this order. Therefore, given the neural network approximation unsubscript𝑢𝑛u_{n}italic_u start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT to u𝑢uitalic_u provided by the dBN method, assume that

en=(1n)r,subscript𝑒𝑛superscript1𝑛𝑟e_{n}=\left(\frac{1}{n}\right)^{r},italic_e start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ( divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ) start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT ,

for some r>0𝑟0r>0italic_r > 0. As in [4], we can use the AdBN method to improve the order of convergence of the dBN method (achieve an r𝑟ritalic_r closer to 1).

Table 1 illustrates adaptive dBN (AdBN) starting with 20 neurons, refining 8 times, and reaching a final count of 194 neurons. The stop** tolerance was set to ϵ=0.05italic-ϵ0.05\epsilon=0.05italic_ϵ = 0.05. The recorded data in Table 1 includes the relative seminorm error and the relative error estimator for each iteration of the adaptive process. Additionally, Table 1 provides the results for dBN with fixed 144 and 194 neurons. Comparing these results to the adaptive run with the same number of neurons, we observe a significant improvement in rate, error estimator, and seminorm error within the adaptive run.

NN (n𝑛nitalic_n neurons) ensubscript𝑒𝑛e_{n}italic_e start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ξnsubscript𝜉𝑛\xi_{n}italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT r𝑟ritalic_r
Adaptive (20) 1.01×1011.01superscript1011.01\times 10^{-1}1.01 × 10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT 0.545 0.766
Adaptive (27) 6.40×1026.40superscript1026.40\times 10^{-2}6.40 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 0.342 0.834
Adaptive (32) 5.64×1025.64superscript1025.64\times 10^{-2}5.64 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 0.259 0.830
Adaptive (46) 3.81×1023.81superscript1023.81\times 10^{-2}3.81 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 0.193 0.854
Adaptive (52) 3.47×1023.47superscript1023.47\times 10^{-2}3.47 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 0.146 0.851
Adaptive (71) 2.64×1022.64superscript1022.64\times 10^{-2}2.64 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 0.107 0.853
Adaptive (99) 1.90×1021.90superscript1021.90\times 10^{-2}1.90 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 0.079 0.862
Adaptive (144) 1.31×1021.31superscript1021.31\times 10^{-2}1.31 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 0.052 0.872
Adaptive (194) 8.83×1038.83superscript1038.83\times 10^{-3}8.83 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT 0.037 0.898
Fixed (144) 1.90×1021.90superscript1021.90\times 10^{-2}1.90 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 0.075 0.798
Fixed (194) 1.50×1021.50superscript1021.50\times 10^{-2}1.50 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 0.057 0.797
Table 1: Comparison of an adaptive network with fixed networks for relative error ensubscript𝑒𝑛e_{n}italic_e start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, relative error estimators ξnsubscript𝜉𝑛\xi_{n}italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, and powers r𝑟ritalic_r

5.3 Singularly Perturbed Reaction-Diffusion Equation

The third test problem is a singularly perturbed reaction-diffusion equation:

(32) {ε2u′′(x)+u(x)=f(x),xΩ=(1,1),u(1)=u(1)=0.casessuperscript𝜀2superscript𝑢′′𝑥𝑢𝑥𝑓𝑥𝑥Ω11𝑢1𝑢10missing-subexpression\left\{\begin{array}[]{lr}-\varepsilon^{2}u^{\prime\prime}(x)+u(x)=f(x),&x\in% \Omega=(-1,1),\\[5.69054pt] u(-1)=u(1)=0.\end{array}\right.{ start_ARRAY start_ROW start_CELL - italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_x ) + italic_u ( italic_x ) = italic_f ( italic_x ) , end_CELL start_CELL italic_x ∈ roman_Ω = ( - 1 , 1 ) , end_CELL end_ROW start_ROW start_CELL italic_u ( - 1 ) = italic_u ( 1 ) = 0 . end_CELL start_CELL end_CELL end_ROW end_ARRAY

For f(x)=2(ε4x2tanh(1ε(x214)))(1/cosh(1ε(x214)))2+tanh(1ε(x214))tanh(34ε)𝑓𝑥2𝜀4superscript𝑥21𝜀superscript𝑥214superscript11𝜀superscript𝑥21421𝜀superscript𝑥21434𝜀f(x)=-2\left(\varepsilon-4x^{2}\tanh{\left(\frac{1}{\varepsilon}(x^{2}-\frac{1% }{4})\right)}\right)\left(1/\cosh{\left(\frac{1}{\varepsilon}(x^{2}-\frac{1}{4% })\right)}\right)^{2}+\tanh{\left(\frac{1}{\varepsilon}(x^{2}-\frac{1}{4})% \right)}-\tanh{\left(\frac{3}{4\varepsilon}\right)}italic_f ( italic_x ) = - 2 ( italic_ε - 4 italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_tanh ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ( italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 4 end_ARG ) ) ) ( 1 / roman_cosh ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ( italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 4 end_ARG ) ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + roman_tanh ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ( italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 4 end_ARG ) ) - roman_tanh ( divide start_ARG 3 end_ARG start_ARG 4 italic_ε end_ARG ), problem Eq. 32 has the following exact solution

(33) u(x)=tanh(1ε(x214))tanh(34ε).𝑢𝑥1𝜀superscript𝑥21434𝜀u(x)=\tanh{\left(\frac{1}{\varepsilon}(x^{2}-\frac{1}{4})\right)}-\tanh{\left(% \frac{3}{4\varepsilon}\right)}.italic_u ( italic_x ) = roman_tanh ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ( italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 4 end_ARG ) ) - roman_tanh ( divide start_ARG 3 end_ARG start_ARG 4 italic_ε end_ARG ) .

For some ν=ε2𝜈superscript𝜀2\nu=\varepsilon^{2}italic_ν = italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, these problems exhibit interior layers that make them challenging for mesh-based methods such as finite element and finite difference, leading to overshooting and oscillations. For ν=104𝜈superscript104\nu=10^{-4}italic_ν = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT, LABEL:example2DR illustrates the neural network approximation of the function described in Eq. 33, using uniform breakpoints (a) and employing dBN to adjust the breakpoints (b). An interesting observation is that the resulting approximation from dBN does not exhibit overshooting or oscillations. This confirms that dBN is capable of successfully adjusting the breakpoints and may have the potential to accurately approximate solutions with boundary and/or interior layers.

It is worth mentioning that the relative L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-norm error of the approximation depicted in LABEL:example2DR (b) is 8.85×1048.85superscript1048.85\times 10^{-4}8.85 × 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT. In [2], similar errors were obtained using deep neural networks with 2962296229622962 parameters. In our case, the number of parameters is only 65656565.

The resulting relative errors obtained after using dBN for various values of ν𝜈\nuitalic_ν are shown in Table 2. For each value of ν𝜈\nuitalic_ν, dBN considerably improves the initial approximation, and the error does not vary significantly with different values of ν𝜈\nuitalic_ν.

ν𝜈\nuitalic_ν ensubscript𝑒𝑛e_{n}italic_e start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT (initial) ensubscript𝑒𝑛e_{n}italic_e start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT (dBN)
102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT 1.63×1011.63superscript1011.63\times 10^{-1}1.63 × 10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT 6.72×1026.72superscript1026.72\times 10^{-2}6.72 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT
103superscript10310^{-3}10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT 5.53×1015.53superscript1015.53\times 10^{-1}5.53 × 10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT 8.08×1028.08superscript1028.08\times 10^{-2}8.08 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT
104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT 8.89×1018.89superscript1018.89\times 10^{-1}8.89 × 10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT 7.65×1027.65superscript1027.65\times 10^{-2}7.65 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT
105superscript10510^{-5}10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT 9.69×1019.69superscript1019.69\times 10^{-1}9.69 × 10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT 8.58×1028.58superscript1028.58\times 10^{-2}8.58 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT
106superscript10610^{-6}10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT 9.90×1019.90superscript1019.90\times 10^{-1}9.90 × 10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT 8.09×1028.09superscript1028.09\times 10^{-2}8.09 × 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT
Table 2: Relative errors ensubscript𝑒𝑛e_{n}italic_e start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT obtained by using ReLU networks to approximate the function in Eq. 33 for different ν=ε2𝜈superscript𝜀2\nu=\varepsilon^{2}italic_ν = italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Initial: NN model with 32 uniform breakpoints. dBN: optimized NN model with 32 breakpoints after 200 iterations

We also present the results of using adaptive mesh refinement. LABEL:example22DR shows the neural network approximation obtained by starting with 12 uniform breakpoints. Refinements are performed using the average marking strategy (see equation (5.2) in [4]) to achieve a similar error as the approximation in LABEL:example2DR (b). After each refinement, the linear parameter was computed by solving equation Eq. 23. In LABEL:example22DR (a), the breakpoints were not moved, whereas LABEL:example22DR (b) illustrates the AdBN method where the breakpoints were moved after each refinement.

6 Discussion and Conclusion

The corresponding mass matrix Mr(𝐛)subscript𝑀𝑟𝐛M_{r}({\bf b})italic_M start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) using the shallow ReLU neural network arises in applications such as diffusion-reaction equation, least-squares data fitting, etc. Unlike the finite element mass matrix, the NN mass matrix is dense and very ill-conditioned (see Lemma 2.3). These features hinder efficiency of commonly used numerical methods for solving the resulting system of linear equations.

This difficulty is overcome in one dimension through an especial factorization of the mass matrix, which was done using both algebraic and geometrical approaches. This factorization enables the 𝒪(n)𝒪𝑛{\cal O}(n)caligraphic_O ( italic_n ) computational cost for the inversion of the mass matrix. Combining this with the fact that the inversion of the coefficient matrix Ar(𝐛)subscript𝐴𝑟𝐛A_{r}({\bf b})italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( bold_b ) is tri-diagonal, the resulting damped block Newton (dBN) method is implemented with a computational cost of just 𝒪(n)𝒪𝑛{\cal O}(n)caligraphic_O ( italic_n ) per iteration, granted that the corresponding Hessian matrix is invertible. The quadratic form of the objective functions for certain problems allows the construction of damped block Gauss-Newton (dBGN) methods, which benefit from having symmetric positive-definite Gauss-Newton matrices. For diffusion-reaction problems in particular, the addition of adaptive network enhancement (ANE) improves the rate of convergence.

Overall, the numerical results demonstrate the efficiency of the various methods in terms of not only the number of iterations but also the cost per iteration, making a compelling case to pursue the construction of similar solvers for higher dimensional problems. Of particular interest is the application of dBN methods to the singularly perturbed reaction-diffusion problem. For a fixed number of mesh points n𝑛nitalic_n, dBN appears to achieve an accuracy independent of the diffusion coefficient ε2superscript𝜀2\varepsilon^{2}italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Furthermore, when adding in adaptivity, AdBN seems to be comparable to FE methods using mesh refinement.

References

  • [1] J. Berg and K. Nyström. A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing, 317:28–41, 2018.
  • [2] Z. Cai, J. Chen, M. Liu, and Xinyu Liu. Deep least-squares methods: An unsupervised learning-based numerical method for solving elliptic PDEs. Journal of Computational Physics, 420:109707, 2020.
  • [3] Z. Cai, T. Ding, M. Liu, X. Liu, and J. Xia. A structure-guided gauss-newton method for shallow ReLU neural network. arXiv:2404.05064v1 [cs.LG], 2024.
  • [4] Z. Cai, A. Doktorova, R. D. Falgout, and C. Herrera. Fast iterative solver for neural network method: I. 1d diffusion problems. arXiv:2404.17750 [math.NA], 2024.
  • [5] Z. Cai and S. Zhang. Recovery-based error estimators for interface problems: conforming linear elements. SIAM Journal on Numerical Analysis, 47(3):2132–2156, 2009.
  • [6] T. Dockhorn. A discussion on solving partial differential equations using neural networks. arXiv:1904.07200 [cs.LG], abs/1904.07200, 2019.
  • [7] W. E and B. Yu. The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics, 6(1):1–12, March 2018.
  • [8] I. Fried. The l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT condition numbers of the finite element stiffness and mass matrices, and the pointwise convergence of the method. In J.R. Whiteman, editor, The Mathematics of Finite Elements and Applications, pages 163–174. Academic Press, 1973.
  • [9] M. Liu and Z. Cai. Adaptive two-layer ReLU neural network: II. Ritz approximation to elliptic pdes. Computers & Mathematics with Applications, 113:103–116, May 2022.
  • [10] M. Liu, Z. Cai, and J. Chen. Adaptive two-layer ReLU neural network: I. best least-squares approximation. Computers & Mathematics with Applications, 113:34–44, May 2022.
  • [11] M. Raissi, P. Perdikaris, and G.E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019.
  • [12] J. Sirignano and K. Spiliopoulos. DGM: A deep learning algorithm for solving partial differential equations. Journal of Computational Physics, 375:1339–1364, 2018.