-
Fast Iterative Solver For Neural Network Method: II. 1D Diffusion-Reaction Problems And Data Fitting
Authors:
Zhiqiang Cai,
Anastassia Doktorova,
Robert D. Falgout,
César Herrera
Abstract:
This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis funct…
▽ More
This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis functions is tri-diagonal and well-conditioned, the mass matrix for NNs is dense and ill-conditioned. For example, the condition number of the NN mass matrix for quasi-uniform meshes is at least ${\cal O}(n^4)$. We present a factorization of the mass matrix that enables solving the systems of linear equations in ${\cal O}(n)$ operations. To determine the non-linear parameters (the weights and bias of the hidden layer), one step of a damped Newton method is employed at each iteration. A Gauss-Newton method is used in place of Newton for the instances in which the Hessian matrices are singular. This modified dBN is referred to as dBGN. For both methods, the computational cost per iteration is ${\cal O}(n)$. Numerical results demonstrate the ability dBN and dBGN to efficiently achieve accurate results and outperform BFGS for select examples.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Fast Iterative Solver For Neural Network Method: I. 1D Diffusion Problems
Authors:
Zhiqiang Cai,
Anastassia Doktorova,
Robert D. Falgout,
César Herrera
Abstract:
The discretization of the deep Ritz method [18] for the Poisson equation leads to a high-dimensional non-convex minimization problem, that is difficult and expensive to solve numerically. In this paper, we consider the shallow Ritz approximation to one-dimensional diffusion problems and introduce an effective and efficient iterative method, a damped block Newton (dBN) method, for solving the resul…
▽ More
The discretization of the deep Ritz method [18] for the Poisson equation leads to a high-dimensional non-convex minimization problem, that is difficult and expensive to solve numerically. In this paper, we consider the shallow Ritz approximation to one-dimensional diffusion problems and introduce an effective and efficient iterative method, a damped block Newton (dBN) method, for solving the resulting non-convex minimization problem.
The method employs the block Gauss-Seidel method as an outer iteration by dividing the parameters of a shallow neural network into the linear parameters (the weights and bias of the output layer) and the non-linear parameters (the weights and bias of the hidden layer). Per each outer iteration, the linear and the non-linear parameters are updated by exact inversion and one step of a damped Newton method, respectively. Inverses of the coefficient matrix and the Hessian matrix are tridiagonal and diagonal, respectively, and hence the cost of each dBN iteration is $\mathcal{O}(n)$. To move the breakpoints (the non-linear parameters) more efficiently, we propose an adaptive damped block Newton (AdBN) method by combining the dBN with the adaptive neuron enhancement (ANE) method [25]. Numerical examples demonstrate the ability of dBN and AdBN not only to move the breakpoints quickly and efficiently but also to achieve a nearly optimal order of convergence for AdBN. These iterative solvers are capable of outperforming BFGS for select examples.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Invariant conformal Killing forms on almost abelian Lie groups
Authors:
Cecilia Herrera,
Marcos Origlia
Abstract:
We describe completely conformal Killing or conformal Killing-Yano (CKY) $p$-forms on almost abelian metric Lie algebras. In particular we prove that if a $n$-dimensional almost abelian metric Lie algebra admits a non-parallel CKY $p$-form, then $p=1$ or $p=n-1$. In other words, any CKY $p$-form on a metric almost abelian Lie algebra is parallel for $2\leq p\leq n-2$. Moreover, we characterize alm…
▽ More
We describe completely conformal Killing or conformal Killing-Yano (CKY) $p$-forms on almost abelian metric Lie algebras. In particular we prove that if a $n$-dimensional almost abelian metric Lie algebra admits a non-parallel CKY $p$-form, then $p=1$ or $p=n-1$. In other words, any CKY $p$-form on a metric almost abelian Lie algebra is parallel for $2\leq p\leq n-2$. Moreover, we characterize almost abelian Lie algebras admitting non-parallel CKY $p$-forms, and we classify all Lie algebras with this property up to dimension $5$, distinguishing also those cases where the associated simply connected Lie group admits lattices.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Optimal Stop** via Randomized Neural Networks
Authors:
Calypso Herrera,
Florian Krach,
Pierre Ruyssen,
Josef Teichmann
Abstract:
This paper presents the benefits of using randomized neural networks instead of standard basis functions or deep neural networks to approximate the solutions of optimal stop** problems. The key idea is to use neural networks, where the parameters of the hidden layers are generated randomly and only the last layer is trained, in order to approximate the continuation value. Our approaches are appl…
▽ More
This paper presents the benefits of using randomized neural networks instead of standard basis functions or deep neural networks to approximate the solutions of optimal stop** problems. The key idea is to use neural networks, where the parameters of the hidden layers are generated randomly and only the last layer is trained, in order to approximate the continuation value. Our approaches are applicable to high dimensional problems where the existing approaches become increasingly impractical. In addition, since our approaches can be optimized using simple linear regression, they are easy to implement and theoretical guarantees can be provided. We test our approaches for American option pricing on Black--Scholes, Heston and rough Heston models and for optimally stop** a fractional Brownian motion. In all cases, our algorithms outperform the state-of-the-art and other relevant machine learning approaches in terms of computation time while achieving comparable results. Moreover, we show that they can also be used to efficiently compute Greeks of American options.
△ Less
Submitted 1 December, 2023; v1 submitted 28 April, 2021;
originally announced April 2021.
-
Invariant Conformal Killing-Yano $2$-forms on five dimensional Lie Groups
Authors:
Andrea Cecilia Herrera,
Marcos Origlia
Abstract:
We study left invariant conformal Killing-Yano (CKY) $2$-forms on Lie groups endowed with a left invariant metric. We classify all $5$-dimensional metric Lie algebras carrying CKY tensors that are obtained as a one-dimensional central extension of 4-dimensional metric Lie algebras endowed with a invertible parallel skew-symmetric tensor. On the other hand, we also classify $5$-dimensional metric L…
▽ More
We study left invariant conformal Killing-Yano (CKY) $2$-forms on Lie groups endowed with a left invariant metric. We classify all $5$-dimensional metric Lie algebras carrying CKY tensors that are obtained as a one-dimensional central extension of 4-dimensional metric Lie algebras endowed with a invertible parallel skew-symmetric tensor. On the other hand, we also classify $5$-dimensional metric Lie algebras with center of dimension greater than one admitting strict CKY tensors. In addition, we determine all possible CKY tensors on these metric Lie algebras. In particular, we exhibit the first examples of CKY $2$-forms on metric Lie algebras which do not admit any Sasakian structure.
△ Less
Submitted 24 May, 2022; v1 submitted 20 December, 2020;
originally announced December 2020.
-
Parallel skew-symmetric tensors on 4-dimensional metric Lie algebras
Authors:
A. C. Herrera
Abstract:
We give a complete classification, up to isometric isomorphism and scaling, of $4$-dimensional metric Lie algebras $(\mathfrak{g},\langle \cdot,\cdot \rangle)$ that admit a non-zero parallel skew-symmetric endomorphism. In particular, we distinguish those metric Lie algebras that admit such an endomorphism which is not a multiple of a complex structure, and for each of them we obtain the de Rham d…
▽ More
We give a complete classification, up to isometric isomorphism and scaling, of $4$-dimensional metric Lie algebras $(\mathfrak{g},\langle \cdot,\cdot \rangle)$ that admit a non-zero parallel skew-symmetric endomorphism. In particular, we distinguish those metric Lie algebras that admit such an endomorphism which is not a multiple of a complex structure, and for each of them we obtain the de Rham decomposition of the associated simply connected Lie group with the corresponding left invariant metric. On the other hand, we find that the associated simply connected Lie group is irreducible as a Riemannian manifold for those metric Lie algebras where each parallel skew-symmetric endomorphism is a multiple of a complex structure.
△ Less
Submitted 15 March, 2022; v1 submitted 16 December, 2020;
originally announced December 2020.
-
Neural Jump Ordinary Differential Equations: Consistent Continuous-Time Prediction and Filtering
Authors:
Calypso Herrera,
Florian Krach,
Josef Teichmann
Abstract:
Combinations of neural ODEs with recurrent neural networks (RNN), like GRU-ODE-Bayes or ODE-RNN are well suited to model irregularly observed time series. While those models outperform existing discrete-time approaches, no theoretical guarantees for their predictive capabilities are available. Assuming that the irregularly-sampled time series data originates from a continuous stochastic process, t…
▽ More
Combinations of neural ODEs with recurrent neural networks (RNN), like GRU-ODE-Bayes or ODE-RNN are well suited to model irregularly observed time series. While those models outperform existing discrete-time approaches, no theoretical guarantees for their predictive capabilities are available. Assuming that the irregularly-sampled time series data originates from a continuous stochastic process, the $L^2$-optimal online prediction is the conditional expectation given the currently available information. We introduce the Neural Jump ODE (NJ-ODE) that provides a data-driven approach to learn, continuously in time, the conditional expectation of a stochastic process. Our approach models the conditional expectation between two observations with a neural ODE and jumps whenever a new observation is made. We define a novel training framework, which allows us to prove theoretical guarantees for the first time. In particular, we show that the output of our model converges to the $L^2$-optimal prediction. This can be interpreted as solution to a special filtering problem. We provide experiments showing that the theoretical results also hold empirically. Moreover, we experimentally show that our model outperforms the baselines in more complex learning tasks and give comparisons on real-world datasets.
△ Less
Submitted 16 April, 2021; v1 submitted 8 June, 2020;
originally announced June 2020.
-
Denise: Deep Robust Principal Component Analysis for Positive Semidefinite Matrices
Authors:
Calypso Herrera,
Florian Krach,
Anastasis Kratsios,
Pierre Ruyssen,
Josef Teichmann
Abstract:
The robust PCA of covariance matrices plays an essential role when isolating key explanatory features. The currently available methods for performing such a low-rank plus sparse decomposition are matrix specific, meaning, those algorithms must re-run for every new matrix. Since these algorithms are computationally expensive, it is preferable to learn and store a function that nearly instantaneousl…
▽ More
The robust PCA of covariance matrices plays an essential role when isolating key explanatory features. The currently available methods for performing such a low-rank plus sparse decomposition are matrix specific, meaning, those algorithms must re-run for every new matrix. Since these algorithms are computationally expensive, it is preferable to learn and store a function that nearly instantaneously performs this decomposition when evaluated. Therefore, we introduce Denise, a deep learning-based algorithm for robust PCA of covariance matrices, or more generally, of symmetric positive semidefinite matrices, which learns precisely such a function. Theoretical guarantees for Denise are provided. These include a novel universal approximation theorem adapted to our geometric deep learning problem and convergence to an optimal solution to the learning problem. Our experiments show that Denise matches state-of-the-art performance in terms of decomposition quality, while being approximately $2000\times$ faster than the state-of-the-art, principal component pursuit (PCP), and $200 \times$ faster than the current speed-optimized method, fast PCP.
△ Less
Submitted 6 June, 2023; v1 submitted 28 April, 2020;
originally announced April 2020.
-
Low-Rank plus Sparse Decomposition of Covariance Matrices using Neural Network Parametrization
Authors:
Michel Baes,
Calypso Herrera,
Ariel Neufeld,
Pierre Ruyssen
Abstract:
This paper revisits the problem of decomposing a positive semidefinite matrix as a sum of a matrix with a given rank plus a sparse matrix. An immediate application can be found in portfolio optimization, when the matrix to be decomposed is the covariance between the different assets in the portfolio. Our approach consists in representing the low-rank part of the solution as the product $MM^{T}$, w…
▽ More
This paper revisits the problem of decomposing a positive semidefinite matrix as a sum of a matrix with a given rank plus a sparse matrix. An immediate application can be found in portfolio optimization, when the matrix to be decomposed is the covariance between the different assets in the portfolio. Our approach consists in representing the low-rank part of the solution as the product $MM^{T}$, where $M$ is a rectangular matrix of appropriate size, parametrized by the coefficients of a deep neural network. We then use a gradient descent algorithm to minimize an appropriate loss function over the parameters of the network. We deduce its convergence rate to a local optimum from the Lipschitz smoothness of our loss function. We show that the rate of convergence grows polynomially in the dimensions of the input, output, and the size of each of the hidden layers.
△ Less
Submitted 15 June, 2021; v1 submitted 1 August, 2019;
originally announced August 2019.
-
A direct D-bar reconstruction algorithm for recovering a complex conductivity in 2-D
Authors:
S. J. Hamilton,
C. N. L. Herrera,
J. L. Mueller,
A. Von Herrmann
Abstract:
A direct reconstruction algorithm for complex conductivities in $W^{2,\infty}(Ω)$, where $Ω$ is a bounded, simply connected Lipschitz domain in $\mathbb{R}^2$, is presented. The framework is based on the uniqueness proof by Francini [Inverse Problems 20 2000], but equations relating the Dirichlet-to-Neumann to the scattering transform and the exponentially growing solutions are not present in that…
▽ More
A direct reconstruction algorithm for complex conductivities in $W^{2,\infty}(Ω)$, where $Ω$ is a bounded, simply connected Lipschitz domain in $\mathbb{R}^2$, is presented. The framework is based on the uniqueness proof by Francini [Inverse Problems 20 2000], but equations relating the Dirichlet-to-Neumann to the scattering transform and the exponentially growing solutions are not present in that work, and are derived here. The algorithm constitutes the first D-bar method for the reconstruction of conductivities and permittivities in two dimensions. Reconstructions of numerically simulated chest phantoms with discontinuities at the organ boundaries are included.
△ Less
Submitted 12 November, 2012; v1 submitted 8 February, 2012;
originally announced February 2012.