Search | arXiv e-print repository

Fast Iterative Solver For Neural Network Method: II. 1D Diffusion-Reaction Problems And Data Fitting

Authors: Zhiqiang Cai, Anastassia Doktorova, Robert D. Falgout, César Herrera

Abstract: This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis funct… ▽ More This paper expands the damped block Newton (dBN) method introduced recently in [4] for 1D diffusion-reaction equations and least-squares data fitting problems. To determine the linear parameters (the weights and bias of the output layer) of the neural network (NN), the dBN method requires solving systems of linear equations involving the mass matrix. While the mass matrix for local hat basis functions is tri-diagonal and well-conditioned, the mass matrix for NNs is dense and ill-conditioned. For example, the condition number of the NN mass matrix for quasi-uniform meshes is at least ${\cal O}(n^4)$. We present a factorization of the mass matrix that enables solving the systems of linear equations in ${\cal O}(n)$ operations. To determine the non-linear parameters (the weights and bias of the hidden layer), one step of a damped Newton method is employed at each iteration. A Gauss-Newton method is used in place of Newton for the instances in which the Hessian matrices are singular. This modified dBN is referred to as dBGN. For both methods, the computational cost per iteration is ${\cal O}(n)$. Numerical results demonstrate the ability dBN and dBGN to efficiently achieve accurate results and outperform BFGS for select examples. △ Less

Submitted 1 July, 2024; originally announced July 2024.

MSC Class: 65K10; 65F05

arXiv:2404.17750 [pdf, other]

Fast Iterative Solver For Neural Network Method: I. 1D Diffusion Problems

Authors: Zhiqiang Cai, Anastassia Doktorova, Robert D. Falgout, César Herrera

Abstract: The discretization of the deep Ritz method [18] for the Poisson equation leads to a high-dimensional non-convex minimization problem, that is difficult and expensive to solve numerically. In this paper, we consider the shallow Ritz approximation to one-dimensional diffusion problems and introduce an effective and efficient iterative method, a damped block Newton (dBN) method, for solving the resul… ▽ More The discretization of the deep Ritz method [18] for the Poisson equation leads to a high-dimensional non-convex minimization problem, that is difficult and expensive to solve numerically. In this paper, we consider the shallow Ritz approximation to one-dimensional diffusion problems and introduce an effective and efficient iterative method, a damped block Newton (dBN) method, for solving the resulting non-convex minimization problem. The method employs the block Gauss-Seidel method as an outer iteration by dividing the parameters of a shallow neural network into the linear parameters (the weights and bias of the output layer) and the non-linear parameters (the weights and bias of the hidden layer). Per each outer iteration, the linear and the non-linear parameters are updated by exact inversion and one step of a damped Newton method, respectively. Inverses of the coefficient matrix and the Hessian matrix are tridiagonal and diagonal, respectively, and hence the cost of each dBN iteration is $\mathcal{O}(n)$. To move the breakpoints (the non-linear parameters) more efficiently, we propose an adaptive damped block Newton (AdBN) method by combining the dBN with the adaptive neuron enhancement (ANE) method [25]. Numerical examples demonstrate the ability of dBN and AdBN not only to move the breakpoints quickly and efficiently but also to achieve a nearly optimal order of convergence for AdBN. These iterative solvers are capable of outperforming BFGS for select examples. △ Less

Submitted 26 April, 2024; originally announced April 2024.

MSC Class: 65N99

arXiv:2402.09229 [pdf, ps, other]

Invariant conformal Killing forms on almost abelian Lie groups

Authors: Cecilia Herrera, Marcos Origlia

Abstract: We describe completely conformal Killing or conformal Killing-Yano (CKY) $p$-forms on almost abelian metric Lie algebras. In particular we prove that if a $n$-dimensional almost abelian metric Lie algebra admits a non-parallel CKY $p$-form, then $p=1$ or $p=n-1$. In other words, any CKY $p$-form on a metric almost abelian Lie algebra is parallel for $2\leq p\leq n-2$. Moreover, we characterize alm… ▽ More We describe completely conformal Killing or conformal Killing-Yano (CKY) $p$-forms on almost abelian metric Lie algebras. In particular we prove that if a $n$-dimensional almost abelian metric Lie algebra admits a non-parallel CKY $p$-form, then $p=1$ or $p=n-1$. In other words, any CKY $p$-form on a metric almost abelian Lie algebra is parallel for $2\leq p\leq n-2$. Moreover, we characterize almost abelian Lie algebras admitting non-parallel CKY $p$-forms, and we classify all Lie algebras with this property up to dimension $5$, distinguishing also those cases where the associated simply connected Lie group admits lattices. △ Less

Submitted 14 February, 2024; originally announced February 2024.

arXiv:2104.13669 [pdf, other]

Optimal Stop** via Randomized Neural Networks

Authors: Calypso Herrera, Florian Krach, Pierre Ruyssen, Josef Teichmann

Abstract: This paper presents the benefits of using randomized neural networks instead of standard basis functions or deep neural networks to approximate the solutions of optimal stop** problems. The key idea is to use neural networks, where the parameters of the hidden layers are generated randomly and only the last layer is trained, in order to approximate the continuation value. Our approaches are appl… ▽ More This paper presents the benefits of using randomized neural networks instead of standard basis functions or deep neural networks to approximate the solutions of optimal stop** problems. The key idea is to use neural networks, where the parameters of the hidden layers are generated randomly and only the last layer is trained, in order to approximate the continuation value. Our approaches are applicable to high dimensional problems where the existing approaches become increasingly impractical. In addition, since our approaches can be optimized using simple linear regression, they are easy to implement and theoretical guarantees can be provided. We test our approaches for American option pricing on Black--Scholes, Heston and rough Heston models and for optimally stop** a fractional Brownian motion. In all cases, our algorithms outperform the state-of-the-art and other relevant machine learning approaches in terms of computation time while achieving comparable results. Moreover, we show that they can also be used to efficiently compute Greeks of American options. △ Less

Submitted 1 December, 2023; v1 submitted 28 April, 2021; originally announced April 2021.

MSC Class: 60G40 (Primary); 68T07 (Secondary)

arXiv:2012.11054 [pdf, ps, other]

Invariant Conformal Killing-Yano $2$-forms on five dimensional Lie Groups

Authors: Andrea Cecilia Herrera, Marcos Origlia

Abstract: We study left invariant conformal Killing-Yano (CKY) $2$-forms on Lie groups endowed with a left invariant metric. We classify all $5$-dimensional metric Lie algebras carrying CKY tensors that are obtained as a one-dimensional central extension of 4-dimensional metric Lie algebras endowed with a invertible parallel skew-symmetric tensor. On the other hand, we also classify $5$-dimensional metric L… ▽ More We study left invariant conformal Killing-Yano (CKY) $2$-forms on Lie groups endowed with a left invariant metric. We classify all $5$-dimensional metric Lie algebras carrying CKY tensors that are obtained as a one-dimensional central extension of 4-dimensional metric Lie algebras endowed with a invertible parallel skew-symmetric tensor. On the other hand, we also classify $5$-dimensional metric Lie algebras with center of dimension greater than one admitting strict CKY tensors. In addition, we determine all possible CKY tensors on these metric Lie algebras. In particular, we exhibit the first examples of CKY $2$-forms on metric Lie algebras which do not admit any Sasakian structure. △ Less

Submitted 24 May, 2022; v1 submitted 20 December, 2020; originally announced December 2020.

Comments: to appear in The Journal of Geometric Analysis

arXiv:2012.09356 [pdf, ps, other]

Parallel skew-symmetric tensors on 4-dimensional metric Lie algebras

Authors: A. C. Herrera

Abstract: We give a complete classification, up to isometric isomorphism and scaling, of $4$-dimensional metric Lie algebras $(\mathfrak{g},\langle \cdot,\cdot \rangle)$ that admit a non-zero parallel skew-symmetric endomorphism. In particular, we distinguish those metric Lie algebras that admit such an endomorphism which is not a multiple of a complex structure, and for each of them we obtain the de Rham d… ▽ More We give a complete classification, up to isometric isomorphism and scaling, of $4$-dimensional metric Lie algebras $(\mathfrak{g},\langle \cdot,\cdot \rangle)$ that admit a non-zero parallel skew-symmetric endomorphism. In particular, we distinguish those metric Lie algebras that admit such an endomorphism which is not a multiple of a complex structure, and for each of them we obtain the de Rham decomposition of the associated simply connected Lie group with the corresponding left invariant metric. On the other hand, we find that the associated simply connected Lie group is irreducible as a Riemannian manifold for those metric Lie algebras where each parallel skew-symmetric endomorphism is a multiple of a complex structure. △ Less

Submitted 15 March, 2022; v1 submitted 16 December, 2020; originally announced December 2020.

Comments: To appear in Revista de la Unión Matemática Argentina, https://doi.org/10.33044/revuma.2451

arXiv:2006.04727 [pdf, other]

Neural Jump Ordinary Differential Equations: Consistent Continuous-Time Prediction and Filtering

Authors: Calypso Herrera, Florian Krach, Josef Teichmann

Abstract: Combinations of neural ODEs with recurrent neural networks (RNN), like GRU-ODE-Bayes or ODE-RNN are well suited to model irregularly observed time series. While those models outperform existing discrete-time approaches, no theoretical guarantees for their predictive capabilities are available. Assuming that the irregularly-sampled time series data originates from a continuous stochastic process, t… ▽ More Combinations of neural ODEs with recurrent neural networks (RNN), like GRU-ODE-Bayes or ODE-RNN are well suited to model irregularly observed time series. While those models outperform existing discrete-time approaches, no theoretical guarantees for their predictive capabilities are available. Assuming that the irregularly-sampled time series data originates from a continuous stochastic process, the $L^2$-optimal online prediction is the conditional expectation given the currently available information. We introduce the Neural Jump ODE (NJ-ODE) that provides a data-driven approach to learn, continuously in time, the conditional expectation of a stochastic process. Our approach models the conditional expectation between two observations with a neural ODE and jumps whenever a new observation is made. We define a novel training framework, which allows us to prove theoretical guarantees for the first time. In particular, we show that the output of our model converges to the $L^2$-optimal prediction. This can be interpreted as solution to a special filtering problem. We provide experiments showing that the theoretical results also hold empirically. Moreover, we experimentally show that our model outperforms the baselines in more complex learning tasks and give comparisons on real-world datasets. △ Less

Submitted 16 April, 2021; v1 submitted 8 June, 2020; originally announced June 2020.

Journal ref: International Conference on Learning Representations (2021)

arXiv:2004.13612 [pdf, other]

Denise: Deep Robust Principal Component Analysis for Positive Semidefinite Matrices

Authors: Calypso Herrera, Florian Krach, Anastasis Kratsios, Pierre Ruyssen, Josef Teichmann

Abstract: The robust PCA of covariance matrices plays an essential role when isolating key explanatory features. The currently available methods for performing such a low-rank plus sparse decomposition are matrix specific, meaning, those algorithms must re-run for every new matrix. Since these algorithms are computationally expensive, it is preferable to learn and store a function that nearly instantaneousl… ▽ More The robust PCA of covariance matrices plays an essential role when isolating key explanatory features. The currently available methods for performing such a low-rank plus sparse decomposition are matrix specific, meaning, those algorithms must re-run for every new matrix. Since these algorithms are computationally expensive, it is preferable to learn and store a function that nearly instantaneously performs this decomposition when evaluated. Therefore, we introduce Denise, a deep learning-based algorithm for robust PCA of covariance matrices, or more generally, of symmetric positive semidefinite matrices, which learns precisely such a function. Theoretical guarantees for Denise are provided. These include a novel universal approximation theorem adapted to our geometric deep learning problem and convergence to an optimal solution to the learning problem. Our experiments show that Denise matches state-of-the-art performance in terms of decomposition quality, while being approximately $2000\times$ faster than the state-of-the-art, principal component pursuit (PCP), and $200 \times$ faster than the current speed-optimized method, fast PCP. △ Less

Submitted 6 June, 2023; v1 submitted 28 April, 2020; originally announced April 2020.

Journal ref: Transactions on Machine Learning Research (2023)

arXiv:1908.00461 [pdf, other]

Low-Rank plus Sparse Decomposition of Covariance Matrices using Neural Network Parametrization

Authors: Michel Baes, Calypso Herrera, Ariel Neufeld, Pierre Ruyssen

Abstract: This paper revisits the problem of decomposing a positive semidefinite matrix as a sum of a matrix with a given rank plus a sparse matrix. An immediate application can be found in portfolio optimization, when the matrix to be decomposed is the covariance between the different assets in the portfolio. Our approach consists in representing the low-rank part of the solution as the product $MM^{T}$, w… ▽ More This paper revisits the problem of decomposing a positive semidefinite matrix as a sum of a matrix with a given rank plus a sparse matrix. An immediate application can be found in portfolio optimization, when the matrix to be decomposed is the covariance between the different assets in the portfolio. Our approach consists in representing the low-rank part of the solution as the product $MM^{T}$, where $M$ is a rectangular matrix of appropriate size, parametrized by the coefficients of a deep neural network. We then use a gradient descent algorithm to minimize an appropriate loss function over the parameters of the network. We deduce its convergence rate to a local optimum from the Lipschitz smoothness of our loss function. We show that the rate of convergence grows polynomially in the dimensions of the input, output, and the size of each of the hidden layers. △ Less

Submitted 15 June, 2021; v1 submitted 1 August, 2019; originally announced August 2019.

arXiv:1202.1785 [pdf, ps, other]

doi 10.1088/0266-5611/28/9/095005

A direct D-bar reconstruction algorithm for recovering a complex conductivity in 2-D

Authors: S. J. Hamilton, C. N. L. Herrera, J. L. Mueller, A. Von Herrmann

Abstract: A direct reconstruction algorithm for complex conductivities in $W^{2,\infty}(Ω)$, where $Ω$ is a bounded, simply connected Lipschitz domain in $\mathbb{R}^2$, is presented. The framework is based on the uniqueness proof by Francini [Inverse Problems 20 2000], but equations relating the Dirichlet-to-Neumann to the scattering transform and the exponentially growing solutions are not present in that… ▽ More A direct reconstruction algorithm for complex conductivities in $W^{2,\infty}(Ω)$, where $Ω$ is a bounded, simply connected Lipschitz domain in $\mathbb{R}^2$, is presented. The framework is based on the uniqueness proof by Francini [Inverse Problems 20 2000], but equations relating the Dirichlet-to-Neumann to the scattering transform and the exponentially growing solutions are not present in that work, and are derived here. The algorithm constitutes the first D-bar method for the reconstruction of conductivities and permittivities in two dimensions. Reconstructions of numerically simulated chest phantoms with discontinuities at the organ boundaries are included. △ Less

Submitted 12 November, 2012; v1 submitted 8 February, 2012; originally announced February 2012.

Comments: This is an author-created, un-copyedited version of an article accepted for publication in [insert name of journal]. IOP Publishing Ltd is not responsible for any errors or omissions in this version of the manuscript or any version derived from it. The Version of Record is available online at 10.1088/0266-5611/28/9/095005

Journal ref: Inverse Problems 28 (2012) 095005

Showing 1–10 of 10 results for author: Herrera, C