Search | arXiv e-print repository

New algebraic fast algorithms for $N$-body problems in two and three dimensions

Authors: Ritesh Khan, Sivaram Ambikasaran

Abstract: This article presents two new algebraic algorithms to perform fast matrix-vector product for $N$-body problems in $d$ dimensions, namely nHODLR$d$D (nested algorithm) and s-nHODLR$d$D (semi-nested or partially nested algorithm). The nHODLR$d$D and s-nHODLR$d$D algorithms are the nested and semi-nested version of our previously proposed fast algorithm, the hierarchically off-diagonal low-rank matri… ▽ More This article presents two new algebraic algorithms to perform fast matrix-vector product for $N$-body problems in $d$ dimensions, namely nHODLR$d$D (nested algorithm) and s-nHODLR$d$D (semi-nested or partially nested algorithm). The nHODLR$d$D and s-nHODLR$d$D algorithms are the nested and semi-nested version of our previously proposed fast algorithm, the hierarchically off-diagonal low-rank matrix in $d$ dimensions (HODLR$d$D), respectively, where the admissible clusters are the certain far-field and the vertex-sharing clusters. We rely on algebraic low-rank approximation techniques (ACA and NCA) and develop both algorithms in a black-box (kernel-independent) fashion. The initialization time of the proposed hierarchical structures scales quasi-linearly. Using the nHODLR$d$D and s-nHODLR$d$D hierarchical structures, one can perform the multiplication of a dense matrix (arising out of $N$-body problems) with a vector that scales as $\mathcal{O}(pN)$ and $\mathcal{O}(pN \log(N))$, respectively, where $p$ grows at most poly logarithmically with $N$. The numerical results in $2$D and $3$D $(d=2,3)$ show that the proposed nHODLR$d$D algorithm is competitive to the algebraic Fast Multipole Method in $d$ dimensions with respect to the matrix-vector product time and space complexity. The C++ implementation with OpenMP parallelization of the proposed algorithms is available at \url{https://github.com/riteshkhan/nHODLRdD/}. △ Less

Submitted 27 April, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: 41 pages

MSC Class: 65F55; 65D12; 65R20; 65D05; 65R10

arXiv:2307.16303 [pdf, other]

HODLR3D: Hierarchical matrices for $N$-body problems in three dimensions

Authors: V A Kandappan, Vaishnavi Gujjula, Sivaram Ambikasaran

Abstract: This article introduces HODLR3D, a class of hierarchical matrices arising out of $N$-body problems in three dimensions. HODLR3D relies on the fact that certain off-diagonal matrix sub-blocks arising out of the $N$-body problems in three dimensions are numerically low-rank. For the Laplace kernel in $3$D, which is widely encountered, we prove that all the off-diagonal matrix sub-blocks are rank def… ▽ More This article introduces HODLR3D, a class of hierarchical matrices arising out of $N$-body problems in three dimensions. HODLR3D relies on the fact that certain off-diagonal matrix sub-blocks arising out of the $N$-body problems in three dimensions are numerically low-rank. For the Laplace kernel in $3$D, which is widely encountered, we prove that all the off-diagonal matrix sub-blocks are rank deficient in finite precision. We also obtain the growth of the rank as a function of the size of these matrix sub-blocks. For other kernels in three dimensions, we numerically illustrate a similar scaling in rank for the different off-diagonal sub-blocks. We leverage this hierarchical low-rank structure to construct HODLR3D representation, with which we accelerate matrix-vector products. The storage and computational complexity of the HODLR3D matrix-vector product scales almost linearly with system size. We demonstrate the computational performance of HODLR3D representation through various numerical experiments. Further, we explore the performance of the HODLR3D representation on distributed memory systems. HODLR3D, described in this article, is based on a weak admissibility condition. Among the hierarchical matrices with different weak admissibility conditions in $3$D, only in HODLR3D did the rank of the admissible off-diagonal blocks not scale with any power of the system size. Thus, the storage and the computational complexity of the HODLR3D matrix-vector product remain tractable for $N$-body problems with large system sizes. △ Less

Submitted 30 July, 2023; originally announced July 2023.

Comments: pre-peer review version

MSC Class: 68Q25; 68R10; 68U05; 45B05; 68U20

arXiv:2301.12704 [pdf, other]

Algebraic Inverse Fast Multipole Method: A fast direct solver that is better than HODLR based fast direct solver

Authors: Vaishnavi Gujjula, Sivaram Ambikasaran

Abstract: This article presents a fast direct solver, termed Algebraic Inverse Fast Multipole Method (from now on abbreviated as AIFMM), for linear systems arising out of $N$-body problems. AIFMM relies on the following three main ideas: (i) Certain sub-blocks in the matrix corresponding to $N$-body problems can be efficiently represented as low-rank matrices; (ii) The low-rank sub-blocks in the above matri… ▽ More This article presents a fast direct solver, termed Algebraic Inverse Fast Multipole Method (from now on abbreviated as AIFMM), for linear systems arising out of $N$-body problems. AIFMM relies on the following three main ideas: (i) Certain sub-blocks in the matrix corresponding to $N$-body problems can be efficiently represented as low-rank matrices; (ii) The low-rank sub-blocks in the above matrix are leveraged to construct an extended sparse linear system; (iii) While solving the extended sparse linear system, certain fill-ins that arise in the elimination phase are represented as low-rank matrices and are "redirected" though other variables maintaining zero fill-in sparsity. The main highlights of this article are the following: (i) Our method is completely algebraic (as opposed to the existing Inverse Fast Multipole Method~\cite{ arXiv:1407.1572,doi:10.1137/15M1034477,TAKAHASHI2017406}, from now on abbreviated as IFMM). We rely on our new Nested Cross Approximation~\cite{arXiv:2203.14832} (from now on abbreviated as NNCA) to represent the matrix arising out of $N$-body problems. (ii) A significant contribution is that the algorithm presented in this article is more efficient than the existing IFMMs. In the existing IFMMs, the fill-ins are compressed and redirected as and when they are created. Whereas in this article, we update the fill-ins first without affecting the computational complexity. We then compress and redirect them only once. (iii) Another noteworthy contribution of this article is that we provide a comparison of AIFMM with Hierarchical Off-Diagonal Low-Rank (from now on abbreviated as HODLR) based fast direct solver and NNCA powered GMRES based fast iterative solver. (iv) Additionally, AIFMM is also demonstrated as a preconditioner. △ Less

Submitted 30 January, 2023; originally announced January 2023.

Comments: 32 pages, 16 Figures, 13 Tables

MSC Class: 65F05 (Primary); 65F08; 65Y20 (Secondary)

arXiv:2209.05819 [pdf, other]

HODLR$d$D: A new Black-box fast algorithm for $N$-body problems in $d$-dimensions with guaranteed error bounds

Authors: Ritesh Khan, V A Kandappan, Sivaram Ambikasaran

Abstract: In this article, we prove new theorems bounding the rank of different sub-matrices arising from these kernel functions. Bounds like these are often useful for analyzing the complexity of various hierarchical matrix algorithms. We also plot the numerical rank growth of different sub-matrices arising out of various kernel functions in $1$D, $2$D, $3$D and $4$D, which, not surprisingly, agrees with t… ▽ More In this article, we prove new theorems bounding the rank of different sub-matrices arising from these kernel functions. Bounds like these are often useful for analyzing the complexity of various hierarchical matrix algorithms. We also plot the numerical rank growth of different sub-matrices arising out of various kernel functions in $1$D, $2$D, $3$D and $4$D, which, not surprisingly, agrees with the proposed theorems. Another significant contribution of this article is that, using the obtained rank bounds, we also propose a way to extend the notion of \textbf{\emph{weak-admissibility}} for hierarchical matrices in higher dimensions. Based on this proposed \textbf{\emph{weak-admissibility}} condition, we develop a black-box (kernel-independent) fast algorithm for $N$-body problems, hierarchically off-diagonal low-rank matrix in $d$ dimensions (HODLR$d$D), which can perform matrix-vector products with $\mathcal{O}(pN \log (N))$ complexity in any dimension $d$, where $p$ doesn't grow with any power of $N$. More precisely, our theorems guarantee that $p \in \mathcal{O} (\log (N) \log^d (\log (N)))$, which implies our HODLR$d$D algorithm scales almost linearly. The $\texttt{C++}$ implementation with \texttt{OpenMP} parallelization of the HODLR$d$D is available at \url{https://github.com/SAFRAN-LAB/HODLRdD}. We also discuss the scalability of the HODLR$d$D algorithm and showcase the applicability by solving an integral equation in $4$ dimensions and accelerating the training phase of the support vector machines (SVM) for the data sets with four and five features. △ Less

Submitted 17 September, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

Comments: 35 pages, 23 figures, 14 tables

MSC Class: 65F55; 65D05; 65R10; 65R20; 65F55; 65D12 ACM Class: G.1.1; G.1.3

arXiv:2205.15563 [pdf, other]

Spectrum of MATLABs magic squares

Authors: Hariprasad Manjunath, Sivaram Ambikasaran

Abstract: This article looks at the eigenvalues of magic squares generated by the MATLAB's magic($n$) function. The magic($n$) function constructs doubly even ($n = 4k$) magic squares, singly even ($n = 4k+2$) magic squares and odd ($n = 2k+1$) magic squares using different algorithms. The doubly even magic squares are constructed by a criss-cross method that involves reflecting the entries of a simple squa… ▽ More This article looks at the eigenvalues of magic squares generated by the MATLAB's magic($n$) function. The magic($n$) function constructs doubly even ($n = 4k$) magic squares, singly even ($n = 4k+2$) magic squares and odd ($n = 2k+1$) magic squares using different algorithms. The doubly even magic squares are constructed by a criss-cross method that involves reflecting the entries of a simple square about the center. The odd magic squares are constructed using the Siamese method. The singly even magic squares are constructed using a lower-order odd magic square (Strachey method). We obtain approximations of eigenvalues of odd and singly even magic squares and prove error bounds on the approximation. For the sake of completeness, we also obtain the eigenpairs of doubly even magic squares generated by MATLAB. The approximation of the spectra involves some interesting connections with the spectrum of g-circulant matrices and the use of Bauer-Fike theorem. △ Less

Submitted 18 September, 2022; v1 submitted 31 May, 2022; originally announced May 2022.

MSC Class: 15A18; 15A60

arXiv:2204.05536 [pdf, other]

HODLR2D: A new class of Hierarchical matrices

Authors: V A Kandappan, Vaishnavi Gujjula, Sivaram Ambikasaran

Abstract: This article introduces HODLR2D, a new hierarchical low-rank representation for a class of dense matrices arising out of $N$ body problems in two dimensions. Using this new hierarchical framework, we propose a new fast matrix-vector product that scales almost linearly. We apply this fast matrix-vector product to accelerate the iterative solution of large dense linear systems arising out of radial… ▽ More This article introduces HODLR2D, a new hierarchical low-rank representation for a class of dense matrices arising out of $N$ body problems in two dimensions. Using this new hierarchical framework, we propose a new fast matrix-vector product that scales almost linearly. We apply this fast matrix-vector product to accelerate the iterative solution of large dense linear systems arising out of radial basis function interpolation and discretized integral equation. The space and computational complexity of HODLR2D matrix-vector products scales as $\mathcal{O}(pN \log(N))$, where $p$ is the maximum rank of the compressed matrix subblocks. We also prove that $p \in \mathcal{O}(\log(N)\log(\log(N)))$, which ensures that the storage and computational complexity of HODLR2D matrix-vector products remain tractable for large $N$. Additionally, we also present the parallel scalability of HODLR2D as part of this article. △ Less

Submitted 12 April, 2022; originally announced April 2022.

Comments: 26 pages, Removed line numbers

MSC Class: 65F55; 05C50; 31A10; 65D12

arXiv:2204.00326 [pdf, other]

doi 10.4208/cicp.OA-2022-0103

A new Directional Algebraic Fast Multipole Method based iterative solver for the Lippmann-Schwinger equation accelerated with HODLR preconditioner

Authors: Vaishnavi Gujjula, Sivaram Ambikasaran

Abstract: We present a fast iterative solver for scattering problems in 2D, where a penetrable object with compact support is considered. By representing the scattered field as a volume potential in terms of the Green's function, we arrive at the Lippmann-Schwinger equation in integral form, which is then discretized using an appropriate quadrature technique. The discretized linear system is then solved usi… ▽ More We present a fast iterative solver for scattering problems in 2D, where a penetrable object with compact support is considered. By representing the scattered field as a volume potential in terms of the Green's function, we arrive at the Lippmann-Schwinger equation in integral form, which is then discretized using an appropriate quadrature technique. The discretized linear system is then solved using an iterative solver accelerated by Directional Algebraic Fast Multipole Method (DAFMM). The DAFMM presented here relies on the directional admissibility condition of the 2D Helmholtz kernel. And the construction of low-rank factorizations of the appropriate low-rank matrix sub-blocks is based on our new Nested Cross Approximation (NCA)~\cite{ arXiv:2203.14832 [math.NA]}. The advantage of our new NCA is that the search space of so-called far-field pivots is smaller than that of the existing NCAs. Another significant contribution of this work is the use of HODLR based direct solver as a preconditioner to further accelerate the iterative solver. In one of our numerical experiments, the iterative solver does not converge without a preconditioner. We show that the HODLR preconditioner is capable of solving problems that the iterative solver can not. Another noteworthy contribution of this article is that we perform a comparative study of the HODLR based fast direct solver, DAFMM based fast iterative solver, and HODLR preconditioned DAFMM based fast iterative solver for the discretized Lippmann-Schwinger problem. To the best of our knowledge, this work is one of the first to provide a systematic study and comparison of these different solvers for various problem sizes and contrast functions. In the spirit of reproducible computational science, the implementation of the algorithms developed in this article is made available at \url{https://github.com/vaishna77/Lippmann_Schwinger_Solver}. △ Less

Submitted 26 March, 2023; v1 submitted 1 April, 2022; originally announced April 2022.

Comments: 36 pages, 15 figures, 8 tables

MSC Class: 65R20; 65F55 (Primary) 31A10; 35J05; 35J08; 65R10 (Secondary)

Journal ref: Communications in Computational Physics, Volume 32, Issue 4, Year 2022, Pages 1061-1093

arXiv:2203.14832 [pdf, other]

A new Nested Cross Approximation

Authors: Vaishnavi Gujjula, Sivaram Ambikasaran

Abstract: In this article, we present a new Nested Cross Approximation (NNCA) for constructing H2 matrices. It differs from the existing NCAs~\cite{bebendorf2012constructing, zhao2019fast} in the technique of choosing pivots, a key part of the approximation. Our technique of choosing pivots is purely algebraic and involves only a single tree traversal. We demonstrate its applicability by develo** a fast H… ▽ More In this article, we present a new Nested Cross Approximation (NNCA) for constructing H2 matrices. It differs from the existing NCAs~\cite{bebendorf2012constructing, zhao2019fast} in the technique of choosing pivots, a key part of the approximation. Our technique of choosing pivots is purely algebraic and involves only a single tree traversal. We demonstrate its applicability by develo** a fast H2 matrix-vector product, that uses NNCA for the appropriate low-rank approximations. We illustrate the timing profiles and the accuracy of NNCA based H2 matrix-vector product. We also provide a comparison of NNCA based H2 matrix-vector product with the existing NCA based H2 matrix-vector products. A key observation is that NNCA performs better than the existing NCAs. In addition, using the NNCA based H2 matrix-vector product, we accelerate i) solving an integral equation in 3D and ii) Support Vector Machine (SVM). In the spirit of reproducible computational science, the implementation of the algorithm developed in this article is made available at \url{https://github.com/SAFRAN-LAB/NNCA}. △ Less

Submitted 12 September, 2023; v1 submitted 28 March, 2022; originally announced March 2022.

Comments: 25 pages, 11 figures, 6 Tables

MSC Class: 65F55 ACM Class: G.1.3; G.1.9

arXiv:1703.09710 [pdf, other]

doi 10.3847/1538-3881/aa9332

Fast and scalable Gaussian process modeling with applications to astronomical time series

Authors: Daniel Foreman-Mackey, Eric Agol, Sivaram Ambikasaran, Ruth Angus

Abstract: The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large datasets. Gaussian Processes are a popular class of models used for this purpose but, since the computational cost scales, in general, as the cube of the number of data points, their application has been limited to small datasets. In this paper… ▽ More The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large datasets. Gaussian Processes are a popular class of models used for this purpose but, since the computational cost scales, in general, as the cube of the number of data points, their application has been limited to small datasets. In this paper, we present a novel method for Gaussian Process modeling in one-dimension where the computational requirements scale linearly with the size of the dataset. We demonstrate the method by applying it to simulated and real astronomical time series datasets. These demonstrations are examples of probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra, and transiting planet parameters. The method exploits structure in the problem when the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise. This form of covariance arises naturally when the process is a mixture of stochastically-driven damped harmonic oscillators -- providing a physical motivation for and interpretation of this choice -- but we also demonstrate that it can be a useful effective model in some other cases. We present a mathematical description of the method and compare it to existing scalable Gaussian Process methods. The method is fast and interpretable, with a range of potential applications within astronomical data analysis and beyond. We provide well-tested and documented open-source implementations of this method in C++, Python, and Julia. △ Less

Submitted 19 July, 2017; v1 submitted 28 March, 2017; originally announced March 2017.

Comments: Updated in response to referee. Submitted to the AAS Journals. Comments (still) welcome. Code available: https://github.com/dfm/celerite

arXiv:1512.01338 [pdf, other]

An accurate, fast, mathematically robust, universal, non-iterative algorithm for computing multi-component diffusion velocities

Authors: Sivaram Ambikasaran, Krithika Narayanaswamy

Abstract: Using accurate multi-component diffusion treatment in numerical combustion studies remains formidable due to the computational cost associated with solving for diffusion velocities. To obtain the diffusion velocities, for low density gases, one needs to solve the Stefan-Maxwell equations along with the zero diffusion flux criteria, which scales as $\mathcal{O}(N^3)$, when solved exactly. In this a… ▽ More Using accurate multi-component diffusion treatment in numerical combustion studies remains formidable due to the computational cost associated with solving for diffusion velocities. To obtain the diffusion velocities, for low density gases, one needs to solve the Stefan-Maxwell equations along with the zero diffusion flux criteria, which scales as $\mathcal{O}(N^3)$, when solved exactly. In this article, we propose an accurate, fast, direct and robust algorithm to compute multi-component diffusion velocities. To our knowledge, this is the first provably accurate algorithm (the solution can be obtained up to an arbitrary degree of precision) scaling at a computational complexity of $\mathcal{O}(N)$ in finite precision. The key idea involves leveraging the fact that the matrix of the reciprocal of the binary diffusivities, $V$, is low rank, with its rank being independent of the number of species involved. The low rank representation of matrix $V$ is computed in a fast manner at a computational complexity of $\mathcal{O}(N)$ and the Sherman-Morrison-Woodbury formula is used to solve for the diffusion velocities at a computational complexity of $\mathcal{O}(N)$. Rigorous proofs and numerical benchmarks illustrate the low rank property of the matrix $V$ and scaling of the algorithm. △ Less

Submitted 16 February, 2016; v1 submitted 4 December, 2015; originally announced December 2015.

Comments: 16 pages, 7 figures, 1 table, 1 algorithm

arXiv:1505.07157 [pdf, other]

Fast, adaptive, high order accurate discretization of the Lippmann-Schwinger equation in two dimension

Authors: Sivaram Ambikasaran, Carlos Borges, Lise-Marie Imbert-Gerard, Leslie Greengard

Abstract: We present a fast direct solver for two dimensional scattering problems, where an incident wave im**es on a penetrable medium with compact support. We represent the scattered field using a volume potential whose kernel is the outgoing Green's function for the exterior domain. Inserting this representation into the governing partial differential equation, we obtain an integral equation of the Lip… ▽ More We present a fast direct solver for two dimensional scattering problems, where an incident wave im**es on a penetrable medium with compact support. We represent the scattered field using a volume potential whose kernel is the outgoing Green's function for the exterior domain. Inserting this representation into the governing partial differential equation, we obtain an integral equation of the Lippmann-Schwinger type. The principal contribution here is the development of an automatically adaptive, high-order accurate discretization based on a quad tree data structure which provides rapid access to arbitrary elements of the discretized system matrix. This permits the straightforward application of state-of-the-art algorithms for constructing compressed versions of the solution operator. These solvers typically require $O(N^{3/2})$ work, where $N$ denotes the number of degrees of freedom. We demonstrate the performance of the method for a variety of problems in both the low and high frequency regimes. △ Less

Submitted 26 May, 2015; originally announced May 2015.

Comments: 18 pages

MSC Class: 65R20

arXiv:1409.7852 [pdf, other]

Generalized Rybicki Press algorithm

Authors: Sivaram Ambikasaran

Abstract: This article discusses a more general and numerically stable Rybicki Press algorithm, which enables inverting and computing determinants of covariance matrices, whose elements are sums of exponentials. The algorithm is true in exact arithmetic and relies on introducing new variables and corresponding equations, thereby converting the matrix into a banded matrix of larger size. Linear complexity ba… ▽ More This article discusses a more general and numerically stable Rybicki Press algorithm, which enables inverting and computing determinants of covariance matrices, whose elements are sums of exponentials. The algorithm is true in exact arithmetic and relies on introducing new variables and corresponding equations, thereby converting the matrix into a banded matrix of larger size. Linear complexity banded algorithms for solving linear systems and computing determinants on the larger matrix enable linear complexity algorithms for the initial semi-separable matrix as well. Benchmarks provided illustrate the linear scaling of the algorithm. △ Less

Submitted 1 May, 2015; v1 submitted 27 September, 2014; originally announced September 2014.

Comments: 13 pages, 11 figures, 1 table

MSC Class: 05C50; 05C85; 62M10; 05B20

arXiv:1407.1572 [pdf, other]

The Inverse Fast Multipole Method

Authors: Sivaram Ambikasaran, Eric Darve

Abstract: This article introduces a new fast direct solver for linear systems arising out of wide range of applications, integral equations, multivariate statistics, radial basis interpolation, etc., to name a few. \emph{The highlight of this new fast direct solver is that the solver scales linearly in the number of unknowns in all dimensions.} The solver, termed as Inverse Fast Multipole Method (abbreviate… ▽ More This article introduces a new fast direct solver for linear systems arising out of wide range of applications, integral equations, multivariate statistics, radial basis interpolation, etc., to name a few. \emph{The highlight of this new fast direct solver is that the solver scales linearly in the number of unknowns in all dimensions.} The solver, termed as Inverse Fast Multipole Method (abbreviated as IFMM), works on the same data-structure as the Fast Multipole Method (abbreviated as FMM). More generally, the solver can be immediately extended to the class of hierarchical matrices, denoted as $\mathcal{H}^2$ matrices with strong admissibility criteria (weak low-rank structure), i.e., \emph{the interaction between neighboring cluster of particles is full-rank whereas the interaction between particles corresponding to well-separated clusters can be efficiently represented as a low-rank matrix}. The algorithm departs from existing approaches in the fact that throughout the algorithm the interaction corresponding to neighboring clusters are always treated as full-rank interactions. Our approach relies on two major ideas: (i) The $N \times N$ matrix arising out of FMM (from now on termed as FMM matrix) can be represented as an extended sparser matrix of size $M \times M$, where $M \approx 3N$. (ii) While solving the larger extended sparser matrix, \emph{the fill-in's that arise in the matrix blocks corresponding to well-separated clusters are hierarchically compressed}. The ordering of the equations and the unknowns in the extended sparser matrix is strongly related to the local and multipole coefficients in the FMM~\cite{greengard1987fast} and \emph{the order of elimination is different from the usual nested dissection approach}. Numerical benchmarks on $2$D manifold confirm the linear scaling of the algorithm. △ Less

Submitted 6 July, 2014; originally announced July 2014.

Comments: 25 pages, 28 figures

arXiv:1405.0223 [pdf, other]

Fast symmetric factorization of hierarchical matrices with applications

Authors: Sivaram Ambikasaran, Michael O'Neil, Karan Raj Singh

Abstract: We present a fast direct algorithm for computing symmetric factorizations, i.e. $A = WW^T$, of symmetric positive-definite hierarchical matrices with weak-admissibility conditions. The computational cost for the symmetric factorization scales as $\mathcal{O}(n \log^2 n)$ for hierarchically off-diagonal low-rank matrices. Once this factorization is obtained, the cost for inversion, application, and… ▽ More We present a fast direct algorithm for computing symmetric factorizations, i.e. $A = WW^T$, of symmetric positive-definite hierarchical matrices with weak-admissibility conditions. The computational cost for the symmetric factorization scales as $\mathcal{O}(n \log^2 n)$ for hierarchically off-diagonal low-rank matrices. Once this factorization is obtained, the cost for inversion, application, and determinant computation scales as $\mathcal{O}(n \log n)$. In particular, this allows for the near optimal generation of correlated random variates in the case where $A$ is a covariance matrix. This symmetric factorization algorithm depends on two key ingredients. First, we present a novel symmetric factorization formula for low-rank updates to the identity of the form $I+UKU^T$. This factorization can be computed in $\mathcal{O}(n)$ time if the rank of the perturbation is sufficiently small. Second, combining this formula with a recursive divide-and-conquer strategy, near linear complexity symmetric factorizations for hierarchically structured matrices can be obtained. We present numerical results for matrices relevant to problems in probability \& statistics (Gaussian processes), interpolation (Radial basis functions), and Brownian dynamics calculations in fluid mechanics (the Rotne-Prager-Yamakawa tensor). △ Less

Submitted 30 December, 2016; v1 submitted 1 May, 2014; originally announced May 2014.

Comments: 18 pages, 8 figures, 1 table

arXiv:1404.3816 [pdf, ps, other]

doi 10.1002/2013WR014607

A Kalman filter powered by $\mathcal{H}^2$-matrices for quasi-continuous data assimilation problems

Authors: Judith Y. Li, Sivaram Ambikasaran, Eric F. Darve, Peter K. Kitanidis

Abstract: Continuously tracking the movement of a fluid or a plume in the subsurface is a challenge that is often encountered in applications, such as tracking a plume of injected CO$_2$ or of a hazardous substance. Advances in monitoring techniques have made it possible to collect measurements at a high frequency while the plume moves, which has the potential advantage of providing continuous high-resoluti… ▽ More Continuously tracking the movement of a fluid or a plume in the subsurface is a challenge that is often encountered in applications, such as tracking a plume of injected CO$_2$ or of a hazardous substance. Advances in monitoring techniques have made it possible to collect measurements at a high frequency while the plume moves, which has the potential advantage of providing continuous high-resolution images of fluid flow with the aid of data processing. However, the applicability of this approach is limited by the high computational cost associated with having to analyze large data sets within the time constraints imposed by real-time monitoring. Existing data assimilation methods have computational requirements that increase super-linearly with the size of the unknowns $m$. In this paper, we present the HiKF, a new Kalman filter (KF) variant powered by the hierarchical matrix approach that dramatically reduces the computational and storage cost of the standard KF from $\mathcal{O}(m^2)$ to $\mathcal{O}(m)$, while producing practically the same results. The version of HiKF that is presented here takes advantage of the so-called random walk dynamical model, which is tailored to a class of data assimilation problems in which measurements are collected quasi-continuously. The proposed method has been applied to a realistic CO$_2$ injection model and compared with the ensemble Kalman filter (EnKF). Numerical results show that HiKF can provide estimates that are more accurate than EnKF, and also demonstrate the usefulness of modeling the system dynamics as a random walk in this context. △ Less

Submitted 15 April, 2014; originally announced April 2014.

Comments: 18 pages, 7 figures. Water Resources Research, 2014

ACM Class: I.4.4; I.4.5; I.4.10; G.1.3; I.1.2

arXiv:1404.3451 [pdf, other]

A fast direct solver for high frequency scattering from a large cavity in two dimensions

Authors: Jun Lai, Sivaram Ambikasaran, Leslie F. Greengard

Abstract: We present a fast direct solver for the simulation of electromagnetic scattering from an arbitrarily-shaped, large, empty cavity embedded in an infinite perfectly conducting half space. The governing Maxwell equations are reformulated as a well-conditioned second kind integral equation and the resulting linear system is solved in nearly linear time using a hierarchical matrix factorization techniq… ▽ More We present a fast direct solver for the simulation of electromagnetic scattering from an arbitrarily-shaped, large, empty cavity embedded in an infinite perfectly conducting half space. The governing Maxwell equations are reformulated as a well-conditioned second kind integral equation and the resulting linear system is solved in nearly linear time using a hierarchical matrix factorization technique. We illustrate the performance of the scheme with several numerical examples for complex cavity shapes over a wide range of frequencies. △ Less

Submitted 13 April, 2014; originally announced April 2014.

Comments: 15 pages, 9 figures. Contact author for animation

MSC Class: 97N40; 65F05; 45B05; 45A05 ACM Class: G.1; G.1.3

arXiv:1403.6015 [pdf, other]

Fast Direct Methods for Gaussian Processes

Authors: Sivaram Ambikasaran, Daniel Foreman-Mackey, Leslie Greengard, David W. Hogg, Michael O'Neil

Abstract: A number of problems in probability and statistics can be addressed using the multivariate normal (Gaussian) distribution. In the one-dimensional case, computing the probability for a given mean and variance simply requires the evaluation of the corresponding Gaussian density. In the $n$-dimensional setting, however, it requires the inversion of an $n \times n$ covariance matrix, $C$, as well as t… ▽ More A number of problems in probability and statistics can be addressed using the multivariate normal (Gaussian) distribution. In the one-dimensional case, computing the probability for a given mean and variance simply requires the evaluation of the corresponding Gaussian density. In the $n$-dimensional setting, however, it requires the inversion of an $n \times n$ covariance matrix, $C$, as well as the evaluation of its determinant, $\det(C)$. In many cases, such as regression using Gaussian processes, the covariance matrix is of the form $C = σ^2 I + K$, where $K$ is computed using a specified covariance kernel which depends on the data and additional parameters (hyperparameters). The matrix $C$ is typically dense, causing standard direct methods for inversion and determinant evaluation to require $\mathcal O(n^3)$ work. This cost is prohibitive for large-scale modeling. Here, we show that for the most commonly used covariance functions, the matrix $C$ can be hierarchically factored into a product of block low-rank updates of the identity matrix, yielding an $\mathcal O (n\log^2 n) $ algorithm for inversion. More importantly, we show that this factorization enables the evaluation of the determinant $\det(C)$, permitting the direct calculation of probabilities in high dimensions under fairly broad assumptions on the kernel defining $K$. Our fast algorithm brings many problems in marginalization and the adaptation of hyperparameters within practical reach using a single CPU core. The combination of nearly optimal scaling in terms of problem size with high-performance computing resources will permit the modeling of previously intractable problems. We illustrate the performance of the scheme on standard covariance kernels. △ Less

Submitted 4 April, 2015; v1 submitted 24 March, 2014; originally announced March 2014.

arXiv:1403.5337 [pdf, other]

doi 10.1016/j.jcp.2015.10.012

A Fast Block Low-Rank Dense Solver with Applications to Finite-Element Matrices

Authors: Amirhossein Aminfar, Sivaram Ambikasaran, Eric Darve

Abstract: This article presents a fast solver for the dense "frontal" matrices that arise from the multifrontal sparse elimination process of 3D elliptic PDEs. The solver relies on the fact that these matrices can be efficiently represented as a hierarchically off-diagonal low-rank (HODLR) matrix. To construct the low-rank approximation of the off-diagonal blocks, we propose a new pseudo-skeleton scheme, th… ▽ More This article presents a fast solver for the dense "frontal" matrices that arise from the multifrontal sparse elimination process of 3D elliptic PDEs. The solver relies on the fact that these matrices can be efficiently represented as a hierarchically off-diagonal low-rank (HODLR) matrix. To construct the low-rank approximation of the off-diagonal blocks, we propose a new pseudo-skeleton scheme, the boundary distance low-rank approximation, that picks rows and columns based on the location of their corresponding vertices in the sparse matrix graph. We compare this new low-rank approximation method to the adaptive cross approximation (ACA) algorithm and show that it achieves betters speedup specially for unstructured meshes. Using the HODLR direct solver as a preconditioner (with a low tolerance) to the GMRES iterative scheme, we can reach machine accuracy much faster than a conventional LU solver. Numerical benchmarks are provided for frontal matrices arising from 3D finite element problems corresponding to a wide range of applications. △ Less

Submitted 18 March, 2015; v1 submitted 20 March, 2014; originally announced March 2014.

arXiv:1101.4081

A simple way to speedup Gauss Elimination

Authors: Sivaram Ambikasaran

Abstract: This article looks at a simple modification to speedup the conventional Gauss Elimination. The proposed modification speeds up the conventional Gauss Elimination by a factor of nearly 9/7 (in the asymptotic limit). This article looks at a simple modification to speedup the conventional Gauss Elimination. The proposed modification speeds up the conventional Gauss Elimination by a factor of nearly 9/7 (in the asymptotic limit). △ Less

Submitted 1 August, 2013; v1 submitted 21 January, 2011; originally announced January 2011.

Comments: Withdrawing test file

Showing 1–19 of 19 results for author: Ambikasaran, S