-
Dynamical low-rank tensor approximations to high-dimensional parabolic problems: existence and convergence of spatial discretizations
Authors:
Markus Bachmayr,
Henrik Eisenmann,
André Uschmajew
Abstract:
We consider dynamical low-rank approximations to parabolic problems on higher-order tensor manifolds in Hilbert spaces. In addition to existence of solutions and their stability with respect to perturbations to the problem data, we show convergence of spatial discretizations. Our framework accommodates various standard low-rank tensor formats for multivariate functions, including tensor train and…
▽ More
We consider dynamical low-rank approximations to parabolic problems on higher-order tensor manifolds in Hilbert spaces. In addition to existence of solutions and their stability with respect to perturbations to the problem data, we show convergence of spatial discretizations. Our framework accommodates various standard low-rank tensor formats for multivariate functions, including tensor train and hierarchical tensors.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Gauss-Southwell type descent methods for low-rank matrix optimization
Authors:
Guillaume Olikier,
André Uschmajew,
Bart Vandereycken
Abstract:
We consider gradient-related methods for low-rank matrix optimization with a smooth cost function. The methods operate on single factors of the low-rank factorization and share aspects of both alternating and Riemannian optimization. Two possible choices for the search directions based on Gauss-Southwell type selection rules are compared: one using the gradient of a factorized non-convex formulati…
▽ More
We consider gradient-related methods for low-rank matrix optimization with a smooth cost function. The methods operate on single factors of the low-rank factorization and share aspects of both alternating and Riemannian optimization. Two possible choices for the search directions based on Gauss-Southwell type selection rules are compared: one using the gradient of a factorized non-convex formulation, the other using the Riemannian gradient. While both methods provide gradient convergence guarantees that are similar to the unconstrained case, numerical experiments on a quadratic cost function indicate that the version based on the Riemannian gradient is significantly more robust with respect to small singular values and the condition number of the cost function. As a side result of our approach, we also obtain new convergence results for the alternating least squares method.
△ Less
Submitted 16 July, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
On the approximation of vector-valued functions by samples
Authors:
Daniel Kressner,
Tingting Ni,
André Uschmajew
Abstract:
Given a Hilbert space $\mathcal H$ and a finite measure space $Ω$, the approximation of a vector-valued function $f: Ω\to \mathcal H$ by a $k$-dimensional subspace $\mathcal U \subset \mathcal H$ plays an important role in dimension reduction techniques, such as reduced basis methods for solving parameter-dependent partial differential equations. For functions in the Lebesgue--Bochner space…
▽ More
Given a Hilbert space $\mathcal H$ and a finite measure space $Ω$, the approximation of a vector-valued function $f: Ω\to \mathcal H$ by a $k$-dimensional subspace $\mathcal U \subset \mathcal H$ plays an important role in dimension reduction techniques, such as reduced basis methods for solving parameter-dependent partial differential equations. For functions in the Lebesgue--Bochner space $L^2(Ω;\mathcal H)$, the best possible subspace approximation error $d_k^{(2)}$ is characterized by the singular values of $f$. However, for practical reasons, $\mathcal U$ is often restricted to be spanned by point samples of $f$. We show that this restriction only has a mild impact on the attainable error; there always exist $k$ samples such that the resulting error is not larger than $\sqrt{k+1} \cdot d_k^{(2)}$. Our work extends existing results by Binev at al. (SIAM J. Math. Anal., 43(3):1457--1472, 2011) on approximation in supremum norm and by Deshpande et al. (Theory Comput., 2:225--247, 2006) on column subset selection for matrices.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Dynamical low-rank approximation of the Vlasov-Poisson equation with piecewise linear spatial boundary
Authors:
André Uschmajew,
Andreas Zeiser
Abstract:
We consider dynamical low-rank approximation (DLRA) for the numerical simulation of Vlasov--Poisson equations based on separation of space and velocity variables, as proposed in several recent works. The standard approach for the time integration in the DLRA model uses a splitting of the tangent space projector for the low-rank manifold according to the separated variables. It can also be modified…
▽ More
We consider dynamical low-rank approximation (DLRA) for the numerical simulation of Vlasov--Poisson equations based on separation of space and velocity variables, as proposed in several recent works. The standard approach for the time integration in the DLRA model uses a splitting of the tangent space projector for the low-rank manifold according to the separated variables. It can also be modified to allow for rank-adaptivity. A less studied aspect is the incorporation of boundary conditions in the DLRA model. We propose a variational formulation of the projector splitting which allows to handle inflow boundary conditions on spatial domains with piecewise linear boundary. Numerical experiments demonstrate the principle feasibility of this approach.
△ Less
Submitted 10 April, 2024; v1 submitted 3 March, 2023;
originally announced March 2023.
-
Time-Varying Semidefinite Programming: Path Following a Burer-Monteiro Factorization
Authors:
Antonio Bellon,
Mareike Dressler,
Vyacheslav Kungurtsev,
Jakub Marecek,
André Uschmajew
Abstract:
We present an online algorithm for time-varying semidefinite programs (TV-SDPs), based on the tracking of the solution trajectory of a low-rank matrix factorization, also known as the Burer-Monteiro factorization, in a path-following procedure. There, a predictor-corrector algorithm solves a sequence of linearized systems. This requires the introduction of a horizontal space constraint to ensure t…
▽ More
We present an online algorithm for time-varying semidefinite programs (TV-SDPs), based on the tracking of the solution trajectory of a low-rank matrix factorization, also known as the Burer-Monteiro factorization, in a path-following procedure. There, a predictor-corrector algorithm solves a sequence of linearized systems. This requires the introduction of a horizontal space constraint to ensure the local injectivity of the low-rank factorization. The method produces a sequence of approximate solutions for the original TV-SDP problem, for which we show that they stay close to the optimal solution path if properly initialized. Numerical experiments for a time-varying max-cut SDP relaxation demonstrate the computational advantages of the proposed method for tracking TV-SDPs in terms of runtime compared to off-the-shelf interior point methods.
△ Less
Submitted 9 January, 2024; v1 submitted 15 October, 2022;
originally announced October 2022.
-
Kronecker Product Approximation of Operators in Spectral Norm via Alternating SDP
Authors:
Mareike Dressler,
André Uschmajew,
Venkat Chandrasekaran
Abstract:
The decomposition or approximation of a linear operator on a matrix space as a sum of Kronecker products plays an important role in matrix equations and low-rank modeling. The approximation problem in Frobenius norm admits a well-known solution via the singular value decomposition. However, the approximation problem in spectral norm, which is more natural for linear operators, is much more challen…
▽ More
The decomposition or approximation of a linear operator on a matrix space as a sum of Kronecker products plays an important role in matrix equations and low-rank modeling. The approximation problem in Frobenius norm admits a well-known solution via the singular value decomposition. However, the approximation problem in spectral norm, which is more natural for linear operators, is much more challenging. In particular, the Frobenius norm solution can be far from optimal in spectral norm. We describe an alternating optimization method based on semidefinite programming to obtain high-quality approximations in spectral norm, and we present computational experiments to illustrate the advantages of our approach.
△ Less
Submitted 6 December, 2023; v1 submitted 7 July, 2022;
originally announced July 2022.
-
Local convergence of alternating low-rank optimization methods with overrelaxation
Authors:
Ivan V. Oseledets,
Maxim V. Rakhuba,
André Uschmajew
Abstract:
The local convergence of alternating optimization methods with overrelaxation for low-rank matrix and tensor problems is established. The analysis is based on the linearization of the method which takes the form of an SOR iteration for a positive semidefinite Hessian and can be studied in the corresponding quotient geometry of equivalent low-rank representations. In the matrix case, the optimal re…
▽ More
The local convergence of alternating optimization methods with overrelaxation for low-rank matrix and tensor problems is established. The analysis is based on the linearization of the method which takes the form of an SOR iteration for a positive semidefinite Hessian and can be studied in the corresponding quotient geometry of equivalent low-rank representations. In the matrix case, the optimal relaxation parameter for accelerating the local convergence can be determined from the convergence rate of the standard method. This result relies on a version of Young's SOR theorem for positive semidefinite $2 \times 2$ block systems.
△ Less
Submitted 28 June, 2022; v1 submitted 29 November, 2021;
originally announced November 2021.
-
Maximum relative distance between real rank-two and rank-one tensors
Authors:
Henrik Eisenmann,
André Uschmajew
Abstract:
It is shown that the relative distance in Frobenius norm of a real symmetric order-$d$ tensor of rank two to its best rank-one approximation is upper bounded by $\sqrt{1-(1-1/d)^{d-1}}$. This is achieved by determining the minimal possible ratio between spectral and Frobenius norm for symmetric tensors of border rank two, which equals $\left(1-{1}/{d}\right)^{(d-1)/{2}}$. These bounds are also ver…
▽ More
It is shown that the relative distance in Frobenius norm of a real symmetric order-$d$ tensor of rank two to its best rank-one approximation is upper bounded by $\sqrt{1-(1-1/d)^{d-1}}$. This is achieved by determining the minimal possible ratio between spectral and Frobenius norm for symmetric tensors of border rank two, which equals $\left(1-{1}/{d}\right)^{(d-1)/{2}}$. These bounds are also verified for arbitrary real rank-two tensors by reducing to the symmetric case.
△ Less
Submitted 25 September, 2022; v1 submitted 24 November, 2021;
originally announced November 2021.
-
A note on the optimal convergence rate of descent methods with fixed step sizes for smooth strongly convex functions
Authors:
André Uschmajew,
Bart Vandereycken
Abstract:
Based on a result by Taylor, Hendrickx, and Glineur (J. Optim. Theory Appl., 178(2):455--476, 2018) on the attainable convergence rate of gradient descent for smooth and strongly convex functions in terms of function values, an elementary convergence analysis for general descent methods with fixed step sizes is presented. It covers general variable metric methods, gradient related search direction…
▽ More
Based on a result by Taylor, Hendrickx, and Glineur (J. Optim. Theory Appl., 178(2):455--476, 2018) on the attainable convergence rate of gradient descent for smooth and strongly convex functions in terms of function values, an elementary convergence analysis for general descent methods with fixed step sizes is presented. It covers general variable metric methods, gradient related search directions under angle and scaling conditions, as well as inexact gradient methods. In all cases, optimal rates are obtained.
△ Less
Submitted 23 March, 2022; v1 submitted 15 June, 2021;
originally announced June 2021.
-
Riemannian thresholding methods for row-sparse and low-rank matrix recovery
Authors:
Henrik Eisenmann,
Felix Krahmer,
Max Pfeffer,
André Uschmajew
Abstract:
In this paper, we present modifications of the iterative hard thresholding (IHT) method for recovery of jointly row-sparse and low-rank matrices. In particular a Riemannian version of IHT is considered which significantly reduces computational cost of the gradient projection in the case of rank-one measurement operators, which have concrete applications in blind deconvolution. Experimental results…
▽ More
In this paper, we present modifications of the iterative hard thresholding (IHT) method for recovery of jointly row-sparse and low-rank matrices. In particular a Riemannian version of IHT is considered which significantly reduces computational cost of the gradient projection in the case of rank-one measurement operators, which have concrete applications in blind deconvolution. Experimental results are reported that show near-optimal recovery for Gaussian and rank-one measurements, and that adaptive stepsizes give crucial improvement. A Riemannian proximal gradient method is derived for the special case of unknown sparsity.
△ Less
Submitted 30 September, 2022; v1 submitted 3 March, 2021;
originally announced March 2021.
-
A note on overrelaxation in the Sinkhorn algorithm
Authors:
Tobias Lehmann,
Max-K. von Renesse,
Alexander Sambale,
André Uschmajew
Abstract:
We derive an a priori parameter range for overrelaxation of the Sinkhorn algorithm, which guarantees global convergence and a strictly faster asymptotic local convergence. Guided by the spectral analysis of the linearized problem we pursue a zero cost procedure to choose a near optimal relaxation parameter.
We derive an a priori parameter range for overrelaxation of the Sinkhorn algorithm, which guarantees global convergence and a strictly faster asymptotic local convergence. Guided by the spectral analysis of the linearized problem we pursue a zero cost procedure to choose a near optimal relaxation parameter.
△ Less
Submitted 6 December, 2021; v1 submitted 23 December, 2020;
originally announced December 2020.
-
Existence of dynamical low-rank approximations to parabolic problems
Authors:
Markus Bachmayr,
Henrik Eisenmann,
Emil Kieri,
André Uschmajew
Abstract:
The existence and uniqueness of weak solutions to dynamical low-rank evolution problems for parabolic partial differential equations in two spatial dimensions is shown, covering also non-diagonal diffusion in the elliptic part. The proof is based on a variational time-step** scheme on the low-rank manifold. Moreover, this scheme is shown to be closely related to practical methods for computing s…
▽ More
The existence and uniqueness of weak solutions to dynamical low-rank evolution problems for parabolic partial differential equations in two spatial dimensions is shown, covering also non-diagonal diffusion in the elliptic part. The proof is based on a variational time-step** scheme on the low-rank manifold. Moreover, this scheme is shown to be closely related to practical methods for computing such low-rank evolutions.
△ Less
Submitted 4 December, 2020; v1 submitted 27 February, 2020;
originally announced February 2020.
-
Chebyshev polynomials and best rank-one approximation ratio
Authors:
Andrei Agrachev,
Khazhgali Kozhasov,
André Uschmajew
Abstract:
We establish a new extremal property of the classical Chebyshev polynomials in the context of best rank-one approximation of tensors. We also give some necessary conditions for a tensor to be a minimizer of the ratio of spectral and Frobenius norms.
We establish a new extremal property of the classical Chebyshev polynomials in the context of best rank-one approximation of tensors. We also give some necessary conditions for a tensor to be a minimizer of the ratio of spectral and Frobenius norms.
△ Less
Submitted 11 March, 2020; v1 submitted 31 March, 2019;
originally announced April 2019.
-
Alternating least squares as moving subspace correction
Authors:
Ivan Oseledets,
Maxim Rakhuba,
André Uschmajew
Abstract:
In this note we take a new look at the local convergence of alternating optimization methods for low-rank matrices and tensors. Our abstract interpretation as sequential optimization on moving subspaces yields insightful reformulations of some known convergence conditions that focus on the interplay between the contractivity of classical multiplicative Schwarz methods with overlap** subspaces an…
▽ More
In this note we take a new look at the local convergence of alternating optimization methods for low-rank matrices and tensors. Our abstract interpretation as sequential optimization on moving subspaces yields insightful reformulations of some known convergence conditions that focus on the interplay between the contractivity of classical multiplicative Schwarz methods with overlap** subspaces and the curvature of low-rank matrix and tensor manifolds. While the verification of the abstract conditions in concrete scenarios remains open in most cases, we are able to provide an alternative and conceptually simple derivation of the asymptotic convergence rate of the two-sided block power method of numerical algebra for computing the dominant singular subspaces of a rectangular matrix. This method is equivalent to an alternating least squares method applied to a distance function. The theoretical results are illustrated and validated by numerical experiments.
△ Less
Submitted 11 January, 2019; v1 submitted 21 September, 2017;
originally announced September 2017.
-
On orthogonal tensors and best rank-one approximation ratio
Authors:
Zhening Li,
Yuji Nakatsukasa,
Tasuku Soma,
André Uschmajew
Abstract:
As is well known, the smallest possible ratio between the spectral norm and the Frobenius norm of an $m \times n$ matrix with $m \le n$ is $1/\sqrt{m}$ and is (up to scalar scaling) attained only by matrices having pairwise orthonormal rows. In the present paper, the smallest possible ratio between spectral and Frobenius norms of $n_1 \times \dots \times n_d$ tensors of order $d$, also called the…
▽ More
As is well known, the smallest possible ratio between the spectral norm and the Frobenius norm of an $m \times n$ matrix with $m \le n$ is $1/\sqrt{m}$ and is (up to scalar scaling) attained only by matrices having pairwise orthonormal rows. In the present paper, the smallest possible ratio between spectral and Frobenius norms of $n_1 \times \dots \times n_d$ tensors of order $d$, also called the best rank-one approximation ratio in the literature, is investigated. The exact value is not known for most configurations of $n_1 \le \dots \le n_d$. Using a natural definition of orthogonal tensors over the real field (resp., unitary tensors over the complex field), it is shown that the obvious lower bound $1/\sqrt{n_1 \cdots n_{d-1}}$ is attained if and only if a tensor is orthogonal (resp., unitary) up to scaling. Whether or not orthogonal or unitary tensors exist depends on the dimensions $n_1,\dots,n_d$ and the field. A connection between the (non)existence of real orthogonal tensors of order three and the classical Hurwitz problem on composition algebras can be established: existence of orthogonal tensors of size $\ell \times m \times n$ is equivalent to the admissibility of the triple $[\ell,m,n]$ to the Hurwitz problem. Some implications for higher-order tensors are then given. For instance, real orthogonal $n \times \dots \times n$ tensors of order $d \ge 3$ do exist, but only when $n = 1,2,4,8$. In the complex case, the situation is more drastic: unitary tensors of size $\ell \times m \times n$ with $\ell \le m \le n$ exist only when $\ell m \le n$. Finally, some numerical illustrations for spectral norm computation are presented.
△ Less
Submitted 13 March, 2018; v1 submitted 9 July, 2017;
originally announced July 2017.
-
Tensor Networks for Latent Variable Analysis. Part I: Algorithms for Tensor Train Decomposition
Authors:
Anh-Huy Phan,
Andrzej Cichocki,
Andre Uschmajew,
Petr Tichavsky,
George Luta,
Danilo Mandic
Abstract:
Decompositions of tensors into factor matrices, which interact through a core tensor, have found numerous applications in signal processing and machine learning. A more general tensor model which represents data as an ordered network of sub-tensors of order-2 or order-3 has, so far, not been widely considered in these fields, although this so-called tensor network decomposition has been long studi…
▽ More
Decompositions of tensors into factor matrices, which interact through a core tensor, have found numerous applications in signal processing and machine learning. A more general tensor model which represents data as an ordered network of sub-tensors of order-2 or order-3 has, so far, not been widely considered in these fields, although this so-called tensor network decomposition has been long studied in quantum physics and scientific computing. In this study, we present novel algorithms and applications of tensor network decompositions, with a particular focus on the tensor train decomposition and its variants. The novel algorithms developed for the tensor train decomposition update, in an alternating way, one or several core tensors at each iteration, and exhibit enhanced mathematical tractability and scalability to exceedingly large-scale data tensors. The proposed algorithms are tested in classic paradigms of blind source separation from a single mixture, denoising, and feature extraction, and achieve superior performance over the widely used truncated algorithms for tensor train decomposition.
△ Less
Submitted 29 September, 2016;
originally announced September 2016.
-
Finding a low-rank basis in a matrix subspace
Authors:
Yuji Nakatsukasa,
Tasuku Soma,
André Uschmajew
Abstract:
For a given matrix subspace, how can we find a basis that consists of low-rank matrices? This is a generalization of the sparse vector problem. It turns out that when the subspace is spanned by rank-1 matrices, the matrices can be obtained by the tensor CP decomposition. For the higher rank case, the situation is not as straightforward. In this work we present an algorithm based on a greedy proces…
▽ More
For a given matrix subspace, how can we find a basis that consists of low-rank matrices? This is a generalization of the sparse vector problem. It turns out that when the subspace is spanned by rank-1 matrices, the matrices can be obtained by the tensor CP decomposition. For the higher rank case, the situation is not as straightforward. In this work we present an algorithm based on a greedy process applicable to higher rank problems. Our algorithm first estimates the minimum rank by applying soft singular value thresholding to a nuclear norm relaxation, and then computes a matrix with that rank using the method of alternating projections. We provide local convergence results, and compare our algorithm with several alternative approaches. Applications include data compression beyond the classical truncated SVD, computing accurate eigenvectors of a near-multiple eigenvalue, image separation and graph Laplacian eigenproblems.
△ Less
Submitted 27 June, 2016; v1 submitted 30 March, 2015;
originally announced March 2015.
-
A new convergence proof for the higher-order power method and generalizations
Authors:
André Uschmajew
Abstract:
A proof for the point-wise convergence of the factors in the higher-order power method for tensors towards a critical point is given. It is obtained by applying established results from the theory of Łojasiewicz inequalities to the equivalent, unconstrained alternating least squares algorithm for best rank-one tensor approximation.
A proof for the point-wise convergence of the factors in the higher-order power method for tensors towards a critical point is given. It is obtained by applying established results from the theory of Łojasiewicz inequalities to the equivalent, unconstrained alternating least squares algorithm for best rank-one tensor approximation.
△ Less
Submitted 23 January, 2015; v1 submitted 17 July, 2014;
originally announced July 2014.
-
On low-rank approximability of solutions to high-dimensional operator equations and eigenvalue problems
Authors:
Daniel Kressner,
André Uschmajew
Abstract:
Low-rank tensor approximation techniques attempt to mitigate the overwhelming complexity of linear algebra tasks arising from high-dimensional applications. In this work, we study the low-rank approximability of solutions to linear systems and eigenvalue problems on Hilbert spaces. Although this question is central to the success of all existing solvers based on low-rank tensor techniques, very fe…
▽ More
Low-rank tensor approximation techniques attempt to mitigate the overwhelming complexity of linear algebra tasks arising from high-dimensional applications. In this work, we study the low-rank approximability of solutions to linear systems and eigenvalue problems on Hilbert spaces. Although this question is central to the success of all existing solvers based on low-rank tensor techniques, very few of the results available so far allow to draw meaningful conclusions for higher dimensions. In this work, we develop a constructive framework to study low-rank approximability. One major assumption is that the involved linear operator admits a low-rank representation with respect to the chosen tensor format, a property that is known to hold in a number of applications. Simple conditions, which are shown to hold for a fairly general problem class, guarantee that our derived low-rank truncation error estimates do not deteriorate as the dimensionality increases.
△ Less
Submitted 7 January, 2016; v1 submitted 26 June, 2014;
originally announced June 2014.
-
Convergence results for projected line-search methods on varieties of low-rank matrices via Łojasiewicz inequality
Authors:
Reinhold Schneider,
André Uschmajew
Abstract:
The aim of this paper is to derive convergence results for projected line-search methods on the real-algebraic variety $\mathcal{M}_{\le k}$ of real $m \times n$ matrices of rank at most $k$. Such methods extend Riemannian optimization methods, which are successfully used on the smooth manifold $\mathcal{M}_k$ of rank-$k$ matrices, to its closure by taking steps along gradient-related directions i…
▽ More
The aim of this paper is to derive convergence results for projected line-search methods on the real-algebraic variety $\mathcal{M}_{\le k}$ of real $m \times n$ matrices of rank at most $k$. Such methods extend Riemannian optimization methods, which are successfully used on the smooth manifold $\mathcal{M}_k$ of rank-$k$ matrices, to its closure by taking steps along gradient-related directions in the tangent cone, and afterwards projecting back to $\mathcal{M}_{\le k}$. Considering such a method circumvents the difficulties which arise from the nonclosedness and the unbounded curvature of $\mathcal{M}_k$. The pointwise convergence is obtained for real-analytic functions on the basis of a Łojasiewicz inequality for the projection of the antigradient to the tangent cone. If the derived limit point lies on the smooth part of $\mathcal{M}_{\le k}$, i.e. in $\mathcal{M}_k$, this boils down to more or less known results, but with the benefit that asymptotic convergence rate estimates (for specific step-sizes) can be obtained without an a priori curvature bound, simply from the fact that the limit lies on a smooth manifold. At the same time, one can give a convincing justification for assuming critical points to lie in $\mathcal{M}_k$: if $X$ is a critical point of $f$ on $\mathcal{M}_{\le k}$, then either $X$ has rank $k$, or $\nabla f(X) = 0$.
△ Less
Submitted 22 April, 2015; v1 submitted 21 February, 2014;
originally announced February 2014.