-
Behaviour Distillation
Authors:
Andrei Lupu,
Chris Lu,
Jarek Liesen,
Robert Tjarko Lange,
Jakob Foerster
Abstract:
Dataset distillation aims to condense large datasets into a small number of synthetic examples that can be used as drop-in replacements when training new models. It has applications to interpretability, neural architecture search, privacy, and continual learning. Despite strong successes in supervised domains, such methods have not yet been extended to reinforcement learning, where the lack of a f…
▽ More
Dataset distillation aims to condense large datasets into a small number of synthetic examples that can be used as drop-in replacements when training new models. It has applications to interpretability, neural architecture search, privacy, and continual learning. Despite strong successes in supervised domains, such methods have not yet been extended to reinforcement learning, where the lack of a fixed dataset renders most distillation methods unusable. Filling the gap, we formalize behaviour distillation, a setting that aims to discover and then condense the information required for training an expert policy into a synthetic dataset of state-action pairs, without access to expert data. We then introduce Hallucinating Datasets with Evolution Strategies (HaDES), a method for behaviour distillation that can discover datasets of just four state-action pairs which, under supervised learning, train agents to competitive performance levels in continuous control tasks. We show that these datasets generalize out of distribution to training policies with a wide range of architectures and hyperparameters. We also demonstrate application to a downstream task, namely training multi-task agents in a zero-shot fashion. Beyond behaviour distillation, HaDES provides significant improvements in neuroevolution for RL over previous approaches and achieves SoTA results on one standard supervised dataset distillation task. Finally, we show that visualizing the synthetic datasets can provide human-interpretable task insights.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Discovering Minimal Reinforcement Learning Environments
Authors:
Jarek Liesen,
Chris Lu,
Andrei Lupu,
Jakob N. Foerster,
Henning Sprekeler,
Robert T. Lange
Abstract:
Reinforcement learning (RL) agents are commonly trained and evaluated in the same environment. In contrast, humans often train in a specialized environment before being evaluated, such as studying a book before taking an exam. The potential of such specialized training environments is still vastly underexplored, despite their capacity to dramatically speed up training.
The framework of synthetic…
▽ More
Reinforcement learning (RL) agents are commonly trained and evaluated in the same environment. In contrast, humans often train in a specialized environment before being evaluated, such as studying a book before taking an exam. The potential of such specialized training environments is still vastly underexplored, despite their capacity to dramatically speed up training.
The framework of synthetic environments takes a first step in this direction by meta-learning neural network-based Markov decision processes (MDPs). The initial approach was limited to toy problems and produced environments that did not transfer to unseen RL algorithms. We extend this approach in three ways: Firstly, we modify the meta-learning algorithm to discover environments invariant towards hyperparameter configurations and learning algorithms. Secondly, by leveraging hardware parallelism and introducing a curriculum on an agent's evaluation episode horizon, we can achieve competitive results on several challenging continuous control problems. Thirdly, we surprisingly find that contextual bandits enable training RL agents that transfer well to their evaluation environment, even if it is a complex MDP. Hence, we set up our experiments to train synthetic contextual bandits, which perform on par with synthetic MDPs, yield additional insights into the evaluation environment, and can speed up downstream applications.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Towards understanding CG and GMRES through examples
Authors:
Erin Carson,
Jörg Liesen,
Zdeněk Strakoš
Abstract:
When the CG method for solving linear algebraic systems was formulated about 70 years ago by Lanczos, Hestenes, and Stiefel, it was considered an iterative process possessing a mathematical finite termination property. CG was placed into a rich mathematical context, including links with Gauss quadrature and continued fractions. The optimality property of CG was described via a normalized weighted…
▽ More
When the CG method for solving linear algebraic systems was formulated about 70 years ago by Lanczos, Hestenes, and Stiefel, it was considered an iterative process possessing a mathematical finite termination property. CG was placed into a rich mathematical context, including links with Gauss quadrature and continued fractions. The optimality property of CG was described via a normalized weighted polynomial least squares approximation to zero. This highly nonlinear problem explains the adaptation of CG iterates to the given data. Karush and Hayes immediately considered CG in infinite dimensional Hilbert spaces and investigated its superlinear convergence. Since then, the view of CG and other Krylov subspace methods has changed. Today these methods are primarily used as computational tools, and their behavior is typically characterized using linear upper bounds or heuristics based on clustering of eigenvalues. Such simplifications limit the mathematical understanding and also negatively affect their practical application.
This paper offers a different perspective. Focusing on CG and GMRES, it presents mathematically important and practically relevant phenomena that uncover their behavior through a discussion of computed examples. These examples provide an easily accessible approach that enables understanding of the methods, while pointers to more detailed analyses in the literature are given. This approach allows readers to choose the level of depth and thoroughness appropriate for their intentions. Some of the points made in this paper illustrate well known facts. Others challenge mainstream views and explain existing misunderstandings. Several points refer to recent results leading to open problems. We consider CG and GMRES crucially important for the mathematical understanding, further development, and practical applications also of other Krylov subspace methods.
△ Less
Submitted 1 February, 2024; v1 submitted 2 November, 2022;
originally announced November 2022.
-
On the Forsythe conjecture
Authors:
Vance Faber,
Jörg Liesen,
Petr Tichý
Abstract:
Forsythe formulated a conjecture about the asymptotic behavior of the restarted conjugate gradient method in 1968. We translate several of his results into modern terms, and generalize the conjecture (originally formulated only for symmetric positive definite matrices) to symmetric and nonsymmetric matrices. Our generalization is based on a two-sided or cross iteration with the given matrix and it…
▽ More
Forsythe formulated a conjecture about the asymptotic behavior of the restarted conjugate gradient method in 1968. We translate several of his results into modern terms, and generalize the conjecture (originally formulated only for symmetric positive definite matrices) to symmetric and nonsymmetric matrices. Our generalization is based on a two-sided or cross iteration with the given matrix and its transpose, which is based on the projection process used in the Arnoldi (or for symmetric matrices the Lanczos) algorithm. We prove several new results about the limiting behavior of this iteration, but the conjecture still remains largely open.
△ Less
Submitted 29 September, 2022;
originally announced September 2022.
-
Computing the logarithmic capacity of compact sets having (infinitely) many components with the Charge Simulation Method
Authors:
Jörg Liesen,
Mohamed M. S. Nasser,
Olivier Sète
Abstract:
We apply the Charge Simulation Method (CSM) in order to compute the logarithmic capacity of compact sets consisting of (infinitely) many "small" components. This application allows to use just a single charge point for each component. The resulting method therefore is significantly more efficient than methods based on discretizations of the boundaries (for example, our own method presented in [Lie…
▽ More
We apply the Charge Simulation Method (CSM) in order to compute the logarithmic capacity of compact sets consisting of (infinitely) many "small" components. This application allows to use just a single charge point for each component. The resulting method therefore is significantly more efficient than methods based on discretizations of the boundaries (for example, our own method presented in [Liesen, Sète, Nasser, 2017]), while maintaining a very high level of accuracy. We study properties of the linear algebraic systems that arise in the CSM, and show how these systems can be solved efficiently using preconditioned iterative methods, where the matrix-vector products are computed using the Fast Multipole Method. We illustrate the use of the method on generalized Cantor sets and the Cantor dust.
△ Less
Submitted 1 March, 2023; v1 submitted 25 January, 2022;
originally announced January 2022.
-
On non-Hermitian positive (semi)definite linear algebraic systems arising from dissipative Hamiltonian DAEs
Authors:
Candan Güdücü,
Jörg Liesen,
Volker Mehrmann,
Daniel B. Szyld
Abstract:
We discuss different cases of dissipative Hamiltonian differential-algebraic equations and the linear algebraic systems that arise in their linearization or discretization. For each case we give examples from practical applications. An important feature of the linear algebraic systems is that the (non-Hermitian) system matrix has a positive definite or semidefinite Hermitian part. In the positive…
▽ More
We discuss different cases of dissipative Hamiltonian differential-algebraic equations and the linear algebraic systems that arise in their linearization or discretization. For each case we give examples from practical applications. An important feature of the linear algebraic systems is that the (non-Hermitian) system matrix has a positive definite or semidefinite Hermitian part. In the positive definite case we can solve the linear algebraic systems iteratively by Krylov subspace methods based on efficient three-term recurrences. We illustrate the performance of these iterative methods on several examples. The semidefinite case can be challenging and requires additional techniques to deal with "singular part", while the "positive definite part" can still be treated with the three-term recurrence methods.
△ Less
Submitted 4 August, 2022; v1 submitted 10 November, 2021;
originally announced November 2021.
-
Centrality of nodes in Federated Byzantine Agreement Systems
Authors:
André Gaul,
Jörg Liesen
Abstract:
The federated Byzantine agreement system (FBAS) is a consensus model introduced by Mazières in 2016 where the participating nodes conceptually form a network, with links between them being established by each node individually and thus in a decentralized way. An important question is whether these decentralized decisions lead to an overall decentralized network. The level of (de-)centralization in…
▽ More
The federated Byzantine agreement system (FBAS) is a consensus model introduced by Mazières in 2016 where the participating nodes conceptually form a network, with links between them being established by each node individually and thus in a decentralized way. An important question is whether these decentralized decisions lead to an overall decentralized network. The level of (de-)centralization in a network can be assessed using centrality measures. In this paper we consider three different approaches for obtaining centrality measures for the nodes in an FBAS. Two of them are based on adapting well-known measures based on graphs and hypergraphs to the FBAS context. Since the network structure of an FBAS can be more complex than (usual) graphs or hypergraphs, we also develop a new, problem-adapted centrality measure. This new measure is based on the intactness of nodes, which is an important ingredient of the FBAS model. We illustrate advantages and disadvantages of the three approaches on several computed examples. We have implemented all centrality measures and performed all computations in the Python package Stellar Observatory.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Analysis of the multiplicative Schwarz method for matrices with a special block structure
Authors:
Carlos Echeverría,
Jörg Liesen,
Petr Tichý
Abstract:
We analyze the convergence of the (algebraic) multiplicative Schwarz method applied to linear algebraic systems with matrices having a special block structure that arises, for example, when a (partial) differential equation is posed and discretized on a domain that consists of two subdomains with an overlap. This is a basic situation in the context of domain decomposition methods. Our analysis is…
▽ More
We analyze the convergence of the (algebraic) multiplicative Schwarz method applied to linear algebraic systems with matrices having a special block structure that arises, for example, when a (partial) differential equation is posed and discretized on a domain that consists of two subdomains with an overlap. This is a basic situation in the context of domain decomposition methods. Our analysis is based on the algebraic structure of the Schwarz iteration matrices, and we derive error bounds that are based on the block diagonal dominance of the given system matrix. Our analysis does not assume that the system matrix is symmetric (positive definite), or has the $M$- or $H$-matrix property. Our approach is motivated by and significantly generalizes an analysis for a special one-dimensional model problem given in [4].
△ Less
Submitted 19 December, 2019;
originally announced December 2019.
-
Mathematical Analysis and Algorithms for Federated Byzantine Agreement Systems
Authors:
André Gaul,
Ismail Khoffi,
Jörg Liesen,
Torsten Stüber
Abstract:
We give an introduction to federated Byzantine agreement systems (FBAS) with many examples ranging from small "academic" cases to the current Stellar network. We then analyze the main concepts from a mathematical and an algorithmic point of view. Based on work of Lachowski we derive algorithms for quorum enumeration, checking quorum intersection, and computing the intact nodes with respect to a gi…
▽ More
We give an introduction to federated Byzantine agreement systems (FBAS) with many examples ranging from small "academic" cases to the current Stellar network. We then analyze the main concepts from a mathematical and an algorithmic point of view. Based on work of Lachowski we derive algorithms for quorum enumeration, checking quorum intersection, and computing the intact nodes with respect to a given set of ill-behaved (Byzantine) nodes. We also show that from the viewpoint of the intactness probability of nodes, which we introduce in this paper, a hierarchical setup of nodes is inferior to an arrangement that we call a symmetric simple FBAS. All algorithms described in this paper are implemented in the Python package Stellar Observatory, which is also used in some of the computed examples.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Block diagonal dominance of matrices revisited: bounds for the norms of inverses and eigenvalue inclusion sets
Authors:
Carlos Echeverría,
Jörg Liesen,
Reinhard Nabben
Abstract:
We generalize the bounds on the inverses of diagonally dominant matrices obtained in [16] from scalar to block tridiagonal matrices. Our derivations are based on a generalization of the classical condition of block diagonal dominance of matrices given by Feingold and Varga in [11]. Based on this generalization, which was recently presented in [3], we also derive a variant of the Gershgorin Circle…
▽ More
We generalize the bounds on the inverses of diagonally dominant matrices obtained in [16] from scalar to block tridiagonal matrices. Our derivations are based on a generalization of the classical condition of block diagonal dominance of matrices given by Feingold and Varga in [11]. Based on this generalization, which was recently presented in [3], we also derive a variant of the Gershgorin Circle Theorem for general block matrices which can provide tighter spectral inclusion regions than those obtained by Feingold and Varga.
△ Less
Submitted 16 May, 2018; v1 submitted 15 December, 2017;
originally announced December 2017.
-
The maximum number of zeros of $r(z) - \overline{z}$ revisited
Authors:
Jörg Liesen,
Jan Zur
Abstract:
Generalizing several previous results in the literature on rational harmonic functions, we derive bounds on the maximum number of zeros of functions $f(z) = \frac{p(z)}{q(z)} - \overline{z}$, which depend on both $\mathrm{deg}(p)$ and $\mathrm{deg}(q)$. Furthermore, we prove that any function that attains one of these upper bounds is regular.
Generalizing several previous results in the literature on rational harmonic functions, we derive bounds on the maximum number of zeros of functions $f(z) = \frac{p(z)}{q(z)} - \overline{z}$, which depend on both $\mathrm{deg}(p)$ and $\mathrm{deg}(q)$. Furthermore, we prove that any function that attains one of these upper bounds is regular.
△ Less
Submitted 7 December, 2017; v1 submitted 13 June, 2017;
originally announced June 2017.
-
How constant shifts affect the zeros of certain rational harmonic functions
Authors:
Jörg Liesen,
Jan Zur
Abstract:
We study the effect of constant shifts on the zeros of rational harmomic functions $f(z) = r(z) - \conj{z}$. In particular, we characterize how shifting through the caustics of $f$ changes the number of zeros and their respective orientations. This also yields insight into the nature of the singular zeros of $f$. Our results have applications in gravitational lensing theory, where certain such fun…
▽ More
We study the effect of constant shifts on the zeros of rational harmomic functions $f(z) = r(z) - \conj{z}$. In particular, we characterize how shifting through the caustics of $f$ changes the number of zeros and their respective orientations. This also yields insight into the nature of the singular zeros of $f$. Our results have applications in gravitational lensing theory, where certain such functions $f$ represent gravitational point-mass lenses, and a constant shift can be interpreted as the position of the light source of the lens.
△ Less
Submitted 22 May, 2018; v1 submitted 24 February, 2017;
originally announced February 2017.
-
Using separable non-negative matrix factorization techniques for the analysis of time-resolved Raman spectra
Authors:
Robert Luce,
Peter Hildebrandt,
Uwe Kuhlmann,
Jörg Liesen
Abstract:
The key challenge of time-resolved Raman spectroscopy is the identification of the constituent species and the analysis of the kinetics of the underlying reaction network. In this work we present an integral approach that allows for determining both the component spectra and the rate constants simultaneously from a series of vibrational spectra. It is based on an algorithm for non-negative matrix…
▽ More
The key challenge of time-resolved Raman spectroscopy is the identification of the constituent species and the analysis of the kinetics of the underlying reaction network. In this work we present an integral approach that allows for determining both the component spectra and the rate constants simultaneously from a series of vibrational spectra. It is based on an algorithm for non-negative matrix factorization which is applied to the experimental data set following a few pre-processing steps. As a prerequisite for physically unambiguous solutions, each component spectrum must include one vibrational band that does not significantly interfere with vibrational bands of other species. The approach is applied to synthetic "experimental" spectra derived from model systems comprising a set of species with component spectra differing with respect to their degree of spectral interferences and signal-to-noise ratios. In each case, the species involved are connected via monomolecular reaction pathways. The potential and limitations of the approach for recovering the respective rate constants and component spectra are discussed.
△ Less
Submitted 14 September, 2017; v1 submitted 3 February, 2016;
originally announced February 2016.
-
Fast and accurate computation of the logarithmic capacity of compact sets
Authors:
Jörg Liesen,
Olivier Sète,
Mohamed M. S. Nasser
Abstract:
We present a numerical method for computing the logarithmic capacity of compact subsets of $\mathbb{C}$, which are bounded by Jordan curves and have finitely connected complement. The subsets may have several components and need not have any special symmetry. The method relies on the conformal map onto lemniscatic domains and, computationally, on the solution of a boundary integral equation with t…
▽ More
We present a numerical method for computing the logarithmic capacity of compact subsets of $\mathbb{C}$, which are bounded by Jordan curves and have finitely connected complement. The subsets may have several components and need not have any special symmetry. The method relies on the conformal map onto lemniscatic domains and, computationally, on the solution of a boundary integral equation with the Neumann kernel. Our numerical examples indicate that the method is fast and accurate. We apply it to give an estimate of the logarithmic capacity of the Cantor middle third set and generalizations of it.
△ Less
Submitted 19 October, 2016; v1 submitted 21 July, 2015;
originally announced July 2015.
-
Numerical computation of the conformal map onto lemniscatic domains
Authors:
Mohamed M. S. Nasser,
Jörg Liesen,
Olivier Sète
Abstract:
We present a numerical method for the computation of the conformal map from unbounded multiply-connected domains onto lemniscatic domains. For $\ell$-times connected domains the method requires solving $\ell$ boundary integral equations with the Neumann kernel. This can be done in $O(\ell^2 n \log n)$ operations, where $n$ is the number of nodes in the discretization of each boundary component of…
▽ More
We present a numerical method for the computation of the conformal map from unbounded multiply-connected domains onto lemniscatic domains. For $\ell$-times connected domains the method requires solving $\ell$ boundary integral equations with the Neumann kernel. This can be done in $O(\ell^2 n \log n)$ operations, where $n$ is the number of nodes in the discretization of each boundary component of the multiply connected domain. As demonstrated by numerical examples, the method works for domains with close-to-touching boundaries, non-convex boundaries, piecewise smooth boundaries, and for domains of high connectivity.
△ Less
Submitted 15 December, 2015; v1 submitted 19 May, 2015;
originally announced May 2015.
-
Properties and examples of Faber--Walsh polynomials
Authors:
Olivier Sète,
Jörg Liesen
Abstract:
The Faber--Walsh polynomials are a direct generalization of the (classical) Faber polynomials from simply connected sets to sets with several simply connected components. In this paper we derive new properties of the Faber--Walsh polynomials, where we focus on results of interest in numerical linear algebra, and on the relation between the Faber--Walsh polynomials and the classical Faber and Cheby…
▽ More
The Faber--Walsh polynomials are a direct generalization of the (classical) Faber polynomials from simply connected sets to sets with several simply connected components. In this paper we derive new properties of the Faber--Walsh polynomials, where we focus on results of interest in numerical linear algebra, and on the relation between the Faber--Walsh polynomials and the classical Faber and Chebyshev polynomials. Moreover, we present examples of Faber--Walsh polynomials for two real intervals as well as some non-real sets consisting of several simply connected components.
△ Less
Submitted 20 July, 2016; v1 submitted 26 February, 2015;
originally announced February 2015.
-
On conformal maps from multiply connected domains onto lemniscatic domains
Authors:
Olivier Sète,
Jörg Liesen
Abstract:
We study conformal maps from multiply connected domains in the extended complex plane onto lemniscatic domains. Walsh proved the existence of such maps in 1956 and thus obtained a direct generalization of the Riemann map** theorem to multiply connected domains. For polynomial pre-images of simply connected sets we derive a construction principle for Walsh's conformal map in terms of the Riemann…
▽ More
We study conformal maps from multiply connected domains in the extended complex plane onto lemniscatic domains. Walsh proved the existence of such maps in 1956 and thus obtained a direct generalization of the Riemann map** theorem to multiply connected domains. For polynomial pre-images of simply connected sets we derive a construction principle for Walsh's conformal map in terms of the Riemann map for the simply connected set. Moreover, we explicitly construct examples of Walsh's conformal map for certain radial slit domains and circular domains.
△ Less
Submitted 26 April, 2016; v1 submitted 8 January, 2015;
originally announced January 2015.
-
Fast Recovery and Approximation of Hidden Cauchy Structure
Authors:
Jörg Liesen,
Robert Luce
Abstract:
We derive an algorithm of optimal complexity which determines whether a given matrix is a Cauchy matrix, and which exactly recovers the Cauchy points defining a Cauchy matrix from the matrix entries. Moreover, we study how to approximate a given matrix by a Cauchy matrix with a particular focus on the recovery of Cauchy points from noisy data. We derive an approximation algorithm of optimal comple…
▽ More
We derive an algorithm of optimal complexity which determines whether a given matrix is a Cauchy matrix, and which exactly recovers the Cauchy points defining a Cauchy matrix from the matrix entries. Moreover, we study how to approximate a given matrix by a Cauchy matrix with a particular focus on the recovery of Cauchy points from noisy data. We derive an approximation algorithm of optimal complexity for this task, and prove approximation bounds. Numerical examples illustrate our theoretical results.
△ Less
Submitted 30 August, 2015; v1 submitted 8 December, 2014;
originally announced December 2014.
-
A Note on the Maximum Number of Zeros of $r(z) - \bar{z}$
Authors:
Robert Luce,
Olivier Sète,
Jörg Liesen
Abstract:
An important theorem of Khavinson & Neumann (Proc. Amer. Math. Soc. 134(4), 2006) states that the complex harmonic function $r(z) - \bar{z}$, where $r$ is a rational function of degree $n \geq 2$, has at most $5 (n - 1)$ zeros. In this note we resolve a slight inaccuracy in their proof and in addition we show that for certain functions of the form $r(z) - \bar{z}$ no more than $5 (n - 1) - 1$ zero…
▽ More
An important theorem of Khavinson & Neumann (Proc. Amer. Math. Soc. 134(4), 2006) states that the complex harmonic function $r(z) - \bar{z}$, where $r$ is a rational function of degree $n \geq 2$, has at most $5 (n - 1)$ zeros. In this note we resolve a slight inaccuracy in their proof and in addition we show that for certain functions of the form $r(z) - \bar{z}$ no more than $5 (n - 1) - 1$ zeros can occur. Moreover, we show that $r(z) - \bar{z}$ is regular, if it has the maximal number of zeros.
△ Less
Submitted 23 December, 2014; v1 submitted 1 October, 2014;
originally announced October 2014.
-
Creating images by adding masses to gravitational point lenses
Authors:
Olivier Sète,
Robert Luce,
Jörg Liesen
Abstract:
A well-studied maximal gravitational point lens construction of S. H. Rhie produces $5n$ images of a light source using $n+1$ deflector masses. The construction arises from a circular, symmetric deflector configuration on $n$ masses (producing only $3n+1$ images) by adding a tiny mass in the center of the other mass positions (and reducing all the other masses a little bit).
In a recent paper we…
▽ More
A well-studied maximal gravitational point lens construction of S. H. Rhie produces $5n$ images of a light source using $n+1$ deflector masses. The construction arises from a circular, symmetric deflector configuration on $n$ masses (producing only $3n+1$ images) by adding a tiny mass in the center of the other mass positions (and reducing all the other masses a little bit).
In a recent paper we studied this "image creating effect" from a purely mathematical point of view (Sète, Luce & Liesen, Comput. Methods Funct. Theory 15(1):9-35, 2015). Here we discuss a few consequences of our findings for gravitational microlensing models. We present a complete characterization of the effect of adding small masses to these point lens models, with respect to the number of images. In particular, we give several examples of maximal lensing models that are different from Rhie's construction and that do not share its highly symmetric appearance. We give generally applicable conditions that allow the construction of maximal point lenses on $n+1$ masses from maximal lenses on $n$ masses.
△ Less
Submitted 6 March, 2015; v1 submitted 12 May, 2014;
originally announced May 2014.
-
Pták's nondiscrete induction and its application to matrix iterations
Authors:
Jörg Liesen
Abstract:
Vlastimil Pták's method of nondiscrete induction is based on the idea that in the analysis of iterative processes one should aim at rates of convergence as functions rather than just numbers, because functions may give convergence estimates that are tight throughout the iteration rather than just asymptotically. In this paper we motivate and prove a theorem on nondiscrete induction originally due…
▽ More
Vlastimil Pták's method of nondiscrete induction is based on the idea that in the analysis of iterative processes one should aim at rates of convergence as functions rather than just numbers, because functions may give convergence estimates that are tight throughout the iteration rather than just asymptotically. In this paper we motivate and prove a theorem on nondiscrete induction originally due to Potra and Pták, and we apply it to the Newton iterations for computing the matrix polar decomposition and the matrix square root. Our goal is to illustrate the application of the method of nondiscrete induction in the finite dimensional numerical linear algebra context. We show the sharpness of the resulting convergence estimate analytically for the polar decomposition iteration and for special cases of the square root iteration, as well as on some numerical examples for the square root iteration. We also discuss some of the method's limitations and possible extensions.
△ Less
Submitted 12 May, 2014;
originally announced May 2014.
-
Perturbing rational harmonic functions by poles
Authors:
Olivier Sète,
Robert Luce,
Jörg Liesen
Abstract:
We study how adding certain poles to rational harmonic functions of the form $R(z)-\bar{z}$, with $R(z)$ rational and of degree $d\geq 2$, affects the number of zeros of the resulting functions. Our results are motivated by and generalize a construction of Rhie derived in the context of gravitational microlensing (ArXiv e-print 2003). Of particular interest is the construction and the behavior of…
▽ More
We study how adding certain poles to rational harmonic functions of the form $R(z)-\bar{z}$, with $R(z)$ rational and of degree $d\geq 2$, affects the number of zeros of the resulting functions. Our results are motivated by and generalize a construction of Rhie derived in the context of gravitational microlensing (ArXiv e-print 2003). Of particular interest is the construction and the behavior of rational functions $R(z)$ that are {\em extremal} in the sense that $R(z)-\bar{z}$ has the maximal possible number of $5(d-1)$ zeros.
△ Less
Submitted 27 May, 2014; v1 submitted 4 March, 2014;
originally announced March 2014.
-
Sharp parameter bounds for certain maximal point lenses
Authors:
Robert Luce,
Olivier Sète,
Jörg Liesen
Abstract:
Starting from an $n$-point circular gravitational lens having $3n+1$ images, Rhie (2003) used a perturbation argument to construct an $(n+1)$-point lens producing $5n$ images. In this work we give a concise proof of Rhie's result, and we extend the range of parameters in Rhie's model for which maximal lensing occurs.
We also study a slightly different construction given by Bayer and Dyer (2007)…
▽ More
Starting from an $n$-point circular gravitational lens having $3n+1$ images, Rhie (2003) used a perturbation argument to construct an $(n+1)$-point lens producing $5n$ images. In this work we give a concise proof of Rhie's result, and we extend the range of parameters in Rhie's model for which maximal lensing occurs.
We also study a slightly different construction given by Bayer and Dyer (2007) arising from the $(3n+1)$-point lens. In particular, we extend their results and give sharp parameter bounds for their lens model. By a substitution of variables and parameters we show that both models are equivalent in a certain sense.
△ Less
Submitted 23 April, 2014; v1 submitted 17 December, 2013;
originally announced December 2013.
-
Max-min and min-max approximation problems for normal matrices revisited
Authors:
Jörg Liesen,
Petr Tichý
Abstract:
We give a new proof for an equality of certain max-min and min-max approximation problems involving normal matrices. The previously published proofs of this equality apply tools from matrix theory, (analytic) optimization theory and constrained convex optimization. Our proof uses a classical characterization theorem from approximation theory and thus exploits the link between the two approximation…
▽ More
We give a new proof for an equality of certain max-min and min-max approximation problems involving normal matrices. The previously published proofs of this equality apply tools from matrix theory, (analytic) optimization theory and constrained convex optimization. Our proof uses a classical characterization theorem from approximation theory and thus exploits the link between the two approximation problems with normal matrices on the one hand and approximation problems on compact sets in the complex plane on the other.
△ Less
Submitted 22 October, 2013;
originally announced October 2013.
-
Characterization of worst-case GMRES
Authors:
Vance Faber,
Jörg Liesen,
Petr Tichý
Abstract:
Given a matrix $A$ and iteration step $k$, we study a best possible attainable upper bound on the GMRES residual norm that does not depend on the initial vector $b$. This quantity is called the worst-case GMRES approximation. We show that the worst case behavior of GMRES for the matrices $A$ and $A^T$ is the same, and we analyze properties of initial vectors for which the worst-case residual norm…
▽ More
Given a matrix $A$ and iteration step $k$, we study a best possible attainable upper bound on the GMRES residual norm that does not depend on the initial vector $b$. This quantity is called the worst-case GMRES approximation. We show that the worst case behavior of GMRES for the matrices $A$ and $A^T$ is the same, and we analyze properties of initial vectors for which the worst-case residual norm is attained. In particular, we show that such vectors satisfy a certain "cross equality", and we characterize them as right singular vectors of the corresponding GMRES residual matrix. We show that the worst-case GMRES polynomial may not be uniquely determined, and we consider the relation between the worst-case and the ideal GMRES approximations, giving new examples in which the inequality between the two quantities is sharp at all iteration steps $k\geq 3$. Finally, we give a complete characterization of how the values of the approximation problems in the context of worst-case and ideal GMRES for a real matrix change, when one considers complex (rather than real) polynomials and initial vectors in these problems.
△ Less
Submitted 22 February, 2013;
originally announced February 2013.
-
The field of values bound on ideal GMRES
Authors:
Jörg Liesen,
Petr Tichý
Abstract:
A widely known result of Elman, and its improvements due to Starke, Eiermann and Ernst, gives a bound on the worst-case GMRES residual norm using quantities related to the field of values of the given matrix and its inverse. We prove that these bounds also hold for the ideal GMRES approximation, and we derive and discuss some improvements of the bounds.
A widely known result of Elman, and its improvements due to Starke, Eiermann and Ernst, gives a bound on the worst-case GMRES residual norm using quantities related to the field of values of the given matrix and its inverse. We prove that these bounds also hold for the ideal GMRES approximation, and we derive and discuss some improvements of the bounds.
△ Less
Submitted 16 July, 2020; v1 submitted 26 November, 2012;
originally announced November 2012.
-
A framework for deflated and augmented Krylov subspace methods
Authors:
André Gaul,
Martin H. Gutknecht,
Jörg Liesen,
Reinhard Nabben
Abstract:
We consider deflation and augmentation techniques for accelerating the convergence of Krylov subspace methods for the solution of nonsingular linear algebraic systems. Despite some formal similarity, the two techniques are conceptually different from preconditioning. Deflation (in the sense the term is used here) "removes" certain parts from the operator making it singular, while augmentation adds…
▽ More
We consider deflation and augmentation techniques for accelerating the convergence of Krylov subspace methods for the solution of nonsingular linear algebraic systems. Despite some formal similarity, the two techniques are conceptually different from preconditioning. Deflation (in the sense the term is used here) "removes" certain parts from the operator making it singular, while augmentation adds a subspace to the Krylov subspace (often the one that is generated by the singular operator); in contrast, preconditioning changes the spectrum of the operator without making it singular. Deflation and augmentation have been used in a variety of methods and settings. Typically, deflation is combined with augmentation to compensate for the singularity of the operator, but both techniques can be applied separately.
We introduce a framework of Krylov subspace methods that satisfy a Galerkin condition. It includes the families of orthogonal residual (OR) and minimal residual (MR) methods. We show that in this framework augmentation can be achieved either explicitly or, equivalently, implicitly by projecting the residuals appropriately and correcting the approximate solutions in a final step. We study conditions for a breakdown of the deflated methods, and we show several possibilities to avoid such breakdowns for the deflated MINRES method. Numerical experiments illustrate properties of different variants of deflated MINRES analyzed in this paper.
△ Less
Submitted 1 February, 2013; v1 submitted 7 June, 2012;
originally announced June 2012.